From patchwork Sat Feb 17 18:22:17 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128674 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1816996ljc; Sat, 17 Feb 2018 10:26:42 -0800 (PST) X-Google-Smtp-Source: AH8x2251RmRgga3J1SUbECn4zrB4ktGp7ZeONVnXnq3e5IJgkLSNEPjyPD433DsekQ2fRLsXsPA4 X-Received: by 10.129.102.131 with SMTP id a125mr992296ywc.166.1518892002480; Sat, 17 Feb 2018 10:26:42 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518892002; cv=none; d=google.com; s=arc-20160816; b=0Pjhi0j8urlEMBG+s+n9Xr+BJaqQZt1O/G35f1bhZz/ROAG/bOULtyaA5Zly0R5v+l yUEsaE2E3kdRyuUb5U+wOmZhnqv2JT83S3R/pUk7qRuxmWlk8zrb3mx9B+auG2BZcSBX vHeMyMIHJW9896gk5GNfMyHRttrYVKrKKSXzoFA8OT1qk8cMMdZm5daz7ICLypS7e+rV ipu7/+LVc/+28IQAuM+hRw8xH6mDwrHGIHl9KcFqEbMDD6dtABAi4O32WVrm1XFXx07S RcY0uc9Qn20L9aKgwBh5m3N6q0bhFXhQ6GdmB4R4nQirNW4ssZCwKFCPe/eRJ+RBg+sy JfsA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=GKTCp6bvksh14RttFM3Mt9UNkxVgwo6yVS4bRZKIG/8=; b=j9kxQBJPQ3NhbfSJs9MDXcq0ipU2CDU4vQq/ImlIxe2zMFS9IkTa5sehZ7NYV3pKhH xF++c8ko/V7ftJPH5DpdTsjDjT0N6b9LLnJbU3OU+nUKWO8nNvKFAgQ/T9hSl/jf7tkx YLDVW7EggQMw7mrVwTZEqXFR8+d7aTwUC+PSEMiPhBZVnVTEnRzXZJtCABtr6RDred2y R/ckr8Hwv5y2zT6tHgKOrhBhf8YreiQ6kuIkpC4/RGxplM/B8EexF+SxkxdsJh69h0PB AHNiFakd3SHSEAM9WY5RjZyX+xnVV21FZlf38llPXDWmfDpiiguJdn8pfg4Q3p7kbN+t nL5A== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=XsENO705; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id y123si775531ybf.49.2018.02.17.10.26.42 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 10:26:42 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=XsENO705; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48061 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7C5-00022N-PV for patch@linaro.org; Sat, 17 Feb 2018 13:26:41 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39461) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en791-0008W9-P5 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:32 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en78z-0001Qv-JB for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:31 -0500 Received: from mail-pl0-x242.google.com ([2607:f8b0:400e:c01::242]:34788) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en78z-0001QV-DD for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:29 -0500 Received: by mail-pl0-x242.google.com with SMTP id bd10so3449349plb.1 for ; Sat, 17 Feb 2018 10:23:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=GKTCp6bvksh14RttFM3Mt9UNkxVgwo6yVS4bRZKIG/8=; b=XsENO705oCj3EEvARe2kDxAKzc6yB5FChZN+2zjJDYeidOU2EZl953Q4WtV+meFwSS QR9mr4zHmzh7lBikyjvhqM4Bg+iJM7deo/gqrlIPde40NXj9qQYVADwzGYVABb1kdMRD 3i8PXJGwhKnksw6a7Gky1u4FYErzaFgNuhbwE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=GKTCp6bvksh14RttFM3Mt9UNkxVgwo6yVS4bRZKIG/8=; b=OMd3FPESx1PXgvNqTRwY9fmqVzFoWqjAOQw7NTY1hPm+V4bYNOQJxAygIsCPexogqP aiH49ORjxW6Bm0yRx0JMClI3ZiCOlUGu0rkeo4J5ZOEMLW/DC3+4tw2eWIsJ6BegwOIm j+xmzKNtrLpFyT6fVMlo7tEqv0fxL7HQFAW758eB4x/XE+2unpmHJp1ZCOTRWs9xPSh8 hX8lQG8YZ6WUqazFh0Cds8GyBxpRPTuk3f7bfuHXazi1Q/f73RIm/OWT6fN/Cq8EieTE ossLr351LfiDtRgYHKbUJDxH/aGhCdKLlVv1rBSKZqkjhBfNMYZRDo+tPi7s2WkUUPdY 6+ag== X-Gm-Message-State: APf1xPCGmDMM6OjwoccW/jWM1nLI4fcYrWeYfo+ZF5Ia+QUPbg+eIUG0 ptRJhMeKvnpM74yUjstwtfpCUfIfJlk= X-Received: by 2002:a17:902:ab85:: with SMTP id f5-v6mr9592147plr.199.1518891808245; Sat, 17 Feb 2018 10:23:28 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.26 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:27 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:17 -0800 Message-Id: <20180217182323.25885-2-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::242 Subject: [Qemu-devel] [PATCH v2 01/67] target/arm: Enable SVE for aarch64-linux-user X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Enable ARM_FEATURE_SVE for the generic "any" cpu. Signed-off-by: Richard Henderson --- target/arm/cpu.c | 7 +++++++ target/arm/cpu64.c | 1 + 2 files changed, 8 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/cpu.c b/target/arm/cpu.c index 1b3ae62db6..10843994c3 100644 --- a/target/arm/cpu.c +++ b/target/arm/cpu.c @@ -150,6 +150,13 @@ static void arm_cpu_reset(CPUState *s) env->cp15.sctlr_el[1] |= SCTLR_UCT | SCTLR_UCI | SCTLR_DZE; /* and to the FP/Neon instructions */ env->cp15.cpacr_el1 = deposit64(env->cp15.cpacr_el1, 20, 2, 3); + /* and to the SVE instructions */ + env->cp15.cpacr_el1 = deposit64(env->cp15.cpacr_el1, 16, 2, 3); + env->cp15.cptr_el[3] |= CPTR_EZ; + /* with maximum vector length */ + env->vfp.zcr_el[1] = ARM_MAX_VQ - 1; + env->vfp.zcr_el[2] = ARM_MAX_VQ - 1; + env->vfp.zcr_el[3] = ARM_MAX_VQ - 1; #else /* Reset into the highest available EL */ if (arm_feature(env, ARM_FEATURE_EL3)) { diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c index efc519b49b..36ef9e9d9d 100644 --- a/target/arm/cpu64.c +++ b/target/arm/cpu64.c @@ -231,6 +231,7 @@ static void aarch64_any_initfn(Object *obj) set_feature(&cpu->env, ARM_FEATURE_V8_PMULL); set_feature(&cpu->env, ARM_FEATURE_CRC); set_feature(&cpu->env, ARM_FEATURE_V8_FP16); + set_feature(&cpu->env, ARM_FEATURE_SVE); cpu->ctr = 0x80038003; /* 32 byte I and D cacheline size, VIPT icache */ cpu->dcz_blocksize = 7; /* 512 bytes */ } From patchwork Sat Feb 17 18:22:18 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128672 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1815637ljc; Sat, 17 Feb 2018 10:24:44 -0800 (PST) X-Google-Smtp-Source: AH8x225v62lIWKbEASNzRxbHypZTbGDGkooq6Wkg+kWRKV/acMAG3ZUHCzV7Q0/G7cDWM5wVEVum X-Received: by 10.13.238.69 with SMTP id x66mr7600850ywe.1.1518891883925; Sat, 17 Feb 2018 10:24:43 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518891883; cv=none; d=google.com; s=arc-20160816; b=U7n0hHNkDtkoFaknwRJecuU2MBayruIY+qOE02FdUwZXzjLBhUpEZJt43F9g3SorW5 v2vOYTEoO3ayC2pVoY5U/6PC6l3duofDVejWMQFdRYjmujiczVrflSyYYeAAbUHV7EUO aAyIHbIt5JZwUKvLIJoWV+F6ZRS3iTYjPICIoW7NysLxnnvhBtVPWrdEcsAM33S8HJuU dHPT72Gwf5oMw8f4ZXzSjja4THKiI7G2VTaEGzrZGKrc11A5u77epBB5vfJtNpxFSxEL 3O/yGArt5SXaslS4JPiLWHo3q+ZXeyQGeAZUoXRqKbVzNTKKrX8VRhfcD5i6ewhgfHOr wWBQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=vA65R0gM6DJd7CAtKhNwANG6CnipdIvbj0v8v4ws5v0=; b=TZtAj4ZsR6xLA+flKUlpZ8SMW9KYvGz3sRECEAql262rRI1wTwQPSZZttaes4A1PBu FMBtl7oSXPVEizWqEA4Na9g3g6KgemdZ/612fcALyc68v9sliSR6EHCaXbSd75mC150E R4hZxCCLtSnyLlxSgHn8wD3xDcLZXmpJzq45sTXyT1/2incodxSpC6uLgahNMLnt3dEM K3LCVYyVVXWTNj5FOL1JK30FpIaIOcna7G7ItUck4oFlCLwFDa8QW1kMwYUSD+kNhciK /xtRIGb3jsrn8M3Ug2j5evq1CZoMUHguS2ZpICTIJ/8Sw5dWcwyvPFFBrxGlo+IZ0bWw 7BIw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=LgbRx/i/; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id j6si3221140ywg.196.2018.02.17.10.24.43 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 10:24:43 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=LgbRx/i/; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48053 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7AB-0000kV-4N for patch@linaro.org; Sat, 17 Feb 2018 13:24:43 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39503) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en794-00005y-0H for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:36 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en791-0001Rz-SB for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:33 -0500 Received: from mail-pl0-x243.google.com ([2607:f8b0:400e:c01::243]:39663) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en791-0001RT-EV for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:31 -0500 Received: by mail-pl0-x243.google.com with SMTP id s13so3436568plq.6 for ; Sat, 17 Feb 2018 10:23:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=vA65R0gM6DJd7CAtKhNwANG6CnipdIvbj0v8v4ws5v0=; b=LgbRx/i/daKnKXWQzLRHf5X4pz1xKZrLZGJXjequ6vBRObYm2NGR6NKiPXBxahEO7O QYo96J9EF5aA2piYfxgNCkXC6vT1HkEu05fBkcCHYxt4uXI1de/8yb89sKvIO/iQnHVD YyQ0vEnZfIoyee8JZjCilSx/GFyjj7QUfMt7U= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=vA65R0gM6DJd7CAtKhNwANG6CnipdIvbj0v8v4ws5v0=; b=dKeZ+tJQCR7SLdqyZUdNL2T7STEZilCb2UZTWR3vHuFbD+JNwaQabcA3vMLNV2VJhq 5exLz67jG2OSfcNv0pidA4hJAbuscd+0boUq8hIvdLYSHzS+0Yp1KwknYZdiI9ZAyYkV Ki62gNGwezAQiLapoKC+zwzIZt4lNdYUkQYvme9hrnJyxJN1WRfJMViTHHQrLW1g7uGq 5rquv2ts46ljlgTVzC9lkeC9JHlocktoVvb3w9G85zaKP66rCe0cAVP9UYH4ByBrl9OE XBf//vYAI8e5NvH9YtnY8cz2V7nVwRR7rsNhnQ1fJpevtDSAaSO7njWpQ6/YGxMy84dS Sn5w== X-Gm-Message-State: APf1xPDlRfEcxBKNkLfROXFI/MvuwxmIZxEk9j4mSm7s07sVgcPzxZp0 QkDIYMAcMudLNUgblvpe/gmYk0Wi5no= X-Received: by 2002:a17:902:4c88:: with SMTP id b8-v6mr9492851ple.233.1518891810059; Sat, 17 Feb 2018 10:23:30 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.28 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:29 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:18 -0800 Message-Id: <20180217182323.25885-3-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::243 Subject: [Qemu-devel] [PATCH v2 02/67] target/arm: Introduce translate-a64.h X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Move some stuff that will be common to both translate-a64.c and translate-sve.c. Signed-off-by: Richard Henderson --- target/arm/translate-a64.h | 110 +++++++++++++++++++++++++++++++++++++++++++++ target/arm/translate-a64.c | 101 ++++++----------------------------------- 2 files changed, 123 insertions(+), 88 deletions(-) create mode 100644 target/arm/translate-a64.h -- 2.14.3 Reviewed-by: Peter Maydell Reviewed-by: Alex Bennée diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h new file mode 100644 index 0000000000..e519aee314 --- /dev/null +++ b/target/arm/translate-a64.h @@ -0,0 +1,110 @@ +/* + * AArch64 translation, common definitions. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, see . + */ + +#ifndef TARGET_ARM_TRANSLATE_A64_H +#define TARGET_ARM_TRANSLATE_A64_H + +void unallocated_encoding(DisasContext *s); + +#define unsupported_encoding(s, insn) \ + do { \ + qemu_log_mask(LOG_UNIMP, \ + "%s:%d: unsupported instruction encoding 0x%08x " \ + "at pc=%016" PRIx64 "\n", \ + __FILE__, __LINE__, insn, s->pc - 4); \ + unallocated_encoding(s); \ + } while (0) + +TCGv_i64 new_tmp_a64(DisasContext *s); +TCGv_i64 new_tmp_a64_zero(DisasContext *s); +TCGv_i64 cpu_reg(DisasContext *s, int reg); +TCGv_i64 cpu_reg_sp(DisasContext *s, int reg); +TCGv_i64 read_cpu_reg(DisasContext *s, int reg, int sf); +TCGv_i64 read_cpu_reg_sp(DisasContext *s, int reg, int sf); +void write_fp_dreg(DisasContext *s, int reg, TCGv_i64 v); +TCGv_ptr get_fpstatus_ptr(bool); +bool logic_imm_decode_wmask(uint64_t *result, unsigned int immn, + unsigned int imms, unsigned int immr); +uint64_t vfp_expand_imm(int size, uint8_t imm8); + +/* We should have at some point before trying to access an FP register + * done the necessary access check, so assert that + * (a) we did the check and + * (b) we didn't then just plough ahead anyway if it failed. + * Print the instruction pattern in the abort message so we can figure + * out what we need to fix if a user encounters this problem in the wild. + */ +static inline void assert_fp_access_checked(DisasContext *s) +{ +#ifdef CONFIG_DEBUG_TCG + if (unlikely(!s->fp_access_checked || s->fp_excp_el)) { + fprintf(stderr, "target-arm: FP access check missing for " + "instruction 0x%08x\n", s->insn); + abort(); + } +#endif +} + +/* Return the offset into CPUARMState of an element of specified + * size, 'element' places in from the least significant end of + * the FP/vector register Qn. + */ +static inline int vec_reg_offset(DisasContext *s, int regno, + int element, TCGMemOp size) +{ + int offs = 0; +#ifdef HOST_WORDS_BIGENDIAN + /* This is complicated slightly because vfp.zregs[n].d[0] is + * still the low half and vfp.zregs[n].d[1] the high half + * of the 128 bit vector, even on big endian systems. + * Calculate the offset assuming a fully bigendian 128 bits, + * then XOR to account for the order of the two 64 bit halves. + */ + offs += (16 - ((element + 1) * (1 << size))); + offs ^= 8; +#else + offs += element * (1 << size); +#endif + offs += offsetof(CPUARMState, vfp.zregs[regno]); + assert_fp_access_checked(s); + return offs; +} + +/* Return the offset info CPUARMState of the "whole" vector register Qn. */ +static inline int vec_full_reg_offset(DisasContext *s, int regno) +{ + assert_fp_access_checked(s); + return offsetof(CPUARMState, vfp.zregs[regno]); +} + +/* Return a newly allocated pointer to the vector register. */ +static inline TCGv_ptr vec_full_reg_ptr(DisasContext *s, int regno) +{ + TCGv_ptr ret = tcg_temp_new_ptr(); + tcg_gen_addi_ptr(ret, cpu_env, vec_full_reg_offset(s, regno)); + return ret; +} + +/* Return the byte size of the "whole" vector register, VL / 8. */ +static inline int vec_full_reg_size(DisasContext *s) +{ + return s->sve_len; +} + +bool disas_sve(DisasContext *, uint32_t); + +#endif /* TARGET_ARM_TRANSLATE_A64_H */ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 032cbfa17d..e0e7ebf68c 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -36,13 +36,13 @@ #include "exec/log.h" #include "trace-tcg.h" +#include "translate-a64.h" static TCGv_i64 cpu_X[32]; static TCGv_i64 cpu_pc; /* Load/store exclusive handling */ static TCGv_i64 cpu_exclusive_high; -static TCGv_i64 cpu_reg(DisasContext *s, int reg); static const char *regnames[] = { "x0", "x1", "x2", "x3", "x4", "x5", "x6", "x7", @@ -392,22 +392,13 @@ static inline void gen_goto_tb(DisasContext *s, int n, uint64_t dest) } } -static void unallocated_encoding(DisasContext *s) +void unallocated_encoding(DisasContext *s) { /* Unallocated and reserved encodings are uncategorized */ gen_exception_insn(s, 4, EXCP_UDEF, syn_uncategorized(), default_exception_el(s)); } -#define unsupported_encoding(s, insn) \ - do { \ - qemu_log_mask(LOG_UNIMP, \ - "%s:%d: unsupported instruction encoding 0x%08x " \ - "at pc=%016" PRIx64 "\n", \ - __FILE__, __LINE__, insn, s->pc - 4); \ - unallocated_encoding(s); \ - } while (0) - static void init_tmp_a64_array(DisasContext *s) { #ifdef CONFIG_DEBUG_TCG @@ -425,13 +416,13 @@ static void free_tmp_a64(DisasContext *s) init_tmp_a64_array(s); } -static TCGv_i64 new_tmp_a64(DisasContext *s) +TCGv_i64 new_tmp_a64(DisasContext *s) { assert(s->tmp_a64_count < TMP_A64_MAX); return s->tmp_a64[s->tmp_a64_count++] = tcg_temp_new_i64(); } -static TCGv_i64 new_tmp_a64_zero(DisasContext *s) +TCGv_i64 new_tmp_a64_zero(DisasContext *s) { TCGv_i64 t = new_tmp_a64(s); tcg_gen_movi_i64(t, 0); @@ -453,7 +444,7 @@ static TCGv_i64 new_tmp_a64_zero(DisasContext *s) * to cpu_X[31] and ZR accesses to a temporary which can be discarded. * This is the point of the _sp forms. */ -static TCGv_i64 cpu_reg(DisasContext *s, int reg) +TCGv_i64 cpu_reg(DisasContext *s, int reg) { if (reg == 31) { return new_tmp_a64_zero(s); @@ -463,7 +454,7 @@ static TCGv_i64 cpu_reg(DisasContext *s, int reg) } /* register access for when 31 == SP */ -static TCGv_i64 cpu_reg_sp(DisasContext *s, int reg) +TCGv_i64 cpu_reg_sp(DisasContext *s, int reg) { return cpu_X[reg]; } @@ -472,7 +463,7 @@ static TCGv_i64 cpu_reg_sp(DisasContext *s, int reg) * representing the register contents. This TCGv is an auto-freed * temporary so it need not be explicitly freed, and may be modified. */ -static TCGv_i64 read_cpu_reg(DisasContext *s, int reg, int sf) +TCGv_i64 read_cpu_reg(DisasContext *s, int reg, int sf) { TCGv_i64 v = new_tmp_a64(s); if (reg != 31) { @@ -487,7 +478,7 @@ static TCGv_i64 read_cpu_reg(DisasContext *s, int reg, int sf) return v; } -static TCGv_i64 read_cpu_reg_sp(DisasContext *s, int reg, int sf) +TCGv_i64 read_cpu_reg_sp(DisasContext *s, int reg, int sf) { TCGv_i64 v = new_tmp_a64(s); if (sf) { @@ -498,72 +489,6 @@ static TCGv_i64 read_cpu_reg_sp(DisasContext *s, int reg, int sf) return v; } -/* We should have at some point before trying to access an FP register - * done the necessary access check, so assert that - * (a) we did the check and - * (b) we didn't then just plough ahead anyway if it failed. - * Print the instruction pattern in the abort message so we can figure - * out what we need to fix if a user encounters this problem in the wild. - */ -static inline void assert_fp_access_checked(DisasContext *s) -{ -#ifdef CONFIG_DEBUG_TCG - if (unlikely(!s->fp_access_checked || s->fp_excp_el)) { - fprintf(stderr, "target-arm: FP access check missing for " - "instruction 0x%08x\n", s->insn); - abort(); - } -#endif -} - -/* Return the offset into CPUARMState of an element of specified - * size, 'element' places in from the least significant end of - * the FP/vector register Qn. - */ -static inline int vec_reg_offset(DisasContext *s, int regno, - int element, TCGMemOp size) -{ - int offs = 0; -#ifdef HOST_WORDS_BIGENDIAN - /* This is complicated slightly because vfp.zregs[n].d[0] is - * still the low half and vfp.zregs[n].d[1] the high half - * of the 128 bit vector, even on big endian systems. - * Calculate the offset assuming a fully bigendian 128 bits, - * then XOR to account for the order of the two 64 bit halves. - */ - offs += (16 - ((element + 1) * (1 << size))); - offs ^= 8; -#else - offs += element * (1 << size); -#endif - offs += offsetof(CPUARMState, vfp.zregs[regno]); - assert_fp_access_checked(s); - return offs; -} - -/* Return the offset info CPUARMState of the "whole" vector register Qn. */ -static inline int vec_full_reg_offset(DisasContext *s, int regno) -{ - assert_fp_access_checked(s); - return offsetof(CPUARMState, vfp.zregs[regno]); -} - -/* Return a newly allocated pointer to the vector register. */ -static TCGv_ptr vec_full_reg_ptr(DisasContext *s, int regno) -{ - TCGv_ptr ret = tcg_temp_new_ptr(); - tcg_gen_addi_ptr(ret, cpu_env, vec_full_reg_offset(s, regno)); - return ret; -} - -/* Return the byte size of the "whole" vector register, VL / 8. */ -static inline int vec_full_reg_size(DisasContext *s) -{ - /* FIXME SVE: We should put the composite ZCR_EL* value into tb->flags. - In the meantime this is just the AdvSIMD length of 128. */ - return 128 / 8; -} - /* Return the offset into CPUARMState of a slice (from * the least significant end) of FP register Qn (ie * Dn, Sn, Hn or Bn). @@ -620,7 +545,7 @@ static void clear_vec_high(DisasContext *s, bool is_q, int rd) } } -static void write_fp_dreg(DisasContext *s, int reg, TCGv_i64 v) +void write_fp_dreg(DisasContext *s, int reg, TCGv_i64 v) { unsigned ofs = fp_reg_offset(s, reg, MO_64); @@ -637,7 +562,7 @@ static void write_fp_sreg(DisasContext *s, int reg, TCGv_i32 v) tcg_temp_free_i64(tmp); } -static TCGv_ptr get_fpstatus_ptr(bool is_f16) +TCGv_ptr get_fpstatus_ptr(bool is_f16) { TCGv_ptr statusptr = tcg_temp_new_ptr(); int offset; @@ -3130,8 +3055,8 @@ static inline uint64_t bitmask64(unsigned int length) * value (ie should cause a guest UNDEF exception), and true if they are * valid, in which case the decoded bit pattern is written to result. */ -static bool logic_imm_decode_wmask(uint64_t *result, unsigned int immn, - unsigned int imms, unsigned int immr) +bool logic_imm_decode_wmask(uint64_t *result, unsigned int immn, + unsigned int imms, unsigned int immr) { uint64_t mask; unsigned e, levels, s, r; @@ -5164,7 +5089,7 @@ static void disas_fp_3src(DisasContext *s, uint32_t insn) * the range 01....1xx to 10....0xx, and the most significant 4 bits of * the mantissa; see VFPExpandImm() in the v8 ARM ARM. */ -static uint64_t vfp_expand_imm(int size, uint8_t imm8) +uint64_t vfp_expand_imm(int size, uint8_t imm8) { uint64_t imm; From patchwork Sat Feb 17 18:22:19 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128676 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1818987ljc; Sat, 17 Feb 2018 10:29:44 -0800 (PST) X-Google-Smtp-Source: AH8x224KymSapeCVlz2j6FKLhqv22VmUkX05zL7Lkt9YICLv8he8HWsjgvpNVfPGOcLZJpqwNFxU X-Received: by 10.129.146.139 with SMTP id j133mr7236472ywg.256.1518892184717; Sat, 17 Feb 2018 10:29:44 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518892184; cv=none; d=google.com; s=arc-20160816; b=hxF2p0reEvs45iuYc5fvzuWTbJW2ZAcZNetvgTc0WroWdZloL675RvIysWjNZOVrO+ qgPtmUVvfKf8CaSh+e0vbCDTwJxLNsFlJr9shfeok3iu++6Qr/v97nL50hl2LvDo/F/e pl+hnlHI8MP3Zp8LrS1jVQRm3rAFZcV/GqdYa7nds4eoxMP+q7N2NSpNXT0tpS6f35Y0 QYImrEsTG3//mbgjFKatghE3lDITv8J//cpQ5G/wqy6ug2xn+3oGeEFlkHIBO9VpUrBj 3QLZYVE445tar9ZHnPEdPAPwLyhG6IxZACYRnHbMfXGc+aYL1Tz2GO1EPXJaF2XuUW1D ffXA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=OB6hZuzKEdcsMV6fxaOMHmbJji7y30odOZxKDC955Eg=; b=br/dkwoLR3XWIPof8Wp5YtcalNdAfev4R2fKImA631/Wu+B0VOR+E/4UvnZ0VC1HQc SdlPR32qOuJZsSs4tx2McgdEFMHWLbEAhHJ0LIw+RMFpayKn0GuXMZizy1U1eCbh7aYB GX3PsA00JnQToIVejQ3/FCKPOtXpx8sZYzpYhK7rGwoSOHPIYPtB2UVA6Ye1XqJeRZqv EHSlAGQ68lxQEN5Xjj4xDBQib+dSHqABusP3T/NKfGOlaotymD81zVLZXcIiR2NASeLZ aUPo19bHQ2Igx3CPKSlHVbI7UmhG5hFV6pRHaXplC51/io0Q9QWfDsL2ZMXU7Xz1Setm nPrA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=c6YW+HiO; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id l201si1655361ybf.242.2018.02.17.10.29.44 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 10:29:44 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=c6YW+HiO; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48085 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7F1-0004UZ-Uk for patch@linaro.org; Sat, 17 Feb 2018 13:29:43 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39515) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en794-000067-HQ for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:36 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en793-0001Sd-1R for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:34 -0500 Received: from mail-pl0-x243.google.com ([2607:f8b0:400e:c01::243]:34789) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en792-0001S7-Ql for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:32 -0500 Received: by mail-pl0-x243.google.com with SMTP id bd10so3449386plb.1 for ; Sat, 17 Feb 2018 10:23:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=OB6hZuzKEdcsMV6fxaOMHmbJji7y30odOZxKDC955Eg=; b=c6YW+HiO4jW5Mmcf4gnYNpIVEXUf4QNj+U0Jg2oowSjDC0/ZqeWs9lzweX5V16/RKJ lSUVOLET4GjNSOWuwZRZRzuYPhWNwVu0TZiXefinbqdJBHwf2XCQfLlbYiUc+pCRAQMe P0XdPf3Vl95L3SlIH1V5h55EZ507BMGvLRDCk= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=OB6hZuzKEdcsMV6fxaOMHmbJji7y30odOZxKDC955Eg=; b=LCjFubUS2Xn1edEBwT5+IYMFWzEip0H7rSUzaOXCnOm+/U8R4ZjUi+KB7u1m9xXo3G PghBe8YpJgHJsdEjaxMzgpODSfCzqCOJU+q1WhJiFkTvHvhILC9KzmV8+jdeT2Qejalc Ps/mlLNP4bL/3QSky3AJNsOk9TdA/b5jrWNwkwcY52wk/augINaX5B+mVxVFM/l6BoEa 1drDdwy6AJfiQx3MfbHgIr7hhU5irRFkcSTqiM/IcCGPhME7lP3+NLYdXKYvC1k7J4qb tXi57qKR25y0GswR9XS7zQrAnzaXC5SW5J+nmECv4xO171YXQFSmVL0A8AvIPIqZ0K/w UAcA== X-Gm-Message-State: APf1xPBA2w4JC0YBNf1UbiANhaYY/8kWZCjQhQCKtH3eFK2K5RM7ZGJp NqGPKWZurnfy9xneGJ6G9H7zTDG3CB4= X-Received: by 2002:a17:902:6e8c:: with SMTP id v12-v6mr9359642plk.424.1518891811558; Sat, 17 Feb 2018 10:23:31 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.30 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:30 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:19 -0800 Message-Id: <20180217182323.25885-4-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::243 Subject: [Qemu-devel] [PATCH v2 03/67] target/arm: Add SVE decode skeleton X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Including only 4, as-yet unimplemented, instruction patterns so that the whole thing compiles. Signed-off-by: Richard Henderson --- target/arm/translate-a64.c | 11 +++++++- target/arm/translate-sve.c | 63 ++++++++++++++++++++++++++++++++++++++++++++++ .gitignore | 1 + target/arm/Makefile.objs | 10 ++++++++ target/arm/sve.decode | 45 +++++++++++++++++++++++++++++++++ 5 files changed, 129 insertions(+), 1 deletion(-) create mode 100644 target/arm/translate-sve.c create mode 100644 target/arm/sve.decode -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index e0e7ebf68c..a50fef98af 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -12772,9 +12772,18 @@ static void disas_a64_insn(CPUARMState *env, DisasContext *s) s->fp_access_checked = false; switch (extract32(insn, 25, 4)) { - case 0x0: case 0x1: case 0x2: case 0x3: /* UNALLOCATED */ + case 0x0: case 0x1: case 0x3: /* UNALLOCATED */ unallocated_encoding(s); break; + case 0x2: + if (!arm_dc_feature(s, ARM_FEATURE_SVE)) { + unallocated_encoding(s); + } else if (!sve_access_check(s) || !fp_access_check(s)) { + /* exception raised */ + } else if (!disas_sve(s, insn)) { + unallocated_encoding(s); + } + break; case 0x8: case 0x9: /* Data processing - immediate */ disas_data_proc_imm(s, insn); break; diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c new file mode 100644 index 0000000000..2c9e4733cb --- /dev/null +++ b/target/arm/translate-sve.c @@ -0,0 +1,63 @@ +/* + * AArch64 SVE translation + * + * Copyright (c) 2018 Linaro, Ltd + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, see . + */ + +#include "qemu/osdep.h" +#include "cpu.h" +#include "exec/exec-all.h" +#include "tcg-op.h" +#include "tcg-op-gvec.h" +#include "qemu/log.h" +#include "arm_ldst.h" +#include "translate.h" +#include "internals.h" +#include "exec/helper-proto.h" +#include "exec/helper-gen.h" +#include "exec/log.h" +#include "trace-tcg.h" +#include "translate-a64.h" + +/* + * Include the generated decoder. + */ + +#include "decode-sve.inc.c" + +/* + * Implement all of the translator functions referenced by the decoder. + */ + +static void trans_AND_zzz(DisasContext *s, arg_AND_zzz *a, uint32_t insn) +{ + unsupported_encoding(s, insn); +} + +static void trans_ORR_zzz(DisasContext *s, arg_ORR_zzz *a, uint32_t insn) +{ + unsupported_encoding(s, insn); +} + +static void trans_EOR_zzz(DisasContext *s, arg_EOR_zzz *a, uint32_t insn) +{ + unsupported_encoding(s, insn); +} + +static void trans_BIC_zzz(DisasContext *s, arg_BIC_zzz *a, uint32_t insn) +{ + unsupported_encoding(s, insn); +} diff --git a/.gitignore b/.gitignore index 704b22285d..abe2b81a26 100644 --- a/.gitignore +++ b/.gitignore @@ -140,3 +140,4 @@ trace-dtrace-root.h trace-dtrace-root.dtrace trace-ust-all.h trace-ust-all.c +/target/arm/decode-sve.inc.c diff --git a/target/arm/Makefile.objs b/target/arm/Makefile.objs index 847fb52ee0..9934cf1d4d 100644 --- a/target/arm/Makefile.objs +++ b/target/arm/Makefile.objs @@ -10,3 +10,13 @@ obj-y += gdbstub.o obj-$(TARGET_AARCH64) += cpu64.o translate-a64.o helper-a64.o gdbstub64.o obj-y += crypto_helper.o obj-$(CONFIG_SOFTMMU) += arm-powerctl.o + +DECODETREE = $(SRC_PATH)/scripts/decodetree.py + +target/arm/decode-sve.inc.c: $(SRC_PATH)/target/arm/sve.decode $(DECODETREE) + $(call quiet-command,\ + $(PYTHON) $(DECODETREE) --decode disas_sve -o $@ $<,\ + "GEN", $(TARGET_DIR)$@) + +target/arm/translate-sve.o: target/arm/decode-sve.inc.c +obj-$(TARGET_AARCH64) += translate-sve.o diff --git a/target/arm/sve.decode b/target/arm/sve.decode new file mode 100644 index 0000000000..2c13a6024a --- /dev/null +++ b/target/arm/sve.decode @@ -0,0 +1,45 @@ +# AArch64 SVE instruction descriptions +# +# Copyright (c) 2017 Linaro, Ltd +# +# This library is free software; you can redistribute it and/or +# modify it under the terms of the GNU Lesser General Public +# License as published by the Free Software Foundation; either +# version 2 of the License, or (at your option) any later version. +# +# This library is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# Lesser General Public License for more details. +# +# You should have received a copy of the GNU Lesser General Public +# License along with this library; if not, see . + +# +# This file is processed by scripts/decodetree.py +# + +########################################################################### +# Named attribute sets. These are used to make nice(er) names +# when creating helpers common to those for the individual +# instruction patterns. + +&rrr_esz rd rn rm esz + +########################################################################### +# Named instruction formats. These are generally used to +# reduce the amount of duplication between instruction patterns. + +# Three operand with unused vector element size +@rd_rn_rm_e0 ........ ... rm:5 ... ... rn:5 rd:5 &rrr_esz esz=0 + +########################################################################### +# Instruction patterns. Grouped according to the SVE encodingindex.xhtml. + +### SVE Logical - Unpredicated Group + +# SVE bitwise logical operations (unpredicated) +AND_zzz 00000100 00 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 +ORR_zzz 00000100 01 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 +EOR_zzz 00000100 10 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 +BIC_zzz 00000100 11 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 From patchwork Sat Feb 17 18:22:20 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128680 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1821323ljc; Sat, 17 Feb 2018 10:33:02 -0800 (PST) X-Google-Smtp-Source: AH8x227ZPZ/YqbGm0IyESl9iY2BixnT18vFmSaLZTzrX8E32jLnfjwPq9m1MRSWf/1BH/kCqmJR+ X-Received: by 10.13.198.193 with SMTP id i184mr7365145ywd.446.1518892382655; Sat, 17 Feb 2018 10:33:02 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518892382; cv=none; d=google.com; s=arc-20160816; b=E1QB7JHNL+SOC8YMMwMjyEAOCKVwYz0HB73SGnD+vDsC4Aur1ml9kHm5vL6+lH0YWK j95DrSLMn1cHbHBabCGr0p48MyHMZFMZwhbLtChsv/+aSehutGxXNzOQvqdBOw2KTg9w pVNtEBn6xAIDogCjhpiPuY/gppAwP27iZCRsN7ooyTaQUQrtXRibRhobPrTuzOthn+T0 cfuvRERinOFscLGjNTLG10oPSazonUzWovgQgNha1QKRiFSrC0PQHYXxK5lxS8d8ggnR QMjyeLcE5b7jWMP2Qt3vRYcicRK3wFJnGWFywTmerpJ4iwXsPYTH2K1S9qx7DMnJffkK KhQg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=crjT9mdFhV4QIVceZtyE/uLyIPkijbglCCBt+D3M/dI=; b=mZ0vWJukFe8EvMzjshoY8NEgo+vXAJY4ttC3TVDaWjQjHYahZdv064ftinhq5IMcG8 +/dE1Hi+haV6v4ejYp4ATxLjZnKT63o+GT/doiUCzyCFMPJdruPnVZNTW4YYnlFat6c0 qjF4sk9elOGNRQGq3NuLC9YS7LGB4BkaUQOU9k9ruGVUrhmk0imKKwI4hbTuvzD1AqGI hh9vc8YFSPgRLP3aUGmtQxM5Qu0MywWwz7slDZFbkzJNu6QdUeBEhxg51AFyf9cshSw0 RDUZxUJ3qqpbBaPMeOxPda/34mqaqNF0hXt8AYD1yWyTX7yeAPlN+LGJ6Ysrh6Hc39M/ 2jyQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=et7eDw+j; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id h79si1147875ywc.111.2018.02.17.10.33.02 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 10:33:02 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=et7eDw+j; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48109 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7IE-0007B1-2X for patch@linaro.org; Sat, 17 Feb 2018 13:33:02 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39527) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en795-000075-78 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:36 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en794-0001TG-AA for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:35 -0500 Received: from mail-pl0-x244.google.com ([2607:f8b0:400e:c01::244]:46293) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en794-0001Ss-4a for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:34 -0500 Received: by mail-pl0-x244.google.com with SMTP id x19so3427827plr.13 for ; Sat, 17 Feb 2018 10:23:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=crjT9mdFhV4QIVceZtyE/uLyIPkijbglCCBt+D3M/dI=; b=et7eDw+j8BCHiHiM5qQ/tvuBRTE9lvahCck7xOYIWpK5kHrZGzh6Fu09G0YZHn9sSU MbphDAAhkkvf7gc6iRiiCzGqsLzn9HYBvVBRcB8CwobVtGVHHblMvTI+RCJv+EZEU979 O3gd8yFaS1Ks4T4zJY02j5kxY3zXJxlxlGlSE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=crjT9mdFhV4QIVceZtyE/uLyIPkijbglCCBt+D3M/dI=; b=GdZwOPEpX8EaDw4Wz0g3R5qgUMOdGIU9W3XzFNt0GHH2zHTgmnEhcSt/3h4UEkBegF ZW+Y2NmZcbm/02/qtG0QqfO9aGX6ShuYYD7htZpjKlpr8epMBEXvaqjH+SFVwNaFINtX hlcIlADueA1b7Ktc9Gyp5wjRml8vPpg3EQhrebtHZ2D+U5jKMwTDhQCdB7SHPU/1WTP9 TX9D5QBXCNDfC3yIn5a4WemAMwhOfRZ9Q1RUnOT2NMq41mWTupCe6QQm3MHczz/L/CFh 9h4Vwxf5C9rXx+9ZHVg5lcMSAgppFOO+5lQqEVaep4Him0Tm2qIVNY+GNxkWkyDVy4vK CKhA== X-Gm-Message-State: APf1xPB3dDxHXd/Oha2SHAm/Cu1wJb/XtuXlHDEmae/0Djzs5EfHcJHK ind+6Zu1kmpeMSADBPJEXTeB4my5q0k= X-Received: by 2002:a17:902:8304:: with SMTP id bd4-v6mr9608299plb.123.1518891812941; Sat, 17 Feb 2018 10:23:32 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.31 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:32 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:20 -0800 Message-Id: <20180217182323.25885-5-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::244 Subject: [Qemu-devel] [PATCH v2 04/67] target/arm: Implement SVE Bitwise Logical - Unpredicated Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" These were the instructions that were stubbed out when introducing the decode skeleton. Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 50 +++++++++++++++++++++++++++++++++++++++------- 1 file changed, 43 insertions(+), 7 deletions(-) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 2c9e4733cb..50cf2a1fdd 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -32,6 +32,10 @@ #include "trace-tcg.h" #include "translate-a64.h" +typedef void GVecGen2Fn(unsigned, uint32_t, uint32_t, uint32_t, uint32_t); +typedef void GVecGen3Fn(unsigned, uint32_t, uint32_t, + uint32_t, uint32_t, uint32_t); + /* * Include the generated decoder. */ @@ -42,22 +46,54 @@ * Implement all of the translator functions referenced by the decoder. */ -static void trans_AND_zzz(DisasContext *s, arg_AND_zzz *a, uint32_t insn) +/* Invoke a vector expander on two Zregs. */ +static void do_vector2_z(DisasContext *s, GVecGen2Fn *gvec_fn, + int esz, int rd, int rn) { - unsupported_encoding(s, insn); + unsigned vsz = vec_full_reg_size(s); + gvec_fn(esz, vec_full_reg_offset(s, rd), + vec_full_reg_offset(s, rn), vsz, vsz); } -static void trans_ORR_zzz(DisasContext *s, arg_ORR_zzz *a, uint32_t insn) +/* Invoke a vector expander on three Zregs. */ +static void do_vector3_z(DisasContext *s, GVecGen3Fn *gvec_fn, + int esz, int rd, int rn, int rm) { - unsupported_encoding(s, insn); + unsigned vsz = vec_full_reg_size(s); + gvec_fn(esz, vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn), + vec_full_reg_offset(s, rm), vsz, vsz); } -static void trans_EOR_zzz(DisasContext *s, arg_EOR_zzz *a, uint32_t insn) +/* Invoke a vector move on two Zregs. */ +static void do_mov_z(DisasContext *s, int rd, int rn) { - unsupported_encoding(s, insn); + do_vector2_z(s, tcg_gen_gvec_mov, 0, rd, rn); +} + +/* + *** SVE Logical - Unpredicated Group + */ + +static void trans_AND_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_vector3_z(s, tcg_gen_gvec_and, 0, a->rd, a->rn, a->rm); +} + +static void trans_ORR_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + if (a->rn == a->rm) { /* MOV */ + do_mov_z(s, a->rd, a->rn); + } else { + do_vector3_z(s, tcg_gen_gvec_or, 0, a->rd, a->rn, a->rm); + } +} + +static void trans_EOR_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_vector3_z(s, tcg_gen_gvec_xor, 0, a->rd, a->rn, a->rm); } static void trans_BIC_zzz(DisasContext *s, arg_BIC_zzz *a, uint32_t insn) { - unsupported_encoding(s, insn); + do_vector3_z(s, tcg_gen_gvec_andc, 0, a->rd, a->rn, a->rm); } From patchwork Sat Feb 17 18:22:21 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128677 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1819106ljc; Sat, 17 Feb 2018 10:29:55 -0800 (PST) X-Google-Smtp-Source: AH8x224iP3D/USQjDNZ2shaVJOhK7hkAhjzrPBBHlWyDyD9w4/hn3nIqbCM5rX8PnHX3jKopOEKN X-Received: by 10.129.208.12 with SMTP id v12mr4276000ywi.304.1518892195829; Sat, 17 Feb 2018 10:29:55 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518892195; cv=none; d=google.com; s=arc-20160816; b=bxfw4wroq5fpPeVNJjbOTVIwi601/MUmceTJQmcmkejhri3GV6/C8BOC1FxeYXBLzH qJ5c5z2cwaryh8AZSoIUp+MHE5pesZh16KwU0fWWe0lhrnhXt0xq8RSN/xWcTbttQK4E eWdXSB8u7CQC9J4K9LAloZD3g1AQDCVnnfJPDm6iFucqY8O8+ppBUHJvgROErqmdisPB xYQvZq2vFFnZ5GzgooCXA7RdISnwLOEuCODjRa8RJ528PLQL0VP4oP/MDbk7d8ZuLxWe JLGJB/dswG1v/So/QMDCRKM+fssCbQ5lo/xp3Gxz8YmDzwCy8ifCIJk4FB+O5GYKGjDZ aBYg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=Rzb3iILp0N9VPoO2HC7Ijv9hijbpXzSFKAfXOuc/y24=; b=O8vr8umhHfojdK/f0EhwvMmrGb/yO5Bnp7Y+dvUPvnFV6BC+LpHMf0N8mw96uk/Dv3 hGdDSmt3hjsa412rvJh/HJCb242ENChJ39yi1NMYqR4Et31DFA12TYQZY/52tQUkf0hc kyn8Q0QqIuAx7IEq5TTAWmRtYp+ktrAuGwbwgbgziPNTSV8Uz8waaLfUOm6ajADOIRQv EnlK6wZFgmIUd/DXbAwFDca7pd5FZkxOhe+0q+041BZVr2XvC1mUPrBnAHUbi//NypBc J+3jFKv5gz6RRCHJXI18Js9I80p3imlinhqSVQh9QhS1rA4pGH4MlTa7KBXVkkDbtzcz HZjw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=A1yiZ6hg; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id k13si850609ywm.511.2018.02.17.10.29.55 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 10:29:55 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=A1yiZ6hg; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48086 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7FC-0004bH-Mt for patch@linaro.org; Sat, 17 Feb 2018 13:29:55 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39565) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en797-0000AP-Md for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:40 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en796-0001UD-8s for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:37 -0500 Received: from mail-pl0-x244.google.com ([2607:f8b0:400e:c01::244]:36664) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en796-0001Tu-0K for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:36 -0500 Received: by mail-pl0-x244.google.com with SMTP id v3so3443614plg.3 for ; Sat, 17 Feb 2018 10:23:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Rzb3iILp0N9VPoO2HC7Ijv9hijbpXzSFKAfXOuc/y24=; b=A1yiZ6hgTpl96c/Xw/00JNme4ci8k9OSBXcTPbWr/oWLOlErI/SFubj5rIv9LBNmiZ /9vQuRapzkFu4XompCryPs9N5vg77Mw9NA+ae2K+g9r+hnZPYatky+Rp2Kd9I/NlXsSC iXJO0CgS4kZz/27Fm2pUSB1p4qfpYC3Mc4tKI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Rzb3iILp0N9VPoO2HC7Ijv9hijbpXzSFKAfXOuc/y24=; b=K1ApJ0ATp1rHEZ84dcrZbzbfx09o/Ouvr9QBkkYTuZON4rfyqvOz1VtBAYF8+oOs76 R6LsyboN1c4WX6hygH3AlwLkDOpH3q2QXk5jm99WBt+3o8tHc6DSHFWMhyvKt2WS0yem jhqXlfsIxufxhvN61yRU8O4bOHSlqiyRO/nmrS6fWh5n8yPpx/jaxv/d2Doungkk4oBc wTrt6YHmr5mp2GR8AUf8gP6ef7MmH7UVd7LvSDDsfBf4NNKY2YAUIj5u9Oqx35UtflYH spE/8O5ZoSiExqOK27nYec4Gk3c7sSKXa5pzD+O3rwhSbd+otsDIBRvdXQrKnvrCyTw7 u7vg== X-Gm-Message-State: APf1xPBJ4AGNlbNLJcvnWKeMewgpzNQTZnPz5BtbAcN6TfBG5Roo5nq+ MtjdEq4s3ahp8TzWOfVzCTXBTRhBC54= X-Received: by 2002:a17:902:6bcb:: with SMTP id m11-v6mr2324350plt.326.1518891814662; Sat, 17 Feb 2018 10:23:34 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.33 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:33 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:21 -0800 Message-Id: <20180217182323.25885-6-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::244 Subject: [Qemu-devel] [PATCH v2 05/67] target/arm: Implement SVE load vector/predicate X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 132 +++++++++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 22 +++++++- 2 files changed, 153 insertions(+), 1 deletion(-) -- 2.14.3 diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 50cf2a1fdd..c0cccfda6f 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -46,6 +46,19 @@ typedef void GVecGen3Fn(unsigned, uint32_t, uint32_t, * Implement all of the translator functions referenced by the decoder. */ +/* Return the offset info CPUARMState of the predicate vector register Pn. + * Note for this purpose, FFR is P16. */ +static inline int pred_full_reg_offset(DisasContext *s, int regno) +{ + return offsetof(CPUARMState, vfp.pregs[regno]); +} + +/* Return the byte size of the whole predicate register, VL / 64. */ +static inline int pred_full_reg_size(DisasContext *s) +{ + return s->sve_len >> 3; +} + /* Invoke a vector expander on two Zregs. */ static void do_vector2_z(DisasContext *s, GVecGen2Fn *gvec_fn, int esz, int rd, int rn) @@ -97,3 +110,122 @@ static void trans_BIC_zzz(DisasContext *s, arg_BIC_zzz *a, uint32_t insn) { do_vector3_z(s, tcg_gen_gvec_andc, 0, a->rd, a->rn, a->rm); } + +/* + *** SVE Memory - 32-bit Gather and Unsized Contiguous Group + */ + +/* Subroutine loading a vector register at VOFS of LEN bytes. + * The load should begin at the address Rn + IMM. + */ + +#if UINTPTR_MAX == UINT32_MAX +# define ptr i32 +#else +# define ptr i64 +#endif + +static void do_ldr(DisasContext *s, uint32_t vofs, uint32_t len, + int rn, int imm) +{ + uint32_t len_align = QEMU_ALIGN_DOWN(len, 8); + uint32_t len_remain = len % 8; + uint32_t nparts = len / 8 + ctpop8(len_remain); + int midx = get_mem_index(s); + TCGv_i64 addr, t0, t1; + + addr = tcg_temp_new_i64(); + t0 = tcg_temp_new_i64(); + + /* Note that unpredicated load/store of vector/predicate registers + * are defined as a stream of bytes, which equates to little-endian + * operations on larger quantities. There is no nice way to force + * a little-endian load for aarch64_be-linux-user out of line. + * + * Attempt to keep code expansion to a minimum by limiting the + * amount of unrolling done. + */ + if (nparts <= 4) { + int i; + + for (i = 0; i < len_align; i += 8) { + tcg_gen_addi_i64(addr, cpu_reg_sp(s, rn), imm + i); + tcg_gen_qemu_ld_i64(t0, addr, midx, MO_LEQ); + tcg_gen_st_i64(t0, cpu_env, vofs + i); + } + } else { + TCGLabel *loop = gen_new_label(); + TCGv_ptr i = TCGV_NAT_TO_PTR(glue(tcg_const_local_, ptr)(0)); + TCGv_ptr dest; + + gen_set_label(loop); + + /* Minimize the number of local temps that must be re-read from + * the stack each iteration. Instead, re-compute values other + * than the loop counter. + */ + dest = tcg_temp_new_ptr(); + tcg_gen_addi_ptr(dest, i, imm); +#if UINTPTR_MAX == UINT32_MAX + tcg_gen_extu_i32_i64(addr, TCGV_PTR_TO_NAT(dest)); + tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, rn)); +#else + tcg_gen_add_i64(addr, TCGV_PTR_TO_NAT(dest), cpu_reg_sp(s, rn)); +#endif + + tcg_gen_qemu_ld_i64(t0, addr, midx, MO_LEQ); + + tcg_gen_add_ptr(dest, cpu_env, i); + tcg_gen_addi_ptr(i, i, 8); + tcg_gen_st_i64(t0, dest, vofs); + tcg_temp_free_ptr(dest); + + glue(tcg_gen_brcondi_, ptr)(TCG_COND_LTU, TCGV_PTR_TO_NAT(i), + len_align, loop); + tcg_temp_free_ptr(i); + } + + /* Predicate register loads can be any multiple of 2. + * Note that we still store the entire 64-bit unit into cpu_env. + */ + if (len_remain) { + tcg_gen_addi_i64(addr, cpu_reg_sp(s, rn), imm + len_align); + + switch (len_remain) { + case 2: + case 4: + case 8: + tcg_gen_qemu_ld_i64(t0, addr, midx, MO_LE | ctz32(len_remain)); + break; + + case 6: + t1 = tcg_temp_new_i64(); + tcg_gen_qemu_ld_i64(t0, addr, midx, MO_LEUL); + tcg_gen_addi_i64(addr, addr, 4); + tcg_gen_qemu_ld_i64(t1, addr, midx, MO_LEUW); + tcg_gen_deposit_i64(t0, t0, t1, 32, 32); + tcg_temp_free_i64(t1); + break; + + default: + g_assert_not_reached(); + } + tcg_gen_st_i64(t0, cpu_env, vofs + len_align); + } + tcg_temp_free_i64(addr); + tcg_temp_free_i64(t0); +} + +#undef ptr + +static void trans_LDR_zri(DisasContext *s, arg_rri *a, uint32_t insn) +{ + int size = vec_full_reg_size(s); + do_ldr(s, vec_full_reg_offset(s, a->rd), size, a->rn, a->imm * size); +} + +static void trans_LDR_pri(DisasContext *s, arg_rri *a, uint32_t insn) +{ + int size = pred_full_reg_size(s); + do_ldr(s, pred_full_reg_offset(s, a->rd), size, a->rn, a->imm * size); +} diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 2c13a6024a..0c6a7ba34d 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -19,11 +19,17 @@ # This file is processed by scripts/decodetree.py # +########################################################################### +# Named fields. These are primarily for disjoint fields. + +%imm9_16_10 16:s6 10:3 + ########################################################################### # Named attribute sets. These are used to make nice(er) names # when creating helpers common to those for the individual # instruction patterns. +&rri rd rn imm &rrr_esz rd rn rm esz ########################################################################### @@ -31,7 +37,13 @@ # reduce the amount of duplication between instruction patterns. # Three operand with unused vector element size -@rd_rn_rm_e0 ........ ... rm:5 ... ... rn:5 rd:5 &rrr_esz esz=0 +@rd_rn_rm_e0 ........ ... rm:5 ... ... rn:5 rd:5 &rrr_esz esz=0 + +# Basic Load/Store with 9-bit immediate offset +@pd_rn_i9 ........ ........ ...... rn:5 . rd:4 \ + &rri imm=%imm9_16_10 +@rd_rn_i9 ........ ........ ...... rn:5 rd:5 \ + &rri imm=%imm9_16_10 ########################################################################### # Instruction patterns. Grouped according to the SVE encodingindex.xhtml. @@ -43,3 +55,11 @@ AND_zzz 00000100 00 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 ORR_zzz 00000100 01 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 EOR_zzz 00000100 10 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 BIC_zzz 00000100 11 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 + +### SVE Memory - 32-bit Gather and Unsized Contiguous Group + +# SVE load predicate register +LDR_pri 10000101 10 ...... 000 ... ..... 0 .... @pd_rn_i9 + +# SVE load vector register +LDR_zri 10000101 10 ...... 010 ... ..... ..... @rd_rn_i9 From patchwork Sat Feb 17 18:22:22 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128681 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1821496ljc; Sat, 17 Feb 2018 10:33:16 -0800 (PST) X-Google-Smtp-Source: AH8x225xQRbsIFxBcLOb173ao+qng6djFcBbbZZVYy9cm9A4Auzmfnxn7N7jiXx7r7x6kvFsMWW+ X-Received: by 10.13.203.194 with SMTP id n185mr2908627ywd.461.1518892396548; Sat, 17 Feb 2018 10:33:16 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518892396; cv=none; d=google.com; s=arc-20160816; b=gNhYjdSEso/EvuK/zspCXQd8VWSsLWtHDlwDasLaG6R/rrVYEaRJJm1VmEwDQs3gue o5p3b642UqgCYctH+HxyHOgHXm8BC65ToJYjrY1phFd/FI0uOPI6Iv5oZQawims0L1rU AmIxsfvxAsFpJOTFEtgFy9PQ0fpKN7W7ZN4Ghzx79pHshp0YFfxoZF51Wvm8ivLqSfkE +ZV6O7GaGeUjRHxzsjIrhDGZxW09Z4L9FmTTTj10M4LCehYxBB0oxnWkuO62E2RpLOyw oBEb3gHaBBPOz6AsyRWptpQ4H/ebNH5D5mFKTVMBWtdSNdlGGDIB1V7QS6tARbGAtZFL 1Qow== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=V7b6SuF1cXACZqT7BVM+onphPKImPG8640eKQWeQ1Es=; b=03jfRsloaGbeDUeDgWk9FPrIWiWKJa1B1sN28evegJA5fPGtMJJf2lURIADKsuQ/lY wpz9N9cp9D4rK+D5WIqJ8Jwn0Aqi7wnZHl8aoRu4GZF/tnUIv0HA/yYUTV80n7eDv37D 2yG5/GYGVYLHbGx5gDfSIn6Hnup6ak/VcVv8cnHR0ESs86URklOqqeIIpg5N8g/T6VRN JKrnNLIjpr089HMdNo/hiBwbbyPGiTivlJ5VC3OxyarE+vX4VtUSK3sgbXOZ/g3Lx7sH pL68gMtbZnVAynurZwGaQXumUOlPvXEFWdeFBFUejWzBhq5cOoN1DhOkVXbIBcySweeh iXzQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=isafiBsG; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id h37si3627918ybi.190.2018.02.17.10.33.16 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 10:33:16 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=isafiBsG; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48110 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7IR-0007KE-Md for patch@linaro.org; Sat, 17 Feb 2018 13:33:15 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39598) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79A-0000Dw-Ee for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:42 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en797-0001Ut-Uh for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:40 -0500 Received: from mail-pg0-x243.google.com ([2607:f8b0:400e:c05::243]:42012) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en797-0001UV-MW for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:37 -0500 Received: by mail-pg0-x243.google.com with SMTP id y8so4342796pgr.9 for ; Sat, 17 Feb 2018 10:23:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=V7b6SuF1cXACZqT7BVM+onphPKImPG8640eKQWeQ1Es=; b=isafiBsGufYRWtE6E74zecgqEc7cmIFrRYymxQ/7Ihet5OZvDEV9X6p5P/UrjjTCHU CXQFMKEP9znCSSejSmhtfxrz1JrwjfXsqK+G1RxK2U/2erNyhzs5KtAXvsrGYPVp1JfI oONQ0ahslfyxq9n5viitrI4EyzG7RCXiU8uxE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=V7b6SuF1cXACZqT7BVM+onphPKImPG8640eKQWeQ1Es=; b=eD9sQ4C+vgVnEC/wn1bEX5soueGEiSfLxBEDm/6liz9G5qM1o/5gAAagoWrCcxQXdb E/WqoqTvTYBXsKTiNDpdByjgeh8vVIV6PgyDQNzHO/Xp7/LeB8K1Fd/h0C/5Os8t71Td tnaUdQmpmr4mIlYS9+6vpg7IHQcnFzPanMBwBiHmBDCQ6xmA5LZ0j18HavsvfLWKdBr+ VaErc/l4bGSBO69OeWx1flbZZQkfTVILypqv+Zt1NLKHfqrOkK9SEC+TOaiakvZ6Gkq3 wu3ADI8SZes9rNWY7QXLUh6z/VMVWxKFdMf038thmZwvW9aM3xFGra0AeGSozkm5QRGU 8p6w== X-Gm-Message-State: APf1xPC70gace214Goa6BxqHFXj2sdXisFXy/O3fvV4HrLq1ivdkC4Uf Xm+DJ1FbJ7AjZs2keBPM9s1gJNINEdg= X-Received: by 10.99.114.86 with SMTP id c22mr8196301pgn.41.1518891816308; Sat, 17 Feb 2018 10:23:36 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.34 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:35 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:22 -0800 Message-Id: <20180217182323.25885-7-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::243 Subject: [Qemu-devel] [PATCH v2 06/67] target/arm: Implement SVE predicate test X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 21 +++++++++++++ target/arm/helper.h | 1 + target/arm/sve_helper.c | 77 ++++++++++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 62 +++++++++++++++++++++++++++++++++++++ target/arm/Makefile.objs | 2 +- target/arm/sve.decode | 5 +++ 6 files changed, 167 insertions(+), 1 deletion(-) create mode 100644 target/arm/helper-sve.h create mode 100644 target/arm/sve_helper.c -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h new file mode 100644 index 0000000000..b6e91539ae --- /dev/null +++ b/target/arm/helper-sve.h @@ -0,0 +1,21 @@ +/* + * AArch64 SVE specific helper definitions + * + * Copyright (c) 2018 Linaro, Ltd + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, see . + */ + +DEF_HELPER_FLAGS_2(sve_predtest1, TCG_CALL_NO_WG, i32, i64, i64) +DEF_HELPER_FLAGS_3(sve_predtest, TCG_CALL_NO_WG, i32, ptr, ptr, i32) diff --git a/target/arm/helper.h b/target/arm/helper.h index 6dd8504ec3..be3c2fcdc0 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -567,4 +567,5 @@ DEF_HELPER_FLAGS_2(neon_pmull_64_hi, TCG_CALL_NO_RWG_SE, i64, i64, i64) #ifdef TARGET_AARCH64 #include "helper-a64.h" +#include "helper-sve.h" #endif diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c new file mode 100644 index 0000000000..7d13fd40ed --- /dev/null +++ b/target/arm/sve_helper.c @@ -0,0 +1,77 @@ +/* + * ARM SVE Operations + * + * Copyright (c) 2018 Linaro + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, see . + */ + +#include "qemu/osdep.h" +#include "cpu.h" +#include "exec/exec-all.h" +#include "exec/cpu_ldst.h" +#include "exec/helper-proto.h" +#include "tcg/tcg-gvec-desc.h" + + +/* Return a value for NZCV as per the ARM PredTest pseudofunction. + * + * The return value has bit 31 set if N is set, bit 1 set if Z is clear, + * and bit 0 set if C is set. + * + * This is an iterative function, called for each Pd and Pg word + * moving forward. + */ + +/* For no G bits set, NZCV = C. */ +#define PREDTEST_INIT 1 + +static uint32_t iter_predtest_fwd(uint64_t d, uint64_t g, uint32_t flags) +{ + if (g) { + /* Compute N from first D & G. + Use bit 2 to signal first G bit seen. */ + if (!(flags & 4)) { + flags |= ((d & (g & -g)) != 0) << 31; + flags |= 4; + } + + /* Accumulate Z from each D & G. */ + flags |= ((d & g) != 0) << 1; + + /* Compute C from last !(D & G). Replace previous. */ + flags = deposit32(flags, 0, 1, (d & pow2floor(g)) == 0); + } + return flags; +} + +/* The same for a single word predicate. */ +uint32_t HELPER(sve_predtest1)(uint64_t d, uint64_t g) +{ + return iter_predtest_fwd(d, g, PREDTEST_INIT); +} + +/* The same for a multi-word predicate. */ +uint32_t HELPER(sve_predtest)(void *vd, void *vg, uint32_t words) +{ + uint32_t flags = PREDTEST_INIT; + uint64_t *d = vd, *g = vg; + uintptr_t i = 0; + + do { + flags = iter_predtest_fwd(d[i], g[i], flags); + } while (++i < words); + + return flags; +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index c0cccfda6f..c2e7fac938 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -83,6 +83,43 @@ static void do_mov_z(DisasContext *s, int rd, int rn) do_vector2_z(s, tcg_gen_gvec_mov, 0, rd, rn); } +/* Set the cpu flags as per a return from an SVE helper. */ +static void do_pred_flags(TCGv_i32 t) +{ + tcg_gen_mov_i32(cpu_NF, t); + tcg_gen_andi_i32(cpu_ZF, t, 2); + tcg_gen_andi_i32(cpu_CF, t, 1); + tcg_gen_movi_i32(cpu_VF, 0); +} + +/* Subroutines computing the ARM PredTest psuedofunction. */ +static void do_predtest1(TCGv_i64 d, TCGv_i64 g) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + gen_helper_sve_predtest1(t, d, g); + do_pred_flags(t); + tcg_temp_free_i32(t); +} + +static void do_predtest(DisasContext *s, int dofs, int gofs, int words) +{ + TCGv_ptr dptr = tcg_temp_new_ptr(); + TCGv_ptr gptr = tcg_temp_new_ptr(); + TCGv_i32 t; + + tcg_gen_addi_ptr(dptr, cpu_env, dofs); + tcg_gen_addi_ptr(gptr, cpu_env, gofs); + t = tcg_const_i32(words); + + gen_helper_sve_predtest(t, dptr, gptr, t); + tcg_temp_free_ptr(dptr); + tcg_temp_free_ptr(gptr); + + do_pred_flags(t); + tcg_temp_free_i32(t); +} + /* *** SVE Logical - Unpredicated Group */ @@ -111,6 +148,31 @@ static void trans_BIC_zzz(DisasContext *s, arg_BIC_zzz *a, uint32_t insn) do_vector3_z(s, tcg_gen_gvec_andc, 0, a->rd, a->rn, a->rm); } +/* + *** SVE Predicate Misc Group + */ + +void trans_PTEST(DisasContext *s, arg_PTEST *a, uint32_t insn) +{ + int nofs = pred_full_reg_offset(s, a->rn); + int gofs = pred_full_reg_offset(s, a->pg); + int words = DIV_ROUND_UP(pred_full_reg_size(s), 8); + + if (words == 1) { + TCGv_i64 pn = tcg_temp_new_i64(); + TCGv_i64 pg = tcg_temp_new_i64(); + + tcg_gen_ld_i64(pn, cpu_env, nofs); + tcg_gen_ld_i64(pg, cpu_env, gofs); + do_predtest1(pn, pg); + + tcg_temp_free_i64(pn); + tcg_temp_free_i64(pg); + } else { + do_predtest(s, nofs, gofs, words); + } +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/Makefile.objs b/target/arm/Makefile.objs index 9934cf1d4d..452ac6f453 100644 --- a/target/arm/Makefile.objs +++ b/target/arm/Makefile.objs @@ -19,4 +19,4 @@ target/arm/decode-sve.inc.c: $(SRC_PATH)/target/arm/sve.decode $(DECODETREE) "GEN", $(TARGET_DIR)$@) target/arm/translate-sve.o: target/arm/decode-sve.inc.c -obj-$(TARGET_AARCH64) += translate-sve.o +obj-$(TARGET_AARCH64) += translate-sve.o sve_helper.o diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 0c6a7ba34d..7efaa8fe8e 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -56,6 +56,11 @@ ORR_zzz 00000100 01 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 EOR_zzz 00000100 10 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 BIC_zzz 00000100 11 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 +### SVE Predicate Misc Group + +# SVE predicate test +PTEST 00100101 01010000 11 pg:4 0 rn:4 00000 + ### SVE Memory - 32-bit Gather and Unsized Contiguous Group # SVE load predicate register From patchwork Sat Feb 17 18:22:23 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128678 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1819896ljc; Sat, 17 Feb 2018 10:30:56 -0800 (PST) X-Google-Smtp-Source: AH8x2267xXRazQHF0bhyLQSXJwdN0kgj49vF2Yvm0SzAAI8dxzhnT6qhNhOl5xPG+D9aZRCRe0+3 X-Received: by 10.37.110.8 with SMTP id j8mr7135029ybc.26.1518892256623; Sat, 17 Feb 2018 10:30:56 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518892256; cv=none; d=google.com; s=arc-20160816; b=ly29UPMRdKuxgxueO+ougYGWz2meVwnqvpY3VFqlN/XhZ7gDoQp3+7NlPZ/fT6Atjg ZmXRF6NCMOGkm3jSEJ1CNQ5Ui2fdAjst9lzU5rsBH/yBbU99XQG2XbDRUMNp2fxAZ5e8 MvuzvCMKHNBYSVqNxGWJ0+0cS4pREkCdhlvm5pVUSbhuwREuUTv7BVxfha/eD63swapB h0ngjFS3MBwuF+yNMW0MpDjgEUkMPwyHO0dHkLzr5Q9XI+V9Su3pebipGeXC7x80xzx5 QmmcZN0X4sJYB8uxbSzfYEsE4gL5FgkoCNRqL9bK/mdsdh/nHiel7vuRkMYhkDr3zuGZ W3/A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=6uhmGKy5nLmfDGSIQVraRklHXj0OvYQ5hNxaPRxQJrw=; b=B1nrj7JofwjkS8AVNZxV980VL3ko+S7TalKfqu79XM0P9o5hhv9hNfB2MIC5WeQhjZ /YtnDIx+TqcC5NBBF0X1KTNYXXejyqgZFevAD4Me5ZLP0cHHQ2xuK7POMnizx8lr9yXK n2FgN3vI/Rq+A0dn4nR8XOhyDhIekMzlc5pmUpuxkAAFtuzpkTm+R9gWWGmJvDVBILeF 3vxkzn+uJFfZqk6vySFF+1fJ2FafIxoCokMLers5aDFK0qqvYyeHMNEmPVuXcMdu9hMa Oq6fDMnXH8ZyO8I3/zSBRzoewPTkPqB11QEVGPwOCWUh4Rxu0BBV3KiFHTy/S66KRC4y Kawg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=DMFUbaz6; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id f67si1134486ybf.256.2018.02.17.10.30.56 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 10:30:56 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=DMFUbaz6; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48090 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7GB-0005Sd-Rn for patch@linaro.org; Sat, 17 Feb 2018 13:30:55 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39623) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79C-0000GF-7v for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:45 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en799-0001We-NK for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:42 -0500 Received: from mail-pg0-x244.google.com ([2607:f8b0:400e:c05::244]:43776) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en799-0001WB-EU for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:39 -0500 Received: by mail-pg0-x244.google.com with SMTP id f6so4342749pgs.10 for ; Sat, 17 Feb 2018 10:23:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=6uhmGKy5nLmfDGSIQVraRklHXj0OvYQ5hNxaPRxQJrw=; b=DMFUbaz6cwZJthSX6wD4LrR8bAVHTO2CF348HaX0jJAiwN8KLlrpbB1A2etO1Pgx5L PMGUZIAVQdb/oCwuq59Ha/x3QDj1yzF87M5SM8ipL/F+WZMxRhYuqTtwEEFybiUrcZs2 Ao2NwnmrRjdBicpkn5zbdHzHb6E1TAy8L0qzg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=6uhmGKy5nLmfDGSIQVraRklHXj0OvYQ5hNxaPRxQJrw=; b=VMzln/IxFE+kcQuUPWBovzPu1RCxXUB3k/Ugm+zKdjIFR8Z38WbcEyexgMNG8ieiUa YxofGXINx+OqV6mbKZsSFhHw5gndyQPNal4o524DAh0IqwUJrNn5Duxy/sgElgjxsHt+ kjGVJAYjiGQdpaXsWzh0AzNYRIXNS4ASz+w4GfjQ/93xJiacwqt/quY4lyRdMYbX2dpP 3e1FGG4eP5ZJBi7iIzjnRY/wbuu3bQCNqHN3vg0uwFOJ4AXVwNCAiBrWi48KZ78aYvwv mRsCey1/X09fGGcOuMRUOPT/lskVh+N+W8WdXc9EFFdXh5aRoOhnTis53aAsSr7RnBS2 rTbA== X-Gm-Message-State: APf1xPCRwTSQOnCi4mP3yLPJyh051O/zcVHvMqMY1Py7XQld4o/vvwEi 1X7a2//C6CfhN2P0KPwXJ8tMhg2lqI8= X-Received: by 10.99.125.13 with SMTP id y13mr8121205pgc.282.1518891817849; Sat, 17 Feb 2018 10:23:37 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.36 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:37 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:23 -0800 Message-Id: <20180217182323.25885-8-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::244 Subject: [Qemu-devel] [PATCH v2 07/67] target/arm: Implement SVE Predicate Logical Operations Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/cpu.h | 4 +- target/arm/helper-sve.h | 10 ++ target/arm/sve_helper.c | 39 ++++++ target/arm/translate-sve.c | 338 ++++++++++++++++++++++++++++++++++++++++++++- target/arm/sve.decode | 16 +++ 5 files changed, 405 insertions(+), 2 deletions(-) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/cpu.h b/target/arm/cpu.h index 70e05f00fe..8befe43a01 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -527,6 +527,8 @@ typedef struct CPUARMState { #ifdef TARGET_AARCH64 /* Store FFR as pregs[16] to make it easier to treat as any other. */ ARMPredicateReg pregs[17]; + /* Scratch space for aa64 sve predicate temporary. */ + ARMPredicateReg preg_tmp; #endif uint32_t xregs[16]; @@ -534,7 +536,7 @@ typedef struct CPUARMState { int vec_len; int vec_stride; - /* scratch space when Tn are not sufficient. */ + /* Scratch space for aa32 neon expansion. */ uint32_t scratch[8]; /* There are a number of distinct float control structures: diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index b6e91539ae..57adc4d912 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -19,3 +19,13 @@ DEF_HELPER_FLAGS_2(sve_predtest1, TCG_CALL_NO_WG, i32, i64, i64) DEF_HELPER_FLAGS_3(sve_predtest, TCG_CALL_NO_WG, i32, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_sel_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_orr_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_orn_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_nor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_nand_pppp, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 7d13fd40ed..b63e7cc90e 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -75,3 +75,42 @@ uint32_t HELPER(sve_predtest)(void *vd, void *vg, uint32_t words) return flags; } + +#define LOGICAL_PPPP(NAME, FUNC) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \ +{ \ + uintptr_t opr_sz = simd_oprsz(desc); \ + uint64_t *d = vd, *n = vn, *m = vm, *g = vg; \ + uintptr_t i; \ + for (i = 0; i < opr_sz / 8; ++i) { \ + d[i] = FUNC(n[i], m[i], g[i]); \ + } \ +} + +#define DO_AND(N, M, G) (((N) & (M)) & (G)) +#define DO_BIC(N, M, G) (((N) & ~(M)) & (G)) +#define DO_EOR(N, M, G) (((N) ^ (M)) & (G)) +#define DO_ORR(N, M, G) (((N) | (M)) & (G)) +#define DO_ORN(N, M, G) (((N) | ~(M)) & (G)) +#define DO_NOR(N, M, G) (~((N) | (M)) & (G)) +#define DO_NAND(N, M, G) (~((N) & (M)) & (G)) +#define DO_SEL(N, M, G) (((N) & (G)) | ((M) & ~(G))) + +LOGICAL_PPPP(sve_and_pppp, DO_AND) +LOGICAL_PPPP(sve_bic_pppp, DO_BIC) +LOGICAL_PPPP(sve_eor_pppp, DO_EOR) +LOGICAL_PPPP(sve_sel_pppp, DO_SEL) +LOGICAL_PPPP(sve_orr_pppp, DO_ORR) +LOGICAL_PPPP(sve_orn_pppp, DO_ORN) +LOGICAL_PPPP(sve_nor_pppp, DO_NOR) +LOGICAL_PPPP(sve_nand_pppp, DO_NAND) + +#undef DO_ADD +#undef DO_BIC +#undef DO_EOR +#undef DO_ORR +#undef DO_ORN +#undef DO_NOR +#undef DO_NAND +#undef DO_SEL +#undef LOGICAL_PPPP diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index c2e7fac938..405f9397a1 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -59,6 +59,24 @@ static inline int pred_full_reg_size(DisasContext *s) return s->sve_len >> 3; } +/* Round up the size of a predicate register to a size allowed by + * the tcg vector infrastructure. Any operation which uses this + * size may assume that the bits above pred_full_reg_size are zero, + * and must leave them the same way. + * + * Note that this is not needed for the vector registers as they + * are always properly sized for tcg vectors. + */ +static int pred_gvec_reg_size(DisasContext *s) +{ + int size = pred_full_reg_size(s); + if (size <= 8) { + return 8; + } else { + return QEMU_ALIGN_UP(size, 16); + } +} + /* Invoke a vector expander on two Zregs. */ static void do_vector2_z(DisasContext *s, GVecGen2Fn *gvec_fn, int esz, int rd, int rn) @@ -83,6 +101,40 @@ static void do_mov_z(DisasContext *s, int rd, int rn) do_vector2_z(s, tcg_gen_gvec_mov, 0, rd, rn); } +/* Invoke a vector expander on two Pregs. */ +static void do_vector2_p(DisasContext *s, GVecGen2Fn *gvec_fn, + int esz, int rd, int rn) +{ + unsigned psz = pred_gvec_reg_size(s); + gvec_fn(esz, pred_full_reg_offset(s, rd), + pred_full_reg_offset(s, rn), psz, psz); +} + +/* Invoke a vector expander on three Pregs. */ +static void do_vector3_p(DisasContext *s, GVecGen3Fn *gvec_fn, + int esz, int rd, int rn, int rm) +{ + unsigned psz = pred_gvec_reg_size(s); + gvec_fn(esz, pred_full_reg_offset(s, rd), pred_full_reg_offset(s, rn), + pred_full_reg_offset(s, rm), psz, psz); +} + +/* Invoke a vector operation on four Pregs. */ +static void do_vecop4_p(DisasContext *s, const GVecGen4 *gvec_op, + int rd, int rn, int rm, int rg) +{ + unsigned psz = pred_gvec_reg_size(s); + tcg_gen_gvec_4(pred_full_reg_offset(s, rd), pred_full_reg_offset(s, rn), + pred_full_reg_offset(s, rm), pred_full_reg_offset(s, rg), + psz, psz, gvec_op); +} + +/* Invoke a vector move on two Pregs. */ +static void do_mov_p(DisasContext *s, int rd, int rn) +{ + do_vector2_p(s, tcg_gen_gvec_mov, 0, rd, rn); +} + /* Set the cpu flags as per a return from an SVE helper. */ static void do_pred_flags(TCGv_i32 t) { @@ -148,11 +200,295 @@ static void trans_BIC_zzz(DisasContext *s, arg_BIC_zzz *a, uint32_t insn) do_vector3_z(s, tcg_gen_gvec_andc, 0, a->rd, a->rn, a->rm); } +/* + *** SVE Predicate Logical Operations Group + */ + +static void do_pppp_flags(DisasContext *s, arg_rprr_s *a, + const GVecGen4 *gvec_op) +{ + unsigned psz = pred_gvec_reg_size(s); + int dofs = pred_full_reg_offset(s, a->rd); + int nofs = pred_full_reg_offset(s, a->rn); + int mofs = pred_full_reg_offset(s, a->rm); + int gofs = pred_full_reg_offset(s, a->pg); + + if (psz == 8) { + /* Do the operation and the flags generation in temps. */ + TCGv_i64 pd = tcg_temp_new_i64(); + TCGv_i64 pn = tcg_temp_new_i64(); + TCGv_i64 pm = tcg_temp_new_i64(); + TCGv_i64 pg = tcg_temp_new_i64(); + + tcg_gen_ld_i64(pn, cpu_env, nofs); + tcg_gen_ld_i64(pm, cpu_env, mofs); + tcg_gen_ld_i64(pg, cpu_env, gofs); + + gvec_op->fni8(pd, pn, pm, pg); + tcg_gen_st_i64(pd, cpu_env, dofs); + + do_predtest1(pd, pg); + + tcg_temp_free_i64(pd); + tcg_temp_free_i64(pn); + tcg_temp_free_i64(pm); + tcg_temp_free_i64(pg); + } else { + /* The operation and flags generation is large. The computation + * of the flags depends on the original contents of the guarding + * predicate. If the destination overwrites the guarding predicate, + * then the easiest way to get this right is to save a copy. + */ + int tofs = gofs; + if (a->rd == a->pg) { + tofs = offsetof(CPUARMState, vfp.preg_tmp); + tcg_gen_gvec_mov(0, tofs, gofs, psz, psz); + } + + tcg_gen_gvec_4(dofs, nofs, mofs, gofs, psz, psz, gvec_op); + do_predtest(s, dofs, tofs, psz / 8); + } +} + +static void gen_and_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i64 pg) +{ + tcg_gen_and_i64(pd, pn, pm); + tcg_gen_and_i64(pd, pd, pg); +} + +static void gen_and_pg_vec(unsigned vece, TCGv_vec pd, TCGv_vec pn, + TCGv_vec pm, TCGv_vec pg) +{ + tcg_gen_and_vec(vece, pd, pn, pm); + tcg_gen_and_vec(vece, pd, pd, pg); +} + +static void trans_AND_pppp(DisasContext *s, arg_rprr_s *a, uint32_t insn) +{ + static const GVecGen4 op = { + .fni8 = gen_and_pg_i64, + .fniv = gen_and_pg_vec, + .fno = gen_helper_sve_and_pppp, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + }; + if (a->s) { + do_pppp_flags(s, a, &op); + } else if (a->pg == a->rn && a->rn == a->rm) { + do_mov_p(s, a->rd, a->rn); + } else if (a->pg == a->rn || a->pg == a->rm) { + do_vector3_p(s, tcg_gen_gvec_and, 0, a->rd, a->rn, a->rm); + } else { + do_vecop4_p(s, &op, a->rd, a->rn, a->rm, a->pg); + } +} + +static void gen_bic_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i64 pg) +{ + tcg_gen_andc_i64(pd, pn, pm); + tcg_gen_and_i64(pd, pd, pg); +} + +static void gen_bic_pg_vec(unsigned vece, TCGv_vec pd, TCGv_vec pn, + TCGv_vec pm, TCGv_vec pg) +{ + tcg_gen_andc_vec(vece, pd, pn, pm); + tcg_gen_and_vec(vece, pd, pd, pg); +} + +static void trans_BIC_pppp(DisasContext *s, arg_rprr_s *a, uint32_t insn) +{ + static const GVecGen4 op = { + .fni8 = gen_bic_pg_i64, + .fniv = gen_bic_pg_vec, + .fno = gen_helper_sve_bic_pppp, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + }; + if (a->s) { + do_pppp_flags(s, a, &op); + } else if (a->pg == a->rn) { + do_vector3_p(s, tcg_gen_gvec_andc, 0, a->rd, a->rn, a->rm); + } else { + do_vecop4_p(s, &op, a->rd, a->rn, a->rm, a->pg); + } +} + +static void gen_eor_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i64 pg) +{ + tcg_gen_xor_i64(pd, pn, pm); + tcg_gen_and_i64(pd, pd, pg); +} + +static void gen_eor_pg_vec(unsigned vece, TCGv_vec pd, TCGv_vec pn, + TCGv_vec pm, TCGv_vec pg) +{ + tcg_gen_xor_vec(vece, pd, pn, pm); + tcg_gen_and_vec(vece, pd, pd, pg); +} + +static void trans_EOR_pppp(DisasContext *s, arg_rprr_s *a, uint32_t insn) +{ + static const GVecGen4 op = { + .fni8 = gen_eor_pg_i64, + .fniv = gen_eor_pg_vec, + .fno = gen_helper_sve_eor_pppp, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + }; + if (a->s) { + do_pppp_flags(s, a, &op); + } else { + do_vecop4_p(s, &op, a->rd, a->rn, a->rm, a->pg); + } +} + +static void gen_sel_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i64 pg) +{ + tcg_gen_and_i64(pn, pn, pg); + tcg_gen_andc_i64(pm, pm, pg); + tcg_gen_or_i64(pd, pn, pm); +} + +static void gen_sel_pg_vec(unsigned vece, TCGv_vec pd, TCGv_vec pn, + TCGv_vec pm, TCGv_vec pg) +{ + tcg_gen_and_vec(vece, pn, pn, pg); + tcg_gen_andc_vec(vece, pm, pm, pg); + tcg_gen_or_vec(vece, pd, pn, pm); +} + +static void trans_SEL_pppp(DisasContext *s, arg_rprr_s *a, uint32_t insn) +{ + static const GVecGen4 op = { + .fni8 = gen_sel_pg_i64, + .fniv = gen_sel_pg_vec, + .fno = gen_helper_sve_sel_pppp, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + }; + if (a->s) { + unallocated_encoding(s); + } else { + do_vecop4_p(s, &op, a->rd, a->rn, a->rm, a->pg); + } +} + +static void gen_orr_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i64 pg) +{ + tcg_gen_or_i64(pd, pn, pm); + tcg_gen_and_i64(pd, pd, pg); +} + +static void gen_orr_pg_vec(unsigned vece, TCGv_vec pd, TCGv_vec pn, + TCGv_vec pm, TCGv_vec pg) +{ + tcg_gen_or_vec(vece, pd, pn, pm); + tcg_gen_and_vec(vece, pd, pd, pg); +} + +static void trans_ORR_pppp(DisasContext *s, arg_rprr_s *a, uint32_t insn) +{ + static const GVecGen4 op = { + .fni8 = gen_orr_pg_i64, + .fniv = gen_orr_pg_vec, + .fno = gen_helper_sve_orr_pppp, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + }; + if (a->s) { + do_pppp_flags(s, a, &op); + } else if (a->pg == a->rn && a->rn == a->rm) { + do_mov_p(s, a->rd, a->rn); + } else { + do_vecop4_p(s, &op, a->rd, a->rn, a->rm, a->pg); + } +} + +static void gen_orn_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i64 pg) +{ + tcg_gen_orc_i64(pd, pn, pm); + tcg_gen_and_i64(pd, pd, pg); +} + +static void gen_orn_pg_vec(unsigned vece, TCGv_vec pd, TCGv_vec pn, + TCGv_vec pm, TCGv_vec pg) +{ + tcg_gen_orc_vec(vece, pd, pn, pm); + tcg_gen_and_vec(vece, pd, pd, pg); +} + +static void trans_ORN_pppp(DisasContext *s, arg_rprr_s *a, uint32_t insn) +{ + static const GVecGen4 op = { + .fni8 = gen_orn_pg_i64, + .fniv = gen_orn_pg_vec, + .fno = gen_helper_sve_orn_pppp, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + }; + if (a->s) { + do_pppp_flags(s, a, &op); + } else { + do_vecop4_p(s, &op, a->rd, a->rn, a->rm, a->pg); + } +} + +static void gen_nor_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i64 pg) +{ + tcg_gen_or_i64(pd, pn, pm); + tcg_gen_andc_i64(pd, pg, pd); +} + +static void gen_nor_pg_vec(unsigned vece, TCGv_vec pd, TCGv_vec pn, + TCGv_vec pm, TCGv_vec pg) +{ + tcg_gen_or_vec(vece, pd, pn, pm); + tcg_gen_andc_vec(vece, pd, pg, pd); +} + +static void trans_NOR_pppp(DisasContext *s, arg_rprr_s *a, uint32_t insn) +{ + static const GVecGen4 op = { + .fni8 = gen_nor_pg_i64, + .fniv = gen_nor_pg_vec, + .fno = gen_helper_sve_nor_pppp, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + }; + if (a->s) { + do_pppp_flags(s, a, &op); + } else { + do_vecop4_p(s, &op, a->rd, a->rn, a->rm, a->pg); + } +} + +static void gen_nand_pg_i64(TCGv_i64 pd, TCGv_i64 pn, TCGv_i64 pm, TCGv_i64 pg) +{ + tcg_gen_and_i64(pd, pn, pm); + tcg_gen_andc_i64(pd, pg, pd); +} + +static void gen_nand_pg_vec(unsigned vece, TCGv_vec pd, TCGv_vec pn, + TCGv_vec pm, TCGv_vec pg) +{ + tcg_gen_and_vec(vece, pd, pn, pm); + tcg_gen_andc_vec(vece, pd, pg, pd); +} + +static void trans_NAND_pppp(DisasContext *s, arg_rprr_s *a, uint32_t insn) +{ + static const GVecGen4 op = { + .fni8 = gen_nand_pg_i64, + .fniv = gen_nand_pg_vec, + .fno = gen_helper_sve_nand_pppp, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + }; + if (a->s) { + do_pppp_flags(s, a, &op); + } else { + do_vecop4_p(s, &op, a->rd, a->rn, a->rm, a->pg); + } +} + /* *** SVE Predicate Misc Group */ -void trans_PTEST(DisasContext *s, arg_PTEST *a, uint32_t insn) +static void trans_PTEST(DisasContext *s, arg_PTEST *a, uint32_t insn) { int nofs = pred_full_reg_offset(s, a->rn); int gofs = pred_full_reg_offset(s, a->pg); diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 7efaa8fe8e..d92886127a 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -31,6 +31,7 @@ &rri rd rn imm &rrr_esz rd rn rm esz +&rprr_s rd pg rn rm s ########################################################################### # Named instruction formats. These are generally used to @@ -39,6 +40,9 @@ # Three operand with unused vector element size @rd_rn_rm_e0 ........ ... rm:5 ... ... rn:5 rd:5 &rrr_esz esz=0 +# Three prediate operand, with governing predicate, flag setting +@pd_pg_pn_pm_s ........ . s:1 .. rm:4 .. pg:4 . rn:4 . rd:4 &rprr_s + # Basic Load/Store with 9-bit immediate offset @pd_rn_i9 ........ ........ ...... rn:5 . rd:4 \ &rri imm=%imm9_16_10 @@ -56,6 +60,18 @@ ORR_zzz 00000100 01 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 EOR_zzz 00000100 10 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 BIC_zzz 00000100 11 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 +### SVE Predicate Logical Operations Group + +# SVE predicate logical operations +AND_pppp 00100101 0. 00 .... 01 .... 0 .... 0 .... @pd_pg_pn_pm_s +BIC_pppp 00100101 0. 00 .... 01 .... 0 .... 1 .... @pd_pg_pn_pm_s +EOR_pppp 00100101 0. 00 .... 01 .... 1 .... 0 .... @pd_pg_pn_pm_s +SEL_pppp 00100101 0. 00 .... 01 .... 1 .... 1 .... @pd_pg_pn_pm_s +ORR_pppp 00100101 1. 00 .... 01 .... 0 .... 0 .... @pd_pg_pn_pm_s +ORN_pppp 00100101 1. 00 .... 01 .... 0 .... 1 .... @pd_pg_pn_pm_s +NOR_pppp 00100101 1. 00 .... 01 .... 1 .... 0 .... @pd_pg_pn_pm_s +NAND_pppp 00100101 1. 00 .... 01 .... 1 .... 1 .... @pd_pg_pn_pm_s + ### SVE Predicate Misc Group # SVE predicate test From patchwork Sat Feb 17 18:22:24 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128686 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1823871ljc; Sat, 17 Feb 2018 10:36:55 -0800 (PST) X-Google-Smtp-Source: AH8x225HRjuvPYANL+z6ufGHNo4kNN1A2gs0DQDhdm22KONT3nX6uqkl8vnN6hFNKkltCaFUxY81 X-Received: by 10.37.20.84 with SMTP id 81mr7027487ybu.426.1518892614985; Sat, 17 Feb 2018 10:36:54 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518892614; cv=none; d=google.com; s=arc-20160816; b=VxMlQK1mvEUX6uaINp9SJIIeF8m8ruZS/g4g2QgxpuvdbdwvM0NiNFebO0kcg3sQOU mVOhSRHTHGy8ForR4ckBAUFUlIGrQUFQwjO9jo2oyq2R9kRrN9Ur24Cq4DUNj/AKMQuH 8gW+uSDAJXjgNpZqGVEfHOR7Xeer6vOTtvEIqhbK2YnbIawlM2QAD7m5jEq4JKVUBoPw GDJhZMpdoUmpYIbZyJdKCPdEg35aL9BAuZVP0eF3iewl3EDzwUdVhRC2x3vIyVm9WTAI IuJi3J8A+OZG8Xfd5HapHeFy46L8lThS7BlwHBUZuXl8RqzJfwqxZJ+11Jo0MwGkduab qvRw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=ABkUdeI3do2ssjX0cgzNwJ6hqP+wdGiVs4gSHu+xkhQ=; b=sSkwoLLmgf2bQYM7OQFVhIvAmJ/e+jMlly7irHOb922pQQaA00B+3JOlNKDQoD25+E Og86Uyh25iJAQeK07jLh6vaf6c1muWx4mqQSs+N8y8fOADOyw9R4xOs3/ignQr0tQucb R+xMZ8w8oXZ5ciFEEo8Vthb7bfrBdyV5U2lQzrUpceyIyZOFEX42KTxD5Jt3vfNu/B5x AMbj5HlQQoQhMvJYEOBH2iDZNugVfpmkGdiphR7nwl8ah0lcy+56CLvJkyi7h01mMeIP NAZRXenL1ZmH52Pn7TIb47hlxMrjj6IaPBTyuTXm1VtBZH0rL5REB3rPKeXnSD6Etc3g JrwQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=awE8eqcK; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id j82si538019ywc.201.2018.02.17.10.36.54 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 10:36:54 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=awE8eqcK; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48152 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7Ly-0001uI-Ba for patch@linaro.org; Sat, 17 Feb 2018 13:36:54 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39643) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79D-0000HY-BV for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:45 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79B-0001Xk-7o for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:43 -0500 Received: from mail-pg0-x244.google.com ([2607:f8b0:400e:c05::244]:44352) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79B-0001XR-08 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:41 -0500 Received: by mail-pg0-x244.google.com with SMTP id l4so2191070pgp.11 for ; Sat, 17 Feb 2018 10:23:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=ABkUdeI3do2ssjX0cgzNwJ6hqP+wdGiVs4gSHu+xkhQ=; b=awE8eqcKRSfvzWzdtuGEwqPJXRsM5oMfsT3JEKQPTMWabMFng5ukNL1zziBN6V5J3m 78gY2HQqbi4GZZ0gAuxv7TUYYH6yf/cOrV5doEav2D0IvylVFKy+O6vhhyqjRbGqnPje e8HZYV5IGaIe9q9iY7K1eGjICyVsxAv0PFnd0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=ABkUdeI3do2ssjX0cgzNwJ6hqP+wdGiVs4gSHu+xkhQ=; b=XdKLxVBs/Kgmf3Xi26Vo5Uf1bvixTNlWSEJLbnZMieUl/y60EVjBeu9clerEJ919Tj HkSMicgbQpmV712KZ18of9rzuMzp5dOaIwiRzcC/8PpP/mNZj7mMXElQTS4ZQyrHY4B7 PAhl4PcYWC+8yG9lpemi+zbP6S/ch5olYZP65BO6FGDzawmiX6WyxYTIKVdIgWvRAgWr OVP3q2L5hls9YRe4fqFyk8jAPzIFSZRkn9m6J4x3fF3bAwtC5V6RsLe5qImrbzZ1J9lE tVev14L+Vw0mnJHkd311MJ69BcdZn0pouuaEo1WpATXpx2+ornxVD0a81n/IYJhW8K+X sC8Q== X-Gm-Message-State: APf1xPA123HJwoSLdR8R5nilrvdNIBf9QjUKWkRXTFpC9KXbvc/e6TlF ilzJIgYxLFVOu3I6tIlG7tqgvKJaQcg= X-Received: by 10.99.47.132 with SMTP id v126mr8252843pgv.129.1518891819491; Sat, 17 Feb 2018 10:23:39 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.37 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:38 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:24 -0800 Message-Id: <20180217182323.25885-9-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::244 Subject: [Qemu-devel] [PATCH v2 08/67] target/arm: Implement SVE Predicate Misc Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/cpu.h | 3 + target/arm/helper-sve.h | 3 + target/arm/sve_helper.c | 86 +++++++++++++++++++++++- target/arm/translate-sve.c | 163 ++++++++++++++++++++++++++++++++++++++++++++- target/arm/sve.decode | 41 ++++++++++++ 5 files changed, 293 insertions(+), 3 deletions(-) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/cpu.h b/target/arm/cpu.h index 8befe43a01..27f395183b 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -2915,4 +2915,7 @@ static inline uint64_t *aa64_vfp_qreg(CPUARMState *env, unsigned regno) return &env->vfp.zregs[regno].d[0]; } +/* Shared between translate-sve.c and sve_helper.c. */ +extern const uint64_t pred_esz_masks[4]; + #endif diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 57adc4d912..0c04afff8c 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -20,6 +20,9 @@ DEF_HELPER_FLAGS_2(sve_predtest1, TCG_CALL_NO_WG, i32, i64, i64) DEF_HELPER_FLAGS_3(sve_predtest, TCG_CALL_NO_WG, i32, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_pfirst, TCG_CALL_NO_WG, i32, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_pnext, TCG_CALL_NO_WG, i32, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index b63e7cc90e..cee7d9bcf6 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -39,7 +39,7 @@ static uint32_t iter_predtest_fwd(uint64_t d, uint64_t g, uint32_t flags) { - if (g) { + if (likely(g)) { /* Compute N from first D & G. Use bit 2 to signal first G bit seen. */ if (!(flags & 4)) { @@ -114,3 +114,87 @@ LOGICAL_PPPP(sve_nand_pppp, DO_NAND) #undef DO_NAND #undef DO_SEL #undef LOGICAL_PPPP + +/* Similar to the ARM LastActiveElement pseudocode function, except the + result is multiplied by the element size. This includes the not found + indication; e.g. not found for esz=3 is -8. */ +static intptr_t last_active_element(uint64_t *g, intptr_t words, intptr_t esz) +{ + uint64_t mask = pred_esz_masks[esz]; + intptr_t i = words; + + do { + uint64_t this_g = g[--i] & mask; + if (this_g) { + return i * 64 + (63 - clz64(this_g)); + } + } while (i > 0); + return (intptr_t)-1 << esz; +} + +uint32_t HELPER(sve_pfirst)(void *vd, void *vg, uint32_t words) +{ + uint32_t flags = PREDTEST_INIT; + uint64_t *d = vd, *g = vg; + intptr_t i = 0; + + do { + uint64_t this_d = d[i]; + uint64_t this_g = g[i]; + + if (this_g) { + if (!(flags & 4)) { + /* Set in D the first bit of G. */ + this_d |= this_g & -this_g; + d[i] = this_d; + } + flags = iter_predtest_fwd(this_d, this_g, flags); + } + } while (++i < words); + + return flags; +} + +uint32_t HELPER(sve_pnext)(void *vd, void *vg, uint32_t pred_desc) +{ + intptr_t words = extract32(pred_desc, 0, SIMD_OPRSZ_BITS); + intptr_t esz = extract32(pred_desc, SIMD_DATA_SHIFT, 2); + uint32_t flags = PREDTEST_INIT; + uint64_t *d = vd, *g = vg, esz_mask; + intptr_t i, next; + + next = last_active_element(vd, words, esz) + (1 << esz); + esz_mask = pred_esz_masks[esz]; + + /* Similar to the pseudocode for pnext, but scaled by ESZ + so that we find the correct bit. */ + if (next < words * 64) { + uint64_t mask = -1; + + if (next & 63) { + mask = ~((1ull << (next & 63)) - 1); + next &= -64; + } + do { + uint64_t this_g = g[next / 64] & esz_mask & mask; + if (this_g != 0) { + next = (next & -64) + ctz64(this_g); + break; + } + next += 64; + mask = -1; + } while (next < words * 64); + } + + i = 0; + do { + uint64_t this_d = 0; + if (i == next / 64) { + this_d = 1ull << (next & 63); + } + d[i] = this_d; + flags = iter_predtest_fwd(this_d, g[i] & esz_mask, flags); + } while (++i < words); + + return flags; +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 405f9397a1..a9b6ae046d 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -22,6 +22,7 @@ #include "exec/exec-all.h" #include "tcg-op.h" #include "tcg-op-gvec.h" +#include "tcg-gvec-desc.h" #include "qemu/log.h" #include "arm_ldst.h" #include "translate.h" @@ -67,9 +68,8 @@ static inline int pred_full_reg_size(DisasContext *s) * Note that this is not needed for the vector registers as they * are always properly sized for tcg vectors. */ -static int pred_gvec_reg_size(DisasContext *s) +static int size_for_gvec(int size) { - int size = pred_full_reg_size(s); if (size <= 8) { return 8; } else { @@ -77,6 +77,11 @@ static int pred_gvec_reg_size(DisasContext *s) } } +static int pred_gvec_reg_size(DisasContext *s) +{ + return size_for_gvec(pred_full_reg_size(s)); +} + /* Invoke a vector expander on two Zregs. */ static void do_vector2_z(DisasContext *s, GVecGen2Fn *gvec_fn, int esz, int rd, int rn) @@ -172,6 +177,12 @@ static void do_predtest(DisasContext *s, int dofs, int gofs, int words) tcg_temp_free_i32(t); } +/* For each element size, the bits within a predicate word that are active. */ +const uint64_t pred_esz_masks[4] = { + 0xffffffffffffffffull, 0x5555555555555555ull, + 0x1111111111111111ull, 0x0101010101010101ull +}; + /* *** SVE Logical - Unpredicated Group */ @@ -509,6 +520,154 @@ static void trans_PTEST(DisasContext *s, arg_PTEST *a, uint32_t insn) } } +/* See the ARM pseudocode DecodePredCount. */ +static unsigned decode_pred_count(unsigned fullsz, int pattern, int esz) +{ + unsigned elements = fullsz >> esz; + unsigned bound; + + switch (pattern) { + case 0x0: /* POW2 */ + return pow2floor(elements); + case 0x1: /* VL1 */ + case 0x2: /* VL2 */ + case 0x3: /* VL3 */ + case 0x4: /* VL4 */ + case 0x5: /* VL5 */ + case 0x6: /* VL6 */ + case 0x7: /* VL7 */ + case 0x8: /* VL8 */ + bound = pattern; + break; + case 0x9: /* VL16 */ + case 0xa: /* VL32 */ + case 0xb: /* VL64 */ + case 0xc: /* VL128 */ + case 0xd: /* VL256 */ + bound = 16 << (pattern - 9); + break; + case 0x1d: /* MUL4 */ + return elements - elements % 4; + case 0x1e: /* MUL3 */ + return elements - elements % 3; + case 0x1f: /* ALL */ + return elements; + default: /* #uimm5 */ + return 0; + } + return elements >= bound ? bound : 0; +} + +static void trans_PTRUE(DisasContext *s, arg_PTRUE *a, uint32_t insn) +{ + unsigned fullsz = vec_full_reg_size(s); + unsigned ofs = pred_full_reg_offset(s, a->rd); + unsigned numelem, setsz, i; + uint64_t word, lastword; + TCGv_i64 t; + + numelem = decode_pred_count(fullsz, a->pat, a->esz); + + /* Determine what we must store into each bit, and how many. */ + if (numelem == 0) { + lastword = word = 0; + setsz = fullsz; + } else { + setsz = numelem << a->esz; + lastword = word = pred_esz_masks[a->esz]; + if (setsz % 64) { + lastword &= ~(-1ull << (setsz % 64)); + } + } + + t = tcg_temp_new_i64(); + if (fullsz <= 64) { + tcg_gen_movi_i64(t, lastword); + tcg_gen_st_i64(t, cpu_env, ofs); + goto done; + } + + if (word == lastword) { + unsigned maxsz = size_for_gvec(fullsz / 8); + unsigned oprsz = size_for_gvec(setsz / 8); + + if (oprsz * 8 == setsz) { + tcg_gen_gvec_dup64i(ofs, oprsz, maxsz, word); + goto done; + } + if (oprsz * 8 == setsz + 8) { + tcg_gen_gvec_dup64i(ofs, oprsz, maxsz, word); + tcg_gen_movi_i64(t, 0); + tcg_gen_st_i64(t, cpu_env, ofs + oprsz - 8); + goto done; + } + } + + setsz /= 8; + fullsz /= 8; + + tcg_gen_movi_i64(t, word); + for (i = 0; i < setsz; i += 8) { + tcg_gen_st_i64(t, cpu_env, ofs + i); + } + if (lastword != word) { + tcg_gen_movi_i64(t, lastword); + tcg_gen_st_i64(t, cpu_env, ofs + i); + i += 8; + } + if (i < fullsz) { + tcg_gen_movi_i64(t, 0); + for (; i < fullsz; i += 8) { + tcg_gen_st_i64(t, cpu_env, ofs + i); + } + } + + done: + tcg_temp_free_i64(t); + + /* PTRUES */ + if (a->s) { + tcg_gen_movi_i32(cpu_NF, -(word != 0)); + tcg_gen_movi_i32(cpu_CF, word == 0); + tcg_gen_movi_i32(cpu_VF, 0); + tcg_gen_mov_i32(cpu_ZF, cpu_NF); + } +} + +static void do_pfirst_pnext(DisasContext *s, arg_rr_esz *a, + void (*gen_fn)(TCGv_i32, TCGv_ptr, + TCGv_ptr, TCGv_i32)) +{ + TCGv_ptr t_pd = tcg_temp_new_ptr(); + TCGv_ptr t_pg = tcg_temp_new_ptr(); + TCGv_i32 t; + unsigned desc; + + desc = DIV_ROUND_UP(pred_full_reg_size(s), 8); + desc = deposit32(desc, SIMD_DATA_SHIFT, 2, a->esz); + + tcg_gen_addi_ptr(t_pd, cpu_env, pred_full_reg_offset(s, a->rd)); + tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, a->rn)); + t = tcg_const_i32(desc); + + gen_fn(t, t_pd, t_pg, t); + tcg_temp_free_ptr(t_pd); + tcg_temp_free_ptr(t_pg); + + do_pred_flags(t); + tcg_temp_free_i32(t); +} + +static void trans_PFIRST(DisasContext *s, arg_rr_esz *a, uint32_t insn) +{ + do_pfirst_pnext(s, a, gen_helper_sve_pfirst); +} + +static void trans_PNEXT(DisasContext *s, arg_rr_esz *a, uint32_t insn) +{ + do_pfirst_pnext(s, a, gen_helper_sve_pnext); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index d92886127a..2e27ef41cd 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -23,20 +23,30 @@ # Named fields. These are primarily for disjoint fields. %imm9_16_10 16:s6 10:3 +%preg4_5 5:4 ########################################################################### # Named attribute sets. These are used to make nice(er) names # when creating helpers common to those for the individual # instruction patterns. +&rr_esz rd rn esz &rri rd rn imm &rrr_esz rd rn rm esz &rprr_s rd pg rn rm s +&ptrue rd esz pat s + ########################################################################### # Named instruction formats. These are generally used to # reduce the amount of duplication between instruction patterns. +# Two operand with unused vector element size +@pd_pn_e0 ........ ........ ....... rn:4 . rd:4 &rr_esz esz=0 + +# Two operand +@pd_pn ........ esz:2 .. .... ....... rn:4 . rd:4 &rr_esz + # Three operand with unused vector element size @rd_rn_rm_e0 ........ ... rm:5 ... ... rn:5 rd:5 &rrr_esz esz=0 @@ -77,6 +87,37 @@ NAND_pppp 00100101 1. 00 .... 01 .... 1 .... 1 .... @pd_pg_pn_pm_s # SVE predicate test PTEST 00100101 01010000 11 pg:4 0 rn:4 00000 +# SVE predicate initialize +PTRUE 00100101 esz:2 01100 s:1 111000 pat:5 0 rd:4 &ptrue + +# SVE initialize FFR (SETFFR) +PTRUE 00100101 0010 1100 1001 0000 0000 0000 \ + &ptrue rd=16 esz=0 pat=31 s=0 + +# SVE zero predicate register (PFALSE) +# Note that pat=32 is outside of the natural 0..31, and will +# always hit the default #uimm5 case of decode_pred_count. +PTRUE 00100101 0001 1000 1110 0100 0000 rd:4 \ + &ptrue esz=0 pat=32 s=0 + +# SVE predicate read from FFR (predicated) (RDFFR) +ORR_pppp 00100101 0 s:1 0110001111000 pg:4 0 rd:4 \ + &rprr_s rn=16 rm=16 + +# SVE predicate read from FFR (unpredicated) (RDFFR) +ORR_pppp 00100101 0001 1001 1111 0000 0000 rd:4 \ + &rprr_s rn=16 rm=16 pg=16 s=0 + +# SVE FFR write from predicate (WRFFR) +ORR_pppp 00100101 0010 1000 1001 000 rn:4 00000 \ + &rprr_s rd=16 rm=%preg4_5 pg=%preg4_5 s=0 + +# SVE predicate first active +PFIRST 00100101 01 011 000 11000 00 .... 0 .... @pd_pn_e0 + +# SVE predicate next active +PNEXT 00100101 .. 011 001 11000 10 .... 0 .... @pd_pn + ### SVE Memory - 32-bit Gather and Unsized Contiguous Group # SVE load predicate register From patchwork Sat Feb 17 18:22:25 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128683 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1822333ljc; Sat, 17 Feb 2018 10:34:37 -0800 (PST) X-Google-Smtp-Source: AH8x225BuCvDS3l1leV6uaR+Kx8OOp51t2YpBNdQa+D4c80Dzccb4nAm9soRdXqBWBBHpDIFfcn5 X-Received: by 10.37.68.212 with SMTP id r203mr700588yba.467.1518892477756; Sat, 17 Feb 2018 10:34:37 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518892477; cv=none; d=google.com; s=arc-20160816; b=AVpitNEz9dCDHroA6p2vO+koBPbCIPxqzfqh+4dZr2g0+mFKiPMWnLivREEo1TGBII LDRZrbCO2Tr21YFaF7k8+mbvsu3XPMY5Vxq67hu/DVCHwob+ukoRgyhgu9Y5d3aZebKJ KZRYqslYDpbcwcORSWc5Db9NTFtPdogNYDsau1rucOypJ1ZpVmlzweodZAjwLQrzvDlr pVlXW9SWk0NkbKxldtNOEjA8TM6VV4seGLw4qy5FjeBCtnq9moWDwlLpcCfONuZ5SFEs xul0jjgCur/tBkzca3BPGUbxpHk0Ro/36tI+FGHSHGFZyb4yyK7JVdYaM6lMnmk4jtta EnLw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=qiFNGpwXsmKVCH6IljZFj24cOY655vAKgmFhZVL/p+Y=; b=Yoc9kkNL65IJcptnZ6W7XqJW1+lFF2TYBRuXAxnZ3bNmdL7K0AaD7VLIIGXQggfRla MGoj9sgpN332MFg/s/bAnAcrWrNxrTSDtquHiaEekEoxpxc0YUYX7QvBIDbpTrC7GLgB bJGffaY7fg/VFatvrwaIGwV8JrVIIaa9aBWHA8cX9g/HatX5Y+2SLl7g+NFgJ72c0t4y G2eW+C5iLedA29exM275LDRn+/51ka+Lz2w16uQUeM06ZljoUNbGfkfP3iljnZPHDDRM iaUpd++mU6IIHBitjO/TEf03EHtdNOCcd3DT2PYmLZDMilFy0C88X05zAxCbcCf+1sON n89w== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=S6OorHRt; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id a62si729514ywb.548.2018.02.17.10.34.37 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 10:34:37 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=S6OorHRt; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48115 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7Jl-0008IY-1a for patch@linaro.org; Sat, 17 Feb 2018 13:34:37 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39682) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79F-0000Ke-VF for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:49 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79C-0001YR-U4 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:45 -0500 Received: from mail-pg0-x242.google.com ([2607:f8b0:400e:c05::242]:38908) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79C-0001Y5-IN for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:42 -0500 Received: by mail-pg0-x242.google.com with SMTP id l24so4354238pgc.5 for ; Sat, 17 Feb 2018 10:23:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=qiFNGpwXsmKVCH6IljZFj24cOY655vAKgmFhZVL/p+Y=; b=S6OorHRt8APffA27xNsTIvs/4+a40J1dSUNRJp1o5my3swU82QNgAjAbq2lsxOn8Wq qgqowCdr8HBn6++qRxLJhrmao4fH7Oz1GcarcpMxgzHpQU0GxyrMhKMUsrfvHslb4ABh NN26lTW1/ZboSVjRoUDWFlzISZsaLIbCWIswM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=qiFNGpwXsmKVCH6IljZFj24cOY655vAKgmFhZVL/p+Y=; b=O4TSBHQzFYNW3qgvWOzczpHzwGCuiyBiGSjGKiQJL29zunQ58nRPN08yGm6iRfbmDO hx3DjxUMHcvPv+cUG+n6UO3w8hwhLnA+8vhLR1CBUdBM+acq2Sc7ngnBUGFb8yLjkJ60 zgO+m2sfB4M1qCl46TrZ30Yb1fkZ+A4SfJ3s5zNiylM6J86zHv1RGPAWDVKBt5g5j7Hf 4m+5lRmPl/XQtwAj7SZ75sSnmqZ3YRa9GAjBlEHYHmVr4miYduLKVFZFDAVFS+OsCRd7 Efwp75kdXrN5rekJmon9GX9SLg4PdxfoYK66MMvIP7mYzgNbsrDb2c4mHFCzuAE2tXKG gvSg== X-Gm-Message-State: APf1xPDui4+LqawLRq2t+D2lK4X/vVFwqmCMrNgwqL74XGCUIHRvL5fy GxNDH2knRIm7w/XilPtt9NpWXCIZAzY= X-Received: by 10.98.92.68 with SMTP id q65mr9796324pfb.4.1518891821066; Sat, 17 Feb 2018 10:23:41 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.39 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:40 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:25 -0800 Message-Id: <20180217182323.25885-10-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::242 Subject: [Qemu-devel] [PATCH v2 09/67] target/arm: Implement SVE Integer Binary Arithmetic - Predicated Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 145 +++++++++++++++++++++++++++++++++ target/arm/sve_helper.c | 196 ++++++++++++++++++++++++++++++++++++++++++++- target/arm/translate-sve.c | 65 +++++++++++++++ target/arm/sve.decode | 42 ++++++++++ 4 files changed, 447 insertions(+), 1 deletion(-) -- 2.14.3 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 0c04afff8c..5b82ba1501 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -23,6 +23,151 @@ DEF_HELPER_FLAGS_3(sve_predtest, TCG_CALL_NO_WG, i32, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_pfirst, TCG_CALL_NO_WG, i32, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_pnext, TCG_CALL_NO_WG, i32, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_and_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_and_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_and_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_and_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_eor_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_eor_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_eor_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_eor_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_orr_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_orr_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_orr_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_orr_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_bic_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_bic_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_bic_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_bic_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_add_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_add_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_add_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_add_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_sub_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_sub_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_sub_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_sub_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_smax_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_smax_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_smax_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_smax_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_umax_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_umax_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_umax_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_umax_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_smin_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_smin_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_smin_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_smin_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_umin_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_umin_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_umin_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_umin_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_sabd_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_sabd_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_sabd_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_sabd_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_uabd_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_uabd_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_uabd_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_uabd_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_mul_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_mul_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_mul_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_mul_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_smulh_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_smulh_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_smulh_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_smulh_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_umulh_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_umulh_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_umulh_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_umulh_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_sdiv_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_sdiv_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_udiv_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_udiv_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index cee7d9bcf6..26c177c2fd 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -25,6 +25,22 @@ #include "tcg/tcg-gvec-desc.h" +/* Note that vector data is stored in host-endian 64-bit chunks, + so addressing units smaller than that needs a host-endian fixup. */ +#ifdef HOST_WORDS_BIGENDIAN +#define H1(x) ((x) ^ 7) +#define H1_2(x) ((x) ^ 6) +#define H1_4(x) ((x) ^ 4) +#define H2(x) ((x) ^ 3) +#define H4(x) ((x) ^ 1) +#else +#define H1(x) (x) +#define H1_2(x) (x) +#define H1_4(x) (x) +#define H2(x) (x) +#define H4(x) (x) +#endif + /* Return a value for NZCV as per the ARM PredTest pseudofunction. * * The return value has bit 31 set if N is set, bit 1 set if Z is clear, @@ -105,7 +121,7 @@ LOGICAL_PPPP(sve_orn_pppp, DO_ORN) LOGICAL_PPPP(sve_nor_pppp, DO_NOR) LOGICAL_PPPP(sve_nand_pppp, DO_NAND) -#undef DO_ADD +#undef DO_AND #undef DO_BIC #undef DO_EOR #undef DO_ORR @@ -115,6 +131,184 @@ LOGICAL_PPPP(sve_nand_pppp, DO_NAND) #undef DO_SEL #undef LOGICAL_PPPP +/* Fully general three-operand expander, controlled by a predicate. + * This is complicated by the host-endian storage of the register file. + */ +/* ??? I don't expect the compiler could ever vectorize this itself. + * With some tables we can convert bit masks to byte masks, and with + * extra care wrt byte/word ordering we could use gcc generic vectors + * and do 16 bytes at a time. + */ +#define DO_ZPZZ(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + for (i = 0; i < opr_sz; ) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + if (pg & 1) { \ + TYPE nn = *(TYPE *)(vn + H(i)); \ + TYPE mm = *(TYPE *)(vm + H(i)); \ + *(TYPE *)(vd + H(i)) = OP(nn, mm); \ + } \ + i += sizeof(TYPE), pg >>= sizeof(TYPE); \ + } while (i & 15); \ + } \ +} + +/* Similarly, specialized for 64-bit operands. */ +#define DO_ZPZZ_D(NAME, TYPE, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; \ + TYPE *d = vd, *n = vn, *m = vm; \ + uint8_t *pg = vg; \ + for (i = 0; i < opr_sz; i += 1) { \ + if (pg[H1(i)] & 1) { \ + TYPE nn = n[i], mm = m[i]; \ + d[i] = OP(nn, mm); \ + } \ + } \ +} + +#define DO_AND(N, M) (N & M) +#define DO_EOR(N, M) (N ^ M) +#define DO_ORR(N, M) (N | M) +#define DO_BIC(N, M) (N & ~M) +#define DO_ADD(N, M) (N + M) +#define DO_SUB(N, M) (N - M) +#define DO_MAX(N, M) ((N) >= (M) ? (N) : (M)) +#define DO_MIN(N, M) ((N) >= (M) ? (M) : (N)) +#define DO_ABD(N, M) ((N) >= (M) ? (N) - (M) : (M) - (N)) +#define DO_MUL(N, M) (N * M) +#define DO_DIV(N, M) (M ? N / M : 0) + +DO_ZPZZ(sve_and_zpzz_b, uint8_t, H1, DO_AND) +DO_ZPZZ(sve_and_zpzz_h, uint16_t, H1_2, DO_AND) +DO_ZPZZ(sve_and_zpzz_s, uint32_t, H1_4, DO_AND) +DO_ZPZZ_D(sve_and_zpzz_d, uint64_t, DO_AND) + +DO_ZPZZ(sve_orr_zpzz_b, uint8_t, H1, DO_ORR) +DO_ZPZZ(sve_orr_zpzz_h, uint16_t, H1_2, DO_ORR) +DO_ZPZZ(sve_orr_zpzz_s, uint32_t, H1_4, DO_ORR) +DO_ZPZZ_D(sve_orr_zpzz_d, uint64_t, DO_ORR) + +DO_ZPZZ(sve_eor_zpzz_b, uint8_t, H1, DO_EOR) +DO_ZPZZ(sve_eor_zpzz_h, uint16_t, H1_2, DO_EOR) +DO_ZPZZ(sve_eor_zpzz_s, uint32_t, H1_4, DO_EOR) +DO_ZPZZ_D(sve_eor_zpzz_d, uint64_t, DO_EOR) + +DO_ZPZZ(sve_bic_zpzz_b, uint8_t, H1, DO_BIC) +DO_ZPZZ(sve_bic_zpzz_h, uint16_t, H1_2, DO_BIC) +DO_ZPZZ(sve_bic_zpzz_s, uint32_t, H1_4, DO_BIC) +DO_ZPZZ_D(sve_bic_zpzz_d, uint64_t, DO_BIC) + +DO_ZPZZ(sve_add_zpzz_b, uint8_t, H1, DO_ADD) +DO_ZPZZ(sve_add_zpzz_h, uint16_t, H1_2, DO_ADD) +DO_ZPZZ(sve_add_zpzz_s, uint32_t, H1_4, DO_ADD) +DO_ZPZZ_D(sve_add_zpzz_d, uint64_t, DO_ADD) + +DO_ZPZZ(sve_sub_zpzz_b, uint8_t, H1, DO_SUB) +DO_ZPZZ(sve_sub_zpzz_h, uint16_t, H1_2, DO_SUB) +DO_ZPZZ(sve_sub_zpzz_s, uint32_t, H1_4, DO_SUB) +DO_ZPZZ_D(sve_sub_zpzz_d, uint64_t, DO_SUB) + +DO_ZPZZ(sve_smax_zpzz_b, int8_t, H1, DO_MAX) +DO_ZPZZ(sve_smax_zpzz_h, int16_t, H1_2, DO_MAX) +DO_ZPZZ(sve_smax_zpzz_s, int32_t, H1_4, DO_MAX) +DO_ZPZZ_D(sve_smax_zpzz_d, int64_t, DO_MAX) + +DO_ZPZZ(sve_umax_zpzz_b, uint8_t, H1, DO_MAX) +DO_ZPZZ(sve_umax_zpzz_h, uint16_t, H1_2, DO_MAX) +DO_ZPZZ(sve_umax_zpzz_s, uint32_t, H1_4, DO_MAX) +DO_ZPZZ_D(sve_umax_zpzz_d, uint64_t, DO_MAX) + +DO_ZPZZ(sve_smin_zpzz_b, int8_t, H1, DO_MIN) +DO_ZPZZ(sve_smin_zpzz_h, int16_t, H1_2, DO_MIN) +DO_ZPZZ(sve_smin_zpzz_s, int32_t, H1_4, DO_MIN) +DO_ZPZZ_D(sve_smin_zpzz_d, int64_t, DO_MIN) + +DO_ZPZZ(sve_umin_zpzz_b, uint8_t, H1, DO_MIN) +DO_ZPZZ(sve_umin_zpzz_h, uint16_t, H1_2, DO_MIN) +DO_ZPZZ(sve_umin_zpzz_s, uint32_t, H1_4, DO_MIN) +DO_ZPZZ_D(sve_umin_zpzz_d, uint64_t, DO_MIN) + +DO_ZPZZ(sve_sabd_zpzz_b, int8_t, H1, DO_ABD) +DO_ZPZZ(sve_sabd_zpzz_h, int16_t, H1_2, DO_ABD) +DO_ZPZZ(sve_sabd_zpzz_s, int32_t, H1_4, DO_ABD) +DO_ZPZZ_D(sve_sabd_zpzz_d, int64_t, DO_ABD) + +DO_ZPZZ(sve_uabd_zpzz_b, uint8_t, H1, DO_ABD) +DO_ZPZZ(sve_uabd_zpzz_h, uint16_t, H1_2, DO_ABD) +DO_ZPZZ(sve_uabd_zpzz_s, uint32_t, H1_4, DO_ABD) +DO_ZPZZ_D(sve_uabd_zpzz_d, uint64_t, DO_ABD) + +/* Because the computation type is at least twice as large as required, + these work for both signed and unsigned source types. */ +static inline uint8_t do_mulh_b(int32_t n, int32_t m) +{ + return (n * m) >> 8; +} + +static inline uint16_t do_mulh_h(int32_t n, int32_t m) +{ + return (n * m) >> 16; +} + +static inline uint32_t do_mulh_s(int64_t n, int64_t m) +{ + return (n * m) >> 32; +} + +static inline uint64_t do_smulh_d(uint64_t n, uint64_t m) +{ + uint64_t lo, hi; + muls64(&lo, &hi, n, m); + return hi; +} + +static inline uint64_t do_umulh_d(uint64_t n, uint64_t m) +{ + uint64_t lo, hi; + mulu64(&lo, &hi, n, m); + return hi; +} + +DO_ZPZZ(sve_mul_zpzz_b, uint8_t, H1, DO_MUL) +DO_ZPZZ(sve_mul_zpzz_h, uint16_t, H1_2, DO_MUL) +DO_ZPZZ(sve_mul_zpzz_s, uint32_t, H1_4, DO_MUL) +DO_ZPZZ_D(sve_mul_zpzz_d, uint64_t, DO_MUL) + +DO_ZPZZ(sve_smulh_zpzz_b, int8_t, H1, do_mulh_b) +DO_ZPZZ(sve_smulh_zpzz_h, int16_t, H1_2, do_mulh_h) +DO_ZPZZ(sve_smulh_zpzz_s, int32_t, H1_4, do_mulh_s) +DO_ZPZZ_D(sve_smulh_zpzz_d, uint64_t, do_smulh_d) + +DO_ZPZZ(sve_umulh_zpzz_b, uint8_t, H1, do_mulh_b) +DO_ZPZZ(sve_umulh_zpzz_h, uint16_t, H1_2, do_mulh_h) +DO_ZPZZ(sve_umulh_zpzz_s, uint32_t, H1_4, do_mulh_s) +DO_ZPZZ_D(sve_umulh_zpzz_d, uint64_t, do_umulh_d) + +DO_ZPZZ(sve_sdiv_zpzz_s, int32_t, H1_4, DO_DIV) +DO_ZPZZ_D(sve_sdiv_zpzz_d, int64_t, DO_DIV) + +DO_ZPZZ(sve_udiv_zpzz_s, uint32_t, H1_4, DO_DIV) +DO_ZPZZ_D(sve_udiv_zpzz_d, uint64_t, DO_DIV) + +#undef DO_AND +#undef DO_ORR +#undef DO_EOR +#undef DO_BIC +#undef DO_ADD +#undef DO_SUB +#undef DO_MAX +#undef DO_MIN +#undef DO_ABD +#undef DO_MUL +#undef DO_DIV +#undef DO_ZPZZ +#undef DO_ZPZZ_D + /* Similar to the ARM LastActiveElement pseudocode function, except the result is multiplied by the element size. This includes the not found indication; e.g. not found for esz=3 is -8. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index a9b6ae046d..116002792a 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -211,6 +211,71 @@ static void trans_BIC_zzz(DisasContext *s, arg_BIC_zzz *a, uint32_t insn) do_vector3_z(s, tcg_gen_gvec_andc, 0, a->rd, a->rn, a->rm); } +/* + *** SVE Integer Arithmetic - Binary Predicated Group + */ + +static void do_zpzz_ool(DisasContext *s, arg_rprr_esz *a, gen_helper_gvec_4 *fn) +{ + unsigned vsz = vec_full_reg_size(s); + if (fn == NULL) { + unallocated_encoding(s); + return; + } + tcg_gen_gvec_4_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + pred_full_reg_offset(s, a->pg), + vsz, vsz, 0, fn); +} + +#define DO_ZPZZ(NAME, name) \ +void trans_##NAME##_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn) \ +{ \ + static gen_helper_gvec_4 * const fns[4] = { \ + gen_helper_sve_##name##_zpzz_b, gen_helper_sve_##name##_zpzz_h, \ + gen_helper_sve_##name##_zpzz_s, gen_helper_sve_##name##_zpzz_d, \ + }; \ + do_zpzz_ool(s, a, fns[a->esz]); \ +} + +DO_ZPZZ(AND, and) +DO_ZPZZ(EOR, eor) +DO_ZPZZ(ORR, orr) +DO_ZPZZ(BIC, bic) + +DO_ZPZZ(ADD, add) +DO_ZPZZ(SUB, sub) + +DO_ZPZZ(SMAX, smax) +DO_ZPZZ(UMAX, umax) +DO_ZPZZ(SMIN, smin) +DO_ZPZZ(UMIN, umin) +DO_ZPZZ(SABD, sabd) +DO_ZPZZ(UABD, uabd) + +DO_ZPZZ(MUL, mul) +DO_ZPZZ(SMULH, smulh) +DO_ZPZZ(UMULH, umulh) + +void trans_SDIV_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_4 * const fns[4] = { + NULL, NULL, gen_helper_sve_sdiv_zpzz_s, gen_helper_sve_sdiv_zpzz_d + }; + do_zpzz_ool(s, a, fns[a->esz]); +} + +void trans_UDIV_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_4 * const fns[4] = { + NULL, NULL, gen_helper_sve_udiv_zpzz_s, gen_helper_sve_udiv_zpzz_d + }; + do_zpzz_ool(s, a, fns[a->esz]); +} + +#undef DO_ZPZZ + /* *** SVE Predicate Logical Operations Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 2e27ef41cd..5fafe02575 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -25,6 +25,10 @@ %imm9_16_10 16:s6 10:3 %preg4_5 5:4 +# Either a copy of rd (at bit 0), or a different source +# as propagated via the MOVPRFX instruction. +%reg_movprfx 0:5 + ########################################################################### # Named attribute sets. These are used to make nice(er) names # when creating helpers common to those for the individual @@ -34,6 +38,7 @@ &rri rd rn imm &rrr_esz rd rn rm esz &rprr_s rd pg rn rm s +&rprr_esz rd pg rn rm esz &ptrue rd esz pat s @@ -53,6 +58,12 @@ # Three prediate operand, with governing predicate, flag setting @pd_pg_pn_pm_s ........ . s:1 .. rm:4 .. pg:4 . rn:4 . rd:4 &rprr_s +# Two register operand, with governing predicate, vector element size +@rdn_pg_rm ........ esz:2 ... ... ... pg:3 rm:5 rd:5 \ + &rprr_esz rn=%reg_movprfx +@rdm_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 \ + &rprr_esz rm=%reg_movprfx + # Basic Load/Store with 9-bit immediate offset @pd_rn_i9 ........ ........ ...... rn:5 . rd:4 \ &rri imm=%imm9_16_10 @@ -62,6 +73,37 @@ ########################################################################### # Instruction patterns. Grouped according to the SVE encodingindex.xhtml. +### SVE Integer Arithmetic - Binary Predicated Group + +# SVE bitwise logical vector operations (predicated) +ORR_zpzz 00000100 .. 011 000 000 ... ..... ..... @rdn_pg_rm +EOR_zpzz 00000100 .. 011 001 000 ... ..... ..... @rdn_pg_rm +AND_zpzz 00000100 .. 011 010 000 ... ..... ..... @rdn_pg_rm +BIC_zpzz 00000100 .. 011 011 000 ... ..... ..... @rdn_pg_rm + +# SVE integer add/subtract vectors (predicated) +ADD_zpzz 00000100 .. 000 000 000 ... ..... ..... @rdn_pg_rm +SUB_zpzz 00000100 .. 000 001 000 ... ..... ..... @rdn_pg_rm +SUB_zpzz 00000100 .. 000 011 000 ... ..... ..... @rdm_pg_rn # SUBR + +# SVE integer min/max/difference (predicated) +SMAX_zpzz 00000100 .. 001 000 000 ... ..... ..... @rdn_pg_rm +UMAX_zpzz 00000100 .. 001 001 000 ... ..... ..... @rdn_pg_rm +SMIN_zpzz 00000100 .. 001 010 000 ... ..... ..... @rdn_pg_rm +UMIN_zpzz 00000100 .. 001 011 000 ... ..... ..... @rdn_pg_rm +SABD_zpzz 00000100 .. 001 100 000 ... ..... ..... @rdn_pg_rm +UABD_zpzz 00000100 .. 001 101 000 ... ..... ..... @rdn_pg_rm + +# SVE integer multiply/divide (predicated) +MUL_zpzz 00000100 .. 010 000 000 ... ..... ..... @rdn_pg_rm +SMULH_zpzz 00000100 .. 010 010 000 ... ..... ..... @rdn_pg_rm +UMULH_zpzz 00000100 .. 010 011 000 ... ..... ..... @rdn_pg_rm +# Note that divide requires size >= 2; below 2 is unallocated. +SDIV_zpzz 00000100 .. 010 100 000 ... ..... ..... @rdn_pg_rm +UDIV_zpzz 00000100 .. 010 101 000 ... ..... ..... @rdn_pg_rm +SDIV_zpzz 00000100 .. 010 110 000 ... ..... ..... @rdm_pg_rn # SDIVR +UDIV_zpzz 00000100 .. 010 111 000 ... ..... ..... @rdm_pg_rn # UDIVR + ### SVE Logical - Unpredicated Group # SVE bitwise logical operations (unpredicated) From patchwork Sat Feb 17 18:22:26 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128688 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1825915ljc; Sat, 17 Feb 2018 10:40:05 -0800 (PST) X-Google-Smtp-Source: AH8x226mdaUClPxBtzaLRDrO0VXrSo7qVPMgm8ToJnQfs88MQ92rI5VHZIeCnvwZkPY6jx0slvAF X-Received: by 10.129.153.88 with SMTP id q85mr6408396ywg.305.1518892804996; Sat, 17 Feb 2018 10:40:04 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518892804; cv=none; d=google.com; s=arc-20160816; b=iVP7Pjo7ja4Zhz4cNjgI0XdTwLAwh+bktAqO8JCCnUSGPifRhoS28Ld2OZ/1S7HW/d tHIEFqPGr31Cxb2+vVs/r5q3IjhPAYh+vrCDBg1Z14oG2ZF7hnlSdrbtmkfmJsI1IcGF hOEbL5VFwHtqsByJiT4S3Y6OBF0vHob23uFmuLn/qhmil/9gatIBO4LiYEgHv3dqbzbH pYa5v+OUs0budig+6BUd9fa278K96d52Jex2C+GlxFjZyuAHgq0twL1p7F3sTbBu3qYn aYgt7ykx3390pxHc7kpwT6u2gt1s9sEciAveHySBD6N2tFq0rgGGUoKzvael1xWNxgwO /wuQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=myYWCaqty6C0BoKxDWeQ92jAbY6QN/No9ULf3IngOig=; b=1CLq+1KMol/HvAX3vvqJxPSjY5EQXrIU2ewKvcHTR9oqJOzd7eFupLhDaJZlSQtlva MYhJvm21l4tfGjzAgezgPwhUyiIT9xIRAG/EkqkUnbMigPO8Ynzcdqj29I1ImVNZF3UK v9m/RaqgPUULEgmTleoxxbw+lFmjsYJ9yDFFD5wx2X1CgxtwVXKhulpIJFHx1L0s+Moa hM3SYNs7MgTY33kWidwVUc2DQ0IDXtdKzFlUqsUFSMCaPIaZxmk0j+/xpyAfRmlSr2jZ GFpwKocMicf7oJrjLQRY7bKE4Lzu2Fq9oKGsUH5RlCY/x3i/PF/VeC7zNi0QopK+lVtY u44Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=ReHJfn/H; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id q6si1111854ywe.75.2018.02.17.10.40.04 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 10:40:04 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=ReHJfn/H; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48170 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7P2-0004SG-7c for patch@linaro.org; Sat, 17 Feb 2018 13:40:04 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39688) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79G-0000L8-7C for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:49 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79E-0001ZN-DX for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:46 -0500 Received: from mail-pg0-x241.google.com ([2607:f8b0:400e:c05::241]:44350) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79E-0001Z9-5M for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:44 -0500 Received: by mail-pg0-x241.google.com with SMTP id l4so2191126pgp.11 for ; Sat, 17 Feb 2018 10:23:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=myYWCaqty6C0BoKxDWeQ92jAbY6QN/No9ULf3IngOig=; b=ReHJfn/HYnod2UKEDjHnbrlMuVM5zs9PKdeuHtkUbAJA8lGGHgqAz5P7VoycbniWDc qQlQC512MaMDWlZIjRQrYFTq64XEgayUaTKkrF5tkmXwYBVz6+gcRupPIWSh8xiXpAdg vC4sOKm1yV8xvDD9gijmInGM5J+kVBmJwDqvY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=myYWCaqty6C0BoKxDWeQ92jAbY6QN/No9ULf3IngOig=; b=HNxfSWoP7/3MdsYO7j1LDuJhyOgurQMJrUNx1qW2sMQAGNt35vv2enmUybYgIAJUR9 thENj6OzOmrdO9xEmmwyBsWi6oOjwd9S2uJ5a4pMwLIkc5MESiaZAN/xVUov5Z38Wv19 uiBIqSKGnu3mceNqtbhmL0VFG4PVlC1btlyyRxJtgAoVWDNLOWqOS0Ytw2t6Oh0mqyE9 dV7E8zgLKZ1R3skkZ4vo/C2lwIA2FAzTmo9zvua5+YGlf+bmbT8tAw1qcpx1Ei9hJTcL v1p7mdyOdtWcuEC93cR/nTeDx0re5F4Rqm8sKMwvbDW3zs0aBQ10r5l+M6ljfzdxfWwT pyJw== X-Gm-Message-State: APf1xPB2SHs6jw/4mNEUXsUBeXU51vc7FyUaO+6sjGZ+MsjF9g0WSiUA 5Ncw++qsD3ek6RGA9arcEJM2z60TKpk= X-Received: by 10.98.215.12 with SMTP id b12mr9773712pfh.149.1518891822822; Sat, 17 Feb 2018 10:23:42 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.41 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:41 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:26 -0800 Message-Id: <20180217182323.25885-11-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::241 Subject: [Qemu-devel] [PATCH v2 10/67] target/arm: Implement SVE Integer Reduction Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Excepting MOVPRFX, which isn't a reduction. Presumably it is placed within the group because of its encoding. Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 44 +++++++++++++++++++++ target/arm/sve_helper.c | 95 +++++++++++++++++++++++++++++++++++++++++++++- target/arm/translate-sve.c | 65 +++++++++++++++++++++++++++++++ target/arm/sve.decode | 22 +++++++++++ 4 files changed, 224 insertions(+), 2 deletions(-) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 5b82ba1501..6b6bbeb272 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -168,6 +168,50 @@ DEF_HELPER_FLAGS_5(sve_udiv_zpzz_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_udiv_zpzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_orv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_orv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_orv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_orv_d, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_eorv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_eorv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_eorv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_eorv_d, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_andv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_andv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_andv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_andv_d, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_saddv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_saddv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_saddv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_uaddv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_uaddv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_uaddv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_uaddv_d, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_smaxv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_smaxv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_smaxv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_smaxv_d, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_umaxv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_umaxv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_umaxv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_umaxv_d, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_sminv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_sminv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_sminv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_sminv_d, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_uminv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_uminv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_uminv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_uminv_d, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 26c177c2fd..18fb27805e 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -295,6 +295,99 @@ DO_ZPZZ_D(sve_sdiv_zpzz_d, int64_t, DO_DIV) DO_ZPZZ(sve_udiv_zpzz_s, uint32_t, H1_4, DO_DIV) DO_ZPZZ_D(sve_udiv_zpzz_d, uint64_t, DO_DIV) +#undef DO_ZPZZ +#undef DO_ZPZZ_D + +/* Two-operand reduction expander, controlled by a predicate. + * The difference between TYPERED and TYPERET has to do with + * sign-extension. E.g. for SMAX, TYPERED must be signed, + * but TYPERET must be unsigned so that e.g. a 32-bit value + * is not sign-extended to the ABI uint64_t return type. + */ +/* ??? If we were to vectorize this by hand the reduction ordering + * would change. For integer operands, this is perfectly fine. + */ +#define DO_VPZ(NAME, TYPEELT, TYPERED, TYPERET, H, INIT, OP) \ +uint64_t HELPER(NAME)(void *vn, void *vg, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + TYPERED ret = INIT; \ + for (i = 0; i < opr_sz; ) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + if (pg & 1) { \ + TYPEELT nn = *(TYPEELT *)(vn + H(i)); \ + ret = OP(ret, nn); \ + } \ + i += sizeof(TYPEELT), pg >>= sizeof(TYPEELT); \ + } while (i & 15); \ + } \ + return (TYPERET)ret; \ +} + +#define DO_VPZ_D(NAME, TYPEE, TYPER, INIT, OP) \ +uint64_t HELPER(NAME)(void *vn, void *vg, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; \ + TYPEE *n = vn; \ + uint8_t *pg = vg; \ + TYPER ret = INIT; \ + for (i = 0; i < opr_sz; i += 1) { \ + if (pg[H1(i)] & 1) { \ + TYPEE nn = n[i]; \ + ret = OP(ret, nn); \ + } \ + } \ + return ret; \ +} + +DO_VPZ(sve_orv_b, uint8_t, uint8_t, uint8_t, H1, 0, DO_ORR) +DO_VPZ(sve_orv_h, uint16_t, uint16_t, uint16_t, H1_2, 0, DO_ORR) +DO_VPZ(sve_orv_s, uint32_t, uint32_t, uint32_t, H1_4, 0, DO_ORR) +DO_VPZ_D(sve_orv_d, uint64_t, uint64_t, 0, DO_ORR) + +DO_VPZ(sve_eorv_b, uint8_t, uint8_t, uint8_t, H1, 0, DO_EOR) +DO_VPZ(sve_eorv_h, uint16_t, uint16_t, uint16_t, H1_2, 0, DO_EOR) +DO_VPZ(sve_eorv_s, uint32_t, uint32_t, uint32_t, H1_4, 0, DO_EOR) +DO_VPZ_D(sve_eorv_d, uint64_t, uint64_t, 0, DO_EOR) + +DO_VPZ(sve_andv_b, uint8_t, uint8_t, uint8_t, H1, -1, DO_AND) +DO_VPZ(sve_andv_h, uint16_t, uint16_t, uint16_t, H1_2, -1, DO_AND) +DO_VPZ(sve_andv_s, uint32_t, uint32_t, uint32_t, H1_4, -1, DO_AND) +DO_VPZ_D(sve_andv_d, uint64_t, uint64_t, -1, DO_AND) + +DO_VPZ(sve_saddv_b, int8_t, uint64_t, uint64_t, H1, 0, DO_ADD) +DO_VPZ(sve_saddv_h, int16_t, uint64_t, uint64_t, H1_2, 0, DO_ADD) +DO_VPZ(sve_saddv_s, int32_t, uint64_t, uint64_t, H1_4, 0, DO_ADD) + +DO_VPZ(sve_uaddv_b, uint8_t, uint64_t, uint64_t, H1, 0, DO_ADD) +DO_VPZ(sve_uaddv_h, uint16_t, uint64_t, uint64_t, H1_2, 0, DO_ADD) +DO_VPZ(sve_uaddv_s, uint32_t, uint64_t, uint64_t, H1_4, 0, DO_ADD) +DO_VPZ_D(sve_uaddv_d, uint64_t, uint64_t, 0, DO_ADD) + +DO_VPZ(sve_smaxv_b, int8_t, int8_t, uint8_t, H1, INT8_MIN, DO_MAX) +DO_VPZ(sve_smaxv_h, int16_t, int16_t, uint16_t, H1_2, INT16_MIN, DO_MAX) +DO_VPZ(sve_smaxv_s, int32_t, int32_t, uint32_t, H1_4, INT32_MIN, DO_MAX) +DO_VPZ_D(sve_smaxv_d, int64_t, int64_t, INT64_MIN, DO_MAX) + +DO_VPZ(sve_umaxv_b, uint8_t, uint8_t, uint8_t, H1, 0, DO_MAX) +DO_VPZ(sve_umaxv_h, uint16_t, uint16_t, uint16_t, H1_2, 0, DO_MAX) +DO_VPZ(sve_umaxv_s, uint32_t, uint32_t, uint32_t, H1_4, 0, DO_MAX) +DO_VPZ_D(sve_umaxv_d, uint64_t, uint64_t, 0, DO_MAX) + +DO_VPZ(sve_sminv_b, int8_t, int8_t, uint8_t, H1, INT8_MAX, DO_MIN) +DO_VPZ(sve_sminv_h, int16_t, int16_t, uint16_t, H1_2, INT16_MAX, DO_MIN) +DO_VPZ(sve_sminv_s, int32_t, int32_t, uint32_t, H1_4, INT32_MAX, DO_MIN) +DO_VPZ_D(sve_sminv_d, int64_t, int64_t, INT64_MAX, DO_MIN) + +DO_VPZ(sve_uminv_b, uint8_t, uint8_t, uint8_t, H1, -1, DO_MIN) +DO_VPZ(sve_uminv_h, uint16_t, uint16_t, uint16_t, H1_2, -1, DO_MIN) +DO_VPZ(sve_uminv_s, uint32_t, uint32_t, uint32_t, H1_4, -1, DO_MIN) +DO_VPZ_D(sve_uminv_d, uint64_t, uint64_t, -1, DO_MIN) + +#undef DO_VPZ +#undef DO_VPZ_D + #undef DO_AND #undef DO_ORR #undef DO_EOR @@ -306,8 +399,6 @@ DO_ZPZZ_D(sve_udiv_zpzz_d, uint64_t, DO_DIV) #undef DO_ABD #undef DO_MUL #undef DO_DIV -#undef DO_ZPZZ -#undef DO_ZPZZ_D /* Similar to the ARM LastActiveElement pseudocode function, except the result is multiplied by the element size. This includes the not found diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 116002792a..49251a53c1 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -276,6 +276,71 @@ void trans_UDIV_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn) #undef DO_ZPZZ +/* + *** SVE Integer Reduction Group + */ + +typedef void gen_helper_gvec_reduc(TCGv_i64, TCGv_ptr, TCGv_ptr, TCGv_i32); +static void do_vpz_ool(DisasContext *s, arg_rpr_esz *a, + gen_helper_gvec_reduc *fn) +{ + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr t_zn, t_pg; + TCGv_i32 desc; + TCGv_i64 temp; + + if (fn == 0) { + unallocated_encoding(s); + return; + } + + desc = tcg_const_i32(simd_desc(vsz, vsz, 0)); + temp = tcg_temp_new_i64(); + t_zn = tcg_temp_new_ptr(); + t_pg = tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(t_zn, cpu_env, vec_full_reg_offset(s, a->rn)); + tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, a->pg)); + fn(temp, t_zn, t_pg, desc); + tcg_temp_free_ptr(t_zn); + tcg_temp_free_ptr(t_pg); + tcg_temp_free_i32(desc); + + write_fp_dreg(s, a->rd, temp); + tcg_temp_free_i64(temp); +} + +#define DO_VPZ(NAME, name) \ +static void trans_##NAME(DisasContext *s, arg_rpr_esz *a, uint32_t insn) \ +{ \ + static gen_helper_gvec_reduc * const fns[4] = { \ + gen_helper_sve_##name##_b, gen_helper_sve_##name##_h, \ + gen_helper_sve_##name##_s, gen_helper_sve_##name##_d, \ + }; \ + do_vpz_ool(s, a, fns[a->esz]); \ +} + +DO_VPZ(ORV, orv) +DO_VPZ(ANDV, andv) +DO_VPZ(EORV, eorv) + +DO_VPZ(UADDV, uaddv) +DO_VPZ(SMAXV, smaxv) +DO_VPZ(UMAXV, umaxv) +DO_VPZ(SMINV, sminv) +DO_VPZ(UMINV, uminv) + +static void trans_SADDV(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_reduc * const fns[4] = { + gen_helper_sve_saddv_b, gen_helper_sve_saddv_h, + gen_helper_sve_saddv_s, NULL + }; + do_vpz_ool(s, a, fns[a->esz]); +} + +#undef DO_VPZ + /* *** SVE Predicate Logical Operations Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 5fafe02575..b390d8f398 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -37,6 +37,7 @@ &rr_esz rd rn esz &rri rd rn imm &rrr_esz rd rn rm esz +&rpr_esz rd pg rn esz &rprr_s rd pg rn rm s &rprr_esz rd pg rn rm esz @@ -64,6 +65,9 @@ @rdm_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 \ &rprr_esz rm=%reg_movprfx +# One register operand, with governing predicate, vector element size +@rd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz + # Basic Load/Store with 9-bit immediate offset @pd_rn_i9 ........ ........ ...... rn:5 . rd:4 \ &rri imm=%imm9_16_10 @@ -104,6 +108,24 @@ UDIV_zpzz 00000100 .. 010 101 000 ... ..... ..... @rdn_pg_rm SDIV_zpzz 00000100 .. 010 110 000 ... ..... ..... @rdm_pg_rn # SDIVR UDIV_zpzz 00000100 .. 010 111 000 ... ..... ..... @rdm_pg_rn # UDIVR +### SVE Integer Reduction Group + +# SVE bitwise logical reduction (predicated) +ORV 00000100 .. 011 000 001 ... ..... ..... @rd_pg_rn +EORV 00000100 .. 011 001 001 ... ..... ..... @rd_pg_rn +ANDV 00000100 .. 011 010 001 ... ..... ..... @rd_pg_rn + +# SVE integer add reduction (predicated) +# Note that saddv requires size != 3. +UADDV 00000100 .. 000 001 001 ... ..... ..... @rd_pg_rn +SADDV 00000100 .. 000 000 001 ... ..... ..... @rd_pg_rn + +# SVE integer min/max reduction (predicated) +SMAXV 00000100 .. 001 000 001 ... ..... ..... @rd_pg_rn +UMAXV 00000100 .. 001 001 001 ... ..... ..... @rd_pg_rn +SMINV 00000100 .. 001 010 001 ... ..... ..... @rd_pg_rn +UMINV 00000100 .. 001 011 001 ... ..... ..... @rd_pg_rn + ### SVE Logical - Unpredicated Group # SVE bitwise logical operations (unpredicated) From patchwork Sat Feb 17 18:22:27 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128691 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1828090ljc; Sat, 17 Feb 2018 10:43:39 -0800 (PST) X-Google-Smtp-Source: AH8x226S2clHsjP5JOC9V67Ua7wW34kuBpH5R86v7bOuidtncvU+SEAVbsYFYAfP7/I0zq8PlxKC X-Received: by 10.37.189.73 with SMTP id p9mr7374656ybm.441.1518893018965; Sat, 17 Feb 2018 10:43:38 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518893018; cv=none; d=google.com; s=arc-20160816; b=n1y5pP4UfqDzQ0oy3ghTaoG5gwLh0xKxjzprxyEsPvC8w2ON6RnY3sElUW3AkPvrmN 3g27PF3hVggMMP63Z9T8j0EO9h3XXKuja3BxtX44lTov+wrT4VtJsUpBbxHvNK5fe/Qd jVQEb+8ICJrWI996fqPBqJvNfx/ZDYjHnY5IkF5934ojnwY9d2d1dAiQG3UBFEPdoGsL RFceXzbkHPuuCcEJTPXNtdphp5YPx+F+vX03lGZjrAuTWzUIU3eLPnhXpqCWrDu3M9tT JuiXh4Q4KpPlSvk6GeWNSvCI7h3zhPgkAwJsV/6u24SHYr5X9uCbkUFPaSJAhheTuoQ7 rUtg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=yc03Rp6PfTCJLkAYx7Y9aOVkKsWaJPzL0uw4GCiXPIs=; b=f1+Eh57St+6M86RycQ8qaixIwj0scp4k7ivRR1/npCpsz3XwcutaQmG7PRA7nCmwAK tPuUF/KmQ8ifgDjNROF6sKmx9/Ps+ikm/dy0aW97jImFIoZexIbLX/rpITynMqS/aHQv LKYxLFlq2Nj1y/vXISH0NY4b9BPVolEywbfhpona1XmZZrriyx5OG+MvOVCT+3cRN/v4 IBKDYrWt7ecn5E5shODDbt1jb0PjPEnV5XXVRaTgH75HS6Ee1zdDp6BDRVlqjL8dFzD7 ths8XyohPfTYqLOAuDYr6AEH4VyHIIZsHg/GeBq+r1RwKUcAtRDEvtBKoQ9ikZ2/eMNl hNNg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Sli782BV; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id z186si3447587yba.810.2018.02.17.10.43.38 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 10:43:38 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Sli782BV; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48196 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7SU-0007N3-5e for patch@linaro.org; Sat, 17 Feb 2018 13:43:38 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39739) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79J-0000Ma-8E for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:52 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79G-0001aO-DE for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:49 -0500 Received: from mail-pf0-x242.google.com ([2607:f8b0:400e:c00::242]:32802) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79G-0001Zm-3e for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:46 -0500 Received: by mail-pf0-x242.google.com with SMTP id b8so525159pfh.0 for ; Sat, 17 Feb 2018 10:23:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=yc03Rp6PfTCJLkAYx7Y9aOVkKsWaJPzL0uw4GCiXPIs=; b=Sli782BVp2lpxlSXYnXso4LDD7TxDzNG6OcjkbhML+ngpCR+p1iDjpM4OzPmdyu5fw Qfy+mt9UXcXaA+OvgKMukxG5RvzckQtWuU59nYrrmX/85LJn7dqYgHhtXucz0Ho8Q5fX XU+nwW8627udYRWOmUEyOY/tJIndz5efT3XWw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=yc03Rp6PfTCJLkAYx7Y9aOVkKsWaJPzL0uw4GCiXPIs=; b=ngo+CuRWajC1T6YR/eUxQPWTdwRr144a+Jk19iCy4JQH9MZ5uCieexMKTfpY6pAtBW CEUEs6LfJH03zFeqbCktBH9p+Xdd2eBY1st23V1sPkAROVw/i/2rzRQrmdDRgvxjITxZ eQBkob/I9c+5sUwP5MPhGk/oex799sW9UAZSMgmiwN+CmyeMS6P66N1aJ/54QoA+Sf5d yYBkT/c90av0lDDNmYiQIOw6aWvwQk0VfrxgTVEM4CN8RVMmQEAHkIopC0jhScM6APfs 9ZCiolGThwT9Tbl4XSh1IIgJ4xuvMa4nh10A4BH4PYunwG11VvZQmTySiWO4sF+oxz3q aoAg== X-Gm-Message-State: APf1xPB9j22tG2STCq1jnIw37UXKLsWsjvIh5g3HJc/zqy85fH3nupxp duOh0QTLwunXGCtdkAvH3ucWvYLkY0Q= X-Received: by 10.98.249.66 with SMTP id g2mr9847325pfm.112.1518891824536; Sat, 17 Feb 2018 10:23:44 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.42 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:43 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:27 -0800 Message-Id: <20180217182323.25885-12-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::242 Subject: [Qemu-devel] [PATCH v2 11/67] target/arm: Implement SVE bitwise shift by immediate (predicated) X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 25 +++++ target/arm/sve_helper.c | 265 +++++++++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 128 ++++++++++++++++++++++ target/arm/sve.decode | 29 ++++- 4 files changed, 445 insertions(+), 2 deletions(-) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 6b6bbeb272..b3c89579af 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -212,6 +212,31 @@ DEF_HELPER_FLAGS_3(sve_uminv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_uminv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_uminv_d, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_clr_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_clr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_clr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_clr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_asr_zpzi_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_asr_zpzi_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_asr_zpzi_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_asr_zpzi_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_lsr_zpzi_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_lsr_zpzi_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_lsr_zpzi_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_lsr_zpzi_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_lsl_zpzi_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_lsl_zpzi_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_lsl_zpzi_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_lsl_zpzi_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_asrd_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_asrd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_asrd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_asrd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 18fb27805e..b1a170fd70 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -92,6 +92,150 @@ uint32_t HELPER(sve_predtest)(void *vd, void *vg, uint32_t words) return flags; } +/* Expand active predicate bits to bytes, for byte elements. + * for (i = 0; i < 256; ++i) { + * unsigned long m = 0; + * for (j = 0; j < 8; j++) { + * if ((i >> j) & 1) { + * m |= 0xfful << (j << 3); + * } + * } + * printf("0x%016lx,\n", m); + * } + */ +static inline uint64_t expand_pred_b(uint8_t byte) +{ + static const uint64_t word[256] = { + 0x0000000000000000, 0x00000000000000ff, 0x000000000000ff00, + 0x000000000000ffff, 0x0000000000ff0000, 0x0000000000ff00ff, + 0x0000000000ffff00, 0x0000000000ffffff, 0x00000000ff000000, + 0x00000000ff0000ff, 0x00000000ff00ff00, 0x00000000ff00ffff, + 0x00000000ffff0000, 0x00000000ffff00ff, 0x00000000ffffff00, + 0x00000000ffffffff, 0x000000ff00000000, 0x000000ff000000ff, + 0x000000ff0000ff00, 0x000000ff0000ffff, 0x000000ff00ff0000, + 0x000000ff00ff00ff, 0x000000ff00ffff00, 0x000000ff00ffffff, + 0x000000ffff000000, 0x000000ffff0000ff, 0x000000ffff00ff00, + 0x000000ffff00ffff, 0x000000ffffff0000, 0x000000ffffff00ff, + 0x000000ffffffff00, 0x000000ffffffffff, 0x0000ff0000000000, + 0x0000ff00000000ff, 0x0000ff000000ff00, 0x0000ff000000ffff, + 0x0000ff0000ff0000, 0x0000ff0000ff00ff, 0x0000ff0000ffff00, + 0x0000ff0000ffffff, 0x0000ff00ff000000, 0x0000ff00ff0000ff, + 0x0000ff00ff00ff00, 0x0000ff00ff00ffff, 0x0000ff00ffff0000, + 0x0000ff00ffff00ff, 0x0000ff00ffffff00, 0x0000ff00ffffffff, + 0x0000ffff00000000, 0x0000ffff000000ff, 0x0000ffff0000ff00, + 0x0000ffff0000ffff, 0x0000ffff00ff0000, 0x0000ffff00ff00ff, + 0x0000ffff00ffff00, 0x0000ffff00ffffff, 0x0000ffffff000000, + 0x0000ffffff0000ff, 0x0000ffffff00ff00, 0x0000ffffff00ffff, + 0x0000ffffffff0000, 0x0000ffffffff00ff, 0x0000ffffffffff00, + 0x0000ffffffffffff, 0x00ff000000000000, 0x00ff0000000000ff, + 0x00ff00000000ff00, 0x00ff00000000ffff, 0x00ff000000ff0000, + 0x00ff000000ff00ff, 0x00ff000000ffff00, 0x00ff000000ffffff, + 0x00ff0000ff000000, 0x00ff0000ff0000ff, 0x00ff0000ff00ff00, + 0x00ff0000ff00ffff, 0x00ff0000ffff0000, 0x00ff0000ffff00ff, + 0x00ff0000ffffff00, 0x00ff0000ffffffff, 0x00ff00ff00000000, + 0x00ff00ff000000ff, 0x00ff00ff0000ff00, 0x00ff00ff0000ffff, + 0x00ff00ff00ff0000, 0x00ff00ff00ff00ff, 0x00ff00ff00ffff00, + 0x00ff00ff00ffffff, 0x00ff00ffff000000, 0x00ff00ffff0000ff, + 0x00ff00ffff00ff00, 0x00ff00ffff00ffff, 0x00ff00ffffff0000, + 0x00ff00ffffff00ff, 0x00ff00ffffffff00, 0x00ff00ffffffffff, + 0x00ffff0000000000, 0x00ffff00000000ff, 0x00ffff000000ff00, + 0x00ffff000000ffff, 0x00ffff0000ff0000, 0x00ffff0000ff00ff, + 0x00ffff0000ffff00, 0x00ffff0000ffffff, 0x00ffff00ff000000, + 0x00ffff00ff0000ff, 0x00ffff00ff00ff00, 0x00ffff00ff00ffff, + 0x00ffff00ffff0000, 0x00ffff00ffff00ff, 0x00ffff00ffffff00, + 0x00ffff00ffffffff, 0x00ffffff00000000, 0x00ffffff000000ff, + 0x00ffffff0000ff00, 0x00ffffff0000ffff, 0x00ffffff00ff0000, + 0x00ffffff00ff00ff, 0x00ffffff00ffff00, 0x00ffffff00ffffff, + 0x00ffffffff000000, 0x00ffffffff0000ff, 0x00ffffffff00ff00, + 0x00ffffffff00ffff, 0x00ffffffffff0000, 0x00ffffffffff00ff, + 0x00ffffffffffff00, 0x00ffffffffffffff, 0xff00000000000000, + 0xff000000000000ff, 0xff0000000000ff00, 0xff0000000000ffff, + 0xff00000000ff0000, 0xff00000000ff00ff, 0xff00000000ffff00, + 0xff00000000ffffff, 0xff000000ff000000, 0xff000000ff0000ff, + 0xff000000ff00ff00, 0xff000000ff00ffff, 0xff000000ffff0000, + 0xff000000ffff00ff, 0xff000000ffffff00, 0xff000000ffffffff, + 0xff0000ff00000000, 0xff0000ff000000ff, 0xff0000ff0000ff00, + 0xff0000ff0000ffff, 0xff0000ff00ff0000, 0xff0000ff00ff00ff, + 0xff0000ff00ffff00, 0xff0000ff00ffffff, 0xff0000ffff000000, + 0xff0000ffff0000ff, 0xff0000ffff00ff00, 0xff0000ffff00ffff, + 0xff0000ffffff0000, 0xff0000ffffff00ff, 0xff0000ffffffff00, + 0xff0000ffffffffff, 0xff00ff0000000000, 0xff00ff00000000ff, + 0xff00ff000000ff00, 0xff00ff000000ffff, 0xff00ff0000ff0000, + 0xff00ff0000ff00ff, 0xff00ff0000ffff00, 0xff00ff0000ffffff, + 0xff00ff00ff000000, 0xff00ff00ff0000ff, 0xff00ff00ff00ff00, + 0xff00ff00ff00ffff, 0xff00ff00ffff0000, 0xff00ff00ffff00ff, + 0xff00ff00ffffff00, 0xff00ff00ffffffff, 0xff00ffff00000000, + 0xff00ffff000000ff, 0xff00ffff0000ff00, 0xff00ffff0000ffff, + 0xff00ffff00ff0000, 0xff00ffff00ff00ff, 0xff00ffff00ffff00, + 0xff00ffff00ffffff, 0xff00ffffff000000, 0xff00ffffff0000ff, + 0xff00ffffff00ff00, 0xff00ffffff00ffff, 0xff00ffffffff0000, + 0xff00ffffffff00ff, 0xff00ffffffffff00, 0xff00ffffffffffff, + 0xffff000000000000, 0xffff0000000000ff, 0xffff00000000ff00, + 0xffff00000000ffff, 0xffff000000ff0000, 0xffff000000ff00ff, + 0xffff000000ffff00, 0xffff000000ffffff, 0xffff0000ff000000, + 0xffff0000ff0000ff, 0xffff0000ff00ff00, 0xffff0000ff00ffff, + 0xffff0000ffff0000, 0xffff0000ffff00ff, 0xffff0000ffffff00, + 0xffff0000ffffffff, 0xffff00ff00000000, 0xffff00ff000000ff, + 0xffff00ff0000ff00, 0xffff00ff0000ffff, 0xffff00ff00ff0000, + 0xffff00ff00ff00ff, 0xffff00ff00ffff00, 0xffff00ff00ffffff, + 0xffff00ffff000000, 0xffff00ffff0000ff, 0xffff00ffff00ff00, + 0xffff00ffff00ffff, 0xffff00ffffff0000, 0xffff00ffffff00ff, + 0xffff00ffffffff00, 0xffff00ffffffffff, 0xffffff0000000000, + 0xffffff00000000ff, 0xffffff000000ff00, 0xffffff000000ffff, + 0xffffff0000ff0000, 0xffffff0000ff00ff, 0xffffff0000ffff00, + 0xffffff0000ffffff, 0xffffff00ff000000, 0xffffff00ff0000ff, + 0xffffff00ff00ff00, 0xffffff00ff00ffff, 0xffffff00ffff0000, + 0xffffff00ffff00ff, 0xffffff00ffffff00, 0xffffff00ffffffff, + 0xffffffff00000000, 0xffffffff000000ff, 0xffffffff0000ff00, + 0xffffffff0000ffff, 0xffffffff00ff0000, 0xffffffff00ff00ff, + 0xffffffff00ffff00, 0xffffffff00ffffff, 0xffffffffff000000, + 0xffffffffff0000ff, 0xffffffffff00ff00, 0xffffffffff00ffff, + 0xffffffffffff0000, 0xffffffffffff00ff, 0xffffffffffffff00, + 0xffffffffffffffff, + }; + return word[byte]; +} + +/* Similarly for half-word elements. + * for (i = 0; i < 256; ++i) { + * unsigned long m = 0; + * if (i & 0xaa) { + * continue; + * } + * for (j = 0; j < 8; j += 2) { + * if ((i >> j) & 1) { + * m |= 0xfffful << (j << 3); + * } + * } + * printf("[0x%x] = 0x%016lx,\n", i, m); + * } + */ +static inline uint64_t expand_pred_h(uint8_t byte) +{ + static const uint64_t word[] = { + [0x01] = 0x000000000000ffff, [0x04] = 0x00000000ffff0000, + [0x05] = 0x00000000ffffffff, [0x10] = 0x0000ffff00000000, + [0x11] = 0x0000ffff0000ffff, [0x14] = 0x0000ffffffff0000, + [0x15] = 0x0000ffffffffffff, [0x40] = 0xffff000000000000, + [0x41] = 0xffff00000000ffff, [0x44] = 0xffff0000ffff0000, + [0x45] = 0xffff0000ffffffff, [0x50] = 0xffffffff00000000, + [0x51] = 0xffffffff0000ffff, [0x54] = 0xffffffffffff0000, + [0x55] = 0xffffffffffffffff, + }; + return word[byte & 0x55]; +} + +/* Similarly for single word elements. */ +static inline uint64_t expand_pred_s(uint8_t byte) +{ + static const uint64_t word[] = { + [0x01] = 0x00000000ffffffffull, + [0x10] = 0xffffffff00000000ull, + [0x11] = 0xffffffffffffffffull, + }; + return word[byte & 0x11]; +} + #define LOGICAL_PPPP(NAME, FUNC) \ void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \ { \ @@ -483,3 +627,124 @@ uint32_t HELPER(sve_pnext)(void *vd, void *vg, uint32_t pred_desc) return flags; } + +/* Store zero into every active element of Zd. We will use this for two + * and three-operand predicated instructions for which logic dictates a + * zero result. In particular, logical shift by element size, which is + * otherwise undefined on the host. + * + * For element sizes smaller than uint64_t, we use tables to expand + * the N bits of the controlling predicate to a byte mask, and clear + * those bytes. + */ +void HELPER(sve_clr_b)(void *vd, void *vg, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd; + uint8_t *pg = vg; + for (i = 0; i < opr_sz; i += 1) { + d[i] &= ~expand_pred_b(pg[H1(i)]); + } +} + +void HELPER(sve_clr_h)(void *vd, void *vg, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd; + uint8_t *pg = vg; + for (i = 0; i < opr_sz; i += 1) { + d[i] &= ~expand_pred_h(pg[H1(i)]); + } +} + +void HELPER(sve_clr_s)(void *vd, void *vg, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd; + uint8_t *pg = vg; + for (i = 0; i < opr_sz; i += 1) { + d[i] &= ~expand_pred_s(pg[H1(i)]); + } +} + +void HELPER(sve_clr_d)(void *vd, void *vg, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd; + uint8_t *pg = vg; + for (i = 0; i < opr_sz; i += 1) { + if (pg[H1(i)] & 1) { + d[i] = 0; + } + } +} + +/* Three-operand expander, immediate operand, controlled by a predicate. + */ +#define DO_ZPZI(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vg, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + TYPE imm = simd_data(desc); \ + for (i = 0; i < opr_sz; ) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + if (pg & 1) { \ + TYPE nn = *(TYPE *)(vn + H(i)); \ + *(TYPE *)(vd + H(i)) = OP(nn, imm); \ + } \ + i += sizeof(TYPE), pg >>= sizeof(TYPE); \ + } while (i & 15); \ + } \ +} + +/* Similarly, specialized for 64-bit operands. */ +#define DO_ZPZI_D(NAME, TYPE, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vg, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; \ + TYPE *d = vd, *n = vn; \ + TYPE imm = simd_data(desc); \ + uint8_t *pg = vg; \ + for (i = 0; i < opr_sz; i += 1) { \ + if (pg[H1(i)] & 1) { \ + TYPE nn = n[i]; \ + d[i] = OP(nn, imm); \ + } \ + } \ +} + +#define DO_SHR(N, M) (N >> M) +#define DO_SHL(N, M) (N << M) + +/* Arithmetic shift right for division. This rounds negative numbers + toward zero as per signed division. Therefore before shifting, + when N is negative, add 2**M-1. */ +#define DO_ASRD(N, M) ((N + (N < 0 ? ((__typeof(N))1 << M) - 1 : 0)) >> M) + +DO_ZPZI(sve_asr_zpzi_b, int8_t, H1, DO_SHR) +DO_ZPZI(sve_asr_zpzi_h, int16_t, H1_2, DO_SHR) +DO_ZPZI(sve_asr_zpzi_s, int32_t, H1_4, DO_SHR) +DO_ZPZI_D(sve_asr_zpzi_d, int64_t, DO_SHR) + +DO_ZPZI(sve_lsr_zpzi_b, uint8_t, H1, DO_SHR) +DO_ZPZI(sve_lsr_zpzi_h, uint16_t, H1_2, DO_SHR) +DO_ZPZI(sve_lsr_zpzi_s, uint32_t, H1_4, DO_SHR) +DO_ZPZI_D(sve_lsr_zpzi_d, uint64_t, DO_SHR) + +DO_ZPZI(sve_lsl_zpzi_b, uint8_t, H1, DO_SHL) +DO_ZPZI(sve_lsl_zpzi_h, uint16_t, H1_2, DO_SHL) +DO_ZPZI(sve_lsl_zpzi_s, uint32_t, H1_4, DO_SHL) +DO_ZPZI_D(sve_lsl_zpzi_d, uint64_t, DO_SHL) + +DO_ZPZI(sve_asrd_b, int8_t, H1, DO_ASRD) +DO_ZPZI(sve_asrd_h, int16_t, H1_2, DO_ASRD) +DO_ZPZI(sve_asrd_s, int32_t, H1_4, DO_ASRD) +DO_ZPZI_D(sve_asrd_d, int64_t, DO_ASRD) + +#undef DO_SHR +#undef DO_SHL +#undef DO_ASRD + +#undef DO_ZPZI +#undef DO_ZPZI_D diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 49251a53c1..4218300960 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -37,6 +37,30 @@ typedef void GVecGen2Fn(unsigned, uint32_t, uint32_t, uint32_t, uint32_t); typedef void GVecGen3Fn(unsigned, uint32_t, uint32_t, uint32_t, uint32_t, uint32_t); +/* + * Helpers for extracting complex instruction fields. + */ + +/* See e.g. ASL (immediate, predicated). + * Returns -1 for unallocated encoding; diagnose later. + */ +static int tszimm_esz(int x) +{ + x >>= 3; /* discard imm3 */ + return 31 - clz32(x); +} + +static int tszimm_shr(int x) +{ + return (16 << tszimm_esz(x)) - x; +} + +/* See e.g. LSL (immediate, predicated). */ +static int tszimm_shl(int x) +{ + return x - (8 << tszimm_esz(x)); +} + /* * Include the generated decoder. */ @@ -341,6 +365,110 @@ static void trans_SADDV(DisasContext *s, arg_rpr_esz *a, uint32_t insn) #undef DO_VPZ +/* + *** SVE Shift by Immediate - Predicated Group + */ + +/* Store zero into every active element of Zd. We will use this for two + * and three-operand predicated instructions for which logic dictates a + * zero result. + */ +static void do_clr_zp(DisasContext *s, int rd, int pg, int esz) +{ + static gen_helper_gvec_2 * const fns[4] = { + gen_helper_sve_clr_b, gen_helper_sve_clr_h, + gen_helper_sve_clr_s, gen_helper_sve_clr_d, + }; + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_2_ool(vec_full_reg_offset(s, rd), + pred_full_reg_offset(s, pg), + vsz, vsz, 0, fns[esz]); +} + +static void do_zpzi_ool(DisasContext *s, arg_rpri_esz *a, + gen_helper_gvec_3 *fn) +{ + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + pred_full_reg_offset(s, a->pg), + vsz, vsz, a->imm, fn); +} + +static void trans_ASR_zpzi(DisasContext *s, arg_rpri_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve_asr_zpzi_b, gen_helper_sve_asr_zpzi_h, + gen_helper_sve_asr_zpzi_s, gen_helper_sve_asr_zpzi_d, + }; + if (a->esz < 0) { + /* Invalid tsz encoding -- see tszimm_esz. */ + unallocated_encoding(s); + return; + } + /* Shift by element size is architecturally valid. For + arithmetic right-shift, it's the same as by one less. */ + a->imm = MIN(a->imm, (8 << a->esz) - 1); + do_zpzi_ool(s, a, fns[a->esz]); +} + +static void trans_LSR_zpzi(DisasContext *s, arg_rpri_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve_lsr_zpzi_b, gen_helper_sve_lsr_zpzi_h, + gen_helper_sve_lsr_zpzi_s, gen_helper_sve_lsr_zpzi_d, + }; + if (a->esz < 0) { + unallocated_encoding(s); + return; + } + /* Shift by element size is architecturally valid. + For logical shifts, it is a zeroing operation. */ + if (a->imm >= (8 << a->esz)) { + do_clr_zp(s, a->rd, a->pg, a->esz); + } else { + do_zpzi_ool(s, a, fns[a->esz]); + } +} + +static void trans_LSL_zpzi(DisasContext *s, arg_rpri_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve_lsl_zpzi_b, gen_helper_sve_lsl_zpzi_h, + gen_helper_sve_lsl_zpzi_s, gen_helper_sve_lsl_zpzi_d, + }; + if (a->esz < 0) { + unallocated_encoding(s); + return; + } + /* Shift by element size is architecturally valid. + For logical shifts, it is a zeroing operation. */ + if (a->imm >= (8 << a->esz)) { + do_clr_zp(s, a->rd, a->pg, a->esz); + } else { + do_zpzi_ool(s, a, fns[a->esz]); + } +} + +static void trans_ASRD(DisasContext *s, arg_rpri_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve_asrd_b, gen_helper_sve_asrd_h, + gen_helper_sve_asrd_s, gen_helper_sve_asrd_d, + }; + if (a->esz < 0) { + unallocated_encoding(s); + return; + } + /* Shift by element size is architecturally valid. For arithmetic + right shift for division, it is a zeroing operation. */ + if (a->imm >= (8 << a->esz)) { + do_clr_zp(s, a->rd, a->pg, a->esz); + } else { + do_zpzi_ool(s, a, fns[a->esz]); + } +} + /* *** SVE Predicate Logical Operations Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index b390d8f398..c265ff9899 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -22,12 +22,20 @@ ########################################################################### # Named fields. These are primarily for disjoint fields. +%imm6_22_5 22:1 5:5 %imm9_16_10 16:s6 10:3 %preg4_5 5:4 +# A combination of tsz:imm3 -- extract esize. +%tszimm_esz 22:2 5:5 !function=tszimm_esz +# A combination of tsz:imm3 -- extract (2 * esize) - (tsz:imm3) +%tszimm_shr 22:2 5:5 !function=tszimm_shr +# A combination of tsz:imm3 -- extract (tsz:imm3) - esize +%tszimm_shl 22:2 5:5 !function=tszimm_shl + # Either a copy of rd (at bit 0), or a different source # as propagated via the MOVPRFX instruction. -%reg_movprfx 0:5 +%reg_movprfx 0:5 ########################################################################### # Named attribute sets. These are used to make nice(er) names @@ -40,7 +48,7 @@ &rpr_esz rd pg rn esz &rprr_s rd pg rn rm s &rprr_esz rd pg rn rm esz - +&rpri_esz rd pg rn imm esz &ptrue rd esz pat s ########################################################################### @@ -68,6 +76,11 @@ # One register operand, with governing predicate, vector element size @rd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz +# Two register operand, one immediate operand, with predicate, +# element size encoded as TSZHL. User must fill in imm. +@rdn_pg_tszimm ........ .. ... ... ... pg:3 ..... rd:5 \ + &rpri_esz rn=%reg_movprfx esz=%tszimm_esz + # Basic Load/Store with 9-bit immediate offset @pd_rn_i9 ........ ........ ...... rn:5 . rd:4 \ &rri imm=%imm9_16_10 @@ -126,6 +139,18 @@ UMAXV 00000100 .. 001 001 001 ... ..... ..... @rd_pg_rn SMINV 00000100 .. 001 010 001 ... ..... ..... @rd_pg_rn UMINV 00000100 .. 001 011 001 ... ..... ..... @rd_pg_rn +### SVE Shift by Immediate - Predicated Group + +# SVE bitwise shift by immediate (predicated) +ASR_zpzi 00000100 .. 000 000 100 ... .. ... ..... \ + @rdn_pg_tszimm imm=%tszimm_shr +LSR_zpzi 00000100 .. 000 001 100 ... .. ... ..... \ + @rdn_pg_tszimm imm=%tszimm_shr +LSL_zpzi 00000100 .. 000 011 100 ... .. ... ..... \ + @rdn_pg_tszimm imm=%tszimm_shl +ASRD 00000100 .. 000 100 100 ... .. ... ..... \ + @rdn_pg_tszimm imm=%tszimm_shr + ### SVE Logical - Unpredicated Group # SVE bitwise logical operations (unpredicated) From patchwork Sat Feb 17 18:22:28 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128682 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1822303ljc; Sat, 17 Feb 2018 10:34:34 -0800 (PST) X-Google-Smtp-Source: AH8x2241ZSrj1EV8+zxQPvxtuNK+KPTcD6eIoN4apAyiCN5ZIYHT0ZMR6Ha10EYaK+WC+wrxC3YZ X-Received: by 10.129.199.73 with SMTP id i9mr7381084ywl.66.1518892474212; Sat, 17 Feb 2018 10:34:34 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518892474; cv=none; d=google.com; s=arc-20160816; b=TsLT//wli/ciB5Dw80r6+E+nN23i/Bn6P6VN8+4Afvo4IQYzBbJK0ZT1hUBD2YrBu4 9MmI7nZAeG1p5KXMMmF1VO4W9NuQp7j3YB3kj/g21GGkZPgcp+C5ZjcXp0XbgicKCC04 sybYK0tPI9BfoE51BXYewthNoA4ox6DCfiJ2D7M0YdXTXi8yDGibK/V9o80KcID/mPHW PWeOpOtomTPYRIXm0OnwlRyPpmiXFPHeStW3MNb8qqfuYJejDZcvCy2/w9PWCRQehwcH zHTjCKmb0QRxeoRB7BtgovM4CQPn90W2Tj6LE/HgJmphDZEW3uBsbOxnAlhXwOQlrdP7 2RXA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=N6hScQO8CYQW5lL3+OTxL793OiswiFsgIP5knStdv5c=; b=o6vkCvB9xmcYjguxaMf23YjUMHuZf2tJjFnSfC3HgPEoH47DvRkeMDHX3Hev3MVAGC YqFaEN3ACcQ6Nt0/huBfzrhMerrnTOqvGUYG6OqGs3m+dAftEyUTAXx5C2YheUKjrUae nK39FZZvaLeaHIO6Bc2KPwvFwtFS1JBzDQUpVrsvD89XnUQb/UXW3Hjjj3RpxBq/4MwQ x/ITYP+UkoqNr0Savn1BiXMUlJNuZCi6Ipvg0SExY7wV1rgzYMDTWfgL0w8b9W74yWPo L8qR75ItKoyPITpy6xrFZ4y7g53RKtn807zq/gFV1oVWIq0MdJY9OpkVbOXLyD2GVwQw aY/Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Z72Knqlm; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id v7si1846055ybk.372.2018.02.17.10.34.33 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 10:34:34 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Z72Knqlm; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48137 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7Jh-0000nA-Hu for patch@linaro.org; Sat, 17 Feb 2018 13:34:33 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39743) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79J-0000Md-95 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:50 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79H-0001b3-Ml for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:49 -0500 Received: from mail-pg0-x241.google.com ([2607:f8b0:400e:c05::241]:38908) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79H-0001aa-D6 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:47 -0500 Received: by mail-pg0-x241.google.com with SMTP id l24so4354289pgc.5 for ; Sat, 17 Feb 2018 10:23:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=N6hScQO8CYQW5lL3+OTxL793OiswiFsgIP5knStdv5c=; b=Z72KnqlmkK04i3s+2LKMdFK5K1M7AS0Uy7FLLhpQbj+KHoCifDEa/+DnE1upLewDbI nN1MMrPABDZa+r2loyAOKivhVtO1zO33LOcU5yrQsKOK/b0hHOGVTe9N2UQjKjXCJKiJ YihBRcUdY9NYdWR8nzod38OhfT40GqMoRtz/g= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=N6hScQO8CYQW5lL3+OTxL793OiswiFsgIP5knStdv5c=; b=CFSMw8zYO2XIvcVWpUSeudkhQtiEvDTnWuuI3w2LnmDgDW1GhC9TKxColeIjtuiJuX BH5Hhycj/5ZapvzTE4dYPg4QFF2LjpxUovZfw83c88zfODayn/M4mqC2J/xYIsjuv/jL Y0GJelsNNSCgr3UfT2OvsVBaodxWiNxc+ONQHYFkWwCbSDEhz40BYwkOX/9npOInnmMc 8qWxFnXL6pq8MWhoYOoTAwM4kPgNBnhw4y0fGrmJOhk5gHlcGd43rkBTtHBk9TQbpjMj 6x0As1k60LHBl5z+IwXPa26+O+IuqanIx+60FwTStHO7V2xTJBjZAu+tchcPhxu5dYHE gxjA== X-Gm-Message-State: APf1xPDGEq4HdZ4wk0SSNII6ZVxcsQ6hidxRKNd7P7es/W+keXd4OOL6 423/vDI6UklIH1euylp0aFbie8JoKRY= X-Received: by 10.98.27.78 with SMTP id b75mr9797472pfb.146.1518891826128; Sat, 17 Feb 2018 10:23:46 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.44 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:45 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:28 -0800 Message-Id: <20180217182323.25885-13-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::241 Subject: [Qemu-devel] [PATCH v2 12/67] target/arm: Implement SVE bitwise shift by vector (predicated) X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 27 +++++++++++++++++++++++++++ target/arm/sve_helper.c | 25 +++++++++++++++++++++++++ target/arm/translate-sve.c | 4 ++++ target/arm/sve.decode | 8 ++++++++ 4 files changed, 64 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index b3c89579af..0cc02ee59e 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -168,6 +168,33 @@ DEF_HELPER_FLAGS_5(sve_udiv_zpzz_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_udiv_zpzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_asr_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_asr_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_asr_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_asr_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_lsr_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_lsr_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_lsr_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_lsr_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_lsl_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_lsl_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_lsl_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_lsl_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_3(sve_orv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_orv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_orv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index b1a170fd70..6ea806d12b 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -439,6 +439,28 @@ DO_ZPZZ_D(sve_sdiv_zpzz_d, int64_t, DO_DIV) DO_ZPZZ(sve_udiv_zpzz_s, uint32_t, H1_4, DO_DIV) DO_ZPZZ_D(sve_udiv_zpzz_d, uint64_t, DO_DIV) +/* Note that all bits of the shift are significant + and not modulo the element size. */ +#define DO_ASR(N, M) (N >> MIN(M, sizeof(N) * 8 - 1)) +#define DO_LSR(N, M) (M < sizeof(N) * 8 ? N >> M : 0) +#define DO_LSL(N, M) (M < sizeof(N) * 8 ? N << M : 0) + +DO_ZPZZ(sve_asr_zpzz_b, int8_t, H1, DO_ASR) +DO_ZPZZ(sve_lsr_zpzz_b, uint8_t, H1_2, DO_LSR) +DO_ZPZZ(sve_lsl_zpzz_b, uint8_t, H1_4, DO_LSL) + +DO_ZPZZ(sve_asr_zpzz_h, int16_t, H1, DO_ASR) +DO_ZPZZ(sve_lsr_zpzz_h, uint16_t, H1_2, DO_LSR) +DO_ZPZZ(sve_lsl_zpzz_h, uint16_t, H1_4, DO_LSL) + +DO_ZPZZ(sve_asr_zpzz_s, int32_t, H1, DO_ASR) +DO_ZPZZ(sve_lsr_zpzz_s, uint32_t, H1_2, DO_LSR) +DO_ZPZZ(sve_lsl_zpzz_s, uint32_t, H1_4, DO_LSL) + +DO_ZPZZ_D(sve_asr_zpzz_d, int64_t, DO_ASR) +DO_ZPZZ_D(sve_lsr_zpzz_d, uint64_t, DO_LSR) +DO_ZPZZ_D(sve_lsl_zpzz_d, uint64_t, DO_LSL) + #undef DO_ZPZZ #undef DO_ZPZZ_D @@ -543,6 +565,9 @@ DO_VPZ_D(sve_uminv_d, uint64_t, uint64_t, -1, DO_MIN) #undef DO_ABD #undef DO_MUL #undef DO_DIV +#undef DO_ASR +#undef DO_LSR +#undef DO_LSL /* Similar to the ARM LastActiveElement pseudocode function, except the result is multiplied by the element size. This includes the not found diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 4218300960..08c56e55a0 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -282,6 +282,10 @@ DO_ZPZZ(MUL, mul) DO_ZPZZ(SMULH, smulh) DO_ZPZZ(UMULH, umulh) +DO_ZPZZ(ASR, asr) +DO_ZPZZ(LSR, lsr) +DO_ZPZZ(LSL, lsl) + void trans_SDIV_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn) { static gen_helper_gvec_4 * const fns[4] = { diff --git a/target/arm/sve.decode b/target/arm/sve.decode index c265ff9899..7ddff8e6bb 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -151,6 +151,14 @@ LSL_zpzi 00000100 .. 000 011 100 ... .. ... ..... \ ASRD 00000100 .. 000 100 100 ... .. ... ..... \ @rdn_pg_tszimm imm=%tszimm_shr +# SVE bitwise shift by vector (predicated) +ASR_zpzz 00000100 .. 010 000 100 ... ..... ..... @rdn_pg_rm +LSR_zpzz 00000100 .. 010 001 100 ... ..... ..... @rdn_pg_rm +LSL_zpzz 00000100 .. 010 011 100 ... ..... ..... @rdn_pg_rm +ASR_zpzz 00000100 .. 010 100 100 ... ..... ..... @rdm_pg_rn # ASRR +LSR_zpzz 00000100 .. 010 101 100 ... ..... ..... @rdm_pg_rn # LSRR +LSL_zpzz 00000100 .. 010 111 100 ... ..... ..... @rdm_pg_rn # LSLR + ### SVE Logical - Unpredicated Group # SVE bitwise logical operations (unpredicated) From patchwork Sat Feb 17 18:22:29 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128687 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1824466ljc; Sat, 17 Feb 2018 10:37:49 -0800 (PST) X-Google-Smtp-Source: AH8x227+UFpqwTpvIXTL6OON16l8ZYQoPTFaY7HwthdtUtC9drNu4cRXRWpFRglF1S3wospcYp8H X-Received: by 10.129.207.11 with SMTP id u11mr7350195ywi.126.1518892669727; Sat, 17 Feb 2018 10:37:49 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518892669; cv=none; d=google.com; s=arc-20160816; b=p9TYZAkAvpXBp2Zqndl5IXuL/0YJ6h6ELNWeDcDL9OUSWhCwYJdFzg5o2BOmMN8a3z zSwCTLiKPXsjY1KD1WA3nAWa38VguzTiijP/pXURUFWhGI46F/JmnKeCE3wYZfmcfFbO ZKkkAvt1+ka22Rxq82V1U4/3N7Pv2DzNlp+GFUS+30w+6+L32TIEAQNJlVKjY+o7CLQP ru5XvG6YxJ30DJqwy0+wwHlkgK/L0qv933g0u+NOTsIFf7W/B+Z3ZdMAgwI7dmL7psJW g5m0ZLMaw2SZzu8DkLGMbuBqWBHjoJyTVObWIQMvnTvR2TCC+OEmeAzAkgruWkPaIMe/ QPyg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=z2zyjf9WQFOCWG11hDLk8qhZRQx78/ZLD2//zJu/aqw=; b=zo47JdeAJXrHa88mkDdnBN4SpWxR/YWBMYo5jhdKz0FpK2aS0mqOV+1abhw/suh6oT KGNUxXYYUPpIigD9xgzuFjGC5iIyC83J7P4i11hi8bnO4nL9bhI4DLid4uMbs0DwgfvU Q7AYWfmxSfUNEqDlkz3EqjddZxvH2xku5+2gKYeYcWn6TpbIGi9oS7ExUATmO+tMNsFe KydVmV0QY20elCHgZka7PskU+KilNPMOytNxQCp4d+oavMEHtxigeleFWGvZ0Bxekh3y Nf92PKJ4Sr+nxLM3o8dgI5YqxcA4R0UldV1PPtY8usuxlZOQPOnJHIHFXS3nCENA0LEX jufQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=MDZ/nNY3; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id l186si3427075ywe.796.2018.02.17.10.37.49 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 10:37:49 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=MDZ/nNY3; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48168 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7Mr-0003S4-21 for patch@linaro.org; Sat, 17 Feb 2018 13:37:49 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39765) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79K-0000OG-HI for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:51 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79J-0001bs-18 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:50 -0500 Received: from mail-pf0-x242.google.com ([2607:f8b0:400e:c00::242]:45873) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79I-0001bL-PS for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:48 -0500 Received: by mail-pf0-x242.google.com with SMTP id w83so592216pfi.12 for ; Sat, 17 Feb 2018 10:23:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=z2zyjf9WQFOCWG11hDLk8qhZRQx78/ZLD2//zJu/aqw=; b=MDZ/nNY3d7UY3Rqv3OzI6wHTPQrdSS2AHfFpYOCxaZ7aC5dBBT59v74eFDUv9UlyL2 agfuO9mokB0vjNCgitmWN1+w9z8Xo4sMLdzGrxg7fjyTp8KKOdvAL6YgsWVeavJgcTHX H2pTEOk7lJ3hnWehZPCUTtpu/oGkfAl+XlRQA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=z2zyjf9WQFOCWG11hDLk8qhZRQx78/ZLD2//zJu/aqw=; b=kB9aNAn53lUGD6RIfmSrCIc68JJyrwB4lYOzDUlF7kcNIfpftcfQpvGc3OopWj46aO 9pprlBT9MEkIC0baOz8sByXQAH+XTjlZ+ceFxQlM4d5jsYjiwJa/ucWNn/9zTbeAtcnS Q0sK455LyvQpEB6My190WtzHbzmjbnniCrrcNabgH1QnWNTa0sxgMRDgYG9Bh/DL27Ru YdxZHDmjXuEJZzgoXfEJC+cgyUMdfyoWVjkFtgJ0cSakIePZ8KR9vXvNXJvvSsz2KLOj J7ZE5pF2fyBDTiJQg8t68YTneIM6bJ+AiUm4hIe34cGMETbO2/p966fIHHYvYjhZlFOg vptA== X-Gm-Message-State: APf1xPAB5sB/AoD6u1P2vS5UUxmQSxpmr8NYd0WgqAGcGKcIxofJhNxf BQmP7X+0jqoW8hW2OTSrKJ+fkcdGS20= X-Received: by 10.167.130.193 with SMTP id f1mr9609038pfn.241.1518891827522; Sat, 17 Feb 2018 10:23:47 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.46 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:46 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:29 -0800 Message-Id: <20180217182323.25885-14-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::242 Subject: [Qemu-devel] [PATCH v2 13/67] target/arm: Implement SVE bitwise shift by wide elements (predicated) X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 21 +++++++++++++++++++++ target/arm/sve_helper.c | 35 +++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 25 +++++++++++++++++++++++++ target/arm/sve.decode | 6 ++++++ 4 files changed, 87 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 0cc02ee59e..d516580134 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -195,6 +195,27 @@ DEF_HELPER_FLAGS_5(sve_lsl_zpzz_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_lsl_zpzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_asr_zpzw_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_asr_zpzw_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_asr_zpzw_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_lsr_zpzw_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_lsr_zpzw_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_lsr_zpzw_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_lsl_zpzw_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_lsl_zpzw_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_lsl_zpzw_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_3(sve_orv_b, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_orv_h, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_orv_s, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 6ea806d12b..3054b3cc99 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -464,6 +464,41 @@ DO_ZPZZ_D(sve_lsl_zpzz_d, uint64_t, DO_LSL) #undef DO_ZPZZ #undef DO_ZPZZ_D +/* Three-operand expander, controlled by a predicate, in which the + * third operand is "wide". That is, for D = N op M, the same 64-bit + * value of M is used with all of the narrower values of N. + */ +#define DO_ZPZW(NAME, TYPE, TYPEW, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + for (i = 0; i < opr_sz; ) { \ + uint8_t pg = *(uint8_t *)(vg + H1(i >> 3)); \ + TYPEW mm = *(TYPEW *)(vm + i); \ + do { \ + if (pg & 1) { \ + TYPE nn = *(TYPE *)(vn + H(i)); \ + *(TYPE *)(vd + H(i)) = OP(nn, mm); \ + } \ + i += sizeof(TYPE), pg >>= sizeof(TYPE); \ + } while (i & 7); \ + } \ +} + +DO_ZPZW(sve_asr_zpzw_b, int8_t, uint64_t, H1, DO_ASR) +DO_ZPZW(sve_lsr_zpzw_b, uint8_t, uint64_t, H1, DO_LSR) +DO_ZPZW(sve_lsl_zpzw_b, uint8_t, uint64_t, H1, DO_LSL) + +DO_ZPZW(sve_asr_zpzw_h, int16_t, uint64_t, H1_2, DO_ASR) +DO_ZPZW(sve_lsr_zpzw_h, uint16_t, uint64_t, H1_2, DO_LSR) +DO_ZPZW(sve_lsl_zpzw_h, uint16_t, uint64_t, H1_2, DO_LSL) + +DO_ZPZW(sve_asr_zpzw_s, int32_t, uint64_t, H1_4, DO_ASR) +DO_ZPZW(sve_lsr_zpzw_s, uint32_t, uint64_t, H1_4, DO_LSR) +DO_ZPZW(sve_lsl_zpzw_s, uint32_t, uint64_t, H1_4, DO_LSL) + +#undef DO_ZPZW + /* Two-operand reduction expander, controlled by a predicate. * The difference between TYPERED and TYPERET has to do with * sign-extension. E.g. for SMAX, TYPERED must be signed, diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 08c56e55a0..35bcd9229d 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -473,6 +473,31 @@ static void trans_ASRD(DisasContext *s, arg_rpri_esz *a, uint32_t insn) } } +/* + *** SVE Bitwise Shift - Predicated Group + */ + +#define DO_ZPZW(NAME, name) \ +static void trans_##NAME##_zpzw(DisasContext *s, arg_rprr_esz *a, \ + uint32_t insn) \ +{ \ + static gen_helper_gvec_4 * const fns[3] = { \ + gen_helper_sve_##name##_zpzw_b, gen_helper_sve_##name##_zpzw_h, \ + gen_helper_sve_##name##_zpzw_s, \ + }; \ + if (a->esz >= 0 && a->esz < 3) { \ + do_zpzz_ool(s, a, fns[a->esz]); \ + } else { \ + unallocated_encoding(s); \ + } \ +} + +DO_ZPZW(ASR, asr) +DO_ZPZW(LSR, lsr) +DO_ZPZW(LSL, lsl) + +#undef DO_ZPZW + /* *** SVE Predicate Logical Operations Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 7ddff8e6bb..177f338fed 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -159,6 +159,12 @@ ASR_zpzz 00000100 .. 010 100 100 ... ..... ..... @rdm_pg_rn # ASRR LSR_zpzz 00000100 .. 010 101 100 ... ..... ..... @rdm_pg_rn # LSRR LSL_zpzz 00000100 .. 010 111 100 ... ..... ..... @rdm_pg_rn # LSLR +# SVE bitwise shift by wide elements (predicated) +# Note these require size != 3. +ASR_zpzw 00000100 .. 011 000 100 ... ..... ..... @rdn_pg_rm +LSR_zpzw 00000100 .. 011 001 100 ... ..... ..... @rdn_pg_rm +LSL_zpzw 00000100 .. 011 011 100 ... ..... ..... @rdn_pg_rm + ### SVE Logical - Unpredicated Group # SVE bitwise logical operations (unpredicated) From patchwork Sat Feb 17 18:22:30 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128693 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1828221ljc; Sat, 17 Feb 2018 10:43:54 -0800 (PST) X-Google-Smtp-Source: AH8x225n9bjLbkV+jf+0r5gPe2NUxVj+8XKRCbPNHZmtG2VtKqONHi08MrrILDxcV9rhzT9MqD6n X-Received: by 10.129.201.2 with SMTP id o2mr7573392ywi.2.1518893034236; Sat, 17 Feb 2018 10:43:54 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518893034; cv=none; d=google.com; s=arc-20160816; b=nXxVyPDQIobrLENKIrrP+XEBIdrvcmW49ShASh3r+bcNu7mlMHIbe8eYhOCuvioX5G nO69ysM4RQOMkpJTR8B8+RiKFZZCOXDXy7u7IP6XBJSNza/H+HaBqnK317UJrQ0D08ug o/LvNlN8xXqBK9wz8bx0waViQlZb4NlfipP2y2fOLspMJysGfYPOEmBZmefL/YGYIvkL Q+H3dwBb/hzE/fdKVuAz76FG2h95drO9qm5P9zxeKkcUijyBBDTBxswulBNxunyUmrie eNxKhQAROEqZfXFy15UrWHqZv1dyjus1Q9n7W5FxZ+oqFtF8hhg26Vpj/8Xva69Q2vo9 O8SA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=FjMo7EZbo9fN48yISfUcPdtxkrSfQEMIR1b8FIyft8c=; b=hNfRLGn6a5pIv4U6QHCZoNrswoqE/HurYRuwBBT248KLulBv3oQ454mcm0QfaoHvto jGmHf7kFrKwmzlFF2yXOJ//0umPv9O66T+Ls3vd07oMzCWJs09CJDA7QNbE5PIbveayP HktkNR6lzOx5mOkVbntfAOw9C4IXd4Zg+GDVZDykuaRqws34AGkIdE1MvqcXl2l5teXe gQ5am+wm8LYpIXZbXoJ9SL/VFXSriUU3+JZgsQKW8S6vuu8+F31+zimQuWyEbMhskNGe dvOp1F7Xf9lcr+1iken69I8vgQVclm3ot2Nv6vLVwSQ53ugMEyR8q942unpE+s6Y0AKe CbnQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=gPSWg/L2; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id k5si3174436ywl.328.2018.02.17.10.43.54 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 10:43:54 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=gPSWg/L2; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48198 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7Sj-0007fA-If for patch@linaro.org; Sat, 17 Feb 2018 13:43:53 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39798) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79M-0000Sb-PP for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:54 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79K-0001cw-Uq for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:52 -0500 Received: from mail-pf0-x244.google.com ([2607:f8b0:400e:c00::244]:42839) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79K-0001cW-Mb for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:50 -0500 Received: by mail-pf0-x244.google.com with SMTP id b25so592216pfd.9 for ; Sat, 17 Feb 2018 10:23:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=FjMo7EZbo9fN48yISfUcPdtxkrSfQEMIR1b8FIyft8c=; b=gPSWg/L2BgM8dScJdxcZ8TbIij2E91/NkzUYycCIBTItntP65/dQZw9rtLLIgZFtrK fZlB4fVpkPbQYDbP7ZvGvq6wdpH6lfMoMUE+XpyUZIwqRzzsul0nBoBzxztQgac/bWI+ xV23aIo/L7K7d9G/zKkkjHUcI0fGIK21d+ReE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=FjMo7EZbo9fN48yISfUcPdtxkrSfQEMIR1b8FIyft8c=; b=UqaZ7J4mViIZsAZG4wZxjI5VgyeX5LCnvg0uwceTNJMgNeTIvI83MQLzthewjC+klO z6z8SRWmmNDDzq6c9JFy9CZL71UrQ56/r1X296gOzoA9dsJO45cEyY4eb++4ZvYOzxpm XVMAOAT4ANVqYh5sBkHEJplhRi8ymV0b6sj8bbGaig8F+CdvqTrZMRR2LvJ2UQcXvt11 3IBf3LIqO7btRaNReUlh+8EWnpOCJf5YKwm/8K1xRIFAOg8KAG2Jvpg6tCHrvSLXdaLk 60gAfznJEonsKQzJrMB5Xu6ShOUjyPbuBDDWck2QA/Hap1RsNhDeDHnTDPC4oUS8xSEI 6BTA== X-Gm-Message-State: APf1xPDbTvjJ6JyDsSDpYRYDOWifSPrGnIAT0y3CKgTdk46HfPujqZME EfKQlKhzXaOqs8LQcyuwc7swXW/BunI= X-Received: by 10.98.83.6 with SMTP id h6mr5098866pfb.174.1518891829303; Sat, 17 Feb 2018 10:23:49 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.47 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:48 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:30 -0800 Message-Id: <20180217182323.25885-15-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::244 Subject: [Qemu-devel] [PATCH v2 14/67] target/arm: Implement SVE Integer Arithmetic - Unary Predicated Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 60 +++++++++++++++++++++ target/arm/sve_helper.c | 127 +++++++++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 111 +++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 23 ++++++++ 4 files changed, 321 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index d516580134..11644125d1 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -285,6 +285,66 @@ DEF_HELPER_FLAGS_4(sve_asrd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_asrd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_asrd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cls_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cls_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cls_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cls_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_clz_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_clz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_clz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_clz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_cnt_zpz_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cnt_zpz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cnt_zpz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cnt_zpz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_cnot_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cnot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cnot_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cnot_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_fabs_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fabs_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fabs_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_fneg_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fneg_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fneg_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_not_zpz_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_not_zpz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_not_zpz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_not_zpz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_sxtb_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_sxtb_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_sxtb_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_uxtb_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_uxtb_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_uxtb_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_sxth_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_sxth_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_uxth_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_uxth_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_sxtw_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_uxtw_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_abs_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_abs_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_abs_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_abs_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_neg_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_neg_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_neg_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_neg_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 3054b3cc99..e11823a727 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -499,6 +499,133 @@ DO_ZPZW(sve_lsl_zpzw_s, uint32_t, uint64_t, H1_4, DO_LSL) #undef DO_ZPZW +/* Fully general two-operand expander, controlled by a predicate. + */ +#define DO_ZPZ(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vg, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + for (i = 0; i < opr_sz; ) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + if (pg & 1) { \ + TYPE nn = *(TYPE *)(vn + H(i)); \ + *(TYPE *)(vd + H(i)) = OP(nn); \ + } \ + i += sizeof(TYPE), pg >>= sizeof(TYPE); \ + } while (i & 15); \ + } \ +} + +/* Similarly, specialized for 64-bit operands. */ +#define DO_ZPZ_D(NAME, TYPE, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vg, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; \ + TYPE *d = vd, *n = vn; \ + uint8_t *pg = vg; \ + for (i = 0; i < opr_sz; i += 1) { \ + if (pg[H1(i)] & 1) { \ + TYPE nn = n[i]; \ + d[i] = OP(nn); \ + } \ + } \ +} + +#define DO_CLS_B(N) (clrsb32(N) - 24) +#define DO_CLS_H(N) (clrsb32(N) - 16) + +DO_ZPZ(sve_cls_b, int8_t, H1, DO_CLS_B) +DO_ZPZ(sve_cls_h, int16_t, H1_2, DO_CLS_H) +DO_ZPZ(sve_cls_s, int32_t, H1_4, clrsb32) +DO_ZPZ_D(sve_cls_d, int64_t, clrsb64) + +#define DO_CLZ_B(N) (clz32(N) - 24) +#define DO_CLZ_H(N) (clz32(N) - 16) + +DO_ZPZ(sve_clz_b, uint8_t, H1, DO_CLZ_B) +DO_ZPZ(sve_clz_h, uint16_t, H1_2, DO_CLZ_H) +DO_ZPZ(sve_clz_s, uint32_t, H1_4, clz32) +DO_ZPZ_D(sve_clz_d, uint64_t, clz64) + +DO_ZPZ(sve_cnt_zpz_b, uint8_t, H1, ctpop8) +DO_ZPZ(sve_cnt_zpz_h, uint16_t, H1_2, ctpop16) +DO_ZPZ(sve_cnt_zpz_s, uint32_t, H1_4, ctpop32) +DO_ZPZ_D(sve_cnt_zpz_d, uint64_t, ctpop64) + +#define DO_CNOT(N) (N == 0) + +DO_ZPZ(sve_cnot_b, uint8_t, H1, DO_CNOT) +DO_ZPZ(sve_cnot_h, uint16_t, H1_2, DO_CNOT) +DO_ZPZ(sve_cnot_s, uint32_t, H1_4, DO_CNOT) +DO_ZPZ_D(sve_cnot_d, uint64_t, DO_CNOT) + +#define DO_FABS(N) (N & ((__typeof(N))-1 >> 1)) + +DO_ZPZ(sve_fabs_h, uint16_t, H1_2, DO_FABS) +DO_ZPZ(sve_fabs_s, uint32_t, H1_4, DO_FABS) +DO_ZPZ_D(sve_fabs_d, uint64_t, DO_FABS) + +#define DO_FNEG(N) (N ^ ~((__typeof(N))-1 >> 1)) + +DO_ZPZ(sve_fneg_h, uint16_t, H1_2, DO_FNEG) +DO_ZPZ(sve_fneg_s, uint32_t, H1_4, DO_FNEG) +DO_ZPZ_D(sve_fneg_d, uint64_t, DO_FNEG) + +#define DO_NOT(N) (~N) + +DO_ZPZ(sve_not_zpz_b, uint8_t, H1, DO_NOT) +DO_ZPZ(sve_not_zpz_h, uint16_t, H1_2, DO_NOT) +DO_ZPZ(sve_not_zpz_s, uint32_t, H1_4, DO_NOT) +DO_ZPZ_D(sve_not_zpz_d, uint64_t, DO_NOT) + +#define DO_SXTB(N) ((int8_t)N) +#define DO_SXTH(N) ((int16_t)N) +#define DO_SXTS(N) ((int32_t)N) +#define DO_UXTB(N) ((uint8_t)N) +#define DO_UXTH(N) ((uint16_t)N) +#define DO_UXTS(N) ((uint32_t)N) + +DO_ZPZ(sve_sxtb_h, uint16_t, H1_2, DO_SXTB) +DO_ZPZ(sve_sxtb_s, uint32_t, H1_4, DO_SXTB) +DO_ZPZ(sve_sxth_s, uint32_t, H1_4, DO_SXTH) +DO_ZPZ_D(sve_sxtb_d, uint64_t, DO_SXTB) +DO_ZPZ_D(sve_sxth_d, uint64_t, DO_SXTH) +DO_ZPZ_D(sve_sxtw_d, uint64_t, DO_SXTS) + +DO_ZPZ(sve_uxtb_h, uint16_t, H1_2, DO_UXTB) +DO_ZPZ(sve_uxtb_s, uint32_t, H1_4, DO_UXTB) +DO_ZPZ(sve_uxth_s, uint32_t, H1_4, DO_UXTH) +DO_ZPZ_D(sve_uxtb_d, uint64_t, DO_UXTB) +DO_ZPZ_D(sve_uxth_d, uint64_t, DO_UXTH) +DO_ZPZ_D(sve_uxtw_d, uint64_t, DO_UXTS) + +#define DO_ABS(N) (N < 0 ? -N : N) + +DO_ZPZ(sve_abs_b, int8_t, H1, DO_ABS) +DO_ZPZ(sve_abs_h, int16_t, H1_2, DO_ABS) +DO_ZPZ(sve_abs_s, int32_t, H1_4, DO_ABS) +DO_ZPZ_D(sve_abs_d, int64_t, DO_ABS) + +#define DO_NEG(N) (-N) + +DO_ZPZ(sve_neg_b, uint8_t, H1, DO_NEG) +DO_ZPZ(sve_neg_h, uint16_t, H1_2, DO_NEG) +DO_ZPZ(sve_neg_s, uint32_t, H1_4, DO_NEG) +DO_ZPZ_D(sve_neg_d, uint64_t, DO_NEG) + +#undef DO_CLS_B +#undef DO_CLS_H +#undef DO_CLZ_B +#undef DO_CLZ_H +#undef DO_CNOT +#undef DO_FABS +#undef DO_FNEG +#undef DO_ABS +#undef DO_NEG +#undef DO_ZPZ +#undef DO_ZPZ_D + /* Two-operand reduction expander, controlled by a predicate. * The difference between TYPERED and TYPERET has to do with * sign-extension. E.g. for SMAX, TYPERED must be signed, diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 35bcd9229d..dce8ba8dc0 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -304,6 +304,117 @@ void trans_UDIV_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn) #undef DO_ZPZZ +/* + *** SVE Integer Arithmetic - Unary Predicated Group + */ + +static void do_zpz_ool(DisasContext *s, arg_rpr_esz *a, gen_helper_gvec_3 *fn) +{ + unsigned vsz = vec_full_reg_size(s); + if (fn == NULL) { + unallocated_encoding(s); + return; + } + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + pred_full_reg_offset(s, a->pg), + vsz, vsz, 0, fn); +} + +#define DO_ZPZ(NAME, name) \ +static void trans_##NAME(DisasContext *s, arg_rpr_esz *a, uint32_t insn) \ +{ \ + static gen_helper_gvec_3 * const fns[4] = { \ + gen_helper_sve_##name##_b, gen_helper_sve_##name##_h, \ + gen_helper_sve_##name##_s, gen_helper_sve_##name##_d, \ + }; \ + do_zpz_ool(s, a, fns[a->esz]); \ +} + +DO_ZPZ(CLS, cls) +DO_ZPZ(CLZ, clz) +DO_ZPZ(CNT_zpz, cnt_zpz) +DO_ZPZ(CNOT, cnot) +DO_ZPZ(NOT_zpz, not_zpz) +DO_ZPZ(ABS, abs) +DO_ZPZ(NEG, neg) + +static void trans_FABS(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] = { + NULL, + gen_helper_sve_fabs_h, + gen_helper_sve_fabs_s, + gen_helper_sve_fabs_d + }; + do_zpz_ool(s, a, fns[a->esz]); +} + +static void trans_FNEG(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] = { + NULL, + gen_helper_sve_fneg_h, + gen_helper_sve_fneg_s, + gen_helper_sve_fneg_d + }; + do_zpz_ool(s, a, fns[a->esz]); +} + +static void trans_SXTB(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] = { + NULL, + gen_helper_sve_sxtb_h, + gen_helper_sve_sxtb_s, + gen_helper_sve_sxtb_d + }; + do_zpz_ool(s, a, fns[a->esz]); +} + +static void trans_UXTB(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] = { + NULL, + gen_helper_sve_uxtb_h, + gen_helper_sve_uxtb_s, + gen_helper_sve_uxtb_d + }; + do_zpz_ool(s, a, fns[a->esz]); +} + +static void trans_SXTH(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] = { + NULL, NULL, + gen_helper_sve_sxth_s, + gen_helper_sve_sxth_d + }; + do_zpz_ool(s, a, fns[a->esz]); +} + +static void trans_UXTH(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] = { + NULL, NULL, + gen_helper_sve_uxth_s, + gen_helper_sve_uxth_d + }; + do_zpz_ool(s, a, fns[a->esz]); +} + +static void trans_SXTW(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ool(s, a, a->esz == 3 ? gen_helper_sve_sxtw_d : NULL); +} + +static void trans_UXTW(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ool(s, a, a->esz == 3 ? gen_helper_sve_uxtw_d : NULL); +} + +#undef DO_ZPZ + /* *** SVE Integer Reduction Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 177f338fed..b875501475 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -165,6 +165,29 @@ ASR_zpzw 00000100 .. 011 000 100 ... ..... ..... @rdn_pg_rm LSR_zpzw 00000100 .. 011 001 100 ... ..... ..... @rdn_pg_rm LSL_zpzw 00000100 .. 011 011 100 ... ..... ..... @rdn_pg_rm +### SVE Integer Arithmetic - Unary Predicated Group + +# SVE unary bit operations (predicated) +# Note esz != 0 for FABS and FNEG. +CLS 00000100 .. 011 000 101 ... ..... ..... @rd_pg_rn +CLZ 00000100 .. 011 001 101 ... ..... ..... @rd_pg_rn +CNT_zpz 00000100 .. 011 010 101 ... ..... ..... @rd_pg_rn +CNOT 00000100 .. 011 011 101 ... ..... ..... @rd_pg_rn +NOT_zpz 00000100 .. 011 110 101 ... ..... ..... @rd_pg_rn +FABS 00000100 .. 011 100 101 ... ..... ..... @rd_pg_rn +FNEG 00000100 .. 011 101 101 ... ..... ..... @rd_pg_rn + +# SVE integer unary operations (predicated) +# Note esz > original size for extensions. +ABS 00000100 .. 010 110 101 ... ..... ..... @rd_pg_rn +NEG 00000100 .. 010 111 101 ... ..... ..... @rd_pg_rn +SXTB 00000100 .. 010 000 101 ... ..... ..... @rd_pg_rn +UXTB 00000100 .. 010 001 101 ... ..... ..... @rd_pg_rn +SXTH 00000100 .. 010 010 101 ... ..... ..... @rd_pg_rn +UXTH 00000100 .. 010 011 101 ... ..... ..... @rd_pg_rn +SXTW 00000100 .. 010 100 101 ... ..... ..... @rd_pg_rn +UXTW 00000100 .. 010 101 101 ... ..... ..... @rd_pg_rn + ### SVE Logical - Unpredicated Group # SVE bitwise logical operations (unpredicated) From patchwork Sat Feb 17 18:22:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128673 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1816868ljc; Sat, 17 Feb 2018 10:26:31 -0800 (PST) X-Google-Smtp-Source: AH8x224HCYKrXm+JeTl+iG6ez57tTnrrxEyJjgnStukpQZf7uQSi/dIiwi7UiWPcGcfX2CogrDrT X-Received: by 10.37.56.21 with SMTP id f21mr5264232yba.480.1518891991566; Sat, 17 Feb 2018 10:26:31 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518891991; cv=none; d=google.com; s=arc-20160816; b=gp+WFuImsfNXb0eCcnWV6GwbMWpETScBP7MQYFBXlfFmRBWizdXzPNj6FrOga1YVfR Ry8sKXZIrnSNjnuSIYTXyLxtc0iDgiOrPElAYU3yI+l6FnO3HtY9LgS7DrVLAcPFOByV CnwzA7tdBqnexYs5vifOCftLdSWmtv3ow9cw+6weNFXxBfU+aJbTYEL4lAHB6Efg78Kb VlQyw1g2S23Ee6P3isnXSaA/1dvPIqCPwJM+gsuFh7IYMDyrh4xvN53JwkHyv40J2+FW YCeoemkR4RtPO2bRiUiJgdvQeWxtflorKEtzPzGFEKBPZtDp9UuXfcea3U2iY7XO3HUb 0zFQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=JrLpQVNkAD2ooQuftQEUHc/FeKYLqxEFlP7jwgpSp7M=; b=Tnv6PYy6I5eNOBOaI7FJNI8nLp197Oow8TblVkMbVL2jZ3rQTLhhfMlTN8lIcQy4oJ WJtUR0jYQMj8Mu0tolkzUXodRbniHKfQ2b9tetZXb2RdP8/OOnRLmNaHrNsAtBddTvhD weVF8ZAUq3NrfKiQdGYhi2TTsz5yxPYQQKgdHI5sLd+fSwRxIHs4JObzjP+Ru4y6e/RT p1B7rEAYQhXENf4aMaqdYkPZ6tVXa7TF0LL/rRIPktkUsaWsCh+P+tr5QoexqkQtWVPk 02h8F/THDFOxcYkcPzLsoR/VXf/PAdcU4V/KcEAPhfnvjsj0HD/da/C6njm7d5b0h4xG 5DBw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=bVWirNV3; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id w188si3538797ywe.231.2018.02.17.10.26.31 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 10:26:31 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=bVWirNV3; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48069 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7Bu-0002kT-Rt for patch@linaro.org; Sat, 17 Feb 2018 13:26:30 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39824) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79O-0000VJ-7U for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:55 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79M-0001dX-QY for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:54 -0500 Received: from mail-pl0-x243.google.com ([2607:f8b0:400e:c01::243]:44552) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79M-0001dF-JA for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:52 -0500 Received: by mail-pl0-x243.google.com with SMTP id w21so3426484plp.11 for ; Sat, 17 Feb 2018 10:23:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=JrLpQVNkAD2ooQuftQEUHc/FeKYLqxEFlP7jwgpSp7M=; b=bVWirNV3nDHul8zG/O7qFSIG1shh6nqF2GybheeEh3srcQfFjTUGfmFIXxZ3H73PgV hYZUxie8fh0zmZ5aqkWTv8zv27iUNqu6UIAvS8nRc9LtoHmyTr6Sdpe5mIIjbg7QOkI7 wv0ltXw6AQpZCrmxpbDOSSlaNhfkHK9k9asEo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=JrLpQVNkAD2ooQuftQEUHc/FeKYLqxEFlP7jwgpSp7M=; b=SyJR5VOdM5Y622Sw0VE3ojVddKYotSPChg2DKOXTSD7o4Dk6raiLWeSLgkdFK6h6hu 3HvbjdNCDHa2DFqlZ5q89rxPumGJQkcz4Gy794I62LN3L7FrSK/qJYFJE8YuTlh81CpD Ru5HBBLea0ybsGzJZ/aGorBir5NESYCUyXrRnsN8SH/Jnh5kw0Tykm3XBTu6YVPD92eI cd/ILn774o1NcpKQYQ/fnBZNFb2+dUDFqP8DyWftUSzbhVZTcZx3t7AONve6Ggpp+J34 XZcjc5oy08ItlsEwfQ2FNXhF3sqaG+ldBLjDRvXFkljZYyWl7XRnV+Yj5lswuH4tPDfS WL3w== X-Gm-Message-State: APf1xPAwkpA8+cF61GIUQuk4dTKFEjp/HnNnrf6tcR/J3aw4m2SWXPY3 AYREkhuuBxj+TnekiIxLSbNihj6o6M0= X-Received: by 2002:a17:902:6941:: with SMTP id k1-v6mr4060710plt.86.1518891831346; Sat, 17 Feb 2018 10:23:51 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.49 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:50 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:31 -0800 Message-Id: <20180217182323.25885-16-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::243 Subject: [Qemu-devel] [PATCH v2 15/67] target/arm: Implement SVE Integer Multiply-Add Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 18 ++++++++++++++ target/arm/sve_helper.c | 58 +++++++++++++++++++++++++++++++++++++++++++++- target/arm/translate-sve.c | 31 +++++++++++++++++++++++++ target/arm/sve.decode | 17 ++++++++++++++ 4 files changed, 123 insertions(+), 1 deletion(-) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 11644125d1..b31d497f31 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -345,6 +345,24 @@ DEF_HELPER_FLAGS_4(sve_neg_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_neg_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_neg_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_mla_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_mla_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_mla_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_mla_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_mls_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_mls_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_mls_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_mls_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index e11823a727..4b08a38ce8 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -932,6 +932,62 @@ DO_ZPZI_D(sve_asrd_d, int64_t, DO_ASRD) #undef DO_SHR #undef DO_SHL #undef DO_ASRD - #undef DO_ZPZI #undef DO_ZPZI_D + +/* Fully general four-operand expander, controlled by a predicate. + */ +#define DO_ZPZZZ(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *va, void *vn, void *vm, \ + void *vg, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + for (i = 0; i < opr_sz; ) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + if (pg & 1) { \ + TYPE nn = *(TYPE *)(vn + H(i)); \ + TYPE mm = *(TYPE *)(vm + H(i)); \ + TYPE aa = *(TYPE *)(va + H(i)); \ + *(TYPE *)(vd + H(i)) = OP(aa, nn, mm); \ + } \ + i += sizeof(TYPE), pg >>= sizeof(TYPE); \ + } while (i & 15); \ + } \ +} + +/* Similarly, specialized for 64-bit operands. */ +#define DO_ZPZZZ_D(NAME, TYPE, OP) \ +void HELPER(NAME)(void *vd, void *va, void *vn, void *vm, \ + void *vg, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; \ + TYPE *d = vd, *a = va, *n = vn, *m = vm; \ + uint8_t *pg = vg; \ + for (i = 0; i < opr_sz; i += 1) { \ + if (pg[H1(i)] & 1) { \ + TYPE aa = a[i], nn = n[i], mm = m[i]; \ + d[i] = OP(aa, nn, mm); \ + } \ + } \ +} + +#define DO_MLA(A, N, M) (A + N * M) +#define DO_MLS(A, N, M) (A - N * M) + +DO_ZPZZZ(sve_mla_b, uint8_t, H1, DO_MLA) +DO_ZPZZZ(sve_mls_b, uint8_t, H1, DO_MLS) + +DO_ZPZZZ(sve_mla_h, uint16_t, H1_2, DO_MLA) +DO_ZPZZZ(sve_mls_h, uint16_t, H1_2, DO_MLS) + +DO_ZPZZZ(sve_mla_s, uint32_t, H1_4, DO_MLA) +DO_ZPZZZ(sve_mls_s, uint32_t, H1_4, DO_MLS) + +DO_ZPZZZ_D(sve_mla_d, uint64_t, DO_MLA) +DO_ZPZZZ_D(sve_mls_d, uint64_t, DO_MLS) + +#undef DO_MLA +#undef DO_MLS +#undef DO_ZPZZZ +#undef DO_ZPZZZ_D diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index dce8ba8dc0..b956d87636 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -609,6 +609,37 @@ DO_ZPZW(LSL, lsl) #undef DO_ZPZW +/* + *** SVE Integer Multiply-Add Group + */ + +static void do_zpzzz_ool(DisasContext *s, arg_rprrr_esz *a, + gen_helper_gvec_5 *fn) +{ + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_5_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->ra), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + pred_full_reg_offset(s, a->pg), + vsz, vsz, 0, fn); +} + +#define DO_ZPZZZ(NAME, name) \ +static void trans_##NAME(DisasContext *s, arg_rprrr_esz *a, uint32_t insn) \ +{ \ + static gen_helper_gvec_5 * const fns[4] = { \ + gen_helper_sve_##name##_b, gen_helper_sve_##name##_h, \ + gen_helper_sve_##name##_s, gen_helper_sve_##name##_d, \ + }; \ + do_zpzzz_ool(s, a, fns[a->esz]); \ +} + +DO_ZPZZZ(MLA, mla) +DO_ZPZZZ(MLS, mls) + +#undef DO_ZPZZZ + /* *** SVE Predicate Logical Operations Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index b875501475..68a1823b72 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -48,6 +48,7 @@ &rpr_esz rd pg rn esz &rprr_s rd pg rn rm s &rprr_esz rd pg rn rm esz +&rprrr_esz rd pg rn rm ra esz &rpri_esz rd pg rn imm esz &ptrue rd esz pat s @@ -73,6 +74,12 @@ @rdm_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 \ &rprr_esz rm=%reg_movprfx +# Three register operand, with governing predicate, vector element size +@rda_pg_rn_rm ........ esz:2 . rm:5 ... pg:3 rn:5 rd:5 \ + &rprrr_esz ra=%reg_movprfx +@rdn_pg_ra_rm ........ esz:2 . rm:5 ... pg:3 ra:5 rd:5 \ + &rprrr_esz rn=%reg_movprfx + # One register operand, with governing predicate, vector element size @rd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz @@ -188,6 +195,16 @@ UXTH 00000100 .. 010 011 101 ... ..... ..... @rd_pg_rn SXTW 00000100 .. 010 100 101 ... ..... ..... @rd_pg_rn UXTW 00000100 .. 010 101 101 ... ..... ..... @rd_pg_rn +### SVE Integer Multiply-Add Group + +# SVE integer multiply-add writing addend (predicated) +MLA 00000100 .. 0 ..... 010 ... ..... ..... @rda_pg_rn_rm +MLS 00000100 .. 0 ..... 011 ... ..... ..... @rda_pg_rn_rm + +# SVE integer multiply-add writing multiplicand (predicated) +MLA 00000100 .. 0 ..... 110 ... ..... ..... @rdn_pg_ra_rm # MAD +MLS 00000100 .. 0 ..... 111 ... ..... ..... @rdn_pg_ra_rm # MSB + ### SVE Logical - Unpredicated Group # SVE bitwise logical operations (unpredicated) From patchwork Sat Feb 17 18:22:32 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128679 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1820949ljc; Sat, 17 Feb 2018 10:32:31 -0800 (PST) X-Google-Smtp-Source: AH8x226GU+VYoPChCq1saGW3R/OztUhlMu4xZPicGk76kq8N6V9VdUc7JqzMJtvBXt5bDSvN6sU/ X-Received: by 10.37.153.137 with SMTP id p9mr7195475ybo.12.1518892351825; Sat, 17 Feb 2018 10:32:31 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518892351; cv=none; d=google.com; s=arc-20160816; b=nJGAYa7+UmqELuLr65lVmpFv/99APnWv4WMW2raIs/aY5Y0tPTCE+npxSVn3OhViAe h88VEPcYPOyB/80FVszQhW9u6680mceGimdkuN2LTLFvQZAuwEK7ZvFzrQlbM3qvi797 pSbBCz/pIbphpRrgT7fKEKknKmd3wTfATm4JBJ/vfCOhVSuaKskYsekFbjTZb9bvOiZF RCWGRO12B1RBPiMbhW4a9LVJj1aWOEdhds3QcCb/ApH57DjNJH7PGpKw/t65ziFMFRfj ww6ZbTcwGuJ9RvLTfcIC//58U636K6rH3SoaNYe6d+8tjKcOvQMcEoMdmg+ZE7SU/qBN fkGQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=p2OJeeH3GeRDZFkskP3/uYl1RujJsL25/T8puntaOkM=; b=ZQxYq2brLZuJvUSuCPquO54ld78Z123NCONbstaFxjO4ZOTRaFxEqrBc1HzdkZQTVB KoT4EToylHPR69Na3nFcW1TPiEjYZoE9WNt0JhlYwpM2iaEPFWLMHcfiznIvKVHXzDxH HPX18gM9Dmh4oSOZYnPiG2OXMFRbMU4xJWBsIKsK+ZTj/U1ARu9s4pBWoLUZ5hTFlqeK rq4aS1xB3lWtsOI7WOS/bAp+LJOHW2AXLdVK1rYpD9PUBAqYIdjCNiBpmNyNo58LeMB3 GFgMXBkHZkDr7nZQoUbYhEKkped6W/y3rVDlEzJn9GuiKb3nIKR+yVXHuWoo97IzVenZ 0tzg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=ZTtPAfCm; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id l19si146273ywk.530.2018.02.17.10.32.31 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 10:32:31 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=ZTtPAfCm; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48103 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7Hj-0006oF-2m for patch@linaro.org; Sat, 17 Feb 2018 13:32:31 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39854) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79P-0000Ws-Jx for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:58 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79O-0001eU-EZ for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:55 -0500 Received: from mail-pl0-x243.google.com ([2607:f8b0:400e:c01::243]:45434) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79O-0001eA-AO for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:54 -0500 Received: by mail-pl0-x243.google.com with SMTP id p5so3428335plo.12 for ; Sat, 17 Feb 2018 10:23:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=p2OJeeH3GeRDZFkskP3/uYl1RujJsL25/T8puntaOkM=; b=ZTtPAfCmp4AENayfBA1q3SlbKh7Qlc+yadSVWnyaqj3P2yqCTtBS+Xh5SduMTIlvgs fmU8vd3dH/idKnbBAPdQVwV43wAICu+no7J+z+Wa37RajwsatY8KFafyIFg+tYWabHQk X4iHsfoRGQvsjyqrMLmD9KM/0COyQvuN16/m8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=p2OJeeH3GeRDZFkskP3/uYl1RujJsL25/T8puntaOkM=; b=ftjpcoPfgH8k7SjK+GzrDniyU70sjKImCtb0ue+syu+Uq9Ea4RpMWT4usIgB2KpjJf fHCKG3R9Giaki3r/sbtEQMvUSHlKxIZbq96+th7m/Nnxcs8y4Wj8G/MXqZoYkq/lS4QH VU2cwvBglE4bwYvnDntP0thrLxxihq3No15facNyhS7SIShp/ROjYic4yRjXuPS2mdWd 1nxsr1smFsGvgXkQ48++JJWK/FSa6IfY2KGMfKche7wQYZDtjl6M2henb/iSns4doMAy svjiSQmmRIBzkgKJgZ0PvsgRAwamZekiytrZZdr8pcjHmN2qOioHexSj8P8Ae4mCII6T EuKg== X-Gm-Message-State: APf1xPBo6M/SPVP9a7Vy6s3MpOXWAj3aSiaSZRrwkkWbxcWdwvcoPk8E knjPz9tdoYEczImuEwpOHgKuXlcTLAQ= X-Received: by 2002:a17:902:9686:: with SMTP id n6-v6mr9302113plp.333.1518891833069; Sat, 17 Feb 2018 10:23:53 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.51 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:52 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:32 -0800 Message-Id: <20180217182323.25885-17-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::243 Subject: [Qemu-devel] [PATCH v2 16/67] target/arm: Implement SVE Integer Arithmetic - Unpredicated Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 41 ++++++++++++++++++++++++++++++++++++++--- target/arm/sve.decode | 13 +++++++++++++ 2 files changed, 51 insertions(+), 3 deletions(-) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index b956d87636..8baec6c674 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -235,6 +235,40 @@ static void trans_BIC_zzz(DisasContext *s, arg_BIC_zzz *a, uint32_t insn) do_vector3_z(s, tcg_gen_gvec_andc, 0, a->rd, a->rn, a->rm); } +/* + *** SVE Integer Arithmetic - Unpredicated Group + */ + +static void trans_ADD_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_vector3_z(s, tcg_gen_gvec_add, a->esz, a->rd, a->rn, a->rm); +} + +static void trans_SUB_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_vector3_z(s, tcg_gen_gvec_sub, a->esz, a->rd, a->rn, a->rm); +} + +static void trans_SQADD_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_vector3_z(s, tcg_gen_gvec_ssadd, a->esz, a->rd, a->rn, a->rm); +} + +static void trans_SQSUB_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_vector3_z(s, tcg_gen_gvec_sssub, a->esz, a->rd, a->rn, a->rm); +} + +static void trans_UQADD_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_vector3_z(s, tcg_gen_gvec_usadd, a->esz, a->rd, a->rn, a->rm); +} + +static void trans_UQSUB_zzz(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_vector3_z(s, tcg_gen_gvec_ussub, a->esz, a->rd, a->rn, a->rm); +} + /* *** SVE Integer Arithmetic - Binary Predicated Group */ @@ -254,7 +288,8 @@ static void do_zpzz_ool(DisasContext *s, arg_rprr_esz *a, gen_helper_gvec_4 *fn) } #define DO_ZPZZ(NAME, name) \ -void trans_##NAME##_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn) \ +static void trans_##NAME##_zpzz(DisasContext *s, arg_rprr_esz *a, \ + uint32_t insn) \ { \ static gen_helper_gvec_4 * const fns[4] = { \ gen_helper_sve_##name##_zpzz_b, gen_helper_sve_##name##_zpzz_h, \ @@ -286,7 +321,7 @@ DO_ZPZZ(ASR, asr) DO_ZPZZ(LSR, lsr) DO_ZPZZ(LSL, lsl) -void trans_SDIV_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +static void trans_SDIV_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn) { static gen_helper_gvec_4 * const fns[4] = { NULL, NULL, gen_helper_sve_sdiv_zpzz_s, gen_helper_sve_sdiv_zpzz_d @@ -294,7 +329,7 @@ void trans_SDIV_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn) do_zpzz_ool(s, a, fns[a->esz]); } -void trans_UDIV_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +static void trans_UDIV_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn) { static gen_helper_gvec_4 * const fns[4] = { NULL, NULL, gen_helper_sve_udiv_zpzz_s, gen_helper_sve_udiv_zpzz_d diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 68a1823b72..b40d7dc9a2 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -68,6 +68,9 @@ # Three prediate operand, with governing predicate, flag setting @pd_pg_pn_pm_s ........ . s:1 .. rm:4 .. pg:4 . rn:4 . rd:4 &rprr_s +# Three operand, vector element size +@rd_rn_rm ........ esz:2 . rm:5 ... ... rn:5 rd:5 &rrr_esz + # Two register operand, with governing predicate, vector element size @rdn_pg_rm ........ esz:2 ... ... ... pg:3 rm:5 rd:5 \ &rprr_esz rn=%reg_movprfx @@ -205,6 +208,16 @@ MLS 00000100 .. 0 ..... 011 ... ..... ..... @rda_pg_rn_rm MLA 00000100 .. 0 ..... 110 ... ..... ..... @rdn_pg_ra_rm # MAD MLS 00000100 .. 0 ..... 111 ... ..... ..... @rdn_pg_ra_rm # MSB +### SVE Integer Arithmetic - Unpredicated Group + +# SVE integer add/subtract vectors (unpredicated) +ADD_zzz 00000100 .. 1 ..... 000 000 ..... ..... @rd_rn_rm +SUB_zzz 00000100 .. 1 ..... 000 001 ..... ..... @rd_rn_rm +SQADD_zzz 00000100 .. 1 ..... 000 100 ..... ..... @rd_rn_rm +UQADD_zzz 00000100 .. 1 ..... 000 101 ..... ..... @rd_rn_rm +SQSUB_zzz 00000100 .. 1 ..... 000 110 ..... ..... @rd_rn_rm +UQSUB_zzz 00000100 .. 1 ..... 000 111 ..... ..... @rd_rn_rm + ### SVE Logical - Unpredicated Group # SVE bitwise logical operations (unpredicated) From patchwork Sat Feb 17 18:22:33 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128698 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1830416ljc; Sat, 17 Feb 2018 10:47:40 -0800 (PST) X-Google-Smtp-Source: AH8x227OfWtePOM0KPcxWYvxRoiOXEOhUs/pn8379qf/HuaekZ+hxFsFilDq9QE18X/F4n6wE6px X-Received: by 10.129.69.10 with SMTP id s10mr5779389ywa.426.1518893260610; Sat, 17 Feb 2018 10:47:40 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518893260; cv=none; d=google.com; s=arc-20160816; b=niEVGDbP2WYx795pwZ6dl5Ntux5wA1AN6vPulXPuv+Q5JqTC7od6ee8ap5o7dO8Lmi Vz4UufhXyqZ3cy/UOhSk9xxphbOIPEiAq6SDn2rpDEMJhMdPcdGXT1MN7PPzlLii5BF/ ON/bywhUqfo+AtBiY6DTmnaRDHGJnwCtOy0zuK6Au0yHIiXcsjWu+efk6TBlubigov9e MIJcz6IvrbCGuyfC6N04Lhz5hPnN/tenoZcSqDh8Amc4JWXxFOKMcNzGROIrZxNrSus/ ynJDbScsXn/rR2rUo9T87YlS3DGxuy150HUNfzUpqlqSq9hJl3V6JKXK0GzEVAJVZUIm bU+A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=STsmc4QHRnZfTt7wI7/HBaUPP3ioNeLwXu1h7b1S+p8=; b=DxWF/7Afik9RseUs8RW2E/yb3N9jJ+TxH5f6ApI78v61lopnXFrRORHbA8ADYBR6vi 4ECJDkHOt9fnph1csqThUUiqSJcLVGzDw7xevJ0POrZZYYb2e0hdEL/8EAMHow3lh4Dh p/doMfDno/bG/CH/ejvv9BoLDgjqrJTMgJ1ogz2pu5/lQ6cEeOVjoUFxpb3/uarOdS40 bcxXk2gpup8MtSrOkVkxONa/J30HD0CHyBJYuNJyiLbtV0jtiaAGH7BK2jUs9WPYElu0 tZVti8n02Fs0fe0rHiNS2KPhSMQhSP8t6zZDsRL8LfYoNvnbUtfewXHyg7grpS15ageN PFeQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=hn1ZBSBu; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id b21si3596221ywe.372.2018.02.17.10.47.40 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 10:47:40 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=hn1ZBSBu; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48237 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7WN-0002Q9-VL for patch@linaro.org; Sat, 17 Feb 2018 13:47:40 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39897) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79S-0000ad-J5 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:59 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79Q-0001fD-Av for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:58 -0500 Received: from mail-pl0-x243.google.com ([2607:f8b0:400e:c01::243]:46293) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79P-0001el-Ny for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:56 -0500 Received: by mail-pl0-x243.google.com with SMTP id x19so3428069plr.13 for ; Sat, 17 Feb 2018 10:23:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=STsmc4QHRnZfTt7wI7/HBaUPP3ioNeLwXu1h7b1S+p8=; b=hn1ZBSBu3dRtay/gpG4FfhIwDAwA1h0GdGOtLoITHFW+FmfwRPEhJrj3ezAM/ngfpF tQ7+P4vMkwgfVF2Ri9cg+4sFTQfXfit88ESvQABOC4oZwv7gJqZcuKOaUuNRtpzuRaZp jNzlDSGw39sU2xZMukOQ9k8ZMzhC9vy5WzdtE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=STsmc4QHRnZfTt7wI7/HBaUPP3ioNeLwXu1h7b1S+p8=; b=GRhZACHdiRAP/KJnROBULj3+CuEKXq9lxQe9nsbbTMiokhqMJ32743E8I/+oNDvQ9h KkNFNa+OfuDJVE9ExDi+xru0sDhb8y/EkCFqQVd/xAPoigmCHIb27I4rQgpxCVTbEHiJ aYWtF3mNcHurv5t4B7/gfrCx2cD7oIVpBPv1szYPIWOuTXI2K3B9KE4fuQCQR8VuVnq7 7JpqSQl0udz8aiq99wC4EaX3txPHABHl8eocKAj1EuH8tv0C5M2bec+MiLzazf/YTG/i MBKeGiDNYBYiVIub13JfVrojo5NchAmHCP8l0urf48Ux4XbvcI7CTjq5VRZ7khcZqx7H 6Nyg== X-Gm-Message-State: APf1xPAj/MXPbmyIiqq3vExSkidhGs/5040KuXJY813EBu+Xt6nOUyKo E4Eyt8ktnoZmHV3nrgnA9tN189puEAg= X-Received: by 2002:a17:902:33a5:: with SMTP id b34-v6mr2253184plc.263.1518891834489; Sat, 17 Feb 2018 10:23:54 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.53 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:53 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:33 -0800 Message-Id: <20180217182323.25885-18-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::243 Subject: [Qemu-devel] [PATCH v2 17/67] target/arm: Implement SVE Index Generation Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 5 ++++ target/arm/sve_helper.c | 40 +++++++++++++++++++++++++++ target/arm/translate-sve.c | 67 ++++++++++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 14 ++++++++++ 4 files changed, 126 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index b31d497f31..2a2dbe98dd 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -363,6 +363,11 @@ DEF_HELPER_FLAGS_6(sve_mls_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_6(sve_mls_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_index_b, TCG_CALL_NO_RWG, void, ptr, i32, i32, i32) +DEF_HELPER_FLAGS_4(sve_index_h, TCG_CALL_NO_RWG, void, ptr, i32, i32, i32) +DEF_HELPER_FLAGS_4(sve_index_s, TCG_CALL_NO_RWG, void, ptr, i32, i32, i32) +DEF_HELPER_FLAGS_4(sve_index_d, TCG_CALL_NO_RWG, void, ptr, i64, i64, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 4b08a38ce8..950012e70a 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -991,3 +991,43 @@ DO_ZPZZZ_D(sve_mls_d, uint64_t, DO_MLS) #undef DO_MLS #undef DO_ZPZZZ #undef DO_ZPZZZ_D + +void HELPER(sve_index_b)(void *vd, uint32_t start, + uint32_t incr, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint8_t *d = vd; + for (i = 0; i < opr_sz; i += 1) { + d[H1(i)] = start + i * incr; + } +} + +void HELPER(sve_index_h)(void *vd, uint32_t start, + uint32_t incr, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 2; + uint16_t *d = vd; + for (i = 0; i < opr_sz; i += 1) { + d[H2(i)] = start + i * incr; + } +} + +void HELPER(sve_index_s)(void *vd, uint32_t start, + uint32_t incr, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 4; + uint32_t *d = vd; + for (i = 0; i < opr_sz; i += 1) { + d[H4(i)] = start + i * incr; + } +} + +void HELPER(sve_index_d)(void *vd, uint64_t start, + uint64_t incr, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd; + for (i = 0; i < opr_sz; i += 1) { + d[i] = start + i * incr; + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 8baec6c674..773f0bfded 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -675,6 +675,73 @@ DO_ZPZZZ(MLS, mls) #undef DO_ZPZZZ +/* + *** SVE Index Generation Group + */ + +static void do_index(DisasContext *s, int esz, int rd, + TCGv_i64 start, TCGv_i64 incr) +{ + unsigned vsz = vec_full_reg_size(s); + TCGv_i32 desc = tcg_const_i32(simd_desc(vsz, vsz, 0)); + TCGv_ptr t_zd = tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(t_zd, cpu_env, vec_full_reg_offset(s, rd)); + if (esz == 3) { + gen_helper_sve_index_d(t_zd, start, incr, desc); + } else { + typedef void index_fn(TCGv_ptr, TCGv_i32, TCGv_i32, TCGv_i32); + static index_fn * const fns[3] = { + gen_helper_sve_index_b, + gen_helper_sve_index_h, + gen_helper_sve_index_s, + }; + TCGv_i32 s32 = tcg_temp_new_i32(); + TCGv_i32 i32 = tcg_temp_new_i32(); + + tcg_gen_extrl_i64_i32(s32, start); + tcg_gen_extrl_i64_i32(i32, incr); + fns[esz](t_zd, s32, i32, desc); + + tcg_temp_free_i32(s32); + tcg_temp_free_i32(i32); + } + tcg_temp_free_ptr(t_zd); + tcg_temp_free_i32(desc); +} + +static void trans_INDEX_ii(DisasContext *s, arg_INDEX_ii *a, uint32_t insn) +{ + TCGv_i64 start = tcg_const_i64(a->imm1); + TCGv_i64 incr = tcg_const_i64(a->imm2); + do_index(s, a->esz, a->rd, start, incr); + tcg_temp_free_i64(start); + tcg_temp_free_i64(incr); +} + +static void trans_INDEX_ir(DisasContext *s, arg_INDEX_ir *a, uint32_t insn) +{ + TCGv_i64 start = tcg_const_i64(a->imm); + TCGv_i64 incr = cpu_reg(s, a->rm); + do_index(s, a->esz, a->rd, start, incr); + tcg_temp_free_i64(start); +} + +static void trans_INDEX_ri(DisasContext *s, arg_INDEX_ri *a, uint32_t insn) +{ + TCGv_i64 start = cpu_reg(s, a->rn); + TCGv_i64 incr = tcg_const_i64(a->imm); + do_index(s, a->esz, a->rd, start, incr); + tcg_temp_free_i64(incr); +} + +static void trans_INDEX_rr(DisasContext *s, arg_INDEX_rr *a, uint32_t insn) +{ + TCGv_i64 start = cpu_reg(s, a->rn); + TCGv_i64 incr = cpu_reg(s, a->rm); + do_index(s, a->esz, a->rd, start, incr); +} + /* *** SVE Predicate Logical Operations Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index b40d7dc9a2..d7b078e92f 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -226,6 +226,20 @@ ORR_zzz 00000100 01 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 EOR_zzz 00000100 10 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 BIC_zzz 00000100 11 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 +### SVE Index Generation Group + +# SVE index generation (immediate start, immediate increment) +INDEX_ii 00000100 esz:2 1 imm2:s5 010000 imm1:s5 rd:5 + +# SVE index generation (immediate start, register increment) +INDEX_ir 00000100 esz:2 1 rm:5 010010 imm:s5 rd:5 + +# SVE index generation (register start, immediate increment) +INDEX_ri 00000100 esz:2 1 imm:s5 010001 rn:5 rd:5 + +# SVE index generation (register start, register increment) +INDEX_rr 00000100 .. 1 ..... 010011 ..... ..... @rd_rn_rm + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Sat Feb 17 18:22:34 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128685 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1823845ljc; Sat, 17 Feb 2018 10:36:51 -0800 (PST) X-Google-Smtp-Source: AH8x2245+rPcIj5RpMQQz9XnhIOkiYAhY5z50zN2qYgLpFc5f9WWxuowqdwBq/qXoCrlE3okgBpK X-Received: by 10.129.212.3 with SMTP id z3mr7388904ywi.137.1518892611472; Sat, 17 Feb 2018 10:36:51 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518892611; cv=none; d=google.com; s=arc-20160816; b=d4WVDmDqHz5sCQM0TPxxAxyg/5qCO4dXPpZAH7WnS/m3RATvcPum8he58Cq5yxr2GA bz2qgCnN0qOfhLT6Je4gZRBw8w0He64+nvHOKrqKar1cpS8hqKYIdiXWDHxBH9ibcN9u 3POABSrCWlbRzzdc0bLnDm6H4JcNYrHJUmYxI/jFCVSmy34hYYi69ce6+0oqgb9YVMV9 XxZ925NOZmgd7vbrK+Rkd9992y40+HnkYqlLFYv/ReSWs/lmnDuT/yqu9aUctniAxK+1 NavK0/4xXZ1zjdX2iBg+zkG7ow+TYfGG0AfGNuZiIH13Yzd5/VDLbj9dGrDThJeElcbP K5BA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=G5CwiAWO6kX58omnBN3XeNxT6S6emdMtL5CoxFcf+0c=; b=bl6EHlVDltJaUSSQJAADYPjsaqZrwlqUAL2burV5WGl6IVarZkVY6u73VkuECzGZjz uHuQx6NLT9S5WUvl1uCGRrNiVJBr/L3oQFjWU6swukvdKOg+QGEH5QmKLaEvGh/saK7/ ZIEqnr7a2w4JfBiWsaqEksf9Lq5wkLljrjO+AwwAQK3E3ZpzLue5xXkqRICHCxBfF7dA jumft6H74dPq1Osz8JIHgwhpzGl8wwlwayJE26svmjLStSQFysOe7nERrWdE9i3JFVuc Ey8JN8Apc19vqafSgQ7irmIyhqGKtGhgrVTqeb0jmQZuCMagBo5bzJ5neeFTYoX1tzgL apUg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=S7PQxOEV; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id m62si790533yba.73.2018.02.17.10.36.51 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 10:36:51 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=S7PQxOEV; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48155 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7Lu-0002iA-Pq for patch@linaro.org; Sat, 17 Feb 2018 13:36:50 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39898) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79S-0000ae-J7 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:59 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79R-0001g3-BH for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:58 -0500 Received: from mail-pf0-x244.google.com ([2607:f8b0:400e:c00::244]:41240) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79R-0001fc-5R for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:57 -0500 Received: by mail-pf0-x244.google.com with SMTP id 68so590897pfj.8 for ; Sat, 17 Feb 2018 10:23:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=G5CwiAWO6kX58omnBN3XeNxT6S6emdMtL5CoxFcf+0c=; b=S7PQxOEVqNVgXFHdoZSxAFe23u3NriC/cohTEYvROBZ+/NrXFMncQ0OQ9gut8Sd2D9 DB2+EClwq63UW+H9OxY12FHpTc3fH+XK5bAHZI9byh5/SMu/7KTFpHVxnZA1oywO8hr0 u9IET6oge02IS6bHa76aajzbLufU1GpUhpvYo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=G5CwiAWO6kX58omnBN3XeNxT6S6emdMtL5CoxFcf+0c=; b=frVkF+Zkld73C4qbonxPgnXIgMPp7UwQU5b2pcdHx3AkOIKM7E1KzExDVbbNPAzcz/ sgmr5iPzA4ymihJ4o1d1vF/Khhbc9yz74YLCntV0CA6gVH6EsgiOnxDDfEqUouuq35HW uGeU1vtex8oQHSZqQMnyaz+Yjouc+wiaQTh7gwpz3A44qX8V+uAWi5NNmelzzZsUgTEx ivDkHNnDI9M6BEPOI0dh9OV33DpNy26EJjOQ2Ucl3DW8RMzmldTPCMZvwr8jNK2xCekZ Lmh6jz9p1XFLzFR7WaVq96fSdXZw8085DPJPEBUrJIKtvs8hR44ZF94bRi/o4TJSrlKr Rxpw== X-Gm-Message-State: APf1xPAVxSzGxXFULCNkSKEGTI1LsUwXBOmuPISkVISAbSu6JbRLoD4s as/yNs6tYYYk1W2ptngecaUYor01mXU= X-Received: by 10.98.58.129 with SMTP id v1mr1979921pfj.203.1518891835946; Sat, 17 Feb 2018 10:23:55 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.54 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:55 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:34 -0800 Message-Id: <20180217182323.25885-19-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::244 Subject: [Qemu-devel] [PATCH v2 18/67] target/arm: Implement SVE Stack Allocation Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 24 ++++++++++++++++++++++++ target/arm/sve.decode | 12 ++++++++++++ 2 files changed, 36 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 773f0bfded..4a38020c8a 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -742,6 +742,30 @@ static void trans_INDEX_rr(DisasContext *s, arg_INDEX_rr *a, uint32_t insn) do_index(s, a->esz, a->rd, start, incr); } +/* + *** SVE Stack Allocation Group + */ + +static void trans_ADDVL(DisasContext *s, arg_ADDVL *a, uint32_t insn) +{ + TCGv_i64 rd = cpu_reg_sp(s, a->rd); + TCGv_i64 rn = cpu_reg_sp(s, a->rn); + tcg_gen_addi_i64(rd, rn, a->imm * vec_full_reg_size(s)); +} + +static void trans_ADDPL(DisasContext *s, arg_ADDPL *a, uint32_t insn) +{ + TCGv_i64 rd = cpu_reg_sp(s, a->rd); + TCGv_i64 rn = cpu_reg_sp(s, a->rn); + tcg_gen_addi_i64(rd, rn, a->imm * pred_full_reg_size(s)); +} + +static void trans_RDVL(DisasContext *s, arg_RDVL *a, uint32_t insn) +{ + TCGv_i64 reg = cpu_reg(s, a->rd); + tcg_gen_movi_i64(reg, a->imm * vec_full_reg_size(s)); +} + /* *** SVE Predicate Logical Operations Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index d7b078e92f..0b47869dcd 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -86,6 +86,9 @@ # One register operand, with governing predicate, vector element size @rd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz +# Two register operands with a 6-bit signed immediate. +@rd_rn_i6 ........ ... rn:5 ..... imm:s6 rd:5 &rri + # Two register operand, one immediate operand, with predicate, # element size encoded as TSZHL. User must fill in imm. @rdn_pg_tszimm ........ .. ... ... ... pg:3 ..... rd:5 \ @@ -240,6 +243,15 @@ INDEX_ri 00000100 esz:2 1 imm:s5 010001 rn:5 rd:5 # SVE index generation (register start, register increment) INDEX_rr 00000100 .. 1 ..... 010011 ..... ..... @rd_rn_rm +### SVE Stack Allocation Group + +# SVE stack frame adjustment +ADDVL 00000100 001 ..... 01010 ...... ..... @rd_rn_i6 +ADDPL 00000100 011 ..... 01010 ...... ..... @rd_rn_i6 + +# SVE stack frame size +RDVL 00000100 101 11111 01010 imm:s6 rd:5 + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Sat Feb 17 18:22:35 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128696 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1830140ljc; Sat, 17 Feb 2018 10:47:11 -0800 (PST) X-Google-Smtp-Source: AH8x2249pfU7JMaWyr72U4uNdUq1e38OdO6B8PrV7HVIwwpfdSbstNub/2o0CGTbNF4mkYF755ec X-Received: by 10.37.98.83 with SMTP id w80mr7095162ybb.108.1518893231588; Sat, 17 Feb 2018 10:47:11 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518893231; cv=none; d=google.com; s=arc-20160816; b=ywKXRwyalCih1KhhB9KN8la7bJ1TCyKBlX+1NzYvvGse0DsHJwb3hB98pvUTGVhBx3 YKISszAbWouoTRCa/xB07HeR298+UUE7oPwjo2Dj2ltqQSuhGemykwyvQOfghusF8kb/ b0VnPnddxJjBpJAQmim4w1kUDqvvRLPSiUvkGMn5gEFovgkjRP5v8Ge0g1TIAoKxoP+s psJXN8jS1u/y4hQLGxq0e4juEg1H96ZtBuQeiTDeqcw/Dni7Akb7/BVxIPhQwmdsJZzy YxlCkh3SzMBecCOIz1qFUeJbqr7yHPoLq1P6gWjWhBlgZX3N+4IvafZzDI/wsTshnJJ9 EemA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=flMxh8Nvg2soBYq5gB6IVe/v9x76mKBYlcXjLqsvSu8=; b=w4mO2BZth1VdXbxwHZ8nqfAyV5KQy8jerf3YCEQr0hVSr/VjYf8oUK8jWtB8yh2rpn kGfqiF6qnuchJBLMbwQXsiCFaOeOuHRJ1nXz1URE6RKXHJAzbq/6xhW6nqb7Ss12mHX/ lloruEt7zR8+qcj9ou3KV+B93ZU3WyL2z+7wkYHyKDdbEowOF4Kvx5fMXtj56UBUwORJ M/Z26MYPEzcAqcvSjE9sCvOWiQr9r4uUHVWCLU/a+4FstmD5mH1PeBFaCkfPcCer/h11 jcOX0gX9RCQOC9f9RjR6qFi70aK9pG4Gwtov3Nv/Eq39GqWS4oTZ0Ct3AX9DmzOY0YzT PM2w== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=SmON7BiB; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id p184si3150058ywh.791.2018.02.17.10.47.11 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 10:47:11 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=SmON7BiB; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48236 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7Vv-00022D-0F for patch@linaro.org; Sat, 17 Feb 2018 13:47:11 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39935) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79U-0000cC-KU for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:04 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79T-0001gs-3T for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:00 -0500 Received: from mail-pg0-x243.google.com ([2607:f8b0:400e:c05::243]:44352) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79S-0001gM-Rz for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:23:59 -0500 Received: by mail-pg0-x243.google.com with SMTP id l4so2191244pgp.11 for ; Sat, 17 Feb 2018 10:23:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=flMxh8Nvg2soBYq5gB6IVe/v9x76mKBYlcXjLqsvSu8=; b=SmON7BiB1Oa4kH7rcwQy2TmkZau4ONwPUQS/n8dEP1UstUk+wmYz9R70otlU1K6/xo tfnTG63eQREzoSNSGuffcw2zweJENbZbXzw5j0k4XrEb26pZS+OVOCUGfVMMgOv1WMci 1WI+w7gSQIROucxLE2y77dhKez/w1ecelyiZM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=flMxh8Nvg2soBYq5gB6IVe/v9x76mKBYlcXjLqsvSu8=; b=HSBdK0On+ahWE+BfOewzRRvP0fV36BL8qKhlDmiIN+JdUNlsBWLKRBchxqxnZRa53v bPL2VXY1D6ibBdZFwsWWpi4tWWwqRF8gyIjWveyu8Veo78qoGjHz2Api1r5BoSjGHxSx cSLaiWkn6Dkfsdi1pBLVFSOl0KkpWn4phTJDllC7+9+fpbbZkSFSiv99XRmW97vqoBOL o/S5IfkMmnM9s+/+5sFN4oj9oj/mz69bgEoBRmPtoYDG/Y3dizGLr3UL42Xu39U7jMz4 tYUdJ10fUODIwnEeQGW6Y+rA3z1qaad5TI2tpL68m68NEImQXB5pe+rt4I4GfxLUmmpd E/9A== X-Gm-Message-State: APf1xPAo1H5FBvkGuf/AcRMX4vG+US7I3SFFKu3HwSMkzrvmjoFIpzC0 lt9hiuI7goNWz1bu+9JYB+LJSi/4H3Q= X-Received: by 10.98.38.134 with SMTP id m128mr9653690pfm.154.1518891837624; Sat, 17 Feb 2018 10:23:57 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.56 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:56 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:35 -0800 Message-Id: <20180217182323.25885-20-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::243 Subject: [Qemu-devel] [PATCH v2 19/67] target/arm: Implement SVE Bitwise Shift - Unpredicated Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 12 +++++++ target/arm/sve_helper.c | 30 +++++++++++++++++ target/arm/translate-sve.c | 81 ++++++++++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 26 +++++++++++++++ 4 files changed, 149 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 2a2dbe98dd..00e3cd48bb 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -368,6 +368,18 @@ DEF_HELPER_FLAGS_4(sve_index_h, TCG_CALL_NO_RWG, void, ptr, i32, i32, i32) DEF_HELPER_FLAGS_4(sve_index_s, TCG_CALL_NO_RWG, void, ptr, i32, i32, i32) DEF_HELPER_FLAGS_4(sve_index_d, TCG_CALL_NO_RWG, void, ptr, i64, i64, i32) +DEF_HELPER_FLAGS_4(sve_asr_zzw_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_asr_zzw_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_asr_zzw_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_lsr_zzw_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_lsr_zzw_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_lsr_zzw_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_lsl_zzw_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_lsl_zzw_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_lsl_zzw_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 950012e70a..4c6e2713fa 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -614,6 +614,36 @@ DO_ZPZ(sve_neg_h, uint16_t, H1_2, DO_NEG) DO_ZPZ(sve_neg_s, uint32_t, H1_4, DO_NEG) DO_ZPZ_D(sve_neg_d, uint64_t, DO_NEG) +/* Three-operand expander, unpredicated, in which the third operand is "wide". + */ +#define DO_ZZW(NAME, TYPE, TYPEW, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + for (i = 0; i < opr_sz; ) { \ + TYPEW mm = *(TYPEW *)(vm + i); \ + do { \ + TYPE nn = *(TYPE *)(vn + H(i)); \ + *(TYPE *)(vd + H(i)) = OP(nn, mm); \ + i += sizeof(TYPE); \ + } while (i & 7); \ + } \ +} + +DO_ZZW(sve_asr_zzw_b, int8_t, uint64_t, H1, DO_ASR) +DO_ZZW(sve_lsr_zzw_b, uint8_t, uint64_t, H1, DO_LSR) +DO_ZZW(sve_lsl_zzw_b, uint8_t, uint64_t, H1, DO_LSL) + +DO_ZZW(sve_asr_zzw_h, int16_t, uint64_t, H1_2, DO_ASR) +DO_ZZW(sve_lsr_zzw_h, uint16_t, uint64_t, H1_2, DO_LSR) +DO_ZZW(sve_lsl_zzw_h, uint16_t, uint64_t, H1_2, DO_LSL) + +DO_ZZW(sve_asr_zzw_s, int32_t, uint64_t, H1_4, DO_ASR) +DO_ZZW(sve_lsr_zzw_s, uint32_t, uint64_t, H1_4, DO_LSR) +DO_ZZW(sve_lsl_zzw_s, uint32_t, uint64_t, H1_4, DO_LSL) + +#undef DO_ZZW + #undef DO_CLS_B #undef DO_CLS_H #undef DO_CLZ_B diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 4a38020c8a..43e9f1ad08 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -130,6 +130,13 @@ static void do_mov_z(DisasContext *s, int rd, int rn) do_vector2_z(s, tcg_gen_gvec_mov, 0, rd, rn); } +/* Initialize a Zreg with replications of a 64-bit immediate. */ +static void do_dupi_z(DisasContext *s, int rd, uint64_t word) +{ + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_dup64i(vec_full_reg_offset(s, rd), vsz, vsz, word); +} + /* Invoke a vector expander on two Pregs. */ static void do_vector2_p(DisasContext *s, GVecGen2Fn *gvec_fn, int esz, int rd, int rn) @@ -644,6 +651,80 @@ DO_ZPZW(LSL, lsl) #undef DO_ZPZW +/* + *** SVE Bitwise Shift - Unpredicated Group + */ + +static void do_shift_imm(DisasContext *s, arg_rri_esz *a, bool asr, + void (*gvec_fn)(unsigned, uint32_t, uint32_t, + int64_t, uint32_t, uint32_t)) +{ + unsigned vsz = vec_full_reg_size(s); + if (a->esz < 0) { + /* Invalid tsz encoding -- see tszimm_esz. */ + unallocated_encoding(s); + return; + } + /* Shift by element size is architecturally valid. For + arithmetic right-shift, it's the same as by one less. + Otherwise it is a zeroing operation. */ + if (a->imm >= 8 << a->esz) { + if (asr) { + a->imm = (8 << a->esz) - 1; + } else { + do_dupi_z(s, a->rd, 0); + return; + } + } + gvec_fn(a->esz, vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), a->imm, vsz, vsz); +} + +static void trans_ASR_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + do_shift_imm(s, a, true, tcg_gen_gvec_sari); +} + +static void trans_LSR_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + do_shift_imm(s, a, false, tcg_gen_gvec_shri); +} + +static void trans_LSL_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + do_shift_imm(s, a, false, tcg_gen_gvec_shli); +} + +static void do_zzw_ool(DisasContext *s, arg_rrr_esz *a, gen_helper_gvec_3 *fn) +{ + unsigned vsz = vec_full_reg_size(s); + if (fn == NULL) { + unallocated_encoding(s); + return; + } + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vsz, vsz, 0, fn); +} + +#define DO_ZZW(NAME, name) \ +static void trans_##NAME##_zzw(DisasContext *s, arg_rrr_esz *a, \ + uint32_t insn) \ +{ \ + static gen_helper_gvec_3 * const fns[4] = { \ + gen_helper_sve_##name##_zzw_b, gen_helper_sve_##name##_zzw_h, \ + gen_helper_sve_##name##_zzw_s, NULL \ + }; \ + do_zzw_ool(s, a, fns[a->esz]); \ +} + +DO_ZZW(ASR, asr) +DO_ZZW(LSR, lsr) +DO_ZZW(LSL, lsl) + +#undef DO_ZZW + /* *** SVE Integer Multiply-Add Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 0b47869dcd..f71ea1b60d 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -33,6 +33,11 @@ # A combination of tsz:imm3 -- extract (tsz:imm3) - esize %tszimm_shl 22:2 5:5 !function=tszimm_shl +# Similarly for the tszh/tszl pair at 22/16 for zzi +%tszimm16_esz 22:2 16:5 !function=tszimm_esz +%tszimm16_shr 22:2 16:5 !function=tszimm_shr +%tszimm16_shl 22:2 16:5 !function=tszimm_shl + # Either a copy of rd (at bit 0), or a different source # as propagated via the MOVPRFX instruction. %reg_movprfx 0:5 @@ -44,6 +49,7 @@ &rr_esz rd rn esz &rri rd rn imm +&rri_esz rd rn imm esz &rrr_esz rd rn rm esz &rpr_esz rd pg rn esz &rprr_s rd pg rn rm s @@ -94,6 +100,10 @@ @rdn_pg_tszimm ........ .. ... ... ... pg:3 ..... rd:5 \ &rpri_esz rn=%reg_movprfx esz=%tszimm_esz +# Similarly without predicate. +@rd_rn_tszimm ........ .. ... ... ...... rn:5 rd:5 \ + &rri_esz esz=%tszimm16_esz + # Basic Load/Store with 9-bit immediate offset @pd_rn_i9 ........ ........ ...... rn:5 . rd:4 \ &rri imm=%imm9_16_10 @@ -252,6 +262,22 @@ ADDPL 00000100 011 ..... 01010 ...... ..... @rd_rn_i6 # SVE stack frame size RDVL 00000100 101 11111 01010 imm:s6 rd:5 +### SVE Bitwise Shift - Unpredicated Group + +# SVE bitwise shift by immediate (unpredicated) +ASR_zzi 00000100 .. 1 ..... 1001 00 ..... ..... \ + @rd_rn_tszimm imm=%tszimm16_shr +LSR_zzi 00000100 .. 1 ..... 1001 01 ..... ..... \ + @rd_rn_tszimm imm=%tszimm16_shr +LSL_zzi 00000100 .. 1 ..... 1001 11 ..... ..... \ + @rd_rn_tszimm imm=%tszimm16_shl + +# SVE bitwise shift by wide elements (unpredicated) +# Note esz != 3 +ASR_zzw 00000100 .. 1 ..... 1000 00 ..... ..... @rd_rn_rm +LSR_zzw 00000100 .. 1 ..... 1000 01 ..... ..... @rd_rn_rm +LSL_zzw 00000100 .. 1 ..... 1000 11 ..... ..... @rd_rn_rm + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Sat Feb 17 18:22:36 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128689 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1825977ljc; Sat, 17 Feb 2018 10:40:10 -0800 (PST) X-Google-Smtp-Source: AH8x2257L5k/CgtAyzd+rcthLxFSbmCmstqArFguX96JCygH0jLnymrlLfyoCutoWMCVPmdJZftY X-Received: by 10.129.161.66 with SMTP id y63mr6310295ywg.403.1518892810645; Sat, 17 Feb 2018 10:40:10 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518892810; cv=none; d=google.com; s=arc-20160816; b=qdo0zi84XFGIeyxPwmkQe4qvVz0TLiuJna+4bkE3HzHld8Ba172MeohZtWMcY6N534 Jwn53lT+7I4SaCmZQRI4doR5Ma9f2R2o+V1OUcSu/vcjCHWB0XUa7XMHb9wZAR3rUFCL vq3w/rraw8JbZGPZX8pexUeAc23jDJBKuahq7bRhzjlkBP9iNnXnwkCuGaMT5zBMqdRI C/EOB+FDgthZdOtZYLtML39iwGtz3MCWZHLACY754PO6i7+ItJ2aOmMxXjsNnltK1xS0 8NY8YilWilWtQqdk2RHL0dkM0187Hts5ohPpLDGs05wtXq51W5l3h0Ac/NhhaxuADbBD 5Dcg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=qUivDrBs/deC0vPF+CXEPEAAPcV9POzH7MV2Tc4RKIU=; b=MJctzsP1LsGrUGl9YJtIOEk17GJIUIavRQLipDNPrDdrP2bV5d9K4Y4s81pgC69HhI apcXXrtgQiUh/3M7Fyl8xLj8cxHPsNNDuzA3KGfnVE8hZjYyQ2OiJkTrEjG/WKVrtJ3V SYMFOkGOBnNXwhKgO2lIfLigZq4cS5sNX5Qa+6rZRgyyG69RZORRtBw6xArAZhRZnhX5 UlxjGDLBf9J3U7k5ZvT6GKrYg71p8bkd8vJ1sdU44t6bTWTkYjIDdKPn/or3VyeJjpSi wBSUzZrjFX/bq+xotq7GI9DIsJ3e8CLSw55abLJpnzaa9W37p3QtqD2VYWeyn4smrIkp v5zw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=FjCGSzOr; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id c65si1220325ywe.519.2018.02.17.10.40.10 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 10:40:10 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=FjCGSzOr; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48175 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7P8-0005Qf-1m for patch@linaro.org; Sat, 17 Feb 2018 13:40:10 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39957) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79V-0000cH-Nv for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:04 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79U-0001hY-Hj for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:01 -0500 Received: from mail-pf0-x242.google.com ([2607:f8b0:400e:c00::242]:47051) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79U-0001hC-AC for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:00 -0500 Received: by mail-pf0-x242.google.com with SMTP id z24so587774pfh.13 for ; Sat, 17 Feb 2018 10:24:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=qUivDrBs/deC0vPF+CXEPEAAPcV9POzH7MV2Tc4RKIU=; b=FjCGSzOrl59WAmxuTF3fp9cSnWOvbvUgfkBEWUwBwkG9lu8QMIiknQeTGNI/UBvJL3 2MoyEOQphAC3KlOTqLXCYkJhLwyiVlZ7vMcC8trb2Wb3jChCrAP4w2TefmACPQ5aSrTK 4bZvwBG+audRb0NOqQmMDYm/3BdsDWDcLoQsM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=qUivDrBs/deC0vPF+CXEPEAAPcV9POzH7MV2Tc4RKIU=; b=gK/YjL0hgmrrgBBAEtg/NpGcmc/dkyrjJgSVg+pIvBX7grSJNj23q22jUvfRnbTuFd umdnBA3V3xmrRMFj2/K5OQqEo/Gu3i3C3joV8ZAzAEE0ol38I6cQ2fOppWXuZNAHJmAA FoaH0z64Q+tL2B8oT8hcBGKMtLmpDR2wEC6zqfQq5ut42zRfXw9J07ads8qQsqdSFnQP TXqj9TIrKSsWQyM9kJn5PKDMpG1Gp2O1MvKYtxgWgD6U8+GKplC03L8MjtswlgvB2tTc kocV7R4q6Yk7PKQDMOPkqHepv3m0TJpi0RR+hRbr1fMm1y+ZZmrwpapmFK2KUKKquYWN /bYQ== X-Gm-Message-State: APf1xPBL7a/9GWy0Veotn08R0IhFujPjsGr6n+FYv6Er1672nVVP+FO1 kScQqSn7i+K8A7zeswTXLIDUqClNEpQ= X-Received: by 10.98.196.199 with SMTP id h68mr1212317pfk.42.1518891839009; Sat, 17 Feb 2018 10:23:59 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.57 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:58 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:36 -0800 Message-Id: <20180217182323.25885-21-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::242 Subject: [Qemu-devel] [PATCH v2 20/67] target/arm: Implement SVE Compute Vector Address Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 5 +++++ target/arm/sve_helper.c | 40 ++++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 33 +++++++++++++++++++++++++++++++++ target/arm/sve.decode | 12 ++++++++++++ 4 files changed, 90 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 00e3cd48bb..5280d375f9 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -380,6 +380,11 @@ DEF_HELPER_FLAGS_4(sve_lsl_zzw_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_lsl_zzw_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_lsl_zzw_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_adr_p32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_adr_p64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_adr_s32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_adr_u32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 4c6e2713fa..a290a58c02 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1061,3 +1061,43 @@ void HELPER(sve_index_d)(void *vd, uint64_t start, d[i] = start + i * incr; } } + +void HELPER(sve_adr_p32)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 4; + uint32_t sh = simd_data(desc); + uint32_t *d = vd, *n = vn, *m = vm; + for (i = 0; i < opr_sz; i += 1) { + d[i] = n[i] + (m[i] << sh); + } +} + +void HELPER(sve_adr_p64)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t sh = simd_data(desc); + uint64_t *d = vd, *n = vn, *m = vm; + for (i = 0; i < opr_sz; i += 1) { + d[i] = n[i] + (m[i] << sh); + } +} + +void HELPER(sve_adr_s32)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t sh = simd_data(desc); + uint64_t *d = vd, *n = vn, *m = vm; + for (i = 0; i < opr_sz; i += 1) { + d[i] = n[i] + ((uint64_t)(int32_t)m[i] << sh); + } +} + +void HELPER(sve_adr_u32)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t sh = simd_data(desc); + uint64_t *d = vd, *n = vn, *m = vm; + for (i = 0; i < opr_sz; i += 1) { + d[i] = n[i] + ((uint64_t)(uint32_t)m[i] << sh); + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 43e9f1ad08..34cc8c2773 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -847,6 +847,39 @@ static void trans_RDVL(DisasContext *s, arg_RDVL *a, uint32_t insn) tcg_gen_movi_i64(reg, a->imm * vec_full_reg_size(s)); } +/* + *** SVE Compute Vector Address Group + */ + +static void do_adr(DisasContext *s, arg_rrri *a, gen_helper_gvec_3 *fn) +{ + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vsz, vsz, a->imm, fn); +} + +static void trans_ADR_p32(DisasContext *s, arg_rrri *a, uint32_t insn) +{ + do_adr(s, a, gen_helper_sve_adr_p32); +} + +static void trans_ADR_p64(DisasContext *s, arg_rrri *a, uint32_t insn) +{ + do_adr(s, a, gen_helper_sve_adr_p64); +} + +static void trans_ADR_s32(DisasContext *s, arg_rrri *a, uint32_t insn) +{ + do_adr(s, a, gen_helper_sve_adr_s32); +} + +static void trans_ADR_u32(DisasContext *s, arg_rrri *a, uint32_t insn) +{ + do_adr(s, a, gen_helper_sve_adr_u32); +} + /* *** SVE Predicate Logical Operations Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index f71ea1b60d..6ec1f94832 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -49,6 +49,7 @@ &rr_esz rd rn esz &rri rd rn imm +&rrri rd rn rm imm &rri_esz rd rn imm esz &rrr_esz rd rn rm esz &rpr_esz rd pg rn esz @@ -77,6 +78,9 @@ # Three operand, vector element size @rd_rn_rm ........ esz:2 . rm:5 ... ... rn:5 rd:5 &rrr_esz +# Three operand with "memory" size, aka immediate left shift +@rd_rn_msz_rm ........ ... rm:5 .... imm:2 rn:5 rd:5 &rrri + # Two register operand, with governing predicate, vector element size @rdn_pg_rm ........ esz:2 ... ... ... pg:3 rm:5 rd:5 \ &rprr_esz rn=%reg_movprfx @@ -278,6 +282,14 @@ ASR_zzw 00000100 .. 1 ..... 1000 00 ..... ..... @rd_rn_rm LSR_zzw 00000100 .. 1 ..... 1000 01 ..... ..... @rd_rn_rm LSL_zzw 00000100 .. 1 ..... 1000 11 ..... ..... @rd_rn_rm +### SVE Compute Vector Address Group + +# SVE vector address generation +ADR_s32 00000100 00 1 ..... 1010 .. ..... ..... @rd_rn_msz_rm +ADR_u32 00000100 01 1 ..... 1010 .. ..... ..... @rd_rn_msz_rm +ADR_p32 00000100 10 1 ..... 1010 .. ..... ..... @rd_rn_msz_rm +ADR_p64 00000100 11 1 ..... 1010 .. ..... ..... @rd_rn_msz_rm + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Sat Feb 17 18:22:37 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128700 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1832602ljc; Sat, 17 Feb 2018 10:51:10 -0800 (PST) X-Google-Smtp-Source: AH8x225R8ziSyUkPAw+DwZopsyL9ZingJLaIrgFy0gNsyoi35e6UBr8FxyeiFUr38bCvq52V60xS X-Received: by 10.129.160.20 with SMTP id x20mr3675032ywg.473.1518893470249; Sat, 17 Feb 2018 10:51:10 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518893470; cv=none; d=google.com; s=arc-20160816; b=FLTSnBGBftvi+qQjMCz6ERoxwfT6z4K+zAoevmCZoHQbcFSM8mSaqUmto3Rr00hfG9 QmvDhrrJt4hwSHypULr3drQWdQumFVlIT6qGo47n3a3Bq0zeM1CFwU5A0GLHJlOW6B5T k13RuMjIpoFS9niuvGpXxZ/1n8OIi/OmaA5vvwfvUhxCNUkFQ/w82YZNI/j6K+MmYvtZ 1seAeoSf+yqVhghhBUjF3v2s2dz+cfAvXbh32E8osG+jkdaWYCHMp5tP+XSn3GmKqq6K qhjUXwMnvq2/f3Icj/x6yNXHbx4JSb8eZai8xd4KeFYR1lWWBoLkblg//L4goORAQVTx pP0g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=118KeuWXZVwxjQFkY5AB5YsFbc+jn6EHA5V7kRQzsGs=; b=IN35ig6pGY1wdWxc+u43GlIU6bxsqEZmwW9XeRTPgPVKgniu0GPcKawT0TL/bkCrQ3 iSXt7XAEkGahGAJnxYzYGG5Tfk+l/bQtq86mFCXL8AvnJdWtDg354NrjVlazoVaUVMrL shcvZzMaady8uOZdh6xBLfwzMy0EF2+y5dUu5/TlbjpeQGCbUh112ukBszXFtpaMDYWZ l1pjfrT9x5uYKUQbniycNNll7oNUVW+1HN9QlBc0Jp21XvnXwX+1JmpePvJmcpDewQxT f05ZgBknCwulu68cuonLpnkgjudGPqSAJdSnTANjbEs/phobUqVBjVLOR7bEGInpxiYi kKfg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=jvQ2Abmb; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id e11si103956ybf.168.2018.02.17.10.51.10 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 10:51:10 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=jvQ2Abmb; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48264 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7Zl-00053a-Ho for patch@linaro.org; Sat, 17 Feb 2018 13:51:09 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40002) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79X-0000cK-Uo for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:07 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79W-0001ic-0n for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:03 -0500 Received: from mail-pg0-x244.google.com ([2607:f8b0:400e:c05::244]:43777) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79V-0001i7-Op for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:01 -0500 Received: by mail-pg0-x244.google.com with SMTP id f6so4342956pgs.10 for ; Sat, 17 Feb 2018 10:24:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=118KeuWXZVwxjQFkY5AB5YsFbc+jn6EHA5V7kRQzsGs=; b=jvQ2AbmbvBM8Tp3q9d5eHKlWbe6OTUZiM+YZINPTsvgpwL/Oj5AXb5A+8zW+vcsfpH LLn3sNe/gFUe5URO6SvMbVxiGsk9khfZ87Q2aIuMLBStV0QFxzxECj1L089KVoblGv4O eeMrou7G9WEWNQmM1mWiOiHKwhlEAEo8Sv1W4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=118KeuWXZVwxjQFkY5AB5YsFbc+jn6EHA5V7kRQzsGs=; b=F6Rn+dCTaLL9jNGVSTngHfMRPi8u12RsD4FaanZJMNC0XIevPw1FbNFjzwVvX+LjQT p4FdYOXJiW5KGNGn43u3YwerNpGQgs9GZm9bKxF9RFiaU6G4TTkesBNJLR0IasCZdN8b yicGKHkieWIbTJATKtZINaNtRoyEN2qFKIS2n70UNJ2YfEwL/WyVP1GEKoIfo0ZoKr1n LtFJ2n+9qvAL5iGGhZ58x1S+slXTWRWbEPZVtozW1fDzjoLqGg2+iaVjCta7DA5MntAV D/0jDluumt2fdtC6G9lhu4XU9cw3Gf922urI4js7cLFkMOGbzdN1AEkO82jDc9iyk1ml 1NAQ== X-Gm-Message-State: APf1xPBj6Hp+IOfiCvsIBPMqiK/dWHmPjIC3+/4YHEJ5uH6CYMvhjncf /feBbGLMIaCUkSE+SwjvRNXv5zH+SYw= X-Received: by 10.98.107.130 with SMTP id g124mr9665515pfc.225.1518891840502; Sat, 17 Feb 2018 10:24:00 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.23.59 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:23:59 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:37 -0800 Message-Id: <20180217182323.25885-22-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::244 Subject: [Qemu-devel] [PATCH v2 21/67] target/arm: Implement SVE floating-point exponential accelerator X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 4 +++ target/arm/sve_helper.c | 81 ++++++++++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 22 +++++++++++++ target/arm/sve.decode | 7 ++++ 4 files changed, 114 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 5280d375f9..e2925ff8ec 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -385,6 +385,10 @@ DEF_HELPER_FLAGS_4(sve_adr_p64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_adr_s32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_adr_u32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fexpa_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fexpa_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fexpa_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index a290a58c02..4d42653eef 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1101,3 +1101,84 @@ void HELPER(sve_adr_u32)(void *vd, void *vn, void *vm, uint32_t desc) d[i] = n[i] + ((uint64_t)(uint32_t)m[i] << sh); } } + +void HELPER(sve_fexpa_h)(void *vd, void *vn, uint32_t desc) +{ + static const uint16_t coeff[] = { + 0x0000, 0x0016, 0x002d, 0x0045, 0x005d, 0x0075, 0x008e, 0x00a8, + 0x00c2, 0x00dc, 0x00f8, 0x0114, 0x0130, 0x014d, 0x016b, 0x0189, + 0x01a8, 0x01c8, 0x01e8, 0x0209, 0x022b, 0x024e, 0x0271, 0x0295, + 0x02ba, 0x02e0, 0x0306, 0x032e, 0x0356, 0x037f, 0x03a9, 0x03d4, + }; + intptr_t i, opr_sz = simd_oprsz(desc) / 2; + uint16_t *d = vd, *n = vn; + + for (i = 0; i < opr_sz; i++) { + uint16_t nn = n[i]; + intptr_t idx = extract32(nn, 0, 5); + uint16_t exp = extract32(nn, 5, 5); + d[i] = coeff[idx] | (exp << 10); + } +} + +void HELPER(sve_fexpa_s)(void *vd, void *vn, uint32_t desc) +{ + static const uint32_t coeff[] = { + 0x000000, 0x0164d2, 0x02cd87, 0x043a29, + 0x05aac3, 0x071f62, 0x08980f, 0x0a14d5, + 0x0b95c2, 0x0d1adf, 0x0ea43a, 0x1031dc, + 0x11c3d3, 0x135a2b, 0x14f4f0, 0x16942d, + 0x1837f0, 0x19e046, 0x1b8d3a, 0x1d3eda, + 0x1ef532, 0x20b051, 0x227043, 0x243516, + 0x25fed7, 0x27cd94, 0x29a15b, 0x2b7a3a, + 0x2d583f, 0x2f3b79, 0x3123f6, 0x3311c4, + 0x3504f3, 0x36fd92, 0x38fbaf, 0x3aff5b, + 0x3d08a4, 0x3f179a, 0x412c4d, 0x4346cd, + 0x45672a, 0x478d75, 0x49b9be, 0x4bec15, + 0x4e248c, 0x506334, 0x52a81e, 0x54f35b, + 0x5744fd, 0x599d16, 0x5bfbb8, 0x5e60f5, + 0x60ccdf, 0x633f89, 0x65b907, 0x68396a, + 0x6ac0c7, 0x6d4f30, 0x6fe4ba, 0x728177, + 0x75257d, 0x77d0df, 0x7a83b3, 0x7d3e0c, + }; + intptr_t i, opr_sz = simd_oprsz(desc) / 4; + uint32_t *d = vd, *n = vn; + + for (i = 0; i < opr_sz; i++) { + uint32_t nn = n[i]; + intptr_t idx = extract32(nn, 0, 6); + uint32_t exp = extract32(nn, 6, 8); + d[i] = coeff[idx] | (exp << 23); + } +} + +void HELPER(sve_fexpa_d)(void *vd, void *vn, uint32_t desc) +{ + static const uint64_t coeff[] = { + 0x0000000000000, 0x02C9A3E778061, 0x059B0D3158574, 0x0874518759BC8, + 0x0B5586CF9890F, 0x0E3EC32D3D1A2, 0x11301D0125B51, 0x1429AAEA92DE0, + 0x172B83C7D517B, 0x1A35BEB6FCB75, 0x1D4873168B9AA, 0x2063B88628CD6, + 0x2387A6E756238, 0x26B4565E27CDD, 0x29E9DF51FDEE1, 0x2D285A6E4030B, + 0x306FE0A31B715, 0x33C08B26416FF, 0x371A7373AA9CB, 0x3A7DB34E59FF7, + 0x3DEA64C123422, 0x4160A21F72E2A, 0x44E086061892D, 0x486A2B5C13CD0, + 0x4BFDAD5362A27, 0x4F9B2769D2CA7, 0x5342B569D4F82, 0x56F4736B527DA, + 0x5AB07DD485429, 0x5E76F15AD2148, 0x6247EB03A5585, 0x6623882552225, + 0x6A09E667F3BCD, 0x6DFB23C651A2F, 0x71F75E8EC5F74, 0x75FEB564267C9, + 0x7A11473EB0187, 0x7E2F336CF4E62, 0x82589994CCE13, 0x868D99B4492ED, + 0x8ACE5422AA0DB, 0x8F1AE99157736, 0x93737B0CDC5E5, 0x97D829FDE4E50, + 0x9C49182A3F090, 0xA0C667B5DE565, 0xA5503B23E255D, 0xA9E6B5579FDBF, + 0xAE89F995AD3AD, 0xB33A2B84F15FB, 0xB7F76F2FB5E47, 0xBCC1E904BC1D2, + 0xC199BDD85529C, 0xC67F12E57D14B, 0xCB720DCEF9069, 0xD072D4A07897C, + 0xD5818DCFBA487, 0xDA9E603DB3285, 0xDFC97337B9B5F, 0xE502EE78B3FF6, + 0xEA4AFA2A490DA, 0xEFA1BEE615A27, 0xF50765B6E4540, 0xFA7C1819E90D8, + }; + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd, *n = vn; + + for (i = 0; i < opr_sz; i++) { + uint64_t nn = n[i]; + intptr_t idx = extract32(nn, 0, 6); + uint64_t exp = extract32(nn, 6, 11); + d[i] = coeff[idx] | (exp << 52); + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 34cc8c2773..2f23f1b192 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -880,6 +880,28 @@ static void trans_ADR_u32(DisasContext *s, arg_rrri *a, uint32_t insn) do_adr(s, a, gen_helper_sve_adr_u32); } +/* + *** SVE Integer Misc - Unpredicated Group + */ + +static void trans_FEXPA(DisasContext *s, arg_rr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_2 * const fns[4] = { + NULL, + gen_helper_sve_fexpa_h, + gen_helper_sve_fexpa_s, + gen_helper_sve_fexpa_d, + }; + unsigned vsz = vec_full_reg_size(s); + if (a->esz == 0) { + unallocated_encoding(s); + return; + } + tcg_gen_gvec_2_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vsz, vsz, 0, fns[a->esz]); +} + /* *** SVE Predicate Logical Operations Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 6ec1f94832..e791fe8031 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -68,6 +68,7 @@ # Two operand @pd_pn ........ esz:2 .. .... ....... rn:4 . rd:4 &rr_esz +@rd_rn ........ esz:2 ...... ...... rn:5 rd:5 &rr_esz # Three operand with unused vector element size @rd_rn_rm_e0 ........ ... rm:5 ... ... rn:5 rd:5 &rrr_esz esz=0 @@ -290,6 +291,12 @@ ADR_u32 00000100 01 1 ..... 1010 .. ..... ..... @rd_rn_msz_rm ADR_p32 00000100 10 1 ..... 1010 .. ..... ..... @rd_rn_msz_rm ADR_p64 00000100 11 1 ..... 1010 .. ..... ..... @rd_rn_msz_rm +### SVE Integer Misc - Unpredicated Group + +# SVE floating-point exponential accelerator +# Note esz != 0 +FEXPA 00000100 .. 1 00000 101110 ..... ..... @rd_rn + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Sat Feb 17 18:22:38 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128702 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1833029ljc; Sat, 17 Feb 2018 10:51:57 -0800 (PST) X-Google-Smtp-Source: AH8x224pix/Y/9gt/UZYEBZLkhkYDc+86eNgJDhSPkebwVydluJdEIQZrTAjaURkch1wj0+AMHPC X-Received: by 10.129.120.82 with SMTP id t79mr5115652ywc.230.1518893516985; Sat, 17 Feb 2018 10:51:56 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518893516; cv=none; d=google.com; s=arc-20160816; b=hpk1mlXmkv4u829ciCup4pbmhQVMj9bGWwkBWyLN2+r6hIfX7S3TfLmnHMaNHYuyJH zm5qCnADWjBCt7zSTjoytaGtGSEtqa6oySxX1/bCrxqU2nXjWJmGJpysw7B4pVFV2amt jZ6ghqvzYUEaLezruNjwdoE7p/z1xexGiJ+lZWedDqTFHd3qSgaGMi9r+G473sE2hn0w xflCX0HCy6wJOnjCxGOdN4bEaCxiXoh3/DXgtgpgcWp7mKg17kUO63N0/PcpXiNMq5t9 VISzULoL9B813KTiBRdzfDER7d0dWdBjxqIFyxlQ8GuOAmn/PFBDP3p0gxaQsj94/UOc s8ag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=6mOX10P32ylwuosMCM/vEVrBSkq/IoiKDXfgaS7C6yA=; b=uHIDlwHhve4cFmlxiLsonv6z9bkmxf5/kIA7n+OvZKl4XZoC6Vy9EmONZO0tMcNzF4 Tk8HGa/Da3+1YFlzgX4HudwPGUXz1zr9FcpEnzAAQpxKPTNBzeOTfzLF9vkNEe9Racfh xoBy7fwVN493vEjROGlBnwd3g9ZOSL4DPLEpALPcJ+5O0OjmkJT9C0E1af2Qo2/9lRQZ 2C1+UKupAzzhgviXi+u1XzTYpXBEQGbAXJyLZm6Fgn7tJQ3MGU2HtJ+gY/e0w6RFVcms Ial35QfS5GGEf0Gy1qJLhIroadcplb329s1cHy8Xeah0e2fIp468cxm/BzK/nupHAe2d PfEQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Cbn5c3Cn; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id 22si808852ybc.111.2018.02.17.10.51.56 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 10:51:56 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Cbn5c3Cn; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48281 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7aW-0006bd-98 for patch@linaro.org; Sat, 17 Feb 2018 13:51:56 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40018) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79Y-0000dI-KO for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:07 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79X-0001jf-Je for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:04 -0500 Received: from mail-pg0-x241.google.com ([2607:f8b0:400e:c05::241]:36444) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79X-0001jM-EB for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:03 -0500 Received: by mail-pg0-x241.google.com with SMTP id j9so4359617pgv.3 for ; Sat, 17 Feb 2018 10:24:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=6mOX10P32ylwuosMCM/vEVrBSkq/IoiKDXfgaS7C6yA=; b=Cbn5c3CnrlMaYY+GBPBs8zLeBddcAqd4zwZnExg8yJ02kypfRfaFRyNWR6S9zJwPd9 pZJ8df3NshxjKAxOD7EB4Kp8eP3I25lDutZz+5Vqd64yyLPKaobNcK7l0sHiYPvyLUeG NdIXhhF/Vlws3L/LW62psQJMsazQnJWrms3Hg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=6mOX10P32ylwuosMCM/vEVrBSkq/IoiKDXfgaS7C6yA=; b=Zi8gBduIXBl4PzLB1H9foFqK0Js0XryKiWXeYMACNiRPBb3zqXlBaoX8q9kBnbwwbM G8IeOcR95VhqLyPrUd2IcE2kHYjub9davsJqa2b25UJz7e6OpPeoUmyJd4AVNVmHVydL 4T3antAUbTLw78Dx9oXvfrxxOl24kCWO42dscOjUZytSQSWNsRw0TIiJGxZaNU6DmyUH 4WNkUWOKUjyJawdHy74de3Z0K7/EUwvTDL5Q9eO0KapZXZgbPe3q1Y+CucLYdOiNOktD IP90fUhl8fXJRincn84/cnDyGs8IGptXL5l3vvUB2VR9O4qO8Sf+jxr/cE6Qq96CwnZb SsxQ== X-Gm-Message-State: APf1xPBC8/wINBDiqRyhLTnqKfS6qnGh9SGjaTtR+Tv6oevfMfDm8LHp IKmMVzRuUW6krfERv2Xe6aYpunNZTeU= X-Received: by 10.98.245.131 with SMTP id b3mr9812757pfm.20.1518891842022; Sat, 17 Feb 2018 10:24:02 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.00 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:01 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:38 -0800 Message-Id: <20180217182323.25885-23-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::241 Subject: [Qemu-devel] [PATCH v2 22/67] target/arm: Implement SVE floating-point trig select coefficient X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 4 ++++ target/arm/sve_helper.c | 43 +++++++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 19 +++++++++++++++++++ target/arm/sve.decode | 4 ++++ 4 files changed, 70 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index e2925ff8ec..4f1bd5a62f 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -389,6 +389,10 @@ DEF_HELPER_FLAGS_3(sve_fexpa_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_fexpa_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_fexpa_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_ftssel_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_ftssel_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_ftssel_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 4d42653eef..b4f70af23f 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -23,6 +23,7 @@ #include "exec/cpu_ldst.h" #include "exec/helper-proto.h" #include "tcg/tcg-gvec-desc.h" +#include "fpu/softfloat.h" /* Note that vector data is stored in host-endian 64-bit chunks, @@ -1182,3 +1183,45 @@ void HELPER(sve_fexpa_d)(void *vd, void *vn, uint32_t desc) d[i] = coeff[idx] | (exp << 52); } } + +void HELPER(sve_ftssel_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 2; + uint16_t *d = vd, *n = vn, *m = vm; + for (i = 0; i < opr_sz; i += 1) { + uint16_t nn = n[i]; + uint16_t mm = m[i]; + if (mm & 1) { + nn = float16_one; + } + d[i] = nn ^ (mm & 2) << 14; + } +} + +void HELPER(sve_ftssel_s)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 4; + uint32_t *d = vd, *n = vn, *m = vm; + for (i = 0; i < opr_sz; i += 1) { + uint32_t nn = n[i]; + uint32_t mm = m[i]; + if (mm & 1) { + nn = float32_one; + } + d[i] = nn ^ (mm & 2) << 30; + } +} + +void HELPER(sve_ftssel_d)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd, *n = vn, *m = vm; + for (i = 0; i < opr_sz; i += 1) { + uint64_t nn = n[i]; + uint64_t mm = m[i]; + if (mm & 1) { + nn = float64_one; + } + d[i] = nn ^ (mm & 2) << 62; + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 2f23f1b192..e32be385fd 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -902,6 +902,25 @@ static void trans_FEXPA(DisasContext *s, arg_rr_esz *a, uint32_t insn) vsz, vsz, 0, fns[a->esz]); } +static void trans_FTSSEL(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] = { + NULL, + gen_helper_sve_ftssel_h, + gen_helper_sve_ftssel_s, + gen_helper_sve_ftssel_d, + }; + unsigned vsz = vec_full_reg_size(s); + if (a->esz == 0) { + unallocated_encoding(s); + return; + } + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vsz, vsz, 0, fns[a->esz]); +} + /* *** SVE Predicate Logical Operations Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index e791fe8031..4ea3f33919 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -297,6 +297,10 @@ ADR_p64 00000100 11 1 ..... 1010 .. ..... ..... @rd_rn_msz_rm # Note esz != 0 FEXPA 00000100 .. 1 00000 101110 ..... ..... @rd_rn +# SVE floating-point trig select coefficient +# Note esz != 0 +FTSSEL 00000100 .. 1 ..... 101100 ..... ..... @rd_rn_rm + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Sat Feb 17 18:22:39 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128701 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1832842ljc; Sat, 17 Feb 2018 10:51:36 -0800 (PST) X-Google-Smtp-Source: AH8x224NhdTpBNzKprEx1xudJOyGXoEkzea3X0+zG5EwWD+tr4TcZ2+P5hjPwmgWw1VIigz8AeiB X-Received: by 10.37.128.3 with SMTP id m3mr1433725ybk.3.1518893496610; Sat, 17 Feb 2018 10:51:36 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518893496; cv=none; d=google.com; s=arc-20160816; b=uq8kuB8g72G7XFZP6CoT1Nc9XOQwlb3+7RQtTZwxYZB6EBGktegO2C1FuQ3pebXm/E 2oHwR4Qd3NgI24uPIRGtV6afdu9dWtFZLkCNG44ROWYfJsqMj2Cpqw7MQ4lt/1/a/rFN I/ypVeVpIRKnQb7/Jk32KiAl2vR07YyjuV3YrD9DncR9rXmieJ6qCefyDEm7Ypyb7xek WrpI8MkWCrcXNcQbTQ/LpPIdOwcX9ToLDB191k/2XDNMg6Z2bK8yuZq1xXI6QEvCmTd3 u9eDHpRZ0Cw2mIWx6JLAr7DH3LDovELjrf4o32z6nC4/uppWMFfMKiplF+X5Sk6Hppc1 ArDg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=rsh86Uyni96VeArd/ErWXmUyZyr8qyIKJCMXUaKWZU0=; b=R4mSayDExeKPthn34ikhe1Y2S7MQuwuw8PH+HYcds/B44wSX0DAsWRQJ3Xi03Jn7x7 Q77Hz6DsCoiwMQ3+09DmpkpdfQ7Le6TozVq/vog54fcY7pZTLYROxOQkS6IgGBZ8rd7A QgCBGcMgwV6uGJBkaWPo7vmwrXGRvQ08Veosj62AfOx+mnnGSsJ9PYRfA+6BogvwjwUw ZLW6+a64ZYGjhw5gxtOg6UNMhzJ3AIPw/tMYNhtuW5ml8Lnxi/WerffgFGJ3NxUv1SHC jrqiyG1OJP9lgokQUQQWJACsBOL3S2c1AKeNh2Xer1qa16JdlgfIf8dTnyKmqf/TBni4 CWzA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=ghV2xgAl; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id n144si798556ywd.405.2018.02.17.10.51.36 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 10:51:36 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=ghV2xgAl; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48265 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7aB-0005Jz-V1 for patch@linaro.org; Sat, 17 Feb 2018 13:51:35 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40072) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79c-0000ho-1p for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:10 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79Z-0001kR-Fa for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:08 -0500 Received: from mail-pf0-x244.google.com ([2607:f8b0:400e:c00::244]:44407) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79Z-0001k9-6V for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:05 -0500 Received: by mail-pf0-x244.google.com with SMTP id 17so590931pfw.11 for ; Sat, 17 Feb 2018 10:24:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=rsh86Uyni96VeArd/ErWXmUyZyr8qyIKJCMXUaKWZU0=; b=ghV2xgAlpAyAP7tkVG4GrFsiHoirQYM8f+3fMFeHpeaGG20n5nCteUNfgOcoOjMrf2 7Y4+pB6+l4ikquN+LOzSWDNiHwYhcjNDK+AZ69QMN8jyV4QIB2wBEdGa7QgDI16lqh1M GCIuHHH5atOAJH+aiJbArC+p4rzp22kKOmcLQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=rsh86Uyni96VeArd/ErWXmUyZyr8qyIKJCMXUaKWZU0=; b=D1bxWO31C63itXncltWZbHQf/fVmQ7q7D47e0KAM4awxcdhZFBeFbvpry6FcDi17zQ aTFL7x2hF9kgAuruQZz4ldYelSZc74pFb1GG3W8QJYbDtavJxXQ9MlYg8PYOfhnslada sh7wSXuWn6+PQEOJOonJAHAXnEuG+YdmEyaO7fltWw5eOHAgnMjhTDi8a8xLTm9Vzbec hI8vS3SY1xvB5WU7cSEEnZNXas2qTmYFOQDRDSjkCnjP0JPBrXTra4QPB+RLnpTJs7kp nNxk2nM4OcjMVnzABFC2B8cvW50fT7p4LFm5OhKUltjsm2WZrawHCansqhihUE1WL63O UjCA== X-Gm-Message-State: APf1xPAHJWrYsRnEVJSu8/I52sRrWIj8sQQwfjPvUREAzfH/6ePNaxZn PnTjW5XJ0cMON4eD74D/tArOz2yF83U= X-Received: by 10.98.208.3 with SMTP id p3mr9790778pfg.8.1518891843780; Sat, 17 Feb 2018 10:24:03 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.02 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:02 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:39 -0800 Message-Id: <20180217182323.25885-24-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::244 Subject: [Qemu-devel] [PATCH v2 23/67] target/arm: Implement SVE Element Count Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 11 ++ target/arm/sve_helper.c | 136 ++++++++++++++++++++++ target/arm/translate-sve.c | 274 ++++++++++++++++++++++++++++++++++++++++++++- target/arm/sve.decode | 30 ++++- 4 files changed, 448 insertions(+), 3 deletions(-) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 4f1bd5a62f..2831e1643b 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -393,6 +393,17 @@ DEF_HELPER_FLAGS_4(sve_ftssel_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_ftssel_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_ftssel_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_sqaddi_b, TCG_CALL_NO_RWG, void, ptr, ptr, s32, i32) +DEF_HELPER_FLAGS_4(sve_sqaddi_h, TCG_CALL_NO_RWG, void, ptr, ptr, s32, i32) +DEF_HELPER_FLAGS_4(sve_sqaddi_s, TCG_CALL_NO_RWG, void, ptr, ptr, s64, i32) +DEF_HELPER_FLAGS_4(sve_sqaddi_d, TCG_CALL_NO_RWG, void, ptr, ptr, s64, i32) + +DEF_HELPER_FLAGS_4(sve_uqaddi_b, TCG_CALL_NO_RWG, void, ptr, ptr, s32, i32) +DEF_HELPER_FLAGS_4(sve_uqaddi_h, TCG_CALL_NO_RWG, void, ptr, ptr, s32, i32) +DEF_HELPER_FLAGS_4(sve_uqaddi_s, TCG_CALL_NO_RWG, void, ptr, ptr, s64, i32) +DEF_HELPER_FLAGS_4(sve_uqaddi_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_uqsubi_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index b4f70af23f..cfda16d520 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1225,3 +1225,139 @@ void HELPER(sve_ftssel_d)(void *vd, void *vn, void *vm, uint32_t desc) d[i] = nn ^ (mm & 2) << 62; } } + +/* + * Signed saturating addition with scalar operand. + */ + +void HELPER(sve_sqaddi_b)(void *d, void *a, int32_t b, uint32_t desc) +{ + intptr_t i, oprsz = simd_oprsz(desc); + + for (i = 0; i < oprsz; i += sizeof(int8_t)) { + int r = *(int8_t *)(a + i) + b; + if (r > INT8_MAX) { + r = INT8_MAX; + } else if (r < INT8_MIN) { + r = INT8_MIN; + } + *(int8_t *)(d + i) = r; + } +} + +void HELPER(sve_sqaddi_h)(void *d, void *a, int32_t b, uint32_t desc) +{ + intptr_t i, oprsz = simd_oprsz(desc); + + for (i = 0; i < oprsz; i += sizeof(int16_t)) { + int r = *(int16_t *)(a + i) + b; + if (r > INT16_MAX) { + r = INT16_MAX; + } else if (r < INT16_MIN) { + r = INT16_MIN; + } + *(int16_t *)(d + i) = r; + } +} + +void HELPER(sve_sqaddi_s)(void *d, void *a, int64_t b, uint32_t desc) +{ + intptr_t i, oprsz = simd_oprsz(desc); + + for (i = 0; i < oprsz; i += sizeof(int32_t)) { + int64_t r = *(int32_t *)(a + i) + b; + if (r > INT32_MAX) { + r = INT32_MAX; + } else if (r < INT32_MIN) { + r = INT32_MIN; + } + *(int32_t *)(d + i) = r; + } +} + +void HELPER(sve_sqaddi_d)(void *d, void *a, int64_t b, uint32_t desc) +{ + intptr_t i, oprsz = simd_oprsz(desc); + + for (i = 0; i < oprsz; i += sizeof(int64_t)) { + int64_t ai = *(int64_t *)(a + i); + int64_t r = ai + b; + if (((r ^ ai) & ~(ai ^ b)) < 0) { + /* Signed overflow. */ + r = (r < 0 ? INT64_MAX : INT64_MIN); + } + *(int64_t *)(d + i) = r; + } +} + +/* + * Unsigned saturating addition with scalar operand. + */ + +void HELPER(sve_uqaddi_b)(void *d, void *a, int32_t b, uint32_t desc) +{ + intptr_t i, oprsz = simd_oprsz(desc); + + for (i = 0; i < oprsz; i += sizeof(uint8_t)) { + int r = *(uint8_t *)(a + i) + b; + if (r > UINT8_MAX) { + r = UINT8_MAX; + } else if (r < 0) { + r = 0; + } + *(uint8_t *)(d + i) = r; + } +} + +void HELPER(sve_uqaddi_h)(void *d, void *a, int32_t b, uint32_t desc) +{ + intptr_t i, oprsz = simd_oprsz(desc); + + for (i = 0; i < oprsz; i += sizeof(uint16_t)) { + int r = *(uint16_t *)(a + i) + b; + if (r > UINT16_MAX) { + r = UINT16_MAX; + } else if (r < 0) { + r = 0; + } + *(uint16_t *)(d + i) = r; + } +} + +void HELPER(sve_uqaddi_s)(void *d, void *a, int64_t b, uint32_t desc) +{ + intptr_t i, oprsz = simd_oprsz(desc); + + for (i = 0; i < oprsz; i += sizeof(uint32_t)) { + int64_t r = *(uint32_t *)(a + i) + b; + if (r > UINT32_MAX) { + r = UINT32_MAX; + } else if (r < 0) { + r = 0; + } + *(uint32_t *)(d + i) = r; + } +} + +void HELPER(sve_uqaddi_d)(void *d, void *a, uint64_t b, uint32_t desc) +{ + intptr_t i, oprsz = simd_oprsz(desc); + + for (i = 0; i < oprsz; i += sizeof(uint64_t)) { + uint64_t r = *(uint64_t *)(a + i) + b; + if (r < b) { + r = UINT64_MAX; + } + *(uint64_t *)(d + i) = r; + } +} + +void HELPER(sve_uqsubi_d)(void *d, void *a, uint64_t b, uint32_t desc) +{ + intptr_t i, oprsz = simd_oprsz(desc); + + for (i = 0; i < oprsz; i += sizeof(uint64_t)) { + uint64_t ai = *(uint64_t *)(a + i); + *(uint64_t *)(d + i) = (ai < b ? 0 : ai - b); + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index e32be385fd..702f20e97b 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -61,6 +61,11 @@ static int tszimm_shl(int x) return x - (8 << tszimm_esz(x)); } +static inline int plus1(int x) +{ + return x + 1; +} + /* * Include the generated decoder. */ @@ -127,7 +132,9 @@ static void do_vector3_z(DisasContext *s, GVecGen3Fn *gvec_fn, /* Invoke a vector move on two Zregs. */ static void do_mov_z(DisasContext *s, int rd, int rn) { - do_vector2_z(s, tcg_gen_gvec_mov, 0, rd, rn); + if (rd != rn) { + do_vector2_z(s, tcg_gen_gvec_mov, 0, rd, rn); + } } /* Initialize a Zreg with replications of a 64-bit immediate. */ @@ -168,7 +175,9 @@ static void do_vecop4_p(DisasContext *s, const GVecGen4 *gvec_op, /* Invoke a vector move on two Pregs. */ static void do_mov_p(DisasContext *s, int rd, int rn) { - do_vector2_p(s, tcg_gen_gvec_mov, 0, rd, rn); + if (rd != rn) { + do_vector2_p(s, tcg_gen_gvec_mov, 0, rd, rn); + } } /* Set the cpu flags as per a return from an SVE helper. */ @@ -1378,6 +1387,267 @@ static void trans_PNEXT(DisasContext *s, arg_rr_esz *a, uint32_t insn) do_pfirst_pnext(s, a, gen_helper_sve_pnext); } +/* + *** SVE Element Count Group + */ + +/* Perform an inline saturating addition of a 32-bit value within + * a 64-bit register. The second operand is known to be positive, + * which halves the comparisions we must perform to bound the result. + */ +static void do_sat_addsub_32(TCGv_i64 reg, TCGv_i64 val, bool u, bool d) +{ + int64_t ibound; + TCGv_i64 bound; + TCGCond cond; + + /* Use normal 64-bit arithmetic to detect 32-bit overflow. */ + if (u) { + tcg_gen_ext32u_i64(reg, reg); + } else { + tcg_gen_ext32s_i64(reg, reg); + } + if (d) { + tcg_gen_sub_i64(reg, reg, val); + ibound = (u ? 0 : INT32_MIN); + cond = TCG_COND_LT; + } else { + tcg_gen_add_i64(reg, reg, val); + ibound = (u ? UINT32_MAX : INT32_MAX); + cond = TCG_COND_GT; + } + bound = tcg_const_i64(ibound); + tcg_gen_movcond_i64(cond, reg, reg, bound, bound, reg); + tcg_temp_free_i64(bound); +} + +/* Similarly with 64-bit values. */ +static void do_sat_addsub_64(TCGv_i64 reg, TCGv_i64 val, bool u, bool d) +{ + TCGv_i64 t0 = tcg_temp_new_i64(); + TCGv_i64 t1 = tcg_temp_new_i64(); + TCGv_i64 t2; + + if (u) { + if (d) { + tcg_gen_sub_i64(t0, reg, val); + tcg_gen_movi_i64(t1, 0); + tcg_gen_movcond_i64(TCG_COND_LTU, reg, reg, val, t1, t0); + } else { + tcg_gen_add_i64(t0, reg, val); + tcg_gen_movi_i64(t1, -1); + tcg_gen_movcond_i64(TCG_COND_LTU, reg, t0, reg, t1, t0); + } + } else { + if (d) { + /* Detect signed overflow for subtraction. */ + tcg_gen_xor_i64(t0, reg, val); + tcg_gen_sub_i64(t1, reg, val); + tcg_gen_xor_i64(reg, reg, t0); + tcg_gen_and_i64(t0, t0, reg); + + /* Bound the result. */ + tcg_gen_movi_i64(reg, INT64_MIN); + t2 = tcg_const_i64(0); + tcg_gen_movcond_i64(TCG_COND_LT, reg, t0, t2, reg, t1); + } else { + /* Detect signed overflow for addition. */ + tcg_gen_xor_i64(t0, reg, val); + tcg_gen_add_i64(reg, reg, val); + tcg_gen_xor_i64(t1, reg, val); + tcg_gen_andc_i64(t0, t1, t0); + + /* Bound the result. */ + tcg_gen_movi_i64(t1, INT64_MAX); + t2 = tcg_const_i64(0); + tcg_gen_movcond_i64(TCG_COND_LT, reg, t0, t2, t1, reg); + } + tcg_temp_free_i64(t2); + } + tcg_temp_free_i64(t0); + tcg_temp_free_i64(t1); +} + +/* Similarly with a vector and a scalar operand. */ +static void do_sat_addsub_vec(DisasContext *s, int esz, int rd, int rn, + TCGv_i64 val, bool u, bool d) +{ + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr dptr, nptr; + TCGv_i32 t32, desc; + TCGv_i64 t64; + + dptr = tcg_temp_new_ptr(); + nptr = tcg_temp_new_ptr(); + tcg_gen_addi_ptr(dptr, cpu_env, vec_full_reg_offset(s, rd)); + tcg_gen_addi_ptr(nptr, cpu_env, vec_full_reg_offset(s, rn)); + desc = tcg_const_i32(simd_desc(vsz, vsz, 0)); + + switch (esz) { + case MO_8: + t32 = tcg_temp_new_i32(); + tcg_gen_extrl_i64_i32(t32, val); + if (d) { + tcg_gen_neg_i32(t32, t32); + } + if (u) { + gen_helper_sve_uqaddi_b(dptr, nptr, t32, desc); + } else { + gen_helper_sve_sqaddi_b(dptr, nptr, t32, desc); + } + tcg_temp_free_i32(t32); + break; + + case MO_16: + t32 = tcg_temp_new_i32(); + tcg_gen_extrl_i64_i32(t32, val); + if (d) { + tcg_gen_neg_i32(t32, t32); + } + if (u) { + gen_helper_sve_uqaddi_h(dptr, nptr, t32, desc); + } else { + gen_helper_sve_sqaddi_h(dptr, nptr, t32, desc); + } + tcg_temp_free_i32(t32); + break; + + case MO_32: + t64 = tcg_temp_new_i64(); + if (d) { + tcg_gen_neg_i64(t64, val); + } else { + tcg_gen_mov_i64(t64, val); + } + if (u) { + gen_helper_sve_uqaddi_s(dptr, nptr, t64, desc); + } else { + gen_helper_sve_sqaddi_s(dptr, nptr, t64, desc); + } + tcg_temp_free_i64(t64); + break; + + case MO_64: + if (u) { + if (d) { + gen_helper_sve_uqsubi_d(dptr, nptr, val, desc); + } else { + gen_helper_sve_uqaddi_d(dptr, nptr, val, desc); + } + } else if (d) { + t64 = tcg_temp_new_i64(); + tcg_gen_neg_i64(t64, val); + gen_helper_sve_sqaddi_d(dptr, nptr, t64, desc); + tcg_temp_free_i64(t64); + } else { + gen_helper_sve_sqaddi_d(dptr, nptr, val, desc); + } + break; + + default: + g_assert_not_reached(); + } + + tcg_temp_free_ptr(dptr); + tcg_temp_free_ptr(nptr); + tcg_temp_free_i32(desc); +} + +static void trans_CNT_r(DisasContext *s, arg_CNT_r *a, uint32_t insn) +{ + unsigned fullsz = vec_full_reg_size(s); + unsigned numelem = decode_pred_count(fullsz, a->pat, a->esz); + + tcg_gen_movi_i64(cpu_reg(s, a->rd), numelem * a->imm); +} + +static void trans_INCDEC_r(DisasContext *s, arg_incdec_cnt *a, uint32_t insn) +{ + unsigned fullsz = vec_full_reg_size(s); + unsigned numelem = decode_pred_count(fullsz, a->pat, a->esz); + int inc = numelem * a->imm * (a->d ? -1 : 1); + TCGv_i64 reg = cpu_reg(s, a->rd); + + tcg_gen_addi_i64(reg, reg, inc); +} + +static void trans_SINCDEC_r_32(DisasContext *s, arg_incdec_cnt *a, + uint32_t insn) +{ + unsigned fullsz = vec_full_reg_size(s); + unsigned numelem = decode_pred_count(fullsz, a->pat, a->esz); + int inc = numelem * a->imm; + TCGv_i64 reg = cpu_reg(s, a->rd); + + /* Use normal 64-bit arithmetic to detect 32-bit overflow. */ + if (inc == 0) { + if (a->u) { + tcg_gen_ext32u_i64(reg, reg); + } else { + tcg_gen_ext32s_i64(reg, reg); + } + } else { + TCGv_i64 t = tcg_const_i64(inc); + do_sat_addsub_32(reg, t, a->u, a->d); + tcg_temp_free_i64(t); + } +} + +static void trans_SINCDEC_r_64(DisasContext *s, arg_incdec_cnt *a, + uint32_t insn) +{ + unsigned fullsz = vec_full_reg_size(s); + unsigned numelem = decode_pred_count(fullsz, a->pat, a->esz); + int inc = numelem * a->imm; + TCGv_i64 reg = cpu_reg(s, a->rd); + + if (inc != 0) { + TCGv_i64 t = tcg_const_i64(inc); + do_sat_addsub_64(reg, t, a->u, a->d); + tcg_temp_free_i64(t); + } +} + +static void trans_INCDEC_v(DisasContext *s, arg_incdec2_cnt *a, uint32_t insn) +{ + unsigned fullsz = vec_full_reg_size(s); + unsigned numelem = decode_pred_count(fullsz, a->pat, a->esz); + int inc = numelem * a->imm; + + if (a->esz == 0) { + unallocated_encoding(s); + return; + } + if (inc != 0) { + TCGv_i64 t = tcg_const_i64(a->d ? -inc : inc); + tcg_gen_gvec_adds(a->esz, vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), t, fullsz, fullsz); + tcg_temp_free_i64(t); + } else { + do_mov_z(s, a->rd, a->rn); + } +} + +static void trans_SINCDEC_v(DisasContext *s, arg_incdec2_cnt *a, + uint32_t insn) +{ + unsigned fullsz = vec_full_reg_size(s); + unsigned numelem = decode_pred_count(fullsz, a->pat, a->esz); + int inc = numelem * a->imm; + + if (a->esz == 0) { + unallocated_encoding(s); + return; + } + if (inc != 0) { + TCGv_i64 t = tcg_const_i64(inc); + do_sat_addsub_vec(s, a->esz, a->rd, a->rn, t, a->u, a->d); + tcg_temp_free_i64(t); + } else { + do_mov_z(s, a->rd, a->rn); + } +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 4ea3f33919..5690b5fcb9 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -22,6 +22,7 @@ ########################################################################### # Named fields. These are primarily for disjoint fields. +%imm4_16_p1 16:4 !function=plus1 %imm6_22_5 22:1 5:5 %imm9_16_10 16:s6 10:3 %preg4_5 5:4 @@ -58,6 +59,8 @@ &rprrr_esz rd pg rn rm ra esz &rpri_esz rd pg rn imm esz &ptrue rd esz pat s +&incdec_cnt rd pat esz imm d u +&incdec2_cnt rd rn pat esz imm d u ########################################################################### # Named instruction formats. These are generally used to @@ -115,6 +118,13 @@ @rd_rn_i9 ........ ........ ...... rn:5 rd:5 \ &rri imm=%imm9_16_10 +# One register, pattern, and uint4+1. +# User must fill in U and D. +@incdec_cnt ........ esz:2 .. .... ...... pat:5 rd:5 \ + &incdec_cnt imm=%imm4_16_p1 +@incdec2_cnt ........ esz:2 .. .... ...... pat:5 rd:5 \ + &incdec2_cnt imm=%imm4_16_p1 rn=%reg_movprfx + ########################################################################### # Instruction patterns. Grouped according to the SVE encodingindex.xhtml. @@ -301,7 +311,25 @@ FEXPA 00000100 .. 1 00000 101110 ..... ..... @rd_rn # Note esz != 0 FTSSEL 00000100 .. 1 ..... 101100 ..... ..... @rd_rn_rm -### SVE Predicate Logical Operations Group +### SVE Element Count Group + +# SVE element count +CNT_r 00000100 .. 10 .... 1110 0 0 ..... ..... @incdec_cnt d=0 u=1 + +# SVE inc/dec register by element count +INCDEC_r 00000100 .. 11 .... 1110 0 d:1 ..... ..... @incdec_cnt u=1 + +# SVE saturating inc/dec register by element count +SINCDEC_r_32 00000100 .. 10 .... 1111 d:1 u:1 ..... ..... @incdec_cnt +SINCDEC_r_64 00000100 .. 11 .... 1111 d:1 u:1 ..... ..... @incdec_cnt + +# SVE inc/dec vector by element count +# Note this requires esz != 0. +INCDEC_v 00000100 .. 1 1 .... 1100 0 d:1 ..... ..... @incdec2_cnt u=1 + +# SVE saturating inc/dec vector by element count +# Note these require esz != 0. +SINCDEC_v 00000100 .. 1 0 .... 1100 d:1 u:1 ..... ..... @incdec2_cnt # SVE predicate logical operations AND_pppp 00100101 0. 00 .... 01 .... 0 .... 0 .... @pd_pg_pn_pm_s From patchwork Sat Feb 17 18:22:40 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128707 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1836003ljc; Sat, 17 Feb 2018 10:57:13 -0800 (PST) X-Google-Smtp-Source: AH8x225GRog/G26Y3y+OSEmG2jwTyOwhhGpV2y/Hsv3E0yMDbViHAccpzMhW2I9YfoQJtoEBTZPP X-Received: by 10.129.57.134 with SMTP id g128mr7424603ywa.373.1518893833657; Sat, 17 Feb 2018 10:57:13 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518893833; cv=none; d=google.com; s=arc-20160816; b=MiHT4b5el9WpfcGJZU5p8tDWGk9PJRuQ5Cv56icsGsJA1ZyMuXOxB95QF9nf94duBo qNDP1O2V3aWtHwf5G85xJ29w1sP4br2NwHgZj4+lq8RGjT9PM+Yxb2ATTTOFveD37wdQ 5FBdWa7q8ZxbyMK4xjBxLrE4NZD4a2Rsxh3t6B71TDRyyUXpylFRBivkYVxZoWnoiqRb M6lNwcjsNgBoSVhltr3KZuAbtcI8q52D1OLs0Qp5LWaPJcD36Y7pId1EdLsai0BXD4h3 /FRvOoR/dfDxd6pErWzgSG6erQiWCII9jeIlWNGUcmlc2v6dm+WQSvzBWRCCE4YdScIx vMcQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=nSNLVT4lf1W1WDzUmj9fhH5V3ch8xkv1gwPhXGeJ6r0=; b=in0n92LzX/NDCeNnUp8Yfomzj21rxoXnf0wpyGrMt5Cld6TwbxNbTEv+cgAmVeZ8jw f2Jop+CSLqurQE7ZX3sSVadnL5/FWX3i9eT6xuLFfN2PQi5vqwPn4ftOg94MWopPYXb4 yX0+f3GR7/PqL7cAqmnMTCgruyO4QIDTjpqjgfwomYPuucaQps8068613kecNdW0U5AQ i3KzMmmDBSSmf7egipe8I92m3Y7x9EshVH634ydcxNqmtNJVL4ib85tVoVsIIfz5ZKeo lPnOMruiod1mwdL1J98QyMnIBMoEy4l2a8Cg78Z4oaMFKaVrqo5SMMpWwo+ZL36yAssv /+KQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=XMEvR+3c; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id l18si3615871ywm.397.2018.02.17.10.57.13 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 10:57:13 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=XMEvR+3c; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48326 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7fd-0001us-3T for patch@linaro.org; Sat, 17 Feb 2018 13:57:13 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40068) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79b-0000hM-OU for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:09 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79a-0001lB-MY for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:07 -0500 Received: from mail-pg0-x243.google.com ([2607:f8b0:400e:c05::243]:43777) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79a-0001ks-Gz for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:06 -0500 Received: by mail-pg0-x243.google.com with SMTP id f6so4342993pgs.10 for ; Sat, 17 Feb 2018 10:24:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=nSNLVT4lf1W1WDzUmj9fhH5V3ch8xkv1gwPhXGeJ6r0=; b=XMEvR+3c4GAUicBd+RM9a/qV1xj0SccDnRWRLh972nZAg6gYVuzi+wWl3Tu3cC8TwB fg63rVue9xMigQqDJBiS76/0vabjcdEQMdtlJ71BjC/JN9ru/dhUUC4Phz8pujpesuUF aseGu2/VROm3JvmJ5H5lAf3gKFBTmE8H2PZWM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=nSNLVT4lf1W1WDzUmj9fhH5V3ch8xkv1gwPhXGeJ6r0=; b=HloNR/NPi1fKJwpoIc34aglffPUt651ymjvia15rIc2zhEql2WxruSqCnMRmvUZ7eZ eoQYnJaRKk+SM5aUsREwjk/ziq3HZb9Csbb5fJArF8IzYMk+EmKycl88STpQsf68KW3/ qObkec7+eRPgIjYfkvAbKRrisJxlHSA6ukZMOY2cgWpmvbmd6T2X/UUQNTu51qoAnrqG eQFwaTg/Q7gaDJ8Iv+jRZC4GKoE23NYOeS37ujoMB1K9Wm81+hCda+oJIPSEXrIeSSPI vw2IhheqxJewoGItHMsNQmObOx+0wIonjhbLONvj7N+vLZJOVmcG6VOc+RlhPjKoEa45 ve3A== X-Gm-Message-State: APf1xPAEqPs2tbADJeaiI9vvjYjxFsCPK5DbHmWmmcOGrolvz6pNeOHz eQ6VZCN8ocgwseDnjBKm2vYYkEIQrUI= X-Received: by 10.98.200.80 with SMTP id z77mr787317pff.85.1518891845282; Sat, 17 Feb 2018 10:24:05 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.03 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:04 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:40 -0800 Message-Id: <20180217182323.25885-25-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::243 Subject: [Qemu-devel] [PATCH v2 24/67] target/arm: Implement SVE Bitwise Immediate Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 17 ++++++++++++++++ 2 files changed, 67 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 702f20e97b..21b1e4df85 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -34,6 +34,8 @@ #include "translate-a64.h" typedef void GVecGen2Fn(unsigned, uint32_t, uint32_t, uint32_t, uint32_t); +typedef void GVecGen2iFn(unsigned, uint32_t, uint32_t, + int64_t, uint32_t, uint32_t); typedef void GVecGen3Fn(unsigned, uint32_t, uint32_t, uint32_t, uint32_t, uint32_t); @@ -1648,6 +1650,54 @@ static void trans_SINCDEC_v(DisasContext *s, arg_incdec2_cnt *a, } } +/* + *** SVE Bitwise Immediate Group + */ + +static void do_zz_dbm(DisasContext *s, arg_rr_dbm *a, GVecGen2iFn *gvec_fn) +{ + unsigned vsz; + uint64_t imm; + + if (!logic_imm_decode_wmask(&imm, extract32(a->dbm, 12, 1), + extract32(a->dbm, 0, 6), + extract32(a->dbm, 6, 6))) { + unallocated_encoding(s); + return; + } + + vsz = vec_full_reg_size(s); + gvec_fn(MO_64, vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), imm, vsz, vsz); +} + +static void trans_AND_zzi(DisasContext *s, arg_rr_dbm *a, uint32_t insn) +{ + do_zz_dbm(s, a, tcg_gen_gvec_andi); +} + +static void trans_ORR_zzi(DisasContext *s, arg_rr_dbm *a, uint32_t insn) +{ + do_zz_dbm(s, a, tcg_gen_gvec_ori); +} + +static void trans_EOR_zzi(DisasContext *s, arg_rr_dbm *a, uint32_t insn) +{ + do_zz_dbm(s, a, tcg_gen_gvec_xori); +} + +static void trans_DUPM(DisasContext *s, arg_DUPM *a, uint32_t insn) +{ + uint64_t imm; + if (!logic_imm_decode_wmask(&imm, extract32(a->dbm, 12, 1), + extract32(a->dbm, 0, 6), + extract32(a->dbm, 6, 6))) { + unallocated_encoding(s); + return; + } + do_dupi_z(s, a->rd, imm); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 5690b5fcb9..0990d135f4 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -50,6 +50,7 @@ &rr_esz rd rn esz &rri rd rn imm +&rr_dbm rd rn dbm &rrri rd rn rm imm &rri_esz rd rn imm esz &rrr_esz rd rn rm esz @@ -112,6 +113,10 @@ @rd_rn_tszimm ........ .. ... ... ...... rn:5 rd:5 \ &rri_esz esz=%tszimm16_esz +# Two register operand, one encoded bitmask. +@rdn_dbm ........ .. .... dbm:13 rd:5 \ + &rr_dbm rn=%reg_movprfx + # Basic Load/Store with 9-bit immediate offset @pd_rn_i9 ........ ........ ...... rn:5 . rd:4 \ &rri imm=%imm9_16_10 @@ -331,6 +336,18 @@ INCDEC_v 00000100 .. 1 1 .... 1100 0 d:1 ..... ..... @incdec2_cnt u=1 # Note these require esz != 0. SINCDEC_v 00000100 .. 1 0 .... 1100 d:1 u:1 ..... ..... @incdec2_cnt +### SVE Bitwise Immediate Group + +# SVE bitwise logical with immediate (unpredicated) +ORR_zzi 00000101 00 0000 ............. ..... @rdn_dbm +EOR_zzi 00000101 01 0000 ............. ..... @rdn_dbm +AND_zzi 00000101 10 0000 ............. ..... @rdn_dbm + +# SVE broadcast bitmask immediate +DUPM 00000101 11 0000 dbm:13 rd:5 + +### SVE Predicate Logical Operations Group + # SVE predicate logical operations AND_pppp 00100101 0. 00 .... 01 .... 0 .... 0 .... @pd_pg_pn_pm_s BIC_pppp 00100101 0. 00 .... 01 .... 0 .... 1 .... @pd_pg_pn_pm_s From patchwork Sat Feb 17 18:22:41 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128705 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1834479ljc; Sat, 17 Feb 2018 10:54:29 -0800 (PST) X-Google-Smtp-Source: AH8x226JbjR3wB6Mk1GceBrr1Hxg4DbO1viPaiqKe+wpRd+FwxdlJCDqv8WGNVnKA1OWbFHhiuDP X-Received: by 10.13.218.66 with SMTP id c63mr7586810ywe.479.1518893669190; Sat, 17 Feb 2018 10:54:29 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518893669; cv=none; d=google.com; s=arc-20160816; b=WtrSvOuweuIJ3Rwyo+wwICSsNHK1tKkXKS5t2/8aq3mzXaVYWIRYXFTOh/PnU0HA6q 32juoZi5Rz7iMxf30d2Pzr67Lf++fmBpIsdBQ8vLJOuNEDjyX2Ah7DSq96FSU0wdBea5 AaRSFZuhY5VkIvetmbjI1PcdWNk1k7+W61Gu2eqsq+YwmpbXF1KUJOoJYIcgAVeGw5gr qDL+LAhuNLtPgjHU2BowW1r4KKCz1303bO30GdWm8ZIkO823XZROaiV1c9FAB+NX0tHu wBF0mqDUCN22rJOu6+TnnondrhyKG6ygnxLXI4w7oagsdR7ioR2JWDKf37fKE/eG/Lo2 01sA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=GpiGhGCBLCpMu5HsxIdaTUuHdNgL0asf15r6AUQljAo=; b=oX6R4TNlBCqS8rZo/SOGfESvxMfOHyIfN01ODRVZhz1RjqRV6SiOei6pBqaeakZEJz JyqUJxNNFYULYWk5I5pxzDb5gLVZQ42dtef+6WKeZh4L+VuPCAgbUnzEngB6DQm7Dax8 h+nG9gHyI68U+EdCd8ES+OHz9+fyBs1l6yiT6KkR9UAkJWmVCDisyVVdy8ZIkSoem18C 5upfouYYoTg0haUxXLaZ+bCR+j5J11FbfDFy/a0bGB+/4K3nkEYXHULgssx6IaAN5zhi 7bzpUdiRwinR8trQ7jiKGA7Dm8zzVbVf90sYSjDVJmegDjjOdNtbMdRd2TxCr2L8z0sJ O3ug== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=DvkOv3IS; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id v130si536023ywe.131.2018.02.17.10.54.28 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 10:54:29 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=DvkOv3IS; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48300 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7cy-0007uW-Hk for patch@linaro.org; Sat, 17 Feb 2018 13:54:28 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40111) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79e-0000la-9F for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:12 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79c-0001m2-Cn for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:10 -0500 Received: from mail-pl0-x244.google.com ([2607:f8b0:400e:c01::244]:44554) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79c-0001ld-4z for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:08 -0500 Received: by mail-pl0-x244.google.com with SMTP id w21so3426672plp.11 for ; Sat, 17 Feb 2018 10:24:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=GpiGhGCBLCpMu5HsxIdaTUuHdNgL0asf15r6AUQljAo=; b=DvkOv3ISU3xNKZSB91aWS50R8HNAYKHeuoVbRx0BFkQpMtYeUy5W3licpKHAexmX8b KkOj+w7TQBgDYH2Kz0hbvG/IdUS16gRdo8SmVXTjBE90cmAk2hdE+DtvY8a0HZWtFqD1 hP/Fh3xMPGWBVM6MtRHOJPKCbbz2VrEBSiaZE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=GpiGhGCBLCpMu5HsxIdaTUuHdNgL0asf15r6AUQljAo=; b=WdNZEOcMYcHLJbWd129XejzZ1/GNUL6EXBvq7MtsCT+Gng86z60ctq/Att/YxA3DWv p+U1z+V1IiFrocbugqUJsDX9amLQdm+XzEz2dJR5auZVxeBzu2tIBOgeprpddQKDvZuO M3Kgx2wem0gaDjokWRxAOzFQdDtqtcrFFNlCVRpujsct2Z1SLU9q0vpW/79yBIfICEE/ TDQJ0wQte6gQADzWdBVFlw2wFrXH1ocOUbfp59BReYBmMVnzp+0AaiZUZxp/NrWOa1E8 Im8G61mj2jgk7KnnrASbn6dWm3Hs9YigSZHWjr9L63LCAbRD7gm2BEr+HDgdAIwHP+oi 34TQ== X-Gm-Message-State: APf1xPDo66ybsU2LsaOlcLpYDw6Usc41sRZgK5q6SDwhHkh42RFBBZ4a vlk8YjX9Mfa77NtCk9+LnlmVXQsBkrw= X-Received: by 2002:a17:902:7808:: with SMTP id p8-v6mr9620755pll.161.1518891846870; Sat, 17 Feb 2018 10:24:06 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.05 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:05 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:41 -0800 Message-Id: <20180217182323.25885-26-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::244 Subject: [Qemu-devel] [PATCH v2 25/67] target/arm: Implement SVE Integer Wide Immediate - Predicated Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 10 +++++ target/arm/sve_helper.c | 108 +++++++++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 92 ++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 17 +++++++ 4 files changed, 227 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 2831e1643b..79493ab647 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -404,6 +404,16 @@ DEF_HELPER_FLAGS_4(sve_uqaddi_s, TCG_CALL_NO_RWG, void, ptr, ptr, s64, i32) DEF_HELPER_FLAGS_4(sve_uqaddi_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(sve_uqsubi_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_5(sve_cpy_m_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_5(sve_cpy_m_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_5(sve_cpy_m_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_5(sve_cpy_m_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_4(sve_cpy_z_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_cpy_z_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_cpy_z_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_cpy_z_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index cfda16d520..6a95d1ec48 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1361,3 +1361,111 @@ void HELPER(sve_uqsubi_d)(void *d, void *a, uint64_t b, uint32_t desc) *(uint64_t *)(d + i) = (ai < b ? 0 : ai - b); } } + +/* Two operand predicated copy immediate with merge. All valid immediates + * can fit within 17 signed bits in the simd_data field. + */ +void HELPER(sve_cpy_m_b)(void *vd, void *vn, void *vg, + uint64_t mm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd, *n = vn; + uint8_t *pg = vg; + + mm = (mm & 0xff) * (-1ull / 0xff); + for (i = 0; i < opr_sz; i += 1) { + uint64_t nn = n[i]; + uint64_t pp = expand_pred_b(pg[H1(i)]); + d[i] = (mm & pp) | (nn & ~pp); + } +} + +void HELPER(sve_cpy_m_h)(void *vd, void *vn, void *vg, + uint64_t mm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd, *n = vn; + uint8_t *pg = vg; + + mm = (mm & 0xffff) * (-1ull / 0xffff); + for (i = 0; i < opr_sz; i += 1) { + uint64_t nn = n[i]; + uint64_t pp = expand_pred_h(pg[H1(i)]); + d[i] = (mm & pp) | (nn & ~pp); + } +} + +void HELPER(sve_cpy_m_s)(void *vd, void *vn, void *vg, + uint64_t mm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd, *n = vn; + uint8_t *pg = vg; + + mm = deposit64(mm, 32, 32, mm); + for (i = 0; i < opr_sz; i += 1) { + uint64_t nn = n[i]; + uint64_t pp = expand_pred_s(pg[H1(i)]); + d[i] = (mm & pp) | (nn & ~pp); + } +} + +void HELPER(sve_cpy_m_d)(void *vd, void *vn, void *vg, + uint64_t mm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd, *n = vn; + uint8_t *pg = vg; + + for (i = 0; i < opr_sz; i += 1) { + uint64_t nn = n[i]; + d[i] = (pg[H1(i)] & 1 ? mm : nn); + } +} + +void HELPER(sve_cpy_z_b)(void *vd, void *vg, uint64_t val, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd; + uint8_t *pg = vg; + + val = (val & 0xff) * (-1ull / 0xff); + for (i = 0; i < opr_sz; i += 1) { + d[i] = val & expand_pred_b(pg[H1(i)]); + } +} + +void HELPER(sve_cpy_z_h)(void *vd, void *vg, uint64_t val, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd; + uint8_t *pg = vg; + + val = (val & 0xffff) * (-1ull / 0xffff); + for (i = 0; i < opr_sz; i += 1) { + d[i] = val & expand_pred_h(pg[H1(i)]); + } +} + +void HELPER(sve_cpy_z_s)(void *vd, void *vg, uint64_t val, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd; + uint8_t *pg = vg; + + val = deposit64(val, 32, 32, val); + for (i = 0; i < opr_sz; i += 1) { + d[i] = val & expand_pred_s(pg[H1(i)]); + } +} + +void HELPER(sve_cpy_z_d)(void *vd, void *vg, uint64_t val, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd; + uint8_t *pg = vg; + + for (i = 0; i < opr_sz; i += 1) { + d[i] = (pg[H1(i)] & 1 ? val : 0); + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 21b1e4df85..dd085b084b 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -68,6 +68,12 @@ static inline int plus1(int x) return x + 1; } +/* The SH bit is in bit 8. Extract the low 8 and shift. */ +static inline int expand_imm_sh8s(int x) +{ + return (int8_t)x << (x & 0x100 ? 8 : 0); +} + /* * Include the generated decoder. */ @@ -1698,6 +1704,92 @@ static void trans_DUPM(DisasContext *s, arg_DUPM *a, uint32_t insn) do_dupi_z(s, a->rd, imm); } +/* + *** SVE Integer Wide Immediate - Predicated Group + */ + +/* Implement all merging copies. This is used for CPY (immediate), + * FCPY, CPY (scalar), CPY (SIMD&FP scalar). + */ +static void do_cpy_m(DisasContext *s, int esz, int rd, int rn, int pg, + TCGv_i64 val) +{ + typedef void gen_cpy(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i64, TCGv_i32); + static gen_cpy * const fns[4] = { + gen_helper_sve_cpy_m_b, gen_helper_sve_cpy_m_h, + gen_helper_sve_cpy_m_s, gen_helper_sve_cpy_m_d, + }; + unsigned vsz = vec_full_reg_size(s); + TCGv_i32 desc = tcg_const_i32(simd_desc(vsz, vsz, 0)); + TCGv_ptr t_zd = tcg_temp_new_ptr(); + TCGv_ptr t_zn = tcg_temp_new_ptr(); + TCGv_ptr t_pg = tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(t_zd, cpu_env, vec_full_reg_offset(s, rd)); + tcg_gen_addi_ptr(t_zn, cpu_env, vec_full_reg_offset(s, rn)); + tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg)); + + fns[esz](t_zd, t_zn, t_pg, val, desc); + + tcg_temp_free_ptr(t_zd); + tcg_temp_free_ptr(t_zn); + tcg_temp_free_ptr(t_pg); + tcg_temp_free_i32(desc); +} + +static void trans_FCPY(DisasContext *s, arg_FCPY *a, uint32_t insn) +{ + uint64_t imm; + TCGv_i64 t_imm; + + if (a->esz == 0) { + unallocated_encoding(s); + return; + } + + /* Decode the VFP immediate. */ + imm = vfp_expand_imm(a->esz, a->imm); + + t_imm = tcg_const_i64(imm); + do_cpy_m(s, a->esz, a->rd, a->rn, a->pg, t_imm); + tcg_temp_free_i64(t_imm); +} + +static void trans_CPY_m_i(DisasContext *s, arg_rpri_esz *a, uint32_t insn) +{ + TCGv_i64 t_imm; + + if (a->esz == 0 && extract32(insn, 13, 1)) { + unallocated_encoding(s); + return; + } + + t_imm = tcg_const_i64(a->imm); + do_cpy_m(s, a->esz, a->rd, a->rn, a->pg, t_imm); + tcg_temp_free_i64(t_imm); +} + +static void trans_CPY_z_i(DisasContext *s, arg_CPY_z_i *a, uint32_t insn) +{ + static gen_helper_gvec_2i * const fns[4] = { + gen_helper_sve_cpy_z_b, gen_helper_sve_cpy_z_h, + gen_helper_sve_cpy_z_s, gen_helper_sve_cpy_z_d, + }; + unsigned vsz = vec_full_reg_size(s); + TCGv_i64 t_imm; + + if (a->esz == 0 && extract32(insn, 13, 1)) { + unallocated_encoding(s); + return; + } + + t_imm = tcg_const_i64(a->imm); + tcg_gen_gvec_2i_ool(vec_full_reg_offset(s, a->rd), + pred_full_reg_offset(s, a->pg), + t_imm, vsz, vsz, 0, fns[a->esz]); + tcg_temp_free_i64(t_imm); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 0990d135f4..e6e10a4f84 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -39,6 +39,9 @@ %tszimm16_shr 22:2 16:5 !function=tszimm_shr %tszimm16_shl 22:2 16:5 !function=tszimm_shl +# Signed 8-bit immediate, optionally shifted left by 8. +%sh8_i8s 5:9 !function=expand_imm_sh8s + # Either a copy of rd (at bit 0), or a different source # as propagated via the MOVPRFX instruction. %reg_movprfx 0:5 @@ -113,6 +116,11 @@ @rd_rn_tszimm ........ .. ... ... ...... rn:5 rd:5 \ &rri_esz esz=%tszimm16_esz +# Two register operand, one immediate operand, with 4-bit predicate. +# User must fill in imm. +@rdn_pg4 ........ esz:2 .. pg:4 ... ........ rd:5 \ + &rpri_esz rn=%reg_movprfx + # Two register operand, one encoded bitmask. @rdn_dbm ........ .. .... dbm:13 rd:5 \ &rr_dbm rn=%reg_movprfx @@ -346,6 +354,15 @@ AND_zzi 00000101 10 0000 ............. ..... @rdn_dbm # SVE broadcast bitmask immediate DUPM 00000101 11 0000 dbm:13 rd:5 +### SVE Integer Wide Immediate - Predicated Group + +# SVE copy floating-point immediate (predicated) +FCPY 00000101 .. 01 .... 110 imm:8 ..... @rdn_pg4 + +# SVE copy integer immediate (predicated) +CPY_m_i 00000101 .. 01 .... 01 . ........ ..... @rdn_pg4 imm=%sh8_i8s +CPY_z_i 00000101 .. 01 .... 00 . ........ ..... @rdn_pg4 imm=%sh8_i8s + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Sat Feb 17 18:22:42 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128694 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1828404ljc; Sat, 17 Feb 2018 10:44:09 -0800 (PST) X-Google-Smtp-Source: AH8x225wH4BS7jiWA5iyBQYEKBfvmJJLf2sb7NU6PoGDh96qL9zolFyxhk+4u/04LplzwedodT/6 X-Received: by 10.129.133.6 with SMTP id v6mr7631082ywf.514.1518893049508; Sat, 17 Feb 2018 10:44:09 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518893049; cv=none; d=google.com; s=arc-20160816; b=oSjs3rYVmNbUtMhKIzXzsS3DILUHwkyU4Dg1Og/Q4MkJGVt3wo995kdTASId2YXtFa B4QXsdkwO5gJ9ZsrmkBpxIdtloztDa/72LfhW5VQz3dhHnzUc8lGHf1dxauUxKWKXu0Y WT9k6wSj12tTrmCHMkpc9vA3ScEiB/0bwvUnbIvfNRvmPfrNnuwJx0pdr427wd2vzG3f kiwdtvEp9r6YH5KDRrxWs1kFRscuXUkN35x5+ZgmXKzuN1FCJQOZlxgl20VOXBSLvzNm wxPfHwexObDMHSIwYUVBmkXcTqPxkd3gvvZ7+e+6IJ6CYdFvhw0M27zlWZIJ4RGD/GUC V77w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=DwIEfXE8CNgM0TRpz0r+ylrsP3Cl/uBo6B2EBJ/28wM=; b=ndTeCxpZC4ZWvS+sCg5D4RYffTiyrVeFNEJ1kVsi/eItKOfY+y0Iura8nj2vecQvSn wMV3CnaTVz6kKpMQh0SECe5xAQpxB4i1i/mw05YSrmtJxhfi6n670od2r5aNOxN5DVZG LYmMXZDVEoDviTLk+EibqiYP+2nf4pLwMAqRBZCzdECRl/E7Zt6vgcix4vBKcOSh5s4P QRnrKgnuyNB0tw2uSn8I+a4E4bH8Mbk0zYQqOxrQoa+99yh8L7VkA0oQtzeHvZ37AOfb H5YzFOVv31WRiK55OL8AeYcAS6CDTddgjwUM0JqVStncy6qay9rrgrPLkGIt8ihQU6Zz 0/Kg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=jhabaLD6; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id z186si3447684yba.810.2018.02.17.10.44.09 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 10:44:09 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=jhabaLD6; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48211 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7Sy-0000H0-Po for patch@linaro.org; Sat, 17 Feb 2018 13:44:08 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40123) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79f-0000mM-37 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:12 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79d-0001mo-PN for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:11 -0500 Received: from mail-pl0-x241.google.com ([2607:f8b0:400e:c01::241]:39664) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79d-0001mK-HR for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:09 -0500 Received: by mail-pl0-x241.google.com with SMTP id s13so3436999plq.6 for ; Sat, 17 Feb 2018 10:24:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=DwIEfXE8CNgM0TRpz0r+ylrsP3Cl/uBo6B2EBJ/28wM=; b=jhabaLD67l4OK/+vLYeFc4Wb2gL/RzzTtyf+eul8Ai1bKqBT3RoGsh1oACmODiXAPU NsFiVErzTnhKkC4qcJyIuOgXoGCGaVDLgv75N6ZXQZ0qMRnb8fI/Ofkpucd/gFXRL7VO egXLF3Iyy5g0QLZDVCMqUvTFv6xNsi82BgPII= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=DwIEfXE8CNgM0TRpz0r+ylrsP3Cl/uBo6B2EBJ/28wM=; b=Y/VQlkS3wlKZxcOS/cLaZ+rXlWZ1qEJnXufwIdca5dsxPpNRTcH3hsxrU37LXJMLHh Z0jLZD4zeBEkNyBmvNapAoMn05lNcgUqqlQL37NDeLrIcyYFCj2ntl7mK4/9BuedpMfy 9R58NTmSzsL7JZEiWfXUbO+JNsV8oqPIGTMhGdi9iC6VCclWFRxvzb+THxr+r1kx/Qoc 1rqKjiSvVjlxnqtiG43Zbb7T8+sLNoCSF1SGk6+50faLiRuSNM2g2BtUfeiQiHhL7/tW ZnTkcgQP8L8oLFeYAGKuC6pbTxi6gglCnQKBq963Va8SMx/yf3LmxPtnTvon3SmHT2qU dWFw== X-Gm-Message-State: APf1xPBATTb+O7ohd4I8D8unnyY4qbGsujJKAVJK/HlzQ9YoQ9HL4KSg fTrWjdHOyeipxmWT62mH1OxXgxSiaDk= X-Received: by 2002:a17:902:42c3:: with SMTP id h61-v6mr9349488pld.269.1518891848270; Sat, 17 Feb 2018 10:24:08 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.06 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:07 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:42 -0800 Message-Id: <20180217182323.25885-27-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::241 Subject: [Qemu-devel] [PATCH v2 26/67] target/arm: Implement SVE Permute - Extract Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 2 ++ target/arm/sve_helper.c | 81 ++++++++++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 29 +++++++++++++++++ target/arm/sve.decode | 9 +++++- 4 files changed, 120 insertions(+), 1 deletion(-) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 79493ab647..94f4356ce9 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -414,6 +414,8 @@ DEF_HELPER_FLAGS_4(sve_cpy_z_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(sve_cpy_z_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(sve_cpy_z_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_ext, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 6a95d1ec48..fb3f54300b 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1469,3 +1469,84 @@ void HELPER(sve_cpy_z_d)(void *vd, void *vg, uint64_t val, uint32_t desc) d[i] = (pg[H1(i)] & 1 ? val : 0); } } + +/* Big-endian hosts need to frob the byte indicies. If the copy + * happens to be 8-byte aligned, then no frobbing necessary. + */ +static void swap_memmove(void *vd, void *vs, size_t n) +{ + uintptr_t d = (uintptr_t)vd; + uintptr_t s = (uintptr_t)vs; + uintptr_t o = (d | s | n) & 7; + size_t i; + +#ifndef HOST_WORDS_BIGENDIAN + o = 0; +#endif + switch (o) { + case 0: + memmove(vd, vs, n); + break; + + case 4: + if (d < s || d >= s + n) { + for (i = 0; i < n; i += 4) { + *(uint32_t *)H1_4(d + i) = *(uint32_t *)H1_4(s + i); + } + } else { + for (i = n; i > 0; ) { + i -= 4; + *(uint32_t *)H1_4(d + i) = *(uint32_t *)H1_4(s + i); + } + } + break; + + case 2: + case 6: + if (d < s || d >= s + n) { + for (i = 0; i < n; i += 2) { + *(uint16_t *)H1_2(d + i) = *(uint16_t *)H1_2(s + i); + } + } else { + for (i = n; i > 0; ) { + i -= 2; + *(uint16_t *)H1_2(d + i) = *(uint16_t *)H1_2(s + i); + } + } + break; + + default: + if (d < s || d >= s + n) { + for (i = 0; i < n; i++) { + *(uint8_t *)H1(d + i) = *(uint8_t *)H1(s + i); + } + } else { + for (i = n; i > 0; ) { + i -= 1; + *(uint8_t *)H1(d + i) = *(uint8_t *)H1(s + i); + } + } + break; + } +} + +void HELPER(sve_ext)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t opr_sz = simd_oprsz(desc); + size_t n_ofs = simd_data(desc); + size_t n_siz = opr_sz - n_ofs; + + if (vd != vm) { + swap_memmove(vd, vn + n_ofs, n_siz); + swap_memmove(vd + n_siz, vm, n_ofs); + } else if (vd != vn) { + swap_memmove(vd + n_siz, vd, n_ofs); + swap_memmove(vd, vn + n_ofs, n_siz); + } else { + /* vd == vn == vm. Need temp space. */ + ARMVectorReg tmp; + swap_memmove(&tmp, vm, n_ofs); + swap_memmove(vd, vd + n_ofs, n_siz); + memcpy(vd + n_siz, &tmp, n_ofs); + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index dd085b084b..07a5eac092 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -1790,6 +1790,35 @@ static void trans_CPY_z_i(DisasContext *s, arg_CPY_z_i *a, uint32_t insn) tcg_temp_free_i64(t_imm); } +/* + *** SVE Permute Extract Group + */ + +static void trans_EXT(DisasContext *s, arg_EXT *a, uint32_t insn) +{ + unsigned vsz = vec_full_reg_size(s); + unsigned n_ofs = a->imm >= vsz ? 0 : a->imm; + unsigned n_siz = vsz - n_ofs; + unsigned d = vec_full_reg_offset(s, a->rd); + unsigned n = vec_full_reg_offset(s, a->rn); + unsigned m = vec_full_reg_offset(s, a->rm); + + /* Use host vector move insns if we have appropriate sizes + and no unfortunate overlap. */ + if (m != d + && n_ofs == size_for_gvec(n_ofs) + && n_siz == size_for_gvec(n_siz) + && (d != n || n_siz <= n_ofs)) { + tcg_gen_gvec_mov(0, d, n + n_ofs, n_siz, n_siz); + if (n_ofs != 0) { + tcg_gen_gvec_mov(0, d + n_siz, m, n_ofs, n_ofs); + } + return; + } + + tcg_gen_gvec_3_ool(d, n, m, vsz, vsz, n_ofs, gen_helper_sve_ext); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index e6e10a4f84..5e3a9839d4 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -22,8 +22,9 @@ ########################################################################### # Named fields. These are primarily for disjoint fields. -%imm4_16_p1 16:4 !function=plus1 +%imm4_16_p1 16:4 !function=plus1 %imm6_22_5 22:1 5:5 +%imm8_16_10 16:5 10:3 %imm9_16_10 16:s6 10:3 %preg4_5 5:4 @@ -363,6 +364,12 @@ FCPY 00000101 .. 01 .... 110 imm:8 ..... @rdn_pg4 CPY_m_i 00000101 .. 01 .... 01 . ........ ..... @rdn_pg4 imm=%sh8_i8s CPY_z_i 00000101 .. 01 .... 00 . ........ ..... @rdn_pg4 imm=%sh8_i8s +### SVE Permute - Extract Group + +# SVE extract vector (immediate offset) +EXT 00000101 001 ..... 000 ... rm:5 rd:5 \ + &rrri rn=%reg_movprfx imm=%imm8_16_10 + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Sat Feb 17 18:22:43 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128684 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1823686ljc; Sat, 17 Feb 2018 10:36:37 -0800 (PST) X-Google-Smtp-Source: AH8x227z6VVsTPXlRf1YVA1giDqw0Rv4eZhwIWytwkXETpe5+gaCcFnb7h4K1cwRjqRkK1/RhYWW X-Received: by 10.37.174.29 with SMTP id a29mr5975522ybj.309.1518892597089; Sat, 17 Feb 2018 10:36:37 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518892597; cv=none; d=google.com; s=arc-20160816; b=l3a14epTGSL6d1LXBmkKrU2N4ZTvrQMP5u61OnEel9mY5IU888UgwVv9z6Pk71PeeR f9i0pjx1gvXZldKGce/s8JY/47+23Azz/ZQnLS9FAmo6Sft5s3ZXYBV1UHUTp/4mWBNo /Jg29/0Ik9HLGOrFq7h4TUpMrJ6XTpmVJLOEDR80pP6XtINl0qci5WMmu1nNS8oVZyBG r5/F2CQyN9B2E02BIvtL+utFmaplvoYilU2zRJ6AfBT5/bZdZfJOurdjC7TSfcZZCuTS GEnVuYkMNAWNBAGa8ZYtmufmqKq+tk4hjAc6fDnetpB4+bxDiD5a//DOMlI/hMqyNTqk o5cA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=qPsbfGNpCfDIlap2bj2lVE1RF8fhYlKz98g2CqBrEjw=; b=DYY0KdSAHq+o+lQ5F0cLdRPuZ9UbeiDPErma0rYtJZdBUB2ERDKT5JuPMo6/c9VGMx rGBQIdGkdAzjCevlvojX/0TGN4iKeNW1juhQapJc+gdg2QWmBD8JfkARvWKyGTvgwvVG qdQbwmcGCJRywWj8aAI8uPK0VfjDl8CEShSftcWUUeDdvE5VZM/cBoFDhVbPXiI9lZpI zlZq5Z/p6SX30+Nz/fgj4F9gbkojd14+VUL1vtsUHMvUcDR7HXXfxZiNldPZGejm7L7p r0paANpWjIbXQL6AtrNZ1c3CwZyBw52emVBqXLqgzLFJzbUPXmzmnI7xUG4VQIbNoUwL WhaA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=CtJ3puXX; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id l11si3398709ywm.782.2018.02.17.10.36.36 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 10:36:37 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=CtJ3puXX; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48142 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7Lg-0001ay-Dg for patch@linaro.org; Sat, 17 Feb 2018 13:36:36 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40160) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79h-0000rJ-OK for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:16 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79f-0001ow-Jh for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:13 -0500 Received: from mail-pl0-x243.google.com ([2607:f8b0:400e:c01::243]:40421) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79f-0001nU-BN for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:11 -0500 Received: by mail-pl0-x243.google.com with SMTP id g18so3435334plo.7 for ; Sat, 17 Feb 2018 10:24:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=qPsbfGNpCfDIlap2bj2lVE1RF8fhYlKz98g2CqBrEjw=; b=CtJ3puXXRdtHjT+YOvukmfZ36iapRfi545H3hposydFmKXzn4AnDPMpIdxExGg9jYZ XeXy0IhUWNx75Sy68mBgXscviZiASPGvHugp8ALndLyLq1HQFEyNEZKmGMzn3KVa9tBM S85mjdXBCQZmsdUi55/gSP3sIudaEB0yXm8zY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=qPsbfGNpCfDIlap2bj2lVE1RF8fhYlKz98g2CqBrEjw=; b=qSYvGrWc1wbMDck9uOGPWlaC8TgAvIyoDOdb7fi2yJot3B40gaTyWAD3FwcsNBySwf zufyhphbeIRyWI7G/cZ4FPNrIxXWfstsjEIw7h+O5q7lwsfEpWSfzgc5ucxCOq65wlID cuCanh1YaPL2l7td2Kzs3ecVWouiXP/sGzWArAGV7nAPXg5OlY3/lJO/WWc+gyB9zPk8 tRqelw5ZSucde4sJnBq+pxfAL+ruAbqR1hvaXhusuOeqRJPfBmdNHSC7XoatrtMIh6fa 4HgZeuhSki182GkTL0UeeQzpajfuA+LlvKTrbhaLtw+Upr3FYqWtTdCdKxlxX7vURzA/ Ccxg== X-Gm-Message-State: APf1xPDKW1YNb3sik3Fst46hG6ybixmkQq3UW1MPc0MLssNRjikfCjcG 38X92JSLAIepEBMSFDZRtCbZdpWQbZ4= X-Received: by 2002:a17:902:2bc5:: with SMTP id l63-v6mr9564154plb.108.1518891849947; Sat, 17 Feb 2018 10:24:09 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.08 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:09 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:43 -0800 Message-Id: <20180217182323.25885-28-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::243 Subject: [Qemu-devel] [PATCH v2 27/67] target/arm: Implement SVE Permute - Unpredicated Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 23 +++++++++ target/arm/translate-a64.h | 14 +++--- target/arm/sve_helper.c | 114 +++++++++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 113 ++++++++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 29 +++++++++++- 5 files changed, 285 insertions(+), 8 deletions(-) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 94f4356ce9..0c9aad575e 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -416,6 +416,29 @@ DEF_HELPER_FLAGS_4(sve_cpy_z_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(sve_ext, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_insr_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_insr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_insr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_insr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_3(sve_rev_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_rev_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_rev_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_rev_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_tbl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_tbl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_tbl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_tbl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_sunpk_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_sunpk_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_sunpk_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_uunpk_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_uunpk_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_uunpk_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h index e519aee314..328aa7fce1 100644 --- a/target/arm/translate-a64.h +++ b/target/arm/translate-a64.h @@ -66,18 +66,18 @@ static inline void assert_fp_access_checked(DisasContext *s) static inline int vec_reg_offset(DisasContext *s, int regno, int element, TCGMemOp size) { - int offs = 0; + int element_size = 1 << size; + int offs = element * element_size; #ifdef HOST_WORDS_BIGENDIAN /* This is complicated slightly because vfp.zregs[n].d[0] is * still the low half and vfp.zregs[n].d[1] the high half * of the 128 bit vector, even on big endian systems. - * Calculate the offset assuming a fully bigendian 128 bits, - * then XOR to account for the order of the two 64 bit halves. + * Calculate the offset assuming a fully little-endian 128 bits, + * then XOR to account for the order of the 64 bit units. */ - offs += (16 - ((element + 1) * (1 << size))); - offs ^= 8; -#else - offs += element * (1 << size); + if (element_size < 8) { + offs ^= 8 - element_size; + } #endif offs += offsetof(CPUARMState, vfp.zregs[regno]); assert_fp_access_checked(s); diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index fb3f54300b..466a209c1e 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1550,3 +1550,117 @@ void HELPER(sve_ext)(void *vd, void *vn, void *vm, uint32_t desc) memcpy(vd + n_siz, &tmp, n_ofs); } } + +#define DO_INSR(NAME, TYPE, H) \ +void HELPER(NAME)(void *vd, void *vn, uint64_t val, uint32_t desc) \ +{ \ + intptr_t opr_sz = simd_oprsz(desc); \ + swap_memmove(vd + sizeof(TYPE), vn, opr_sz - sizeof(TYPE)); \ + *(TYPE *)(vd + H(0)) = val; \ +} + +DO_INSR(sve_insr_b, uint8_t, H1) +DO_INSR(sve_insr_h, uint16_t, H1_2) +DO_INSR(sve_insr_s, uint32_t, H1_4) +DO_INSR(sve_insr_d, uint64_t, ) + +#undef DO_INSR + +void HELPER(sve_rev_b)(void *vd, void *vn, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc); + for (i = 0, j = opr_sz - 8; i < opr_sz / 2; i += 8, j -= 8) { + uint64_t f = *(uint64_t *)(vn + i); + uint64_t b = *(uint64_t *)(vn + j); + *(uint64_t *)(vd + i) = bswap64(b); + *(uint64_t *)(vd + j) = bswap64(f); + } +} + +static inline uint64_t hswap64(uint64_t h) +{ + uint64_t m = 0x0000ffff0000ffffull; + h = rol64(h, 32); + return ((h & m) << 16) | ((h >> 16) & m); +} + +void HELPER(sve_rev_h)(void *vd, void *vn, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc); + for (i = 0, j = opr_sz - 8; i < opr_sz / 2; i += 8, j -= 8) { + uint64_t f = *(uint64_t *)(vn + i); + uint64_t b = *(uint64_t *)(vn + j); + *(uint64_t *)(vd + i) = hswap64(b); + *(uint64_t *)(vd + j) = hswap64(f); + } +} + +void HELPER(sve_rev_s)(void *vd, void *vn, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc); + for (i = 0, j = opr_sz - 8; i < opr_sz / 2; i += 8, j -= 8) { + uint64_t f = *(uint64_t *)(vn + i); + uint64_t b = *(uint64_t *)(vn + j); + *(uint64_t *)(vd + i) = rol64(b, 32); + *(uint64_t *)(vd + j) = rol64(f, 32); + } +} + +void HELPER(sve_rev_d)(void *vd, void *vn, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc); + for (i = 0, j = opr_sz - 8; i < opr_sz / 2; i += 8, j -= 8) { + uint64_t f = *(uint64_t *)(vn + i); + uint64_t b = *(uint64_t *)(vn + j); + *(uint64_t *)(vd + i) = b; + *(uint64_t *)(vd + j) = f; + } +} + +#define DO_TBL(NAME, TYPE, H) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + uintptr_t elem = opr_sz / sizeof(TYPE); \ + TYPE *d = vd, *n = vn, *m = vm; \ + ARMVectorReg tmp; \ + if (unlikely(vd == vn)) { \ + n = memcpy(&tmp, vn, opr_sz); \ + } \ + for (i = 0; i < elem; i++) { \ + TYPE j = m[H(i)]; \ + d[H(i)] = j < elem ? n[H(j)] : 0; \ + } \ +} + +DO_TBL(sve_tbl_b, uint8_t, H1) +DO_TBL(sve_tbl_h, uint16_t, H2) +DO_TBL(sve_tbl_s, uint32_t, H4) +DO_TBL(sve_tbl_d, uint64_t, ) + +#undef TBL + +#define DO_UNPK(NAME, TYPED, TYPES, HD, HS) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + TYPED *d = vd; \ + TYPES *n = vn; \ + ARMVectorReg tmp; \ + if (unlikely(vn - vd < opr_sz)) { \ + n = memcpy(&tmp, n, opr_sz / 2); \ + } \ + for (i = 0; i < opr_sz / sizeof(TYPED); i++) { \ + d[HD(i)] = n[HS(i)]; \ + } \ +} + +DO_UNPK(sve_sunpk_h, int16_t, int8_t, H2, H1) +DO_UNPK(sve_sunpk_s, int32_t, int16_t, H4, H2) +DO_UNPK(sve_sunpk_d, int64_t, int32_t, , H4) + +DO_UNPK(sve_uunpk_h, uint16_t, uint8_t, H2, H1) +DO_UNPK(sve_uunpk_s, uint32_t, uint16_t, H4, H2) +DO_UNPK(sve_uunpk_d, uint64_t, uint32_t, , H4) + +#undef DO_UNPK diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 07a5eac092..3724f6290c 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -1819,6 +1819,119 @@ static void trans_EXT(DisasContext *s, arg_EXT *a, uint32_t insn) tcg_gen_gvec_3_ool(d, n, m, vsz, vsz, n_ofs, gen_helper_sve_ext); } +/* + *** SVE Permute - Unpredicated Group + */ + +static void trans_DUP_s(DisasContext *s, arg_DUP_s *a, uint32_t insn) +{ + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_dup_i64(a->esz, vec_full_reg_offset(s, a->rd), + vsz, vsz, cpu_reg_sp(s, a->rn)); +} + +static void trans_DUP_x(DisasContext *s, arg_DUP_x *a, uint32_t insn) +{ + unsigned vsz = vec_full_reg_size(s); + unsigned dofs = vec_full_reg_offset(s, a->rd); + unsigned esz, index; + + if ((a->imm & 0x1f) == 0) { + unallocated_encoding(s); + return; + } + esz = ctz32(a->imm); + index = a->imm >> (esz + 1); + + if ((index << esz) < vsz) { + unsigned nofs = vec_reg_offset(s, a->rn, index, esz); + tcg_gen_gvec_dup_mem(esz, dofs, nofs, vsz, vsz); + } else { + tcg_gen_gvec_dup64i(dofs, vsz, vsz, 0); + } +} + +static void do_insr_i64(DisasContext *s, arg_rrr_esz *a, TCGv_i64 val) +{ + typedef void gen_insr(TCGv_ptr, TCGv_ptr, TCGv_i64, TCGv_i32); + static gen_insr * const fns[4] = { + gen_helper_sve_insr_b, gen_helper_sve_insr_h, + gen_helper_sve_insr_s, gen_helper_sve_insr_d, + }; + unsigned vsz = vec_full_reg_size(s); + TCGv_i32 desc = tcg_const_i32(simd_desc(vsz, vsz, 0)); + TCGv_ptr t_zd = tcg_temp_new_ptr(); + TCGv_ptr t_zn = tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(t_zd, cpu_env, vec_full_reg_offset(s, a->rd)); + tcg_gen_addi_ptr(t_zn, cpu_env, vec_full_reg_offset(s, a->rn)); + + fns[a->esz](t_zd, t_zn, val, desc); + + tcg_temp_free_ptr(t_zd); + tcg_temp_free_ptr(t_zn); + tcg_temp_free_i32(desc); +} + +static void trans_INSR_f(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + TCGv_i64 t = tcg_temp_new_i64(); + tcg_gen_ld_i64(t, cpu_env, vec_reg_offset(s, a->rm, 0, MO_64)); + do_insr_i64(s, a, t); + tcg_temp_free_i64(t); +} + +static void trans_INSR_r(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_insr_i64(s, a, cpu_reg(s, a->rm)); +} + +static void trans_REV_v(DisasContext *s, arg_rr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_2 * const fns[4] = { + gen_helper_sve_rev_b, gen_helper_sve_rev_h, + gen_helper_sve_rev_s, gen_helper_sve_rev_d + }; + unsigned vsz = vec_full_reg_size(s); + + tcg_gen_gvec_2_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vsz, vsz, 0, fns[a->esz]); +} + +static void trans_TBL(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve_tbl_b, gen_helper_sve_tbl_h, + gen_helper_sve_tbl_s, gen_helper_sve_tbl_d + }; + unsigned vsz = vec_full_reg_size(s); + + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vsz, vsz, 0, fns[a->esz]); +} + +static void trans_UNPK(DisasContext *s, arg_UNPK *a, uint32_t insn) +{ + static gen_helper_gvec_2 * const fns[4][2] = { + { NULL, NULL }, + { gen_helper_sve_sunpk_h, gen_helper_sve_uunpk_h }, + { gen_helper_sve_sunpk_s, gen_helper_sve_uunpk_s }, + { gen_helper_sve_sunpk_d, gen_helper_sve_uunpk_d }, + }; + unsigned vsz = vec_full_reg_size(s); + + if (a->esz == 0) { + unallocated_encoding(s); + return; + } + tcg_gen_gvec_2_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn) + (a->h ? vsz / 2 : 0), + vsz, vsz, 0, fns[a->esz][a->u]); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 5e3a9839d4..8af47ad27b 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -24,6 +24,7 @@ %imm4_16_p1 16:4 !function=plus1 %imm6_22_5 22:1 5:5 +%imm7_22_16 22:2 16:5 %imm8_16_10 16:5 10:3 %imm9_16_10 16:s6 10:3 %preg4_5 5:4 @@ -85,7 +86,9 @@ @pd_pg_pn_pm_s ........ . s:1 .. rm:4 .. pg:4 . rn:4 . rd:4 &rprr_s # Three operand, vector element size -@rd_rn_rm ........ esz:2 . rm:5 ... ... rn:5 rd:5 &rrr_esz +@rd_rn_rm ........ esz:2 . rm:5 ... ... rn:5 rd:5 &rrr_esz +@rdn_rm ........ esz:2 ...... ...... rm:5 rd:5 \ + &rrr_esz rn=%reg_movprfx # Three operand with "memory" size, aka immediate left shift @rd_rn_msz_rm ........ ... rm:5 .... imm:2 rn:5 rd:5 &rrri @@ -370,6 +373,30 @@ CPY_z_i 00000101 .. 01 .... 00 . ........ ..... @rdn_pg4 imm=%sh8_i8s EXT 00000101 001 ..... 000 ... rm:5 rd:5 \ &rrri rn=%reg_movprfx imm=%imm8_16_10 +### SVE Permute - Unpredicated Group + +# SVE broadcast general register +DUP_s 00000101 .. 1 00000 001110 ..... ..... @rd_rn + +# SVE broadcast indexed element +DUP_x 00000101 .. 1 ..... 001000 rn:5 rd:5 \ + &rri imm=%imm7_22_16 + +# SVE insert SIMD&FP scalar register +INSR_f 00000101 .. 1 10100 001110 ..... ..... @rdn_rm + +# SVE insert general register +INSR_r 00000101 .. 1 00100 001110 ..... ..... @rdn_rm + +# SVE reverse vector elements +REV_v 00000101 .. 1 11000 001110 ..... ..... @rd_rn + +# SVE vector table lookup +TBL 00000101 .. 1 ..... 001100 ..... ..... @rd_rn_rm + +# SVE unpack vector elements +UNPK 00000101 esz:2 1100 u:1 h:1 001110 rn:5 rd:5 + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Sat Feb 17 18:22:44 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128699 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1831353ljc; Sat, 17 Feb 2018 10:49:01 -0800 (PST) X-Google-Smtp-Source: AH8x225ipDIIBdhNwDOF4UOF82ftyOPGjhWhoIexPM1kG5n+u8BS3dYOOPD3aH6H5BVgTl2Eq97O X-Received: by 10.37.8.6 with SMTP id 6mr6941157ybi.203.1518893340910; Sat, 17 Feb 2018 10:49:00 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518893340; cv=none; d=google.com; s=arc-20160816; b=cJ4lROGEfidArkEqTXVlf7QCyQkf84NPkJbyttCCoLX9jswZIcovyfSW75H/YLpR4l WAeH8uMgJCQhhepXV+xkYv+aeNhUJW9n/1ya83LeTEaYGE3De4nooz19k7gp0FdlNRQs L/1Fut4twfXsvWZFZig3z5wNDI9jly6RDjoznBL4dQEkKW1kleGyZoooL9UdRo+C+s3z xXD4k9AwN+b2uqaMAjrGBmhtTWLHaVh0qoepNeEtAhJO7pvbzh02es906qfBg+FzJl84 lNN+0xFTpnDzsWzWa58dU8acA+XdJ8d3jsSw9t4hpvCZLcd7t316RS3PsORyPim6kNFR 7imw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=gs8jk3SnvU+rkac2DB/AmJreOmTjHJdoAqlQrUA8wrI=; b=RP3L8hS/3eykQ8TxA2HPmO3q8ljrRfL1G1dg5IG+WqQf5icFLe8eTgm5tsW6jqB/8T kpWk1uNlp2EV+RIrkf4x9l47pZtvm1EXTQQ2GAecL/6oRSkTgJUXFq8D1NflVTgbfqmp Ep31FdixQ/5iqINY0FoEEQ25alLYxGJFdksLkljmM0SxlxeNrGJ6pdOdey/kzen2YR78 tFzzQ3AiAcJIPqwhWdJiv6DkQwVrM4YiHlW9Z3Cy7CaXkTCurRCNXbowsum0Czd5qWe6 FQQKsoje1j0rrY6x9t99FnSynL7uWWze05jJs/3/CB81HEXuoD30CHlUUHOrvn8c/J9s p6Hw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=LJskhUr+; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id k8si1147753ybm.688.2018.02.17.10.49.00 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 10:49:00 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=LJskhUr+; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48262 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7Xg-0004Fg-5I for patch@linaro.org; Sat, 17 Feb 2018 13:49:00 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40199) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79k-0000sp-6u for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:18 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79h-0001pf-Gj for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:16 -0500 Received: from mail-pf0-x241.google.com ([2607:f8b0:400e:c00::241]:32803) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79h-0001pG-0e for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:13 -0500 Received: by mail-pf0-x241.google.com with SMTP id b8so525376pfh.0 for ; Sat, 17 Feb 2018 10:24:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=gs8jk3SnvU+rkac2DB/AmJreOmTjHJdoAqlQrUA8wrI=; b=LJskhUr+Lqcy5YCUHndW+/zRbO1Yr+Xb46iNwLPDW+Qfl9kYsg2QJYohKg8ctgeX4E d3QhS4fU4tzMj35JfRn8C97WEdULBgrFvF0dYljq4YizNitO+79qa1PNcv2crCYvTw5M A0suKjhxnqfuIFfW1a+Oza5KwkUYCRG/9dxBQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=gs8jk3SnvU+rkac2DB/AmJreOmTjHJdoAqlQrUA8wrI=; b=TTLpgb8vDZTBf/UC42k+81ChOHq22U2JqF25mNPH3jP8trp4ptCJCnKgIxSvXHWWzF RkGMOQlXfngdnku1oWOA/O/u9maoRwF8hDnlNMdpQjucKRtimROi8F+p6UsmyUGo44BG p54glat9U+iKdiijKYnDS/DrX7c3CkP3Pyh7ta1h2eBsjIksSL1nZfz8hDu2EgdjBZxZ ipsoelRDVuJbGf6lQSKl9rITB5S4WAyxZooKs4crnT5Jh6NdXNPFsnJOgFrAV9aQinUo fCNfe9ckWUmAbrf6QI3aNc+HcIsftoCOLfldSz1cgQMoJEZIrYW01hEvgUCjEgVB5FnA ChGA== X-Gm-Message-State: APf1xPDETgwCE37oXnBxEvqxKzVPCBS2UXd7EHIr/3/7P3Bnx6ex+fwL OcXpv3SI+p5IJo6uI9RDwLSSdnMGm7g= X-Received: by 10.99.63.9 with SMTP id m9mr8511033pga.247.1518891851631; Sat, 17 Feb 2018 10:24:11 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.10 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:10 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:44 -0800 Message-Id: <20180217182323.25885-29-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::241 Subject: [Qemu-devel] [PATCH v2 28/67] target/arm: Implement SVE Permute - Predicates Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 6 + target/arm/sve_helper.c | 280 +++++++++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 110 ++++++++++++++++++ target/arm/sve.decode | 18 +++ 4 files changed, 414 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 0c9aad575e..ff958fcebd 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -439,6 +439,12 @@ DEF_HELPER_FLAGS_3(sve_uunpk_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_uunpk_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_uunpk_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_zip_p, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_uzp_p, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_trn_p, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_rev_p, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_punpk_p, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 466a209c1e..c3a2706a16 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1664,3 +1664,283 @@ DO_UNPK(sve_uunpk_s, uint32_t, uint16_t, H4, H2) DO_UNPK(sve_uunpk_d, uint64_t, uint32_t, , H4) #undef DO_UNPK + +static const uint64_t expand_bit_data[5][2] = { + { 0x1111111111111111ull, 0x2222222222222222ull }, + { 0x0303030303030303ull, 0x0c0c0c0c0c0c0c0cull }, + { 0x000f000f000f000full, 0x00f000f000f000f0ull }, + { 0x000000ff000000ffull, 0x0000ff000000ff00ull }, + { 0x000000000000ffffull, 0x00000000ffff0000ull } +}; + +/* Expand units of 2**N bits to units of 2**(N+1) bits, + with the higher bits zero. */ +static uint64_t expand_bits(uint64_t x, int n) +{ + int i, sh; + for (i = 4, sh = 16; i >= n; i--, sh >>= 1) { + x = ((x & expand_bit_data[i][1]) << sh) | (x & expand_bit_data[i][0]); + } + return x; +} + +/* Compress units of 2**(N+1) bits to units of 2**N bits. */ +static uint64_t compress_bits(uint64_t x, int n) +{ + int i, sh; + for (i = n, sh = 1 << n; i <= 4; i++, sh <<= 1) { + x = ((x >> sh) & expand_bit_data[i][1]) | (x & expand_bit_data[i][0]); + } + return x; +} + +void HELPER(sve_zip_p)(void *vd, void *vn, void *vm, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + int esz = extract32(pred_desc, SIMD_DATA_SHIFT, 2); + intptr_t high = extract32(pred_desc, SIMD_DATA_SHIFT + 2, 1); + uint64_t *d = vd; + intptr_t i; + + if (oprsz <= 8) { + uint64_t nn = *(uint64_t *)vn; + uint64_t mm = *(uint64_t *)vm; + int half = 4 * oprsz; + + nn = extract64(nn, high * half, half); + mm = extract64(mm, high * half, half); + nn = expand_bits(nn, esz); + mm = expand_bits(mm, esz); + d[0] = nn + (mm << (1 << esz)); + } else { + ARMPredicateReg tmp_n, tmp_m; + + /* We produce output faster than we consume input. + Therefore we must be mindful of possible overlap. */ + if ((vn - vd) < (uintptr_t)oprsz) { + vn = memcpy(&tmp_n, vn, oprsz); + } + if ((vm - vd) < (uintptr_t)oprsz) { + vm = memcpy(&tmp_m, vm, oprsz); + } + if (high) { + high = oprsz >> 1; + } + + if ((high & 3) == 0) { + uint32_t *n = vn, *m = vm; + high >>= 2; + + for (i = 0; i < DIV_ROUND_UP(oprsz, 8); i++) { + uint64_t nn = n[H4(high + i)]; + uint64_t mm = m[H4(high + i)]; + + nn = expand_bits(nn, esz); + mm = expand_bits(mm, esz); + d[i] = nn + (mm << (1 << esz)); + } + } else { + uint8_t *n = vn, *m = vm; + uint16_t *d16 = vd; + + for (i = 0; i < oprsz / 2; i++) { + uint16_t nn = n[H1(high + i)]; + uint16_t mm = m[H1(high + i)]; + + nn = expand_bits(nn, esz); + mm = expand_bits(mm, esz); + d16[H2(i)] = nn + (mm << (1 << esz)); + } + } + } +} + +void HELPER(sve_uzp_p)(void *vd, void *vn, void *vm, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + int esz = extract32(pred_desc, SIMD_DATA_SHIFT, 2); + int odd = extract32(pred_desc, SIMD_DATA_SHIFT + 2, 1) << esz; + uint64_t *d = vd, *n = vn, *m = vm; + uint64_t l, h; + intptr_t i; + + if (oprsz <= 8) { + l = compress_bits(n[0] >> odd, esz); + h = compress_bits(m[0] >> odd, esz); + d[0] = extract64(l + (h << (4 * oprsz)), 0, 8 * oprsz); + } else { + ARMPredicateReg tmp_m; + intptr_t oprsz_16 = oprsz / 16; + + if ((vm - vd) < (uintptr_t)oprsz) { + m = memcpy(&tmp_m, vm, oprsz); + } + + for (i = 0; i < oprsz_16; i++) { + l = n[2 * i + 0]; + h = n[2 * i + 1]; + l = compress_bits(l >> odd, esz); + h = compress_bits(h >> odd, esz); + d[i] = l + (h << 32); + } + + /* For VL which is not a power of 2, the results from M do not + align nicely with the uint64_t for D. Put the aligned results + from M into TMP_M and then copy it into place afterward. */ + if (oprsz & 15) { + d[i] = compress_bits(n[2 * i] >> odd, esz); + + for (i = 0; i < oprsz_16; i++) { + l = m[2 * i + 0]; + h = m[2 * i + 1]; + l = compress_bits(l >> odd, esz); + h = compress_bits(h >> odd, esz); + tmp_m.p[i] = l + (h << 32); + } + tmp_m.p[i] = compress_bits(m[2 * i] >> odd, esz); + + swap_memmove(vd + oprsz / 2, &tmp_m, oprsz / 2); + } else { + for (i = 0; i < oprsz_16; i++) { + l = m[2 * i + 0]; + h = m[2 * i + 1]; + l = compress_bits(l >> odd, esz); + h = compress_bits(h >> odd, esz); + d[oprsz_16 + i] = l + (h << 32); + } + } + } +} + +static const uint64_t even_bit_esz_masks[4] = { + 0x5555555555555555ull, + 0x3333333333333333ull, + 0x0f0f0f0f0f0f0f0full, + 0x00ff00ff00ff00ffull +}; + +void HELPER(sve_trn_p)(void *vd, void *vn, void *vm, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + uintptr_t esz = extract32(pred_desc, SIMD_DATA_SHIFT, 2); + bool odd = extract32(pred_desc, SIMD_DATA_SHIFT + 2, 1); + uint64_t *d = vd, *n = vn, *m = vm; + uint64_t mask; + int shr, shl; + intptr_t i; + + shl = 1 << esz; + shr = 0; + mask = even_bit_esz_masks[esz]; + if (odd) { + mask <<= shl; + shr = shl; + shl = 0; + } + + for (i = 0; i < DIV_ROUND_UP(oprsz, 8); i++) { + uint64_t nn = (n[i] & mask) >> shr; + uint64_t mm = (m[i] & mask) << shl; + d[i] = nn + mm; + } +} + +/* Reverse units of 2**N bits. */ +static uint64_t reverse_bits_64(uint64_t x, int n) +{ + int i, sh; + + x = bswap64(x); + for (i = 2, sh = 4; i >= n; i--, sh >>= 1) { + uint64_t mask = even_bit_esz_masks[i]; + x = ((x & mask) << sh) | ((x >> sh) & mask); + } + return x; +} + +static uint8_t reverse_bits_8(uint8_t x, int n) +{ + static const uint8_t mask[3] = { 0x55, 0x33, 0x0f }; + int i, sh; + + for (i = 2, sh = 4; i >= n; i--, sh >>= 1) { + x = ((x & mask[i]) << sh) | ((x >> sh) & mask[i]); + } + return x; +} + +void HELPER(sve_rev_p)(void *vd, void *vn, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + int esz = extract32(pred_desc, SIMD_DATA_SHIFT, 2); + intptr_t i, oprsz_2 = oprsz / 2; + + if (oprsz <= 8) { + uint64_t l = *(uint64_t *)vn; + l = reverse_bits_64(l << (64 - 8 * oprsz), esz); + *(uint64_t *)vd = l; + } else if ((oprsz & 15) == 0) { + for (i = 0; i < oprsz_2; i += 8) { + intptr_t ih = oprsz - 8 - i; + uint64_t l = reverse_bits_64(*(uint64_t *)(vn + i), esz); + uint64_t h = reverse_bits_64(*(uint64_t *)(vn + ih), esz); + *(uint64_t *)(vd + i) = h; + *(uint64_t *)(vd + ih) = l; + } + } else { + for (i = 0; i < oprsz_2; i += 1) { + intptr_t il = H1(i); + intptr_t ih = H1(oprsz - 1 - i); + uint8_t l = reverse_bits_8(*(uint8_t *)(vn + il), esz); + uint8_t h = reverse_bits_8(*(uint8_t *)(vn + ih), esz); + *(uint8_t *)(vd + il) = h; + *(uint8_t *)(vd + ih) = l; + } + } +} + +void HELPER(sve_punpk_p)(void *vd, void *vn, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + intptr_t high = extract32(pred_desc, SIMD_DATA_SHIFT + 2, 1); + uint64_t *d = vd; + intptr_t i; + + if (oprsz <= 8) { + uint64_t nn = *(uint64_t *)vn; + int half = 4 * oprsz; + + nn = extract64(nn, high * half, half); + nn = expand_bits(nn, 0); + d[0] = nn; + } else { + ARMPredicateReg tmp_n; + + /* We produce output faster than we consume input. + Therefore we must be mindful of possible overlap. */ + if ((vn - vd) < (uintptr_t)oprsz) { + vn = memcpy(&tmp_n, vn, oprsz); + } + if (high) { + high = oprsz >> 1; + } + + if ((high & 3) == 0) { + uint32_t *n = vn; + high >>= 2; + + for (i = 0; i < DIV_ROUND_UP(oprsz, 8); i++) { + uint64_t nn = n[H4(high + i)]; + d[i] = expand_bits(nn, 0); + } + } else { + uint16_t *d16 = vd; + uint8_t *n = vn; + + for (i = 0; i < oprsz / 2; i++) { + uint16_t nn = n[H1(high + i)]; + d16[H2(i)] = expand_bits(nn, 0); + } + } + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 3724f6290c..45e1ea87bf 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -1932,6 +1932,116 @@ static void trans_UNPK(DisasContext *s, arg_UNPK *a, uint32_t insn) vsz, vsz, 0, fns[a->esz][a->u]); } +/* + *** SVE Permute - Predicates Group + */ + +static void do_perm_pred3(DisasContext *s, arg_rrr_esz *a, bool high_odd, + gen_helper_gvec_3 *fn) +{ + unsigned vsz = pred_full_reg_size(s); + + /* Predicate sizes may be smaller and cannot use simd_desc. + We cannot round up, as we do elsewhere, because we need + the exact size for ZIP2 and REV. We retain the style for + the other helpers for consistency. */ + TCGv_ptr t_d = tcg_temp_new_ptr(); + TCGv_ptr t_n = tcg_temp_new_ptr(); + TCGv_ptr t_m = tcg_temp_new_ptr(); + TCGv_i32 t_desc; + int desc; + + desc = vsz - 2; + desc = deposit32(desc, SIMD_DATA_SHIFT, 2, a->esz); + desc = deposit32(desc, SIMD_DATA_SHIFT + 2, 2, high_odd); + + tcg_gen_addi_ptr(t_d, cpu_env, pred_full_reg_offset(s, a->rd)); + tcg_gen_addi_ptr(t_n, cpu_env, pred_full_reg_offset(s, a->rn)); + tcg_gen_addi_ptr(t_m, cpu_env, pred_full_reg_offset(s, a->rm)); + t_desc = tcg_const_i32(desc); + + fn(t_d, t_n, t_m, t_desc); + + tcg_temp_free_ptr(t_d); + tcg_temp_free_ptr(t_n); + tcg_temp_free_ptr(t_m); + tcg_temp_free_i32(t_desc); +} + +static void do_perm_pred2(DisasContext *s, arg_rr_esz *a, bool high_odd, + gen_helper_gvec_2 *fn) +{ + unsigned vsz = pred_full_reg_size(s); + TCGv_ptr t_d = tcg_temp_new_ptr(); + TCGv_ptr t_n = tcg_temp_new_ptr(); + TCGv_i32 t_desc; + int desc; + + tcg_gen_addi_ptr(t_d, cpu_env, pred_full_reg_offset(s, a->rd)); + tcg_gen_addi_ptr(t_n, cpu_env, pred_full_reg_offset(s, a->rn)); + + /* Predicate sizes may be smaller and cannot use simd_desc. + We cannot round up, as we do elsewhere, because we need + the exact size for ZIP2 and REV. We retain the style for + the other helpers for consistency. */ + + desc = vsz - 2; + desc = deposit32(desc, SIMD_DATA_SHIFT, 2, a->esz); + desc = deposit32(desc, SIMD_DATA_SHIFT + 2, 2, high_odd); + t_desc = tcg_const_i32(desc); + + fn(t_d, t_n, t_desc); + + tcg_temp_free_i32(t_desc); + tcg_temp_free_ptr(t_d); + tcg_temp_free_ptr(t_n); +} + +static void trans_ZIP1_p(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_perm_pred3(s, a, 0, gen_helper_sve_zip_p); +} + +static void trans_ZIP2_p(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_perm_pred3(s, a, 1, gen_helper_sve_zip_p); +} + +static void trans_UZP1_p(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_perm_pred3(s, a, 0, gen_helper_sve_uzp_p); +} + +static void trans_UZP2_p(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_perm_pred3(s, a, 1, gen_helper_sve_uzp_p); +} + +static void trans_TRN1_p(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_perm_pred3(s, a, 0, gen_helper_sve_trn_p); +} + +static void trans_TRN2_p(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_perm_pred3(s, a, 1, gen_helper_sve_trn_p); +} + +static void trans_REV_p(DisasContext *s, arg_rr_esz *a, uint32_t insn) +{ + do_perm_pred2(s, a, 0, gen_helper_sve_rev_p); +} + +static void trans_PUNPKLO(DisasContext *s, arg_PUNPKLO *a, uint32_t insn) +{ + do_perm_pred2(s, a, 0, gen_helper_sve_punpk_p); +} + +static void trans_PUNPKHI(DisasContext *s, arg_PUNPKHI *a, uint32_t insn) +{ + do_perm_pred2(s, a, 1, gen_helper_sve_punpk_p); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 8af47ad27b..bcbe84c3a6 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -87,6 +87,7 @@ # Three operand, vector element size @rd_rn_rm ........ esz:2 . rm:5 ... ... rn:5 rd:5 &rrr_esz +@pd_pn_pm ........ esz:2 .. rm:4 ....... rn:4 . rd:4 &rrr_esz @rdn_rm ........ esz:2 ...... ...... rm:5 rd:5 \ &rrr_esz rn=%reg_movprfx @@ -397,6 +398,23 @@ TBL 00000101 .. 1 ..... 001100 ..... ..... @rd_rn_rm # SVE unpack vector elements UNPK 00000101 esz:2 1100 u:1 h:1 001110 rn:5 rd:5 +### SVE Permute - Predicates Group + +# SVE permute predicate elements +ZIP1_p 00000101 .. 10 .... 010 000 0 .... 0 .... @pd_pn_pm +ZIP2_p 00000101 .. 10 .... 010 001 0 .... 0 .... @pd_pn_pm +UZP1_p 00000101 .. 10 .... 010 010 0 .... 0 .... @pd_pn_pm +UZP2_p 00000101 .. 10 .... 010 011 0 .... 0 .... @pd_pn_pm +TRN1_p 00000101 .. 10 .... 010 100 0 .... 0 .... @pd_pn_pm +TRN2_p 00000101 .. 10 .... 010 101 0 .... 0 .... @pd_pn_pm + +# SVE reverse predicate elements +REV_p 00000101 .. 11 0100 010 000 0 .... 0 .... @pd_pn + +# SVE unpack predicate elements +PUNPKLO 00000101 00 11 0000 010 000 0 .... 0 .... @pd_pn_e0 +PUNPKHI 00000101 00 11 0001 010 000 0 .... 0 .... @pd_pn_e0 + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Sat Feb 17 18:22:45 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128712 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1838166ljc; Sat, 17 Feb 2018 11:00:45 -0800 (PST) X-Google-Smtp-Source: AH8x225Ut2HCj3ct83fGFdlBFuDOnLu4PrBOH7yN5JDiDMsGRMjYwZECS4RnZYMeuOEDqQmGtL8f X-Received: by 10.129.42.194 with SMTP id q185mr3832270ywq.245.1518894045250; Sat, 17 Feb 2018 11:00:45 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518894045; cv=none; d=google.com; s=arc-20160816; b=TrnMxROhklKybvA1/ZCVvp49kJNRwkYXVAfg7xnZs8qOrgDX1EA865g2TMSPzePcjo 0LCmr4YrRO+9hB23ZjMIVN/Lu/Qy6lLlFt2pH2vPUJ78HdBf/95U4pI4FPiGJngfYSeM 2je6npYBEuIsp3M8ZhQsJZ+21ewFVjQ6bmVmLKUPj1tkaQ/sIXz9/UNGAV6FfPIBAxpR /aCII2FbYIL7ML0a4iUuYcIpMUzossTNPFGQGb3G1pHyNjahRNzB6NoSB8KeIQW4w8Xn Dw15X4odlPouJG5dDRBvQlqp052KfrONLT7GULUv2K7IvkoS28q90pjczp+GIzVBC+o6 CkXw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=8ZPV9VHMlufqnzfBjaH561O57HKOPZlRQxm41Pp/2u4=; b=SiiI25eH1O4vTTKxerOLB8gfhxhjFRy1S5R/lgvgFqGPs/vBHIxyrPPMgI8Hl6vllt 1WfcooaZq7Ytryf4ZBzn+uyXopNh8z/JECOHF2k/jcn7T13/vM/akCA5Yfqg3tReGxN4 qnp/zWMwaYXev65X9+cCvLJV0HmAhgvGogqJvvV4SaVH2j3D0yc6iqy8Wa1nKZA2LbEf nsvg4vnpByIgoSJev3zFCT6jc93MTR6UNojD3Arsd+IKLFr3hMNW4Lgjpl6IHFgcRh36 fcr9JNsBGlTbBUhRBtu54Ylmw0f7JIUbOc6E8/JocNhzi6Q7dQuKHHFiTSWAqUUJ6lHG uJVg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=YsP6ikEi; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id z7si1128457ybm.243.2018.02.17.11.00.45 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 11:00:45 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=YsP6ikEi; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48363 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7j2-0005Dq-IU for patch@linaro.org; Sat, 17 Feb 2018 14:00:44 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40198) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79k-0000so-6c for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:17 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79i-0001q4-Li for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:16 -0500 Received: from mail-pf0-x243.google.com ([2607:f8b0:400e:c00::243]:38521) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79i-0001pp-E7 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:14 -0500 Received: by mail-pf0-x243.google.com with SMTP id i3so592644pfe.5 for ; Sat, 17 Feb 2018 10:24:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=8ZPV9VHMlufqnzfBjaH561O57HKOPZlRQxm41Pp/2u4=; b=YsP6ikEiSp9Batn91eU/Rl+M+SJ/N9BzD+8oWcykyx2ZJE9/F1T/GFEl83APQ5+qOj gt2S/r7NOtbmCKF6LzwtkwGEaC046JHhosydXCvqNpYFq03G8cq6TBlgnr6iPRPufh7S HyBXWvCu3wyqngPFLby+5f3in/+Ga9vp80+OY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=8ZPV9VHMlufqnzfBjaH561O57HKOPZlRQxm41Pp/2u4=; b=YMBxDaXxRWiTDaP+29fswG4X34vi0kbCsKuLHMwNEQlLYuMsJp+h7HLRi6hL6tBZbF CeaYelxMkd5UTJ7GYSB9O/fYS0lc/QNR8YdUqJbarsX/c00Oo0BXx6Jb42otor0O75th sFbGXMHWZMrXRC9mqUnq2EFTO49TUxSreEVYEyoe/mUQzxSz9RAGO/QZ+R/tf/DRavbA hWNH76y36HzcaiAdtsqXv5jmvbNcmOuEgI1wjK3V3fxZGWmU2zYQCgHSQvMKV0nJSoqu FWfZ3W4/jO1EvtMFpjU/ZoNsrFJdRHKx3yiYCLMjU5bBQwV2Br30wSJK6BNlKnMKS72r 2ELA== X-Gm-Message-State: APf1xPAjgj2PeE8ZztzJehO7yQkKYdkiqKFGCuWHVyo9WLkW6uUeEF/C xO5A69xUMeVf4XDp1bsflR1JzGoJJ2E= X-Received: by 10.99.146.3 with SMTP id o3mr8290580pgd.309.1518891853101; Sat, 17 Feb 2018 10:24:13 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.11 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:12 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:45 -0800 Message-Id: <20180217182323.25885-30-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::243 Subject: [Qemu-devel] [PATCH v2 29/67] target/arm: Implement SVE Permute - Interleaving Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 15 ++++++++++ target/arm/sve_helper.c | 72 ++++++++++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 69 ++++++++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 10 +++++++ 4 files changed, 166 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index ff958fcebd..bab20345c6 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -445,6 +445,21 @@ DEF_HELPER_FLAGS_4(sve_trn_p, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_rev_p, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_punpk_p, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_zip_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_zip_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_zip_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_zip_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_uzp_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_uzp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_uzp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_uzp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_trn_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_trn_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_trn_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_trn_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index c3a2706a16..62982bd099 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1944,3 +1944,75 @@ void HELPER(sve_punpk_p)(void *vd, void *vn, uint32_t pred_desc) } } } + +#define DO_ZIP(NAME, TYPE, H) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t oprsz = simd_oprsz(desc); \ + intptr_t i, oprsz_2 = oprsz / 2; \ + ARMVectorReg tmp_n, tmp_m; \ + /* We produce output faster than we consume input. \ + Therefore we must be mindful of possible overlap. */ \ + if (unlikely((vn - vd) < (uintptr_t)oprsz)) { \ + vn = memcpy(&tmp_n, vn, oprsz_2); \ + } \ + if (unlikely((vm - vd) < (uintptr_t)oprsz)) { \ + vm = memcpy(&tmp_m, vm, oprsz_2); \ + } \ + for (i = 0; i < oprsz_2; i += sizeof(TYPE)) { \ + *(TYPE *)(vd + H(2 * i + 0)) = *(TYPE *)(vn + H(i)); \ + *(TYPE *)(vd + H(2 * i + sizeof(TYPE))) = *(TYPE *)(vm + H(i)); \ + } \ +} + +DO_ZIP(sve_zip_b, uint8_t, H1) +DO_ZIP(sve_zip_h, uint16_t, H1_2) +DO_ZIP(sve_zip_s, uint32_t, H1_4) +DO_ZIP(sve_zip_d, uint64_t, ) + +#define DO_UZP(NAME, TYPE, H) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t oprsz = simd_oprsz(desc); \ + intptr_t oprsz_2 = oprsz / 2; \ + intptr_t odd_ofs = simd_data(desc); \ + intptr_t i; \ + ARMVectorReg tmp_m; \ + if (unlikely((vm - vd) < (uintptr_t)oprsz)) { \ + vm = memcpy(&tmp_m, vm, oprsz); \ + } \ + for (i = 0; i < oprsz_2; i += sizeof(TYPE)) { \ + *(TYPE *)(vd + H(i)) = *(TYPE *)(vn + H(2 * i + odd_ofs)); \ + } \ + for (i = 0; i < oprsz_2; i += sizeof(TYPE)) { \ + *(TYPE *)(vd + H(oprsz_2 + i)) = *(TYPE *)(vm + H(2 * i + odd_ofs)); \ + } \ +} + +DO_UZP(sve_uzp_b, uint8_t, H1) +DO_UZP(sve_uzp_h, uint16_t, H1_2) +DO_UZP(sve_uzp_s, uint32_t, H1_4) +DO_UZP(sve_uzp_d, uint64_t, ) + +#define DO_TRN(NAME, TYPE, H) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t oprsz = simd_oprsz(desc); \ + intptr_t odd_ofs = simd_data(desc); \ + intptr_t i; \ + for (i = 0; i < oprsz; i += 2 * sizeof(TYPE)) { \ + TYPE ae = *(TYPE *)(vn + H(i + odd_ofs)); \ + TYPE be = *(TYPE *)(vm + H(i + odd_ofs)); \ + *(TYPE *)(vd + H(i + 0)) = ae; \ + *(TYPE *)(vd + H(i + sizeof(TYPE))) = be; \ + } \ +} + +DO_TRN(sve_trn_b, uint8_t, H1) +DO_TRN(sve_trn_h, uint16_t, H1_2) +DO_TRN(sve_trn_s, uint32_t, H1_4) +DO_TRN(sve_trn_d, uint64_t, ) + +#undef DO_ZIP +#undef DO_UZP +#undef DO_TRN diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 45e1ea87bf..09ac955a36 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2042,6 +2042,75 @@ static void trans_PUNPKHI(DisasContext *s, arg_PUNPKHI *a, uint32_t insn) do_perm_pred2(s, a, 1, gen_helper_sve_punpk_p); } +/* + *** SVE Permute - Interleaving Group + */ + +static void do_zip(DisasContext *s, arg_rrr_esz *a, bool high) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve_zip_b, gen_helper_sve_zip_h, + gen_helper_sve_zip_s, gen_helper_sve_zip_d, + }; + unsigned vsz = vec_full_reg_size(s); + unsigned high_ofs = high ? vsz / 2 : 0; + + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn) + high_ofs, + vec_full_reg_offset(s, a->rm) + high_ofs, + vsz, vsz, 0, fns[a->esz]); +} + +static void do_zzz_data_ool(DisasContext *s, arg_rrr_esz *a, int data, + gen_helper_gvec_3 *fn) +{ + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vsz, vsz, data, fn); +} + +static void trans_ZIP1_z(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_zip(s, a, false); +} + +static void trans_ZIP2_z(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_zip(s, a, true); +} + +static gen_helper_gvec_3 * const uzp_fns[4] = { + gen_helper_sve_uzp_b, gen_helper_sve_uzp_h, + gen_helper_sve_uzp_s, gen_helper_sve_uzp_d, +}; + +static void trans_UZP1_z(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_zzz_data_ool(s, a, 0, uzp_fns[a->esz]); +} + +static void trans_UZP2_z(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_zzz_data_ool(s, a, 1 << a->esz, uzp_fns[a->esz]); +} + +static gen_helper_gvec_3 * const trn_fns[4] = { + gen_helper_sve_trn_b, gen_helper_sve_trn_h, + gen_helper_sve_trn_s, gen_helper_sve_trn_d, +}; + +static void trans_TRN1_z(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_zzz_data_ool(s, a, 0, trn_fns[a->esz]); +} + +static void trans_TRN2_z(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + do_zzz_data_ool(s, a, 1 << a->esz, trn_fns[a->esz]); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index bcbe84c3a6..2efa3773fc 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -415,6 +415,16 @@ REV_p 00000101 .. 11 0100 010 000 0 .... 0 .... @pd_pn PUNPKLO 00000101 00 11 0000 010 000 0 .... 0 .... @pd_pn_e0 PUNPKHI 00000101 00 11 0001 010 000 0 .... 0 .... @pd_pn_e0 +### SVE Permute - Interleaving Group + +# SVE permute vector elements +ZIP1_z 00000101 .. 1 ..... 011 000 ..... ..... @rd_rn_rm +ZIP2_z 00000101 .. 1 ..... 011 001 ..... ..... @rd_rn_rm +UZP1_z 00000101 .. 1 ..... 011 010 ..... ..... @rd_rn_rm +UZP2_z 00000101 .. 1 ..... 011 011 ..... ..... @rd_rn_rm +TRN1_z 00000101 .. 1 ..... 011 100 ..... ..... @rd_rn_rm +TRN2_z 00000101 .. 1 ..... 011 101 ..... ..... @rd_rn_rm + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Sat Feb 17 18:22:46 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128716 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1840908ljc; Sat, 17 Feb 2018 11:04:10 -0800 (PST) X-Google-Smtp-Source: AH8x224l8I9RW7df49fJvuRKjjnzCqBCmSoLv77Nytqb0sQxWAbmG/AChN7SYaj9+YW0hylyNJzm X-Received: by 10.37.45.90 with SMTP id s26mr6126643ybe.354.1518894250568; Sat, 17 Feb 2018 11:04:10 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518894250; cv=none; d=google.com; s=arc-20160816; b=tIadl2K3LcZlPYPcCejfLUahKA4jd06o6B2jUdr8klaDNi+REQWZc617V5PdPRTJz1 lWsagDUCfG4pu/Xyb6wCxpuT9TzvOOyL520KmYAH91pHN0jhaLoOaMribP3aIb9XD+sQ 3gzzx1y49bekmKB7M+qTD6Vi40L/mC+DZXg7V3UDrrGLlFzm9+fAo2bg9Z1/XyYMGaH1 f7MNndVoXMTsYHfkyNctSNJCfUz+4WYaHQ2zbZ+auNTRkg7rRSl1khKFFHn4VNjFC4RV iIWxXpDs4Rtt0chjoOyySw+doV4UZ+eGW3gPWRfRhSkWpWEQZWMZpyIh8wRKsuD2eBDK /MOQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=nvmjmNBSSd8KlTGFZjxDZR+Pc5M+VYp/8trMxJq8PP0=; b=GD+aMBUh4iDCfVb9rLoNUa84BXLwUCkrgWMU8e9hgr2zA9JDUMovxxV+RVFhFeac3f shnDmLkBr9F18+G6y9P0adICZHZ1uXXSNPC41kkbz6gV0SLz+gOt7AvMVKlkNkAQWPGf g/zDcncLTnj2fqZo029aLjgKevqZ/n7PyYyyog03uwQy/CZaEja+teot9ZsiE8R0nvD9 UsMLNusTvjeklMZMsAjqiZhkAMFXHsJuZzq1eAahoKpREvelcJRCnGSMKgaB+DiQxC0Z T2rR+HM8UeDbj5m3EaWJSKiowu0BD0GbRdbW/hCONTJdUoOyQxsfNszxgmm2eedo7RiG cjXw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=d9nEwwYS; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id t184si3807278ywb.736.2018.02.17.11.04.10 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 11:04:10 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=d9nEwwYS; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48404 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7mM-00089c-04 for patch@linaro.org; Sat, 17 Feb 2018 14:04:10 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40218) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79l-0000uS-Bg for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:18 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79k-0001r8-AB for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:17 -0500 Received: from mail-pg0-x241.google.com ([2607:f8b0:400e:c05::241]:37119) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79k-0001qY-1h for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:16 -0500 Received: by mail-pg0-x241.google.com with SMTP id o1so4353677pgn.4 for ; Sat, 17 Feb 2018 10:24:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=nvmjmNBSSd8KlTGFZjxDZR+Pc5M+VYp/8trMxJq8PP0=; b=d9nEwwYSjSaZUIaYquQNmpyKFIsbiRl3tuTsBjkgE9UJK+HgSsr5ZfaeFGJR+zf4eM /A7lagrN+kO7Oaum7aHT1gt+zZbeuXnd5R7CdrZwo4EoL+2G7rkUVCsvqbopNMoJ43KC X/7TsJcXEbLefuEyNkIyf/IItVAz6ahZ1EJm0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=nvmjmNBSSd8KlTGFZjxDZR+Pc5M+VYp/8trMxJq8PP0=; b=N/m+XUnknLMXKMmHkqqkhxRN+OAKwavRlDFq5pfcpDr9qjZNVJYMCLVPRiakhCg6H1 nNSRc3XyQMvTB1ZZIpLmpj2NIvBDSv9hz+WERMA/oUwUSMStEAoHpdebAqOpK5Uh1ZTq pG6zGSGpokuy1dJl+wmCMKzMtwJ+qkelRPmD9/AHSctQPg2yWiQxtXsT7Y1Pjb0D5MvT /+ushBY7I/ytMhty0vGFjKqpLSTKp7ik7Iw22IUdwr39VNVTG83VvFj470bnID6bQVc0 WXRzNVPocL1dGhlycUvilBXVdrJl0TFdW8m69d/fd8L9/+re96KbVyny5VqYgCC27Jbg VY/g== X-Gm-Message-State: APf1xPCHaBFk2LF0y3oTAig5a0Jobu9G9w7sBt/YvxhP3KJ4bGO3KlsK LVZVbZp2NTIrLvCLQBusRu0cQzUYtlI= X-Received: by 10.99.60.72 with SMTP id i8mr8129654pgn.399.1518891854813; Sat, 17 Feb 2018 10:24:14 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.13 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:13 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:46 -0800 Message-Id: <20180217182323.25885-31-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::241 Subject: [Qemu-devel] [PATCH v2 30/67] target/arm: Implement SVE compress active elements X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 3 +++ target/arm/sve_helper.c | 34 ++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 12 ++++++++++++ target/arm/sve.decode | 6 ++++++ 4 files changed, 55 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index bab20345c6..d977aea00d 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -460,6 +460,9 @@ DEF_HELPER_FLAGS_4(sve_trn_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_trn_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_trn_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_compact_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_compact_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 62982bd099..87a1a32232 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2016,3 +2016,37 @@ DO_TRN(sve_trn_d, uint64_t, ) #undef DO_ZIP #undef DO_UZP #undef DO_TRN + +void HELPER(sve_compact_s)(void *vd, void *vn, void *vg, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc) / 4; + uint32_t *d = vd, *n = vn; + uint8_t *pg = vg; + + for (i = j = 0; i < opr_sz; i++) { + if (pg[H1(i / 2)] & (i & 1 ? 0x10 : 0x01)) { + d[H4(j)] = n[H4(i)]; + j++; + } + } + for (; j < opr_sz; j++) { + d[H4(j)] = 0; + } +} + +void HELPER(sve_compact_d)(void *vd, void *vn, void *vg, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd, *n = vn; + uint8_t *pg = vg; + + for (i = j = 0; i < opr_sz; i++) { + if (pg[H1(i)] & 1) { + d[j] = n[i]; + j++; + } + } + for (; j < opr_sz; j++) { + d[j] = 0; + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 09ac955a36..21531b259c 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2111,6 +2111,18 @@ static void trans_TRN2_z(DisasContext *s, arg_rrr_esz *a, uint32_t insn) do_zzz_data_ool(s, a, 1 << a->esz, trn_fns[a->esz]); } +/* + *** SVE Permute Vector - Predicated Group + */ + +static void trans_COMPACT(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] = { + NULL, NULL, gen_helper_sve_compact_s, gen_helper_sve_compact_d + }; + do_zpz_ool(s, a, fns[a->esz]); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 2efa3773fc..a89bd37eeb 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -425,6 +425,12 @@ UZP2_z 00000101 .. 1 ..... 011 011 ..... ..... @rd_rn_rm TRN1_z 00000101 .. 1 ..... 011 100 ..... ..... @rd_rn_rm TRN2_z 00000101 .. 1 ..... 011 101 ..... ..... @rd_rn_rm +### SVE Permute - Predicated Group + +# SVE compress active elements +# Note esz >= 2 +COMPACT 00000101 .. 100001 100 ... ..... ..... @rd_pg_rn + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Sat Feb 17 18:22:47 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128719 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1842998ljc; Sat, 17 Feb 2018 11:07:15 -0800 (PST) X-Google-Smtp-Source: AH8x225YHrHSxbLj/nU6PNbA5Sf0tUHwYkDHQlRoDaUcCYN08QZKiyzAh2AvxD3E6ZfvAqrB+Iyh X-Received: by 10.13.217.80 with SMTP id b77mr7649779ywe.123.1518894435537; Sat, 17 Feb 2018 11:07:15 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518894435; cv=none; d=google.com; s=arc-20160816; b=y7jufJ8fZ0Xa3VcV3wGUDxBsgjrXDJwBgP6zFUY5Nk5A3yNBeqn74a+K/+kGgG64k/ rb0fU/WO1pUrWFj07WJWvt6JPaWghhR/0xs/F7lVu2gR3r5UlaWokX+mqrEhg0S0F2FS OfSt61wtBEkgk7v33ik5Tg9KPAH7zNJN7ZcMf1akPBsPPqgUirpTvbulpM9TeOJ90H4E hJCUamuimMXPz0phIPBdhMwJLcGCEcfcokY7STlELNs9ECv44vima5pC7z7i6uhzfUUx gU5F0TGy8lK8Oco2WVpa4Q0m/W354gSpE0Ci5Pz+Xgk5JZ57FIwYD6EHcMDcNwUPYs+X jwjg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=bxlL3933KZt9XyhRekE43YoYNcpubIZDGz3/aO+obpk=; b=YSs9ViwcCXbDOZhOBs2zY5QNhzPhEsLQzcfEZSNY/XXQotI6sJoY4Go5c5QlGHNw09 78pYw1LXyDvQnuczElgRw3ns7lDDJAfa+pTLlcqcQLqunigcyAc0vFVETPWAtUm6G+IF CusVfXIIe4qAD4rMhbLHXcNDl4siFkkc1XzJrw36SxYvYlIlcKyEEHf4+6tB/McZ3Kbf Er4Kk1KZEx9ttRTgQODK+rfG7V4HxZiHLtDTRb4NPSSc9J+icKn4zKcIPX2lXNq7SacI lXcoEDHkys2dTJWx5oOMLD/m/G+g1qi15ERdKWdVIsmMsW89NomRJ6yja+2PSL/cFTzf d37A== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=EWN46cE+; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id c188si847644ybb.461.2018.02.17.11.07.15 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 11:07:15 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=EWN46cE+; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48425 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7pK-0002VK-Kp for patch@linaro.org; Sat, 17 Feb 2018 14:07:14 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40262) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79n-0000xe-Lp for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:25 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79m-0001rx-0M for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:19 -0500 Received: from mail-pl0-x244.google.com ([2607:f8b0:400e:c01::244]:40422) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79l-0001rY-Nc for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:17 -0500 Received: by mail-pl0-x244.google.com with SMTP id g18so3435424plo.7 for ; Sat, 17 Feb 2018 10:24:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=bxlL3933KZt9XyhRekE43YoYNcpubIZDGz3/aO+obpk=; b=EWN46cE+IeEAix1Pn0dcFCdCJO2n4pJ6RDg0TpzvTjoNhdpRLiNI5Z7YAL5p2KFFcE T2FLvMinUBybRuUrugRiGigApi04xV5oG3Q/ZScrvVZQ0VpO0ThquhgPu0lmyJE4AKuv J/r/lx/BnH27IyUJpWHQI+JqDzgjQvSTKZuNk= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=bxlL3933KZt9XyhRekE43YoYNcpubIZDGz3/aO+obpk=; b=eWnXxzRz+UzK/U+U/5Or8KZBT/CDKy6fTW08KOVSWpv3Rl26EFLH0SoqZzoySC9/Gy ZAAFFwRVaVmv3R1rdfD2ZNgAxEj4ySfCA9zOS6gSgWXXuW5+GgY8AYicd26065nXdaXu tgQ9vvVETu5yTnTohYVji18bls0jcqVfv5KKloLgKGCYHWdXlg6qboqTg3qa6FlSM26+ kcRcIihqPffkwmb4XPFUoaUx4MvhV+h4Jg5lFEKcHcgnGy9OMgcdGslC54wPI1X6xDVo 47QOW1xGJJZ0zJNNUEPuuWbkkVb7QRUj8DoS+Xeguxss1aCGm3dqADJ2C8xjViObsoBW YvZw== X-Gm-Message-State: APf1xPB+cr+bI6Xkwt3unmFNtWz8+UdAFDXMmdC6IM9ySwqNC3Lahb3X uJ7t5GkR2oP45hhgcXDU5aOKzsGocnY= X-Received: by 2002:a17:902:8509:: with SMTP id bj9-v6mr9562808plb.386.1518891856347; Sat, 17 Feb 2018 10:24:16 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.14 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:15 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:47 -0800 Message-Id: <20180217182323.25885-32-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::244 Subject: [Qemu-devel] [PATCH v2 31/67] target/arm: Implement SVE conditionally broadcast/extract element X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 2 + target/arm/sve_helper.c | 11 ++ target/arm/translate-sve.c | 299 +++++++++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 20 +++ 4 files changed, 332 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index d977aea00d..a58fb4ba01 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -463,6 +463,8 @@ DEF_HELPER_FLAGS_4(sve_trn_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_compact_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_compact_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_2(sve_last_active_element, TCG_CALL_NO_RWG, s32, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 87a1a32232..ee289be642 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2050,3 +2050,14 @@ void HELPER(sve_compact_d)(void *vd, void *vn, void *vg, uint32_t desc) d[j] = 0; } } + +/* Similar to the ARM LastActiveElement pseudocode function, except the + result is multiplied by the element size. This includes the not found + indication; e.g. not found for esz=3 is -8. */ +int32_t HELPER(sve_last_active_element)(void *vg, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + intptr_t esz = extract32(pred_desc, SIMD_DATA_SHIFT, 2); + + return last_active_element(vg, DIV_ROUND_UP(oprsz, 8), esz); +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 21531b259c..207a22a0bc 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2123,6 +2123,305 @@ static void trans_COMPACT(DisasContext *s, arg_rpr_esz *a, uint32_t insn) do_zpz_ool(s, a, fns[a->esz]); } +/* Call the helper that computes the ARM LastActiveElement pseudocode + function, scaled by the element size. This includes the not found + indication; e.g. not found for esz=3 is -8. */ +static void find_last_active(DisasContext *s, TCGv_i32 ret, int esz, int pg) +{ + /* Predicate sizes may be smaller and cannot use simd_desc. We cannot + round up, as we do elsewhere, because we need the exact size. */ + TCGv_ptr t_p = tcg_temp_new_ptr(); + TCGv_i32 t_desc; + unsigned vsz = pred_full_reg_size(s); + unsigned desc; + + desc = vsz - 2; + desc = deposit32(desc, SIMD_DATA_SHIFT, 2, esz); + + tcg_gen_addi_ptr(t_p, cpu_env, pred_full_reg_offset(s, pg)); + t_desc = tcg_const_i32(desc); + + gen_helper_sve_last_active_element(ret, t_p, t_desc); + + tcg_temp_free_i32(t_desc); + tcg_temp_free_ptr(t_p); +} + +/* Increment LAST to the offset of the next element in the vector, + wrapping around to 0. */ +static void incr_last_active(DisasContext *s, TCGv_i32 last, int esz) +{ + unsigned vsz = vec_full_reg_size(s); + + tcg_gen_addi_i32(last, last, 1 << esz); + if (is_power_of_2(vsz)) { + tcg_gen_andi_i32(last, last, vsz - 1); + } else { + TCGv_i32 max = tcg_const_i32(vsz); + TCGv_i32 zero = tcg_const_i32(0); + tcg_gen_movcond_i32(TCG_COND_GEU, last, last, max, zero, last); + tcg_temp_free_i32(max); + tcg_temp_free_i32(zero); + } +} + +/* If LAST < 0, set LAST to the offset of the last element in the vector. */ +static void wrap_last_active(DisasContext *s, TCGv_i32 last, int esz) +{ + unsigned vsz = vec_full_reg_size(s); + + if (is_power_of_2(vsz)) { + tcg_gen_andi_i32(last, last, vsz - 1); + } else { + TCGv_i32 max = tcg_const_i32(vsz - (1 << esz)); + TCGv_i32 zero = tcg_const_i32(0); + tcg_gen_movcond_i32(TCG_COND_LT, last, last, zero, max, last); + tcg_temp_free_i32(max); + tcg_temp_free_i32(zero); + } +} + +/* Load an unsigned element of ESZ from BASE+OFS. */ +static TCGv_i64 load_esz(TCGv_ptr base, int ofs, int esz) +{ + TCGv_i64 r = tcg_temp_new_i64(); + + switch (esz) { + case 0: + tcg_gen_ld8u_i64(r, base, ofs); + break; + case 1: + tcg_gen_ld16u_i64(r, base, ofs); + break; + case 2: + tcg_gen_ld32u_i64(r, base, ofs); + break; + case 3: + tcg_gen_ld_i64(r, base, ofs); + break; + default: + g_assert_not_reached(); + } + return r; +} + +/* Load an unsigned element of ESZ from RM[LAST]. */ +static TCGv_i64 load_last_active(DisasContext *s, TCGv_i32 last, + int rm, int esz) +{ + TCGv_ptr p = tcg_temp_new_ptr(); + TCGv_i64 r; + + /* Convert offset into vector into offset into ENV. + The final adjustment for the vector register base + is added via constant offset to the load. */ +#ifdef HOST_WORDS_BIGENDIAN + /* Adjust for element ordering. See vec_reg_offset. */ + if (esz < 3) { + tcg_gen_xori_i32(last, last, 8 - (1 << esz)); + } +#endif + tcg_gen_ext_i32_ptr(p, last); + tcg_gen_add_ptr(p, p, cpu_env); + + r = load_esz(p, vec_full_reg_offset(s, rm), esz); + tcg_temp_free_ptr(p); + + return r; +} + +/* Compute CLAST for a Zreg. */ +static void do_clast_vector(DisasContext *s, arg_rprr_esz *a, bool before) +{ + TCGv_i32 last = tcg_temp_local_new_i32(); + TCGLabel *over = gen_new_label(); + TCGv_i64 ele; + unsigned vsz, esz = a->esz; + + find_last_active(s, last, esz, a->pg); + + /* There is of course no movcond for a 2048-bit vector, + so we must branch over the actual store. */ + tcg_gen_brcondi_i32(TCG_COND_LT, last, 0, over); + + if (!before) { + incr_last_active(s, last, esz); + } + + ele = load_last_active(s, last, a->rm, esz); + tcg_temp_free_i32(last); + + vsz = vec_full_reg_size(s); + tcg_gen_gvec_dup_i64(esz, vec_full_reg_offset(s, a->rd), vsz, vsz, ele); + tcg_temp_free_i64(ele); + + /* If this insn used MOVPRFX, we may need a second move. */ + if (a->rd != a->rn) { + TCGLabel *done = gen_new_label(); + tcg_gen_br(done); + + gen_set_label(over); + do_mov_z(s, a->rd, a->rn); + + gen_set_label(done); + } else { + gen_set_label(over); + } +} + +static void trans_CLASTA_z(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ + do_clast_vector(s, a, false); +} + +static void trans_CLASTB_z(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ + do_clast_vector(s, a, true); +} + +/* Compute CLAST for a scalar. */ +static void do_clast_scalar(DisasContext *s, int esz, int pg, int rm, + bool before, TCGv_i64 reg_val) +{ + TCGv_i32 last = tcg_temp_new_i32(); + TCGv_i64 ele, cmp, zero; + + find_last_active(s, last, esz, pg); + + /* Extend the original value of last prior to incrementing. */ + cmp = tcg_temp_new_i64(); + tcg_gen_ext_i32_i64(cmp, last); + + if (!before) { + incr_last_active(s, last, esz); + } + + /* The conceit here is that while last < 0 indicates not found, after + adjusting for cpu_env->vfp.zregs[rm], it is still a valid address + from which we can load garbage. We then discard the garbage with + a conditional move. */ + ele = load_last_active(s, last, rm, esz); + tcg_temp_free_i32(last); + + zero = tcg_const_i64(0); + tcg_gen_movcond_i64(TCG_COND_GE, reg_val, cmp, zero, ele, reg_val); + + tcg_temp_free_i64(zero); + tcg_temp_free_i64(cmp); + tcg_temp_free_i64(ele); +} + +/* Compute CLAST for a Vreg. */ +static void do_clast_fp(DisasContext *s, arg_rpr_esz *a, bool before) +{ + int esz = a->esz; + int ofs = vec_reg_offset(s, a->rd, 0, esz); + TCGv_i64 reg = load_esz(cpu_env, ofs, esz); + + do_clast_scalar(s, esz, a->pg, a->rn, before, reg); + write_fp_dreg(s, a->rd, reg); + tcg_temp_free_i64(reg); +} + +static void trans_CLASTA_v(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_clast_fp(s, a, false); +} + +static void trans_CLASTB_v(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_clast_fp(s, a, true); +} + +/* Compute CLAST for a Xreg. */ +static void do_clast_general(DisasContext *s, arg_rpr_esz *a, bool before) +{ + TCGv_i64 reg = cpu_reg(s, a->rd); + + switch (a->esz) { + case 0: + tcg_gen_ext8u_i64(reg, reg); + break; + case 1: + tcg_gen_ext16u_i64(reg, reg); + break; + case 2: + tcg_gen_ext32u_i64(reg, reg); + break; + case 3: + break; + default: + g_assert_not_reached(); + } + + do_clast_scalar(s, a->esz, a->pg, a->rn, before, cpu_reg(s, a->rd)); +} + +static void trans_CLASTA_r(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_clast_general(s, a, false); +} + +static void trans_CLASTB_r(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_clast_general(s, a, true); +} + +/* Compute LAST for a scalar. */ +static TCGv_i64 do_last_scalar(DisasContext *s, int esz, + int pg, int rm, bool before) +{ + TCGv_i32 last = tcg_temp_new_i32(); + TCGv_i64 ret; + + find_last_active(s, last, esz, pg); + if (before) { + wrap_last_active(s, last, esz); + } else { + incr_last_active(s, last, esz); + } + + ret = load_last_active(s, last, rm, esz); + tcg_temp_free_i32(last); + return ret; +} + +/* Compute LAST for a Vreg. */ +static void do_last_fp(DisasContext *s, arg_rpr_esz *a, bool before) +{ + TCGv_i64 val = do_last_scalar(s, a->esz, a->pg, a->rn, before); + write_fp_dreg(s, a->rd, val); + tcg_temp_free_i64(val); +} + +static void trans_LASTA_v(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_last_fp(s, a, false); +} + +static void trans_LASTB_v(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_last_fp(s, a, true); +} + +/* Compute LAST for a Xreg. */ +static void do_last_general(DisasContext *s, arg_rpr_esz *a, bool before) +{ + TCGv_i64 val = do_last_scalar(s, a->esz, a->pg, a->rn, before); + tcg_gen_mov_i64(cpu_reg(s, a->rd), val); + tcg_temp_free_i64(val); +} + +static void trans_LASTA_r(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_last_general(s, a, false); +} + +static void trans_LASTB_r(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_last_general(s, a, true); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index a89bd37eeb..1370802c12 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -431,6 +431,26 @@ TRN2_z 00000101 .. 1 ..... 011 101 ..... ..... @rd_rn_rm # Note esz >= 2 COMPACT 00000101 .. 100001 100 ... ..... ..... @rd_pg_rn +# SVE conditionally broadcast element to vector +CLASTA_z 00000101 .. 10100 0 100 ... ..... ..... @rdn_pg_rm +CLASTB_z 00000101 .. 10100 1 100 ... ..... ..... @rdn_pg_rm + +# SVE conditionally copy element to SIMD&FP scalar +CLASTA_v 00000101 .. 10101 0 100 ... ..... ..... @rd_pg_rn +CLASTB_v 00000101 .. 10101 1 100 ... ..... ..... @rd_pg_rn + +# SVE conditionally copy element to general register +CLASTA_r 00000101 .. 11000 0 101 ... ..... ..... @rd_pg_rn +CLASTB_r 00000101 .. 11000 1 101 ... ..... ..... @rd_pg_rn + +# SVE copy element to SIMD&FP scalar register +LASTA_v 00000101 .. 10001 0 100 ... ..... ..... @rd_pg_rn +LASTB_v 00000101 .. 10001 1 100 ... ..... ..... @rd_pg_rn + +# SVE copy element to general register +LASTA_r 00000101 .. 10000 0 101 ... ..... ..... @rd_pg_rn +LASTB_r 00000101 .. 10000 1 101 ... ..... ..... @rd_pg_rn + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Sat Feb 17 18:22:48 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128709 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1836303ljc; Sat, 17 Feb 2018 10:57:51 -0800 (PST) X-Google-Smtp-Source: AH8x225G7eBiuHDHQuIpO+R2BpklPgj67hUhcc9rBcvjVrnTGPxgbrSRw9AnKsrun9dvSisLq5QQ X-Received: by 10.37.187.10 with SMTP id z10mr7455367ybg.163.1518893871271; Sat, 17 Feb 2018 10:57:51 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518893871; cv=none; d=google.com; s=arc-20160816; b=mnRKCp8/8nt+BPCkF0nEJv8y1vQ7bciDMt+6ickmPPk+TcZCQWjfG8j0zaTgPIW4u9 zPoRBmp4QqhbWUKiY3TKFulqMDJRuMeQoDmjbrktpUDL9AVIk2auSDV0jCvYiGqFhkP/ taVMwgumWx0vme0qjCmLPIDfRoMa5dsMolPTp6HQGOYuU/5rKhUcmbNB7BjxIUkO4K7v WCoAhtxO9/WbE2BMgj+3ZeMC59nLCMdxPbM9IXP6cQyj9ylBgEHEeNZLRdyvtMA5ubWb Qi13oPdm5Gr/0d5Tl1AQtReOA54iPfti/Trkv+ISsdpEXj6q+1tkxNwf8UqmJ8/A0ANS GNtQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=lsu/bLU80Lg0ZG5Cn3SeZNiZJAF3AUwvFClFldUX54I=; b=kG3OTZELqRE9xuGQd5EgafxtbeElHag/J2WbP6GVKXKeXTXxUErZmeeeFfBAwVeh6Q 88j6+tLBehl134hS+kWW1YhztEU4i9X8Gz1wlFPYEMqjdX2VmduUCtyXvIesuKf26jls 2tJwSKf07DoN11EL5sAf18y9VMYFRjlYwoc2+c2H1DCjzqkg9E4WfsbMnZpe408YQc7O VCylLbdBmabbcrOVOOck8ldZDJACd3Z7BHGG6d1D0tiMVn2Ji004q7GmGdplC7f3wbA9 3YGM1PoBPmYd+BMguD86bWMZsU69bn60QOpl16kfN7Z3TJ77MGVio1xywMtN1OB36OgV c+5w== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=g0g6bM3l; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id d7si3580102ywg.193.2018.02.17.10.57.51 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 10:57:51 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=g0g6bM3l; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48331 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7gE-0002WS-N1 for patch@linaro.org; Sat, 17 Feb 2018 13:57:50 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40268) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79o-0000yY-1q for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:21 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79n-0001sT-6E for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:20 -0500 Received: from mail-pl0-x244.google.com ([2607:f8b0:400e:c01::244]:37016) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79n-0001s8-1B for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:19 -0500 Received: by mail-pl0-x244.google.com with SMTP id ay8so3442958plb.4 for ; Sat, 17 Feb 2018 10:24:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=lsu/bLU80Lg0ZG5Cn3SeZNiZJAF3AUwvFClFldUX54I=; b=g0g6bM3lHcjVoeB/+79v/FLkgRaH1ZusoThyBV9eDk1WHnqLvR6fUDtpOzV6tZL1Fg gt7Sw0rZRd7SnRgB9h6ckJwb3xKxdMrgx/UdqopyLzKi75ly1GMPAw+Cy4DC0p19bUdg nyIacCpNNIJcQNheGh+u6jxP3E/yh5m2FEmbw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=lsu/bLU80Lg0ZG5Cn3SeZNiZJAF3AUwvFClFldUX54I=; b=SKTbVLqXU+XH/eS4mraVUSKwtxpfKrMes3gQm2uCfck/wawGGEIz/X5PaMMjXgljkr F63Jf3puyR7/Yosq3WCFoh21USUxA5dW2x/xNjXmiS+7+AbMFqXKG7VmJPF3vUoPQ6pY sqSLCtk2Agj8PL/8iFdaOuyOZ/2tUxRX4fzdNlkGlzwj82hrFDRYfKlaGSmw5We/Bo2M EzdAy3qmp42TW3yEbRJMKfu1LejxKe5lp4eVjxWvByFDnxEtO0umHdbE06NplOL2rLZq AfBp8sG/Ybk13v6vSMkOI6DbwH4uTrubTJshc/nvqRYkEUvUPa3OUHOOvHqzeWl3B8wQ sr1w== X-Gm-Message-State: APf1xPARHQaCkv4ZtNm5N1Fzf7S/JzUHsQqH5ZnF+tcb89siFnd0SX0l Zyo5ZHolmyvIxf1pB+obOSlp7MDNE8I= X-Received: by 2002:a17:902:9893:: with SMTP id s19-v6mr9286801plp.101.1518891857879; Sat, 17 Feb 2018 10:24:17 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.16 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:17 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:48 -0800 Message-Id: <20180217182323.25885-33-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::244 Subject: [Qemu-devel] [PATCH v2 32/67] target/arm: Implement SVE copy to vector (predicated) X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 13 +++++++++++++ target/arm/sve.decode | 6 ++++++ 2 files changed, 19 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 207a22a0bc..fc2a295ab7 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2422,6 +2422,19 @@ static void trans_LASTB_r(DisasContext *s, arg_rpr_esz *a, uint32_t insn) do_last_general(s, a, true); } +static void trans_CPY_m_r(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_cpy_m(s, a->esz, a->rd, a->rd, a->pg, cpu_reg_sp(s, a->rn)); +} + +static void trans_CPY_m_v(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + int ofs = vec_reg_offset(s, a->rn, 0, a->esz); + TCGv_i64 t = load_esz(cpu_env, ofs, a->esz); + do_cpy_m(s, a->esz, a->rd, a->rd, a->pg, t); + tcg_temp_free_i64(t); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 1370802c12..5e127de88c 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -451,6 +451,12 @@ LASTB_v 00000101 .. 10001 1 100 ... ..... ..... @rd_pg_rn LASTA_r 00000101 .. 10000 0 101 ... ..... ..... @rd_pg_rn LASTB_r 00000101 .. 10000 1 101 ... ..... ..... @rd_pg_rn +# SVE copy element from SIMD&FP scalar register +CPY_m_v 00000101 .. 100000 100 ... ..... ..... @rd_pg_rn + +# SVE copy element from general register to vector (predicated) +CPY_m_r 00000101 .. 101000 101 ... ..... ..... @rd_pg_rn + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Sat Feb 17 18:22:49 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128690 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1826040ljc; Sat, 17 Feb 2018 10:40:18 -0800 (PST) X-Google-Smtp-Source: AH8x22534bTDZZnKne5FBXLa8Q/c6oR1IcCOkh74R9yZLTHPwGyxtbgx7uVKajOJcBRqiJ5dG2Vw X-Received: by 10.37.132.194 with SMTP id x2mr7012384ybm.22.1518892817914; Sat, 17 Feb 2018 10:40:17 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518892817; cv=none; d=google.com; s=arc-20160816; b=kjorsCEJ0aJJ1mAx+MGkjq1qZaU40KMIkG9D1nirZJUk5YXsaOpQLAOuU3jGYPbXcC frUAQeaLVv/8FGzkVpkwkBtXD3i7/gJu22PTkGXKk9X6iodMssV08LqAscz0DD2goZee eGgylRh+Yg1SQ6XD4MiLKSs+ELkMZpLHcZ6AoP/QNjmGbW13AfrkcRvKzVS3dUCDCWQK FAugU7wy/p5taTDSqSyqQSiT5qwkSaMrN/Sax1xthv769VcrlCVIIfvEsjzLZs+Q+TBD Ud5dq8La5sTEllySdfn1on98Qz+AYJedfglYTJfzNVbd9UWSgWHJQY9V49Zl/8zsVfcj cacw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=HYLGbImRsPUmVxi2WVEnc4w2Rh42rCM67spjRykW1n4=; b=J0WSlsovf37wZV5dbHjkq+y1KbrRLBqns/sGyD/6ROCdDJYo7SPj5+r5qDAtQu6ZKj 7DvJqF/QCjIHO2cisuZnItje7YZ18ogby+rZZaZPaVgqmokSLFe+k5qRcX87UAr7v24o q4J6FhBXx18ByulphvLg/wxBQOHYNfvbqNrq2FN+bCaXuuXop561FqnMxOu+4dPDn3BI 6m2Rj0Pu6q2qcshrDm1iENL1aIXkH49rnvqqODpKycyiLx/eJFTj+8F/36ldYvnOOpaW +oyCqcW5tct0PrCYugRbS04FcxzxOiDsJ5cHow+0dmg9kAlVknrwmmUObNbLpykVEEPL qXHQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=kHIuEg2+; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id 202si1995846yws.623.2018.02.17.10.40.17 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 10:40:17 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=kHIuEg2+; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48171 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7PF-0004dv-69 for patch@linaro.org; Sat, 17 Feb 2018 13:40:17 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40298) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79q-00011l-8N for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:25 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79o-0001tQ-To for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:22 -0500 Received: from mail-pl0-x242.google.com ([2607:f8b0:400e:c01::242]:44553) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79o-0001t2-NF for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:20 -0500 Received: by mail-pl0-x242.google.com with SMTP id w21so3426846plp.11 for ; Sat, 17 Feb 2018 10:24:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=HYLGbImRsPUmVxi2WVEnc4w2Rh42rCM67spjRykW1n4=; b=kHIuEg2+rn0Aznu1oT0wRsNNzYK0M64xl2q1o7ohhe57wiZxCFovgpnu4ezVB9O52/ OIKY28v7q3OgF65Teflpl2/morhDB5ODMnB5vgKdBxf3V6glftRIbn2jug3EiAWxikb4 +yEk+w1chHC30SALWefJ129CVcEMwNIFsy1bc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=HYLGbImRsPUmVxi2WVEnc4w2Rh42rCM67spjRykW1n4=; b=OJGJDTRuDNzBjQ8yzL3HjCQUTLBsrce58Iu9DQBUTZo8+dzgNxWN2F6X6Ob4QnxBwj BaoLjw4X/eLGEKWhbLznpU3XEULiUC7mVExjIui5C1NYSf3jXYznEArI1MtvIaV/YlaU qpC4uMFwotF59aU35yIL4SmsAIrrQ/Vf8POTUOZyyY6Tl8aQy2vKJ5dRBz9NgvT+GuBw Kya1sEVcQHW+Zt+FcsYrAI4GjjohGTrBtje6QZ7jXfk52qH6JOtR+WJ0z316tTASp3xJ 01KTpLRbjEAqybIPSJrnPODt9BDtAgXPNDzFWhoscwmNcjLJgX2OhR+6z8CfcBC/q885 9eQw== X-Gm-Message-State: APf1xPDm49szZ6TJEmB3wAjH3nM3vA88+QncSa3ZewKXr6vqVt/aV+yC UnksceKQcq5X3ex44cil057lu9XNi0U= X-Received: by 2002:a17:902:402:: with SMTP id 2-v6mr9254422ple.353.1518891859459; Sat, 17 Feb 2018 10:24:19 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.18 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:18 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:49 -0800 Message-Id: <20180217182323.25885-34-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::242 Subject: [Qemu-devel] [PATCH v2 33/67] target/arm: Implement SVE reverse within elements X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 14 ++++++++++++++ target/arm/sve_helper.c | 41 ++++++++++++++++++++++++++++++++++------- target/arm/translate-sve.c | 38 ++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 7 +++++++ 4 files changed, 93 insertions(+), 7 deletions(-) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index a58fb4ba01..3b7c54905d 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -465,6 +465,20 @@ DEF_HELPER_FLAGS_4(sve_compact_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_2(sve_last_active_element, TCG_CALL_NO_RWG, s32, ptr, i32) +DEF_HELPER_FLAGS_4(sve_revb_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_revb_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_revb_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_revh_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_revh_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_revw_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_rbit_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_rbit_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_rbit_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_rbit_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index ee289be642..a67bb579b8 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -237,6 +237,26 @@ static inline uint64_t expand_pred_s(uint8_t byte) return word[byte & 0x11]; } +/* Swap 16-bit words within a 32-bit word. */ +static inline uint32_t hswap32(uint32_t h) +{ + return rol32(h, 16); +} + +/* Swap 16-bit words within a 64-bit word. */ +static inline uint64_t hswap64(uint64_t h) +{ + uint64_t m = 0x0000ffff0000ffffull; + h = rol64(h, 32); + return ((h & m) << 16) | ((h >> 16) & m); +} + +/* Swap 32-bit words within a 64-bit word. */ +static inline uint64_t wswap64(uint64_t h) +{ + return rol64(h, 32); +} + #define LOGICAL_PPPP(NAME, FUNC) \ void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \ { \ @@ -615,6 +635,20 @@ DO_ZPZ(sve_neg_h, uint16_t, H1_2, DO_NEG) DO_ZPZ(sve_neg_s, uint32_t, H1_4, DO_NEG) DO_ZPZ_D(sve_neg_d, uint64_t, DO_NEG) +DO_ZPZ(sve_revb_h, uint16_t, H1_2, bswap16) +DO_ZPZ(sve_revb_s, uint32_t, H1_4, bswap32) +DO_ZPZ_D(sve_revb_d, uint64_t, bswap64) + +DO_ZPZ(sve_revh_s, uint32_t, H1_4, hswap32) +DO_ZPZ_D(sve_revh_d, uint64_t, hswap64) + +DO_ZPZ_D(sve_revw_d, uint64_t, wswap64) + +DO_ZPZ(sve_rbit_b, uint8_t, H1, revbit8) +DO_ZPZ(sve_rbit_h, uint16_t, H1_2, revbit16) +DO_ZPZ(sve_rbit_s, uint32_t, H1_4, revbit32) +DO_ZPZ_D(sve_rbit_d, uint64_t, revbit64) + /* Three-operand expander, unpredicated, in which the third operand is "wide". */ #define DO_ZZW(NAME, TYPE, TYPEW, H, OP) \ @@ -1577,13 +1611,6 @@ void HELPER(sve_rev_b)(void *vd, void *vn, uint32_t desc) } } -static inline uint64_t hswap64(uint64_t h) -{ - uint64_t m = 0x0000ffff0000ffffull; - h = rol64(h, 32); - return ((h & m) << 16) | ((h >> 16) & m); -} - void HELPER(sve_rev_h)(void *vd, void *vn, uint32_t desc) { intptr_t i, j, opr_sz = simd_oprsz(desc); diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index fc2a295ab7..5a1ed379ad 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2435,6 +2435,44 @@ static void trans_CPY_m_v(DisasContext *s, arg_rpr_esz *a, uint32_t insn) tcg_temp_free_i64(t); } +static void trans_REVB(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] = { + NULL, + gen_helper_sve_revb_h, + gen_helper_sve_revb_s, + gen_helper_sve_revb_d, + }; + do_zpz_ool(s, a, fns[a->esz]); +} + +static void trans_REVH(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] = { + NULL, + NULL, + gen_helper_sve_revh_s, + gen_helper_sve_revh_d, + }; + do_zpz_ool(s, a, fns[a->esz]); +} + +static void trans_REVW(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ool(s, a, a->esz == 3 ? gen_helper_sve_revw_d : NULL); +} + +static void trans_RBIT(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve_rbit_b, + gen_helper_sve_rbit_h, + gen_helper_sve_rbit_s, + gen_helper_sve_rbit_d, + }; + do_zpz_ool(s, a, fns[a->esz]); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 5e127de88c..8903fb6592 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -457,6 +457,13 @@ CPY_m_v 00000101 .. 100000 100 ... ..... ..... @rd_pg_rn # SVE copy element from general register to vector (predicated) CPY_m_r 00000101 .. 101000 101 ... ..... ..... @rd_pg_rn +# SVE reverse within elements +# Note esz >= operation size +REVB 00000101 .. 1001 00 100 ... ..... ..... @rd_pg_rn +REVH 00000101 .. 1001 01 100 ... ..... ..... @rd_pg_rn +REVW 00000101 .. 1001 10 100 ... ..... ..... @rd_pg_rn +RBIT 00000101 .. 1001 11 100 ... ..... ..... @rd_pg_rn + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Sat Feb 17 18:22:50 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128710 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1836570ljc; Sat, 17 Feb 2018 10:58:22 -0800 (PST) X-Google-Smtp-Source: AH8x2253XimHrvKpay3OIMd09Stz+dwg68wdLMAhN/yQDpXJUySvfFtkj52lcydHwXeQDp8w200U X-Received: by 10.37.87.66 with SMTP id l63mr3703075ybb.488.1518893902521; Sat, 17 Feb 2018 10:58:22 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518893902; cv=none; d=google.com; s=arc-20160816; b=uCjmwh2HDTgnRtzW52fjGenEhhyd42f8ISYaBUmWkoX6QLCijcsF0QlSbt33A9ao8X vQJhowRkM9TI5qrt9maydPezVxv9sH0ugW3PMrdr5+vlDEwEszoBc9ZroZx1Vpoh5riW y7VQlBA47gebjq8HyX8RjN16Sr1DeFtedFUQaGsi5aXIw0E3HqmUz/k2aE9reZ3S+vlx cVkvOT4SWuIIkNZX1hSDCDizHo6YqGrYGOt2P2xFL0KmcGDeUZedB+deXs6JoxgEHLpy x8+bB/MN9DOIOI5w6+ThlXIpN1n1tfZvd5g6mKtgGw7hO/qQG1udEaLJYpP1ClsZdd9u WyWw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=k4jL592UFSa2Gtp23s7RFTcDDeu5zdBjKBpgmgfqBXU=; b=mNFnlWBCgnkdnUplgyy9O/PgEdJVV9KnGAQeWBG+NK+Wl/S5DAOAKGkIh3QaFIrAd3 qqAShjE8Plg1e57MgVkBlYavR9R9GFgKhafY9BSK82QC5uaWfCEu0rP6tiqJndanKNz5 UXgKGtVIuZb/f+RHeDJfBXOpbiMEo8L/K+8Iar0Ya5fRHFITV8QfoMVKicyImNTwKTXp 69mKNziQnxyeOm73e1lV9KUZNUIPqWXI7wKV6CqAD7NG4Cc8tzyselndClW7vSj3bx1G b7VBPtxZq4BJd3bbGitnkwClKyNRyPLhit399eDt0XB+0yJ+bcpcjoFlVhEB3PWyw89Y iH5g== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=KatThmkn; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id s203si3540367yws.394.2018.02.17.10.58.22 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 10:58:22 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=KatThmkn; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48356 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7gj-0003m3-Rq for patch@linaro.org; Sat, 17 Feb 2018 13:58:21 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40334) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79t-00015A-5d for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:26 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79q-0001u6-Bl for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:25 -0500 Received: from mail-pf0-x244.google.com ([2607:f8b0:400e:c00::244]:38522) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79q-0001tg-5r for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:22 -0500 Received: by mail-pf0-x244.google.com with SMTP id i3so592708pfe.5 for ; Sat, 17 Feb 2018 10:24:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=k4jL592UFSa2Gtp23s7RFTcDDeu5zdBjKBpgmgfqBXU=; b=KatThmknFgzHb/Wf3HO7P2EK564kCbe3j/4RB061Jcs8gnNuWy6VTDIow5DZvgSiHR Ipot+dEmFcABolk0K8yYRySP08UKC24xJahxgpm2MFilWrc22VZmPWSS9GsnDiDf3toN 338t0xA1WZ31bByeJosFPH95W0Lg89cZ/GCPQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=k4jL592UFSa2Gtp23s7RFTcDDeu5zdBjKBpgmgfqBXU=; b=Q89I4GuuvyQZ5LDRTTGsOpgB81ZYbPc1ECSIV0SZM4r/emX8HMKwN7CEcniwyW4Q9I FZNAh0N9OX4pOhdARBM/ufMTPeiw2UCctiNgegJgT9KPxR5wrXf5cicWzuSRJxXqWHh4 4zIVpAUqhwZCRgfw8b8ZIYGM+IXI/WZi/z0EbO7tIcZJoGglV98dF1Z+RJSvq4EVwH/N oPaTZWZzN+lgSah9TtyAmsbpq2g6hvydFggHAcf2UxjZiWTSoRIekETRdeOqXf/ea0xX MBj5AhDMJYswENQiE2UymprH/nw73kM8E9aaz7Gw/NWTZmHpQl54q4J0b8FN5KRqZqRJ diTw== X-Gm-Message-State: APf1xPCqGY6nD6JvZALOZh619NCc7WtoWJ4/AxFoYd2YsuFU86wgIbHV PdGV+6XWHNbNRhj3krtuz8ecECGmj4w= X-Received: by 10.99.168.8 with SMTP id o8mr8387252pgf.42.1518891860910; Sat, 17 Feb 2018 10:24:20 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.19 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:20 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:50 -0800 Message-Id: <20180217182323.25885-35-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::244 Subject: [Qemu-devel] [PATCH v2 34/67] target/arm: Implement SVE vector splice (predicated) X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 2 ++ target/arm/sve_helper.c | 37 +++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 10 ++++++++++ target/arm/sve.decode | 3 +++ 4 files changed, 52 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 3b7c54905d..c3f8a2b502 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -479,6 +479,8 @@ DEF_HELPER_FLAGS_4(sve_rbit_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_rbit_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_rbit_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_splice, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index a67bb579b8..f524a1ddce 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2088,3 +2088,40 @@ int32_t HELPER(sve_last_active_element)(void *vg, uint32_t pred_desc) return last_active_element(vg, DIV_ROUND_UP(oprsz, 8), esz); } + +void HELPER(sve_splice)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) +{ + intptr_t opr_sz = simd_oprsz(desc) / 8; + int esz = simd_data(desc); + uint64_t pg, first_g, last_g, len, mask = pred_esz_masks[esz]; + intptr_t i, first_i, last_i; + ARMVectorReg tmp; + + first_i = last_i = 0; + first_g = last_g = 0; + + /* Find the extent of the active elements within VG. */ + for (i = QEMU_ALIGN_UP(opr_sz, 8) - 8; i >= 0; i -= 8) { + pg = *(uint64_t *)(vg + i) & mask; + if (pg) { + if (last_g == 0) { + last_g = pg; + last_i = i; + } + first_g = pg; + first_i = i; + } + } + + len = 0; + if (first_g != 0) { + first_i = first_i * 8 + ctz64(first_g); + last_i = last_i * 8 + 63 - clz64(last_g); + len = last_i - first_i + (1 << esz); + if (vd == vm) { + vm = memcpy(&tmp, vm, opr_sz * 8); + } + swap_memmove(vd, vn + first_i, len); + } + swap_memmove(vd + len, vm, opr_sz * 8 - len); +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 5a1ed379ad..559fb41fd6 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2473,6 +2473,16 @@ static void trans_RBIT(DisasContext *s, arg_rpr_esz *a, uint32_t insn) do_zpz_ool(s, a, fns[a->esz]); } +static void trans_SPLICE(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_4_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + pred_full_reg_offset(s, a->pg), + vsz, vsz, a->esz, gen_helper_sve_splice); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 8903fb6592..70feb448e6 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -464,6 +464,9 @@ REVH 00000101 .. 1001 01 100 ... ..... ..... @rd_pg_rn REVW 00000101 .. 1001 10 100 ... ..... ..... @rd_pg_rn RBIT 00000101 .. 1001 11 100 ... ..... ..... @rd_pg_rn +# SVE vector splice (predicated) +SPLICE 00000101 .. 101 100 100 ... ..... ..... @rdn_pg_rm + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Sat Feb 17 18:22:51 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128715 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1840756ljc; Sat, 17 Feb 2018 11:03:55 -0800 (PST) X-Google-Smtp-Source: AH8x226KIbT35o83hc2DI9oxV3VnyANNF4zTKwvG7WvN8UZoBFMWCOR1d+McfgpLZ+kQxIiadLlT X-Received: by 10.37.170.101 with SMTP id s92mr1131604ybi.348.1518894235540; Sat, 17 Feb 2018 11:03:55 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518894235; cv=none; d=google.com; s=arc-20160816; b=bD3csNpeSL8NRox9mTZgx8vLITFF2r3HxLvVnJ1GS+IasDDEx0vLvV1/YTCNmO6Au0 IkTwN1eBLcGloFatqVNLJwvyVGl0+mCEG+8RrzWFowTtn8PwceHss/3iD4cqZ8o1LGSd Okx0itCsHNxy0yGqHeI4hnVbSZyTlCfZHKPw7ocwQktdrzEsmrp5GZrdkLONCXwFZDTo /7viU1aoPl4j51Oi9wqSWQr/kb+p2JX2znZa//Ryn8lsDVWZgUznGXeAqGsoTMouDp/5 7vvYEHw0XFMJ9c2I/8Kbn+BdHmia7tElfFmbQfbHHKbvtE574ARSM1LQaKy3c+C1MayG jpzg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=9GMGFACpQiacf69ZW381VkQYNsWKfIiV8Bn1CblnjWs=; b=wl2gqkhfk73LKUzl83BqJqrOJswyo67Gav29AUJDFtvMJD1qstZcx8IePgA7riO+hZ 4s7Q9qg1eIUPhlv/zerfaEUmaFcrXnwiY79dUhaLl4GNG6bQsaWDFDGc/S8aXKpsm9L9 9jFAI9t1PVsdd0yuBmgjj6dEG07nP/5LfhvyUh5KD1DFdiqAVerhdnpC8/l9pyQzPE/v VfsyqEX89pMeG22N7qWVaWuSqgRxsd3A3UWMWdDC+vZIxIvt13QudVjllEpHScxiHz0d 9ht01iyHFop258CVHnTn0yrop8bMm4ShSoxNZfCWFpsdAH/JiNygJ+u02qEwKQIbQ2fX JEzA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=OHcQszaS; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id b123si1987704ywd.119.2018.02.17.11.03.55 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 11:03:55 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=OHcQszaS; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48403 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7m6-0007uy-TW for patch@linaro.org; Sat, 17 Feb 2018 14:03:54 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40333) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79t-000158-4r for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:26 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79r-0001uf-RW for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:25 -0500 Received: from mail-pf0-x242.google.com ([2607:f8b0:400e:c00::242]:34162) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79r-0001uK-LQ for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:23 -0500 Received: by mail-pf0-x242.google.com with SMTP id g17so591387pfh.1 for ; Sat, 17 Feb 2018 10:24:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=9GMGFACpQiacf69ZW381VkQYNsWKfIiV8Bn1CblnjWs=; b=OHcQszaSDvmydmEZUcVx9HWAMwPJljsXAMcbnzRABqEvtzNnr9f+QXd0dK87lrRcXO 6Dd+LNdDkjrLGzuCvLkuOpJlSPumIxC/jBRksdsXmIL7vfNiZ6zba9Yk4FdAInQAuumn NrQ5N6//rbvGBiTnPFDrK+fKWLVmivKqcDLfA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=9GMGFACpQiacf69ZW381VkQYNsWKfIiV8Bn1CblnjWs=; b=rREzw1pFIDnA36sQwNE4+jaPNp/aOp36QZVRGByVfHTNfdM1KypBb3qLIB2NdNMTQu 2PgUZgwSe2ZiWfnOKLs6ElckQM8pfchR5k6hJ15vz9Ksx5l9fJEi2Ojd6wVhfGOd27yV l1gSZ4yf7hMCaNhottOAmpyp5Rq/3yUabWACw1APkxm4ttjAxGoHVR7cEKcJ9DFFCocH suaTCorpRsUZVSV7/g0pJ4BbVhQu5Kc9ha82zpJD24M7dWQQgxoeDqb2Y5e+F6NCefdK tL0BODG4i7quZkiyosj2LzzaZYus8U7hb+kESHYWs7bWxCJm54/7GzZo2RnYVtsDE+Bp PBLA== X-Gm-Message-State: APf1xPC9RnoGlf/BKsoGPh3Ijx5P8ojiesZTBUAroylvBwO1SKAHWmCq 5gHUcCyscgkqBWuPS+uahIZsa1vL2Ec= X-Received: by 10.101.81.76 with SMTP id g12mr8489946pgq.24.1518891862350; Sat, 17 Feb 2018 10:24:22 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.21 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:21 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:51 -0800 Message-Id: <20180217182323.25885-36-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::242 Subject: [Qemu-devel] [PATCH v2 35/67] target/arm: Implement SVE Select Vectors Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 9 ++++++++ target/arm/sve_helper.c | 55 ++++++++++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 2 ++ target/arm/sve.decode | 6 +++++ 4 files changed, 72 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index c3f8a2b502..0f57f64895 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -195,6 +195,15 @@ DEF_HELPER_FLAGS_5(sve_lsl_zpzz_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_lsl_zpzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_sel_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_sel_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_sel_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_sel_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_asr_zpzw_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_asr_zpzw_h, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index f524a1ddce..86cd792cdf 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2125,3 +2125,58 @@ void HELPER(sve_splice)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) } swap_memmove(vd + len, vm, opr_sz * 8 - len); } + +void HELPER(sve_sel_zpzz_b)(void *vd, void *vn, void *vm, + void *vg, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd, *n = vn, *m = vm; + uint8_t *pg = vg; + + for (i = 0; i < opr_sz; i += 1) { + uint64_t nn = n[i], mm = m[i]; + uint64_t pp = expand_pred_b(pg[H1(i)]); + d[i] = (nn & pp) | (mm & ~pp); + } +} + +void HELPER(sve_sel_zpzz_h)(void *vd, void *vn, void *vm, + void *vg, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd, *n = vn, *m = vm; + uint8_t *pg = vg; + + for (i = 0; i < opr_sz; i += 1) { + uint64_t nn = n[i], mm = m[i]; + uint64_t pp = expand_pred_h(pg[H1(i)]); + d[i] = (nn & pp) | (mm & ~pp); + } +} + +void HELPER(sve_sel_zpzz_s)(void *vd, void *vn, void *vm, + void *vg, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd, *n = vn, *m = vm; + uint8_t *pg = vg; + + for (i = 0; i < opr_sz; i += 1) { + uint64_t nn = n[i], mm = m[i]; + uint64_t pp = expand_pred_s(pg[H1(i)]); + d[i] = (nn & pp) | (mm & ~pp); + } +} + +void HELPER(sve_sel_zpzz_d)(void *vd, void *vn, void *vm, + void *vg, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd, *n = vn, *m = vm; + uint8_t *pg = vg; + + for (i = 0; i < opr_sz; i += 1) { + uint64_t nn = n[i], mm = m[i]; + d[i] = (pg[H1(i)] & 1 ? nn : mm); + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 559fb41fd6..021b33ced9 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -361,6 +361,8 @@ static void trans_UDIV_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn) do_zpzz_ool(s, a, fns[a->esz]); } +DO_ZPZZ(SEL, sel) + #undef DO_ZPZZ /* diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 70feb448e6..7ec84fdd80 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -99,6 +99,7 @@ &rprr_esz rn=%reg_movprfx @rdm_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 \ &rprr_esz rm=%reg_movprfx +@rd_pg4_rn_rm ........ esz:2 . rm:5 .. pg:4 rn:5 rd:5 &rprr_esz # Three register operand, with governing predicate, vector element size @rda_pg_rn_rm ........ esz:2 . rm:5 ... pg:3 rn:5 rd:5 \ @@ -467,6 +468,11 @@ RBIT 00000101 .. 1001 11 100 ... ..... ..... @rd_pg_rn # SVE vector splice (predicated) SPLICE 00000101 .. 101 100 100 ... ..... ..... @rdn_pg_rm +### SVE Select Vectors Group + +# SVE select vector elements (predicated) +SEL_zpzz 00000101 .. 1 ..... 11 .... ..... ..... @rd_pg4_rn_rm + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Sat Feb 17 18:22:52 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128692 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1828162ljc; Sat, 17 Feb 2018 10:43:48 -0800 (PST) X-Google-Smtp-Source: AH8x224V5g1ie5HxrG5j+4KGMVTcDE8jQ/E2yPFl62AdetYIJEQZI1SgPqg7cXeghuGWCmjlnkT5 X-Received: by 10.37.77.6 with SMTP id a6mr7468880ybb.410.1518893028539; Sat, 17 Feb 2018 10:43:48 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518893028; cv=none; d=google.com; s=arc-20160816; b=Bl9ijBKbbfxCH5tgfZ0Lfg+5/2Ankd6L+KDVyCEsfKTih/OewaOeRyQL9zpp/9LJKU o3/SMeNne6RjIbP29P2TWEKW+tmWqyHRq7kvUZK945cM72+zhFQpB6FI48jxE4zZmbOn BWCt+n/3c6nTLjtVyHm2wN6FedQdLUphqVs2zqIKFVHxrSi6c/daqKImwdx463JLq8hj 2P+PLNUQ7eqWdVFBw0Fp3x18Nc8S5ZKRGa6dIZC7camaCYCuD4FETllpa+Xt8mteeHvW cZmoC+MVBnkwuubLLXjxRX/xlxtBiJGh3ugrO2uLfJ0nD+sEnFJiAJpHdN1DuNdDFjBo CeFg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=D1c9k2YAv4Fr2n1MAcDtQOvAWXUmnI0wi/JHbwXdfnk=; b=Y4OVPlfz/eY6MzfhLXe/XKb/4ZYnCAl807AWLZ7hMpDvDpqiVMKqTsm4tJg9PkDoDG VFRqfCZTRG11QeKZvfKfPcHn4tV+YokuudqIEtso7qmpc+BRlnAnQbVyqr/2QXJf5qoO SxpZA4c7qrN9JkhsimpwLjVpDOCncpp3TgX1PJzF1Zil2A3oM0qRmM2osRDL3vpCMTAm wOVNfzofP58dQaAp/PTQ3EzeRB5yTsPJaB+icuI9WavNQDskb0ZnWQrrtcYnO51gjgxd UHMQW5fS6LSwiXrgL/+BLX9QkMexW5cxr+Gp/nztvghZIuF/zDaWQIsaievXK7wPGYrs zEag== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=eZRog78h; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id f30si381525ybi.284.2018.02.17.10.43.48 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 10:43:48 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=eZRog78h; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48197 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7Sd-0007Xw-RQ for patch@linaro.org; Sat, 17 Feb 2018 13:43:47 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40381) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79w-00018M-0Z for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:31 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79t-0001vh-IM for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:27 -0500 Received: from mail-pg0-x242.google.com ([2607:f8b0:400e:c05::242]:40106) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79t-0001uz-AW for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:25 -0500 Received: by mail-pg0-x242.google.com with SMTP id g2so4353515pgn.7 for ; Sat, 17 Feb 2018 10:24:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=D1c9k2YAv4Fr2n1MAcDtQOvAWXUmnI0wi/JHbwXdfnk=; b=eZRog78hIGdsi2vfSWPDH9iYdiOdARbuLZDL2bZB4GPr/6OO/lrODIzsUO6txgWzv6 N10OrmvjV03Ex05rbmxQAC0L82XMXSWEcEwPyoZKZ4xVZzvGtrTzu0ZHOve0D7DA4GZb I2W5xlrAQEvO8sJQd+tUefpkiO6MFv1/NZzmI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=D1c9k2YAv4Fr2n1MAcDtQOvAWXUmnI0wi/JHbwXdfnk=; b=FbzyK6uwxazxMROjPIhibc2jQuHSh41jWuwgQbOCVokBzr6cHFkw631Oa3x6Boh38U Leejv8Z3wiYnzAkWT1Yf3YxUOQdX3mZ9yDcf4Gl9nkIKNQ3tTysqJURSrnyk+aFhBtMY bBi/BtzfQAKuwoyU5uSPqTnZItTxBYkh3TnLEiU42Qm3B7xpBrd9WcmzpNofpE8obtxB 7xp24PU/KYM0syjiuBEu3M6YAcw/cl59JPdu6YiISltMrZpnBS4AJvZ/EP2iBNOjEhdZ LyXlQBEdRSrqCngJcThZI5xzmPw7h73JX4VbTQ+OaKlFjKpqSNGsGhzM5UVsBC0CVwRe 0HYg== X-Gm-Message-State: APf1xPCgn4mDl6vJEY8mBWZ4dFPlv3qFV0o/fLYLSqBQZDxWvYS+82lL 1bIjAiheGTHwHhm3hNv5i4yJnIzzMxs= X-Received: by 10.99.107.200 with SMTP id g191mr8139908pgc.165.1518891863804; Sat, 17 Feb 2018 10:24:23 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.22 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:23 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:52 -0800 Message-Id: <20180217182323.25885-37-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::242 Subject: [Qemu-devel] [PATCH v2 36/67] target/arm: Implement SVE Integer Compare - Vectors Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 115 +++++++++++++++++++++++++++ target/arm/sve_helper.c | 193 ++++++++++++++++++++++++++++++++++++++++++++- target/arm/translate-sve.c | 87 ++++++++++++++++++++ target/arm/sve.decode | 24 ++++++ 4 files changed, 416 insertions(+), 3 deletions(-) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 0f57f64895..6ffd1fbe8e 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -490,6 +490,121 @@ DEF_HELPER_FLAGS_4(sve_rbit_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_splice, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpeq_ppzz_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpne_ppzz_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpge_ppzz_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpgt_ppzz_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphi_ppzz_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphs_ppzz_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_cmpeq_ppzz_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpne_ppzz_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpge_ppzz_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpgt_ppzz_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphi_ppzz_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphs_ppzz_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_cmpeq_ppzz_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpne_ppzz_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpge_ppzz_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpgt_ppzz_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphi_ppzz_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphs_ppzz_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_cmpeq_ppzz_d, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpne_ppzz_d, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpge_ppzz_d, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpgt_ppzz_d, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphi_ppzz_d, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphs_ppzz_d, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_cmpeq_ppzw_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpne_ppzw_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpge_ppzw_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpgt_ppzw_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphi_ppzw_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphs_ppzw_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmple_ppzw_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmplt_ppzw_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmplo_ppzw_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpls_ppzw_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_cmpeq_ppzw_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpne_ppzw_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpge_ppzw_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpgt_ppzw_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphi_ppzw_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphs_ppzw_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmple_ppzw_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmplt_ppzw_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmplo_ppzw_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpls_ppzw_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_cmpeq_ppzw_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpne_ppzw_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpge_ppzw_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpgt_ppzw_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphi_ppzw_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphs_ppzw_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmple_ppzw_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmplt_ppzw_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmplo_ppzw_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpls_ppzw_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 86cd792cdf..ae433861f8 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -46,14 +46,14 @@ * * The return value has bit 31 set if N is set, bit 1 set if Z is clear, * and bit 0 set if C is set. - * - * This is an iterative function, called for each Pd and Pg word - * moving forward. */ /* For no G bits set, NZCV = C. */ #define PREDTEST_INIT 1 +/* This is an iterative function, called for each Pd and Pg word + * moving forward. + */ static uint32_t iter_predtest_fwd(uint64_t d, uint64_t g, uint32_t flags) { if (likely(g)) { @@ -73,6 +73,28 @@ static uint32_t iter_predtest_fwd(uint64_t d, uint64_t g, uint32_t flags) return flags; } +/* This is an iterative function, called for each Pd and Pg word + * moving backward. + */ +static uint32_t iter_predtest_bwd(uint64_t d, uint64_t g, uint32_t flags) +{ + if (likely(g)) { + /* Compute C from first (i.e last) !(D & G). + Use bit 2 to signal first G bit seen. */ + if (!(flags & 4)) { + flags += 4 - 1; /* add bit 2, subtract C from PREDTEST_INIT */ + flags |= (d & pow2floor(g)) == 0; + } + + /* Accumulate Z from each D & G. */ + flags |= ((d & g) != 0) << 1; + + /* Compute N from last (i.e first) D & G. Replace previous. */ + flags = deposit32(flags, 31, 1, (d & (g & -g)) != 0); + } + return flags; +} + /* The same for a single word predicate. */ uint32_t HELPER(sve_predtest1)(uint64_t d, uint64_t g) { @@ -2180,3 +2202,168 @@ void HELPER(sve_sel_zpzz_d)(void *vd, void *vn, void *vm, d[i] = (pg[H1(i)] & 1 ? nn : mm); } } + +/* Two operand comparison controlled by a predicate. + * ??? It is very tempting to want to be able to expand this inline + * with x86 instructions, e.g. + * + * vcmpeqw zm, zn, %ymm0 + * vpmovmskb %ymm0, %eax + * and $0x5555, %eax + * and pg, %eax + * + * or even aarch64, e.g. + * + * // mask = 4000 1000 0400 0100 0040 0010 0004 0001 + * cmeq v0.8h, zn, zm + * and v0.8h, v0.8h, mask + * addv h0, v0.8h + * and v0.8b, pg + * + * However, coming up with an abstraction that allows vector inputs and + * a scalar output, and also handles the byte-ordering of sub-uint64_t + * scalar outputs, is tricky. + */ +#define DO_CMP_PPZZ(NAME, TYPE, OP, H, MASK) \ +uint32_t HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \ +{ \ + intptr_t opr_sz = simd_oprsz(desc); \ + uint32_t flags = PREDTEST_INIT; \ + intptr_t i = opr_sz; \ + do { \ + uint64_t out = 0, pg; \ + do { \ + i -= sizeof(TYPE), out <<= sizeof(TYPE); \ + TYPE nn = *(TYPE *)(vn + H(i)); \ + TYPE mm = *(TYPE *)(vm + H(i)); \ + out |= nn OP mm; \ + } while (i & 63); \ + pg = *(uint64_t *)(vg + (i >> 3)) & MASK; \ + out &= pg; \ + *(uint64_t *)(vd + (i >> 3)) = out; \ + flags = iter_predtest_bwd(out, pg, flags); \ + } while (i > 0); \ + return flags; \ +} + +#define DO_CMP_PPZZ_B(NAME, TYPE, OP) \ + DO_CMP_PPZZ(NAME, TYPE, OP, H1, 0xffffffffffffffffull) +#define DO_CMP_PPZZ_H(NAME, TYPE, OP) \ + DO_CMP_PPZZ(NAME, TYPE, OP, H1_2, 0x5555555555555555ull) +#define DO_CMP_PPZZ_S(NAME, TYPE, OP) \ + DO_CMP_PPZZ(NAME, TYPE, OP, H1_4, 0x1111111111111111ull) +#define DO_CMP_PPZZ_D(NAME, TYPE, OP) \ + DO_CMP_PPZZ(NAME, TYPE, OP, , 0x0101010101010101ull) + +DO_CMP_PPZZ_B(sve_cmpeq_ppzz_b, uint8_t, ==) +DO_CMP_PPZZ_H(sve_cmpeq_ppzz_h, uint16_t, ==) +DO_CMP_PPZZ_S(sve_cmpeq_ppzz_s, uint32_t, ==) +DO_CMP_PPZZ_D(sve_cmpeq_ppzz_d, uint64_t, ==) + +DO_CMP_PPZZ_B(sve_cmpne_ppzz_b, uint8_t, !=) +DO_CMP_PPZZ_H(sve_cmpne_ppzz_h, uint16_t, !=) +DO_CMP_PPZZ_S(sve_cmpne_ppzz_s, uint32_t, !=) +DO_CMP_PPZZ_D(sve_cmpne_ppzz_d, uint64_t, !=) + +DO_CMP_PPZZ_B(sve_cmpgt_ppzz_b, int8_t, >) +DO_CMP_PPZZ_H(sve_cmpgt_ppzz_h, int16_t, >) +DO_CMP_PPZZ_S(sve_cmpgt_ppzz_s, int32_t, >) +DO_CMP_PPZZ_D(sve_cmpgt_ppzz_d, int64_t, >) + +DO_CMP_PPZZ_B(sve_cmpge_ppzz_b, int8_t, >=) +DO_CMP_PPZZ_H(sve_cmpge_ppzz_h, int16_t, >=) +DO_CMP_PPZZ_S(sve_cmpge_ppzz_s, int32_t, >=) +DO_CMP_PPZZ_D(sve_cmpge_ppzz_d, int64_t, >=) + +DO_CMP_PPZZ_B(sve_cmphi_ppzz_b, uint8_t, >) +DO_CMP_PPZZ_H(sve_cmphi_ppzz_h, uint16_t, >) +DO_CMP_PPZZ_S(sve_cmphi_ppzz_s, uint32_t, >) +DO_CMP_PPZZ_D(sve_cmphi_ppzz_d, uint64_t, >) + +DO_CMP_PPZZ_B(sve_cmphs_ppzz_b, uint8_t, >=) +DO_CMP_PPZZ_H(sve_cmphs_ppzz_h, uint16_t, >=) +DO_CMP_PPZZ_S(sve_cmphs_ppzz_s, uint32_t, >=) +DO_CMP_PPZZ_D(sve_cmphs_ppzz_d, uint64_t, >=) + +#undef DO_CMP_PPZZ_B +#undef DO_CMP_PPZZ_H +#undef DO_CMP_PPZZ_S +#undef DO_CMP_PPZZ_D +#undef DO_CMP_PPZZ + +/* Similar, but the second source is "wide". */ +#define DO_CMP_PPZW(NAME, TYPE, TYPEW, OP, H, MASK) \ +uint32_t HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \ +{ \ + intptr_t opr_sz = simd_oprsz(desc); \ + uint32_t flags = PREDTEST_INIT; \ + intptr_t i = opr_sz; \ + do { \ + uint64_t out = 0, pg; \ + do { \ + TYPEW mm = *(TYPEW *)(vm + i - 8); \ + do { \ + i -= sizeof(TYPE), out <<= sizeof(TYPE); \ + TYPE nn = *(TYPE *)(vn + H(i)); \ + out |= nn OP mm; \ + } while (i & 7); \ + } while (i & 63); \ + pg = *(uint64_t *)(vg + (i >> 3)) & MASK; \ + out &= pg; \ + *(uint64_t *)(vd + (i >> 3)) = out; \ + flags = iter_predtest_bwd(out, pg, flags); \ + } while (i > 0); \ + return flags; \ +} + +#define DO_CMP_PPZW_B(NAME, TYPE, TYPEW, OP) \ + DO_CMP_PPZW(NAME, TYPE, TYPEW, OP, H1, 0xffffffffffffffffull) +#define DO_CMP_PPZW_H(NAME, TYPE, TYPEW, OP) \ + DO_CMP_PPZW(NAME, TYPE, TYPEW, OP, H1_2, 0x5555555555555555ull) +#define DO_CMP_PPZW_S(NAME, TYPE, TYPEW, OP) \ + DO_CMP_PPZW(NAME, TYPE, TYPEW, OP, H1_4, 0x1111111111111111ull) + +DO_CMP_PPZW_B(sve_cmpeq_ppzw_b, uint8_t, uint64_t, ==) +DO_CMP_PPZW_H(sve_cmpeq_ppzw_h, uint16_t, uint64_t, ==) +DO_CMP_PPZW_S(sve_cmpeq_ppzw_s, uint32_t, uint64_t, ==) + +DO_CMP_PPZW_B(sve_cmpne_ppzw_b, uint8_t, uint64_t, !=) +DO_CMP_PPZW_H(sve_cmpne_ppzw_h, uint16_t, uint64_t, !=) +DO_CMP_PPZW_S(sve_cmpne_ppzw_s, uint32_t, uint64_t, !=) + +DO_CMP_PPZW_B(sve_cmpgt_ppzw_b, int8_t, int64_t, >) +DO_CMP_PPZW_H(sve_cmpgt_ppzw_h, int16_t, int64_t, >) +DO_CMP_PPZW_S(sve_cmpgt_ppzw_s, int32_t, int64_t, >) + +DO_CMP_PPZW_B(sve_cmpge_ppzw_b, int8_t, int64_t, >=) +DO_CMP_PPZW_H(sve_cmpge_ppzw_h, int16_t, int64_t, >=) +DO_CMP_PPZW_S(sve_cmpge_ppzw_s, int32_t, int64_t, >=) + +DO_CMP_PPZW_B(sve_cmphi_ppzw_b, uint8_t, uint64_t, >) +DO_CMP_PPZW_H(sve_cmphi_ppzw_h, uint16_t, uint64_t, >) +DO_CMP_PPZW_S(sve_cmphi_ppzw_s, uint32_t, uint64_t, >) + +DO_CMP_PPZW_B(sve_cmphs_ppzw_b, uint8_t, uint64_t, >=) +DO_CMP_PPZW_H(sve_cmphs_ppzw_h, uint16_t, uint64_t, >=) +DO_CMP_PPZW_S(sve_cmphs_ppzw_s, uint32_t, uint64_t, >=) + +DO_CMP_PPZW_B(sve_cmplt_ppzw_b, int8_t, int64_t, <) +DO_CMP_PPZW_H(sve_cmplt_ppzw_h, int16_t, int64_t, <) +DO_CMP_PPZW_S(sve_cmplt_ppzw_s, int32_t, int64_t, <) + +DO_CMP_PPZW_B(sve_cmple_ppzw_b, int8_t, int64_t, <=) +DO_CMP_PPZW_H(sve_cmple_ppzw_h, int16_t, int64_t, <=) +DO_CMP_PPZW_S(sve_cmple_ppzw_s, int32_t, int64_t, <=) + +DO_CMP_PPZW_B(sve_cmplo_ppzw_b, uint8_t, uint64_t, <) +DO_CMP_PPZW_H(sve_cmplo_ppzw_h, uint16_t, uint64_t, <) +DO_CMP_PPZW_S(sve_cmplo_ppzw_s, uint32_t, uint64_t, <) + +DO_CMP_PPZW_B(sve_cmpls_ppzw_b, uint8_t, uint64_t, <=) +DO_CMP_PPZW_H(sve_cmpls_ppzw_h, uint16_t, uint64_t, <=) +DO_CMP_PPZW_S(sve_cmpls_ppzw_s, uint32_t, uint64_t, <=) + +#undef DO_CMP_PPZW_B +#undef DO_CMP_PPZW_H +#undef DO_CMP_PPZW_S +#undef DO_CMP_PPZW diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 021b33ced9..cb54777108 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -39,6 +39,9 @@ typedef void GVecGen2iFn(unsigned, uint32_t, uint32_t, typedef void GVecGen3Fn(unsigned, uint32_t, uint32_t, uint32_t, uint32_t, uint32_t); +typedef void gen_helper_gvec_flags_4(TCGv_i32, TCGv_ptr, TCGv_ptr, + TCGv_ptr, TCGv_ptr, TCGv_i32); + /* * Helpers for extracting complex instruction fields. */ @@ -2485,6 +2488,90 @@ static void trans_SPLICE(DisasContext *s, arg_rprr_esz *a, uint32_t insn) vsz, vsz, a->esz, gen_helper_sve_splice); } +/* + *** SVE Integer Compare - Vectors Group + */ + +static void do_ppzz_flags(DisasContext *s, arg_rprr_esz *a, + gen_helper_gvec_flags_4 *gen_fn) +{ + TCGv_ptr pd, zn, zm, pg; + unsigned vsz; + TCGv_i32 t; + + if (gen_fn == NULL) { + unallocated_encoding(s); + return; + } + + vsz = vec_full_reg_size(s); + t = tcg_const_i32(simd_desc(vsz, vsz, 0)); + pd = tcg_temp_new_ptr(); + zn = tcg_temp_new_ptr(); + zm = tcg_temp_new_ptr(); + pg = tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(pd, cpu_env, pred_full_reg_offset(s, a->rd)); + tcg_gen_addi_ptr(zn, cpu_env, vec_full_reg_offset(s, a->rn)); + tcg_gen_addi_ptr(zm, cpu_env, vec_full_reg_offset(s, a->rm)); + tcg_gen_addi_ptr(pg, cpu_env, pred_full_reg_offset(s, a->pg)); + + gen_fn(t, pd, zn, zm, pg, t); + + tcg_temp_free_ptr(pd); + tcg_temp_free_ptr(zn); + tcg_temp_free_ptr(zm); + tcg_temp_free_ptr(pg); + + do_pred_flags(t); + + tcg_temp_free_i32(t); +} + +#define DO_PPZZ(NAME, name) \ +static void trans_##NAME##_ppzz(DisasContext *s, arg_rprr_esz *a, \ + uint32_t insn) \ +{ \ + static gen_helper_gvec_flags_4 * const fns[4] = { \ + gen_helper_sve_##name##_ppzz_b, gen_helper_sve_##name##_ppzz_h, \ + gen_helper_sve_##name##_ppzz_s, gen_helper_sve_##name##_ppzz_d, \ + }; \ + do_ppzz_flags(s, a, fns[a->esz]); \ +} + +DO_PPZZ(CMPEQ, cmpeq) +DO_PPZZ(CMPNE, cmpne) +DO_PPZZ(CMPGT, cmpgt) +DO_PPZZ(CMPGE, cmpge) +DO_PPZZ(CMPHI, cmphi) +DO_PPZZ(CMPHS, cmphs) + +#undef DO_PPZZ + +#define DO_PPZW(NAME, name) \ +static void trans_##NAME##_ppzw(DisasContext *s, arg_rprr_esz *a, \ + uint32_t insn) \ +{ \ + static gen_helper_gvec_flags_4 * const fns[4] = { \ + gen_helper_sve_##name##_ppzw_b, gen_helper_sve_##name##_ppzw_h, \ + gen_helper_sve_##name##_ppzw_s, NULL \ + }; \ + do_ppzz_flags(s, a, fns[a->esz]); \ +} + +DO_PPZW(CMPEQ, cmpeq) +DO_PPZW(CMPNE, cmpne) +DO_PPZW(CMPGT, cmpgt) +DO_PPZW(CMPGE, cmpge) +DO_PPZW(CMPHI, cmphi) +DO_PPZW(CMPHS, cmphs) +DO_PPZW(CMPLT, cmplt) +DO_PPZW(CMPLE, cmple) +DO_PPZW(CMPLO, cmplo) +DO_PPZW(CMPLS, cmpls) + +#undef DO_PPZW + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 7ec84fdd80..deedc9163b 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -100,6 +100,7 @@ @rdm_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 \ &rprr_esz rm=%reg_movprfx @rd_pg4_rn_rm ........ esz:2 . rm:5 .. pg:4 rn:5 rd:5 &rprr_esz +@pd_pg_rn_rm ........ esz:2 . rm:5 ... pg:3 rn:5 . rd:4 &rprr_esz # Three register operand, with governing predicate, vector element size @rda_pg_rn_rm ........ esz:2 . rm:5 ... pg:3 rn:5 rd:5 \ @@ -473,6 +474,29 @@ SPLICE 00000101 .. 101 100 100 ... ..... ..... @rdn_pg_rm # SVE select vector elements (predicated) SEL_zpzz 00000101 .. 1 ..... 11 .... ..... ..... @rd_pg4_rn_rm +### SVE Integer Compare - Vectors Group + +# SVE integer compare_vectors +CMPHS_ppzz 00100100 .. 0 ..... 000 ... ..... 0 .... @pd_pg_rn_rm +CMPHI_ppzz 00100100 .. 0 ..... 000 ... ..... 1 .... @pd_pg_rn_rm +CMPGE_ppzz 00100100 .. 0 ..... 100 ... ..... 0 .... @pd_pg_rn_rm +CMPGT_ppzz 00100100 .. 0 ..... 100 ... ..... 1 .... @pd_pg_rn_rm +CMPEQ_ppzz 00100100 .. 0 ..... 101 ... ..... 0 .... @pd_pg_rn_rm +CMPNE_ppzz 00100100 .. 0 ..... 101 ... ..... 1 .... @pd_pg_rn_rm + +# SVE integer compare with wide elements +# Note these require esz != 3. +CMPEQ_ppzw 00100100 .. 0 ..... 001 ... ..... 0 .... @pd_pg_rn_rm +CMPNE_ppzw 00100100 .. 0 ..... 001 ... ..... 1 .... @pd_pg_rn_rm +CMPGE_ppzw 00100100 .. 0 ..... 010 ... ..... 0 .... @pd_pg_rn_rm +CMPGT_ppzw 00100100 .. 0 ..... 010 ... ..... 1 .... @pd_pg_rn_rm +CMPLT_ppzw 00100100 .. 0 ..... 011 ... ..... 0 .... @pd_pg_rn_rm +CMPLE_ppzw 00100100 .. 0 ..... 011 ... ..... 1 .... @pd_pg_rn_rm +CMPHS_ppzw 00100100 .. 0 ..... 110 ... ..... 0 .... @pd_pg_rn_rm +CMPHI_ppzw 00100100 .. 0 ..... 110 ... ..... 1 .... @pd_pg_rn_rm +CMPLO_ppzw 00100100 .. 0 ..... 111 ... ..... 0 .... @pd_pg_rn_rm +CMPLS_ppzw 00100100 .. 0 ..... 111 ... ..... 1 .... @pd_pg_rn_rm + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Sat Feb 17 18:22:53 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128717 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1842826ljc; Sat, 17 Feb 2018 11:07:00 -0800 (PST) X-Google-Smtp-Source: AH8x227pZExFeByCuWYd/kKyTtzld5DUkO5YYxK+4v6VWF/1BAoMw57/5PCkG9wMA+1uwp36FQNR X-Received: by 10.129.37.14 with SMTP id l14mr7658642ywl.412.1518894420671; Sat, 17 Feb 2018 11:07:00 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518894420; cv=none; d=google.com; s=arc-20160816; b=Vnqr8YNTkp7zGvSFGepF02D/+vmjtoqiD7B6ZBhWir92TB1FRuqgQLEJkQRLkXf0uQ 5Jes4qQHylKs8MNY9W488pZ0SjKs/W2IfPpT9TwvTdpEn8Kjk4iL1avywqfUYr941bfB aDRHwF+viX1GW1gs3qLDQkO13MhxJUWF7dOtaTkbb4Yo7fkICpHGU/MagACTLF6ucQqd k37sSyxDqDfChqVdl4Q4TruFwUdeCqeRkfTUtO1n2vsTXQ1K5TvxKHZdVKGTZyyiqRL4 PZFAMTpJ5acNvSkKzEEoFHHy1rANgET2ecdYU6XMxr0gtfcU0VKTjJSjhb/5DAwl4oHt R6cQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=vgUeQVcNdAqSYVjf7X4B2XDy4dpJwgizl072wHry8oE=; b=tmy3djFoOjGBvevEFTgqYKTDuuOmFeBIvdk1G+gQHC9Wh57q7Z0jUYyn02/xn/zyJl Q89yy5XhHWvwHEiIEYT2g9U3NgGg0+LKsLQFrSGrhUKkRLpwtHf/sgREVSS3TMbJZCFi dYj+/qBJLJkORqS7TMi400R2TMjxyFX9kuuTdmTGiOsU4beo9r9NrwLWwZMnYPcW7RR1 SUl/3B51Z18MFONxJ0Po34AkjZZYWdUBDdlnfUMAF3/DEhal8ZKcdZUxR2vH1aNOnevh mkqaTwfIVL8ye9EoKQuY5t9gTk4Ewf6akJIQirfyKrOEZ1+9cAtmXz8DyWgR1+eIIQH0 4d7g== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=DLQoYF4C; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id z13si1277097ybm.84.2018.02.17.11.07.00 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 11:07:00 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=DLQoYF4C; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48423 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7p5-0002Fp-Uo for patch@linaro.org; Sat, 17 Feb 2018 14:07:00 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40400) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79w-00019A-Ra for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:31 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79v-0001wQ-00 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:28 -0500 Received: from mail-pl0-x242.google.com ([2607:f8b0:400e:c01::242]:41328) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79u-0001w3-Of for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:26 -0500 Received: by mail-pl0-x242.google.com with SMTP id k8so3438728pli.8 for ; Sat, 17 Feb 2018 10:24:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=vgUeQVcNdAqSYVjf7X4B2XDy4dpJwgizl072wHry8oE=; b=DLQoYF4Cqu8sg/sk846kXGwwprCPhEsGsixoGHFURg6+OnyUsHryI+nIAB9i8khzk4 8GzLvzs9d+g7oDgLHkM/NBFl7utG5TCVDn4Fs0XOn24fyQt3XoLC1fyfN8Y9I2KRM5Lp ThLEY9VAj2/3YtZTw9mpQf2B9JbYz5rUesBWA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=vgUeQVcNdAqSYVjf7X4B2XDy4dpJwgizl072wHry8oE=; b=RX7RXDLxmi8+9HUrJU/a3aJopB5XfuK5hZ/+DJtnPu3JJlosIMFqJmTs9g5gaYVEzA kNPX33t4YeYLdXKR9H/QPNCYNzpluGOPwq+DR/aqSt43Z7YwgoHbcEDCiO1gKfCHUixK Z+XRNNVFTDEnGPeRuXW1PDD7PVp4QTlJfHPcrqGorxGhJC0TFHdTclTQBR0nd1aTtXhi +NWFgSVKwO7dCfhbUGF3ug+jvZXMMellltz5mwUNlQQgVB0zW/JrjRJR7E3fztRf6WNS iO+o3adh9Dv70g65OAp6/M/fx1scc2d7n39L3WYPoqxv4xscUZ1wFpdVWdMk8A8akNGe +uMA== X-Gm-Message-State: APf1xPBULIFqcPesxchyjN53+uFlXs7phcXRcaanr9Btm9UG4iAfM8jQ 5XmOiTylH9zcVpdcLoUcSDaYIy5BJmY= X-Received: by 2002:a17:902:ab85:: with SMTP id f5-v6mr9594068plr.199.1518891865475; Sat, 17 Feb 2018 10:24:25 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.23 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:24 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:53 -0800 Message-Id: <20180217182323.25885-38-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::242 Subject: [Qemu-devel] [PATCH v2 37/67] target/arm: Implement SVE Integer Compare - Immediate Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 44 +++++++++++++++++++++++ target/arm/sve_helper.c | 88 ++++++++++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 63 +++++++++++++++++++++++++++++++++ target/arm/sve.decode | 23 ++++++++++++ 4 files changed, 218 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 6ffd1fbe8e..ae38c0a4be 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -605,6 +605,50 @@ DEF_HELPER_FLAGS_5(sve_cmplo_ppzw_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_cmpls_ppzw_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpeq_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpne_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpgt_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpge_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmplt_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmple_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmphs_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmphi_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmplo_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpls_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_cmpeq_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpne_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpgt_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpge_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmplt_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmple_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmphs_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmphi_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmplo_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpls_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_cmpeq_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpne_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpgt_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpge_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmplt_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmple_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmphs_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmphi_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmplo_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpls_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_cmpeq_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpne_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpgt_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpge_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmplt_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmple_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmphs_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmphi_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmplo_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpls_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index ae433861f8..b74db681f2 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2367,3 +2367,91 @@ DO_CMP_PPZW_S(sve_cmpls_ppzw_s, uint32_t, uint64_t, <=) #undef DO_CMP_PPZW_H #undef DO_CMP_PPZW_S #undef DO_CMP_PPZW + +/* Similar, but the second source is immediate. */ +#define DO_CMP_PPZI(NAME, TYPE, OP, H, MASK) \ +uint32_t HELPER(NAME)(void *vd, void *vn, void *vg, uint32_t desc) \ +{ \ + intptr_t opr_sz = simd_oprsz(desc); \ + uint32_t flags = PREDTEST_INIT; \ + TYPE mm = simd_data(desc); \ + intptr_t i = opr_sz; \ + do { \ + uint64_t out = 0, pg; \ + do { \ + i -= sizeof(TYPE), out <<= sizeof(TYPE); \ + TYPE nn = *(TYPE *)(vn + H(i)); \ + out |= nn OP mm; \ + } while (i & 63); \ + pg = *(uint64_t *)(vg + (i >> 3)) & MASK; \ + out &= pg; \ + *(uint64_t *)(vd + (i >> 3)) = out; \ + flags = iter_predtest_bwd(out, pg, flags); \ + } while (i > 0); \ + return flags; \ +} + +#define DO_CMP_PPZI_B(NAME, TYPE, OP) \ + DO_CMP_PPZI(NAME, TYPE, OP, H1, 0xffffffffffffffffull) +#define DO_CMP_PPZI_H(NAME, TYPE, OP) \ + DO_CMP_PPZI(NAME, TYPE, OP, H1_2, 0x5555555555555555ull) +#define DO_CMP_PPZI_S(NAME, TYPE, OP) \ + DO_CMP_PPZI(NAME, TYPE, OP, H1_4, 0x1111111111111111ull) +#define DO_CMP_PPZI_D(NAME, TYPE, OP) \ + DO_CMP_PPZI(NAME, TYPE, OP, , 0x0101010101010101ull) + +DO_CMP_PPZI_B(sve_cmpeq_ppzi_b, uint8_t, ==) +DO_CMP_PPZI_H(sve_cmpeq_ppzi_h, uint16_t, ==) +DO_CMP_PPZI_S(sve_cmpeq_ppzi_s, uint32_t, ==) +DO_CMP_PPZI_D(sve_cmpeq_ppzi_d, uint64_t, ==) + +DO_CMP_PPZI_B(sve_cmpne_ppzi_b, uint8_t, !=) +DO_CMP_PPZI_H(sve_cmpne_ppzi_h, uint16_t, !=) +DO_CMP_PPZI_S(sve_cmpne_ppzi_s, uint32_t, !=) +DO_CMP_PPZI_D(sve_cmpne_ppzi_d, uint64_t, !=) + +DO_CMP_PPZI_B(sve_cmpgt_ppzi_b, int8_t, >) +DO_CMP_PPZI_H(sve_cmpgt_ppzi_h, int16_t, >) +DO_CMP_PPZI_S(sve_cmpgt_ppzi_s, int32_t, >) +DO_CMP_PPZI_D(sve_cmpgt_ppzi_d, int64_t, >) + +DO_CMP_PPZI_B(sve_cmpge_ppzi_b, int8_t, >=) +DO_CMP_PPZI_H(sve_cmpge_ppzi_h, int16_t, >=) +DO_CMP_PPZI_S(sve_cmpge_ppzi_s, int32_t, >=) +DO_CMP_PPZI_D(sve_cmpge_ppzi_d, int64_t, >=) + +DO_CMP_PPZI_B(sve_cmphi_ppzi_b, uint8_t, >) +DO_CMP_PPZI_H(sve_cmphi_ppzi_h, uint16_t, >) +DO_CMP_PPZI_S(sve_cmphi_ppzi_s, uint32_t, >) +DO_CMP_PPZI_D(sve_cmphi_ppzi_d, uint64_t, >) + +DO_CMP_PPZI_B(sve_cmphs_ppzi_b, uint8_t, >=) +DO_CMP_PPZI_H(sve_cmphs_ppzi_h, uint16_t, >=) +DO_CMP_PPZI_S(sve_cmphs_ppzi_s, uint32_t, >=) +DO_CMP_PPZI_D(sve_cmphs_ppzi_d, uint64_t, >=) + +DO_CMP_PPZI_B(sve_cmplt_ppzi_b, int8_t, <) +DO_CMP_PPZI_H(sve_cmplt_ppzi_h, int16_t, <) +DO_CMP_PPZI_S(sve_cmplt_ppzi_s, int32_t, <) +DO_CMP_PPZI_D(sve_cmplt_ppzi_d, int64_t, <) + +DO_CMP_PPZI_B(sve_cmple_ppzi_b, int8_t, <=) +DO_CMP_PPZI_H(sve_cmple_ppzi_h, int16_t, <=) +DO_CMP_PPZI_S(sve_cmple_ppzi_s, int32_t, <=) +DO_CMP_PPZI_D(sve_cmple_ppzi_d, int64_t, <=) + +DO_CMP_PPZI_B(sve_cmplo_ppzi_b, uint8_t, <) +DO_CMP_PPZI_H(sve_cmplo_ppzi_h, uint16_t, <) +DO_CMP_PPZI_S(sve_cmplo_ppzi_s, uint32_t, <) +DO_CMP_PPZI_D(sve_cmplo_ppzi_d, uint64_t, <) + +DO_CMP_PPZI_B(sve_cmpls_ppzi_b, uint8_t, <=) +DO_CMP_PPZI_H(sve_cmpls_ppzi_h, uint16_t, <=) +DO_CMP_PPZI_S(sve_cmpls_ppzi_s, uint32_t, <=) +DO_CMP_PPZI_D(sve_cmpls_ppzi_d, uint64_t, <=) + +#undef DO_CMP_PPZI_B +#undef DO_CMP_PPZI_H +#undef DO_CMP_PPZI_S +#undef DO_CMP_PPZI_D +#undef DO_CMP_PPZI diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index cb54777108..a7eeb122e3 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -39,6 +39,8 @@ typedef void GVecGen2iFn(unsigned, uint32_t, uint32_t, typedef void GVecGen3Fn(unsigned, uint32_t, uint32_t, uint32_t, uint32_t, uint32_t); +typedef void gen_helper_gvec_flags_3(TCGv_i32, TCGv_ptr, TCGv_ptr, + TCGv_ptr, TCGv_i32); typedef void gen_helper_gvec_flags_4(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32); @@ -2572,6 +2574,67 @@ DO_PPZW(CMPLS, cmpls) #undef DO_PPZW +/* + *** SVE Integer Compare - Immediate Groups + */ + +static void do_ppzi_flags(DisasContext *s, arg_rpri_esz *a, + gen_helper_gvec_flags_3 *gen_fn) +{ + TCGv_ptr pd, zn, pg; + unsigned vsz; + TCGv_i32 t; + + if (gen_fn == NULL) { + unallocated_encoding(s); + return; + } + + vsz = vec_full_reg_size(s); + t = tcg_const_i32(simd_desc(vsz, vsz, a->imm)); + pd = tcg_temp_new_ptr(); + zn = tcg_temp_new_ptr(); + pg = tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(pd, cpu_env, pred_full_reg_offset(s, a->rd)); + tcg_gen_addi_ptr(zn, cpu_env, vec_full_reg_offset(s, a->rn)); + tcg_gen_addi_ptr(pg, cpu_env, pred_full_reg_offset(s, a->pg)); + + gen_fn(t, pd, zn, pg, t); + + tcg_temp_free_ptr(pd); + tcg_temp_free_ptr(zn); + tcg_temp_free_ptr(pg); + + do_pred_flags(t); + + tcg_temp_free_i32(t); +} + +#define DO_PPZI(NAME, name) \ +static void trans_##NAME##_ppzi(DisasContext *s, arg_rpri_esz *a, \ + uint32_t insn) \ +{ \ + static gen_helper_gvec_flags_3 * const fns[4] = { \ + gen_helper_sve_##name##_ppzi_b, gen_helper_sve_##name##_ppzi_h, \ + gen_helper_sve_##name##_ppzi_s, gen_helper_sve_##name##_ppzi_d, \ + }; \ + do_ppzi_flags(s, a, fns[a->esz]); \ +} + +DO_PPZI(CMPEQ, cmpeq) +DO_PPZI(CMPNE, cmpne) +DO_PPZI(CMPGT, cmpgt) +DO_PPZI(CMPGE, cmpge) +DO_PPZI(CMPHI, cmphi) +DO_PPZI(CMPHS, cmphs) +DO_PPZI(CMPLT, cmplt) +DO_PPZI(CMPLE, cmple) +DO_PPZI(CMPLO, cmplo) +DO_PPZI(CMPLS, cmpls) + +#undef DO_PPZI + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index deedc9163b..0e317d7d48 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -132,6 +132,11 @@ @rdn_dbm ........ .. .... dbm:13 rd:5 \ &rr_dbm rn=%reg_movprfx +# Predicate output, vector and immediate input, +# controlling predicate, element size. +@pd_pg_rn_i7 ........ esz:2 . imm:7 . pg:3 rn:5 . rd:4 &rpri_esz +@pd_pg_rn_i5 ........ esz:2 . imm:s5 ... pg:3 rn:5 . rd:4 &rpri_esz + # Basic Load/Store with 9-bit immediate offset @pd_rn_i9 ........ ........ ...... rn:5 . rd:4 \ &rri imm=%imm9_16_10 @@ -497,6 +502,24 @@ CMPHI_ppzw 00100100 .. 0 ..... 110 ... ..... 1 .... @pd_pg_rn_rm CMPLO_ppzw 00100100 .. 0 ..... 111 ... ..... 0 .... @pd_pg_rn_rm CMPLS_ppzw 00100100 .. 0 ..... 111 ... ..... 1 .... @pd_pg_rn_rm +### SVE Integer Compare - Unsigned Immediate Group + +# SVE integer compare with unsigned immediate +CMPHS_ppzi 00100100 .. 1 ....... 0 ... ..... 0 .... @pd_pg_rn_i7 +CMPHI_ppzi 00100100 .. 1 ....... 0 ... ..... 1 .... @pd_pg_rn_i7 +CMPLO_ppzi 00100100 .. 1 ....... 1 ... ..... 0 .... @pd_pg_rn_i7 +CMPLS_ppzi 00100100 .. 1 ....... 1 ... ..... 1 .... @pd_pg_rn_i7 + +### SVE Integer Compare - Signed Immediate Group + +# SVE integer compare with signed immediate +CMPGE_ppzi 00100101 .. 0 ..... 000 ... ..... 0 .... @pd_pg_rn_i5 +CMPGT_ppzi 00100101 .. 0 ..... 000 ... ..... 1 .... @pd_pg_rn_i5 +CMPLT_ppzi 00100101 .. 0 ..... 001 ... ..... 0 .... @pd_pg_rn_i5 +CMPLE_ppzi 00100101 .. 0 ..... 001 ... ..... 1 .... @pd_pg_rn_i5 +CMPEQ_ppzi 00100101 .. 0 ..... 100 ... ..... 0 .... @pd_pg_rn_i5 +CMPNE_ppzi 00100101 .. 0 ..... 100 ... ..... 1 .... @pd_pg_rn_i5 + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Sat Feb 17 18:22:54 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128724 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1845369ljc; Sat, 17 Feb 2018 11:11:05 -0800 (PST) X-Google-Smtp-Source: AH8x2243t9WRNm6h1m/vDyy9ntbrl/7fOe5kZxaWaTMxfLwIrWQm+W1aE2ppBGdXY4BDL6I8+vfT X-Received: by 10.13.208.197 with SMTP id s188mr7393556ywd.217.1518894665034; Sat, 17 Feb 2018 11:11:05 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518894665; cv=none; d=google.com; s=arc-20160816; b=K3DzFTDL99duDcbRmT4QFzwM8xE0QA28bjMOD3qmn0TcRTihAf//zc0SwxF7J7KXOc MOOHhR8p5SU1I/kReBoqBGuz40u+/VmquKtmggCwsXOGUI4Gn3vfk/0RLwgaA6G9Sbp1 2qJpRR/EHEz/dbixjcO18yRulStCYAtPn1yHfq6iG/Mhvl4AQXnWty1Pnxsq0VcZztBP 3ThMnnf+HffHn4w/yZs/x2UMSAACS0Y7ktzyAHCI4YYFQGDbgze9eD0xqbZpuJAWP4eQ tqgmAhX5neomEcKDYnjE9cGG8QSo1aBFkKO5Ev/UL80PDqGC5uRiarX6YdKSg7XS0sVp K7JA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=bCA93J2gh2zptPDp0XGn339+9IiVcOPVbDJIhtHGNOw=; b=sQBENoFMKW5t5tWmcGbIZ2kv9xguVCQzYt+zrd0OMbSQPTmBzRBggds1ddLIfX24qM eIlyfwdrv53q9hi+LBAqDYUOMF3NJXBwQB4oV2osWE82J63+kfdTN4HVZA1Hc63l/PxT Q43fKCtzXrN9X5GbCDPbTh/CjCKXlBMRPaanYJ9+r2dW95eFfgDBB34dA4kymKt4Da0O bmC4HvdLBDmi7NHEWRv4Aqrmj07WRQaajpBzO5PkLAlAy5p5I6TKS6kJmivehI2bRTl+ lr+FGaBrt+5vfKK5sbnbcSegLUpYXVHk/FoEFrmE0lGwbqyh/i2loY/e8+M7rBhOeIlc afHQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=BfuUYEHP; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id e10si331895ybh.139.2018.02.17.11.11.04 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 11:11:05 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=BfuUYEHP; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48450 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7t2-0005sG-BS for patch@linaro.org; Sat, 17 Feb 2018 14:11:04 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40444) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79z-00019t-Of for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:34 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79w-0001xS-M4 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:30 -0500 Received: from mail-pl0-x243.google.com ([2607:f8b0:400e:c01::243]:41329) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79w-0001ws-EP for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:28 -0500 Received: by mail-pl0-x243.google.com with SMTP id k8so3438744pli.8 for ; Sat, 17 Feb 2018 10:24:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=bCA93J2gh2zptPDp0XGn339+9IiVcOPVbDJIhtHGNOw=; b=BfuUYEHPCP0T2TiyD3RHVHf32mbxNfu2qVDqbWjr6RYaPlOGa5rwnv6iHv5ZV0T699 Af0YQ7eka2KR/NUdU0EYrePj+aVvOef+/378xEW6pl+mRDRD8AjHIit5HV6JLdgVSYeC rK2y8nA82dAt77fe+WppKL62SQsAaeQDRXHG0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=bCA93J2gh2zptPDp0XGn339+9IiVcOPVbDJIhtHGNOw=; b=LEg6XvahE7rFfiHqfHUNPz/IPlsO0TVCMp0tcspBmYoSsdiAF0DfawxmlIu1EKfE8s AP5dNL06PLIjhRW+cjw8bKFQQ40jV/vCyaGMLmdwLEoboiPBf8L8n86k0I26XnzyzH6d JXVk6ub2TTFIwq4QzQnReegV3nu59JeIzAUHib3NuwCp66gubwLZWw6xLNXqe6r1z3AQ mW+4bjY5gMPmUPuPYD8NytRoYhgEBtke1xJ2aXYhRM0lC+7fjeC8fedbL8fHYm7PSoGU MQK5n4ytiBwczofwrpbch3D1PzSNVqqFlFKfTLpVcgBSCLXaPl4PhzwdaIAH7m7+tUod 4SeQ== X-Gm-Message-State: APf1xPBl1d1tEeaDPQJUp2YUXZhCxdhIpQHMQ3nuZRpg5rYEBMywmIn6 FFWymRL8G6hJYe9rn+oT3jnZsyXDC1s= X-Received: by 2002:a17:902:6ac2:: with SMTP id i2-v6mr1484066plt.368.1518891867027; Sat, 17 Feb 2018 10:24:27 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.25 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:26 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:54 -0800 Message-Id: <20180217182323.25885-39-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::243 Subject: [Qemu-devel] [PATCH v2 38/67] target/arm: Implement SVE Partition Break Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 18 ++++ target/arm/sve_helper.c | 247 +++++++++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 96 ++++++++++++++++++ target/arm/sve.decode | 19 ++++ 4 files changed, 380 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index ae38c0a4be..f0a3ed3414 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -658,3 +658,21 @@ DEF_HELPER_FLAGS_5(sve_orn_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_nor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_nand_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_brkpa, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_brkpb, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_brkpas, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_brkpbs, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_brka_z, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_brkb_z, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_brka_m, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_brkb_m, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_brkas_z, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_brkbs_z, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_brkas_m, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_brkbs_m, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_brkn, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_brkns, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index b74db681f2..d6d2220f8b 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2455,3 +2455,250 @@ DO_CMP_PPZI_D(sve_cmpls_ppzi_d, uint64_t, <=) #undef DO_CMP_PPZI_S #undef DO_CMP_PPZI_D #undef DO_CMP_PPZI + +/* Similar to the ARM LastActive pseudocode function. */ +static bool last_active_pred(void *vd, void *vg, intptr_t oprsz) +{ + intptr_t i; + + for (i = QEMU_ALIGN_UP(oprsz, 8) - 8; i >= 0; i -= 8) { + uint64_t pg = *(uint64_t *)(vg + i); + if (pg) { + return (pow2floor(pg) & *(uint64_t *)(vd + i)) != 0; + } + } + return 0; +} + +/* Compute a mask into RETB that is true for all G, up to and including + * (if after) or excluding (if !after) the first G & N. + * Return true if BRK found. + */ +static bool compute_brk(uint64_t *retb, uint64_t n, uint64_t g, + bool brk, bool after) +{ + uint64_t b; + + if (brk) { + b = 0; + } else if ((g & n) == 0) { + /* For all G, no N are set; break not found. */ + b = g; + } else { + /* Break somewhere in N. Locate it. */ + b = g & n; /* guard true, pred true*/ + b = b & -b; /* first such */ + if (after) { + b = b | (b - 1); /* break after same */ + } else { + b = b - 1; /* break before same */ + } + brk = true; + } + + *retb = b; + return brk; +} + +/* Compute a zeroing BRK. */ +static void compute_brk_z(uint64_t *d, uint64_t *n, uint64_t *g, + intptr_t oprsz, bool after) +{ + bool brk = false; + intptr_t i; + + for (i = 0; i < DIV_ROUND_UP(oprsz, 8); ++i) { + uint64_t this_b, this_g = g[i]; + + brk = compute_brk(&this_b, n[i], this_g, brk, after); + d[i] = this_b & this_g; + } +} + +/* Likewise, but also compute flags. */ +static uint32_t compute_brks_z(uint64_t *d, uint64_t *n, uint64_t *g, + intptr_t oprsz, bool after) +{ + uint32_t flags = PREDTEST_INIT; + bool brk = false; + intptr_t i; + + for (i = 0; i < DIV_ROUND_UP(oprsz, 8); ++i) { + uint64_t this_b, this_d, this_g = g[i]; + + brk = compute_brk(&this_b, n[i], this_g, brk, after); + d[i] = this_d = this_b & this_g; + flags = iter_predtest_fwd(this_d, this_g, flags); + } + return flags; +} + +/* Given a computation function, compute a merging BRK. */ +static void compute_brk_m(uint64_t *d, uint64_t *n, uint64_t *g, + intptr_t oprsz, bool after) +{ + bool brk = false; + intptr_t i; + + for (i = 0; i < DIV_ROUND_UP(oprsz, 8); ++i) { + uint64_t this_b, this_g = g[i]; + + brk = compute_brk(&this_b, n[i], this_g, brk, after); + d[i] = (this_b & this_g) | (d[i] & ~this_g); + } +} + +/* Likewise, but also compute flags. */ +static uint32_t compute_brks_m(uint64_t *d, uint64_t *n, uint64_t *g, + intptr_t oprsz, bool after) +{ + uint32_t flags = PREDTEST_INIT; + bool brk = false; + intptr_t i; + + for (i = 0; i < oprsz / 8; ++i) { + uint64_t this_b, this_d = d[i], this_g = g[i]; + + brk = compute_brk(&this_b, n[i], this_g, brk, after); + d[i] = this_d = (this_b & this_g) | (this_d & ~this_g); + flags = iter_predtest_fwd(this_d, this_g, flags); + } + return flags; +} + +static uint32_t do_zero(ARMPredicateReg *d, intptr_t oprsz) +{ + /* It is quicker to zero the whole predicate than loop on OPRSZ. + The compiler should turn this into 4 64-bit integer stores. */ + memset(d, 0, sizeof(ARMPredicateReg)); + return PREDTEST_INIT; +} + +void HELPER(sve_brkpa)(void *vd, void *vn, void *vm, void *vg, + uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + if (last_active_pred(vn, vg, oprsz)) { + compute_brk_z(vd, vm, vg, oprsz, true); + } else { + do_zero(vd, oprsz); + } +} + +uint32_t HELPER(sve_brkpas)(void *vd, void *vn, void *vm, void *vg, + uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + if (last_active_pred(vn, vg, oprsz)) { + return compute_brks_z(vd, vm, vg, oprsz, true); + } else { + return do_zero(vd, oprsz); + } +} + +void HELPER(sve_brkpb)(void *vd, void *vn, void *vm, void *vg, + uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + if (last_active_pred(vn, vg, oprsz)) { + compute_brk_z(vd, vm, vg, oprsz, false); + } else { + do_zero(vd, oprsz); + } +} + +uint32_t HELPER(sve_brkpbs)(void *vd, void *vn, void *vm, void *vg, + uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + if (last_active_pred(vn, vg, oprsz)) { + return compute_brks_z(vd, vm, vg, oprsz, false); + } else { + return do_zero(vd, oprsz); + } +} + +void HELPER(sve_brka_z)(void *vd, void *vn, void *vg, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + compute_brk_z(vd, vn, vg, oprsz, true); +} + +uint32_t HELPER(sve_brkas_z)(void *vd, void *vn, void *vg, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + return compute_brks_z(vd, vn, vg, oprsz, true); +} + +void HELPER(sve_brkb_z)(void *vd, void *vn, void *vg, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + compute_brk_z(vd, vn, vg, oprsz, false); +} + +uint32_t HELPER(sve_brkbs_z)(void *vd, void *vn, void *vg, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + return compute_brks_z(vd, vn, vg, oprsz, false); +} + +void HELPER(sve_brka_m)(void *vd, void *vn, void *vg, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + compute_brk_m(vd, vn, vg, oprsz, true); +} + +uint32_t HELPER(sve_brkas_m)(void *vd, void *vn, void *vg, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + return compute_brks_m(vd, vn, vg, oprsz, true); +} + +void HELPER(sve_brkb_m)(void *vd, void *vn, void *vg, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + compute_brk_m(vd, vn, vg, oprsz, false); +} + +uint32_t HELPER(sve_brkbs_m)(void *vd, void *vn, void *vg, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + return compute_brks_m(vd, vn, vg, oprsz, false); +} + +void HELPER(sve_brkn)(void *vd, void *vn, void *vg, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + + if (!last_active_pred(vn, vg, oprsz)) { + do_zero(vd, oprsz); + } +} + +/* As if PredTest(Ones(PL), D, esz). */ +static uint32_t predtest_ones(ARMPredicateReg *d, intptr_t oprsz, + uint64_t esz_mask) +{ + uint32_t flags = PREDTEST_INIT; + intptr_t i; + + for (i = 0; i < oprsz / 8; i++) { + flags = iter_predtest_fwd(d->p[i], esz_mask, flags); + } + if (oprsz & 7) { + uint64_t mask = ~(-1ULL << (8 * (oprsz & 7))); + flags = iter_predtest_fwd(d->p[i], esz_mask & mask, flags); + } + return flags; +} + +uint32_t HELPER(sve_brkns)(void *vd, void *vn, void *vg, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + + if (last_active_pred(vn, vg, oprsz)) { + return predtest_ones(vd, oprsz, -1); + } else { + return do_zero(vd, oprsz); + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index a7eeb122e3..dc95d68867 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2635,6 +2635,102 @@ DO_PPZI(CMPLS, cmpls) #undef DO_PPZI +/* + *** SVE Partition Break Group + */ + +static void do_brk3(DisasContext *s, arg_rprr_s *a, + gen_helper_gvec_4 *fn, gen_helper_gvec_flags_4 *fn_s) +{ + unsigned vsz = pred_full_reg_size(s); + + /* Predicate sizes may be smaller and cannot use simd_desc. */ + TCGv_ptr d = tcg_temp_new_ptr(); + TCGv_ptr n = tcg_temp_new_ptr(); + TCGv_ptr m = tcg_temp_new_ptr(); + TCGv_ptr g = tcg_temp_new_ptr(); + TCGv_i32 t = tcg_const_i32(vsz - 2); + + tcg_gen_addi_ptr(d, cpu_env, pred_full_reg_offset(s, a->rd)); + tcg_gen_addi_ptr(n, cpu_env, pred_full_reg_offset(s, a->rn)); + tcg_gen_addi_ptr(m, cpu_env, pred_full_reg_offset(s, a->rm)); + tcg_gen_addi_ptr(g, cpu_env, pred_full_reg_offset(s, a->pg)); + + if (a->s) { + fn_s(t, d, n, m, g, t); + do_pred_flags(t); + } else { + fn(d, n, m, g, t); + } + tcg_temp_free_ptr(d); + tcg_temp_free_ptr(n); + tcg_temp_free_ptr(m); + tcg_temp_free_ptr(g); + tcg_temp_free_i32(t); +} + +static void do_brk2(DisasContext *s, arg_rpr_s *a, + gen_helper_gvec_3 *fn, gen_helper_gvec_flags_3 *fn_s) +{ + unsigned vsz = pred_full_reg_size(s); + + /* Predicate sizes may be smaller and cannot use simd_desc. */ + TCGv_ptr d = tcg_temp_new_ptr(); + TCGv_ptr n = tcg_temp_new_ptr(); + TCGv_ptr g = tcg_temp_new_ptr(); + TCGv_i32 t = tcg_const_i32(vsz - 2); + + tcg_gen_addi_ptr(d, cpu_env, pred_full_reg_offset(s, a->rd)); + tcg_gen_addi_ptr(n, cpu_env, pred_full_reg_offset(s, a->rn)); + tcg_gen_addi_ptr(g, cpu_env, pred_full_reg_offset(s, a->pg)); + + if (a->s) { + fn_s(t, d, n, g, t); + do_pred_flags(t); + } else { + fn(d, n, g, t); + } + tcg_temp_free_ptr(d); + tcg_temp_free_ptr(n); + tcg_temp_free_ptr(g); + tcg_temp_free_i32(t); +} + +void trans_BRKPA(DisasContext *s, arg_rprr_s *a, uint32_t insn) +{ + do_brk3(s, a, gen_helper_sve_brkpa, gen_helper_sve_brkpas); +} + +void trans_BRKPB(DisasContext *s, arg_rprr_s *a, uint32_t insn) +{ + do_brk3(s, a, gen_helper_sve_brkpb, gen_helper_sve_brkpbs); +} + +void trans_BRKA_m(DisasContext *s, arg_rpr_s *a, uint32_t insn) +{ + do_brk2(s, a, gen_helper_sve_brka_m, gen_helper_sve_brkas_m); +} + +void trans_BRKB_m(DisasContext *s, arg_rpr_s *a, uint32_t insn) +{ + do_brk2(s, a, gen_helper_sve_brkb_m, gen_helper_sve_brkbs_m); +} + +void trans_BRKA_z(DisasContext *s, arg_rpr_s *a, uint32_t insn) +{ + do_brk2(s, a, gen_helper_sve_brka_z, gen_helper_sve_brkas_z); +} + +void trans_BRKB_z(DisasContext *s, arg_rpr_s *a, uint32_t insn) +{ + do_brk2(s, a, gen_helper_sve_brkb_z, gen_helper_sve_brkbs_z); +} + +void trans_BRKN(DisasContext *s, arg_rpr_s *a, uint32_t insn) +{ + do_brk2(s, a, gen_helper_sve_brkn, gen_helper_sve_brkns); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 0e317d7d48..1c19129e55 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -60,6 +60,7 @@ &rri_esz rd rn imm esz &rrr_esz rd rn rm esz &rpr_esz rd pg rn esz +&rpr_s rd pg rn s &rprr_s rd pg rn rm s &rprr_esz rd pg rn rm esz &rprrr_esz rd pg rn rm ra esz @@ -79,6 +80,9 @@ @pd_pn ........ esz:2 .. .... ....... rn:4 . rd:4 &rr_esz @rd_rn ........ esz:2 ...... ...... rn:5 rd:5 &rr_esz +# Two operand with governing predicate, flags setting +@pd_pg_pn_s ........ . s:1 ...... .. pg:4 . rn:4 . rd:4 &rpr_s + # Three operand with unused vector element size @rd_rn_rm_e0 ........ ... rm:5 ... ... rn:5 rd:5 &rrr_esz esz=0 @@ -568,6 +572,21 @@ PFIRST 00100101 01 011 000 11000 00 .... 0 .... @pd_pn_e0 # SVE predicate next active PNEXT 00100101 .. 011 001 11000 10 .... 0 .... @pd_pn +### SVE Partition Break Group + +# SVE propagate break from previous partition +BRKPA 00100101 0. 00 .... 11 .... 0 .... 0 .... @pd_pg_pn_pm_s +BRKPB 00100101 0. 00 .... 11 .... 0 .... 1 .... @pd_pg_pn_pm_s + +# SVE partition break condition +BRKA_z 00100101 0. 01000001 .... 0 .... 0 .... @pd_pg_pn_s +BRKB_z 00100101 1. 01000001 .... 0 .... 0 .... @pd_pg_pn_s +BRKA_m 00100101 0. 01000001 .... 0 .... 1 .... @pd_pg_pn_s +BRKB_m 00100101 1. 01000001 .... 0 .... 1 .... @pd_pg_pn_s + +# SVE propagate break to next partition +BRKN 00100101 0. 01100001 .... 0 .... 0 .... @pd_pg_pn_s + ### SVE Memory - 32-bit Gather and Unsized Contiguous Group # SVE load predicate register From patchwork Sat Feb 17 18:22:55 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128695 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1828532ljc; Sat, 17 Feb 2018 10:44:21 -0800 (PST) X-Google-Smtp-Source: AH8x227zAPX7R7h4J4BYfev9qXdFnUra47HjLi/KNZckbV/UAMzYUe5HHUPgI+jY/rdMRA+/zFx8 X-Received: by 10.37.187.206 with SMTP id c14mr7361866ybk.408.1518893061770; Sat, 17 Feb 2018 10:44:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518893061; cv=none; d=google.com; s=arc-20160816; b=ZQHQED9/V9w+tIyC4YCiZxsDMAO3CKO+36MQ1hEBgCm++ge7QKl9SvOjW3XR5ObNH6 e1GAY6FkBgJB/dMwcs8j72myfB1Ue1CsYgX1XRxpsYjiaDwJZ5AGVIYFmg4liaU49gv+ I6J/5Kg1AdhpG/50G7TA2pV0KC+2E6OVQes2TW12e8YtRXLKYiVjGcl0RMwt+MFBNptf 3iMZMXGd34dLCr4coGFKtiz9KdH9CARHHD085h3XlE+9zKfqFyq1M0X1NiO5cYgfdpOZ N7zRyfQKa6+KND0QnmLBn6oLATy+OC8uF3qAc6Y6ZsB/ARJWevLcR02eOBTNj+m6aZdn 8ilw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=FVmGqUj6eUA4cUBaHP2RMtNc+5HxMAEYAfQlDELLJ08=; b=sVpRGh3dIBhDBy0dcUJmNkaLhjABIq3WBOh+ifkbKxdh543ibI0pUhOlBHyPpYgcfV dvHMbuoSxxAUycroUBIOYktUmqQgWp/7aYR0T0fCB/8jSF7JiPPiFyH4eIhX8ZTpeWhP v75DaYrm0sctH5DnXMJkjh1SMx4dUZY2fC8JT3+DLQe2mLqqANEjQ9Z8yKznSkQEcB6L 7Kz2PbzKRaHx/ozGUn2mzuuvmoGmZLBdwknGcCuwDqm1jxfcpdN+tNzPsmEUBCbfuM0R KEsJNsg0B/GobVZMuC8IQNX5A/czfjnBxQMX89v+kaABsTLzi8S70ocPU4ETXvzxzb83 9IpQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=czs9ma53; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id q11si2552421ywc.299.2018.02.17.10.44.21 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 10:44:21 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=czs9ma53; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48216 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7TB-0000Pp-4t for patch@linaro.org; Sat, 17 Feb 2018 13:44:21 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40455) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en79z-00019v-Ok for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:34 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79y-0001y1-1K for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:31 -0500 Received: from mail-pl0-x241.google.com ([2607:f8b0:400e:c01::241]:40420) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79x-0001xk-Pk for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:29 -0500 Received: by mail-pl0-x241.google.com with SMTP id g18so3435581plo.7 for ; Sat, 17 Feb 2018 10:24:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=FVmGqUj6eUA4cUBaHP2RMtNc+5HxMAEYAfQlDELLJ08=; b=czs9ma53fn5O+exBK5ohzl/uOcsR6hdYQQvCHS/EOaDx5gCUA2NiCExeOuFOH5PWUE 7tdazZtdEFl5fvSvwIvfNNlRmv0Xw3uQ77l2BtN/B+wSVCny6pYUfxY2OcEpZVcV6cd5 2dHFgt0Q/AZslo9DVnP5gQvt53qdHubNiLfio= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=FVmGqUj6eUA4cUBaHP2RMtNc+5HxMAEYAfQlDELLJ08=; b=VXwJSg79LaAiOLhilnvCjCol2CDpbM09LimRYQrEnd2WY8zSjwZHj96wEOLcJoGmB2 lzU7bxF+aKXqNRakB/X9WYSBtbRtX3psILQ7zsytRox2UyXLapwLLNaKRhaxYUQr+wh7 QIyzogMBHX15Pame29l4jK3iHdZypa+CJuqWSUSgOc3yt+sZY+isr56D8i6miFReNvAO 7x7BXm/Lad+Ux1/xCNlYK0aGDYoYaA+WduzueHXyitlfUr0rS36QRsSe0vMB9/uoEv5H AFK1INOJJYyAN8sfJGpNUUxbiBf4vSpPYsB6ZV6tOVTElhlXt3al/uaFmuMsvZgnC1Z8 Z4zw== X-Gm-Message-State: APf1xPAzE9YkYQ+kfvghTA9kpmDeEshbRiR6jK28v7Nf7Bo6qdE4BGlQ nTJzM+3A1fF9bSHz370mzj5GHJxvoYU= X-Received: by 2002:a17:902:6e8c:: with SMTP id v12-v6mr9361431plk.424.1518891868501; Sat, 17 Feb 2018 10:24:28 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.27 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:27 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:55 -0800 Message-Id: <20180217182323.25885-40-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::241 Subject: [Qemu-devel] [PATCH v2 39/67] target/arm: Implement SVE Predicate Count Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 2 + target/arm/sve_helper.c | 14 ++++++ target/arm/translate-sve.c | 116 +++++++++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 27 +++++++++++ 4 files changed, 159 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index f0a3ed3414..dd4f8f754d 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -676,3 +676,5 @@ DEF_HELPER_FLAGS_4(sve_brkbs_m, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_brkn, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_brkns, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_cntp, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index d6d2220f8b..dd884bdd1c 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2702,3 +2702,17 @@ uint32_t HELPER(sve_brkns)(void *vd, void *vn, void *vg, uint32_t pred_desc) return do_zero(vd, oprsz); } } + +uint64_t HELPER(sve_cntp)(void *vn, void *vg, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + intptr_t esz = extract32(pred_desc, SIMD_DATA_SHIFT, 2); + uint64_t *n = vn, *g = vg, sum = 0, mask = pred_esz_masks[esz]; + intptr_t i; + + for (i = 0; i < DIV_ROUND_UP(oprsz, 8); ++i) { + uint64_t t = n[i] & g[i] & mask; + sum += ctpop64(t); + } + return sum; +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index dc95d68867..038800cc86 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -36,6 +36,8 @@ typedef void GVecGen2Fn(unsigned, uint32_t, uint32_t, uint32_t, uint32_t); typedef void GVecGen2iFn(unsigned, uint32_t, uint32_t, int64_t, uint32_t, uint32_t); +typedef void GVecGen2sFn(unsigned, uint32_t, uint32_t, + TCGv_i64, uint32_t, uint32_t); typedef void GVecGen3Fn(unsigned, uint32_t, uint32_t, uint32_t, uint32_t, uint32_t); @@ -2731,6 +2733,120 @@ void trans_BRKN(DisasContext *s, arg_rpr_s *a, uint32_t insn) do_brk2(s, a, gen_helper_sve_brkn, gen_helper_sve_brkns); } +/* + *** SVE Predicate Count Group + */ + +static void do_cntp(DisasContext *s, TCGv_i64 val, int esz, int pn, int pg) +{ + unsigned psz = pred_full_reg_size(s); + + if (psz <= 8) { + uint64_t psz_mask; + + tcg_gen_ld_i64(val, cpu_env, pred_full_reg_offset(s, pn)); + if (pn != pg) { + TCGv_i64 g = tcg_temp_new_i64(); + tcg_gen_ld_i64(g, cpu_env, pred_full_reg_offset(s, pg)); + tcg_gen_and_i64(val, val, g); + tcg_temp_free_i64(g); + } + + /* Reduce the pred_esz_masks value simply to reduce the + size of the code generated here. */ + psz_mask = deposit64(0, 0, psz * 8, -1); + tcg_gen_andi_i64(val, val, pred_esz_masks[esz] & psz_mask); + + tcg_gen_ctpop_i64(val, val); + } else { + TCGv_ptr t_pn = tcg_temp_new_ptr(); + TCGv_ptr t_pg = tcg_temp_new_ptr(); + unsigned desc; + TCGv_i32 t_desc; + + desc = psz - 2; + desc = deposit32(desc, SIMD_DATA_SHIFT, 2, esz); + + tcg_gen_addi_ptr(t_pn, cpu_env, pred_full_reg_offset(s, pn)); + tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg)); + t_desc = tcg_const_i32(desc); + + gen_helper_sve_cntp(val, t_pn, t_pg, t_desc); + tcg_temp_free_ptr(t_pn); + tcg_temp_free_ptr(t_pg); + tcg_temp_free_i32(t_desc); + } +} + +static void trans_CNTP(DisasContext *s, arg_CNTP *a, uint32_t insn) +{ + do_cntp(s, cpu_reg(s, a->rd), a->esz, a->rn, a->pg); +} + +static void trans_INCDECP_r(DisasContext *s, arg_incdec_pred *a, + uint32_t insn) +{ + TCGv_i64 reg = cpu_reg(s, a->rd); + TCGv_i64 val = tcg_temp_new_i64(); + + do_cntp(s, val, a->esz, a->pg, a->pg); + if (a->d) { + tcg_gen_sub_i64(reg, reg, val); + } else { + tcg_gen_add_i64(reg, reg, val); + } + tcg_temp_free_i64(val); +} + +static void trans_INCDECP_z(DisasContext *s, arg_incdec2_pred *a, + uint32_t insn) +{ + unsigned vsz = vec_full_reg_size(s); + TCGv_i64 val = tcg_temp_new_i64(); + GVecGen2sFn *gvec_fn = a->d ? tcg_gen_gvec_subs : tcg_gen_gvec_adds; + + if (a->esz == 0) { + unallocated_encoding(s); + return; + } + do_cntp(s, val, a->esz, a->pg, a->pg); + gvec_fn(a->esz, vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), val, vsz, vsz); +} + +static void trans_SINCDECP_r_32(DisasContext *s, arg_incdec_pred *a, + uint32_t insn) +{ + TCGv_i64 reg = cpu_reg(s, a->rd); + TCGv_i64 val = tcg_temp_new_i64(); + + do_cntp(s, val, a->esz, a->pg, a->pg); + do_sat_addsub_32(reg, val, a->u, a->d); +} + +static void trans_SINCDECP_r_64(DisasContext *s, arg_incdec_pred *a, + uint32_t insn) +{ + TCGv_i64 reg = cpu_reg(s, a->rd); + TCGv_i64 val = tcg_temp_new_i64(); + + do_cntp(s, val, a->esz, a->pg, a->pg); + do_sat_addsub_64(reg, val, a->u, a->d); +} + +static void trans_SINCDECP_z(DisasContext *s, arg_incdec2_pred *a, + uint32_t insn) +{ + TCGv_i64 val = tcg_temp_new_i64(); + + if (a->esz == 0) { + unallocated_encoding(s); + return; + } + do_cntp(s, val, a->esz, a->pg, a->pg); + do_sat_addsub_vec(s, a->esz, a->rd, a->rn, val, a->u, a->d); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 1c19129e55..76c084d43e 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -68,6 +68,8 @@ &ptrue rd esz pat s &incdec_cnt rd pat esz imm d u &incdec2_cnt rd rn pat esz imm d u +&incdec_pred rd pg esz d u +&incdec2_pred rd rn pg esz d u ########################################################################### # Named instruction formats. These are generally used to @@ -114,6 +116,7 @@ # One register operand, with governing predicate, vector element size @rd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz +@rd_pg4_pn ........ esz:2 ... ... .. pg:4 . rn:4 rd:5 &rpr_esz # Two register operands with a 6-bit signed immediate. @rd_rn_i6 ........ ... rn:5 ..... imm:s6 rd:5 &rri @@ -154,6 +157,12 @@ @incdec2_cnt ........ esz:2 .. .... ...... pat:5 rd:5 \ &incdec2_cnt imm=%imm4_16_p1 rn=%reg_movprfx +# One register, predicate. +# User must fill in U and D. +@incdec_pred ........ esz:2 .... .. ..... .. pg:4 rd:5 &incdec_pred +@incdec2_pred ........ esz:2 .... .. ..... .. pg:4 rd:5 \ + &incdec2_pred rn=%reg_movprfx + ########################################################################### # Instruction patterns. Grouped according to the SVE encodingindex.xhtml. @@ -587,6 +596,24 @@ BRKB_m 00100101 1. 01000001 .... 0 .... 1 .... @pd_pg_pn_s # SVE propagate break to next partition BRKN 00100101 0. 01100001 .... 0 .... 0 .... @pd_pg_pn_s +### SVE Predicate Count Group + +# SVE predicate count +CNTP 00100101 .. 100 000 10 .... 0 .... ..... @rd_pg4_pn + +# SVE inc/dec register by predicate count +INCDECP_r 00100101 .. 10110 d:1 10001 00 .... ..... @incdec_pred u=1 + +# SVE inc/dec vector by predicate count +INCDECP_z 00100101 .. 10110 d:1 10000 00 .... ..... @incdec2_pred u=1 + +# SVE saturating inc/dec register by predicate count +SINCDECP_r_32 00100101 .. 1010 d:1 u:1 10001 00 .... ..... @incdec_pred +SINCDECP_r_64 00100101 .. 1010 d:1 u:1 10001 10 .... ..... @incdec_pred + +# SVE saturating inc/dec vector by predicate count +SINCDECP_z 00100101 .. 1010 d:1 u:1 10000 00 .... ..... @incdec2_pred + ### SVE Memory - 32-bit Gather and Unsized Contiguous Group # SVE load predicate register From patchwork Sat Feb 17 18:22:56 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128721 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1843540ljc; Sat, 17 Feb 2018 11:08:08 -0800 (PST) X-Google-Smtp-Source: AH8x225f35ynKlS04M6J+Z68lhz9bKCo2TkYnHcMEDpubV6GfMmW6UQUHK16Jj4UF/EjOYZBm22O X-Received: by 10.13.251.70 with SMTP id l67mr7339790ywf.266.1518894488306; Sat, 17 Feb 2018 11:08:08 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518894488; cv=none; d=google.com; s=arc-20160816; b=0aQXW5zwbreghtzyn9/5DEovDPHkiYI1JnADOINGU6wPj6dIF6EnPGGCX6ldBQAuwm KUgwo2eGIrvuGSIkR2CkEOWg6PdcvJQPPEIgIRwHAOhmrcKvJaWgFBEU+e9YxFqw7yWW UPEYjICND/1IYZgYvKLGcW/2eS6+3sNwIejgsLKdf6KGf/2VwNBKVng9n4KpRW5D++w1 +B440B1ByrA55G0MCKYkEggJrjEACd2FHswo17iHUkDRsaDpuufy7nIqMnzYkjG2F2dF 1QSYBppIidAaXB54XntrH2LUIw+C2+BFdbhqXS6SJnn0zbvtVtzppL3kWpjb3G11KaQ/ 62Bg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=5Y5i1Z/QDoiE237plg9CGLy21f/g2av1w+1fO98BbyU=; b=ZMHXq7SNveQOAuUhYgSQMse2A81tD2QRrOXq7d36QnwIpOC7ZeRfxKaCq8oCPYkwzj JoeXC96HcDkLe59CrYivFiLu+ntdJVXo+nGZ5TfRHyqOHxJ6JnsK1eGq4aD7o3C+/Zet ZpGVUl6cMdFJnWJ+R7X/2O6R5C/CzunnOysMfzEYEj2SLwq1lWKb7vbAtMByv1m/m+w8 E33bYSrUUJWhP1u3zlrN9ClJc1md5U5npCbtZL6XcHAPTMq9SIprzBnoqZp/lHr7AgEx oyQrp5RWEGX3conQp1bttkOT0yuoEPgMFyaNVcc43YTNxqAm6w/UfAAx4PZc4PbsfbvB r/ew== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=D6Xv3L8w; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id 189si824030ywg.443.2018.02.17.11.08.08 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 11:08:08 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=D6Xv3L8w; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48439 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7qB-00048X-IT for patch@linaro.org; Sat, 17 Feb 2018 14:08:07 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40505) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7A2-0001DX-7o for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:35 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en79z-0001yw-Ad for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:34 -0500 Received: from mail-pl0-x243.google.com ([2607:f8b0:400e:c01::243]:34791) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en79z-0001yY-32 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:31 -0500 Received: by mail-pl0-x243.google.com with SMTP id bd10so3450055plb.1 for ; Sat, 17 Feb 2018 10:24:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=5Y5i1Z/QDoiE237plg9CGLy21f/g2av1w+1fO98BbyU=; b=D6Xv3L8wcdUuu17M6/nAzjHQfUC82mYwKRU2s1p89C9WCeDQG2erUwT8LxN4sa+Scd zeOmL49xGuJ3El7P0RJ7lK+erpektPOX0NLF9bFnkqk0I1iKkbPVa2DAHdO5pA/AXs50 FrmCvhnQuBwWNXk7GY6/hG2zVlo0rUgFRuHrM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=5Y5i1Z/QDoiE237plg9CGLy21f/g2av1w+1fO98BbyU=; b=IowTsfAX4U2IlcqD9+1Ct4vHTOOZCTaPhXJ+xITskFtBpzsb14u+LIf6JDX4X/Iw1B blDcH5H7+fdaUU+VVzJ19p3zi/OwtDCiKN5EeZ2PXSosEuNqvuesD26VjCzLxbHHZ5uZ bpRbL6GGKW4r1izq3dqY5FJ2QI6IgJxtgVl4M6cn+J7u3vbGeeyoT6ZXVCaM9EY3QYII T2YS7IwOIZNMEbdSrPGUKssTp957UH9E8AiNga+y4DNufWBnYsVRW5/JXjGIKiP+5PW1 4YrFZHh9uurx7C7WVmOmLRTbbfDMx8DOoW4IxZqhtaYpsDuO/aV/ATB4hu2xIWLzelwY hMXQ== X-Gm-Message-State: APf1xPBn5M1qKn2Pm/VBLbvRajNhFLLZeNhgOs0txYuz4ckzGk87miIM H5a1AWJvhP7gJ72pMKZ9v7fpC+5DUv4= X-Received: by 2002:a17:902:6bcb:: with SMTP id m11-v6mr2326167plt.326.1518891869820; Sat, 17 Feb 2018 10:24:29 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.28 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:29 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:56 -0800 Message-Id: <20180217182323.25885-41-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::243 Subject: [Qemu-devel] [PATCH v2 40/67] target/arm: Implement SVE Integer Compare - Scalars Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 2 + target/arm/sve_helper.c | 31 ++++++++++++++++ target/arm/translate-sve.c | 92 ++++++++++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 8 ++++ 4 files changed, 133 insertions(+) -- 2.14.3 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index dd4f8f754d..1863106d0f 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -678,3 +678,5 @@ DEF_HELPER_FLAGS_4(sve_brkn, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_brkns, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_cntp, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_while, TCG_CALL_NO_RWG, i32, ptr, i32, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index dd884bdd1c..80b78da834 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2716,3 +2716,34 @@ uint64_t HELPER(sve_cntp)(void *vn, void *vg, uint32_t pred_desc) } return sum; } + +uint32_t HELPER(sve_while)(void *vd, uint32_t count, uint32_t pred_desc) +{ + uintptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + intptr_t esz = extract32(pred_desc, SIMD_DATA_SHIFT, 2); + uint64_t esz_mask = pred_esz_masks[esz]; + ARMPredicateReg *d = vd; + uint32_t flags; + intptr_t i; + + /* Begin with a zero predicate register. */ + flags = do_zero(d, oprsz); + if (count == 0) { + return flags; + } + + /* Scale from predicate element count to bits. */ + count <<= esz; + /* Bound to the bits in the predicate. */ + count = MIN(count, oprsz * 8); + + /* Set all of the requested bits. */ + for (i = 0; i < count / 64; ++i) { + d->p[i] = esz_mask; + } + if (count & 63) { + d->p[i] = ~(-1ull << (count & 63)) & esz_mask; + } + + return predtest_ones(d, oprsz, esz_mask); +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 038800cc86..4b92a55c21 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2847,6 +2847,98 @@ static void trans_SINCDECP_z(DisasContext *s, arg_incdec2_pred *a, do_sat_addsub_vec(s, a->esz, a->rd, a->rn, val, a->u, a->d); } +/* + *** SVE Integer Compare Scalars Group + */ + +static void trans_CTERM(DisasContext *s, arg_CTERM *a, uint32_t insn) +{ + TCGCond cond = (a->ne ? TCG_COND_NE : TCG_COND_EQ); + TCGv_i64 rn = read_cpu_reg(s, a->rn, a->sf); + TCGv_i64 rm = read_cpu_reg(s, a->rm, a->sf); + TCGv_i64 cmp = tcg_temp_new_i64(); + + tcg_gen_setcond_i64(cond, cmp, rn, rm); + tcg_gen_extrl_i64_i32(cpu_NF, cmp); + tcg_temp_free_i64(cmp); + + /* VF = !NF & !CF. */ + tcg_gen_xori_i32(cpu_VF, cpu_NF, 1); + tcg_gen_andc_i32(cpu_VF, cpu_VF, cpu_CF); + + /* Both NF and VF actually look at bit 31. */ + tcg_gen_neg_i32(cpu_NF, cpu_NF); + tcg_gen_neg_i32(cpu_VF, cpu_VF); +} + +static void trans_WHILE(DisasContext *s, arg_WHILE *a, uint32_t insn) +{ + TCGv_i64 op0 = read_cpu_reg(s, a->rn, 1); + TCGv_i64 op1 = read_cpu_reg(s, a->rm, 1); + TCGv_i64 t0 = tcg_temp_new_i64(); + TCGv_i64 t1 = tcg_temp_new_i64(); + TCGv_i32 t2, t3; + TCGv_ptr ptr; + unsigned desc, vsz = vec_full_reg_size(s); + TCGCond cond; + + if (!a->sf) { + if (a->u) { + tcg_gen_ext32u_i64(op0, op0); + tcg_gen_ext32u_i64(op1, op1); + } else { + tcg_gen_ext32s_i64(op0, op0); + tcg_gen_ext32s_i64(op1, op1); + } + } + + /* For the helper, compress the different conditions into a computation + * of how many iterations for which the condition is true. + * + * This is slightly complicated by 0 <= UINT64_MAX, which is nominally + * 2**64 iterations, overflowing to 0. Of course, predicate registers + * aren't that large, so any value >= predicate size is sufficient. + */ + tcg_gen_sub_i64(t0, op1, op0); + + /* t0 = MIN(op1 - op0, vsz). */ + if (a->eq) { + /* Equality means one more iteration. */ + tcg_gen_movi_i64(t1, vsz - 1); + tcg_gen_movcond_i64(TCG_COND_LTU, t0, t0, t1, t0, t1); + tcg_gen_addi_i64(t0, t0, 1); + } else { + tcg_gen_movi_i64(t1, vsz); + tcg_gen_movcond_i64(TCG_COND_LTU, t0, t0, t1, t0, t1); + } + + /* t0 = (condition true ? t0 : 0). */ + cond = (a->u + ? (a->eq ? TCG_COND_LEU : TCG_COND_LTU) + : (a->eq ? TCG_COND_LE : TCG_COND_LT)); + tcg_gen_movi_i64(t1, 0); + tcg_gen_movcond_i64(cond, t0, op0, op1, t0, t1); + + t2 = tcg_temp_new_i32(); + tcg_gen_extrl_i64_i32(t2, t0); + tcg_temp_free_i64(t0); + tcg_temp_free_i64(t1); + + desc = (vsz / 8) - 2; + desc = deposit32(desc, SIMD_DATA_SHIFT, 2, a->esz); + t3 = tcg_const_i32(desc); + + ptr = tcg_temp_new_ptr(); + tcg_gen_addi_ptr(ptr, cpu_env, pred_full_reg_offset(s, a->rd)); + + gen_helper_sve_while(t2, ptr, t2, t3); + do_pred_flags(t2); + + tcg_temp_free_ptr(ptr); + tcg_temp_free_i32(t2); + tcg_temp_free_i32(t3); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 76c084d43e..b5bc7e9546 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -614,6 +614,14 @@ SINCDECP_r_64 00100101 .. 1010 d:1 u:1 10001 10 .... ..... @incdec_pred # SVE saturating inc/dec vector by predicate count SINCDECP_z 00100101 .. 1010 d:1 u:1 10000 00 .... ..... @incdec2_pred +### SVE Integer Compare - Scalars Group + +# SVE conditionally terminate scalars +CTERM 00100101 1 sf:1 1 rm:5 001000 rn:5 ne:1 0000 + +# SVE integer compare scalar count and limit +WHILE 00100101 esz:2 1 rm:5 000 sf:1 u:1 1 rn:5 eq:1 rd:4 + ### SVE Memory - 32-bit Gather and Unsized Contiguous Group # SVE load predicate register From patchwork Sat Feb 17 18:22:57 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128706 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1835454ljc; Sat, 17 Feb 2018 10:56:11 -0800 (PST) X-Google-Smtp-Source: AH8x224msV/HaoZRIaiUYeBWUOkQgTz8+j5f5YG7dAN4Wdue+WHpeJlbc2mmSQQMGHlvsWSBgfoV X-Received: by 10.37.48.84 with SMTP id w81mr6995186ybw.320.1518893771627; Sat, 17 Feb 2018 10:56:11 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518893771; cv=none; d=google.com; s=arc-20160816; b=JNQ3gt2JElYhZ2WksmpAQwyZagH/dH1jpXUKCX0JRjQgJcIJhy3WyL5mx8BVD+9cG8 ud/71d8Yer9Z4mY216MDj0x0j+zlxe9CFlxnCXBlVTDaXq2JHIh0evBfkgas6TBJnAia kypXsjn9M+/HK7OU0EgfdrdjZJtO5VybRx5TGArDEPE44QhdNPpIvR4G3rLrsA6iV+tS eyFyHRM+MvChW70tCzOd3SdtaZfcSKUu1CKazEkFeDyE3yVwxOgreHKC0jmEoBUUyFiL 0cLRDhM5WmtLbvS8VRrGOQ5MB+j1zHEtpl62gWFIw8JQaIYYoFfGz7rsJu9317PQlf8d FVtw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=Sbjzr4oP/2U5rXhk+eeI3UjsbSb3BRWAaUy7LI44TGI=; b=FC6lBx4T/BklgZ3ERW2YaAgK0RWE1yLdj/t4OJCiYoT+G3K29l5iSCSG1YJC/GElJb x0lO4h21hlz2TLHocPxr0UqA1wrJ/uzsdUnJ8PqoenCCcYGSeFAgVu3PRVQqZ7z7F0bc ir8BviMcm/N9ikrFbCyWvuZTM+dM04MUj74rGn3il2AOmo6c1qLNlmh6gXGAWx/Ra8od qNMXRo5OIxJt0mEdZIjfsxBHbjjEbPJ2dMFIg4blFa3B19x8wwC9f8lzPFJO4jKlVs/y cdFXuKGr8B0ImZWWqZhB+WGWSMcCk2Tub0hvbir431Frdq1lWHqSAPhRpAYMxZMarDF/ AQGQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=PivhMbZN; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id s65si2448992ywb.804.2018.02.17.10.56.11 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 10:56:11 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=PivhMbZN; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48317 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7ed-0000y0-1E for patch@linaro.org; Sat, 17 Feb 2018 13:56:11 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40504) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7A2-0001DW-7n for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:35 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7A0-0001zc-M8 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:34 -0500 Received: from mail-pf0-x242.google.com ([2607:f8b0:400e:c00::242]:45874) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7A0-0001zM-G9 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:32 -0500 Received: by mail-pf0-x242.google.com with SMTP id w83so592594pfi.12 for ; Sat, 17 Feb 2018 10:24:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Sbjzr4oP/2U5rXhk+eeI3UjsbSb3BRWAaUy7LI44TGI=; b=PivhMbZN4oKeFBK3typmjkvZQ71C+JQ3JiYxLXTBJzKP5klTatnO525dkf/tXWU/rA AiaaBByN9gbGc1dNyf6ZVkr6JS1cvRayazZ4Fh2KtO3puhQq24ieo2fDO/TImRhp50tD 63ReSVo1pLmf8exbQH6MiQQhLu+IIeyWrSfdA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Sbjzr4oP/2U5rXhk+eeI3UjsbSb3BRWAaUy7LI44TGI=; b=LG9UIc8UNbBQxna3+FCWQT3kOanlET4Z14G4N590jnPNT9NbYR4mbn51mZFDNvHcZq GP8biS2jYtdd9iudcAMq+EVjbph/Dqn0YnvdcJF+gb6C1VY7cUETP7EK9XerqSOpXS5a noLHA0hjyMhMLhsrAjqJkQLqLLrO5BA0HZ0jawQdKVApZ/BsEFOKZ4nF9mkcwOF5IQpI SAV51h4uAmIgXTSXig4lewCjoFZhMCGkEaIVoLDCF+L0ErwYCibgv8wYz08IGP/2c+np XaPRtYOK/MjHtriJ4CYOYrPYwzZid3A8rRXHbrkZkQyCNtndn34ifp5CuKk0t3mM2fjK Dwww== X-Gm-Message-State: APf1xPCpYPGh1Kc0PcWr+4gkHu8iTRodRQCAW6UmvAnSHQ8QwO6WmNEq E//hiZVa/qffHdSck0num72ACA0x2YQ= X-Received: by 10.98.92.68 with SMTP id q65mr9797853pfb.4.1518891871266; Sat, 17 Feb 2018 10:24:31 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.29 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:30 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:57 -0800 Message-Id: <20180217182323.25885-42-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::242 Subject: [Qemu-devel] [PATCH v2 41/67] target/arm: Implement FDUP/DUP X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 35 +++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 8 ++++++++ 2 files changed, 43 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 4b92a55c21..7571d02237 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2939,6 +2939,41 @@ static void trans_WHILE(DisasContext *s, arg_WHILE *a, uint32_t insn) tcg_temp_free_i32(t3); } +/* + *** SVE Integer Wide Immediate - Unpredicated Group + */ + +static void trans_FDUP(DisasContext *s, arg_FDUP *a, uint32_t insn) +{ + unsigned vsz = vec_full_reg_size(s); + int dofs = vec_full_reg_offset(s, a->rd); + uint64_t imm; + + if (a->esz == 0) { + unallocated_encoding(s); + return; + } + + /* Decode the VFP immediate. */ + imm = vfp_expand_imm(a->esz, a->imm); + imm = dup_const(a->esz, imm); + + tcg_gen_gvec_dup64i(dofs, vsz, vsz, imm); +} + +static void trans_DUP_i(DisasContext *s, arg_DUP_i *a, uint32_t insn) +{ + unsigned vsz = vec_full_reg_size(s); + int dofs = vec_full_reg_offset(s, a->rd); + + if (a->esz == 0 && extract32(insn, 13, 1)) { + unallocated_encoding(s); + return; + } + + tcg_gen_gvec_dup64i(dofs, vsz, vsz, dup_const(a->esz, a->imm)); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index b5bc7e9546..ea1bfe7579 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -622,6 +622,14 @@ CTERM 00100101 1 sf:1 1 rm:5 001000 rn:5 ne:1 0000 # SVE integer compare scalar count and limit WHILE 00100101 esz:2 1 rm:5 000 sf:1 u:1 1 rn:5 eq:1 rd:4 +### SVE Integer Wide Immediate - Unpredicated Group + +# SVE broadcast floating-point immediate (unpredicated) +FDUP 00100101 esz:2 111 00 1110 imm:8 rd:5 + +# SVE broadcast integer immediate (unpredicated) +DUP_i 00100101 esz:2 111 00 011 . ........ rd:5 imm=%sh8_i8s + ### SVE Memory - 32-bit Gather and Unsized Contiguous Group # SVE load predicate register From patchwork Sat Feb 17 18:22:58 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128727 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1847392ljc; Sat, 17 Feb 2018 11:14:12 -0800 (PST) X-Google-Smtp-Source: AH8x227/ZlIZSZMbn1KoM7CnK+JRm8au1JlqGe2+itBSOilY2+jf5YfiPXTbKioknS+s2SDNEzUg X-Received: by 10.129.200.73 with SMTP id k9mr7727858ywl.496.1518894852104; Sat, 17 Feb 2018 11:14:12 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518894852; cv=none; d=google.com; s=arc-20160816; b=M2ng9U8rUgahlFXjIht2iBQGW74bOPOZL44Fl4LFq+NU+5YicAI5eccNN8X/aUdocQ xycvZcgJVlzff8/Rn18ovYPGVBjD208iu7RYWiYjuiGPWHXbYQge0OeEKrzzSrBP1A7O 1nXm1+WGHB4pGjcOtrXI9wGr8l7lsXo+Ay82yrwdADbgYMTUZ3uwk9XlivCU1FVeDvxB Q3xiuHnhp6D+ZoKhNxneeldk4KW0fXidVr+jurck3/L+ggVn8uL5ZiuAbnfOrdNBfBB0 +vYIvFBoURbu4D3vs/uEvfkCl4tegynqPaoKBvKdUAUnpc2VQLiLQe9eNr2e98sBT5N6 OFAw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=f6tnM4rLbtSBZqU3j1NYbg9btYpfaZ4siRzrSFF1A54=; b=sk6QoRp9mPCADIZoNKJWoWNv13RQ0wi70UwWhSLV73LUATeOoyWBmgsIsmlP7LWGY1 QP4eq91J6vlEwZM25u7EKzJhIga5/Sy3eNtHukDI2l70/9wuOwPt8nJkmTQjPyQjMXC7 pr6Gh428+g9MbrOiJMC1Ou7IBEjYdIZntdBWVeiq+lrpUPPj9nLnEYoxkjPjsEUp3/li lN6GudM+kH0YbC98tt4g2f2BQeAfk1EMyAubnqDlhKnYrkC4SeUyCIhpdXlQcQqwQ8gw zCmxmeaV6hiW3zODsKNH2EWZUDSIaq60nC22uJMaCWxTUkDs6e/RUFh1cYdnWoJPPWv7 NHjg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=EhjmOh6l; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id r9si3417576ybg.427.2018.02.17.11.14.11 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 11:14:12 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=EhjmOh6l; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48480 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7w3-000080-De for patch@linaro.org; Sat, 17 Feb 2018 14:14:11 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40545) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7A4-0001Fz-10 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:38 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7A2-00020l-Ei for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:36 -0500 Received: from mail-pf0-x244.google.com ([2607:f8b0:400e:c00::244]:35233) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7A2-000201-5H for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:34 -0500 Received: by mail-pf0-x244.google.com with SMTP id a6so591965pfi.2 for ; Sat, 17 Feb 2018 10:24:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=f6tnM4rLbtSBZqU3j1NYbg9btYpfaZ4siRzrSFF1A54=; b=EhjmOh6lGPWybODY5lXUS71MTW38IIzcnhajIriQbP0P00tlB7d3HRAmBYCW9OYuEW PnDLVI6XqhG1kH9EpIbQaDp+RLTzMQZV6y8+xJRr7r21PiuLVxg3MEs1bxh+iXaO4UML F7ncbuivsjEJhKJeQ3UltlropW6EgvJcC4BlA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=f6tnM4rLbtSBZqU3j1NYbg9btYpfaZ4siRzrSFF1A54=; b=OelQaEjt2wJuTYuYOu/h/IQYPh9A9rRDMbCKEy5tTaEtlBhMH3fg2PizC59g+Nzlre MiKzgeBXL6CqtpvvaYJ8P0us9/YFYH0YZpIcxJj8LEpuY3r8yTvDvZOoANtmr6Trx2SS p5J6zlwVQmWT6iTyDnqC3t9C/mNj7sYShyxBpkurEmdL7kudPbyU5G7uy3KglU39mKT5 1iuVbmqPXEpTSUbVwTypCQXeENjcovx2wG9XhjxPDKA5rxp3+VlOd4+cuOAOJu9XHeFt +lG3uCIGV3ncM9EdWlPWjv2EGzVcUbPAO8hrSYN96B+6rwpoL2UTfvFiZ2cDNgVuztpa 0udA== X-Gm-Message-State: APf1xPAf13vN9+u1uTHj2vAVQvX+27xmCyJ+wMLcDIWWkunMc8uItyEl znjjrLUuKoJqj5BO4x09VzAvlNi9anE= X-Received: by 10.98.33.204 with SMTP id o73mr2432335pfj.54.1518891872719; Sat, 17 Feb 2018 10:24:32 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.31 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:31 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:58 -0800 Message-Id: <20180217182323.25885-43-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::244 Subject: [Qemu-devel] [PATCH v2 42/67] target/arm: Implement SVE Integer Wide Immediate - Unpredicated Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 25 +++++++++ target/arm/sve_helper.c | 41 ++++++++++++++ target/arm/translate-sve.c | 135 +++++++++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 26 +++++++++ 4 files changed, 227 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 1863106d0f..97bfe0f47b 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -680,3 +680,28 @@ DEF_HELPER_FLAGS_4(sve_brkns, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_cntp, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_while, TCG_CALL_NO_RWG, i32, ptr, i32, i32) + +DEF_HELPER_FLAGS_4(sve_subri_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_subri_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_subri_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_subri_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_4(sve_smaxi_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_smaxi_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_smaxi_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_smaxi_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_4(sve_smini_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_smini_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_smini_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_smini_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_4(sve_umaxi_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_umaxi_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_umaxi_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_umaxi_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_4(sve_umini_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_umini_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_umini_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_umini_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 80b78da834..4f45f11bff 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -803,6 +803,46 @@ DO_VPZ_D(sve_uminv_d, uint64_t, uint64_t, -1, DO_MIN) #undef DO_VPZ #undef DO_VPZ_D +/* Two vector operand, one scalar operand, unpredicated. */ +#define DO_ZZI(NAME, TYPE, OP) \ +void HELPER(NAME)(void *vd, void *vn, uint64_t s64, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc) / sizeof(TYPE); \ + TYPE s = s64, *d = vd, *n = vn; \ + for (i = 0; i < opr_sz; ++i) { \ + d[i] = OP(n[i], s); \ + } \ +} + +#define DO_SUBR(X, Y) (Y - X) + +DO_ZZI(sve_subri_b, uint8_t, DO_SUBR) +DO_ZZI(sve_subri_h, uint16_t, DO_SUBR) +DO_ZZI(sve_subri_s, uint32_t, DO_SUBR) +DO_ZZI(sve_subri_d, uint64_t, DO_SUBR) + +DO_ZZI(sve_smaxi_b, int8_t, DO_MAX) +DO_ZZI(sve_smaxi_h, int16_t, DO_MAX) +DO_ZZI(sve_smaxi_s, int32_t, DO_MAX) +DO_ZZI(sve_smaxi_d, int64_t, DO_MAX) + +DO_ZZI(sve_smini_b, int8_t, DO_MIN) +DO_ZZI(sve_smini_h, int16_t, DO_MIN) +DO_ZZI(sve_smini_s, int32_t, DO_MIN) +DO_ZZI(sve_smini_d, int64_t, DO_MIN) + +DO_ZZI(sve_umaxi_b, uint8_t, DO_MAX) +DO_ZZI(sve_umaxi_h, uint16_t, DO_MAX) +DO_ZZI(sve_umaxi_s, uint32_t, DO_MAX) +DO_ZZI(sve_umaxi_d, uint64_t, DO_MAX) + +DO_ZZI(sve_umini_b, uint8_t, DO_MIN) +DO_ZZI(sve_umini_h, uint16_t, DO_MIN) +DO_ZZI(sve_umini_s, uint32_t, DO_MIN) +DO_ZZI(sve_umini_d, uint64_t, DO_MIN) + +#undef DO_ZZI + #undef DO_AND #undef DO_ORR #undef DO_EOR @@ -817,6 +857,7 @@ DO_VPZ_D(sve_uminv_d, uint64_t, uint64_t, -1, DO_MIN) #undef DO_ASR #undef DO_LSR #undef DO_LSL +#undef DO_SUBR /* Similar to the ARM LastActiveElement pseudocode function, except the result is multiplied by the element size. This includes the not found diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 7571d02237..72abcb543a 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -81,6 +81,11 @@ static inline int expand_imm_sh8s(int x) return (int8_t)x << (x & 0x100 ? 8 : 0); } +static inline int expand_imm_sh8u(int x) +{ + return (uint8_t)x << (x & 0x100 ? 8 : 0); +} + /* * Include the generated decoder. */ @@ -2974,6 +2979,136 @@ static void trans_DUP_i(DisasContext *s, arg_DUP_i *a, uint32_t insn) tcg_gen_gvec_dup64i(dofs, vsz, vsz, dup_const(a->esz, a->imm)); } +static void trans_ADD_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + unsigned vsz = vec_full_reg_size(s); + + if (a->esz == 0 && extract32(insn, 13, 1)) { + unallocated_encoding(s); + return; + } + tcg_gen_gvec_addi(a->esz, vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), a->imm, vsz, vsz); +} + +static void trans_SUB_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + a->imm = -a->imm; + trans_ADD_zzi(s, a, insn); +} + +static void trans_SUBR_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + static const GVecGen2s op[4] = { + { .fni8 = tcg_gen_vec_sub8_i64, + .fniv = tcg_gen_sub_vec, + .fno = gen_helper_sve_subri_b, + .opc = INDEX_op_sub_vec, + .vece = MO_8, + .scalar_first = true }, + { .fni8 = tcg_gen_vec_sub16_i64, + .fniv = tcg_gen_sub_vec, + .fno = gen_helper_sve_subri_h, + .opc = INDEX_op_sub_vec, + .vece = MO_16, + .scalar_first = true }, + { .fni4 = tcg_gen_sub_i32, + .fniv = tcg_gen_sub_vec, + .fno = gen_helper_sve_subri_s, + .opc = INDEX_op_sub_vec, + .vece = MO_32, + .scalar_first = true }, + { .fni8 = tcg_gen_sub_i64, + .fniv = tcg_gen_sub_vec, + .fno = gen_helper_sve_subri_d, + .opc = INDEX_op_sub_vec, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .vece = MO_64, + .scalar_first = true } + }; + unsigned vsz = vec_full_reg_size(s); + TCGv_i64 c; + + if (a->esz == 0 && extract32(insn, 13, 1)) { + unallocated_encoding(s); + return; + } + c = tcg_const_i64(a->imm); + tcg_gen_gvec_2s(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), vsz, vsz, c, &op[a->esz]); + tcg_temp_free_i64(c); +} + +static void trans_MUL_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_muli(a->esz, vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), a->imm, vsz, vsz); +} + +static void do_zzi_sat(DisasContext *s, arg_rri_esz *a, uint32_t insn, + bool u, bool d) +{ + TCGv_i64 val; + + if (a->esz == 0 && extract32(insn, 13, 1)) { + unallocated_encoding(s); + return; + } + val = tcg_const_i64(a->imm); + do_sat_addsub_vec(s, a->esz, a->rd, a->rn, val, u, d); + tcg_temp_free_i64(val); +} + +static void trans_SQADD_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + do_zzi_sat(s, a, insn, false, false); +} + +static void trans_UQADD_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + do_zzi_sat(s, a, insn, true, false); +} + +static void trans_SQSUB_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + do_zzi_sat(s, a, insn, false, true); +} + +static void trans_UQSUB_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + do_zzi_sat(s, a, insn, true, true); +} + +static void do_zzi_ool(DisasContext *s, arg_rri_esz *a, gen_helper_gvec_2i *fn) +{ + unsigned vsz = vec_full_reg_size(s); + TCGv_i64 c = tcg_const_i64(a->imm); + + tcg_gen_gvec_2i_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + c, vsz, vsz, 0, fn); + tcg_temp_free_i64(c); +} + +#define DO_ZZI(NAME, name) \ +static void trans_##NAME##_zzi(DisasContext *s, arg_rri_esz *a, \ + uint32_t insn) \ +{ \ + static gen_helper_gvec_2i * const fns[4] = { \ + gen_helper_sve_##name##i_b, gen_helper_sve_##name##i_h, \ + gen_helper_sve_##name##i_s, gen_helper_sve_##name##i_d, \ + }; \ + do_zzi_ool(s, a, fns[a->esz]); \ +} + +DO_ZZI(SMAX, smax) +DO_ZZI(UMAX, umax) +DO_ZZI(SMIN, smin) +DO_ZZI(UMIN, umin) + +#undef DO_ZZI + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index ea1bfe7579..1ede152360 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -43,6 +43,8 @@ # Signed 8-bit immediate, optionally shifted left by 8. %sh8_i8s 5:9 !function=expand_imm_sh8s +# Unsigned 8-bit immediate, optionally shifted left by 8. +%sh8_i8u 5:9 !function=expand_imm_sh8u # Either a copy of rd (at bit 0), or a different source # as propagated via the MOVPRFX instruction. @@ -96,6 +98,12 @@ @pd_pn_pm ........ esz:2 .. rm:4 ....... rn:4 . rd:4 &rrr_esz @rdn_rm ........ esz:2 ...... ...... rm:5 rd:5 \ &rrr_esz rn=%reg_movprfx +@rdn_sh_i8u ........ esz:2 ...... ...... ..... rd:5 \ + &rri_esz rn=%reg_movprfx imm=%sh8_i8u +@rdn_i8u ........ esz:2 ...... ... imm:8 rd:5 \ + &rri_esz rn=%reg_movprfx +@rdn_i8s ........ esz:2 ...... ... imm:s8 rd:5 \ + &rri_esz rn=%reg_movprfx # Three operand with "memory" size, aka immediate left shift @rd_rn_msz_rm ........ ... rm:5 .... imm:2 rn:5 rd:5 &rrri @@ -630,6 +638,24 @@ FDUP 00100101 esz:2 111 00 1110 imm:8 rd:5 # SVE broadcast integer immediate (unpredicated) DUP_i 00100101 esz:2 111 00 011 . ........ rd:5 imm=%sh8_i8s +# SVE integer add/subtract immediate (unpredicated) +ADD_zzi 00100101 .. 100 000 11 . ........ ..... @rdn_sh_i8u +SUB_zzi 00100101 .. 100 001 11 . ........ ..... @rdn_sh_i8u +SUBR_zzi 00100101 .. 100 011 11 . ........ ..... @rdn_sh_i8u +SQADD_zzi 00100101 .. 100 100 11 . ........ ..... @rdn_sh_i8u +UQADD_zzi 00100101 .. 100 101 11 . ........ ..... @rdn_sh_i8u +SQSUB_zzi 00100101 .. 100 110 11 . ........ ..... @rdn_sh_i8u +UQSUB_zzi 00100101 .. 100 111 11 . ........ ..... @rdn_sh_i8u + +# SVE integer min/max immediate (unpredicated) +SMAX_zzi 00100101 .. 101 000 110 ........ ..... @rdn_i8s +UMAX_zzi 00100101 .. 101 001 110 ........ ..... @rdn_i8u +SMIN_zzi 00100101 .. 101 010 110 ........ ..... @rdn_i8s +UMIN_zzi 00100101 .. 101 011 110 ........ ..... @rdn_i8u + +# SVE integer multiply immediate (unpredicated) +MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s + ### SVE Memory - 32-bit Gather and Unsized Contiguous Group # SVE load predicate register From patchwork Sat Feb 17 18:22:59 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128697 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1830415ljc; Sat, 17 Feb 2018 10:47:40 -0800 (PST) X-Google-Smtp-Source: AH8x224uOyfj9vV3QZiSDDLvDcwJ949bZ9++yfAyOduMpidMY09KTjfhk7lMTHgNcWulq1Ub0YWE X-Received: by 10.129.122.197 with SMTP id v188mr7236467ywc.280.1518893260557; Sat, 17 Feb 2018 10:47:40 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518893260; cv=none; d=google.com; s=arc-20160816; b=ND5PwJgS4F3bviE+1xl9pzU5GSboPHFtED7+VCWPvxbJoYcjzMCmGLvzbPklBt/Ios mcVixxgNAlFi+obUgtDkN2lN1FdfKIfAgKJOsgu9aqj6g4QeYXOxFoH9oJQJ0jDhA6ZJ LF1gyW6EKo8UpXiss8jY+Bt7i1mKEuvy7hdeTKYKonU7f2//RGJedgTmdS/2YY+2jw3C BR3FLT9f7HD8ZiuUuAB3aiuRx0MJhwpCS+S4LMU1r28/NBHpBb/wjTOJIt2SXCzzzIiG 6qaXct6PLVHlNxEGHmA7gEcQY4wJPm1AwonWd336gmzFIaTUQ84Xarx0/K4sxZtxWUQ1 uvFQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=phitCBYJ3/hkDUn72beMxZ/Q6P0rFSGJrO+mIm6pzNw=; b=BjgD0alX9QBdGEVyKch8gdE9jsmf2AgTaEXAzSIm6wB6A15pwM+nX+Bb6hIb7QjjRQ xhOmcQbkKVdZbYeDp70uzyI+MfGYsrgQe8Cb+X0ncWtIUfLglebO46baRbzPTIO6Khoh VGizo1CB8g1KRHGvfkJjw6u7rl7FxFAoljAyQhAvy6uzx83ow+C45CVIPpc4aVUwTwK2 vdxqR5tBdZ6YhdRLA4e1EnHbZ6km+xdD38ozeeXS+hvpa6yVBoiLa0LC7wt+qUpqJAys JI9oe4AP0SqdfZ5XRUcWX4IAAR1UoU69qawTWN4PCyQDKG5glTdOHbj3gY3n4vx0Rury M9UQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=fELIXxqg; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id a17si3581096ybh.96.2018.02.17.10.47.40 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 10:47:40 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=fELIXxqg; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48245 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7WO-0003DJ-0C for patch@linaro.org; Sat, 17 Feb 2018 13:47:40 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40571) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7A5-0001IC-LM for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:39 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7A3-00021Q-VE for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:37 -0500 Received: from mail-pg0-x243.google.com ([2607:f8b0:400e:c05::243]:34489) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7A3-000214-NV for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:35 -0500 Received: by mail-pg0-x243.google.com with SMTP id m19so4360482pgn.1 for ; Sat, 17 Feb 2018 10:24:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=phitCBYJ3/hkDUn72beMxZ/Q6P0rFSGJrO+mIm6pzNw=; b=fELIXxqgPeTLsz6llK3Nn+TB4c9sygTze5gXA7CYaafPCwe37NEkyYp0Ut4xVJmtNv 7Mx09tp2ajASVTdwzkF0GERr8jqXLk75stt46rQm+mClhDIX4i53I7R1wZYCFv8d9Syb R9gU8F2/VbyBAxP/1ICaAuXsl8ZCWsCunvtro= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=phitCBYJ3/hkDUn72beMxZ/Q6P0rFSGJrO+mIm6pzNw=; b=lPgEPLJMaWk15bDXOifF8bG+2jxt0h6SmsDhRUeVCLNfAdDFe5NwVlmm9ikcZ2fUFc 5l2h3cWF6rOjAmkAIJ4SJ7wWNWLlQgmGfkfxNCXVvqMwXamvnkbBB72+meyesfyPhVrF vDKC0R9QkS3yNB80hZvoTJxJVK33282JaL8B0klwl1ciDatz9Q0d7OGxXaROv6oii3L5 Y0I43aCmUvneihLlhkFZh4/VGu7AkVCMTokG7lCNieAGssafRav9AJbvPqdifMbaiEBJ Lh6FPoTORIJtOMWJ6VZ5OgWUx/4OXI48UyTT6fOpQhLTi61Vqb6s36wnsydZIjQGHIL4 3HAA== X-Gm-Message-State: APf1xPClV9tbQtx593M1S5zLjJXpn7I7SYFl3lIZmsqHue/OS3M0sckB Rx8bftelrYazZrbfVguC9i4xGf/oPYk= X-Received: by 10.167.130.193 with SMTP id f1mr9610466pfn.241.1518891874395; Sat, 17 Feb 2018 10:24:34 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.32 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:33 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:22:59 -0800 Message-Id: <20180217182323.25885-44-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::243 Subject: [Qemu-devel] [PATCH v2 43/67] target/arm: Implement SVE Floating Point Arithmetic - Unpredicated Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 14 +++++++ target/arm/helper.h | 19 ++++++++++ target/arm/translate-sve.c | 41 ++++++++++++++++++++ target/arm/vec_helper.c | 94 ++++++++++++++++++++++++++++++++++++++++++++++ target/arm/Makefile.objs | 2 +- target/arm/sve.decode | 10 +++++ 6 files changed, 179 insertions(+), 1 deletion(-) create mode 100644 target/arm/vec_helper.c -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 97bfe0f47b..2e76084992 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -705,3 +705,17 @@ DEF_HELPER_FLAGS_4(sve_umini_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(sve_umini_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(sve_umini_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(sve_umini_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_5(gvec_recps_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_recps_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_recps_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(gvec_rsqrts_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/helper.h b/target/arm/helper.h index be3c2fcdc0..f3ce58e276 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -565,6 +565,25 @@ DEF_HELPER_2(dc_zva, void, env, i64) DEF_HELPER_FLAGS_2(neon_pmull_64_lo, TCG_CALL_NO_RWG_SE, i64, i64, i64) DEF_HELPER_FLAGS_2(neon_pmull_64_hi, TCG_CALL_NO_RWG_SE, i64, i64, i64) +DEF_HELPER_FLAGS_5(gvec_fadd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(gvec_fsub_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fsub_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fsub_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(gvec_fmul_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fmul_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fmul_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(gvec_ftsmul_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_ftsmul_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_ftsmul_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 72abcb543a..f9a3ad1434 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3109,6 +3109,47 @@ DO_ZZI(UMIN, umin) #undef DO_ZZI +/* + *** SVE Floating Point Arithmetic - Unpredicated Group + */ + +static void do_zzz_fp(DisasContext *s, arg_rrr_esz *a, + gen_helper_gvec_3_ptr *fn) +{ + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr status; + + if (fn == NULL) { + unallocated_encoding(s); + return; + } + status = get_fpstatus_ptr(a->esz == MO_16); + tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + status, vsz, vsz, 0, fn); +} + + +#define DO_FP3(NAME, name) \ +static void trans_##NAME(DisasContext *s, arg_rrr_esz *a, uint32_t insn) \ +{ \ + static gen_helper_gvec_3_ptr * const fns[4] = { \ + NULL, gen_helper_gvec_##name##_h, \ + gen_helper_gvec_##name##_s, gen_helper_gvec_##name##_d \ + }; \ + do_zzz_fp(s, a, fns[a->esz]); \ +} + +DO_FP3(FADD_zzz, fadd) +DO_FP3(FSUB_zzz, fsub) +DO_FP3(FMUL_zzz, fmul) +DO_FP3(FTSMUL, ftsmul) +DO_FP3(FRECPS, recps) +DO_FP3(FRSQRTS, rsqrts) + +#undef DO_FP3 + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c new file mode 100644 index 0000000000..ad5c29cdd5 --- /dev/null +++ b/target/arm/vec_helper.c @@ -0,0 +1,94 @@ +/* + * ARM Shared AdvSIMD / SVE Operations + * + * Copyright (c) 2018 Linaro + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, see . + */ + +#include "qemu/osdep.h" +#include "cpu.h" +#include "exec/helper-proto.h" +#include "tcg/tcg-gvec-desc.h" +#include "fpu/softfloat.h" + + +/* Floating-point trigonometric starting value. + * See the ARM ARM pseudocode function FPTrigSMul. + */ +static float16 float16_ftsmul(float16 op1, uint16_t op2, float_status *stat) +{ + float16 result = float16_mul(op1, op1, stat); + if (!float16_is_any_nan(result)) { + result = float16_set_sign(result, op2 & 1); + } + return result; +} + +static float32 float32_ftsmul(float32 op1, uint32_t op2, float_status *stat) +{ + float32 result = float32_mul(op1, op1, stat); + if (!float32_is_any_nan(result)) { + result = float32_set_sign(result, op2 & 1); + } + return result; +} + +static float64 float64_ftsmul(float64 op1, uint64_t op2, float_status *stat) +{ + float64 result = float64_mul(op1, op1, stat); + if (!float64_is_any_nan(result)) { + result = float64_set_sign(result, op2 & 1); + } + return result; +} + +#define DO_3OP(NAME, FUNC, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + TYPE *d = vd, *n = vn, *m = vm; \ + for (i = 0; i < oprsz / sizeof(TYPE); i++) { \ + d[i] = FUNC(n[i], m[i], stat); \ + } \ +} + +DO_3OP(gvec_fadd_h, float16_add, float16) +DO_3OP(gvec_fadd_s, float32_add, float32) +DO_3OP(gvec_fadd_d, float64_add, float64) + +DO_3OP(gvec_fsub_h, float16_sub, float16) +DO_3OP(gvec_fsub_s, float32_sub, float32) +DO_3OP(gvec_fsub_d, float64_sub, float64) + +DO_3OP(gvec_fmul_h, float16_mul, float16) +DO_3OP(gvec_fmul_s, float32_mul, float32) +DO_3OP(gvec_fmul_d, float64_mul, float64) + +DO_3OP(gvec_ftsmul_h, float16_ftsmul, float16) +DO_3OP(gvec_ftsmul_s, float32_ftsmul, float32) +DO_3OP(gvec_ftsmul_d, float64_ftsmul, float64) + +#ifdef TARGET_AARCH64 + +DO_3OP(gvec_recps_h, helper_recpsf_f16, float16) +DO_3OP(gvec_recps_s, helper_recpsf_f32, float32) +DO_3OP(gvec_recps_d, helper_recpsf_f64, float64) + +DO_3OP(gvec_rsqrts_h, helper_rsqrtsf_f16, float16) +DO_3OP(gvec_rsqrts_s, helper_rsqrtsf_f32, float32) +DO_3OP(gvec_rsqrts_d, helper_rsqrtsf_f64, float64) + +#endif +#undef DO_3OP diff --git a/target/arm/Makefile.objs b/target/arm/Makefile.objs index 452ac6f453..50a521876d 100644 --- a/target/arm/Makefile.objs +++ b/target/arm/Makefile.objs @@ -8,7 +8,7 @@ obj-y += translate.o op_helper.o helper.o cpu.o obj-y += neon_helper.o iwmmxt_helper.o obj-y += gdbstub.o obj-$(TARGET_AARCH64) += cpu64.o translate-a64.o helper-a64.o gdbstub64.o -obj-y += crypto_helper.o +obj-y += crypto_helper.o vec_helper.o obj-$(CONFIG_SOFTMMU) += arm-powerctl.o DECODETREE = $(SRC_PATH)/scripts/decodetree.py diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 1ede152360..42d14994a1 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -656,6 +656,16 @@ UMIN_zzi 00100101 .. 101 011 110 ........ ..... @rdn_i8u # SVE integer multiply immediate (unpredicated) MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s +### SVE Floating Point Arithmetic - Unpredicated Group + +# SVE floating-point arithmetic (unpredicated) +FADD_zzz 01100101 .. 0 ..... 000 000 ..... ..... @rd_rn_rm +FSUB_zzz 01100101 .. 0 ..... 000 001 ..... ..... @rd_rn_rm +FMUL_zzz 01100101 .. 0 ..... 000 010 ..... ..... @rd_rn_rm +FTSMUL 01100101 .. 0 ..... 000 011 ..... ..... @rd_rn_rm +FRECPS 01100101 .. 0 ..... 000 110 ..... ..... @rd_rn_rm +FRSQRTS 01100101 .. 0 ..... 000 111 ..... ..... @rd_rn_rm + ### SVE Memory - 32-bit Gather and Unsized Contiguous Group # SVE load predicate register From patchwork Sat Feb 17 18:23:00 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128711 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1837172ljc; Sat, 17 Feb 2018 10:59:27 -0800 (PST) X-Google-Smtp-Source: AH8x225MlSJWkIBmWlxLiKstUmW6vIaB4CXXt3NCgjegVf7SDyP0+XHUdp0XNAywymRMm9WxZcE/ X-Received: by 10.129.82.81 with SMTP id g78mr3938135ywb.240.1518893966994; Sat, 17 Feb 2018 10:59:26 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518893966; cv=none; d=google.com; s=arc-20160816; b=pzgvG1xndLOhVE9TBF4/PkgK4pCvM0rdBFqHMYttkfImbWkiq0YpZMSNGj3ecLIjwt QpvkSKgww04oeBilnxQnYH5tSqK+PrhcGeQhYI3BflY8oEVYHY/dU43c9o75g37KXmzT ktEkR4EdgpqXNtbjtIJS06mtyvGYEURSAbGzM/ZxcJJ2oJP4GJ0cnmX4aVtEg6i0j+yj b/DCaLbsILHoTDqFfw4XqWUgO3TuI90R9ktY0zsli5gA+2f4ejtQ5e+zdagqMmc5Geh6 GDxdTaASN6XzsNj/8nXR89dmMTpHdr6dupoRJjKNmBynH9re+XgIHabdIA7z0Cvy379+ bs2A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=dj1dGlY0FjUyA12pbp3Budo7JX7KcWM9Ev96He6goX4=; b=QVj81DecQy895q1XV/KMmUFJUcQ9jS9SG73Y5l0qFzSa3gCUyUvYxAuCBy+rXu+g/Y GsdEz8liGDLMvvzGUCghfVBZB6kw6PIxU6xvp6fNSHL1BDk9BaXuNj/WkZViKUL097wZ Cqdt99fsMsFft74QGNzkCMPeigj3HLoOxu/1Cmu9YmspHY82OqLJfOYjl2N7Kr3fa+38 fulwVJ0MBZFnOmA/y/qNW9yx9MEf4nzmQm3Syismwb6aqmGN+J0V8KS20hGeBmrRuCjK EGaZYN5r395oBAHAZAbm+tGYNolevE9lrdZyIbX5iVgIi9xDHeC3ymT/11mJq/aIZvjM qM2w== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=kdRAj7O6; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id d2si1750323ybn.154.2018.02.17.10.59.26 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 10:59:26 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=kdRAj7O6; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48359 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7hm-0003qo-9V for patch@linaro.org; Sat, 17 Feb 2018 13:59:26 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40595) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7A8-0001Kq-4T for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:43 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7A5-00022D-PJ for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:40 -0500 Received: from mail-pl0-x243.google.com ([2607:f8b0:400e:c01::243]:35793) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7A5-00021n-H0 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:37 -0500 Received: by mail-pl0-x243.google.com with SMTP id bb3so3441299plb.2 for ; Sat, 17 Feb 2018 10:24:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=dj1dGlY0FjUyA12pbp3Budo7JX7KcWM9Ev96He6goX4=; b=kdRAj7O62tkXm4TcGmj59SWnYoEIl6YkIh1cV0+xpKdFGaQlFwrSFpW5mP/0H0Tr+v FfX0Hv5C3MUjH/wHsSwAmkeImSfOJFgXhiqcel+jCP7mTpjIiXSUEDaHfinWtqJl4Xqx zzegl7jAT5PGNkt/IWFuDyoRSfB2Y0KcyyMuE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=dj1dGlY0FjUyA12pbp3Budo7JX7KcWM9Ev96He6goX4=; b=csfV8TWdjIBhBgCj85cls+0RRSLng4G/cuyQBQGYmtEB8pElyrSt6vjUAH/xt2L5W+ Vr6adE26szejxJnO3xE+WV0PNDNY1BQmdi2qIbfTX3bssKoAvMlGrbyQ/xks+9hrt3cm LP/F4LR7VJIwlmi7HQY0Be7aiFFZX3YmVgCkrsAxAXVM5/2eqDBztqBPdIZD5AeAf/Ow n9mIc0xoV7Dd2NVRcuAkVMgLgsEr+cknUHO/b1y5eBbbfU4oDwpCtHl/Dm+kyhlrcOJk 9T2vaqeD4CRifgys5Pdfv8g/izf3tPk9sIa/RLP0HQRFnz7PPtK3fG92+vqcwZGsTVAr pVgw== X-Gm-Message-State: APf1xPBBV8doXsI9e+BoFVV3dF1PpL2fBIDvlM93ffB367n0qmbzDOqA R95BbhbaEFYLMJUA/xdb96SL8bvkFzU= X-Received: by 2002:a17:902:a983:: with SMTP id bh3-v6mr1820821plb.359.1518891876101; Sat, 17 Feb 2018 10:24:36 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.34 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:35 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:00 -0800 Message-Id: <20180217182323.25885-45-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::243 Subject: [Qemu-devel] [PATCH v2 44/67] target/arm: Implement SVE Memory Contiguous Load Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 35 +++++++ target/arm/sve_helper.c | 235 +++++++++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 130 +++++++++++++++++++++++++ target/arm/sve.decode | 44 ++++++++- 4 files changed, 442 insertions(+), 2 deletions(-) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 2e76084992..fcc9ba5f50 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -719,3 +719,38 @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_ld1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld4bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ld1hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld2hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld3hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld4hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ld1ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld2ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld3ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld4ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ld1dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld2dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld3dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld4dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ld1bhu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1bsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1bdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1bhs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1bss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1bds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ld1hsu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1hdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1hss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1hds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_ld1sdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_ld1sds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 4f45f11bff..e542725113 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2788,3 +2788,238 @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count, uint32_t pred_desc) return predtest_ones(d, oprsz, esz_mask); } + +/* + * Load contiguous data, protected by a governing predicate. + */ +#define DO_LD1(NAME, FN, TYPEE, TYPEM, H) \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + intptr_t ra = GETPC(); \ + unsigned rd = simd_data(desc); \ + void *vd = &env->vfp.zregs[rd]; \ + for (i = 0; i < oprsz; ) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + TYPEM m = 0; \ + if (pg & 1) { \ + m = FN(env, addr, ra); \ + } \ + *(TYPEE *)(vd + H(i)) = m; \ + i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \ + addr += sizeof(TYPEM); \ + } while (i & 15); \ + } \ +} + +#define DO_LD1_D(NAME, FN, TYPEM) \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc) / 8; \ + intptr_t ra = GETPC(); \ + unsigned rd = simd_data(desc); \ + uint64_t *d = &env->vfp.zregs[rd].d[0]; \ + uint8_t *pg = vg; \ + for (i = 0; i < oprsz; i += 1) { \ + TYPEM m = 0; \ + if (pg[H1(i)] & 1) { \ + m = FN(env, addr, ra); \ + } \ + d[i] = m; \ + addr += sizeof(TYPEM); \ + } \ +} + +#define DO_LD2(NAME, FN, TYPEE, TYPEM, H) \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + intptr_t ra = GETPC(); \ + unsigned rd = simd_data(desc); \ + void *d1 = &env->vfp.zregs[rd]; \ + void *d2 = &env->vfp.zregs[(rd + 1) & 31]; \ + for (i = 0; i < oprsz; ) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + TYPEM m1 = 0, m2 = 0; \ + if (pg & 1) { \ + m1 = FN(env, addr, ra); \ + m2 = FN(env, addr + sizeof(TYPEM), ra); \ + } \ + *(TYPEE *)(d1 + H(i)) = m1; \ + *(TYPEE *)(d2 + H(i)) = m2; \ + i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \ + addr += 2 * sizeof(TYPEM); \ + } while (i & 15); \ + } \ +} + +#define DO_LD3(NAME, FN, TYPEE, TYPEM, H) \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + intptr_t ra = GETPC(); \ + unsigned rd = simd_data(desc); \ + void *d1 = &env->vfp.zregs[rd]; \ + void *d2 = &env->vfp.zregs[(rd + 1) & 31]; \ + void *d3 = &env->vfp.zregs[(rd + 2) & 31]; \ + for (i = 0; i < oprsz; ) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + TYPEM m1 = 0, m2 = 0, m3 = 0; \ + if (pg & 1) { \ + m1 = FN(env, addr, ra); \ + m2 = FN(env, addr + sizeof(TYPEM), ra); \ + m3 = FN(env, addr + 2 * sizeof(TYPEM), ra); \ + } \ + *(TYPEE *)(d1 + H(i)) = m1; \ + *(TYPEE *)(d2 + H(i)) = m2; \ + *(TYPEE *)(d3 + H(i)) = m3; \ + i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \ + addr += 3 * sizeof(TYPEM); \ + } while (i & 15); \ + } \ +} + +#define DO_LD4(NAME, FN, TYPEE, TYPEM, H) \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + intptr_t ra = GETPC(); \ + unsigned rd = simd_data(desc); \ + void *d1 = &env->vfp.zregs[rd]; \ + void *d2 = &env->vfp.zregs[(rd + 1) & 31]; \ + void *d3 = &env->vfp.zregs[(rd + 2) & 31]; \ + void *d4 = &env->vfp.zregs[(rd + 3) & 31]; \ + for (i = 0; i < oprsz; ) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + TYPEM m1 = 0, m2 = 0, m3 = 0, m4 = 0; \ + if (pg & 1) { \ + m1 = FN(env, addr, ra); \ + m2 = FN(env, addr + sizeof(TYPEM), ra); \ + m3 = FN(env, addr + 2 * sizeof(TYPEM), ra); \ + m4 = FN(env, addr + 3 * sizeof(TYPEM), ra); \ + } \ + *(TYPEE *)(d1 + H(i)) = m1; \ + *(TYPEE *)(d2 + H(i)) = m2; \ + *(TYPEE *)(d3 + H(i)) = m3; \ + *(TYPEE *)(d4 + H(i)) = m4; \ + i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \ + addr += 4 * sizeof(TYPEM); \ + } while (i & 15); \ + } \ +} + +DO_LD1(sve_ld1bhu_r, cpu_ldub_data_ra, uint16_t, uint8_t, H1_2) +DO_LD1(sve_ld1bhs_r, cpu_ldsb_data_ra, uint16_t, int8_t, H1_2) +DO_LD1(sve_ld1bsu_r, cpu_ldub_data_ra, uint32_t, uint8_t, H1_4) +DO_LD1(sve_ld1bss_r, cpu_ldsb_data_ra, uint32_t, int8_t, H1_4) +DO_LD1_D(sve_ld1bdu_r, cpu_ldub_data_ra, uint8_t) +DO_LD1_D(sve_ld1bds_r, cpu_ldsb_data_ra, int8_t) + +DO_LD1(sve_ld1hsu_r, cpu_lduw_data_ra, uint32_t, uint16_t, H1_4) +DO_LD1(sve_ld1hss_r, cpu_ldsw_data_ra, uint32_t, int8_t, H1_4) +DO_LD1_D(sve_ld1hdu_r, cpu_lduw_data_ra, uint16_t) +DO_LD1_D(sve_ld1hds_r, cpu_ldsw_data_ra, int16_t) + +DO_LD1_D(sve_ld1sdu_r, cpu_ldl_data_ra, uint32_t) +DO_LD1_D(sve_ld1sds_r, cpu_ldl_data_ra, int32_t) + +DO_LD1(sve_ld1bb_r, cpu_ldub_data_ra, uint8_t, uint8_t, H1) +DO_LD2(sve_ld2bb_r, cpu_ldub_data_ra, uint8_t, uint8_t, H1) +DO_LD3(sve_ld3bb_r, cpu_ldub_data_ra, uint8_t, uint8_t, H1) +DO_LD4(sve_ld4bb_r, cpu_ldub_data_ra, uint8_t, uint8_t, H1) + +DO_LD1(sve_ld1hh_r, cpu_lduw_data_ra, uint16_t, uint16_t, H1_2) +DO_LD2(sve_ld2hh_r, cpu_lduw_data_ra, uint16_t, uint16_t, H1_2) +DO_LD3(sve_ld3hh_r, cpu_lduw_data_ra, uint16_t, uint16_t, H1_2) +DO_LD4(sve_ld4hh_r, cpu_lduw_data_ra, uint16_t, uint16_t, H1_2) + +DO_LD1(sve_ld1ss_r, cpu_ldl_data_ra, uint32_t, uint32_t, H1_4) +DO_LD2(sve_ld2ss_r, cpu_ldl_data_ra, uint32_t, uint32_t, H1_4) +DO_LD3(sve_ld3ss_r, cpu_ldl_data_ra, uint32_t, uint32_t, H1_4) +DO_LD4(sve_ld4ss_r, cpu_ldl_data_ra, uint32_t, uint32_t, H1_4) + +DO_LD1_D(sve_ld1dd_r, cpu_ldq_data_ra, uint64_t) + +void HELPER(sve_ld2dd_r)(CPUARMState *env, void *vg, + target_ulong addr, uint32_t desc) +{ + intptr_t i, oprsz = simd_oprsz(desc) / 8; + intptr_t ra = GETPC(); + unsigned rd = simd_data(desc); + uint64_t *d1 = &env->vfp.zregs[rd].d[0]; + uint64_t *d2 = &env->vfp.zregs[(rd + 1) & 31].d[0]; + uint8_t *pg = vg; + + for (i = 0; i < oprsz; i += 1) { + uint64_t m1 = 0, m2 = 0; + if (pg[H1(i)] & 1) { + m1 = cpu_ldq_data_ra(env, addr, ra); + m2 = cpu_ldq_data_ra(env, addr + 8, ra); + } + d1[i] = m1; + d2[i] = m2; + addr += 2 * 8; + } +} + +void HELPER(sve_ld3dd_r)(CPUARMState *env, void *vg, + target_ulong addr, uint32_t desc) +{ + intptr_t i, oprsz = simd_oprsz(desc) / 8; + intptr_t ra = GETPC(); + unsigned rd = simd_data(desc); + uint64_t *d1 = &env->vfp.zregs[rd].d[0]; + uint64_t *d2 = &env->vfp.zregs[(rd + 1) & 31].d[0]; + uint64_t *d3 = &env->vfp.zregs[(rd + 2) & 31].d[0]; + uint8_t *pg = vg; + + for (i = 0; i < oprsz; i += 1) { + uint64_t m1 = 0, m2 = 0, m3 = 0; + if (pg[H1(i)] & 1) { + m1 = cpu_ldq_data_ra(env, addr, ra); + m2 = cpu_ldq_data_ra(env, addr + 8, ra); + m3 = cpu_ldq_data_ra(env, addr + 16, ra); + } + d1[i] = m1; + d2[i] = m2; + d3[i] = m3; + addr += 3 * 8; + } +} + +void HELPER(sve_ld4dd_r)(CPUARMState *env, void *vg, + target_ulong addr, uint32_t desc) +{ + intptr_t i, oprsz = simd_oprsz(desc) / 8; + intptr_t ra = GETPC(); + unsigned rd = simd_data(desc); + uint64_t *d1 = &env->vfp.zregs[rd].d[0]; + uint64_t *d2 = &env->vfp.zregs[(rd + 1) & 31].d[0]; + uint64_t *d3 = &env->vfp.zregs[(rd + 2) & 31].d[0]; + uint64_t *d4 = &env->vfp.zregs[(rd + 3) & 31].d[0]; + uint8_t *pg = vg; + + for (i = 0; i < oprsz; i += 1) { + uint64_t m1 = 0, m2 = 0, m3 = 0, m4 = 0; + if (pg[H1(i)] & 1) { + m1 = cpu_ldq_data_ra(env, addr, ra); + m2 = cpu_ldq_data_ra(env, addr + 8, ra); + m3 = cpu_ldq_data_ra(env, addr + 16, ra); + m4 = cpu_ldq_data_ra(env, addr + 24, ra); + } + d1[i] = m1; + d2[i] = m2; + d3[i] = m3; + d4[i] = m4; + addr += 4 * 8; + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index f9a3ad1434..aa8bfd2ae7 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -46,6 +46,8 @@ typedef void gen_helper_gvec_flags_3(TCGv_i32, TCGv_ptr, TCGv_ptr, typedef void gen_helper_gvec_flags_4(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32); +typedef void gen_helper_gvec_mem(TCGv_env, TCGv_ptr, TCGv_i64, TCGv_i32); + /* * Helpers for extracting complex instruction fields. */ @@ -86,6 +88,15 @@ static inline int expand_imm_sh8u(int x) return (uint8_t)x << (x & 0x100 ? 8 : 0); } +/* Convert a 2-bit memory size (msz) to a 4-bit data type (dtype) + * with unsigned data. C.f. SVE Memory Contiguous Load Group. + */ +static inline int msz_dtype(int msz) +{ + static const uint8_t dtype[4] = { 0, 5, 10, 15 }; + return dtype[msz]; +} + /* * Include the generated decoder. */ @@ -3268,3 +3279,122 @@ static void trans_LDR_pri(DisasContext *s, arg_rri *a, uint32_t insn) int size = pred_full_reg_size(s); do_ldr(s, pred_full_reg_offset(s, a->rd), size, a->rn, a->imm * size); } + +/* + *** SVE Memory - Contiguous Load Group + */ + +/* The memory element size of dtype. */ +static const TCGMemOp dtype_mop[16] = { + MO_UB, MO_UB, MO_UB, MO_UB, + MO_SL, MO_UW, MO_UW, MO_UW, + MO_SW, MO_SW, MO_UL, MO_UL, + MO_SB, MO_SB, MO_SB, MO_Q +}; + +#define dtype_msz(x) (dtype_mop[x] & MO_SIZE) + +/* The vector element size of dtype. */ +static const uint8_t dtype_esz[16] = { + 0, 1, 2, 3, + 3, 1, 2, 3, + 3, 2, 2, 3, + 3, 2, 1, 3 +}; + +static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr, + gen_helper_gvec_mem *fn) +{ + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr t_pg; + TCGv_i32 desc; + + /* For e.g. LD4, there are not enough arguments to pass all 4 + registers as pointers, so encode the regno into the data field. + For consistency, do this even for LD1. */ + desc = tcg_const_i32(simd_desc(vsz, vsz, zt)); + t_pg = tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg)); + fn(cpu_env, t_pg, addr, desc); + + tcg_temp_free_ptr(t_pg); + tcg_temp_free_i32(desc); + tcg_temp_free_i64(addr); +} + +static void do_ld_zpa(DisasContext *s, int zt, int pg, + TCGv_i64 addr, int dtype, int nreg) +{ + static gen_helper_gvec_mem * const fns[16][4] = { + { gen_helper_sve_ld1bb_r, gen_helper_sve_ld2bb_r, + gen_helper_sve_ld3bb_r, gen_helper_sve_ld4bb_r }, + { gen_helper_sve_ld1bhu_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1bsu_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1bdu_r, NULL, NULL, NULL }, + + { gen_helper_sve_ld1sds_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1hh_r, gen_helper_sve_ld2hh_r, + gen_helper_sve_ld3hh_r, gen_helper_sve_ld4hh_r }, + { gen_helper_sve_ld1hsu_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1hdu_r, NULL, NULL, NULL }, + + { gen_helper_sve_ld1hds_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1hss_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1ss_r, gen_helper_sve_ld2ss_r, + gen_helper_sve_ld3ss_r, gen_helper_sve_ld4ss_r }, + { gen_helper_sve_ld1sdu_r, NULL, NULL, NULL }, + + { gen_helper_sve_ld1bds_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1bss_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1bhs_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1dd_r, gen_helper_sve_ld2dd_r, + gen_helper_sve_ld3dd_r, gen_helper_sve_ld4dd_r }, + }; + gen_helper_gvec_mem *fn = fns[dtype][nreg]; + + /* While there are holes in the table, they are not + accessible via the instruction encoding. */ + assert(fn != NULL); + do_mem_zpa(s, zt, pg, addr, fn); +} + +static void trans_LD_zprr(DisasContext *s, arg_rprr_load *a, uint32_t insn) +{ + TCGv_i64 addr; + + if (a->rm == 31) { + unallocated_encoding(s); + return; + } + + addr = tcg_temp_new_i64(); + tcg_gen_muli_i64(addr, cpu_reg(s, a->rm), + (a->nreg + 1) << dtype_msz(a->dtype)); + tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn)); + do_ld_zpa(s, a->rd, a->pg, addr, a->dtype, a->nreg); +} + +static void trans_LD_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn) +{ + unsigned vsz = vec_full_reg_size(s); + unsigned elements = vsz >> dtype_esz[a->dtype]; + TCGv_i64 addr = tcg_temp_new_i64(); + + tcg_gen_addi_i64(addr, cpu_reg_sp(s, a->rn), + (a->imm * elements * (a->nreg + 1)) + << dtype_msz(a->dtype)); + do_ld_zpa(s, a->rd, a->pg, addr, a->dtype, a->nreg); +} + +static void trans_LDFF1_zprr(DisasContext *s, arg_rprr_load *a, uint32_t insn) +{ + /* FIXME */ + trans_LD_zprr(s, a, insn); +} + +static void trans_LDNF1_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn) +{ + /* FIXME */ + trans_LD_zpri(s, a, insn); +} diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 42d14994a1..d2b3869c58 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -42,9 +42,12 @@ %tszimm16_shl 22:2 16:5 !function=tszimm_shl # Signed 8-bit immediate, optionally shifted left by 8. -%sh8_i8s 5:9 !function=expand_imm_sh8s +%sh8_i8s 5:9 !function=expand_imm_sh8s # Unsigned 8-bit immediate, optionally shifted left by 8. -%sh8_i8u 5:9 !function=expand_imm_sh8u +%sh8_i8u 5:9 !function=expand_imm_sh8u + +# Unsigned load of msz into esz=2, represented as a dtype. +%msz_dtype 23:2 !function=msz_dtype # Either a copy of rd (at bit 0), or a different source # as propagated via the MOVPRFX instruction. @@ -72,6 +75,8 @@ &incdec2_cnt rd rn pat esz imm d u &incdec_pred rd pg esz d u &incdec2_pred rd rn pg esz d u +&rprr_load rd pg rn rm dtype nreg +&rpri_load rd pg rn imm dtype nreg ########################################################################### # Named instruction formats. These are generally used to @@ -171,6 +176,15 @@ @incdec2_pred ........ esz:2 .... .. ..... .. pg:4 rd:5 \ &incdec2_pred rn=%reg_movprfx +# Loads; user must fill in NREG. +@rprr_load_dt ....... dtype:4 rm:5 ... pg:3 rn:5 rd:5 &rprr_load +@rpri_load_dt ....... dtype:4 . imm:s4 ... pg:3 rn:5 rd:5 &rpri_load + +@rprr_load_msz ....... .... rm:5 ... pg:3 rn:5 rd:5 \ + &rprr_load dtype=%msz_dtype +@rpri_load_msz ....... .... . imm:s4 ... pg:3 rn:5 rd:5 \ + &rpri_load dtype=%msz_dtype + ########################################################################### # Instruction patterns. Grouped according to the SVE encodingindex.xhtml. @@ -673,3 +687,29 @@ LDR_pri 10000101 10 ...... 000 ... ..... 0 .... @pd_rn_i9 # SVE load vector register LDR_zri 10000101 10 ...... 010 ... ..... ..... @rd_rn_i9 + +### SVE Memory Contiguous Load Group + +# SVE contiguous load (scalar plus scalar) +LD_zprr 1010010 .... ..... 010 ... ..... ..... @rprr_load_dt nreg=0 + +# SVE contiguous first-fault load (scalar plus scalar) +LDFF1_zprr 1010010 .... ..... 011 ... ..... ..... @rprr_load_dt nreg=0 + +# SVE contiguous load (scalar plus immediate) +LD_zpri 1010010 .... 0.... 101 ... ..... ..... @rpri_load_dt nreg=0 + +# SVE contiguous non-fault load (scalar plus immediate) +LDNF1_zpri 1010010 .... 1.... 101 ... ..... ..... @rpri_load_dt nreg=0 + +# SVE contiguous non-temporal load (scalar plus scalar) +# LDNT1B, LDNT1H, LDNT1W, LDNT1D +# SVE load multiple structures (scalar plus scalar) +# LD2B, LD2H, LD2W, LD2D; etc. +LD_zprr 1010010 .. nreg:2 ..... 110 ... ..... ..... @rprr_load_msz + +# SVE contiguous non-temporal load (scalar plus immediate) +# LDNT1B, LDNT1H, LDNT1W, LDNT1D +# SVE load multiple structures (scalar plus immediate) +# LD2B, LD2H, LD2W, LD2D; etc. +LD_zpri 1010010 .. nreg:2 0.... 111 ... ..... ..... @rpri_load_msz From patchwork Sat Feb 17 18:23:01 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128728 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1847398ljc; Sat, 17 Feb 2018 11:14:12 -0800 (PST) X-Google-Smtp-Source: AH8x225IBD96dByGdE+Y2rwG+Ylpso0rNJLLTzHoy0v+TNe5nW658bHPBD8I+ae8BkiB4CsJUqdu X-Received: by 10.129.166.129 with SMTP id d123mr7535054ywh.173.1518894852397; Sat, 17 Feb 2018 11:14:12 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518894852; cv=none; d=google.com; s=arc-20160816; b=Mz6K0swINHB5LchOwlMtatQCsuROxrdY8oDBPvW2hQ7A2cSTw6N0nfweJ5Jnv0zkJx n2cH6IRuQmteHgSlQz/gJHXophG8kCY7elNSjbTRjqQyk93E7YGSjMnWPZ0QyJ8e6bk+ vCTG9SF665iUsECaptxrYeMS+uQaluclLO/D9h9G9202SivG0zK4SyS6J04QJpjZ64zr /+lQn1s7IbZJIZLNQYbdXKfkLb36WXKre4zL0QiphBU+UW84O2JuA2OhJGpbjvISgRve XZI6vJfS8YjxyMYdTiB5xDm2ON21Ni3H62whZqTB5J2AcOifnjBAJMtqE5WtmqFVw3FB 8qmg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=DlfGTdNWi8xMhHNSKulM72AN7w8vQPIOR0ZxZAqGW9M=; b=FnZFPrSJtyxxOf4zz4WQeJ9fx3gsOsGkEEXeNoxmOu5sb/sQBBEvI2NjHXjsHK1E0m hlbHQMI1h0wXtXGBiFNTkPoA6piH1OtelzrgrIn4fgFKQGLlhP/pRO7xuaXp6l/KeSMA 3A89yTZ9QEz30pf1a+TY4fhRUAZK51uvPtnv5IbG1MMoyzcmkTujobgNVWNjaOeX+3CJ o6kr+qmdIJ+yYxp/zONxD0xdqxdsJJiL3IPvTRzgNKbQ+NhE3dQDmql4Ja/PYvftaFAl pZAw1bXvEPBCZxpHNoSmwi90obPMDgjIZTeiHxjVg+Agxzhg/Mvx1Fk13CDK9fBaZmaa v8lg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=OOStrE8S; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id g63si3542238ywa.391.2018.02.17.11.14.12 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 11:14:12 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=OOStrE8S; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48479 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7w3-00006T-Jn for patch@linaro.org; Sat, 17 Feb 2018 14:14:11 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40617) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7A9-0001LR-Ga for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:43 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7A7-00023r-Fe for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:41 -0500 Received: from mail-pl0-x244.google.com ([2607:f8b0:400e:c01::244]:46297) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7A7-00023Q-7f for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:39 -0500 Received: by mail-pl0-x244.google.com with SMTP id x19so3428626plr.13 for ; Sat, 17 Feb 2018 10:24:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=DlfGTdNWi8xMhHNSKulM72AN7w8vQPIOR0ZxZAqGW9M=; b=OOStrE8SG5sLNenC8VHYkLz3zpB2NaXAQtZQ32+a58mXcDwhw2Gk/bE1LPWibS4nOC m112gW2ng8XfgI454iXfUQSlBanWdtTkNaqO8/EZkl3FKe89c/ynb3YuSCat4gK7hzMX A1fhxeFMYpsoVgJUC3zSv8Ems0q0CFB8IOi/Y= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=DlfGTdNWi8xMhHNSKulM72AN7w8vQPIOR0ZxZAqGW9M=; b=jTH/jG4+fDgnRR0bJePbXbFCLmF/VkmJ2vWhRq8XVNwKZBeosZ+pg/KaQrmCa+Dft2 PZwRjRGO5wwr54LqySNsB8OsCNvu63d5lTYmDWNlimCLm3YCloaIsBSZLt/fm94nH3wI Z3OGciENiCDOoRRES6PtsCjuCYXmEVNlvtqN2QgcSsY0CZytHbkiXTTAWrRubbGTG4m7 4u+wSjKLd0kIFiydOh14gMvz203SxcXx2u19DIqqQ1rd9AnBnigRjnOqWSSWWSwIEr7W V5IDEck9Q723yQ4sTJGR22tJd0C9wm9Zl6zHBm1722t8VvmljJBycDq9btjwYY9WAVy0 UxOA== X-Gm-Message-State: APf1xPAN8e9Sd1fPVc+7EZhJusLO0tz+jvbgKpw02icsb8nr6q6MLIam AHuxYuicxXqi3C2OOEqK6L9e7GbvDVk= X-Received: by 2002:a17:902:8646:: with SMTP id y6-v6mr9633644plt.406.1518891877821; Sat, 17 Feb 2018 10:24:37 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.36 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:36 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:01 -0800 Message-Id: <20180217182323.25885-46-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::244 Subject: [Qemu-devel] [PATCH v2 45/67] target/arm: Implement SVE Memory Contiguous Store Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 29 +++++++ target/arm/sve_helper.c | 211 +++++++++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 68 ++++++++++++++- target/arm/sve.decode | 38 ++++++++ 4 files changed, 343 insertions(+), 3 deletions(-) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index fcc9ba5f50..74c2d642a3 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -754,3 +754,32 @@ DEF_HELPER_FLAGS_4(sve_ld1hds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld1sdu_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld1sds_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_st1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st4bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_st1hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st2hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st3hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st4hh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_st1ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st2ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st3ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st4ss_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_st1dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st2dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st3dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st4dd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_st1bh_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st1bs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st1bd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_st1hs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve_st1hd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_4(sve_st1sd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index e542725113..e259e910de 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3023,3 +3023,214 @@ void HELPER(sve_ld4dd_r)(CPUARMState *env, void *vg, addr += 4 * 8; } } + +/* + * Store contiguous data, protected by a governing predicate. + */ +#define DO_ST1(NAME, FN, TYPEE, TYPEM, H) \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + intptr_t ra = GETPC(); \ + unsigned rd = simd_data(desc); \ + void *vd = &env->vfp.zregs[rd]; \ + for (i = 0; i < oprsz; ) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + if (pg & 1) { \ + TYPEM m = *(TYPEE *)(vd + H(i)); \ + FN(env, addr, m, ra); \ + } \ + i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \ + addr += sizeof(TYPEM); \ + } while (i & 15); \ + } \ +} + +#define DO_ST1_D(NAME, FN, TYPEM) \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc) / 8; \ + intptr_t ra = GETPC(); \ + unsigned rd = simd_data(desc); \ + uint64_t *d = &env->vfp.zregs[rd].d[0]; \ + uint8_t *pg = vg; \ + for (i = 0; i < oprsz; i += 1) { \ + if (pg[H1(i)] & 1) { \ + FN(env, addr, d[i], ra); \ + } \ + addr += sizeof(TYPEM); \ + } \ +} + +#define DO_ST2(NAME, FN, TYPEE, TYPEM, H) \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + intptr_t ra = GETPC(); \ + unsigned rd = simd_data(desc); \ + void *d1 = &env->vfp.zregs[rd]; \ + void *d2 = &env->vfp.zregs[(rd + 1) & 31]; \ + for (i = 0; i < oprsz; ) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + if (pg & 1) { \ + TYPEM m1 = *(TYPEE *)(d1 + H(i)); \ + TYPEM m2 = *(TYPEE *)(d2 + H(i)); \ + FN(env, addr, m1, ra); \ + FN(env, addr + sizeof(TYPEM), m2, ra); \ + } \ + i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \ + addr += 2 * sizeof(TYPEM); \ + } while (i & 15); \ + } \ +} + +#define DO_ST3(NAME, FN, TYPEE, TYPEM, H) \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + intptr_t ra = GETPC(); \ + unsigned rd = simd_data(desc); \ + void *d1 = &env->vfp.zregs[rd]; \ + void *d2 = &env->vfp.zregs[(rd + 1) & 31]; \ + void *d3 = &env->vfp.zregs[(rd + 2) & 31]; \ + for (i = 0; i < oprsz; ) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + if (pg & 1) { \ + TYPEM m1 = *(TYPEE *)(d1 + H(i)); \ + TYPEM m2 = *(TYPEE *)(d2 + H(i)); \ + TYPEM m3 = *(TYPEE *)(d3 + H(i)); \ + FN(env, addr, m1, ra); \ + FN(env, addr + sizeof(TYPEM), m2, ra); \ + FN(env, addr + 2 * sizeof(TYPEM), m3, ra); \ + } \ + i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \ + addr += 3 * sizeof(TYPEM); \ + } while (i & 15); \ + } \ +} + +#define DO_ST4(NAME, FN, TYPEE, TYPEM, H) \ +void HELPER(NAME)(CPUARMState *env, void *vg, \ + target_ulong addr, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + intptr_t ra = GETPC(); \ + unsigned rd = simd_data(desc); \ + void *d1 = &env->vfp.zregs[rd]; \ + void *d2 = &env->vfp.zregs[(rd + 1) & 31]; \ + void *d3 = &env->vfp.zregs[(rd + 2) & 31]; \ + void *d4 = &env->vfp.zregs[(rd + 3) & 31]; \ + for (i = 0; i < oprsz; ) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + if (pg & 1) { \ + TYPEM m1 = *(TYPEE *)(d1 + H(i)); \ + TYPEM m2 = *(TYPEE *)(d2 + H(i)); \ + TYPEM m3 = *(TYPEE *)(d3 + H(i)); \ + TYPEM m4 = *(TYPEE *)(d4 + H(i)); \ + FN(env, addr, m1, ra); \ + FN(env, addr + sizeof(TYPEM), m2, ra); \ + FN(env, addr + 2 * sizeof(TYPEM), m3, ra); \ + FN(env, addr + 3 * sizeof(TYPEM), m4, ra); \ + } \ + i += sizeof(TYPEE), pg >>= sizeof(TYPEE); \ + addr += 4 * sizeof(TYPEM); \ + } while (i & 15); \ + } \ +} + +DO_ST1(sve_st1bh_r, cpu_stb_data_ra, uint16_t, uint8_t, H1_2) +DO_ST1(sve_st1bs_r, cpu_stb_data_ra, uint32_t, uint8_t, H1_4) +DO_ST1_D(sve_st1bd_r, cpu_stb_data_ra, uint8_t) + +DO_ST1(sve_st1hs_r, cpu_stw_data_ra, uint32_t, uint16_t, H1_4) +DO_ST1_D(sve_st1hd_r, cpu_stw_data_ra, uint16_t) + +DO_ST1_D(sve_st1sd_r, cpu_stl_data_ra, uint32_t) + +DO_ST1(sve_st1bb_r, cpu_stb_data_ra, uint8_t, uint8_t, H1) +DO_ST2(sve_st2bb_r, cpu_stb_data_ra, uint8_t, uint8_t, H1) +DO_ST3(sve_st3bb_r, cpu_stb_data_ra, uint8_t, uint8_t, H1) +DO_ST4(sve_st4bb_r, cpu_stb_data_ra, uint8_t, uint8_t, H1) + +DO_ST1(sve_st1hh_r, cpu_stw_data_ra, uint16_t, uint16_t, H1_2) +DO_ST2(sve_st2hh_r, cpu_stw_data_ra, uint16_t, uint16_t, H1_2) +DO_ST3(sve_st3hh_r, cpu_stw_data_ra, uint16_t, uint16_t, H1_2) +DO_ST4(sve_st4hh_r, cpu_stw_data_ra, uint16_t, uint16_t, H1_2) + +DO_ST1(sve_st1ss_r, cpu_stl_data_ra, uint32_t, uint32_t, H1_4) +DO_ST2(sve_st2ss_r, cpu_stl_data_ra, uint32_t, uint32_t, H1_4) +DO_ST3(sve_st3ss_r, cpu_stl_data_ra, uint32_t, uint32_t, H1_4) +DO_ST4(sve_st4ss_r, cpu_stl_data_ra, uint32_t, uint32_t, H1_4) + +DO_ST1_D(sve_st1dd_r, cpu_stq_data_ra, uint64_t) + +void HELPER(sve_st2dd_r)(CPUARMState *env, void *vg, + target_ulong addr, uint32_t desc) +{ + intptr_t i, oprsz = simd_oprsz(desc) / 8; + intptr_t ra = GETPC(); + unsigned rd = simd_data(desc); + uint64_t *d1 = &env->vfp.zregs[rd].d[0]; + uint64_t *d2 = &env->vfp.zregs[(rd + 1) & 31].d[0]; + uint8_t *pg = vg; + + for (i = 0; i < oprsz; i += 1) { + if (pg[H1(i)] & 1) { + cpu_stq_data_ra(env, addr, d1[i], ra); + cpu_stq_data_ra(env, addr + 8, d2[i], ra); + } + addr += 2 * 8; + } +} + +void HELPER(sve_st3dd_r)(CPUARMState *env, void *vg, + target_ulong addr, uint32_t desc) +{ + intptr_t i, oprsz = simd_oprsz(desc) / 8; + intptr_t ra = GETPC(); + unsigned rd = simd_data(desc); + uint64_t *d1 = &env->vfp.zregs[rd].d[0]; + uint64_t *d2 = &env->vfp.zregs[(rd + 1) & 31].d[0]; + uint64_t *d3 = &env->vfp.zregs[(rd + 2) & 31].d[0]; + uint8_t *pg = vg; + + for (i = 0; i < oprsz; i += 1) { + if (pg[H1(i)] & 1) { + cpu_stq_data_ra(env, addr, d1[i], ra); + cpu_stq_data_ra(env, addr + 8, d2[i], ra); + cpu_stq_data_ra(env, addr + 16, d3[i], ra); + } + addr += 3 * 8; + } +} + +void HELPER(sve_st4dd_r)(CPUARMState *env, void *vg, + target_ulong addr, uint32_t desc) +{ + intptr_t i, oprsz = simd_oprsz(desc) / 8; + intptr_t ra = GETPC(); + unsigned rd = simd_data(desc); + uint64_t *d1 = &env->vfp.zregs[rd].d[0]; + uint64_t *d2 = &env->vfp.zregs[(rd + 1) & 31].d[0]; + uint64_t *d3 = &env->vfp.zregs[(rd + 2) & 31].d[0]; + uint64_t *d4 = &env->vfp.zregs[(rd + 3) & 31].d[0]; + uint8_t *pg = vg; + + for (i = 0; i < oprsz; i += 1) { + if (pg[H1(i)] & 1) { + cpu_stq_data_ra(env, addr, d1[i], ra); + cpu_stq_data_ra(env, addr + 8, d2[i], ra); + cpu_stq_data_ra(env, addr + 16, d3[i], ra); + cpu_stq_data_ra(env, addr + 24, d4[i], ra); + } + addr += 4 * 8; + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index aa8bfd2ae7..fda9a56fd5 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3320,7 +3320,6 @@ static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr, tcg_temp_free_ptr(t_pg); tcg_temp_free_i32(desc); - tcg_temp_free_i64(addr); } static void do_ld_zpa(DisasContext *s, int zt, int pg, @@ -3368,7 +3367,7 @@ static void trans_LD_zprr(DisasContext *s, arg_rprr_load *a, uint32_t insn) return; } - addr = tcg_temp_new_i64(); + addr = new_tmp_a64(s); tcg_gen_muli_i64(addr, cpu_reg(s, a->rm), (a->nreg + 1) << dtype_msz(a->dtype)); tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn)); @@ -3379,7 +3378,7 @@ static void trans_LD_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn) { unsigned vsz = vec_full_reg_size(s); unsigned elements = vsz >> dtype_esz[a->dtype]; - TCGv_i64 addr = tcg_temp_new_i64(); + TCGv_i64 addr = new_tmp_a64(s); tcg_gen_addi_i64(addr, cpu_reg_sp(s, a->rn), (a->imm * elements * (a->nreg + 1)) @@ -3398,3 +3397,66 @@ static void trans_LDNF1_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn) /* FIXME */ trans_LD_zpri(s, a, insn); } + +static void do_st_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr, + int msz, int esz, int nreg) +{ + static gen_helper_gvec_mem * const fn_single[4][4] = { + { gen_helper_sve_st1bb_r, gen_helper_sve_st1bh_r, + gen_helper_sve_st1bs_r, gen_helper_sve_st1bd_r }, + { NULL, gen_helper_sve_st1hh_r, + gen_helper_sve_st1hs_r, gen_helper_sve_st1hd_r }, + { NULL, NULL, + gen_helper_sve_st1ss_r, gen_helper_sve_st1sd_r }, + { NULL, NULL, NULL, gen_helper_sve_st1dd_r }, + }; + static gen_helper_gvec_mem * const fn_multiple[3][4] = { + { gen_helper_sve_st1hh_r, gen_helper_sve_st2hh_r, + gen_helper_sve_st3hh_r, gen_helper_sve_st4hh_r }, + { gen_helper_sve_st1ss_r, gen_helper_sve_st2ss_r, + gen_helper_sve_st3ss_r, gen_helper_sve_st4ss_r }, + { gen_helper_sve_st1dd_r, gen_helper_sve_st2dd_r, + gen_helper_sve_st3dd_r, gen_helper_sve_st4dd_r }, + }; + gen_helper_gvec_mem *fn; + + if (nreg == 0) { + /* ST1 */ + fn = fn_single[msz][esz]; + if (fn == NULL) { + unallocated_encoding(s); + return; + } + } else { + /* ST2, ST3, ST4 -- msz == esz, enforced by encoding */ + assert(msz == esz); + fn = fn_multiple[msz][nreg - 1]; + } + do_mem_zpa(s, zt, pg, addr, fn); +} + +static void trans_ST_zprr(DisasContext *s, arg_rprr_store *a, uint32_t insn) +{ + TCGv_i64 addr; + + if (a->rm == 31) { + unallocated_encoding(s); + return; + } + + addr = new_tmp_a64(s); + tcg_gen_muli_i64(addr, cpu_reg(s, a->rm), (a->nreg + 1) << a->msz); + tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn)); + do_st_zpa(s, a->rd, a->pg, addr, a->msz, a->esz, a->nreg); +} + +static void trans_ST_zpri(DisasContext *s, arg_rpri_store *a, uint32_t insn) +{ + unsigned vsz = vec_full_reg_size(s); + unsigned elements = vsz >> a->esz; + TCGv_i64 addr = new_tmp_a64(s); + + tcg_gen_addi_i64(addr, cpu_reg_sp(s, a->rn), + (a->imm * elements * (a->nreg + 1)) << a->msz); + do_st_zpa(s, a->rd, a->pg, addr, a->msz, a->esz, a->nreg); +} diff --git a/target/arm/sve.decode b/target/arm/sve.decode index d2b3869c58..41b8cd8746 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -28,6 +28,7 @@ %imm8_16_10 16:5 10:3 %imm9_16_10 16:s6 10:3 %preg4_5 5:4 +%size_23 23:2 # A combination of tsz:imm3 -- extract esize. %tszimm_esz 22:2 5:5 !function=tszimm_esz @@ -77,6 +78,8 @@ &incdec2_pred rd rn pg esz d u &rprr_load rd pg rn rm dtype nreg &rpri_load rd pg rn imm dtype nreg +&rprr_store rd pg rn rm msz esz nreg +&rpri_store rd pg rn imm msz esz nreg ########################################################################### # Named instruction formats. These are generally used to @@ -185,6 +188,12 @@ @rpri_load_msz ....... .... . imm:s4 ... pg:3 rn:5 rd:5 \ &rpri_load dtype=%msz_dtype +# Stores; user must fill in ESZ, MSZ, NREG as needed. +@rprr_store ....... .. .. rm:5 ... pg:3 rn:5 rd:5 &rprr_store +@rpri_store_msz ....... msz:2 .. . imm:s4 ... pg:3 rn:5 rd:5 &rpri_store +@rprr_store_esz_n0 ....... .. esz:2 rm:5 ... pg:3 rn:5 rd:5 \ + &rprr_store nreg=0 + ########################################################################### # Instruction patterns. Grouped according to the SVE encodingindex.xhtml. @@ -713,3 +722,32 @@ LD_zprr 1010010 .. nreg:2 ..... 110 ... ..... ..... @rprr_load_msz # SVE load multiple structures (scalar plus immediate) # LD2B, LD2H, LD2W, LD2D; etc. LD_zpri 1010010 .. nreg:2 0.... 111 ... ..... ..... @rpri_load_msz + +### SVE Memory Store Group + +# SVE contiguous store (scalar plus immediate) +# ST1B, ST1H, ST1W, ST1D; require msz <= esz +ST_zpri 1110010 .. esz:2 0.... 111 ... ..... ..... \ + @rpri_store_msz nreg=0 + +# SVE contiguous store (scalar plus scalar) +# ST1B, ST1H, ST1W, ST1D; require msz <= esz +# Enumerate msz lest we conflict with STR_zri. +ST_zprr 1110010 00 .. ..... 010 ... ..... ..... \ + @rprr_store_esz_n0 msz=0 +ST_zprr 1110010 01 .. ..... 010 ... ..... ..... \ + @rprr_store_esz_n0 msz=1 +ST_zprr 1110010 10 .. ..... 010 ... ..... ..... \ + @rprr_store_esz_n0 msz=2 +ST_zprr 1110010 11 11 ..... 010 ... ..... ..... \ + @rprr_store msz=3 esz=3 nreg=0 + +# SVE contiguous non-temporal store (scalar plus immediate) (nreg == 0) +# SVE store multiple structures (scalar plus immediate) (nreg != 0) +ST_zpri 1110010 .. nreg:2 1.... 111 ... ..... ..... \ + @rpri_store_msz esz=%size_23 + +# SVE contiguous non-temporal store (scalar plus scalar) (nreg == 0) +# SVE store multiple structures (scalar plus scalar) (nreg != 0) +ST_zprr 1110010 msz:2 nreg:2 ..... 011 ... ..... ..... \ + @rprr_store esz=%size_23 From patchwork Sat Feb 17 18:23:02 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128729 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1848021ljc; Sat, 17 Feb 2018 11:15:12 -0800 (PST) X-Google-Smtp-Source: AH8x2258t81NsDm4X2PJdju4RnUDT/2yFGXgDoE8nBvuME77etxDylLvy1Fm83TfWW05BImqNAtI X-Received: by 10.37.187.206 with SMTP id c14mr7401856ybk.408.1518894912273; Sat, 17 Feb 2018 11:15:12 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518894912; cv=none; d=google.com; s=arc-20160816; b=ucUIQhghWkB4WE7UnRB4qLrDUNGtjgPoV0eBRPmRCnMqlXFkfvKcOEzQtS99fl+BbK DeZLw3A3/FVL6O/5U2yPrH5sm6VFCYUloiEWQmXb6nmjJTaskiUExeQVV7q0vAJksY8+ bHA34mBsjN9EzwsznUNlPmoQ3lpq6gO9glqaslRUcQPTww3AsFlLLoDPebYZ+lrs3JSq akTcwBGduCn9E4VJT1BykwHA+qSaY3GuWge8enjXj9JxI94+R2sAWsVPK84WvKZLRpjo SFnn+1QjUaHJs/8krzeHwUrcXT88SM1SWla2AMMIDHofs6IhNMW97h1uzHfpPXWsfouR T+VA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=II2r4Y9ASLSI8Rc5w2Vv6jfGjVTNMqTbvzEJXx6d0Hc=; b=BkXi8zUXeUmLg3vfiE7ANo+kk+hLfKZPfiQm4uZLi59SRuHGVTRCIhlF+TEk3EdNLO v+Q9TuzvsoTmB7Alo/CfJ4+wLLT3/ZFJk6lIXXo+FjYbO447Wu/7a309sKDu0Y+7y3fV 8rnUsSkUuKoTMfKUeFkhsHp53lU1sWN4a37Hisri8nJsW5HqKeR6VohdDBqXrRstVTF+ Csrt8cVvYthyLliDgmxPiEp0bLHT+d+K9JkvQ/Fh/ixnqHJi5hzTpBeYGMjEO3q2HgPo tMioYTd7E752ymzXBmz6ogvCDz3ybl5d7BFSsnyeoOt8PS3X72B/yPT3g/2/T4syaMxJ SwNg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Qia6tOaq; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id 206si3551742ywd.712.2018.02.17.11.15.12 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 11:15:12 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Qia6tOaq; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48559 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7x1-00020W-Ia for patch@linaro.org; Sat, 17 Feb 2018 14:15:11 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40639) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7AB-0001Mr-7I for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:44 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7A8-00024w-Oq for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:43 -0500 Received: from mail-pl0-x241.google.com ([2607:f8b0:400e:c01::241]:34790) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7A8-00024Y-Jq for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:40 -0500 Received: by mail-pl0-x241.google.com with SMTP id bd10so3450169plb.1 for ; Sat, 17 Feb 2018 10:24:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=II2r4Y9ASLSI8Rc5w2Vv6jfGjVTNMqTbvzEJXx6d0Hc=; b=Qia6tOaqHgc7H7ev7GQWax4fPoDob0/7CxzJ8E2Fyk/vfwGx33RwY2Bteuttts/scb TS1yxMlfKb/MW/+4ZtTfrHulNKnuA9FquF0Kf2cv00+laYe2WsjK5K36Sj4PQ8ZaXUSN ayHYBeG5suo6/ulYCeH8yvRHKNGGxSJIHOzNE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=II2r4Y9ASLSI8Rc5w2Vv6jfGjVTNMqTbvzEJXx6d0Hc=; b=Mw+42NZsABhmg0BEAFpY7j9G50VDC9c8gMKCDN6A//pl4oXuVmXPDD0d+VL3t05o1T 9Vr6Pz5rCHCb75s+YMwyaTatrPy69eOjf8/blNaDYy4+jLufAqP+tmdXWrQ9sl1xhozV je3FHN0SBidn+rFG4JkK7eJZuiPAGVN4NuheIDiX1ozzpEAzhyL1QzSxNCMwfDPB3T94 GXGpbIO3ap+c6Iqw6iEbyM++ncBV9ByGwo5ZhgyMYfmm068l1P05I6CuAOJU0GQ97Rge GVhtvDZq/85Kv9yIF/KHmyO7OeZXlvYv1W75uWLrPyzoPvsJurYgUJM0KcTZdVLns4Tp WBVQ== X-Gm-Message-State: APf1xPDyYuXcImEuBD5Z17mZi6JkTr58wOq49HHXAOrZbzEZWrYzYeMm 5YhIMdOSKVG8xn1st6AhPxoRsPVWyX4= X-Received: by 2002:a17:902:9686:: with SMTP id n6-v6mr9303607plp.333.1518891879340; Sat, 17 Feb 2018 10:24:39 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.37 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:38 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:02 -0800 Message-Id: <20180217182323.25885-47-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::241 Subject: [Qemu-devel] [PATCH v2 46/67] target/arm: Implement SVE load and broadcast quadword X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 51 ++++++++++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 9 ++++++++ 2 files changed, 60 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index fda9a56fd5..7b21102b7e 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3398,6 +3398,57 @@ static void trans_LDNF1_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn) trans_LD_zpri(s, a, insn); } +static void do_ldrq(DisasContext *s, int zt, int pg, TCGv_i64 addr, int msz) +{ + static gen_helper_gvec_mem * const fns[4] = { + gen_helper_sve_ld1bb_r, gen_helper_sve_ld1hh_r, + gen_helper_sve_ld1ss_r, gen_helper_sve_ld1dd_r, + }; + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr t_pg; + TCGv_i32 desc; + + /* Load the first quadword using the normal predicated load helpers. */ + desc = tcg_const_i32(simd_desc(16, 16, zt)); + t_pg = tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg)); + fns[msz](cpu_env, t_pg, addr, desc); + + tcg_temp_free_ptr(t_pg); + tcg_temp_free_i32(desc); + + /* Replicate that first quadword. */ + if (vsz > 16) { + unsigned dofs = vec_full_reg_offset(s, zt); + tcg_gen_gvec_dup_mem(4, dofs + 16, dofs, vsz - 16, vsz - 16); + } +} + +static void trans_LD1RQ_zprr(DisasContext *s, arg_rprr_load *a, uint32_t insn) +{ + TCGv_i64 addr; + int msz = dtype_msz(a->dtype); + + if (a->rm == 31) { + unallocated_encoding(s); + return; + } + + addr = new_tmp_a64(s); + tcg_gen_shli_i64(addr, cpu_reg(s, a->rm), msz); + tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn)); + do_ldrq(s, a->rd, a->pg, addr, msz); +} + +static void trans_LD1RQ_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn) +{ + TCGv_i64 addr = new_tmp_a64(s); + + tcg_gen_addi_i64(addr, cpu_reg_sp(s, a->rn), a->imm * 16); + do_ldrq(s, a->rd, a->pg, addr, dtype_msz(a->dtype)); +} + static void do_st_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr, int msz, int esz, int nreg) { diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 41b8cd8746..6c906e25e9 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -723,6 +723,15 @@ LD_zprr 1010010 .. nreg:2 ..... 110 ... ..... ..... @rprr_load_msz # LD2B, LD2H, LD2W, LD2D; etc. LD_zpri 1010010 .. nreg:2 0.... 111 ... ..... ..... @rpri_load_msz +# SVE load and broadcast quadword (scalar plus scalar) +LD1RQ_zprr 1010010 .. 00 ..... 000 ... ..... ..... \ + @rprr_load_msz nreg=0 + +# SVE load and broadcast quadword (scalar plus immediate) +# LD1RQB, LD1RQH, LD1RQS, LD1RQD +LD1RQ_zpri 1010010 .. 00 0.... 001 ... ..... ..... \ + @rpri_load_msz nreg=0 + ### SVE Memory Store Group # SVE contiguous store (scalar plus immediate) From patchwork Sat Feb 17 18:23:03 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128703 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1834095ljc; Sat, 17 Feb 2018 10:53:57 -0800 (PST) X-Google-Smtp-Source: AH8x227xk/1dYlcCWfLMrd1kZyI79c4ImM5T0bNtEsV0UBdpQnpvLZ08gY5mvh4SPTkHth8GSRYr X-Received: by 10.37.160.99 with SMTP id x90mr7332865ybh.356.1518893637535; Sat, 17 Feb 2018 10:53:57 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518893637; cv=none; d=google.com; s=arc-20160816; b=c0hx0lHSiyIt2wu2G0VKRELROqq+3XTWrFBOMWLhj4QpRRj14CCb+coZ+CxNSI29qC hCt0cTXhKoPRX9WMKW1Av6oL8o9Vj1xqZCf5/Uk/KXEVDnppcpmvFex3ws6QCsl1tTTT oo7AvI+3C7j5P2eso3ge2yo9EOM7kknANtY0q6WVsVTn+x3tDFNlyqhq8iEp5jDL3UqI gFuBZDJ5xQvnEDlPpBogWApOc8A/qDsUXCM1Rxm9i712BUqMQOXu7Ee5GECZUgn5dxpW 5XGJhqRyI/blnfea8eGoUPwWfF57aRohO70IaHNBcfzYsML1fzaRDZV4GNNthMjD7ecc aZiA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=7ABdzz1wITfCNiODAQGc4uzbhTYqYtjwPZz9V9qFcXA=; b=vevBKZINCx635GkyD2loMT7AsgVAQn8jX1y+lb+EubptG5cK7RqkMG2My2OmN86WJz eDG7+NZqT0vHEk89ZgA2uu6qDNDRoNpbKUm353pb4JL4zgDdk3fx0PFGsfXdn8eyEyu+ droRzz7P2vDIzwlbSe8//87TaNNsBC/oDUHgR2hnavSglCoOGWSfQMVMSShQFc0rhbbQ 4WdOoB9cuHjOz3vWqXInl+GLF4riBmBAu8tf93QZ9QBeYgwFUBo6cVWUy5CPukoed8h3 mI2JjyMgzNGB9pSQAuiwgxRj95tjZ4UNcXGvXk7fWWVn8xliYQqQ5dYbUv4Nz2raHL0J 9CrA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=WIhqa3mi; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id p11si1153406ybl.779.2018.02.17.10.53.57 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 10:53:57 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=WIhqa3mi; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48295 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7cS-0007Ro-TI for patch@linaro.org; Sat, 17 Feb 2018 13:53:56 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40666) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7AC-0001On-Mi for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:46 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7AA-00025r-UN for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:44 -0500 Received: from mail-pl0-x241.google.com ([2607:f8b0:400e:c01::241]:38170) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7AA-00025S-Lt for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:42 -0500 Received: by mail-pl0-x241.google.com with SMTP id h10so3439633plt.5 for ; Sat, 17 Feb 2018 10:24:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=7ABdzz1wITfCNiODAQGc4uzbhTYqYtjwPZz9V9qFcXA=; b=WIhqa3mih7ta9C/mSlgyt29PjH06zUxuey5fGbCpXsZMB37Ycu70Aymdq7FPknrFPF K2ip6LnGuh/RJPbHghxloZGepjGEmq8Npdi6gGCMXBKyAPM2+F4xrgqdX3AuGCR7ByOI 6sdlWqCxcvFuE7p9BTb6qmlGHIQ3z2rK2XwGo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=7ABdzz1wITfCNiODAQGc4uzbhTYqYtjwPZz9V9qFcXA=; b=m876LtdkkhvPcQmhrydUNiUp+/5wkPsdccN0fu2rdvgpN35k7z72Wn3H2ynGN1nSkq w2sch0Cnd+cC78gMfpqLJcMnvOximw2Am9MbETrIscX/d26akt9nxo3/E3Ihh2SHpPmo SCSa05P5dKcn5AkGq/8o0h4LxszQW/EN5Dk+9VcpCkZR73oNVmxnSD4Ns4pJVppK/U8+ SGMLys6MnR0apM06MdDB6URlTRcXY/0maTvgUEHt5MsLnITEU1cUXcpxxII1pRGswHfw yAUdBERO7UYq97KXcVAIpgWO0hovyiX2Dvrr8Ow1RQMhmwF4eY/RJ2QCefApdTRXPj6M 0uow== X-Gm-Message-State: APf1xPDIHIPUz30JdtkBiVxYUlQ3Ie80eWJ1Zh5rvAvSwO4BDnCHUjis oClB3QqGdFj9kU+LVqEuFGdTNJHGvak= X-Received: by 2002:a17:902:b189:: with SMTP id s9-v6mr9342318plr.243.1518891881212; Sat, 17 Feb 2018 10:24:41 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.39 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:40 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:03 -0800 Message-Id: <20180217182323.25885-48-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::241 Subject: [Qemu-devel] [PATCH v2 47/67] target/arm: Implement SVE integer convert to floating-point X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 30 +++++++++++++++ target/arm/sve_helper.c | 52 ++++++++++++++++++++++++++ target/arm/translate-sve.c | 92 ++++++++++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 22 +++++++++++ 4 files changed, 196 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 74c2d642a3..fb7609f9ef 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -720,6 +720,36 @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_scvt_dh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_scvt_ss, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_scvt_sd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_scvt_ds, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_scvt_dd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_ucvt_hh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ucvt_sh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ucvt_dh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ucvt_ss, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ucvt_sd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ucvt_ds, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ucvt_dd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(sve_ld1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index e259e910de..a1e0ceb5fb 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2789,6 +2789,58 @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count, uint32_t pred_desc) return predtest_ones(d, oprsz, esz_mask); } +/* Fully general two-operand expander, controlled by a predicate, + * With the extra float_status parameter. + */ +#define DO_ZPZ_FP(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vg, void *status, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + for (i = 0; i < opr_sz; ) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + if (pg & 1) { \ + TYPE nn = *(TYPE *)(vn + H(i)); \ + *(TYPE *)(vd + H(i)) = OP(nn, status); \ + } \ + i += sizeof(TYPE), pg >>= sizeof(TYPE); \ + } while (i & 15); \ + } \ +} + +/* Similarly, specialized for 64-bit operands. */ +#define DO_ZPZ_FP_D(NAME, TYPE, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vg, void *status, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; \ + TYPE *d = vd, *n = vn; \ + uint8_t *pg = vg; \ + for (i = 0; i < opr_sz; i += 1) { \ + if (pg[H1(i)] & 1) { \ + d[i] = OP(n[i], status); \ + } \ + } \ +} + +DO_ZPZ_FP(sve_scvt_hh, uint16_t, H1_2, int16_to_float16) +DO_ZPZ_FP(sve_scvt_sh, uint32_t, H1_4, int32_to_float16) +DO_ZPZ_FP(sve_scvt_ss, uint32_t, H1_4, int32_to_float32) +DO_ZPZ_FP_D(sve_scvt_sd, uint64_t, int32_to_float64) +DO_ZPZ_FP_D(sve_scvt_dh, uint64_t, int64_to_float16) +DO_ZPZ_FP_D(sve_scvt_ds, uint64_t, int64_to_float32) +DO_ZPZ_FP_D(sve_scvt_dd, uint64_t, int64_to_float64) + +DO_ZPZ_FP(sve_ucvt_hh, uint16_t, H1_2, uint16_to_float16) +DO_ZPZ_FP(sve_ucvt_sh, uint32_t, H1_4, uint32_to_float16) +DO_ZPZ_FP(sve_ucvt_ss, uint32_t, H1_4, uint32_to_float32) +DO_ZPZ_FP_D(sve_ucvt_sd, uint64_t, uint32_to_float64) +DO_ZPZ_FP_D(sve_ucvt_dh, uint64_t, uint64_to_float16) +DO_ZPZ_FP_D(sve_ucvt_ds, uint64_t, uint64_to_float32) +DO_ZPZ_FP_D(sve_ucvt_dd, uint64_t, uint64_to_float64) + +#undef DO_ZPZ_FP +#undef DO_ZPZ_FP_D + /* * Load contiguous data, protected by a governing predicate. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 7b21102b7e..05c684222e 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3161,6 +3161,98 @@ DO_FP3(FRSQRTS, rsqrts) #undef DO_FP3 + +/* + *** SVE Floating Point Unary Operations Prediated Group + */ + +static void do_zpz_ptr(DisasContext *s, int rd, int rn, int pg, + bool is_fp16, gen_helper_gvec_3_ptr *fn) +{ + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr status; + + if (fn == NULL) { + unallocated_encoding(s); + return; + } + status = get_fpstatus_ptr(is_fp16); + tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd), + vec_full_reg_offset(s, rn), + pred_full_reg_offset(s, pg), + status, vsz, vsz, 0, fn); +} + +static void trans_SCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_hh); +} + +static void trans_SCVTF_sh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_sh); +} + +static void trans_SCVTF_dh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_dh); +} + +static void trans_SCVTF_ss(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_scvt_ss); +} + +static void trans_SCVTF_ds(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_scvt_ds); +} + +static void trans_SCVTF_sd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_scvt_sd); +} + +static void trans_SCVTF_dd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_scvt_dd); +} + +static void trans_UCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_ucvt_hh); +} + +static void trans_UCVTF_sh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_ucvt_sh); +} + +static void trans_UCVTF_dh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_ucvt_dh); +} + +static void trans_UCVTF_ss(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_ucvt_ss); +} + +static void trans_UCVTF_ds(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_ucvt_ds); +} + +static void trans_UCVTF_sd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_ucvt_sd); +} + +static void trans_UCVTF_dd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_ucvt_dd); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 6c906e25e9..b571b70050 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -134,6 +134,9 @@ @rd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz @rd_pg4_pn ........ esz:2 ... ... .. pg:4 . rn:4 rd:5 &rpr_esz +# One register operand, with governing predicate, no vector element size +@rd_pg_rn_e0 ........ .. ... ... ... pg:3 rn:5 rd:5 &rpr_esz esz=0 + # Two register operands with a 6-bit signed immediate. @rd_rn_i6 ........ ... rn:5 ..... imm:s6 rd:5 &rri @@ -689,6 +692,25 @@ FTSMUL 01100101 .. 0 ..... 000 011 ..... ..... @rd_rn_rm FRECPS 01100101 .. 0 ..... 000 110 ..... ..... @rd_rn_rm FRSQRTS 01100101 .. 0 ..... 000 111 ..... ..... @rd_rn_rm +### SVE FP Unary Operations Predicated Group + +# SVE integer convert to floating-point +SCVTF_hh 01100101 01 010 01 0 101 ... ..... ..... @rd_pg_rn_e0 +SCVTF_sh 01100101 01 010 10 0 101 ... ..... ..... @rd_pg_rn_e0 +SCVTF_dh 01100101 01 010 11 0 101 ... ..... ..... @rd_pg_rn_e0 +SCVTF_ss 01100101 10 010 10 0 101 ... ..... ..... @rd_pg_rn_e0 +SCVTF_sd 01100101 11 010 00 0 101 ... ..... ..... @rd_pg_rn_e0 +SCVTF_ds 01100101 11 010 10 0 101 ... ..... ..... @rd_pg_rn_e0 +SCVTF_dd 01100101 11 010 11 0 101 ... ..... ..... @rd_pg_rn_e0 + +UCVTF_hh 01100101 01 010 01 1 101 ... ..... ..... @rd_pg_rn_e0 +UCVTF_sh 01100101 01 010 10 1 101 ... ..... ..... @rd_pg_rn_e0 +UCVTF_dh 01100101 01 010 11 1 101 ... ..... ..... @rd_pg_rn_e0 +UCVTF_ss 01100101 10 010 10 1 101 ... ..... ..... @rd_pg_rn_e0 +UCVTF_sd 01100101 11 010 00 1 101 ... ..... ..... @rd_pg_rn_e0 +UCVTF_ds 01100101 11 010 10 1 101 ... ..... ..... @rd_pg_rn_e0 +UCVTF_dd 01100101 11 010 11 1 101 ... ..... ..... @rd_pg_rn_e0 + ### SVE Memory - 32-bit Gather and Unsized Contiguous Group # SVE load predicate register From patchwork Sat Feb 17 18:23:04 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128704 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1834473ljc; Sat, 17 Feb 2018 10:54:28 -0800 (PST) X-Google-Smtp-Source: AH8x226eYDCBBI+CPEKpjLbYdw4y6Bi9MsIdk2Pbg3DE9x7sui0UgqgFePjMCG1SlzEg9nt8hojh X-Received: by 10.129.146.139 with SMTP id j133mr7265978ywg.256.1518893668877; Sat, 17 Feb 2018 10:54:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518893668; cv=none; d=google.com; s=arc-20160816; b=Mbk+b+46Ek3MiWzB+FlSom0Hm5VXmD2mABBo9Zmc21uyb7MKxcbmUWFnKNpN2DleZs RmI8l9GqFEql58YmN0pk0OVbsrcF10vV4SRsrwkJESpkGohugI8Cxg6iPqDz38++uxmr PddhKy1KDHh9VmPIPORYs5JHxO9GBVPXA+zjCsvVgwxRvAY7KKMTM4qQ4WU9Rzvr7rQ3 3tzbjvibzWG1HA2uiiYSRVBnIblX676knWQBTu1ubCODHjNsodeDJq2xSVSpVRn3vVj9 VEUnOf+7MxZd9cJ7J+bwhpgZU2ygvE5MucpYE7DJrD9hrrIjF3UgxId+3Px5YyzFEzKW glpA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=819P1QL6+9Kz2hG8caKn90tN50J65oecWq4BgxtZEpg=; b=v6PCWFClEH2FuY2gX6zSrSMcfpB8PFMMZ/s7YqP11foAy1U71NpbXabNk1K6YgOYJA a4+TpjHowgnSPHTfWSNLqrPovHgTgr6p1e2+ZO/jTIgjRVXyPP2pc4LpNkrAMzfsZVGF 8jLdal39qlR+4x60gIv6AdF5fpCKJjiCeP7D3vgtax69/5MA/eJXc7aEWzcYbfes2BpP Y8vuKdYMzT95xuBTi5dLYwJ+4OgG3zfjSD8zJVrs0bDzmi1tIpYQv+xymNzlGY8WpAHm ZKd7jU/zrYZZ55URzheUPoalicG56oCM1xkgQLlHGBuSEm256O4n8lMfvMaetNHQQCXf xCcw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Lf3WtKR9; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id y2si3472059ybc.505.2018.02.17.10.54.28 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 10:54:28 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Lf3WtKR9; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48306 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7cu-0000Ge-Pb for patch@linaro.org; Sat, 17 Feb 2018 13:54:24 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40693) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7AE-0001Qz-Ez for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:48 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7AC-00026n-Kw for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:46 -0500 Received: from mail-pl0-x243.google.com ([2607:f8b0:400e:c01::243]:37017) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7AC-00026D-Bz for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:44 -0500 Received: by mail-pl0-x243.google.com with SMTP id ay8so3443274plb.4 for ; Sat, 17 Feb 2018 10:24:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=819P1QL6+9Kz2hG8caKn90tN50J65oecWq4BgxtZEpg=; b=Lf3WtKR9UXU59a/zCRqA8HmQXWzmAopj6YN06N4rnNEuWx2j7jo+ow42M8Ty9TIMBD QDVwtucur/HOnRqqiTcwc04Vo8JBV51CBwlD1egiDrTHqXuom7iBymVEj7/yxLt+OPip u60DSHeQCWNEoL8cN2a18w0IKz55GERSuAbFU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=819P1QL6+9Kz2hG8caKn90tN50J65oecWq4BgxtZEpg=; b=YaGXZiFRbeGPZP/48mIXaRv4bvPB/uXK4lh0qKtf0Gsbecan//oF8RE8fUeqCzLWJy 6J3f6XCffts0fc8q+uxyWVc9ocTGcH6YPYsmUxmcpP4WPbiClnrDnCxq8L5BdEWBK8Yq R7eq8+NQmELpuhRj0sFC7cqO6uLNrUCgGUzPR4MhY4xkwvzCiHYSdHrHWl4KftOE0Rb6 lKWM9kNPxldMNGY5jVjWpgrLvsPUEo7y0Y71v5zwicyir90UdNtWGeB8eGk8vBQR+L5o G+j/h5I1jCqfJ/CUhWnehuWqHZORqRNEX30TMFTWpWe2ilBooEBh2+PayWA4gSOsNz62 Yt1A== X-Gm-Message-State: APf1xPDSw3XiheUaYHNCzph95CPF6pXHapOpOPXIher5BpI5HS2gUPk0 ZwDU5Fx6o5CBwYIifG/VCZ3gTon7pjY= X-Received: by 2002:a17:902:33a5:: with SMTP id b34-v6mr2254723plc.263.1518891883020; Sat, 17 Feb 2018 10:24:43 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.41 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:42 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:04 -0800 Message-Id: <20180217182323.25885-49-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::243 Subject: [Qemu-devel] [PATCH v2 48/67] target/arm: Implement SVE floating-point arithmetic (predicated) X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 77 ++++++++++++++++++++++++++++++++ target/arm/sve_helper.c | 107 +++++++++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 47 ++++++++++++++++++++ target/arm/sve.decode | 17 +++++++ 4 files changed, 248 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index fb7609f9ef..84d0a8978c 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -720,6 +720,83 @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fadd_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fadd_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fadd_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fsub_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fsub_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fsub_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmul_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmul_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmul_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fdiv_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fdiv_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fdiv_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmin_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmin_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmin_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmax_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmax_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmax_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fminnum_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fminnum_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fminnum_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmaxnum_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmaxnum_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmaxnum_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fabd_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fabd_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fabd_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fscalbn_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fscalbn_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fscalbn_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmulx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmulx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmulx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index a1e0ceb5fb..d80babfae7 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2789,6 +2789,113 @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count, uint32_t pred_desc) return predtest_ones(d, oprsz, esz_mask); } +/* Fully general three-operand expander, controlled by a predicate, + * With the extra float_status parameter. + */ +#define DO_ZPZZ_FP(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, \ + void *status, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + for (i = 0; i < opr_sz; ) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + if (pg & 1) { \ + TYPE nn = *(TYPE *)(vn + H(i)); \ + TYPE mm = *(TYPE *)(vm + H(i)); \ + *(TYPE *)(vd + H(i)) = OP(nn, mm, status); \ + } \ + i += sizeof(TYPE), pg >>= sizeof(TYPE); \ + } while (i & 15); \ + } \ +} + +/* Similarly, specialized for 64-bit operands. */ +#define DO_ZPZZ_FP_D(NAME, TYPE, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, \ + void *status, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; \ + TYPE *d = vd, *n = vn, *m = vm; \ + uint8_t *pg = vg; \ + for (i = 0; i < opr_sz; i += 1) { \ + if (pg[H1(i)] & 1) { \ + d[i] = OP(n[i], m[i], status); \ + } \ + } \ +} + +DO_ZPZZ_FP(sve_fadd_h, uint16_t, H1_2, float16_add) +DO_ZPZZ_FP(sve_fadd_s, uint16_t, H1_4, float32_add) +DO_ZPZZ_FP_D(sve_fadd_d, uint64_t, float64_add) + +DO_ZPZZ_FP(sve_fsub_h, uint16_t, H1_2, float16_sub) +DO_ZPZZ_FP(sve_fsub_s, uint16_t, H1_4, float32_sub) +DO_ZPZZ_FP_D(sve_fsub_d, uint64_t, float64_sub) + +DO_ZPZZ_FP(sve_fmul_h, uint16_t, H1_2, float16_mul) +DO_ZPZZ_FP(sve_fmul_s, uint16_t, H1_4, float32_mul) +DO_ZPZZ_FP_D(sve_fmul_d, uint64_t, float64_mul) + +DO_ZPZZ_FP(sve_fdiv_h, uint16_t, H1_2, float16_div) +DO_ZPZZ_FP(sve_fdiv_s, uint16_t, H1_4, float32_div) +DO_ZPZZ_FP_D(sve_fdiv_d, uint64_t, float64_div) + +DO_ZPZZ_FP(sve_fmin_h, uint16_t, H1_2, float16_min) +DO_ZPZZ_FP(sve_fmin_s, uint16_t, H1_4, float32_min) +DO_ZPZZ_FP_D(sve_fmin_d, uint64_t, float64_min) + +DO_ZPZZ_FP(sve_fmax_h, uint16_t, H1_2, float16_max) +DO_ZPZZ_FP(sve_fmax_s, uint16_t, H1_4, float32_max) +DO_ZPZZ_FP_D(sve_fmax_d, uint64_t, float64_max) + +DO_ZPZZ_FP(sve_fminnum_h, uint16_t, H1_2, float16_minnum) +DO_ZPZZ_FP(sve_fminnum_s, uint16_t, H1_4, float32_minnum) +DO_ZPZZ_FP_D(sve_fminnum_d, uint64_t, float64_minnum) + +DO_ZPZZ_FP(sve_fmaxnum_h, uint16_t, H1_2, float16_maxnum) +DO_ZPZZ_FP(sve_fmaxnum_s, uint16_t, H1_4, float32_maxnum) +DO_ZPZZ_FP_D(sve_fmaxnum_d, uint64_t, float64_maxnum) + +static inline uint16_t abd_h(float16 a, float16 b, float_status *s) +{ + return float16_abs(float16_sub(a, b, s)); + +} + +static inline uint32_t abd_s(float32 a, float32 b, float_status *s) +{ + return float32_abs(float32_sub(a, b, s)); + +} + +static inline uint64_t abd_d(float64 a, float64 b, float_status *s) +{ + return float64_abs(float64_sub(a, b, s)); + +} + +DO_ZPZZ_FP(sve_fabd_h, uint16_t, H1_2, abd_h) +DO_ZPZZ_FP(sve_fabd_s, uint16_t, H1_4, abd_s) +DO_ZPZZ_FP_D(sve_fabd_d, uint64_t, abd_d) + +static inline uint64_t scalbn_d(float64 a, int64_t b, float_status *s) +{ + int b_int = MIN(MAX(b, INT_MIN), INT_MAX); + return float64_scalbn(a, b_int, s); +} + +DO_ZPZZ_FP(sve_fscalbn_h, uint16_t, H1_2, float16_scalbn) +DO_ZPZZ_FP(sve_fscalbn_s, uint16_t, H1_4, float32_scalbn) +DO_ZPZZ_FP_D(sve_fscalbn_d, uint64_t, scalbn_d) + +DO_ZPZZ_FP(sve_fmulx_h, uint16_t, H1_2, helper_advsimd_mulxh) +DO_ZPZZ_FP(sve_fmulx_s, uint16_t, H1_4, helper_vfp_mulxs) +DO_ZPZZ_FP_D(sve_fmulx_d, uint64_t, helper_vfp_mulxd) + +#undef DO_ZPZZ_FP +#undef DO_ZPZZ_FP_D + /* Fully general two-operand expander, controlled by a predicate, * With the extra float_status parameter. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 05c684222e..1692980d20 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3161,6 +3161,52 @@ DO_FP3(FRSQRTS, rsqrts) #undef DO_FP3 +/* + *** SVE Floating Point Arithmetic - Predicated Group + */ + +static void do_zpzz_fp(DisasContext *s, arg_rprr_esz *a, + gen_helper_gvec_4_ptr *fn) +{ + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr status; + + if (fn == NULL) { + unallocated_encoding(s); + return; + } + status = get_fpstatus_ptr(a->esz == MO_16); + tcg_gen_gvec_4_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + pred_full_reg_offset(s, a->pg), + status, vsz, vsz, 0, fn); + tcg_temp_free_ptr(status); +} + +#define DO_FP3(NAME, name) \ +static void trans_##NAME(DisasContext *s, arg_rprr_esz *a, uint32_t insn) \ +{ \ + static gen_helper_gvec_4_ptr * const fns[4] = { \ + NULL, gen_helper_sve_##name##_h, \ + gen_helper_sve_##name##_s, gen_helper_sve_##name##_d \ + }; \ + do_zpzz_fp(s, a, fns[a->esz]); \ +} + +DO_FP3(FADD_zpzz, fadd) +DO_FP3(FSUB_zpzz, fsub) +DO_FP3(FMUL_zpzz, fmul) +DO_FP3(FMIN_zpzz, fmin) +DO_FP3(FMAX_zpzz, fmax) +DO_FP3(FMINNM_zpzz, fminnum) +DO_FP3(FMAXNM_zpzz, fmaxnum) +DO_FP3(FABD, fabd) +DO_FP3(FSCALE, fscalbn) +DO_FP3(FDIV, fdiv) +DO_FP3(FMULX, fmulx) + +#undef DO_FP3 /* *** SVE Floating Point Unary Operations Prediated Group @@ -3181,6 +3227,7 @@ static void do_zpz_ptr(DisasContext *s, int rd, int rn, int pg, vec_full_reg_offset(s, rn), pred_full_reg_offset(s, pg), status, vsz, vsz, 0, fn); + tcg_temp_free_ptr(status); } static void trans_SCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index b571b70050..1a13c603ff 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -692,6 +692,23 @@ FTSMUL 01100101 .. 0 ..... 000 011 ..... ..... @rd_rn_rm FRECPS 01100101 .. 0 ..... 000 110 ..... ..... @rd_rn_rm FRSQRTS 01100101 .. 0 ..... 000 111 ..... ..... @rd_rn_rm +### SVE FP Arithmetic Predicated Group + +# SVE floating-point arithmetic (predicated) +FADD_zpzz 01100101 .. 00 0000 100 ... ..... ..... @rdn_pg_rm +FSUB_zpzz 01100101 .. 00 0001 100 ... ..... ..... @rdn_pg_rm +FMUL_zpzz 01100101 .. 00 0010 100 ... ..... ..... @rdn_pg_rm +FSUB_zpzz 01100101 .. 00 0011 100 ... ..... ..... @rdm_pg_rn # FSUBR +FMAXNM_zpzz 01100101 .. 00 0100 100 ... ..... ..... @rdn_pg_rm +FMINNM_zpzz 01100101 .. 00 0101 100 ... ..... ..... @rdn_pg_rm +FMAX_zpzz 01100101 .. 00 0110 100 ... ..... ..... @rdn_pg_rm +FMIN_zpzz 01100101 .. 00 0111 100 ... ..... ..... @rdn_pg_rm +FABD 01100101 .. 00 1000 100 ... ..... ..... @rdn_pg_rm +FSCALE 01100101 .. 00 1001 100 ... ..... ..... @rdn_pg_rm +FMULX 01100101 .. 00 1010 100 ... ..... ..... @rdn_pg_rm +FDIV 01100101 .. 00 1100 100 ... ..... ..... @rdm_pg_rn # FDIVR +FDIV 01100101 .. 00 1101 100 ... ..... ..... @rdn_pg_rm + ### SVE FP Unary Operations Predicated Group # SVE integer convert to floating-point From patchwork Sat Feb 17 18:23:05 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128733 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1851993ljc; Sat, 17 Feb 2018 11:21:42 -0800 (PST) X-Google-Smtp-Source: AH8x22423pjFv1aliFN3VwW0Fx6M82h+juuALgvTKWhOANKAyRIAdOpdDRbHwiTz6AVWifc0u99z X-Received: by 10.37.130.78 with SMTP id d14mr7229269ybn.246.1518895302079; Sat, 17 Feb 2018 11:21:42 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518895302; cv=none; d=google.com; s=arc-20160816; b=F3sC1Q5PdD0Ki7v4XHBS79RERriD8WCHEhuptIjpAp7XOj3vXf+uj92oXCL4rrnfzw xoSKKp85rHYiUXLftAlM6vxRUlNz1Ycb1WxSjxMhmCo1lsUTYWWHS8QYr9nebTKb0fxA Bm1VhBXL1O2yeJtNGqFPo1JSmn2D7lMOEeZHzslW+LBzrzV6BFX0MI4sKRQaqwYG4hcL lUACEJlGjiBvhfx4C2UgKJrs3yFMNGwFFJKhtRAne0OzEL79HLdaf/VGWPZDUQMEx1mf xIXpKmvjhdQBEdBg9e1prg/D5/5WxEp9qQHyeaDdHJfk30cv2cpeBr94sI4gY5e3H8DR A0BQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=N/uyfV2EnnPtS8uXZFNB+wp3yS83EN710i4Qp70kVz4=; b=FAwFtClp/Sq1qypFjOXNqjYmfK4mZxSG3Hwsb2R+tLp6rFsV/6Yfv5yRTzT1LO9VxM ScInm4Fj7wDfXXShEU1zXjE7n+nE8DuQEJ59+HgOsvdVahFj/bltfgefJ2uQmqE8l79i dTiXli9qhnvtz1rip+l+AeXsmpLp0YmPk2mSMeoP6QvhBX/YPJ/1SKoCz0C4HKX54nKv TyOB5ryLttHEHLSxJmbrT0wi8H6C8PpHThzdL6Og7Na451N/ysSXOPyJw2/sVEd3vk7j FeHLlucKkEIZRUJJMLceC3K1Lo4Hre1GMMfLDqnzRnqn05Pp9zvmFl1xP9yTQpkKO8rU gIEA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=H/MUIYn1; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id a64si262775ywb.159.2018.02.17.11.21.41 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 11:21:42 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=H/MUIYn1; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:49489 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en83J-0007qJ-BW for patch@linaro.org; Sat, 17 Feb 2018 14:21:41 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40710) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7AF-0001SM-Ko for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:49 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7AD-00027L-U8 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:47 -0500 Received: from mail-pl0-x243.google.com ([2607:f8b0:400e:c01::243]:45436) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7AD-000276-MH for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:45 -0500 Received: by mail-pl0-x243.google.com with SMTP id p5so3428969plo.12 for ; Sat, 17 Feb 2018 10:24:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=N/uyfV2EnnPtS8uXZFNB+wp3yS83EN710i4Qp70kVz4=; b=H/MUIYn1loOyvN60iecRE0FQjCwd/BjSnnyc15z+QbX2c/H8rVKH38gGmh604YWggC 0FKT8NXeIKlxk4p9mXj1VpQS3Vw942fgR+Zv9AP1fWaMKC5QxzeYiaF3OHKU7YZSINoa QiqPd7izm5AMwGANa1dlHyrrfkUfzI/Wg5oMY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=N/uyfV2EnnPtS8uXZFNB+wp3yS83EN710i4Qp70kVz4=; b=VF7u2CAImX0FKxYzpwqgGQRbOGzGSkxAdhHxAbMPTo1Ldd7z7XXi6Csx6DqLT99kVU woi9t8SwpgpZphcqxKXTQ96i7aihTI76jcbwKA6fIp1C6vvtjLf6BpYJdPMUIw7kBhiq e/XFrmvjw1LWAGrNEmfGt/Zg5vu+NnbcVO9Lk4WYrnysbS28k8n+/WIUntSERrl//jT6 VzwzE1EqP0XFOG7pdg1pM7ZxKIrDspsdEas9abi61mlg7NmQTksG/24ilzpPzJwn/kKb V4PthdGjEUXI8dy/IXczW5dOsMrynwb4G1KcstmoGyCWE8Hu4wm42RvD3G0Jwvkg5h0h EZHg== X-Gm-Message-State: APf1xPCDvNBKBYNDdzeFNcLiJxPwe+VDNoEFQEh52961dUgihMgMYf+b bxMLypynfC8G0isplcSH/Ser5YMGACY= X-Received: by 2002:a17:902:7808:: with SMTP id p8-v6mr9622082pll.161.1518891884401; Sat, 17 Feb 2018 10:24:44 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.43 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:43 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:05 -0800 Message-Id: <20180217182323.25885-50-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::243 Subject: [Qemu-devel] [PATCH v2 49/67] target/arm: Implement SVE FP Multiply-Add Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 16 ++++++++++++++ target/arm/sve_helper.c | 53 ++++++++++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 41 +++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 17 +++++++++++++++ 4 files changed, 127 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 84d0a8978c..a95f077c7f 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -827,6 +827,22 @@ DEF_HELPER_FLAGS_5(sve_ucvt_ds, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_ucvt_dd, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_fmls_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fmls_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fmls_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_fnmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fnmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fnmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) + DEF_HELPER_FLAGS_4(sve_ld1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index d80babfae7..6622275b44 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2948,6 +2948,59 @@ DO_ZPZ_FP_D(sve_ucvt_dd, uint64_t, uint64_to_float64) #undef DO_ZPZ_FP #undef DO_ZPZ_FP_D +/* 4-operand predicated multiply-add. This requires 7 operands to pass + * "properly", so we need to encode some of the registers into DESC. + */ +QEMU_BUILD_BUG_ON(SIMD_DATA_SHIFT + 20 > 32); + +#define DO_FMLA(NAME, N, H, NEG1, NEG3) \ +void HELPER(NAME)(CPUARMState *env, void *vg, uint32_t desc) \ +{ \ + intptr_t i = 0, opr_sz = simd_oprsz(desc); \ + unsigned rd = extract32(desc, SIMD_DATA_SHIFT, 5); \ + unsigned rn = extract32(desc, SIMD_DATA_SHIFT + 5, 5); \ + unsigned rm = extract32(desc, SIMD_DATA_SHIFT + 10, 5); \ + unsigned ra = extract32(desc, SIMD_DATA_SHIFT + 15, 5); \ + void *vd = &env->vfp.zregs[rd]; \ + void *vn = &env->vfp.zregs[rn]; \ + void *vm = &env->vfp.zregs[rm]; \ + void *va = &env->vfp.zregs[ra]; \ + do { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + if (likely(pg & 1)) { \ + float##N e1 = *(uint##N##_t *)(vn + H(i)); \ + float##N e2 = *(uint##N##_t *)(vm + H(i)); \ + float##N e3 = *(uint##N##_t *)(va + H(i)); \ + float##N r; \ + if (NEG1) e1 = float##N##_chs(e1); \ + if (NEG3) e3 = float##N##_chs(e3); \ + r = float##N##_muladd(e1, e2, e3, 0, &env->vfp.fp_status); \ + *(uint##N##_t *)(vd + H(i)) = r; \ + } \ + i += sizeof(float##N), pg >>= sizeof(float##N); \ + } while (i & 15); \ + } while (i < opr_sz); \ +} + +DO_FMLA(sve_fmla_zpzzz_h, 16, H1_2, 0, 0) +DO_FMLA(sve_fmla_zpzzz_s, 32, H1_4, 0, 0) +DO_FMLA(sve_fmla_zpzzz_d, 64, , 0, 0) + +DO_FMLA(sve_fmls_zpzzz_h, 16, H1_2, 0, 1) +DO_FMLA(sve_fmls_zpzzz_s, 32, H1_4, 0, 1) +DO_FMLA(sve_fmls_zpzzz_d, 64, , 0, 1) + +DO_FMLA(sve_fnmla_zpzzz_h, 16, H1_2, 1, 0) +DO_FMLA(sve_fnmla_zpzzz_s, 32, H1_4, 1, 0) +DO_FMLA(sve_fnmla_zpzzz_d, 64, , 1, 0) + +DO_FMLA(sve_fnmls_zpzzz_h, 16, H1_2, 1, 1) +DO_FMLA(sve_fnmls_zpzzz_s, 32, H1_4, 1, 1) +DO_FMLA(sve_fnmls_zpzzz_d, 64, , 1, 1) + +#undef DO_FMLA + /* * Load contiguous data, protected by a governing predicate. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 1692980d20..3124368fb5 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3208,6 +3208,47 @@ DO_FP3(FMULX, fmulx) #undef DO_FP3 +typedef void gen_helper_sve_fmla(TCGv_env, TCGv_ptr, TCGv_i32); + +static void do_fmla(DisasContext *s, arg_rprrr_esz *a, gen_helper_sve_fmla *fn) +{ + unsigned vsz = vec_full_reg_size(s); + unsigned desc; + TCGv_i32 t_desc; + TCGv_ptr pg = tcg_temp_new_ptr(); + + /* We would need 7 operands to pass these arguments "properly". + * So we encode all the register numbers into the descriptor. + */ + desc = deposit32(a->rd, 5, 5, a->rn); + desc = deposit32(desc, 10, 5, a->rm); + desc = deposit32(desc, 15, 5, a->ra); + desc = simd_desc(vsz, vsz, desc); + + t_desc = tcg_const_i32(desc); + tcg_gen_addi_ptr(pg, cpu_env, pred_full_reg_offset(s, a->pg)); + fn(cpu_env, pg, t_desc); + tcg_temp_free_i32(t_desc); + tcg_temp_free_ptr(pg); +} + +#define DO_FMLA(NAME, name) \ +static void trans_##NAME(DisasContext *s, arg_rprrr_esz *a, uint32_t insn) \ +{ \ + static gen_helper_sve_fmla * const fns[4] = { \ + NULL, gen_helper_sve_##name##_h, \ + gen_helper_sve_##name##_s, gen_helper_sve_##name##_d \ + }; \ + do_fmla(s, a, fns[a->esz]); \ +} + +DO_FMLA(FMLA_zpzzz, fmla_zpzzz) +DO_FMLA(FMLS_zpzzz, fmls_zpzzz) +DO_FMLA(FNMLA_zpzzz, fnmla_zpzzz) +DO_FMLA(FNMLS_zpzzz, fnmls_zpzzz) + +#undef DO_FMLA + /* *** SVE Floating Point Unary Operations Prediated Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 1a13c603ff..817833f96e 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -129,6 +129,8 @@ &rprrr_esz ra=%reg_movprfx @rdn_pg_ra_rm ........ esz:2 . rm:5 ... pg:3 ra:5 rd:5 \ &rprrr_esz rn=%reg_movprfx +@rdn_pg_rm_ra ........ esz:2 . ra:5 ... pg:3 rm:5 rd:5 \ + &rprrr_esz rn=%reg_movprfx # One register operand, with governing predicate, vector element size @rd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz @@ -709,6 +711,21 @@ FMULX 01100101 .. 00 1010 100 ... ..... ..... @rdn_pg_rm FDIV 01100101 .. 00 1100 100 ... ..... ..... @rdm_pg_rn # FDIVR FDIV 01100101 .. 00 1101 100 ... ..... ..... @rdn_pg_rm +### SVE FP Multiply-Add Group + +# SVE floating-point multiply-accumulate writing addend +FMLA_zpzzz 01100101 .. 1 ..... 000 ... ..... ..... @rda_pg_rn_rm +FMLS_zpzzz 01100101 .. 1 ..... 001 ... ..... ..... @rda_pg_rn_rm +FNMLA_zpzzz 01100101 .. 1 ..... 010 ... ..... ..... @rda_pg_rn_rm +FNMLS_zpzzz 01100101 .. 1 ..... 011 ... ..... ..... @rda_pg_rn_rm + +# SVE floating-point multiply-accumulate writing multiplicand +# FMAD, FMSB, FNMAD, FNMS +FMLA_zpzzz 01100101 .. 1 ..... 100 ... ..... ..... @rdn_pg_rm_ra +FMLS_zpzzz 01100101 .. 1 ..... 101 ... ..... ..... @rdn_pg_rm_ra +FNMLA_zpzzz 01100101 .. 1 ..... 110 ... ..... ..... @rdn_pg_rm_ra +FNMLS_zpzzz 01100101 .. 1 ..... 111 ... ..... ..... @rdn_pg_rm_ra + ### SVE FP Unary Operations Predicated Group # SVE integer convert to floating-point From patchwork Sat Feb 17 18:23:06 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128734 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1854139ljc; Sat, 17 Feb 2018 11:24:54 -0800 (PST) X-Google-Smtp-Source: AH8x226+yrjldBO9jvnMcSH6LNOKsKRMbjVdoxnmb4AmnBumSabfn42UBflSixGKuWNTuYwF82WS X-Received: by 10.37.17.193 with SMTP id 184mr1415453ybr.392.1518895494620; Sat, 17 Feb 2018 11:24:54 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518895494; cv=none; d=google.com; s=arc-20160816; b=i08SzgPDTUnqwlbm2poLLENceBky2t2SOZFg1awZhVDCGRWqmxgfCkF6J6GI/gAYKR tptCnUQuHwdEXte0G8cO2P6Dw3C1qeQPZ3CwhINU87vA1W5ZroVVJGFI5izKiDWEtqW6 jR/WlYh2xH5GIebb25mizf8C3LgRXKOTEca57RwvnB85yGS60WL+xD8UtRxN0RhmNtAi sxmf5QqnHHJRsiZ0RTSFszem7AS7VlQnNQoQjX5gKjL7a47h+SQFKZ29DV9itpmYkkKg bFpjrli17TpkbMKQ3BF3RBthDHjc4thEOIriAjEAqNFDuJwVjEyP/epzdKIc1KSXpA9c 7sTA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=cokBBhuDkWzo28jRgpOw6j/fAbncWNu8+0qc6t2uB2A=; b=kXI68n/VNodnmMZHRFE53sy3c+YsamuXJt+cTz22V2tFjm22Erzrwbj2Evmskyjl0Z zkJEbJD9JOyNWuw+GjZLCuwa64lsW+OjKaeHxsaEMfIn9fxD542pYVv14CY9vP5kzunH uNj3PXNtlycYA6fkJ243mqPyo9kgea2yDq8npOaEf7kW/wTrS9pPPsIPBEIfdfgjXu22 rHaZcF6PH6Dfpp3me4MR6cbEC/LfTpvS4XbhfspAuXi4KhnKSlZAHY2iBvxHyUn3CNkQ gMfl/Q38nZsIVgiER63KEtFwzmfCdIzucjnVKjQD+LFnjXsaMYZRwidCEk596XYfSKpW 2YTQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=VAo/oF8A; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id b203si1326714ywc.129.2018.02.17.11.24.54 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 11:24:54 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=VAo/oF8A; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:49522 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en86P-0001us-Rj for patch@linaro.org; Sat, 17 Feb 2018 14:24:53 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40737) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7AG-0001Ts-Us for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:52 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7AF-00028D-Nx for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:48 -0500 Received: from mail-pl0-x244.google.com ([2607:f8b0:400e:c01::244]:43140) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7AF-00027k-8N for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:47 -0500 Received: by mail-pl0-x244.google.com with SMTP id f4so3436597plr.10 for ; Sat, 17 Feb 2018 10:24:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=cokBBhuDkWzo28jRgpOw6j/fAbncWNu8+0qc6t2uB2A=; b=VAo/oF8A+CY8ToSIg/4fbQP52ZtyxwcUn0hg4PSmo+ru7LWv7WHA/UuerjwCRlOzz4 NnjH5dK+8OVJLUlan+plCftRZjgD1QsyQN+DxXWcSbAqdrGNnroV4dRLCwNc4Mi5xvoT EEP9qMp2RVsDTH1sKJYoz3Fa2ib42s6P3/pzU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=cokBBhuDkWzo28jRgpOw6j/fAbncWNu8+0qc6t2uB2A=; b=qKoeP2YVJIkdXrJ00DRMSmtjMTytztNtEHC12ZrJTlXMNU16OgzsBJwh08uTJVa0fn u/2pOUA7RRBSNUXuG64oi/5+AaSWMq/qkxzAxyTUdWCtMdVpPCL42Av5I+dr4miWQL9a GZ6lCedj2MfnG+qr+ADVFeeIONu1ernqu1cwjIIakCZsqi8UcbccY3cqha5Fs1bt5C/t vGwczhP/Vj77AAGyQnDju09GWy54Vsf9nXR6ExY/6F0889UG63wsv7/l0ZKLr/3Z7wXQ 8PU2XDBkmfvt1l20oRgrHMCaPjGFWrxOQ3CpRJge82SYwo6A/8M0vfA5XmmBs+7F/o+5 h8IQ== X-Gm-Message-State: APf1xPAtstjNX1KbKB5HtLjHZJXcwWmQ1uME2Lizt6deFs4a5LQF4fAT xPSzVXnPZVQ4elcNTe5hzlshM1cEWyY= X-Received: by 2002:a17:902:2bc5:: with SMTP id l63-v6mr9565360plb.108.1518891886027; Sat, 17 Feb 2018 10:24:46 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.44 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:45 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:06 -0800 Message-Id: <20180217182323.25885-51-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::244 Subject: [Qemu-devel] [PATCH v2 50/67] target/arm: Implement SVE Floating Point Accumulating Reduction Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 7 ++++++ target/arm/sve_helper.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 42 ++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 5 +++++ 4 files changed, 110 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index a95f077c7f..c4502256d5 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -720,6 +720,13 @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fadda_h, TCG_CALL_NO_RWG, + i64, i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fadda_s, TCG_CALL_NO_RWG, + i64, i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fadda_d, TCG_CALL_NO_RWG, + i64, i64, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_6(sve_fadd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_6(sve_fadd_s, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 6622275b44..0e2b3091b0 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2789,6 +2789,62 @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count, uint32_t pred_desc) return predtest_ones(d, oprsz, esz_mask); } +uint64_t HELPER(sve_fadda_h)(uint64_t nn, void *vm, void *vg, + void *status, uint32_t desc) +{ + intptr_t i = 0, opr_sz = simd_oprsz(desc); + float16 result = nn; + + do { + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); + do { + if (pg & 1) { + float16 mm = *(float16 *)(vm + H1_2(i)); + result = float16_add(result, mm, status); + } + i += sizeof(float16), pg >>= sizeof(float16); + } while (i & 15); + } while (i < opr_sz); + + return result; +} + +uint64_t HELPER(sve_fadda_s)(uint64_t nn, void *vm, void *vg, + void *status, uint32_t desc) +{ + intptr_t i = 0, opr_sz = simd_oprsz(desc); + float32 result = nn; + + do { + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); + do { + if (pg & 1) { + float32 mm = *(float32 *)(vm + H1_2(i)); + result = float32_add(result, mm, status); + } + i += sizeof(float32), pg >>= sizeof(float32); + } while (i & 15); + } while (i < opr_sz); + + return result; +} + +uint64_t HELPER(sve_fadda_d)(uint64_t nn, void *vm, void *vg, + void *status, uint32_t desc) +{ + intptr_t i = 0, opr_sz = simd_oprsz(desc) / 8; + uint64_t *m = vm; + uint8_t *pg = vg; + + for (i = 0; i < opr_sz; i++) { + if (pg[H1(i)] & 1) { + nn = float64_add(nn, m[i], status); + } + } + + return nn; +} + /* Fully general three-operand expander, controlled by a predicate, * With the extra float_status parameter. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 3124368fb5..32f0340738 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3120,6 +3120,48 @@ DO_ZZI(UMIN, umin) #undef DO_ZZI +/* + *** SVE Floating Point Accumulating Reduction Group + */ + +static void trans_FADDA(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ + typedef void fadda_fn(TCGv_i64, TCGv_i64, TCGv_ptr, + TCGv_ptr, TCGv_ptr, TCGv_i32); + static fadda_fn * const fns[3] = { + gen_helper_sve_fadda_h, + gen_helper_sve_fadda_s, + gen_helper_sve_fadda_d, + }; + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr t_rm, t_pg, t_fpst; + TCGv_i64 t_val; + TCGv_i32 t_desc; + + if (a->esz == 0) { + unallocated_encoding(s); + return; + } + + t_val = load_esz(cpu_env, vec_reg_offset(s, a->rn, 0, a->esz), a->esz); + t_rm = tcg_temp_new_ptr(); + t_pg = tcg_temp_new_ptr(); + tcg_gen_addi_ptr(t_rm, cpu_env, vec_full_reg_offset(s, a->rm)); + tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, a->pg)); + t_fpst = get_fpstatus_ptr(a->esz == MO_16); + t_desc = tcg_const_i32(simd_desc(vsz, vsz, 0)); + + fns[a->esz - 1](t_val, t_val, t_rm, t_pg, t_fpst, t_desc); + + tcg_temp_free_i32(t_desc); + tcg_temp_free_ptr(t_fpst); + tcg_temp_free_ptr(t_pg); + tcg_temp_free_ptr(t_rm); + + write_fp_dreg(s, a->rd, t_val); + tcg_temp_free_i64(t_val); +} + /* *** SVE Floating Point Arithmetic - Unpredicated Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 817833f96e..95a290aed0 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -684,6 +684,11 @@ UMIN_zzi 00100101 .. 101 011 110 ........ ..... @rdn_i8u # SVE integer multiply immediate (unpredicated) MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s +### SVE FP Accumulating Reduction Group + +# SVE floating-point serial reduction (predicated) +FADDA 01100101 .. 011 000 001 ... ..... ..... @rdn_pg_rm + ### SVE Floating Point Arithmetic - Unpredicated Group # SVE floating-point arithmetic (unpredicated) From patchwork Sat Feb 17 18:23:07 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128731 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1849454ljc; Sat, 17 Feb 2018 11:17:25 -0800 (PST) X-Google-Smtp-Source: AH8x225Vf3yt4KGNirfTgz3uAbgfRUIeTzx1EwmWMFfl6VcEC77FQ65nFuPbwnmYJUtkkIQiGuKT X-Received: by 10.129.173.17 with SMTP id l17mr7307316ywh.460.1518895045143; Sat, 17 Feb 2018 11:17:25 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518895045; cv=none; d=google.com; s=arc-20160816; b=fx/EYYhFOft2OZx446n3dQjlZeJSDH8UETyu42gKRvHp1P73CcLWOdQuJ+bxeQv51h dm9y1TNcSG80rNTQcx9y5wgL4NjSmsbFBF/tRoPkwZi4zmlRLsdHUr/m8s7MdANY10LN fhMItzz86eaT+iq8Ev1pX0HzZH30qqcElg9b4tcxeGSpkyRL+FFkzM4LDpiEOYyNWJzs O1Qh/3pFje4woyHOOIeAUYH3yIpVn6iZRDjb9SP1k9gbnikVw66Hs6kjjWQJpzAU04f+ XgSZYEm4mYgyNoC5eknv+V0Wz42HNwi6jKrIDCtcU99b5gJeKLu5mEbtpUaTMtzTjwjt F3qQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=zzawyRmfL2KGIFwz+BVgdCeNmKyizy+NZ2+wOIU6zVw=; b=zhHl23P4OotMa0yjUTmBLYQaEnA6XEehuvFWt131gGpR1NxPDXahx5gADzj1f00hPq nkCFVinxXbD+FHjL3EPb/K09vfjjbbQGkfruDeDxlvC0C8SWYe+jxOYA7MBvkXEQmeEI AWL6jheGMv7yTuZpl8xqJklOUJSL7kYisdzXAwEfwZMANB6zgyAgv0rO+S7h+HB0Z9Br h7jRmT7sYpS0YRWelZY65XV/0FAGI6I+2/0hx/W9b366/K/k0aCt1Oteb+JYNgLYUX6G M5yJ1tH0H1x6lDFq9nnDZBPHpEYSjer+gn8y5RgvyOuSNQOE5n67dI2JlRovjtyRXSsv lf6g== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=ioy7bBnl; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id a14si1037293ybk.580.2018.02.17.11.17.24 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 11:17:25 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=ioy7bBnl; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48568 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7zA-00039a-CY for patch@linaro.org; Sat, 17 Feb 2018 14:17:24 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40779) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7AK-0001YK-6W for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:53 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7AH-000292-3g for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:52 -0500 Received: from mail-pl0-x241.google.com ([2607:f8b0:400e:c01::241]:41329) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7AG-00028c-SX for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:49 -0500 Received: by mail-pl0-x241.google.com with SMTP id k8so3439026pli.8 for ; Sat, 17 Feb 2018 10:24:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=zzawyRmfL2KGIFwz+BVgdCeNmKyizy+NZ2+wOIU6zVw=; b=ioy7bBnl43iUet/aePnOz6XoxwlBK5M3a+tlgeYlQo7rbaCYNDvx3xKURxe/+Fzbk5 Dl93j2rfX6xPDPlRUQ7egH+/PbhnoeRinFIxKEUDPej1r/Xn+NPm2WTK43knEqPsghO+ wA3DNZFHnqNRy96mCycDXpx8SpoZ0Iq+FkG1A= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=zzawyRmfL2KGIFwz+BVgdCeNmKyizy+NZ2+wOIU6zVw=; b=TMOfD+VA8aIgI+catPBgw/gEFXiw4TH9hwKKgt65kvVa3dAdZp5Yt755FqB4Lblqed gWUp0VUBrS3DCKOJTfsrjfmJzM47uZFHDNQxBuGTBycw2UPG/+jwsjq3EDabP8WyOdl5 Bc6sonIvbQUBLECaGFV/jFWOOUbRU6t39YqTy9AGICDjmLE12e0WWOynr1xyaJNlPS3L 6XvbbSJsG2QaHV3MhJXCUuIgQuE/HJ9DRjHsVGvU+G3ao5M0UV36kjYuvo69JmfjH0Yt WlyOYUucu1QMH3JCH+NL1jyqOEOcHM7P+b3DMnwan78CSEDSe0PCzg/LC6Hpt2uCFQWt ZxPg== X-Gm-Message-State: APf1xPDs8ah4iTXeimAF7NCH44UPjoebA7joVUnS0KOufSi6dW8Dyqh3 8s6XwG/7Sncm0mzIlrB8AkWL0GIZK/A= X-Received: by 2002:a17:902:8509:: with SMTP id bj9-v6mr9563834plb.386.1518891887554; Sat, 17 Feb 2018 10:24:47 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.46 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:46 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:07 -0800 Message-Id: <20180217182323.25885-52-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::241 Subject: [Qemu-devel] [PATCH v2 51/67] target/arm: Implement SVE load and broadcast element X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 5 +++++ target/arm/sve_helper.c | 43 ++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 55 +++++++++++++++++++++++++++++++++++++++++++++- target/arm/sve.decode | 5 +++++ 4 files changed, 107 insertions(+), 1 deletion(-) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index c4502256d5..6c640a92ff 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -274,6 +274,11 @@ DEF_HELPER_FLAGS_3(sve_clr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_clr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_clr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_clri_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_clri_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_clri_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_clri_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(sve_asr_zpzi_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_asr_zpzi_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_asr_zpzi_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 0e2b3091b0..a7dc6f6164 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -994,6 +994,49 @@ void HELPER(sve_clr_d)(void *vd, void *vg, uint32_t desc) } } +/* Store zero into every inactive element of Zd. */ +void HELPER(sve_clri_b)(void *vd, void *vg, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd; + uint8_t *pg = vg; + for (i = 0; i < opr_sz; i += 1) { + d[i] &= expand_pred_b(pg[H1(i)]); + } +} + +void HELPER(sve_clri_h)(void *vd, void *vg, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd; + uint8_t *pg = vg; + for (i = 0; i < opr_sz; i += 1) { + d[i] &= expand_pred_h(pg[H1(i)]); + } +} + +void HELPER(sve_clri_s)(void *vd, void *vg, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd; + uint8_t *pg = vg; + for (i = 0; i < opr_sz; i += 1) { + d[i] &= expand_pred_s(pg[H1(i)]); + } +} + +void HELPER(sve_clri_d)(void *vd, void *vg, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd; + uint8_t *pg = vg; + for (i = 0; i < opr_sz; i += 1) { + if (!(pg[H1(i)] & 1)) { + d[i] = 0; + } + } +} + /* Three-operand expander, immediate operand, controlled by a predicate. */ #define DO_ZPZI(NAME, TYPE, H, OP) \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 32f0340738..b000a2482e 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -584,6 +584,19 @@ static void do_clr_zp(DisasContext *s, int rd, int pg, int esz) vsz, vsz, 0, fns[esz]); } +/* Store zero into every inactive element of Zd. */ +static void do_clr_inactive_zp(DisasContext *s, int rd, int pg, int esz) +{ + static gen_helper_gvec_2 * const fns[4] = { + gen_helper_sve_clri_b, gen_helper_sve_clri_h, + gen_helper_sve_clri_s, gen_helper_sve_clri_d, + }; + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_2_ool(vec_full_reg_offset(s, rd), + pred_full_reg_offset(s, pg), + vsz, vsz, 0, fns[esz]); +} + static void do_zpzi_ool(DisasContext *s, arg_rpri_esz *a, gen_helper_gvec_3 *fn) { @@ -3506,7 +3519,7 @@ static void trans_LDR_pri(DisasContext *s, arg_rri *a, uint32_t insn) *** SVE Memory - Contiguous Load Group */ -/* The memory element size of dtype. */ +/* The memory mode of the dtype. */ static const TCGMemOp dtype_mop[16] = { MO_UB, MO_UB, MO_UB, MO_UB, MO_SL, MO_UW, MO_UW, MO_UW, @@ -3671,6 +3684,46 @@ static void trans_LD1RQ_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn) do_ldrq(s, a->rd, a->pg, addr, dtype_msz(a->dtype)); } +/* Load and broadcast element. */ +static void trans_LD1R_zpri(DisasContext *s, arg_rpri_load *a, uint32_t insn) +{ + unsigned vsz = vec_full_reg_size(s); + unsigned psz = pred_full_reg_size(s); + unsigned esz = dtype_esz[a->dtype]; + TCGLabel *over = gen_new_label(); + TCGv_i64 temp; + + /* If the guarding predicate has no bits set, no load occurs. */ + if (psz <= 8) { + temp = tcg_temp_new_i64(); + tcg_gen_ld_i64(temp, cpu_env, pred_full_reg_offset(s, a->pg)); + tcg_gen_andi_i64(temp, temp, + deposit64(0, 0, psz * 8, pred_esz_masks[esz])); + tcg_gen_brcondi_i64(TCG_COND_EQ, temp, 0, over); + tcg_temp_free_i64(temp); + } else { + TCGv_i32 t32 = tcg_temp_new_i32(); + find_last_active(s, t32, esz, a->pg); + tcg_gen_brcondi_i32(TCG_COND_LT, t32, 0, over); + tcg_temp_free_i32(t32); + } + + /* Load the data. */ + temp = tcg_temp_new_i64(); + tcg_gen_addi_i64(temp, cpu_reg_sp(s, a->rn), a->imm); + tcg_gen_qemu_ld_i64(temp, temp, get_mem_index(s), + s->be_data | dtype_mop[a->dtype]); + + /* Broadcast to *all* elements. */ + tcg_gen_gvec_dup_i64(esz, vec_full_reg_offset(s, a->rd), + vsz, vsz, temp); + tcg_temp_free_i64(temp); + + /* Zero the inactive elements. */ + gen_set_label(over); + do_clr_inactive_zp(s, a->rd, a->pg, esz); +} + static void do_st_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr, int msz, int esz, int nreg) { diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 95a290aed0..3e30985a09 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -29,6 +29,7 @@ %imm9_16_10 16:s6 10:3 %preg4_5 5:4 %size_23 23:2 +%dtype_23_13 23:2 13:2 # A combination of tsz:imm3 -- extract esize. %tszimm_esz 22:2 5:5 !function=tszimm_esz @@ -758,6 +759,10 @@ LDR_pri 10000101 10 ...... 000 ... ..... 0 .... @pd_rn_i9 # SVE load vector register LDR_zri 10000101 10 ...... 010 ... ..... ..... @rd_rn_i9 +# SVE load and broadcast element +LD1R_zpri 1000010 .. 1 imm:6 1.. pg:3 rn:5 rd:5 \ + &rpri_load dtype=%dtype_23_13 nreg=0 + ### SVE Memory Contiguous Load Group # SVE contiguous load (scalar plus scalar) From patchwork Sat Feb 17 18:23:08 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128713 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1839456ljc; Sat, 17 Feb 2018 11:02:04 -0800 (PST) X-Google-Smtp-Source: AH8x226a/gu9+x/iUMIAWFrJywSXbClNCLwzugCXjPAQZie2UAOJz70uN2JUPw2Y5+ezlSl2v0iP X-Received: by 10.37.155.2 with SMTP id y2mr3583556ybn.518.1518894124017; Sat, 17 Feb 2018 11:02:04 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518894124; cv=none; d=google.com; s=arc-20160816; b=xBryjB6I3d4mPk8L58+KVcTjE2T2sMFAINZYpOS2XdQvnlhSFaRvpGDfGaCs2xpWxl xnCqyzbL6xJZnLDyIxDRc2Klw2lRjrdtU+l/GzQvJRM8590M64WOICnxJzf1m/TCFwh4 2YQ+R4HPZTA/NwTd9e/kw8ygTQMXyT2n6irdOwO+m0xuQjKxF/0sjDUshK7SQ1CZ5AXS 3voaq+uwNMBDHTch10WE+1XFbXpyygyLQXXGh0fTE+A/6DrvPeSJYYkQw21n4hHokrhV eEu3QymmtDuP2CFgAf8pqg5YdGM1FGlirys5ce/4r9XKRJmHHvRgzO4LZN+5Yy3Rrx1G ZztA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=szV9p2aTTZ+9ZvXrV44xeIdEsoeNuGFa0EWnSbjdE/M=; b=cds5x87v+4fUjyxJHstkFeA7jjbhuFbQgf7eubpmwfdZ2bgga0ds5VH1oyj+zQzljH 5hexV41IJ++A5D1e7HEu8jQIp1ioeWs/HYJnwV+wB0QXny1k/k/0NHErbDH/2eN9jydf hKSf6synU6/BxDwoYgY7Tr/+RuAgnpLBiaqJ/r5q9l58UW0vM5WPOdjaYGrTrMdua5lZ vGCjHDfLVnMl1pEtHcHzS/cO2nxiuMIEqciPZJulI3qcneTQSI+ywjmoVouZm6Be1R2Q LxTVT++A00sCQbEgqhwPlo/JX8BITlIUzuQ8xEpdZZArqokt8zb0eDrW3n+GHxPl5l5D BU3g== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=MvXSwKgV; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id w62si1386116ybe.410.2018.02.17.11.02.03 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 11:02:04 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=MvXSwKgV; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48401 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7kJ-0007J6-AF for patch@linaro.org; Sat, 17 Feb 2018 14:02:03 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40780) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7AK-0001YL-6v for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:53 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7AI-00029j-HF for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:52 -0500 Received: from mail-pl0-x243.google.com ([2607:f8b0:400e:c01::243]:38172) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7AI-00029H-8q for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:50 -0500 Received: by mail-pl0-x243.google.com with SMTP id h10so3439731plt.5 for ; Sat, 17 Feb 2018 10:24:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=szV9p2aTTZ+9ZvXrV44xeIdEsoeNuGFa0EWnSbjdE/M=; b=MvXSwKgV4KRlsBmoQxUd7Y+CIlWGYasYDm4sXGBUtiOpbCnE6YNApp9YGxhcpKBYuy +0AZYN7MAMbsC9wgNTApclQica0EU8GbEq8jaN1nYIgkckAMS8TjYJ2D1vuaHuvAQjYJ dCKeQsIW7gs4q5x6qAc+N3lYkfztfnC3X3OiA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=szV9p2aTTZ+9ZvXrV44xeIdEsoeNuGFa0EWnSbjdE/M=; b=XmfZX3MdUGRQFrV5XGvH41vYiLI9zpGFDD2FPlMUl5CYUT0tBasjQf5E9s/oOwMU5z XHYjO8uwZ5ImPbLcGXadNIyF4SL+wU3DzkrxeRiMCXc0Djnm70j0h3zWDxKCMz3wUvf9 UEBIbnWreBzyivD8pZTFP2LiMj5NYq/jzI+BFVGlGDaZXDXJZdGSP9jZZEL+DcHI3XXw F8gwupWumQ+nSd9al5j5M+srjXBjpsGoDMYLb5M5EJHE+WQcAaULzsHQM7c86h/sZT07 8sVtn7xKMq0aAee25UISBiAER2tPVilJYQ9w7bihas44+Q9tNcxPlzqFvP5LgvMw1IjR jRMQ== X-Gm-Message-State: APf1xPBgIniup/srer5+pIm2BP/wk3mn+0i+327wA/uQwLYYZF6XmjaY XJSEySDf7yuvYkntBsAaraRAXLQU/As= X-Received: by 2002:a17:902:461:: with SMTP id 88-v6mr9362814ple.88.1518891889000; Sat, 17 Feb 2018 10:24:49 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.47 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:48 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:08 -0800 Message-Id: <20180217182323.25885-53-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::243 Subject: [Qemu-devel] [PATCH v2 52/67] target/arm: Implement SVE store vector/predicate register X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 101 +++++++++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 6 +++ 2 files changed, 107 insertions(+) -- 2.14.3 diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index b000a2482e..9c724980a0 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3501,6 +3501,95 @@ static void do_ldr(DisasContext *s, uint32_t vofs, uint32_t len, tcg_temp_free_i64(t0); } +/* Similarly for stores. */ +static void do_str(DisasContext *s, uint32_t vofs, uint32_t len, + int rn, int imm) +{ + uint32_t len_align = QEMU_ALIGN_DOWN(len, 8); + uint32_t len_remain = len % 8; + uint32_t nparts = len / 8 + ctpop8(len_remain); + int midx = get_mem_index(s); + TCGv_i64 addr, t0; + + addr = tcg_temp_new_i64(); + t0 = tcg_temp_new_i64(); + + /* Note that unpredicated load/store of vector/predicate registers + * are defined as a stream of bytes, which equates to little-endian + * operations on larger quantities. There is no nice way to force + * a little-endian load for aarch64_be-linux-user out of line. + * + * Attempt to keep code expansion to a minimum by limiting the + * amount of unrolling done. + */ + if (nparts <= 4) { + int i; + + for (i = 0; i < len_align; i += 8) { + tcg_gen_ld_i64(t0, cpu_env, vofs + i); + tcg_gen_addi_i64(addr, cpu_reg_sp(s, rn), imm + i); + tcg_gen_qemu_st_i64(t0, addr, midx, MO_LEQ); + } + } else { + TCGLabel *loop = gen_new_label(); + TCGv_ptr i = TCGV_NAT_TO_PTR(glue(tcg_const_local_, ptr)(0)); + TCGv_ptr src; + + gen_set_label(loop); + + src = tcg_temp_new_ptr(); + tcg_gen_add_ptr(src, cpu_env, i); + tcg_gen_ld_i64(t0, src, vofs); + + /* Minimize the number of local temps that must be re-read from + * the stack each iteration. Instead, re-compute values other + * than the loop counter. + */ + tcg_gen_addi_ptr(src, i, imm); +#if UINTPTR_MAX == UINT32_MAX + tcg_gen_extu_i32_i64(addr, TCGV_PTR_TO_NAT(src)); + tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, rn)); +#else + tcg_gen_add_i64(addr, TCGV_PTR_TO_NAT(src), cpu_reg_sp(s, rn)); +#endif + tcg_temp_free_ptr(src); + + tcg_gen_qemu_st_i64(t0, addr, midx, MO_LEQ); + + tcg_gen_addi_ptr(i, i, 8); + + glue(tcg_gen_brcondi_, ptr)(TCG_COND_LTU, TCGV_PTR_TO_NAT(i), + len_align, loop); + tcg_temp_free_ptr(i); + } + + /* Predicate register stores can be any multiple of 2. */ + if (len_remain) { + tcg_gen_ld_i64(t0, cpu_env, vofs + len_align); + tcg_gen_addi_i64(addr, cpu_reg_sp(s, rn), imm + len_align); + + switch (len_remain) { + case 2: + case 4: + case 8: + tcg_gen_qemu_st_i64(t0, addr, midx, MO_LE | ctz32(len_remain)); + break; + + case 6: + tcg_gen_qemu_st_i64(t0, addr, midx, MO_LEUL); + tcg_gen_addi_i64(addr, addr, 4); + tcg_gen_shri_i64(addr, addr, 32); + tcg_gen_qemu_st_i64(t0, addr, midx, MO_LEUW); + break; + + default: + g_assert_not_reached(); + } + } + tcg_temp_free_i64(addr); + tcg_temp_free_i64(t0); +} + #undef ptr static void trans_LDR_zri(DisasContext *s, arg_rri *a, uint32_t insn) @@ -3515,6 +3604,18 @@ static void trans_LDR_pri(DisasContext *s, arg_rri *a, uint32_t insn) do_ldr(s, pred_full_reg_offset(s, a->rd), size, a->rn, a->imm * size); } +static void trans_STR_zri(DisasContext *s, arg_rri *a, uint32_t insn) +{ + int size = vec_full_reg_size(s); + do_str(s, vec_full_reg_offset(s, a->rd), size, a->rn, a->imm * size); +} + +static void trans_STR_pri(DisasContext *s, arg_rri *a, uint32_t insn) +{ + int size = pred_full_reg_size(s); + do_str(s, pred_full_reg_offset(s, a->rd), size, a->rn, a->imm * size); +} + /* *** SVE Memory - Contiguous Load Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 3e30985a09..5d8e1481d7 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -800,6 +800,12 @@ LD1RQ_zpri 1010010 .. 00 0.... 001 ... ..... ..... \ ### SVE Memory Store Group +# SVE store predicate register +STR_pri 1110010 11 0. ..... 000 ... ..... 0 .... @pd_rn_i9 + +# SVE store vector register +STR_zri 1110010 11 0. ..... 010 ... ..... ..... @rd_rn_i9 + # SVE contiguous store (scalar plus immediate) # ST1B, ST1H, ST1W, ST1D; require msz <= esz ST_zpri 1110010 .. esz:2 0.... 111 ... ..... ..... \ From patchwork Sat Feb 17 18:23:09 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128708 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1836215ljc; Sat, 17 Feb 2018 10:57:42 -0800 (PST) X-Google-Smtp-Source: AH8x225bsl3cBhR2m+JRcgN2ibDcffhCicdHcX+6b7g3tZSxFK4gnqU3Kyist3/B/wGcpGwSi1Ch X-Received: by 10.37.195.66 with SMTP id t63mr7398404ybf.276.1518893862010; Sat, 17 Feb 2018 10:57:42 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518893862; cv=none; d=google.com; s=arc-20160816; b=mnhdByyt1cs8xPnh27OtVJpMWpahjtF309FX2c+DSlOUNUhcMloQA9C/r/D22sq2Ar y2Hd3AgmdajaZpshy5oR8ffDqIN8zknbgBwwb2+6UQbHiBnqvkW8Qel34Bix17oeiVx+ EFU1WC2dhtM245exZpSlrp2Ez+uprLpYhZHuJdob3AICtWXkn7TzR1Y1AjwBPryPLjrP kcL+4nfTtqlcq+fbZBYSkPFPk/SM0lmrKC+IC5XJscDGW4+II6FZEeyZDRvSiYtBwkRZ 5XSnu+/ArjD1FHjDGEC6ArbmhhZAhVph4gpFSKbW5vcQGCWKFf8bjQchaI0QT8NapFSs u+0A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=IhToWs1C7AovvETP6XOvXMuMMPnPAh0jKOUin6dQbJU=; b=XpwgeVZ08N0nzBDDQbKLUjNXPvWA3RRghWqcuvpAyb4i1Z6PsXch2I1ZA+5/Lpy3YK kBAkaXU/voUldkLuyeIzR9bUdwH/ttFJzuR5xH9hiGR5l9akqt0Czij81OxrYqhOYsKO Hc+mHMcAIa5hBM5eHAvNnMclW5AqTfypS89l6u99Tz1HpUpc3r2CdY+kpVD51pcnNQr3 DveDSNxHTjKYBERFiiJ37BoBdMCMUZ091JTYKVvUCU/w29gDnZdVEuwOCr+Gbq7UIM/W 67y7CRdMJSADcXkb86GtJHerGraugNkM/0PvSODoLBho9UIVd1nHmGp0mkRK49raw2um l1mA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=QB9g90y2; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id 204si1077788ywy.685.2018.02.17.10.57.41 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 10:57:41 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=QB9g90y2; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48346 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7g5-0003AO-A6 for patch@linaro.org; Sat, 17 Feb 2018 13:57:41 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40813) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7AL-0001bz-OB for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:55 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7AK-0002AG-2v for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:53 -0500 Received: from mail-pl0-x241.google.com ([2607:f8b0:400e:c01::241]:37016) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7AJ-00029z-Ri for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:52 -0500 Received: by mail-pl0-x241.google.com with SMTP id ay8so3443369plb.4 for ; Sat, 17 Feb 2018 10:24:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=IhToWs1C7AovvETP6XOvXMuMMPnPAh0jKOUin6dQbJU=; b=QB9g90y2TEkyU1V27BzKlf0vX0kK5U2vJncmfjTKWsy4dmrrR7r/oGGoPZlIu/zvUd l8/GFpDFU8ohWjjQ36ULEbIvZnU1XsLrSXauYEbJBMpIXecpxmAc6P52AtLMGR1RLnri t+Dirc7B5c7M8M5TcFTIqj80cmYLkI0K/4Gu4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=IhToWs1C7AovvETP6XOvXMuMMPnPAh0jKOUin6dQbJU=; b=Ht510ic2dQlEk3XZBRNdjlQTlVNJ34Xd8yxlASzdpcjsx94xQd5gRfYPss2whq4ds9 wpf8E9gZcF5YM1wjillLoPkrSLGKnnOQsn+/I84hPH9YiQBHOkTCsp3D1rImBx2uYAwZ gfUMioKkTAP1evFHyiQkEcAB19b78ZgFNiRYMSvyaoZjoQpOWVUzzzJloC2Vksref+c+ V3SABcL/xlSRbcapE3Ht25sv0t8NUoO1g5OBdTPc1JA3Qf8zw4kpdL7ckI2nwBDc+5MA bo5pOYZ3RoVRSsLKjOyQRYM+LetiTKjHtSFP1ogThJYRvxkCTkzuJAch44/AX3Gt9OnW xTcg== X-Gm-Message-State: APf1xPDkx/QvW4Z4QObLO/oml/Ss8eXUa1IUkfnw6/zaSy5yTcxnLvrc 1cqw/wSz2Icet4NP7xeExfupObv/xRY= X-Received: by 2002:a17:902:b488:: with SMTP id y8-v6mr8856075plr.432.1518891890498; Sat, 17 Feb 2018 10:24:50 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.49 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:49 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:09 -0800 Message-Id: <20180217182323.25885-54-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::241 Subject: [Qemu-devel] [PATCH v2 53/67] target/arm: Implement SVE scatter stores X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 41 ++++++++++++++++++++++++++ target/arm/sve_helper.c | 62 ++++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 71 ++++++++++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 39 +++++++++++++++++++++++++ 4 files changed, 213 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 6c640a92ff..b5c093f2fd 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -918,3 +918,44 @@ DEF_HELPER_FLAGS_4(sve_st1hs_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_st1hd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_st1sd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_stbs_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_sths_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_stss_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_stbs_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_sths_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_stss_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_stbd_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_sthd_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_stsd_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_stdd_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_stbd_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_sthd_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_stsd_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_stdd_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_stbd_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_sthd_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_stsd_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_stdd_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index a7dc6f6164..07b3d285f2 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3545,3 +3545,65 @@ void HELPER(sve_st4dd_r)(CPUARMState *env, void *vg, addr += 4 * 8; } } + +/* Stores with a vector index. */ + +#define DO_ST1_ZPZ_S(NAME, TYPEI, FN) \ +void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \ + target_ulong base, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc) / 8; \ + unsigned scale = simd_data(desc); \ + uintptr_t ra = GETPC(); \ + uint32_t *d = vd; TYPEI *m = vm; uint8_t *pg = vg; \ + for (i = 0; i < oprsz; i++) { \ + uint8_t pp = pg[H1(i)]; \ + if (pp & 0x01) { \ + target_ulong off = (target_ulong)m[H4(i * 2)] << scale; \ + FN(env, base + off, d[H4(i * 2)], ra); \ + } \ + if (pp & 0x10) { \ + target_ulong off = (target_ulong)m[H4(i * 2 + 1)] << scale; \ + FN(env, base + off, d[H4(i * 2 + 1)], ra); \ + } \ + } \ +} + +#define DO_ST1_ZPZ_D(NAME, TYPEI, FN) \ +void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \ + target_ulong base, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc) / 8; \ + unsigned scale = simd_data(desc); \ + uintptr_t ra = GETPC(); \ + uint64_t *d = vd, *m = vm; uint8_t *pg = vg; \ + for (i = 0; i < oprsz; i++) { \ + if (pg[H1(i)] & 1) { \ + target_ulong off = (target_ulong)(TYPEI)m[i] << scale; \ + FN(env, base + off, d[i], ra); \ + } \ + } \ +} + +DO_ST1_ZPZ_S(sve_stbs_zsu, uint32_t, cpu_stb_data_ra) +DO_ST1_ZPZ_S(sve_sths_zsu, uint32_t, cpu_stw_data_ra) +DO_ST1_ZPZ_S(sve_stss_zsu, uint32_t, cpu_stl_data_ra) + +DO_ST1_ZPZ_S(sve_stbs_zss, int32_t, cpu_stb_data_ra) +DO_ST1_ZPZ_S(sve_sths_zss, int32_t, cpu_stw_data_ra) +DO_ST1_ZPZ_S(sve_stss_zss, int32_t, cpu_stl_data_ra) + +DO_ST1_ZPZ_D(sve_stbd_zsu, uint32_t, cpu_stb_data_ra) +DO_ST1_ZPZ_D(sve_sthd_zsu, uint32_t, cpu_stw_data_ra) +DO_ST1_ZPZ_D(sve_stsd_zsu, uint32_t, cpu_stl_data_ra) +DO_ST1_ZPZ_D(sve_stdd_zsu, uint32_t, cpu_stq_data_ra) + +DO_ST1_ZPZ_D(sve_stbd_zss, int32_t, cpu_stb_data_ra) +DO_ST1_ZPZ_D(sve_sthd_zss, int32_t, cpu_stw_data_ra) +DO_ST1_ZPZ_D(sve_stsd_zss, int32_t, cpu_stl_data_ra) +DO_ST1_ZPZ_D(sve_stdd_zss, int32_t, cpu_stq_data_ra) + +DO_ST1_ZPZ_D(sve_stbd_zd, uint64_t, cpu_stb_data_ra) +DO_ST1_ZPZ_D(sve_sthd_zd, uint64_t, cpu_stw_data_ra) +DO_ST1_ZPZ_D(sve_stsd_zd, uint64_t, cpu_stl_data_ra) +DO_ST1_ZPZ_D(sve_stdd_zd, uint64_t, cpu_stq_data_ra) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 9c724980a0..ca49b94924 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -47,6 +47,8 @@ typedef void gen_helper_gvec_flags_4(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32); typedef void gen_helper_gvec_mem(TCGv_env, TCGv_ptr, TCGv_i64, TCGv_i32); +typedef void gen_helper_gvec_mem_scatter(TCGv_env, TCGv_ptr, TCGv_ptr, + TCGv_ptr, TCGv_i64, TCGv_i32); /* * Helpers for extracting complex instruction fields. @@ -3887,3 +3889,72 @@ static void trans_ST_zpri(DisasContext *s, arg_rpri_store *a, uint32_t insn) (a->imm * elements * (a->nreg + 1)) << a->msz); do_st_zpa(s, a->rd, a->pg, addr, a->msz, a->esz, a->nreg); } + +/* + *** SVE gather loads / scatter stores + */ + +static void do_mem_zpz(DisasContext *s, int zt, int pg, int zm, int scale, + TCGv_i64 scalar, gen_helper_gvec_mem_scatter *fn) +{ + unsigned vsz = vec_full_reg_size(s); + TCGv_i32 desc = tcg_const_i32(simd_desc(vsz, vsz, scale)); + TCGv_ptr t_zm = tcg_temp_new_ptr(); + TCGv_ptr t_pg = tcg_temp_new_ptr(); + TCGv_ptr t_zt = tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg)); + tcg_gen_addi_ptr(t_zm, cpu_env, vec_full_reg_offset(s, zm)); + tcg_gen_addi_ptr(t_zt, cpu_env, vec_full_reg_offset(s, zt)); + fn(cpu_env, t_zt, t_pg, t_zm, scalar, desc); + + tcg_temp_free_ptr(t_zt); + tcg_temp_free_ptr(t_zm); + tcg_temp_free_ptr(t_pg); + tcg_temp_free_i32(desc); +} + +static void trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn) +{ + /* Indexed by [xs][msz]. */ + static gen_helper_gvec_mem_scatter * const fn32[2][3] = { + { gen_helper_sve_stbs_zsu, + gen_helper_sve_sths_zsu, + gen_helper_sve_stss_zsu, }, + { gen_helper_sve_stbs_zss, + gen_helper_sve_sths_zss, + gen_helper_sve_stss_zss, }, + }; + static gen_helper_gvec_mem_scatter * const fn64[3][4] = { + { gen_helper_sve_stbd_zsu, + gen_helper_sve_sthd_zsu, + gen_helper_sve_stsd_zsu, + gen_helper_sve_stdd_zsu, }, + { gen_helper_sve_stbd_zss, + gen_helper_sve_sthd_zss, + gen_helper_sve_stsd_zss, + gen_helper_sve_stdd_zss, }, + { gen_helper_sve_stbd_zd, + gen_helper_sve_sthd_zd, + gen_helper_sve_stsd_zd, + gen_helper_sve_stdd_zd, }, + }; + gen_helper_gvec_mem_scatter *fn; + + if (a->esz < a->msz || (a->msz == 0 && a->scale)) { + unallocated_encoding(s); + return; + } + switch (a->esz) { + case MO_32: + fn = fn32[a->xs][a->msz]; + break; + case MO_64: + fn = fn64[a->xs][a->msz]; + break; + default: + g_assert_not_reached(); + } + do_mem_zpz(s, a->rd, a->pg, a->rm, a->scale * a->msz, + cpu_reg_sp(s, a->rn), fn); +} diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 5d8e1481d7..edd9340c02 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -81,6 +81,7 @@ &rpri_load rd pg rn imm dtype nreg &rprr_store rd pg rn rm msz esz nreg &rpri_store rd pg rn imm msz esz nreg +&rprr_scatter_store rd pg rn rm esz msz xs scale ########################################################################### # Named instruction formats. These are generally used to @@ -199,6 +200,8 @@ @rpri_store_msz ....... msz:2 .. . imm:s4 ... pg:3 rn:5 rd:5 &rpri_store @rprr_store_esz_n0 ....... .. esz:2 rm:5 ... pg:3 rn:5 rd:5 \ &rprr_store nreg=0 +@rprr_scatter_store ....... msz:2 .. rm:5 ... pg:3 rn:5 rd:5 \ + &rprr_scatter_store ########################################################################### # Instruction patterns. Grouped according to the SVE encodingindex.xhtml. @@ -832,3 +835,39 @@ ST_zpri 1110010 .. nreg:2 1.... 111 ... ..... ..... \ # SVE store multiple structures (scalar plus scalar) (nreg != 0) ST_zprr 1110010 msz:2 nreg:2 ..... 011 ... ..... ..... \ @rprr_store esz=%size_23 + +# SVE 32-bit scatter store (scalar plus 32-bit scaled offsets) +# Require msz > 0 && msz <= esz. +ST1_zprz 1110010 .. 11 ..... 100 ... ..... ..... \ + @rprr_scatter_store xs=0 esz=2 scale=1 +ST1_zprz 1110010 .. 11 ..... 110 ... ..... ..... \ + @rprr_scatter_store xs=1 esz=2 scale=1 + +# SVE 32-bit scatter store (scalar plus 32-bit unscaled offsets) +# Require msz <= esz. +ST1_zprz 1110010 .. 10 ..... 100 ... ..... ..... \ + @rprr_scatter_store xs=0 esz=2 scale=0 +ST1_zprz 1110010 .. 10 ..... 110 ... ..... ..... \ + @rprr_scatter_store xs=1 esz=2 scale=0 + +# SVE 64-bit scatter store (scalar plus 64-bit scaled offset) +# Require msz > 0 +ST1_zprz 1110010 .. 01 ..... 101 ... ..... ..... \ + @rprr_scatter_store xs=2 esz=3 scale=1 + +# SVE 64-bit scatter store (scalar plus 64-bit unscaled offset) +ST1_zprz 1110010 .. 00 ..... 101 ... ..... ..... \ + @rprr_scatter_store xs=2 esz=3 scale=0 + +# SVE 64-bit scatter store (scalar plus unpacked 32-bit scaled offset) +# Require msz > 0 +ST1_zprz 1110010 .. 01 ..... 100 ... ..... ..... \ + @rprr_scatter_store xs=0 esz=3 scale=1 +ST1_zprz 1110010 .. 01 ..... 110 ... ..... ..... \ + @rprr_scatter_store xs=1 esz=3 scale=1 + +# SVE 64-bit scatter store (scalar plus unpacked 32-bit unscaled offset) +ST1_zprz 1110010 .. 00 ..... 100 ... ..... ..... \ + @rprr_scatter_store xs=0 esz=3 scale=0 +ST1_zprz 1110010 .. 00 ..... 110 ... ..... ..... \ + @rprr_scatter_store xs=1 esz=3 scale=0 From patchwork Sat Feb 17 18:23:10 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128720 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1843269ljc; Sat, 17 Feb 2018 11:07:41 -0800 (PST) X-Google-Smtp-Source: AH8x225I5Kno6M3iGF2mAQIEdZYeD5SUNlTv6QqFM6YEsrpxuF/EyCWSyS6hS0A7584NsF7Az7bs X-Received: by 10.13.254.7 with SMTP id o7mr7598552ywf.12.1518894461317; Sat, 17 Feb 2018 11:07:41 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518894461; cv=none; d=google.com; s=arc-20160816; b=hqRFjjdakG5yiXolx8NNALJc2tqScxkyrfX3gjV9rKpIIWeLWALlhFwV5qBy6UJxW1 pDpWIk7E4rW8VXHhciY9j8qxRYQmfQM2dM/2BKtmSTI8P4j7lT3tnpUk8MEuGzxOclRd QH5/GjeB+C26eBpEPhFe1egKjhxBPLD1PzrFhq1ee1RW77B6oDZctjv1IcAjsPunyLmr mvtJmyiebzdakfTsMkA+4FkNwoUlsE+WBGQUYGVwAcpfzqjv58p9z2oSZz1G2lsH0g60 zPOWthpVK2VzZrjxjtAHoRQZ/7eF87ffrdTLpBIst2KrceuJY+6NZIJyHi6k98b8J2eY uPDA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=d4HbPykktVYLS4tE1HcEFAw5Qf7NVL1FMMy69rPPEE0=; b=kvxgb+NvMMtT8L8dpu6fF0kVJRKCVZAb5cYS6jpBb5/KAN23yHz8Z5uXq8DBJ8ZuEM +5xUrFmU+gVnmN/GrW4rKOxzYTkJ6p44+PfCqbYZ7UEFdBrHIdzI1MSmBLA9b4TjNsBO Kq3tZmFnRIYJW4L3mxjWTzZ6KlxzhaGf3we+5yaDOzn6kIFFTcOzPuwgTnXffZdnCrSC gg0Uj+0sgs3lDBkF2vr4sIszrWPnm2dFUWjySCQBT6l6LXcK76J+p9em2TJQerkVAeNu NxpCGlVqXU+C9F67G1reyE+MJSdKQIi7Jb0tgZj0jgyi3rAXR/uWpmUtuXewRmLI+D+o 08KQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=TnhRGHO5; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id g131si978958ywc.694.2018.02.17.11.07.41 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 11:07:41 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=TnhRGHO5; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48426 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7pk-0002st-Kw for patch@linaro.org; Sat, 17 Feb 2018 14:07:40 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40821) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7AM-0001cr-H0 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:55 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7AL-0002Au-EB for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:54 -0500 Received: from mail-pl0-x244.google.com ([2607:f8b0:400e:c01::244]:40423) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7AL-0002Ad-8u for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:53 -0500 Received: by mail-pl0-x244.google.com with SMTP id g18so3435893plo.7 for ; Sat, 17 Feb 2018 10:24:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=d4HbPykktVYLS4tE1HcEFAw5Qf7NVL1FMMy69rPPEE0=; b=TnhRGHO515AMY9RNZj7jrkToPGL9H070saLqxSGas1cvhg9KJ0QIZanFDGXLYt+Fdy EmkG7fi6omiG23Z9YmNoQjC37e3FKDhsv/3vkN4wukDIslXiJiq7G3iCtCIFYpgEfyKc cxJlStYyhnzD8FTiT0cA5FeNKVeOMGDJy0lLc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=d4HbPykktVYLS4tE1HcEFAw5Qf7NVL1FMMy69rPPEE0=; b=HL9r2fIKIL42/J3UW2Ya8ddL/OFbJ1Sb9M2D9cYYU/q44Qteb21mygNIY+ri3b/E+T d+PPbF6F3664f75EfyM4A40NUh7kITz0kcTpRZP/u+ZFlB5snoFG4DO4qaTw2Y15KOuI q3mINQbrdrvz+HDRzMAsHATRvbnRts4ZUT/3KvZHULnfAV3ctMjdywCGWF8BfEoThS7t /0Dgr9S+PZOkAd7CfJ/JiH9GsJwcNY+Z1BwpbBbF9LHR7i6c60ORjE8MvtTPN++bLbMf bwV8X9znlA5ceQlYDJx3GdOWMU4J4a71hBSfWC/Nr1b05vaIReWYFiDqkTyznHuNZp1x 5KWw== X-Gm-Message-State: APf1xPDNRqsIQLKl1jKe68g6TGW/6H5EVmtMhf+K/wNcwxT3yTK69Li8 St7RBbGNR5i3gdYb+ngG8SoHmL3hXjI= X-Received: by 2002:a17:902:ab85:: with SMTP id f5-v6mr9594980plr.199.1518891892033; Sat, 17 Feb 2018 10:24:52 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.50 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:51 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:10 -0800 Message-Id: <20180217182323.25885-55-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::244 Subject: [Qemu-devel] [PATCH v2 54/67] target/arm: Implement SVE prefetches X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 9 +++++++++ target/arm/sve.decode | 23 +++++++++++++++++++++++ 2 files changed, 32 insertions(+) -- 2.14.3 diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index ca49b94924..63c7a0e8d8 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3958,3 +3958,12 @@ static void trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn) do_mem_zpz(s, a->rd, a->pg, a->rm, a->scale * a->msz, cpu_reg_sp(s, a->rn), fn); } + +/* + * Prefetches + */ + +static void trans_PRF(DisasContext *s, arg_PRF *a, uint32_t insn) +{ + /* Prefetch is a nop within QEMU. */ +} diff --git a/target/arm/sve.decode b/target/arm/sve.decode index edd9340c02..f0144aa2d0 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -801,6 +801,29 @@ LD1RQ_zprr 1010010 .. 00 ..... 000 ... ..... ..... \ LD1RQ_zpri 1010010 .. 00 0.... 001 ... ..... ..... \ @rpri_load_msz nreg=0 +# SVE 32-bit gather prefetch (scalar plus 32-bit scaled offsets) +PRF 1000010 00 -1 ----- 0-- --- ----- 0 ---- + +# SVE 32-bit gather prefetch (vector plus immediate) +PRF 1000010 -- 00 ----- 111 --- ----- 0 ---- + +# SVE contiguous prefetch (scalar plus immediate) +PRF 1000010 11 1- ----- 0-- --- ----- 0 ---- + +# SVE contiguous prefetch (scalar plus scalar) +PRF 1000010 -- 00 ----- 110 --- ----- 0 ---- + +### SVE Memory 64-bit Gather Group + +# SVE 64-bit gather prefetch (scalar plus 64-bit scaled offsets) +PRF 1100010 00 11 ----- 1-- --- ----- 0 ---- + +# SVE 64-bit gather prefetch (scalar plus unpacked 32-bit scaled offsets) +PRF 1100010 00 -1 ----- 0-- --- ----- 0 ---- + +# SVE 64-bit gather prefetch (vector plus immediate) +PRF 1100010 -- 00 ----- 111 --- ----- 0 ---- + ### SVE Memory Store Group # SVE store predicate register From patchwork Sat Feb 17 18:23:11 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128736 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1856115ljc; Sat, 17 Feb 2018 11:28:07 -0800 (PST) X-Google-Smtp-Source: AH8x2246o2j6r+uESLscJJ3EzWPvwus8/mYRQ0ju7GTXim2RRtSE4D+MeturQedYZ8AnktixCsU7 X-Received: by 10.37.244.71 with SMTP id p7mr7040652ybe.264.1518895687148; Sat, 17 Feb 2018 11:28:07 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518895687; cv=none; d=google.com; s=arc-20160816; b=VYng5iPhGQ3OLkByq49+LuG59Yy++iIpqipID9AeHDdvBfX+fhO8RcoExKJVCB3mQ6 P9n+TIZnJhNgnf6j8+DmDK59Klj6HMrX+g+9cWzPqwDM1kJLuG8/VDsApIwuhWKq5AMT smQaBY4xh1SrudJ4ML+d4JS+8nC0Hrrfiotwc4b8wu7NwQr11In2H4Ou9Qqpg5Gl/zrs VEHQQKHyPELQxL07JdLK8CL2fOG1EyhgdolZBcSKLqarIUUzCawZyl73a/GxYje+cFRY jyt0elysQOrGewQ3wXpbhw2DQ8XxFH7kObvv5RmlBWicvHawJ30x80rqbRu9Sc/8iFxB hr5Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=2sKSuIJ/y4jlOeyOJVW8/Dfg+ktvWdZhrQjHSRQzkQA=; b=VsFGAAuZkYFefqUDHh7tK7OOWa4umB7ELAPZuju+Haj6US7U0XqAnZlgZkmuQn0a7P IqCSVEBWUZORy7QJ8oR9TkxTrN+lyT1D81zgKOIKouWHaDjr4/ZXZV033Z6bnchiTsWG pYeXbFdd3pqO3oJA/ij04FAyIf6TjekigtGZ5JzmB63ReVBBdSA8TAKOLUjnNcsTVEF9 Cu0kgoK90bXzszqjFHsiDydjP9i6G5G/tBbmRAH+gKKu9X1AqIK+z6pWny4Xfeh8P+KV owBc5kgdGKtTDRSYtN6yEfh6iFo4kXjRD/NwdxZAQFJTkgMABXWRaPzLvyLAlf0aZNqV 4rmw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Xr67ruYV; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id g136si1056614ybf.185.2018.02.17.11.28.06 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 11:28:07 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Xr67ruYV; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:50396 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en89W-0005p7-Cg for patch@linaro.org; Sat, 17 Feb 2018 14:28:06 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40853) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7AP-0001fN-8p for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:59 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7AN-0002Bt-I9 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:57 -0500 Received: from mail-pl0-x244.google.com ([2607:f8b0:400e:c01::244]:46298) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7AN-0002BW-8w for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:55 -0500 Received: by mail-pl0-x244.google.com with SMTP id x19so3428833plr.13 for ; Sat, 17 Feb 2018 10:24:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=2sKSuIJ/y4jlOeyOJVW8/Dfg+ktvWdZhrQjHSRQzkQA=; b=Xr67ruYVEKWUBhAqQuRPykML7XquFAAR8pijTPSYWPGglP15/CgxzduyTsB7U4ILXY UibQXV0F6IuGNYHq5i7WlDqJwi0RnFN56QeXXiGyNWTAmEpz4+Hgu4MXgSAHOciRFWjn bNwqk6EgQIdCUZqx6ZTWQUIULqV+o3t25bgjM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=2sKSuIJ/y4jlOeyOJVW8/Dfg+ktvWdZhrQjHSRQzkQA=; b=mMgGOvomo+7W62MoISuh2qv8czX4ZxFpSm09mvj1p9NXGmSFG8zX7voJvrmdpBKmK/ CCRBszqnCSFIhE6mMSXjfWGqQCqserzqpeg+U2qSXDHNLbILYnuCVNvcz9JDdiRsIelx v2m6kmoQdfxxHhmJ3CtbSkoGD+SQc2YYPqWKaHV0Kx76RCQIOezsD4TpWJxOpQyHfmoy 3DCY0mvhvJUQggDVOuQro4IdRrCwTEiFd5NqfReVlZwrs9MT9UlYQf6WBPaxM/HVgWQZ UJ1jBqHdJo2JyHuD9nrhCZfIyAnUxOYsxNgYfRj3P1ebkQMz2ZxrhH64QKytu6KBCk+6 m9uw== X-Gm-Message-State: APf1xPAPUppp3IT0YUFu4WIkCyRD125DN3ZnmpRhIzgplfOxdK2KuOUW eoH78puazN37c/gWM0GD4IX9DGJQmEg= X-Received: by 2002:a17:902:bf01:: with SMTP id bi1-v6mr9308650plb.254.1518891893898; Sat, 17 Feb 2018 10:24:53 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.52 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:52 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:11 -0800 Message-Id: <20180217182323.25885-56-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::244 Subject: [Qemu-devel] [PATCH v2 55/67] target/arm: Implement SVE gather loads X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 67 ++++++++++++++++++++++++++++++++ target/arm/sve_helper.c | 75 +++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 97 ++++++++++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 53 +++++++++++++++++++++++++ 4 files changed, 292 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index b5c093f2fd..3cb7ab9ef2 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -919,6 +919,73 @@ DEF_HELPER_FLAGS_4(sve_st1hd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_st1sd_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldbsu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhsu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldssu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldbss_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhss_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_ldbsu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhsu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldssu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldbss_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhss_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_ldbdu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhdu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldsdu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldddu_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldbds_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhds_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldsds_zsu, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_ldbdu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhdu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldsdu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldddu_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldbds_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhds_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldsds_zss, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + +DEF_HELPER_FLAGS_6(sve_ldbdu_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhdu_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldsdu_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldddu_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldbds_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldhds_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_6(sve_ldsds_zd, TCG_CALL_NO_WG, + void, env, ptr, ptr, ptr, tl, i32) + DEF_HELPER_FLAGS_6(sve_stbs_zsu, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr, tl, i32) DEF_HELPER_FLAGS_6(sve_sths_zsu, TCG_CALL_NO_WG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 07b3d285f2..4edd3d4367 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3546,6 +3546,81 @@ void HELPER(sve_st4dd_r)(CPUARMState *env, void *vg, } } +/* Loads with a vector index. */ + +#define DO_LD1_ZPZ_S(NAME, TYPEI, TYPEM, FN) \ +void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \ + target_ulong base, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc) / 8; \ + unsigned scale = simd_data(desc); \ + uintptr_t ra = GETPC(); \ + uint32_t *d = vd; TYPEI *m = vm; uint8_t *pg = vg; \ + for (i = 0; i < oprsz; i++) { \ + uint8_t pp = pg[H1(i)]; \ + if (pp & 0x01) { \ + target_ulong off = (target_ulong)m[H4(i * 2)] << scale; \ + d[H4(i * 2)] = (TYPEM)FN(env, base + off, ra); \ + } \ + if (pp & 0x10) { \ + target_ulong off = (target_ulong)m[H4(i * 2 + 1)] << scale; \ + d[H4(i * 2 + 1)] = (TYPEM)FN(env, base + off, ra); \ + } \ + } \ +} + +#define DO_LD1_ZPZ_D(NAME, TYPEI, TYPEM, FN) \ +void HELPER(NAME)(CPUARMState *env, void *vd, void *vg, void *vm, \ + target_ulong base, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc) / 8; \ + unsigned scale = simd_data(desc); \ + uintptr_t ra = GETPC(); \ + uint64_t *d = vd, *m = vm; uint8_t *pg = vg; \ + for (i = 0; i < oprsz; i++) { \ + if (pg[H1(i)] & 1) { \ + target_ulong off = (target_ulong)(TYPEI)m[i] << scale; \ + d[i] = (TYPEM)FN(env, base + off, ra); \ + } \ + } \ +} + +DO_LD1_ZPZ_S(sve_ldbsu_zsu, uint32_t, uint8_t, cpu_ldub_data_ra) +DO_LD1_ZPZ_S(sve_ldhsu_zsu, uint32_t, uint16_t, cpu_lduw_data_ra) +DO_LD1_ZPZ_S(sve_ldssu_zsu, uint32_t, uint32_t, cpu_ldl_data_ra) +DO_LD1_ZPZ_S(sve_ldbss_zsu, uint32_t, int8_t, cpu_ldub_data_ra) +DO_LD1_ZPZ_S(sve_ldhss_zsu, uint32_t, int16_t, cpu_lduw_data_ra) + +DO_LD1_ZPZ_S(sve_ldbsu_zss, int32_t, uint8_t, cpu_ldub_data_ra) +DO_LD1_ZPZ_S(sve_ldhsu_zss, int32_t, uint16_t, cpu_lduw_data_ra) +DO_LD1_ZPZ_S(sve_ldssu_zss, int32_t, uint32_t, cpu_ldl_data_ra) +DO_LD1_ZPZ_S(sve_ldbss_zss, int32_t, int8_t, cpu_ldub_data_ra) +DO_LD1_ZPZ_S(sve_ldhss_zss, int32_t, int16_t, cpu_lduw_data_ra) + +DO_LD1_ZPZ_D(sve_ldbdu_zsu, uint32_t, uint8_t, cpu_ldub_data_ra) +DO_LD1_ZPZ_D(sve_ldhdu_zsu, uint32_t, uint16_t, cpu_lduw_data_ra) +DO_LD1_ZPZ_D(sve_ldsdu_zsu, uint32_t, uint32_t, cpu_ldl_data_ra) +DO_LD1_ZPZ_D(sve_ldddu_zsu, uint32_t, uint64_t, cpu_ldq_data_ra) +DO_LD1_ZPZ_D(sve_ldbds_zsu, uint32_t, int8_t, cpu_ldub_data_ra) +DO_LD1_ZPZ_D(sve_ldhds_zsu, uint32_t, int16_t, cpu_lduw_data_ra) +DO_LD1_ZPZ_D(sve_ldsds_zsu, uint32_t, int32_t, cpu_ldl_data_ra) + +DO_LD1_ZPZ_D(sve_ldbdu_zss, int32_t, uint8_t, cpu_ldub_data_ra) +DO_LD1_ZPZ_D(sve_ldhdu_zss, int32_t, uint16_t, cpu_lduw_data_ra) +DO_LD1_ZPZ_D(sve_ldsdu_zss, int32_t, uint32_t, cpu_ldl_data_ra) +DO_LD1_ZPZ_D(sve_ldddu_zss, int32_t, uint64_t, cpu_ldq_data_ra) +DO_LD1_ZPZ_D(sve_ldbds_zss, int32_t, int8_t, cpu_ldub_data_ra) +DO_LD1_ZPZ_D(sve_ldhds_zss, int32_t, int16_t, cpu_lduw_data_ra) +DO_LD1_ZPZ_D(sve_ldsds_zss, int32_t, int32_t, cpu_ldl_data_ra) + +DO_LD1_ZPZ_D(sve_ldbdu_zd, uint64_t, uint8_t, cpu_ldub_data_ra) +DO_LD1_ZPZ_D(sve_ldhdu_zd, uint64_t, uint16_t, cpu_lduw_data_ra) +DO_LD1_ZPZ_D(sve_ldsdu_zd, uint64_t, uint32_t, cpu_ldl_data_ra) +DO_LD1_ZPZ_D(sve_ldddu_zd, uint64_t, uint64_t, cpu_ldq_data_ra) +DO_LD1_ZPZ_D(sve_ldbds_zd, uint64_t, int8_t, cpu_ldub_data_ra) +DO_LD1_ZPZ_D(sve_ldhds_zd, uint64_t, int16_t, cpu_lduw_data_ra) +DO_LD1_ZPZ_D(sve_ldsds_zd, uint64_t, int32_t, cpu_ldl_data_ra) + /* Stores with a vector index. */ #define DO_ST1_ZPZ_S(NAME, TYPEI, FN) \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 63c7a0e8d8..6484ecd257 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3914,6 +3914,103 @@ static void do_mem_zpz(DisasContext *s, int zt, int pg, int zm, int scale, tcg_temp_free_i32(desc); } +/* Indexed by [xs][u][msz]. */ +static gen_helper_gvec_mem_scatter * const gather_load_fn32[2][2][3] = { + { { gen_helper_sve_ldbss_zsu, + gen_helper_sve_ldhss_zsu, + NULL, }, + { gen_helper_sve_ldbsu_zsu, + gen_helper_sve_ldhsu_zsu, + gen_helper_sve_ldssu_zsu, } }, + { { gen_helper_sve_ldbss_zss, + gen_helper_sve_ldhss_zss, + NULL, }, + { gen_helper_sve_ldbsu_zss, + gen_helper_sve_ldhsu_zss, + gen_helper_sve_ldssu_zss, } }, +}; + +static gen_helper_gvec_mem_scatter * const gather_load_fn64[3][2][4] = { + { { gen_helper_sve_ldbds_zsu, + gen_helper_sve_ldhds_zsu, + gen_helper_sve_ldsds_zsu, + NULL, }, + { gen_helper_sve_ldbdu_zsu, + gen_helper_sve_ldhdu_zsu, + gen_helper_sve_ldsdu_zsu, + gen_helper_sve_ldddu_zsu, } }, + { { gen_helper_sve_ldbds_zss, + gen_helper_sve_ldhds_zss, + gen_helper_sve_ldsds_zss, + NULL, }, + { gen_helper_sve_ldbdu_zss, + gen_helper_sve_ldhdu_zss, + gen_helper_sve_ldsdu_zss, + gen_helper_sve_ldddu_zss, } }, + { { gen_helper_sve_ldbds_zd, + gen_helper_sve_ldhds_zd, + gen_helper_sve_ldsds_zd, + NULL, }, + { gen_helper_sve_ldbdu_zd, + gen_helper_sve_ldhdu_zd, + gen_helper_sve_ldsdu_zd, + gen_helper_sve_ldddu_zd, } }, +}; + +static void trans_LD1_zprz(DisasContext *s, arg_LD1_zprz *a, uint32_t insn) +{ + gen_helper_gvec_mem_scatter *fn = NULL; + + if (a->esz < a->msz + || (a->msz == 0 && a->scale) + || (a->esz == a->msz && !a->u)) { + unallocated_encoding(s); + return; + } + + /* TODO: handle LDFF1. */ + switch (a->esz) { + case MO_32: + fn = gather_load_fn32[a->xs][a->u][a->msz]; + break; + case MO_64: + fn = gather_load_fn64[a->xs][a->u][a->msz]; + break; + } + assert(fn != NULL); + + do_mem_zpz(s, a->rd, a->pg, a->rm, a->scale * a->msz, + cpu_reg_sp(s, a->rn), fn); +} + +static void trans_LD1_zpiz(DisasContext *s, arg_LD1_zpiz *a, uint32_t insn) +{ + gen_helper_gvec_mem_scatter *fn = NULL; + TCGv_i64 imm; + + if (a->esz < a->msz || (a->esz == a->msz && !a->u)) { + unallocated_encoding(s); + return; + } + + /* TODO: handle LDFF1. */ + switch (a->esz) { + case MO_32: + fn = gather_load_fn32[0][a->u][a->msz]; + break; + case MO_64: + fn = gather_load_fn64[2][a->u][a->msz]; + break; + } + assert(fn != NULL); + + /* Treat LD1_zpiz (zn[x] + imm) the same way as LD1_zprz (rn + zm[x]) + by loading the immediate into the scalar parameter. */ + imm = tcg_const_i64(a->imm << a->msz); + do_mem_zpz(s, a->rd, a->pg, a->rn, 0, imm, fn); + tcg_temp_free_i64(imm); +} + static void trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn) { /* Indexed by [xs][msz]. */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index f0144aa2d0..f85d82e009 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -81,6 +81,8 @@ &rpri_load rd pg rn imm dtype nreg &rprr_store rd pg rn rm msz esz nreg &rpri_store rd pg rn imm msz esz nreg +&rprr_gather_load rd pg rn rm esz msz u ff xs scale +&rpri_gather_load rd pg rn imm esz msz u ff &rprr_scatter_store rd pg rn rm esz msz xs scale ########################################################################### @@ -195,6 +197,18 @@ @rpri_load_msz ....... .... . imm:s4 ... pg:3 rn:5 rd:5 \ &rpri_load dtype=%msz_dtype +# Gather Loads. +@rprr_g_load_u ....... .. . . rm:5 . u:1 ff:1 pg:3 rn:5 rd:5 \ + &rprr_gather_load xs=2 +@rprr_g_load_xs_u ....... .. xs:1 . rm:5 . u:1 ff:1 pg:3 rn:5 rd:5 \ + &rprr_gather_load +@rprr_g_load_xs_u_sc ....... .. xs:1 scale:1 rm:5 . u:1 ff:1 pg:3 rn:5 rd:5 \ + &rprr_gather_load +@rprr_g_load_u_sc ....... .. . scale:1 rm:5 . u:1 ff:1 pg:3 rn:5 rd:5 \ + &rprr_gather_load xs=2 +@rpri_g_load ....... msz:2 .. imm:5 . u:1 ff:1 pg:3 rn:5 rd:5 \ + &rpri_gather_load + # Stores; user must fill in ESZ, MSZ, NREG as needed. @rprr_store ....... .. .. rm:5 ... pg:3 rn:5 rd:5 &rprr_store @rpri_store_msz ....... msz:2 .. . imm:s4 ... pg:3 rn:5 rd:5 &rpri_store @@ -766,6 +780,19 @@ LDR_zri 10000101 10 ...... 010 ... ..... ..... @rd_rn_i9 LD1R_zpri 1000010 .. 1 imm:6 1.. pg:3 rn:5 rd:5 \ &rpri_load dtype=%dtype_23_13 nreg=0 +# SVE 32-bit gather load (scalar plus 32-bit unscaled offsets) +# SVE 32-bit gather load (scalar plus 32-bit scaled offsets) +LD1_zprz 1000010 00 .0 ..... 0.. ... ..... ..... \ + @rprr_g_load_xs_u esz=2 msz=0 scale=0 +LD1_zprz 1000010 01 .. ..... 0.. ... ..... ..... \ + @rprr_g_load_xs_u_sc esz=2 msz=1 +LD1_zprz 1000010 10 .. ..... 0.. ... ..... ..... \ + @rprr_g_load_xs_u_sc esz=2 msz=2 + +# SVE 32-bit gather load (vector plus immediate) +LD1_zpiz 1000010 .. 01 ..... 1.. ... ..... ..... \ + @rpri_g_load esz=2 + ### SVE Memory Contiguous Load Group # SVE contiguous load (scalar plus scalar) @@ -815,6 +842,32 @@ PRF 1000010 -- 00 ----- 110 --- ----- 0 ---- ### SVE Memory 64-bit Gather Group +# SVE 64-bit gather load (scalar plus 32-bit unpacked unscaled offsets) +# SVE 64-bit gather load (scalar plus 32-bit unpacked scaled offsets) +LD1_zprz 1100010 00 .0 ..... 0.. ... ..... ..... \ + @rprr_g_load_xs_u esz=3 msz=0 scale=0 +LD1_zprz 1100010 01 .. ..... 0.. ... ..... ..... \ + @rprr_g_load_xs_u_sc esz=3 msz=1 +LD1_zprz 1100010 10 .. ..... 0.. ... ..... ..... \ + @rprr_g_load_xs_u_sc esz=3 msz=2 +LD1_zprz 1100010 11 .. ..... 0.. ... ..... ..... \ + @rprr_g_load_xs_u_sc esz=3 msz=3 + +# SVE 64-bit gather load (scalar plus 64-bit unscaled offsets) +# SVE 64-bit gather load (scalar plus 64-bit scaled offsets) +LD1_zprz 1100010 00 10 ..... 1.. ... ..... ..... \ + @rprr_g_load_u esz=3 msz=0 scale=0 +LD1_zprz 1100010 01 1. ..... 1.. ... ..... ..... \ + @rprr_g_load_u_sc esz=3 msz=1 +LD1_zprz 1100010 10 1. ..... 1.. ... ..... ..... \ + @rprr_g_load_u_sc esz=3 msz=2 +LD1_zprz 1100010 11 1. ..... 1.. ... ..... ..... \ + @rprr_g_load_u_sc esz=3 msz=3 + +# SVE 64-bit gather load (vector plus immediate) +LD1_zpiz 1100010 .. 01 ..... 1.. ... ..... ..... \ + @rpri_g_load esz=3 + # SVE 64-bit gather prefetch (scalar plus 64-bit scaled offsets) PRF 1100010 00 11 ----- 1-- --- ----- 0 ---- From patchwork Sat Feb 17 18:23:12 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128738 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1858196ljc; Sat, 17 Feb 2018 11:31:19 -0800 (PST) X-Google-Smtp-Source: AH8x226oB8omnAZbXiTrmlu4TahBv1Vak/qKEB5IdnyYjP7C8MnrZqwFuMECj/2NijST08ox47Hx X-Received: by 10.129.128.67 with SMTP id q64mr7522914ywf.376.1518895879015; Sat, 17 Feb 2018 11:31:19 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518895879; cv=none; d=google.com; s=arc-20160816; b=YIXOyYL7Lu7UKpiWtBS62z7LsaX0WD4q0S427mooU+zxWBoAUwLKtRsc5EhgxAjdNj i00KhK9iwaifLzBcEHvLEs7HzvfE98Uc72TYubdrrqgZ5fRxwgu0LBJYr42ti0bWzOfH +9SaPjiVjo/Heo2EGGR2VzT4tafLpXTtFf9eZfEhhbJ0vsnclHvJnO2A/sj89wJEUTRs 4R3YYop7QDIv8FwZScmmUOfopsom+zIT7cl3ceieT1ijQhlBz2/d+b49RC0aew4JiY/j N8fX0txxeF+jrlBVhhL0R0XOdAWzLUQWEEC2y0Ef+M1a91hQ0qrzwFhQaAzQ9DoO9VvY cRLw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=6iBAAbIc0nm8fljouk+VAtJA8NE1VxP/6Uk4XqY2ALY=; b=EBVuDb4MkBdZKRcCF2rnF4cMCec3tJi1xeq5sDmNrDJRykVeG5H2ziBmNCFFvA8N9K Nk/4HbVURBKM0tI/mXuNC7YkDOdS9TBgJFXkZGOskyyDfjGFBpmKY7f75inUEkqAlzvR 5EjzzIU7f+0c4BBrV04oYY33lu0hDHfwJQ+Ea0nc8zLLWvpe1RWckMTn/AwRRlxozjGK Dfvy30nfpvuIHYFm+J/mtTEmLnQBH5idhBhDrNDmnYpIFkkJvRxhPEcr5DvnFVn0vf84 qhpty6hIZNamMwNVjX+34LfsXrwlQxAPyYqrwyhqKHmo8MAervF3sKW9ATWRjB3W5qep GhpQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=bQsteFdJ; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id k124si3572700ywb.556.2018.02.17.11.31.18 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 11:31:19 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=bQsteFdJ; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:50773 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en8Cc-0000M9-62 for patch@linaro.org; Sat, 17 Feb 2018 14:31:18 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40867) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7AQ-0001gq-6k for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:01 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7AO-0002CF-Ua for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:58 -0500 Received: from mail-pl0-x243.google.com ([2607:f8b0:400e:c01::243]:45437) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7AO-0002C4-Me for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:56 -0500 Received: by mail-pl0-x243.google.com with SMTP id p5so3429097plo.12 for ; Sat, 17 Feb 2018 10:24:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=6iBAAbIc0nm8fljouk+VAtJA8NE1VxP/6Uk4XqY2ALY=; b=bQsteFdJvY671q2PALBpD3G4VVdIP0C0wcHJJ4m0VEPbrtH2lOA6Zf1imZNajZFqCA Q2PtAQaa00CaJNHPIuqSzaio26WDT3AVoFWbl+K06Bz2ZRRBrHu7HhxjTrON1FkO4Nly ajvjiycfrYGfbms2GB42To+oQCFBC/e44OxN8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=6iBAAbIc0nm8fljouk+VAtJA8NE1VxP/6Uk4XqY2ALY=; b=oFlcGXzfThhhp7w2N1Pde83sJdQevJG+faKbUmTEjqT4ifU8vuKCVfCZgScqbeNFt4 umiKFEC5lkyPf898gwYRE1D+Il0vyOITW6r3NYbfgk7I9hOE7sRj/qx45uJZ31nxdOPF uoVr8Y9Vj4mdIG94koGyHvBVXV/2+I0n8cBngWec4IMOKSFi1ckEloSMavqYm4dEOPyy 1Uy1lTkTPLzR+t1Y1wzw1JAgLPRCthZJKg15oRi0hr7/G0w+Bqr9ISlEWy3R4ZxMDS65 jrU+xa9Cte6fIm+7tWMTBgk0IbYNBwadjtoO1DmeBvCuan7XeG97t/GmGTJ5RvELPfPu 2m+g== X-Gm-Message-State: APf1xPBApxbr9Dm32zuzb40SMEnD3gU1jNuxGScCxslV5m05ARfu9Sln K6p2/tjA+UIcNBAIwIqg4CojvXTQCRM= X-Received: by 2002:a17:902:6ac2:: with SMTP id i2-v6mr1484963plt.368.1518891895405; Sat, 17 Feb 2018 10:24:55 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.53 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:54 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:12 -0800 Message-Id: <20180217182323.25885-57-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::243 Subject: [Qemu-devel] [PATCH v2 56/67] target/arm: Implement SVE scatter store vector immediate X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 79 +++++++++++++++++++++++++++++++--------------- target/arm/sve.decode | 11 +++++++ 2 files changed, 65 insertions(+), 25 deletions(-) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 6484ecd257..0241e8e707 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -4011,31 +4011,33 @@ static void trans_LD1_zpiz(DisasContext *s, arg_LD1_zpiz *a, uint32_t insn) tcg_temp_free_i64(imm); } +/* Indexed by [xs][msz]. */ +static gen_helper_gvec_mem_scatter * const scatter_store_fn32[2][3] = { + { gen_helper_sve_stbs_zsu, + gen_helper_sve_sths_zsu, + gen_helper_sve_stss_zsu, }, + { gen_helper_sve_stbs_zss, + gen_helper_sve_sths_zss, + gen_helper_sve_stss_zss, }, +}; + +static gen_helper_gvec_mem_scatter * const scatter_store_fn64[3][4] = { + { gen_helper_sve_stbd_zsu, + gen_helper_sve_sthd_zsu, + gen_helper_sve_stsd_zsu, + gen_helper_sve_stdd_zsu, }, + { gen_helper_sve_stbd_zss, + gen_helper_sve_sthd_zss, + gen_helper_sve_stsd_zss, + gen_helper_sve_stdd_zss, }, + { gen_helper_sve_stbd_zd, + gen_helper_sve_sthd_zd, + gen_helper_sve_stsd_zd, + gen_helper_sve_stdd_zd, }, +}; + static void trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn) { - /* Indexed by [xs][msz]. */ - static gen_helper_gvec_mem_scatter * const fn32[2][3] = { - { gen_helper_sve_stbs_zsu, - gen_helper_sve_sths_zsu, - gen_helper_sve_stss_zsu, }, - { gen_helper_sve_stbs_zss, - gen_helper_sve_sths_zss, - gen_helper_sve_stss_zss, }, - }; - static gen_helper_gvec_mem_scatter * const fn64[3][4] = { - { gen_helper_sve_stbd_zsu, - gen_helper_sve_sthd_zsu, - gen_helper_sve_stsd_zsu, - gen_helper_sve_stdd_zsu, }, - { gen_helper_sve_stbd_zss, - gen_helper_sve_sthd_zss, - gen_helper_sve_stsd_zss, - gen_helper_sve_stdd_zss, }, - { gen_helper_sve_stbd_zd, - gen_helper_sve_sthd_zd, - gen_helper_sve_stsd_zd, - gen_helper_sve_stdd_zd, }, - }; gen_helper_gvec_mem_scatter *fn; if (a->esz < a->msz || (a->msz == 0 && a->scale)) { @@ -4044,10 +4046,10 @@ static void trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn) } switch (a->esz) { case MO_32: - fn = fn32[a->xs][a->msz]; + fn = scatter_store_fn32[a->xs][a->msz]; break; case MO_64: - fn = fn64[a->xs][a->msz]; + fn = scatter_store_fn64[a->xs][a->msz]; break; default: g_assert_not_reached(); @@ -4056,6 +4058,33 @@ static void trans_ST1_zprz(DisasContext *s, arg_ST1_zprz *a, uint32_t insn) cpu_reg_sp(s, a->rn), fn); } +static void trans_ST1_zpiz(DisasContext *s, arg_ST1_zpiz *a, uint32_t insn) +{ + gen_helper_gvec_mem_scatter *fn = NULL; + TCGv_i64 imm; + + if (a->esz < a->msz) { + unallocated_encoding(s); + return; + } + + switch (a->esz) { + case MO_32: + fn = scatter_store_fn32[0][a->msz]; + break; + case MO_64: + fn = scatter_store_fn64[2][a->msz]; + break; + } + assert(fn != NULL); + + /* Treat ST1_zpiz (zn[x] + imm) the same way as ST1_zprz (rn + zm[x]) + by loading the immediate into the scalar parameter. */ + imm = tcg_const_i64(a->imm << a->msz); + do_mem_zpz(s, a->rd, a->pg, a->rn, 0, imm, fn); + tcg_temp_free_i64(imm); +} + /* * Prefetches */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index f85d82e009..6ccb4289fc 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -84,6 +84,7 @@ &rprr_gather_load rd pg rn rm esz msz u ff xs scale &rpri_gather_load rd pg rn imm esz msz u ff &rprr_scatter_store rd pg rn rm esz msz xs scale +&rpri_scatter_store rd pg rn imm esz msz ########################################################################### # Named instruction formats. These are generally used to @@ -216,6 +217,8 @@ &rprr_store nreg=0 @rprr_scatter_store ....... msz:2 .. rm:5 ... pg:3 rn:5 rd:5 \ &rprr_scatter_store +@rpri_scatter_store ....... msz:2 .. imm:5 ... pg:3 rn:5 rd:5 \ + &rpri_scatter_store ########################################################################### # Instruction patterns. Grouped according to the SVE encodingindex.xhtml. @@ -935,6 +938,14 @@ ST1_zprz 1110010 .. 01 ..... 101 ... ..... ..... \ ST1_zprz 1110010 .. 00 ..... 101 ... ..... ..... \ @rprr_scatter_store xs=2 esz=3 scale=0 +# SVE 64-bit scatter store (vector plus immediate) +ST1_zpiz 1110010 .. 10 ..... 101 ... ..... ..... \ + @rpri_scatter_store esz=3 + +# SVE 32-bit scatter store (vector plus immediate) +ST1_zpiz 1110010 .. 11 ..... 101 ... ..... ..... \ + @rpri_scatter_store esz=2 + # SVE 64-bit scatter store (scalar plus unpacked 32-bit scaled offset) # Require msz > 0 ST1_zprz 1110010 .. 01 ..... 100 ... ..... ..... \ From patchwork Sat Feb 17 18:23:13 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128722 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1843606ljc; Sat, 17 Feb 2018 11:08:14 -0800 (PST) X-Google-Smtp-Source: AH8x227yV9oUXOgTwQGvRpyUEvKhzOmYZ5szGXrnQGuRKSChpHYeFHucPamS63cEWNyFAwlkzlUn X-Received: by 10.129.45.68 with SMTP id t65mr7618064ywt.216.1518894494640; Sat, 17 Feb 2018 11:08:14 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518894494; cv=none; d=google.com; s=arc-20160816; b=T75eGPvHI9UVwAPS7EZ8OeS8zLxCbyhffN6Re8ONMfFodbN85r0WwHcN6c2wRJ9SnH H0emwku1DU0KRMJjfiiT80wuoowLgPpqRTuqmIuVrvLhIubC397Zk+qHqGzVf7ne3W0N QFLB1hRwY9yU8NogAMc5lpUdlnJKm65/fymJ1hRtz8UnZelXqa3KbmS6xj1/wEAsObAE oOhNmQHfJR83tVbToHm0Bvju0nhpyaYz+PsuCyYgRC3zVGZDAd6elegTDYLlHE2YbxJc bnyoom6c6mPixET9L6C78rwNOBFopRJo00ND2wF2jv5nF19qj9V+rTWZyy4gjAV/EntI kCFw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=yqpNa6LdvOaBLZgG0e0TTv5WL7qSDCjvAX9y6/JraPc=; b=oyn9RZYnmEsLfqOHI9b9KQxqZrY1tMxDRICcWdJU6E3uFr+jcwaNtRKmGAN3CBJV+Y Y8hvvscfgt0mF/6cm0umEH0qwko7KYXllihGmAG6nsGWmhJAxEV6CmTyV30AsfpOj6oy pkgDgB2sU27KYpu+t/bafFkIWTop8YJZV1oGhd4pw6jFSKm+nHmdUH9Fqa1zT1GrxzTR Q62Yf7DfyePtkh7CaDfKq8xKEGd7D2rSd2+XA9wMU/BMP1sv4Wn67xpw3uIeGZ8ZiJq4 7upbU9X7KRyGst4ThGrnXvLZL5os2/v25CSU0bgUhggxevRpjLcQChyWg6u1znL8Vy1w IFnQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=OqrSJAuQ; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id 186si274537ywc.659.2018.02.17.11.08.14 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 11:08:14 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=OqrSJAuQ; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48443 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7qH-0004Bq-T5 for patch@linaro.org; Sat, 17 Feb 2018 14:08:13 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40914) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7AT-0001l4-5B for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:03 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7AQ-0002DC-IC for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:01 -0500 Received: from mail-pf0-x244.google.com ([2607:f8b0:400e:c00::244]:32806) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7AQ-0002Cb-Au for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:58 -0500 Received: by mail-pf0-x244.google.com with SMTP id b8so525773pfh.0 for ; Sat, 17 Feb 2018 10:24:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=yqpNa6LdvOaBLZgG0e0TTv5WL7qSDCjvAX9y6/JraPc=; b=OqrSJAuQChgvOEmzEAmyFOvFbm3TsWxQBjIYhQJMSHHzgzDoOWrrz9mlcLl7Ihr1UU JbqIFkORERW9MAn3xlIXMhjv/lVaFEv0uhv2FOBIi6xnrTjY1D+qKf1wzhV4cMLTCPAt gVgITEIfpQFpXzuXVynaDtZGp2OzG+59eZnE0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=yqpNa6LdvOaBLZgG0e0TTv5WL7qSDCjvAX9y6/JraPc=; b=btySIER2yOKKLwOdgwigUkurg3KeqjxSuMZrM5GRUPksdFj1VzPRVJWWSTplZHNEaJ cexTUnJ57m9/GDMzGgwMhk1ViCozsfsojZ3oad7xx6kFvTqK0HoPZnyxILJalmIC/BKb 6ppV/U6tzgsXHosGuGl1tFzTdoKWbSnJBVF15qSNxA7KfKTeUKM7LllzYmsM/p6PD7ge RTAZYVldD2wVGEq+aQuydKb+Bxqf72vHcb5aWb+iMoy4G8uEf7yL4nGo8OClcjAMj+rf uSD2A0OTyT5HZjIcXWDJbS7mJtbln1rzVDrXPbrnzuJhzFJgFcvUvmKKsgspX1LBrtt6 o9rg== X-Gm-Message-State: APf1xPASQI9u3GrTK0IlgsBGwXa/ew3gJNFl0uq2LS5VVY+wrLnhXYdT cMgl9NcHnOARMgPo36+XN/AM/mw+aU4= X-Received: by 10.99.113.90 with SMTP id b26mr8168202pgn.10.1518891896945; Sat, 17 Feb 2018 10:24:56 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.55 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:55 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:13 -0800 Message-Id: <20180217182323.25885-58-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::244 Subject: [Qemu-devel] [PATCH v2 57/67] target/arm: Implement SVE floating-point compare vectors X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 49 +++++++++++++++++++++++++++++++++++ target/arm/sve_helper.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 41 +++++++++++++++++++++++++++++ target/arm/sve.decode | 11 ++++++++ 4 files changed, 165 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 3cb7ab9ef2..30373e3fc7 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -839,6 +839,55 @@ DEF_HELPER_FLAGS_5(sve_ucvt_ds, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_ucvt_dd, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmge_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmge_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmge_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fcmgt_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmgt_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmgt_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fcmeq_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmeq_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmeq_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fcmne_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmne_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmne_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fcmuo_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmuo_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fcmuo_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_facge_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_facge_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_facge_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_facgt_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_facgt_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_facgt_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(sve_fmla_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 4edd3d4367..ace613684d 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3100,6 +3100,70 @@ DO_FMLA(sve_fnmls_zpzzz_d, 64, , 1, 1) #undef DO_FMLA +/* Two operand floating-point comparison controlled by a predicate. + * Unlike the integer version, we are not allowed to optimistically + * compare operands, since the comparison may have side effects wrt + * the FPSR. + */ +#define DO_FPCMP_PPZZ(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, \ + void *status, uint32_t desc) \ +{ \ + intptr_t opr_sz = simd_oprsz(desc); \ + intptr_t i = opr_sz, j = ((opr_sz - 1) & -64) >> 3; \ + do { \ + uint64_t out = 0; \ + uint64_t pg = *(uint64_t *)(vg + j); \ + do { \ + i -= sizeof(TYPE), out <<= sizeof(TYPE); \ + if ((pg >> (i & 63)) & 1) { \ + TYPE nn = *(TYPE *)(vn + H(i)); \ + TYPE mm = *(TYPE *)(vm + H(i)); \ + out |= OP(TYPE, nn, mm, status); \ + } \ + } while (i & 63); \ + *(uint64_t *)(vd + j) = out; \ + j -= 8; \ + } while (i > 0); \ +} + +#define DO_FPCMP_PPZZ_H(NAME, OP) \ + DO_FPCMP_PPZZ(NAME##_h, float16, H1_2, OP) +#define DO_FPCMP_PPZZ_S(NAME, OP) \ + DO_FPCMP_PPZZ(NAME##_s, float32, H1_4, OP) +#define DO_FPCMP_PPZZ_D(NAME, OP) \ + DO_FPCMP_PPZZ(NAME##_d, float64, , OP) + +#define DO_FPCMP_PPZZ_ALL(NAME, OP) \ + DO_FPCMP_PPZZ_H(NAME, OP) \ + DO_FPCMP_PPZZ_S(NAME, OP) \ + DO_FPCMP_PPZZ_D(NAME, OP) + +#define DO_FCMGE(TYPE, X, Y, ST) TYPE##_compare(Y, X, ST) <= 0 +#define DO_FCMGT(TYPE, X, Y, ST) TYPE##_compare(Y, X, ST) < 0 +#define DO_FCMEQ(TYPE, X, Y, ST) TYPE##_compare_quiet(X, Y, ST) == 0 +#define DO_FCMNE(TYPE, X, Y, ST) TYPE##_compare_quiet(X, Y, ST) != 0 +#define DO_FCMUO(TYPE, X, Y, ST) \ + TYPE##_compare_quiet(X, Y, ST) == float_relation_unordered +#define DO_FACGE(TYPE, X, Y, ST) \ + TYPE##_compare(TYPE##_abs(Y), TYPE##_abs(X), ST) <= 0 +#define DO_FACGT(TYPE, X, Y, ST) \ + TYPE##_compare(TYPE##_abs(Y), TYPE##_abs(X), ST) < 0 + +DO_FPCMP_PPZZ_ALL(sve_fcmge, DO_FCMGE) +DO_FPCMP_PPZZ_ALL(sve_fcmgt, DO_FCMGT) +DO_FPCMP_PPZZ_ALL(sve_fcmeq, DO_FCMEQ) +DO_FPCMP_PPZZ_ALL(sve_fcmne, DO_FCMNE) +DO_FPCMP_PPZZ_ALL(sve_fcmuo, DO_FCMUO) +DO_FPCMP_PPZZ_ALL(sve_facge, DO_FACGE) +DO_FPCMP_PPZZ_ALL(sve_facgt, DO_FACGT) + +#undef DO_FPCMP_PPZZ_ALL +#undef DO_FPCMP_PPZZ_D +#undef DO_FPCMP_PPZZ_S +#undef DO_FPCMP_PPZZ_H +#undef DO_FPCMP_PPZZ + /* * Load contiguous data, protected by a governing predicate. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 0241e8e707..8fcb9dd2be 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3265,6 +3265,47 @@ DO_FP3(FMULX, fmulx) #undef DO_FP3 +static void do_fp_cmp(DisasContext *s, arg_rprr_esz *a, + gen_helper_gvec_4_ptr *fn) +{ + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr status; + + if (fn == NULL) { + unallocated_encoding(s); + return; + } + + status = get_fpstatus_ptr(a->esz == MO_16); + tcg_gen_gvec_4_ptr(pred_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + pred_full_reg_offset(s, a->pg), + status, vsz, vsz, 0, fn); + tcg_temp_free_ptr(status); +} + +#define DO_FPCMP(NAME, name) \ +static void trans_##NAME##_ppzz(DisasContext *s, arg_rprr_esz *a, \ + uint32_t insn) \ +{ \ + static gen_helper_gvec_4_ptr * const fns[4] = { \ + NULL, gen_helper_sve_##name##_h, \ + gen_helper_sve_##name##_s, gen_helper_sve_##name##_d \ + }; \ + do_fp_cmp(s, a, fns[a->esz]); \ +} + +DO_FPCMP(FCMGE, fcmge) +DO_FPCMP(FCMGT, fcmgt) +DO_FPCMP(FCMEQ, fcmeq) +DO_FPCMP(FCMNE, fcmne) +DO_FPCMP(FCMUO, fcmuo) +DO_FPCMP(FACGE, facge) +DO_FPCMP(FACGT, facgt) + +#undef DO_FPCMP + typedef void gen_helper_sve_fmla(TCGv_env, TCGv_ptr, TCGv_i32); static void do_fmla(DisasContext *s, arg_rprrr_esz *a, gen_helper_sve_fmla *fn) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 6ccb4289fc..f82cef2d7e 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -321,6 +321,17 @@ UXTH 00000100 .. 010 011 101 ... ..... ..... @rd_pg_rn SXTW 00000100 .. 010 100 101 ... ..... ..... @rd_pg_rn UXTW 00000100 .. 010 101 101 ... ..... ..... @rd_pg_rn +### SVE Floating Point Compare - Vectors Group + +# SVE floating-point compare vectors +FCMGE_ppzz 01100101 .. 0 ..... 010 ... ..... 0 .... @pd_pg_rn_rm +FCMGT_ppzz 01100101 .. 0 ..... 010 ... ..... 1 .... @pd_pg_rn_rm +FCMEQ_ppzz 01100101 .. 0 ..... 011 ... ..... 0 .... @pd_pg_rn_rm +FCMNE_ppzz 01100101 .. 0 ..... 011 ... ..... 1 .... @pd_pg_rn_rm +FCMUO_ppzz 01100101 .. 0 ..... 110 ... ..... 0 .... @pd_pg_rn_rm +FACGE_ppzz 01100101 .. 0 ..... 110 ... ..... 1 .... @pd_pg_rn_rm +FACGT_ppzz 01100101 .. 0 ..... 111 ... ..... 1 .... @pd_pg_rn_rm + ### SVE Integer Multiply-Add Group # SVE integer multiply-add writing addend (predicated) From patchwork Sat Feb 17 18:23:14 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128714 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1840697ljc; Sat, 17 Feb 2018 11:03:49 -0800 (PST) X-Google-Smtp-Source: AH8x226p/Dv1UyFd1EWV9enmMPRSS/oqSKX9IKN+BII3s/ZXayYFA+8snz3rv37SCKV/h2tfx7qW X-Received: by 10.37.78.212 with SMTP id c203mr6149415ybb.325.1518894229799; Sat, 17 Feb 2018 11:03:49 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518894229; cv=none; d=google.com; s=arc-20160816; b=a3qnkb4szdaAkjXDd4Qv3G2FWWZ+GzyI9WiQ9nvD0a0gaGO/c82S+ws4PCesKEcCby pVemMGucTYH0xyINZQSVOHBKcs3Y9jHk61RNV7u2pwy6uV58wYKFdvzvffbRnFv3MdEy Qw/sabgT3ld7uIjMGoUF6clFQZYtawiBv8v5RIb+WQj/z3+EdX8mxFlqPsSXpGkKfFMn a/cqJg6TTeT6dSOgWguwWOZ4X4DDugm/UXyTpiCbE2lwiONk1S3/Hlg2nOBQDppWODI7 0MJkXTfCmN34Oxz9sW9AHvW0Gxjc6Q+jzQZ1B7JUmOZ8DmxGHX1r8OcqTGxzQEfLcmXe 6q5A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=oHroeQ0pW6nI9r1pBVle/D5G83QwYsIdCkVximFFDyU=; b=JZq7KZPMVDMgxWi0pMQKgnSXcVQxrlSC6Q2SsB5lPoOrprjK5Qq/YPlywMd8YUN6BY MLNeXodSncMErIv08q2oNFhm2MsSF6Goe/zIEDescwhdXYlQvvU4yr3l/nj3VuwZ48m0 mfxWSNo3HfefcBFKmrqIwIQemzxNdz5gEqMIvbUYOWOJlBAtycwRBaIDibqKESYvQpVP kFxvhgLL7Q04qcwEID56A33X840PEcuZ2KO2MHuYhkNvO/hTt+bF0qM8PQMoaqZ1hLCi HKNePOUvkjTjW3Gshm02QjsY6iM4X0qSGsMBcoBi+CZs8djbiCZGNTuZB29T3ULGa6/H 4IRA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=eSaB5noF; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id x124si1092710ybx.408.2018.02.17.11.03.49 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 11:03:49 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=eSaB5noF; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48402 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7m1-0007tM-2A for patch@linaro.org; Sat, 17 Feb 2018 14:03:49 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40921) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7AT-0001lc-GZ for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:03 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7AS-0002Dn-0f for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:01 -0500 Received: from mail-pg0-x243.google.com ([2607:f8b0:400e:c05::243]:37889) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7AR-0002DS-Oc for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:24:59 -0500 Received: by mail-pg0-x243.google.com with SMTP id l24so4354929pgc.5 for ; Sat, 17 Feb 2018 10:24:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=oHroeQ0pW6nI9r1pBVle/D5G83QwYsIdCkVximFFDyU=; b=eSaB5noFZRf1UWPb5mMz49m7GVFXpYPd0DlVoOpcubgaaPWOsc+aVd1Pj/vbPfuo2O vm9uCwojs6QevwjuphuP3/VAeRsW3llcI4qkyJ8c1qX2D37HGI1kudCFdBuRO4BpqAt9 npfqENOlv9LSsSYjXHuGqq9jxVsiAmtKzzRy0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=oHroeQ0pW6nI9r1pBVle/D5G83QwYsIdCkVximFFDyU=; b=ohSMy1Dxe9zQnXev/xzcyXGwPY0aZ6TJbsCj2B3lHvwEvXme3iDF2lSCHfua0SzZQq 5X99VRC664gexER8XxUCfLlm6/Otycg+5F4hQAAjk0+DWKMeSvWHGApQUcdoL5xYw51j JllZrHywDwhM8LEE+j31RE2GoYXS4qOgMJqBeW39n3GsyuAMcie+vbphpYbg+eXiKpL7 NmG6D/bhXmegg+ptYZvCs8kEiwTHfqC54vRsSK03C3r8b+0AIRiTsEWgS1NcbrCxVYlF cwt+Y2kOg3pfelTFQlYnpR4uyZjm14F/YtxRbKZGxNyuLIsqz1ULqAebSYIC1CHdGWMr Kv3g== X-Gm-Message-State: APf1xPCzws+PpYROYZnviQMIpYLcKO+dSlT/GCqL3h5IpsHlv5UyGt0Y p25mVKin9vHJGAvKQpRkf+FPSc8Mvuk= X-Received: by 10.101.87.132 with SMTP id b4mr8350083pgr.332.1518891898432; Sat, 17 Feb 2018 10:24:58 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.57 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:57 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:14 -0800 Message-Id: <20180217182323.25885-59-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::243 Subject: [Qemu-devel] [PATCH v2 58/67] target/arm: Implement SVE floating-point arithmetic with immediate X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 56 +++++++++++++++++++++++++++++++++++ target/arm/sve_helper.c | 68 ++++++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 73 ++++++++++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 14 +++++++++ 4 files changed, 211 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 30373e3fc7..7ada12687b 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -809,6 +809,62 @@ DEF_HELPER_FLAGS_6(sve_fmulx_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_6(sve_fmulx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fadds_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fadds_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fadds_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fsubs_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fsubs_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fsubs_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmuls_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmuls_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmuls_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fsubrs_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fsubrs_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fsubrs_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmaxnms_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmaxnms_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmaxnms_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fminnms_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fminnms_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fminnms_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmaxs_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmaxs_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmaxs_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) + +DEF_HELPER_FLAGS_6(sve_fmins_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmins_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_6(sve_fmins_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i64, ptr, i32) + DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index ace613684d..9378c8f0b2 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2995,6 +2995,74 @@ DO_ZPZZ_FP_D(sve_fmulx_d, uint64_t, helper_vfp_mulxd) #undef DO_ZPZZ_FP #undef DO_ZPZZ_FP_D +/* Three-operand expander, with one scalar operand, controlled by + * a predicate, with the extra float_status parameter. + */ +#define DO_ZPZS_FP(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vg, uint64_t scalar, \ + void *status, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + TYPE mm = scalar; \ + for (i = 0; i < opr_sz; ) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + if (pg & 1) { \ + TYPE nn = *(TYPE *)(vn + H(i)); \ + *(TYPE *)(vd + H(i)) = OP(nn, mm, status); \ + } \ + i += sizeof(TYPE), pg >>= sizeof(TYPE); \ + } while (i & 15); \ + } \ +} + +DO_ZPZS_FP(sve_fadds_h, float16, H1_2, float16_add) +DO_ZPZS_FP(sve_fadds_s, float32, H1_4, float32_add) +DO_ZPZS_FP(sve_fadds_d, float64, , float64_add) + +DO_ZPZS_FP(sve_fsubs_h, float16, H1_2, float16_sub) +DO_ZPZS_FP(sve_fsubs_s, float32, H1_4, float32_sub) +DO_ZPZS_FP(sve_fsubs_d, float64, , float64_sub) + +DO_ZPZS_FP(sve_fmuls_h, float16, H1_2, float16_mul) +DO_ZPZS_FP(sve_fmuls_s, float32, H1_4, float32_mul) +DO_ZPZS_FP(sve_fmuls_d, float64, , float64_mul) + +static inline float16 subr_h(float16 a, float16 b, float_status *s) +{ + return float16_sub(b, a, s); +} + +static inline float32 subr_s(float32 a, float32 b, float_status *s) +{ + return float32_sub(b, a, s); +} + +static inline float64 subr_d(float64 a, float64 b, float_status *s) +{ + return float64_sub(b, a, s); +} + +DO_ZPZS_FP(sve_fsubrs_h, float16, H1_2, subr_h) +DO_ZPZS_FP(sve_fsubrs_s, float32, H1_4, subr_s) +DO_ZPZS_FP(sve_fsubrs_d, float64, , subr_d) + +DO_ZPZS_FP(sve_fmaxnms_h, float16, H1_2, float16_maxnum) +DO_ZPZS_FP(sve_fmaxnms_s, float32, H1_4, float32_maxnum) +DO_ZPZS_FP(sve_fmaxnms_d, float64, , float64_maxnum) + +DO_ZPZS_FP(sve_fminnms_h, float16, H1_2, float16_minnum) +DO_ZPZS_FP(sve_fminnms_s, float32, H1_4, float32_minnum) +DO_ZPZS_FP(sve_fminnms_d, float64, , float64_minnum) + +DO_ZPZS_FP(sve_fmaxs_h, float16, H1_2, float16_max) +DO_ZPZS_FP(sve_fmaxs_s, float32, H1_4, float32_max) +DO_ZPZS_FP(sve_fmaxs_d, float64, , float64_max) + +DO_ZPZS_FP(sve_fmins_h, float16, H1_2, float16_min) +DO_ZPZS_FP(sve_fmins_s, float32, H1_4, float32_min) +DO_ZPZS_FP(sve_fmins_d, float64, , float64_min) + /* Fully general two-operand expander, controlled by a predicate, * With the extra float_status parameter. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 8fcb9dd2be..6ce1b01b9a 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -32,6 +32,7 @@ #include "exec/log.h" #include "trace-tcg.h" #include "translate-a64.h" +#include "fpu/softfloat.h" typedef void GVecGen2Fn(unsigned, uint32_t, uint32_t, uint32_t, uint32_t); typedef void GVecGen2iFn(unsigned, uint32_t, uint32_t, @@ -3265,6 +3266,78 @@ DO_FP3(FMULX, fmulx) #undef DO_FP3 +typedef void gen_helper_sve_fp2scalar(TCGv_ptr, TCGv_ptr, TCGv_ptr, + TCGv_i64, TCGv_ptr, TCGv_i32); + +static void do_fp_scalar(DisasContext *s, int zd, int zn, int pg, bool is_fp16, + TCGv_i64 scalar, gen_helper_sve_fp2scalar *fn) +{ + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr t_zd, t_zn, t_pg, status; + TCGv_i32 desc; + + t_zd = tcg_temp_new_ptr(); + t_zn = tcg_temp_new_ptr(); + t_pg = tcg_temp_new_ptr(); + tcg_gen_addi_ptr(t_zd, cpu_env, vec_full_reg_offset(s, zd)); + tcg_gen_addi_ptr(t_zn, cpu_env, vec_full_reg_offset(s, zn)); + tcg_gen_addi_ptr(t_pg, cpu_env, vec_full_reg_offset(s, pg)); + + status = get_fpstatus_ptr(is_fp16); + desc = tcg_const_i32(simd_desc(vsz, vsz, 0)); + fn(t_zd, t_zn, t_pg, scalar, status, desc); + + tcg_temp_free_i32(desc); + tcg_temp_free_ptr(status); + tcg_temp_free_ptr(t_pg); + tcg_temp_free_ptr(t_zn); + tcg_temp_free_ptr(t_zd); +} + +static void do_fp_imm(DisasContext *s, arg_rpri_esz *a, uint64_t imm, + gen_helper_sve_fp2scalar *fn) +{ + TCGv_i64 temp = tcg_const_i64(imm); + do_fp_scalar(s, a->rd, a->rn, a->pg, a->esz == MO_16, temp, fn); + tcg_temp_free_i64(temp); +} + +#define DO_FP_IMM(NAME, name, const0, const1) \ +static void trans_##NAME##_zpzi(DisasContext *s, arg_rpri_esz *a, \ + uint32_t insn) \ +{ \ + static gen_helper_sve_fp2scalar * const fns[3] = { \ + gen_helper_sve_##name##_h, \ + gen_helper_sve_##name##_s, \ + gen_helper_sve_##name##_d \ + }; \ + static uint64_t const val[3][2] = { \ + { float16_##const0, float16_##const1 }, \ + { float32_##const0, float32_##const1 }, \ + { float64_##const0, float64_##const1 }, \ + }; \ + if (a->esz == 0) { \ + unallocated_encoding(s); \ + return; \ + } \ + do_fp_imm(s, a, val[a->esz - 1][a->imm], fns[a->esz - 1]); \ +} + +#define float16_two make_float16(0x4000) +#define float32_two make_float32(0x40000000) +#define float64_two make_float64(0x4000000000000000ULL) + +DO_FP_IMM(FADD, fadds, half, one) +DO_FP_IMM(FSUB, fsubs, half, one) +DO_FP_IMM(FMUL, fmuls, half, two) +DO_FP_IMM(FSUBR, fsubrs, half, one) +DO_FP_IMM(FMAXNM, fmaxnms, zero, one) +DO_FP_IMM(FMINNM, fminnms, zero, one) +DO_FP_IMM(FMAX, fmaxs, zero, one) +DO_FP_IMM(FMIN, fmins, zero, one) + +#undef DO_FP_IMM + static void do_fp_cmp(DisasContext *s, arg_rprr_esz *a, gen_helper_gvec_4_ptr *fn) { diff --git a/target/arm/sve.decode b/target/arm/sve.decode index f82cef2d7e..258d14b729 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -161,6 +161,10 @@ @rdn_pg4 ........ esz:2 .. pg:4 ... ........ rd:5 \ &rpri_esz rn=%reg_movprfx +# Two register operand, one one-bit floating-point operand. +@rdn_i1 ........ esz:2 ......... pg:3 .... imm:1 rd:5 \ + &rpri_esz rn=%reg_movprfx + # Two register operand, one encoded bitmask. @rdn_dbm ........ .. .... dbm:13 rd:5 \ &rr_dbm rn=%reg_movprfx @@ -748,6 +752,16 @@ FMULX 01100101 .. 00 1010 100 ... ..... ..... @rdn_pg_rm FDIV 01100101 .. 00 1100 100 ... ..... ..... @rdm_pg_rn # FDIVR FDIV 01100101 .. 00 1101 100 ... ..... ..... @rdn_pg_rm +# SVE floating-point arithmetic with immediate (predicated) +FADD_zpzi 01100101 .. 011 000 100 ... 0000 . ..... @rdn_i1 +FSUB_zpzi 01100101 .. 011 001 100 ... 0000 . ..... @rdn_i1 +FMUL_zpzi 01100101 .. 011 010 100 ... 0000 . ..... @rdn_i1 +FSUBR_zpzi 01100101 .. 011 011 100 ... 0000 . ..... @rdn_i1 +FMAXNM_zpzi 01100101 .. 011 100 100 ... 0000 . ..... @rdn_i1 +FMINNM_zpzi 01100101 .. 011 101 100 ... 0000 . ..... @rdn_i1 +FMAX_zpzi 01100101 .. 011 110 100 ... 0000 . ..... @rdn_i1 +FMIN_zpzi 01100101 .. 011 111 100 ... 0000 . ..... @rdn_i1 + ### SVE FP Multiply-Add Group # SVE floating-point multiply-accumulate writing addend From patchwork Sat Feb 17 18:23:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128718 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1842930ljc; Sat, 17 Feb 2018 11:07:09 -0800 (PST) X-Google-Smtp-Source: AH8x225IrCuHE7GSnq9nGi7cZnN6rVrYpy2hcM0tcQ0ao27CAowAlJcNT6MxEnfWG2ftrmnr5qvl X-Received: by 10.37.187.145 with SMTP id y17mr7327469ybg.64.1518894429757; Sat, 17 Feb 2018 11:07:09 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518894429; cv=none; d=google.com; s=arc-20160816; b=sn65iPOXSQ3/humiawsMO5ak/S66ktdTPMEzT8lQfW2LfyasA1JEN28l8aO14bMmCY XyKQZz9u1qDMnNvGv2Iwfh7En8w6tmnEdGtkTzVldHjjIzXPajWN/8frGQK+pRN00GjU 8mHmH8SNI3o2bzHuNK/mZqtv62gnDDwmSDDfSeSpBkrdxI4MXTCzIik4M0VWf5k9jD9E QtnauDmdbbtEzJyqOf6IgZONc6v+OhnZymVH684ohWbBIP9/3qX1z21ow/bQk0FMbheC Me+2hpWrAAuV8YeHdQ9qE7tzbpAx9aDU7oE3Cma/I95orlzKnFuMiWBxoEjl+aIre4Pe zMgw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=y77VXTcItoEr50fnaXeizXOxi+B96vWDrfG5IAacJoI=; b=hh2ZPcAIqcHNZsv8zxi/GPSqAyjvopwjN+uwJ3xjMv8MmWkSvk6Ij16qbQY0i1tL2g /jcQL5BmSu5pJvEdSztg67DVCN0pleMt82N2Gvutt2UPpEoh27nAGhW7q5vaKB9iIKCj o/1TH9EeZPB692bOplN7QtiBMtFTFqWxr6L/vK5J6r2ZF/6o9ipTRas8NhYSgMDUZtWF ESjXPxLtUUHen/ILZKIuKp7H6/DKjbaDmj2GwK8zwMoVMv94l+8CnsDFW6Wb8abqpjI9 PTYvEgheQnoDCVLW7sKfVGy1BANqfyIFoSvEoTQrdRmlpn0IEBE6ff+VIK+FR44Bwb/9 lxRQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=NVHBDsMh; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id n9si3497414ybj.692.2018.02.17.11.07.09 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 11:07:09 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=NVHBDsMh; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48424 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7pF-0002PE-1a for patch@linaro.org; Sat, 17 Feb 2018 14:07:09 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40950) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7AU-0001n8-Rq for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:04 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7AT-0002Ej-Ii for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:02 -0500 Received: from mail-pf0-x243.google.com ([2607:f8b0:400e:c00::243]:47055) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7AT-0002Dx-Ba for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:01 -0500 Received: by mail-pf0-x243.google.com with SMTP id z24so588312pfh.13 for ; Sat, 17 Feb 2018 10:25:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=y77VXTcItoEr50fnaXeizXOxi+B96vWDrfG5IAacJoI=; b=NVHBDsMh2kDQxF8XKMUixrckyYlojinVvRxKUWy2ZfWF0NWAUfhJKJKb21uAmDV7RT hapg0Zi4HQNmksI4aLR2Jg+HX9mHVpaiYA1ep3eKgka6G9t+g5oKHclqDgFcH2FfvEHG SjBeBQPyOPqiukgRY8rwREmjZgLUYq1AdCBg8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=y77VXTcItoEr50fnaXeizXOxi+B96vWDrfG5IAacJoI=; b=hHgpDNwwtVQ9H4caaNNQX2oGhVlx5M3N1dew48wbsf3ruGToLL47n7nEJ7HTCAFxs0 C5hEumSw3oic63YHJwqTyVkRal1vRICwFS6L75SFenVOXH+vTCseBffdmFiXugpZ+Xhh HLJsLR2C6eXjcECx5/O87W0x2/E3gW20xNYmM4YghOxT7hxilhbiXarlOncpljU5qDGx fztKbYZgi33md5OBGzmAvR4jJGPXxAUg6WDG5rYWZtfivQq1ZVmnuAMJ66FWj6/D2X4Y bIKc+WPtBqCqC9HjjcfcI4dT9GsjEDWbA8fqWEisGa5uaB405aQZJTL3IRnDMNIKlTpI 6KlA== X-Gm-Message-State: APf1xPDGzLB9FIOu79XxVbo2uZqZ8GVwIydhbRDEk4eoZU4iIFyVjY1w AsB+Bk5b1NhlwM0O+BqRYLCefG5nn8c= X-Received: by 10.101.100.208 with SMTP id t16mr7928770pgv.398.1518891899959; Sat, 17 Feb 2018 10:24:59 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.24.58 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:24:59 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:15 -0800 Message-Id: <20180217182323.25885-60-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::243 Subject: [Qemu-devel] [PATCH v2 59/67] target/arm: Implement SVE Floating Point Multiply Indexed Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper.h | 14 ++++++++++ target/arm/translate-sve.c | 44 +++++++++++++++++++++++++++++++ target/arm/vec_helper.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 19 ++++++++++++++ 4 files changed, 141 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper.h b/target/arm/helper.h index f3ce58e276..a8d824b085 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -584,6 +584,20 @@ DEF_HELPER_FLAGS_5(gvec_ftsmul_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_ftsmul_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fmul_idx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fmul_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fmul_idx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(gvec_fmla_idx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(gvec_fmla_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(gvec_fmla_idx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 6ce1b01b9a..cf2a4d3284 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3136,6 +3136,50 @@ DO_ZZI(UMIN, umin) #undef DO_ZZI +/* + *** SVE Floating Point Multiply-Add Indexed Group + */ + +static void trans_FMLA_zzxz(DisasContext *s, arg_FMLA_zzxz *a, uint32_t insn) +{ + static gen_helper_gvec_4_ptr * const fns[3] = { + gen_helper_gvec_fmla_idx_h, + gen_helper_gvec_fmla_idx_s, + gen_helper_gvec_fmla_idx_d, + }; + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16); + + tcg_gen_gvec_4_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vec_full_reg_offset(s, a->ra), + status, vsz, vsz, a->index * 2 + a->sub, + fns[a->esz - 1]); + tcg_temp_free_ptr(status); +} + +/* + *** SVE Floating Point Multiply Indexed Group + */ + +static void trans_FMUL_zzx(DisasContext *s, arg_FMUL_zzx *a, uint32_t insn) +{ + static gen_helper_gvec_3_ptr * const fns[3] = { + gen_helper_gvec_fmul_idx_h, + gen_helper_gvec_fmul_idx_s, + gen_helper_gvec_fmul_idx_d, + }; + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16); + + tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + status, vsz, vsz, a->index, fns[a->esz - 1]); + tcg_temp_free_ptr(status); +} + /* *** SVE Floating Point Accumulating Reduction Group */ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index ad5c29cdd5..e711a3217d 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -24,6 +24,22 @@ #include "fpu/softfloat.h" +/* Note that vector data is stored in host-endian 64-bit chunks, + so addressing units smaller than that needs a host-endian fixup. */ +#ifdef HOST_WORDS_BIGENDIAN +#define H1(x) ((x) ^ 7) +#define H1_2(x) ((x) ^ 6) +#define H1_4(x) ((x) ^ 4) +#define H2(x) ((x) ^ 3) +#define H4(x) ((x) ^ 1) +#else +#define H1(x) (x) +#define H1_2(x) (x) +#define H1_4(x) (x) +#define H2(x) (x) +#define H4(x) (x) +#endif + /* Floating-point trigonometric starting value. * See the ARM ARM pseudocode function FPTrigSMul. */ @@ -92,3 +108,51 @@ DO_3OP(gvec_rsqrts_d, helper_rsqrtsf_f64, float64) #endif #undef DO_3OP + +/* For the indexed ops, SVE applies the index per 128-bit vector segment. + * For AdvSIMD, there is of course only one such vector segment. + */ + +#define DO_MUL_IDX(NAME, TYPE, H) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \ +{ \ + intptr_t i, j, oprsz = simd_oprsz(desc), segment = 16 / sizeof(TYPE); \ + intptr_t idx = simd_data(desc); \ + TYPE *d = vd, *n = vn, *m = vm; \ + for (i = 0; i < oprsz / sizeof(TYPE); i += segment) { \ + TYPE mm = m[H(i + idx)]; \ + for (j = 0; j < segment; j++) { \ + d[i + j] = TYPE##_mul(n[i + j], mm, stat); \ + } \ + } \ +} + +DO_MUL_IDX(gvec_fmul_idx_h, float16, H2) +DO_MUL_IDX(gvec_fmul_idx_s, float32, H4) +DO_MUL_IDX(gvec_fmul_idx_d, float64, ) + +#undef DO_MUL_IDX + +#define DO_FMLA_IDX(NAME, TYPE, H) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, \ + void *stat, uint32_t desc) \ +{ \ + intptr_t i, j, oprsz = simd_oprsz(desc), segment = 16 / sizeof(TYPE); \ + TYPE op1_neg = extract32(desc, SIMD_DATA_SHIFT, 1); \ + intptr_t idx = desc >> (SIMD_DATA_SHIFT + 1); \ + TYPE *d = vd, *n = vn, *m = vm, *a = va; \ + op1_neg <<= (8 * sizeof(TYPE) - 1); \ + for (i = 0; i < oprsz / sizeof(TYPE); i += segment) { \ + TYPE mm = m[H(i + idx)]; \ + for (j = 0; j < segment; j++) { \ + d[i + j] = TYPE##_muladd(n[i + j] ^ op1_neg, \ + mm, a[i + j], 0, stat); \ + } \ + } \ +} + +DO_FMLA_IDX(gvec_fmla_idx_h, float16, H2) +DO_FMLA_IDX(gvec_fmla_idx_s, float32, H4) +DO_FMLA_IDX(gvec_fmla_idx_d, float64, ) + +#undef DO_FMLA_IDX diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 258d14b729..d16e733aa3 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -30,6 +30,7 @@ %preg4_5 5:4 %size_23 23:2 %dtype_23_13 23:2 13:2 +%index3_22_19 22:1 19:2 # A combination of tsz:imm3 -- extract esize. %tszimm_esz 22:2 5:5 !function=tszimm_esz @@ -720,6 +721,24 @@ UMIN_zzi 00100101 .. 101 011 110 ........ ..... @rdn_i8u # SVE integer multiply immediate (unpredicated) MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s +### SVE FP Multiply-Add Indexed Group + +# SVE floating-point multiply-add (indexed) +FMLA_zzxz 01100100 0.1 .. rm:3 00000 sub:1 rn:5 rd:5 \ + ra=%reg_movprfx index=%index3_22_19 esz=1 +FMLA_zzxz 01100100 101 index:2 rm:3 00000 sub:1 rn:5 rd:5 \ + ra=%reg_movprfx esz=2 +FMLA_zzxz 01100100 111 index:1 rm:4 00000 sub:1 rn:5 rd:5 \ + ra=%reg_movprfx esz=3 + +### SVE FP Multiply Indexed Group + +# SVE floating-point multiply (indexed) +FMUL_zzx 01100100 0.1 .. rm:3 001000 rn:5 rd:5 \ + index=%index3_22_19 esz=1 +FMUL_zzx 01100100 101 index:2 rm:3 001000 rn:5 rd:5 esz=2 +FMUL_zzx 01100100 111 index:1 rm:4 001000 rn:5 rd:5 esz=3 + ### SVE FP Accumulating Reduction Group # SVE floating-point serial reduction (predicated) From patchwork Sat Feb 17 18:23:16 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128732 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1850754ljc; Sat, 17 Feb 2018 11:19:33 -0800 (PST) X-Google-Smtp-Source: AH8x226o7XdGnpvWw/NWbJ/wE71XR+I5sGdzsBSgrH6WmI3srVr55I8tpgt7BxDyNHPrx/lIz/ZD X-Received: by 10.37.160.99 with SMTP id x90mr7367237ybh.356.1518895173488; Sat, 17 Feb 2018 11:19:33 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518895173; cv=none; d=google.com; s=arc-20160816; b=Ymfa5smlpDAUWcG6Faqd2H4IxDFccNffjR/5icl0jkmd/jCSt08ztMn7hPbAGasmKO ouhbswRnz2+W0kcGx3PnP3g0HXYVb8iuyNsQUO8vXDpFmd/3j6hXm1WmimM7Wt4laNGG Lz+7m2PFzuAu5a/Ws9k64ifoavVCOXo5SZbC9FbIU8NurJh57XvQyvYLKaaXX/k3xeB/ MJN0lEeZlSfBHW+GbjOCt7hWAlJgPbMhwFu3aiHRjjd2JV2VHxRNhLp83mCxYTMp/4AR he7hpgo+OQ1uax/9AU70AEvLKMMH537AOC0GNO9joFwKYd6X5t+AdvcPWUl2tLHuGEKq k+xg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=8A9oZaEU1Q/sjBJ+Ds3RtyUVolRFNfgjtMee6G1xKSw=; b=Gz7+UKP6fC4+weDQBtCgTlJIJm28w3bXCP8KHT0qZggkXKT9GPS3D/FmupFkP9ygPn KF3ip6UeTg8z25pqJLeZMtt1iVf7DU0MJvaTrK6vj2E/w/1K+W/YelEYL3WDEYX/o83W wxlA8ki0A8vAav+K7cEqjqi1lwQpL6BvjLBQL7s4ldUcKd0B7rqB0xuvHy97k0cLZLGn cJP1bmXA5pspNl7SYQ0bRRWqwhb8epKq1LbiAEXnArB4jTZYlgNEt8imhLy76ujnK9ZY xaIR8wwfhCj4iHNgADomb1M8NWmyYHc3qcFbf7W9avQ7udpkvFE4ZHFX6Ad9Pw8Y+CCc iVUQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=ILmJkAB/; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id n65si1042614ybg.831.2018.02.17.11.19.33 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 11:19:33 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=ILmJkAB/; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:49291 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en81E-0006Qq-LL for patch@linaro.org; Sat, 17 Feb 2018 14:19:32 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40989) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7AW-0001pi-Ly for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:05 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7AV-0002FS-4i for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:04 -0500 Received: from mail-pg0-x242.google.com ([2607:f8b0:400e:c05::242]:42014) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7AU-0002Ev-Sd for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:03 -0500 Received: by mail-pg0-x242.google.com with SMTP id y8so4343546pgr.9 for ; Sat, 17 Feb 2018 10:25:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=8A9oZaEU1Q/sjBJ+Ds3RtyUVolRFNfgjtMee6G1xKSw=; b=ILmJkAB/QmiKvyTrvdZRU0xnCcOuTiKT0Up+kSeGeb6kH6WW9ZIl17bj9FrGzt0OkX AqvDcJI3dZDr/DeiohFASo5cXa46yqZ1uCoFvkBeFq5s4U0sjPOb0oVcsZveVgXtZkh2 6GIr76f5UCROG2O6+dJ9UBMwxpDHTIhe7XvGk= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=8A9oZaEU1Q/sjBJ+Ds3RtyUVolRFNfgjtMee6G1xKSw=; b=ER8FkX1HWvh/HF4TFPHyYN8Re5IZq/ofHTqzqXi6KUhY85B8f0YsdcCuYvsikq7bpG tc/XDU/LKjU+PlLWv6QxbtxpE3y6u5QVsTvC+PudPLt35cZ7ZnbuHl5P2WdfGCMKkwJZ OOucRtcKGEH+I0HzP+ryV2HXIcWXfc7Lqjfpwzdehyu0mvJxjjcxJrWGWLn8dWRufC6A nw7GaXTM2zTPwYd+su2AQqqToLUPV/XlEWKZAM+2KcbakjEOjryRWNDVxYcam2Cjdrqr VlL/gN8s91B+rW4b2Y8rNfFk5ZmMb3/G+YW27PrCENR8Gk0lekHk3WXqUMcWN7ZbWAqJ ClXQ== X-Gm-Message-State: APf1xPAnZTTBxATkETHivuCaAmYneEkQqMSFum/Xo7gPBjVmYLnGyxBX 0ldGFg3mIMpY2G2oQZ+x5UgkCyNQ6rs= X-Received: by 10.99.125.74 with SMTP id m10mr8493057pgn.354.1518891901572; Sat, 17 Feb 2018 10:25:01 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.25.00 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:25:00 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:16 -0800 Message-Id: <20180217182323.25885-61-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::242 Subject: [Qemu-devel] [PATCH v2 60/67] target/arm: Implement SVE FP Fast Reduction Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 35 ++++++++++++++++++++++++++ target/arm/sve_helper.c | 61 ++++++++++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 55 +++++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 8 ++++++ 4 files changed, 159 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 7ada12687b..c07b2245ba 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -725,6 +725,41 @@ DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_faddv_h, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_faddv_s, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_faddv_d, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_fmaxnmv_h, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fmaxnmv_s, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fmaxnmv_d, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_fminnmv_h, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fminnmv_s, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fminnmv_d, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_fmaxv_h, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fmaxv_s, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fmaxv_d, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_fminv_h, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fminv_s, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_fminv_d, TCG_CALL_NO_RWG, + i64, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_fadda_h, TCG_CALL_NO_RWG, i64, i64, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_fadda_s, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 9378c8f0b2..29deefcd86 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2832,6 +2832,67 @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count, uint32_t pred_desc) return predtest_ones(d, oprsz, esz_mask); } +/* Recursive reduction on a function; + * C.f. the ARM ARM function ReducePredicated. + * + * While it would be possible to write this without the DATA temporary, + * it is much simpler to process the predicate register this way. + * The recursion is bounded to depth 7 (128 fp16 elements), so there's + * little to gain with a more complex non-recursive form. + */ +#define DO_REDUCE(NAME, TYPE, H, FUNC, IDENT) \ +static TYPE NAME##_reduce(TYPE *data, float_status *status, uintptr_t n) \ +{ \ + if (n == 1) { \ + return *data; \ + } else { \ + uintptr_t half = n / 2; \ + TYPE lo = NAME##_reduce(data, status, half); \ + TYPE hi = NAME##_reduce(data + half, status, half); \ + return TYPE##_##FUNC(lo, hi, status); \ + } \ +} \ +uint64_t HELPER(NAME)(void *vn, void *vg, void *vs, uint32_t desc) \ +{ \ + uintptr_t i, oprsz = simd_oprsz(desc), maxsz = simd_maxsz(desc); \ + TYPE data[sizeof(ARMVectorReg) / sizeof(TYPE)]; \ + for (i = 0; i < oprsz; ) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + TYPE nn = *(TYPE *)(vn + H(i)); \ + *(TYPE *)((void *)data + i) = (pg & 1 ? nn : IDENT); \ + i += sizeof(TYPE), pg >>= sizeof(TYPE); \ + } while (i & 15); \ + } \ + for (; i < maxsz; i += sizeof(TYPE)) { \ + *(TYPE *)((void *)data + i) = IDENT; \ + } \ + return NAME##_reduce(data, vs, maxsz / sizeof(TYPE)); \ +} + +DO_REDUCE(sve_faddv_h, float16, H1_2, add, float16_zero) +DO_REDUCE(sve_faddv_s, float32, H1_4, add, float32_zero) +DO_REDUCE(sve_faddv_d, float64, , add, float64_zero) + +/* Identity is floatN_default_nan, without the function call. */ +DO_REDUCE(sve_fminnmv_h, float16, H1_2, minnum, 0x7E00) +DO_REDUCE(sve_fminnmv_s, float32, H1_4, minnum, 0x7FC00000) +DO_REDUCE(sve_fminnmv_d, float64, , minnum, 0x7FF8000000000000ULL) + +DO_REDUCE(sve_fmaxnmv_h, float16, H1_2, maxnum, 0x7E00) +DO_REDUCE(sve_fmaxnmv_s, float32, H1_4, maxnum, 0x7FC00000) +DO_REDUCE(sve_fmaxnmv_d, float64, , maxnum, 0x7FF8000000000000ULL) + +DO_REDUCE(sve_fminv_h, float16, H1_2, min, float16_infinity) +DO_REDUCE(sve_fminv_s, float32, H1_4, min, float32_infinity) +DO_REDUCE(sve_fminv_d, float64, , min, float64_infinity) + +DO_REDUCE(sve_fmaxv_h, float16, H1_2, max, float16_chs(float16_infinity)) +DO_REDUCE(sve_fmaxv_s, float32, H1_4, max, float32_chs(float32_infinity)) +DO_REDUCE(sve_fmaxv_d, float64, , max, float64_chs(float64_infinity)) + +#undef DO_REDUCE + uint64_t HELPER(sve_fadda_h)(uint64_t nn, void *vm, void *vg, void *status, uint32_t desc) { diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index cf2a4d3284..a77ddf0f4b 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3180,6 +3180,61 @@ static void trans_FMUL_zzx(DisasContext *s, arg_FMUL_zzx *a, uint32_t insn) tcg_temp_free_ptr(status); } +/* + *** SVE Floating Point Fast Reduction Group + */ + +typedef void gen_helper_fp_reduce(TCGv_i64, TCGv_ptr, TCGv_ptr, + TCGv_ptr, TCGv_i32); + +static void do_reduce(DisasContext *s, arg_rpr_esz *a, + gen_helper_fp_reduce *fn) +{ + unsigned vsz = vec_full_reg_size(s); + unsigned p2vsz = pow2ceil(vsz); + TCGv_i32 t_desc = tcg_const_i32(simd_desc(vsz, p2vsz, 0)); + TCGv_ptr t_zn, t_pg, status; + TCGv_i64 temp; + + temp = tcg_temp_new_i64(); + t_zn = tcg_temp_new_ptr(); + t_pg = tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(t_zn, cpu_env, vec_full_reg_offset(s, a->rn)); + tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, a->pg)); + status = get_fpstatus_ptr(a->esz == MO_16); + + fn(temp, t_zn, t_pg, status, t_desc); + tcg_temp_free_ptr(t_zn); + tcg_temp_free_ptr(t_pg); + tcg_temp_free_ptr(status); + tcg_temp_free_i32(t_desc); + + write_fp_dreg(s, a->rd, temp); + tcg_temp_free_i64(temp); +} + +#define DO_VPZ(NAME, name) \ +static void trans_##NAME(DisasContext *s, arg_rpr_esz *a, uint32_t insn) \ +{ \ + static gen_helper_fp_reduce * const fns[3] = { \ + gen_helper_sve_##name##_h, \ + gen_helper_sve_##name##_s, \ + gen_helper_sve_##name##_d, \ + }; \ + if (a->esz == 0) { \ + unallocated_encoding(s); \ + return; \ + } \ + do_reduce(s, a, fns[a->esz - 1]); \ +} + +DO_VPZ(FADDV, faddv) +DO_VPZ(FMINNMV, fminnmv) +DO_VPZ(FMAXNMV, fmaxnmv) +DO_VPZ(FMINV, fminv) +DO_VPZ(FMAXV, fmaxv) + /* *** SVE Floating Point Accumulating Reduction Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index d16e733aa3..feb8c65e89 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -739,6 +739,14 @@ FMUL_zzx 01100100 0.1 .. rm:3 001000 rn:5 rd:5 \ FMUL_zzx 01100100 101 index:2 rm:3 001000 rn:5 rd:5 esz=2 FMUL_zzx 01100100 111 index:1 rm:4 001000 rn:5 rd:5 esz=3 +### SVE FP Fast Reduction Group + +FADDV 01100101 .. 000 000 001 ... ..... ..... @rd_pg_rn +FMAXNMV 01100101 .. 000 100 001 ... ..... ..... @rd_pg_rn +FMINNMV 01100101 .. 000 101 001 ... ..... ..... @rd_pg_rn +FMAXV 01100101 .. 000 110 001 ... ..... ..... @rd_pg_rn +FMINV 01100101 .. 000 111 001 ... ..... ..... @rd_pg_rn + ### SVE FP Accumulating Reduction Group # SVE floating-point serial reduction (predicated) From patchwork Sat Feb 17 18:23:17 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128726 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1847104ljc; Sat, 17 Feb 2018 11:13:43 -0800 (PST) X-Google-Smtp-Source: AH8x226IFOnc+XvnvwH7B55AwY/n+kBFg0dxaKlXeEibI7Yy9O3SrAlkEBYNDmd9ccpIDUCmt1bJ X-Received: by 10.37.131.7 with SMTP id s7mr7162267ybk.5.1518894823424; Sat, 17 Feb 2018 11:13:43 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518894823; cv=none; d=google.com; s=arc-20160816; b=F5JgCZGRE1RF4Dm70IBgpHAl67b6VaD/9CpM/mJSuejS42PRQ9AEj9fwh0xRhQT+Xn 3i3F6l5XGs7GrKHiTt2FGtSRVPoradGCEXLSoRZ1hDT+RzG0YomptHNd1Movv37SeEII wfduVoyAa2KHm1upiSrjYtkuEfyV1mYh2ObhBNnpm9x46We16Npab+k3xno5yRTHyZJM de8YsqzhY0QqLrxeTR5vnTqfby5Y6uoKGGWp3vAGVxZB0uSkhtwt5+fpZlp7EuQESkzY C83KQORyjrg71vJH874riHLM/XImEL7fNbcB8E2isSABItGmdyNYOyWquANfdFlqHqfI 8h6g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=Y10V3pA5jyYqp4BAV4fenO1hcQGPIHd6qQBTl6wUr7o=; b=Kv1J+ksbafY3f1I83ubALb2moCMMjotaNkcQbS76U3QpdGkHND7NiLeHbLdR2P9sJS 1aeQDdt4ut4E0c5MIMb/mD4Il01GXBz9fkuFmFXM1no2wucW7YaYMxfP8c+OX/Goz+Xl jOJDSLWtgJatO/bQDzl0cZiWOJCAkDWMYa/KX8qy2Vy7rAHnwrDPwed4mhB0C/9zGDMK /PWd9ZOrJY2E5WRngNoowPUJjycSpM+F2/6v46JVWwD+oMcq0UgryJu6M9iLWUMZb5GN H8p7WQ+LPJEO8ipPPFyJzNCxL/DtPRHMk8LhPhHOCSjsrhfnfH5x0cU0qiWzChJ0t1Db Le6w== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=T1NMf67a; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id g4si372304ybd.652.2018.02.17.11.13.43 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 11:13:43 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=T1NMf67a; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48478 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7va-0008DJ-Nc for patch@linaro.org; Sat, 17 Feb 2018 14:13:42 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41012) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7AX-0001rI-Po for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:06 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7AW-0002GE-Lm for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:05 -0500 Received: from mail-pg0-x242.google.com ([2607:f8b0:400e:c05::242]:39714) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7AW-0002Fn-EL for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:04 -0500 Received: by mail-pg0-x242.google.com with SMTP id w17so4356674pgv.6 for ; Sat, 17 Feb 2018 10:25:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Y10V3pA5jyYqp4BAV4fenO1hcQGPIHd6qQBTl6wUr7o=; b=T1NMf67azn3fAUVZXhEEHvirBgKsZtqgLkp4xA2p64VZK5oFjoE/qWbosH8wFg+dQu wYQ4xbhj4wNJmZbmLSCPRRqBaAcHSqs0y3krBTstMM7Ulr3FRuAMWTn3a8c9484UpP9k lUBa614cysd4+gA+R4z7su4ga9vnHbO2dcG1U= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Y10V3pA5jyYqp4BAV4fenO1hcQGPIHd6qQBTl6wUr7o=; b=BHmYTinzCQkW/gl3TdeUomosOaJseChKjd5VPSeKRUSrNzCqzOOlO6yh3+E1cRf9nd yGPFxndHKPsSKf77FkAQZ9ip+BsV3j75qM+wYFJJBxVxD4Dn9hRTjj4TtVAGBfce/6Wz TcvKAhLWPVVZlQCcnux40gARzmXjGIJe7TrfjTTP6wyd2b18nvlNXxjaXTiN2Ay/cZAM vWqJc1vJGXVEweOaqG2j7utP38rHUXGbO8YkcCMwBm+ScbkfAHX8o8sf9pxw3aaFqvWD RiVX4Q+dWOuKoU5AHJMogA10WlDKx+QrOkKadWXMDfpH8zwE02M0Fo68LACHaXFlpImE CUfQ== X-Gm-Message-State: APf1xPC3tPW2KnNzXfGYJ+LSe6uc277BYWEwbyQJoGbOsErzLKgS8KvS +J+8o+iL1W1dVZj0/40IlpPHQSmKxwk= X-Received: by 10.99.114.86 with SMTP id c22mr8198570pgn.41.1518891903201; Sat, 17 Feb 2018 10:25:03 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.25.01 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:25:02 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:17 -0800 Message-Id: <20180217182323.25885-62-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::242 Subject: [Qemu-devel] [PATCH v2 61/67] target/arm: Implement SVE Floating Point Unary Operations - Unpredicated Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper.h | 8 ++++++++ target/arm/translate-sve.c | 43 +++++++++++++++++++++++++++++++++++++++++++ target/arm/vec_helper.c | 20 ++++++++++++++++++++ target/arm/sve.decode | 5 +++++ 4 files changed, 76 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper.h b/target/arm/helper.h index a8d824b085..4bfefe42b2 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -565,6 +565,14 @@ DEF_HELPER_2(dc_zva, void, env, i64) DEF_HELPER_FLAGS_2(neon_pmull_64_lo, TCG_CALL_NO_RWG_SE, i64, i64, i64) DEF_HELPER_FLAGS_2(neon_pmull_64_hi, TCG_CALL_NO_RWG_SE, i64, i64, i64) +DEF_HELPER_FLAGS_4(gvec_frecpe_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_frecpe_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_frecpe_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(gvec_frsqrte_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_frsqrte_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_frsqrte_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(gvec_fadd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index a77ddf0f4b..463ff7b690 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3235,6 +3235,49 @@ DO_VPZ(FMAXNMV, fmaxnmv) DO_VPZ(FMINV, fminv) DO_VPZ(FMAXV, fmaxv) +/* + *** SVE Floating Point Unary Operations - Unpredicated Group + */ + +static void do_zz_fp(DisasContext *s, arg_rr_esz *a, gen_helper_gvec_2_ptr *fn) +{ + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16); + + tcg_gen_gvec_2_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + status, vsz, vsz, 0, fn); + tcg_temp_free_ptr(status); +} + +static void trans_FRECPE(DisasContext *s, arg_rr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_2_ptr * const fns[3] = { + gen_helper_gvec_frecpe_h, + gen_helper_gvec_frecpe_s, + gen_helper_gvec_frecpe_d, + }; + if (a->esz == 0) { + unallocated_encoding(s); + } else { + do_zz_fp(s, a, fns[a->esz - 1]); + } +} + +static void trans_FRSQRTE(DisasContext *s, arg_rr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_2_ptr * const fns[3] = { + gen_helper_gvec_frsqrte_h, + gen_helper_gvec_frsqrte_s, + gen_helper_gvec_frsqrte_d, + }; + if (a->esz == 0) { + unallocated_encoding(s); + } else { + do_zz_fp(s, a, fns[a->esz - 1]); + } +} + /* *** SVE Floating Point Accumulating Reduction Group */ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index e711a3217d..60dc07cf87 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -40,6 +40,26 @@ #define H4(x) (x) #endif +#define DO_2OP(NAME, FUNC, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, void *stat, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + TYPE *d = vd, *n = vn; \ + for (i = 0; i < oprsz / sizeof(TYPE); i++) { \ + d[i] = FUNC(n[i], stat); \ + } \ +} + +DO_2OP(gvec_frecpe_h, helper_recpe_f16, float16) +DO_2OP(gvec_frecpe_s, helper_recpe_f32, float32) +DO_2OP(gvec_frecpe_d, helper_recpe_f64, float64) + +DO_2OP(gvec_frsqrte_h, helper_rsqrte_f16, float16) +DO_2OP(gvec_frsqrte_s, helper_rsqrte_f32, float32) +DO_2OP(gvec_frsqrte_d, helper_rsqrte_f64, float64) + +#undef DO_2OP + /* Floating-point trigonometric starting value. * See the ARM ARM pseudocode function FPTrigSMul. */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index feb8c65e89..112e85174c 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -747,6 +747,11 @@ FMINNMV 01100101 .. 000 101 001 ... ..... ..... @rd_pg_rn FMAXV 01100101 .. 000 110 001 ... ..... ..... @rd_pg_rn FMINV 01100101 .. 000 111 001 ... ..... ..... @rd_pg_rn +## SVE Floating Point Unary Operations - Unpredicated Group + +FRECPE 01100101 .. 001 110 001110 ..... ..... @rd_rn +FRSQRTE 01100101 .. 001 111 001110 ..... ..... @rd_rn + ### SVE FP Accumulating Reduction Group # SVE floating-point serial reduction (predicated) From patchwork Sat Feb 17 18:23:18 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128730 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1849144ljc; Sat, 17 Feb 2018 11:16:54 -0800 (PST) X-Google-Smtp-Source: AH8x225qdZ1rg83rTV0JDGQbSI2gDNyI8PVl+zZhgMl1wc46VPRd0Rv8lfGSSIhLWb9g0b5tkrd9 X-Received: by 10.37.5.2 with SMTP id 2mr7319989ybf.322.1518895014104; Sat, 17 Feb 2018 11:16:54 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518895014; cv=none; d=google.com; s=arc-20160816; b=gEKeRu7Pi3o36/jOHJcuNv2yx9aa8In2X+PtfmeraW5nUbX+a+pi/s26Jte+V8cJ5O kKZThOjrijvXmOtLWJrnDW3sb/cfFHwD6dJDgg30lkhKW77IoznJ/dIcAb8jp1shx0wI Y8ADbeZ1MzTZQ3oqUUO5s/PAOdf+o5BJaBeuSVwREh1WZlPnEypgSUXafDisF2mmKc1D vsJUMuUwOTGBttagMD3hDLVZgvqerbl9CH7gFH0kuxy2j8nzI1QcX1OugnTQcmTPk8U5 wtcRRHhYMYtR3SqMlFl6CKlS0c661ZDvVOYGKFin8EceAvt5K2iCd1urfP9HCKXopaVI Mx1Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=SbsyuLWrWOxYaj+V/JBx0bY2VvqPeS9b43VL0kuF60o=; b=WhnbRkcsP9sazj7R2JNbvaHBAuqV022jfnfiQx94+wggHR9fzPmdyq4uo0Epu1Qdmk oqeycukMtiFr29Zqu6CW6MAp7nQ/8Vel3honP45iRvJ7jUJZB6cjTyACzG9P4wKPoHxb vqgm6LN3Wev9uUyIpZhmT2bTfCMOOvsUuXlvtqzP9EUUWzJij5lbJqCOneA3kr0bRnEe Tw596aGabxImy48yutkN50oNM9HnDPwbjwKUPUuaTqEok9ELzlPBvG8CxB06JpnD+Whe 0NzxA/gMv7FTQ+NJ8uaPijwm+gl5cr9tlGTs++x2vlCOAyfLNNtbF/ebczrb5KKSh4rM cOtg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=P85vW9Oj; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id n6si108474ywe.456.2018.02.17.11.16.53 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 11:16:54 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=P85vW9Oj; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48567 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7yf-0002jU-DS for patch@linaro.org; Sat, 17 Feb 2018 14:16:53 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41035) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7Aa-0001uZ-93 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:11 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7AY-0002Gx-QA for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:08 -0500 Received: from mail-pf0-x241.google.com ([2607:f8b0:400e:c00::241]:38521) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7AY-0002Gi-I1 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:06 -0500 Received: by mail-pf0-x241.google.com with SMTP id i3so593060pfe.5 for ; Sat, 17 Feb 2018 10:25:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=SbsyuLWrWOxYaj+V/JBx0bY2VvqPeS9b43VL0kuF60o=; b=P85vW9Ojo+Dlfrk2dYWyOK9kPfIlDjTtj/h9XiG99njckkeyuAixaXyRU9DgacnCKq yCF3I4E8kF5hmiCPrHie1YcHW9g7mme5eIweb+ANNAaBypBKzho+XvXkdBEKnhOBoU5+ d8Mxs/qy6q4cV6Gqc1b/FDR5th9AnVM/tf/m4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=SbsyuLWrWOxYaj+V/JBx0bY2VvqPeS9b43VL0kuF60o=; b=k5VWOOJsmNIBEbHBiijERfxxD/OBnop1J94LNHMNGiVH8PJzy+MFJWuHhslyQmDWIj 3tRsrGpgmYJrzPHhMqJj2DpcUte7+ANdva8Gio8tE9hu5aD9Qhy+zPehUIyczNzXY/Yr PT5VOgFsPve+hZkuxIRBJXf1tjj/5NQ33ZatNYoHxx3fucqX4yjuStYPKlj8sNGK/+/d zH3Jr6lLWpGq22ct+rgVMFAB6HorzQloXTTd8ViBY1iYzI9l/npr74KtOJqjAgXmznLI XhC3ZlPsgtbyEZM9vw3kL7V0Bqqm1naIr4SGuXkaSCdtk5cPk7cjUwJTpeMVgSNphaf4 7juA== X-Gm-Message-State: APf1xPCK5Jsnk7QeAIRhgoQTMiXg0R0W2sJyWYYH6/LFPlx+neB/C0mA wCydjleWE3syjN7/ICNtyWqP1TIbOSA= X-Received: by 10.99.125.19 with SMTP id y19mr4723689pgc.285.1518891905214; Sat, 17 Feb 2018 10:25:05 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.25.03 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:25:04 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:18 -0800 Message-Id: <20180217182323.25885-63-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::241 Subject: [Qemu-devel] [PATCH v2 62/67] target/arm: Implement SVE FP Compare with Zero Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 42 ++++++++++++++++++++++++++++++++++++++++++ target/arm/sve_helper.c | 45 +++++++++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 41 +++++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 10 ++++++++++ 4 files changed, 138 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index c07b2245ba..696c97648b 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -767,6 +767,48 @@ DEF_HELPER_FLAGS_5(sve_fadda_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_fadda_d, TCG_CALL_NO_RWG, i64, i64, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmge0_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmge0_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmge0_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_fcmgt0_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmgt0_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmgt0_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_fcmlt0_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmlt0_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmlt0_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_fcmle0_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmle0_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmle0_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_fcmeq0_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmeq0_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmeq0_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_fcmne0_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmne0_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcmne0_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_6(sve_fadd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_6(sve_fadd_s, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 29deefcd86..6a052ce9ad 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3270,6 +3270,8 @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, \ #define DO_FCMGE(TYPE, X, Y, ST) TYPE##_compare(Y, X, ST) <= 0 #define DO_FCMGT(TYPE, X, Y, ST) TYPE##_compare(Y, X, ST) < 0 +#define DO_FCMLE(TYPE, X, Y, ST) TYPE##_compare(X, Y, ST) <= 0 +#define DO_FCMLT(TYPE, X, Y, ST) TYPE##_compare(X, Y, ST) < 0 #define DO_FCMEQ(TYPE, X, Y, ST) TYPE##_compare_quiet(X, Y, ST) == 0 #define DO_FCMNE(TYPE, X, Y, ST) TYPE##_compare_quiet(X, Y, ST) != 0 #define DO_FCMUO(TYPE, X, Y, ST) \ @@ -3293,6 +3295,49 @@ DO_FPCMP_PPZZ_ALL(sve_facgt, DO_FACGT) #undef DO_FPCMP_PPZZ_H #undef DO_FPCMP_PPZZ +/* One operand floating-point comparison against zero, controlled + * by a predicate. + */ +#define DO_FPCMP_PPZ0(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vg, \ + void *status, uint32_t desc) \ +{ \ + intptr_t opr_sz = simd_oprsz(desc); \ + intptr_t i = opr_sz, j = ((opr_sz - 1) & -64) >> 3; \ + do { \ + uint64_t out = 0; \ + uint64_t pg = *(uint64_t *)(vg + j); \ + do { \ + i -= sizeof(TYPE), out <<= sizeof(TYPE); \ + if ((pg >> (i & 63)) & 1) { \ + TYPE nn = *(TYPE *)(vn + H(i)); \ + out |= OP(TYPE, nn, 0, status); \ + } \ + } while (i & 63); \ + *(uint64_t *)(vd + j) = out; \ + j -= 8; \ + } while (i > 0); \ +} + +#define DO_FPCMP_PPZ0_H(NAME, OP) \ + DO_FPCMP_PPZ0(NAME##_h, float16, H1_2, OP) +#define DO_FPCMP_PPZ0_S(NAME, OP) \ + DO_FPCMP_PPZ0(NAME##_s, float32, H1_4, OP) +#define DO_FPCMP_PPZ0_D(NAME, OP) \ + DO_FPCMP_PPZ0(NAME##_d, float64, , OP) + +#define DO_FPCMP_PPZ0_ALL(NAME, OP) \ + DO_FPCMP_PPZ0_H(NAME, OP) \ + DO_FPCMP_PPZ0_S(NAME, OP) \ + DO_FPCMP_PPZ0_D(NAME, OP) + +DO_FPCMP_PPZ0_ALL(sve_fcmge0, DO_FCMGE) +DO_FPCMP_PPZ0_ALL(sve_fcmgt0, DO_FCMGT) +DO_FPCMP_PPZ0_ALL(sve_fcmle0, DO_FCMLE) +DO_FPCMP_PPZ0_ALL(sve_fcmlt0, DO_FCMLT) +DO_FPCMP_PPZ0_ALL(sve_fcmeq0, DO_FCMEQ) +DO_FPCMP_PPZ0_ALL(sve_fcmne0, DO_FCMNE) + /* * Load contiguous data, protected by a governing predicate. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 463ff7b690..02655bff03 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3278,6 +3278,47 @@ static void trans_FRSQRTE(DisasContext *s, arg_rr_esz *a, uint32_t insn) } } +/* + *** SVE Floating Point Compare with Zero Group + */ + +static void do_ppz_fp(DisasContext *s, arg_rpr_esz *a, + gen_helper_gvec_3_ptr *fn) +{ + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16); + + tcg_gen_gvec_3_ptr(pred_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + pred_full_reg_offset(s, a->pg), + status, vsz, vsz, 0, fn); + tcg_temp_free_ptr(status); +} + +#define DO_PPZ(NAME, name) \ +static void trans_##NAME(DisasContext *s, arg_rpr_esz *a, uint32_t insn) \ +{ \ + static gen_helper_gvec_3_ptr * const fns[3] = { \ + gen_helper_sve_##name##_h, \ + gen_helper_sve_##name##_s, \ + gen_helper_sve_##name##_d, \ + }; \ + if (a->esz == 0) { \ + unallocated_encoding(s); \ + return; \ + } \ + do_ppz_fp(s, a, fns[a->esz - 1]); \ +} + +DO_PPZ(FCMGE_ppz0, fcmge0) +DO_PPZ(FCMGT_ppz0, fcmgt0) +DO_PPZ(FCMLE_ppz0, fcmle0) +DO_PPZ(FCMLT_ppz0, fcmlt0) +DO_PPZ(FCMEQ_ppz0, fcmeq0) +DO_PPZ(FCMNE_ppz0, fcmne0) + +#undef DO_PPZ + /* *** SVE Floating Point Accumulating Reduction Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 112e85174c..f4505ad0bf 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -141,6 +141,7 @@ # One register operand, with governing predicate, vector element size @rd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz @rd_pg4_pn ........ esz:2 ... ... .. pg:4 . rn:4 rd:5 &rpr_esz +@pd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 . rd:4 &rpr_esz # One register operand, with governing predicate, no vector element size @rd_pg_rn_e0 ........ .. ... ... ... pg:3 rn:5 rd:5 &rpr_esz esz=0 @@ -752,6 +753,15 @@ FMINV 01100101 .. 000 111 001 ... ..... ..... @rd_pg_rn FRECPE 01100101 .. 001 110 001110 ..... ..... @rd_rn FRSQRTE 01100101 .. 001 111 001110 ..... ..... @rd_rn +### SVE FP Compare with Zero Group + +FCMGE_ppz0 01100101 .. 0100 00 001 ... ..... 0 .... @pd_pg_rn +FCMGT_ppz0 01100101 .. 0100 00 001 ... ..... 1 .... @pd_pg_rn +FCMLT_ppz0 01100101 .. 0100 01 001 ... ..... 0 .... @pd_pg_rn +FCMLE_ppz0 01100101 .. 0100 01 001 ... ..... 1 .... @pd_pg_rn +FCMEQ_ppz0 01100101 .. 0100 10 001 ... ..... 0 .... @pd_pg_rn +FCMNE_ppz0 01100101 .. 0100 11 001 ... ..... 0 .... @pd_pg_rn + ### SVE FP Accumulating Reduction Group # SVE floating-point serial reduction (predicated) From patchwork Sat Feb 17 18:23:19 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128739 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1861399ljc; Sat, 17 Feb 2018 11:36:33 -0800 (PST) X-Google-Smtp-Source: AH8x224k8Efx1hQmzmlZd1xi0sLW7mMS1iIQ/8tqGYsZEjhwbsLfX5VuJaGwGPD0ntXUZDIQ2wir X-Received: by 10.129.97.215 with SMTP id v206mr7626562ywb.153.1518896193743; Sat, 17 Feb 2018 11:36:33 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518896193; cv=none; d=google.com; s=arc-20160816; b=chtFYG5L/dNJX0A88HBcEYcI2+QQul1qrGyTaB46K0FxF2fQIaygQcq6z29yfxxqcs w+3ePaW72ZYVori8XS7HTbvQZxeEfJOGcOVv31X64ocvTOkrHhWIaEq5BGYwd8DEkzK0 1FDp+0r4zjIH8IUpL/ZPtHO9ERNI9jdWDG6jeDIW721ItUafYgm/2+yWi5D2jBfNZNd5 YzY++jvRLjjOSWiDjW+8LVuM8CurLhaZ+zXu/ygcY7elnnMw3RX+W4V9iUs24FCIAxhk QiuXjBU+MJ7yswn5pq6AXViUYccnyDB836+F+CMAR8rTn3l760FDFfxHqOI1y6jCG3Ws /m3w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=Pssv/btxNtN7VLrefrxQ25vQ0Lh51u2AnNokoqUYCik=; b=ldYc7CJJdtz0OJqm1nvAbgyRPJKHc+89BbE2XuDgyx3RZ4sumBg93r6QdpxyghxQBb IOdC7K/+MG5exD8OK8qBLK7S31VX4MyMvn1AvclNPhsRWtTt7kN9Dt+4ksaXXPq4CF7F nLWDpuWYCbCTFC/AOrWNe3VIQI9qdGcHVmNMpi4K7nDrsX9HuIDtMg1gzXU6F1ctKLjO 7iVTzy8e6jOiFh7qElODTWLPlyAwpXUOwq8209kiPPpUeHdKA7SrNcJDDqNtf+wjFg// 5krwbDX6R1u0MejCBfDormyg0qYdleQLSkXfPm9ze8UmiVeDlJg50SgDLplCROcYFWf6 bvcA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=E2Uawo0F; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id e203si1019166ywa.802.2018.02.17.11.36.33 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 11:36:33 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=E2Uawo0F; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:51937 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en8Hh-0005cB-31 for patch@linaro.org; Sat, 17 Feb 2018 14:36:33 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41063) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7Ad-0001xn-0F for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:12 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7Ab-0002JH-CX for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:11 -0500 Received: from mail-pf0-x243.google.com ([2607:f8b0:400e:c00::243]:34165) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7Ab-0002Ht-4W for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:09 -0500 Received: by mail-pf0-x243.google.com with SMTP id g17so591757pfh.1 for ; Sat, 17 Feb 2018 10:25:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Pssv/btxNtN7VLrefrxQ25vQ0Lh51u2AnNokoqUYCik=; b=E2Uawo0FIlMnJilATgNe2hX5D+Yya7srydNRR5+IlzMd5CwI0lHbU5mP/qjB5fZYts 1/J77E6Sw2O3RARrKchEA1sK+eFZWlwBUnfrsv9Cihgfjzea+svbDoBZCfLjLxq9zp+H nrOrs39p9q1s8PgbfqTIyff9ZCfNHE+qPKWBQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Pssv/btxNtN7VLrefrxQ25vQ0Lh51u2AnNokoqUYCik=; b=JSg+x+nBrUos4t1k0PSMuQLP77PEA6I5YzAFjI0KD//ukrzECfBsFSMuNzfQfum62g e90LEH1IkpHn3JyeUMW70lmDoICxxHKz+jcQkG0rsrUn3S/uXN5XrNrNbAsIxsK9R+W6 thTCj7+q5zNKeabW6F7kwYeX1yF8N2blmY0x/lgqgedJhBmTH5kLbgYU70bkNmqtsHGM /7qUkUWEU3WNi5K6oIjUyGsCFwhkqsO0d2ldHNPgRUr7yBtCtFjuVVraJ3vmuU3HPjkh fKPrW1cUXQnP+aO8rYJfsLaqBXlu8WpgA31ujREBfiOoXvWScqMV1t3EmF2lwsXXSFZS +1EQ== X-Gm-Message-State: APf1xPDi873DpAM2yY2VlVANZmYn4xhBj40b1jCMWQ078jXx6FaR4LzO 5l52LYVn+nSNzdtZ+jwTaIwRc6VbhVo= X-Received: by 10.98.58.129 with SMTP id v1mr1982096pfj.203.1518891907734; Sat, 17 Feb 2018 10:25:07 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.25.05 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:25:06 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:19 -0800 Message-Id: <20180217182323.25885-64-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::243 Subject: [Qemu-devel] [PATCH v2 63/67] target/arm: Implement SVE floating-point trig multiply-add coefficient X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 4 +++ target/arm/sve_helper.c | 70 ++++++++++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 26 +++++++++++++++++ target/arm/sve.decode | 3 ++ 4 files changed, 103 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 696c97648b..ce5fe24dc2 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -1037,6 +1037,10 @@ DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_h, TCG_CALL_NO_RWG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_s, TCG_CALL_NO_RWG, void, env, ptr, i32) DEF_HELPER_FLAGS_3(sve_fnmls_zpzzz_d, TCG_CALL_NO_RWG, void, env, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ftmad_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ftmad_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_ftmad_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(sve_ld1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 6a052ce9ad..53e3516f47 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3338,6 +3338,76 @@ DO_FPCMP_PPZ0_ALL(sve_fcmlt0, DO_FCMLT) DO_FPCMP_PPZ0_ALL(sve_fcmeq0, DO_FCMEQ) DO_FPCMP_PPZ0_ALL(sve_fcmne0, DO_FCMNE) +/* FP Trig Multiply-Add. */ + +void HELPER(sve_ftmad_h)(void *vd, void *vn, void *vm, void *vs, uint32_t desc) +{ + static const float16 coeff[16] = { + 0x3c00, 0xb155, 0x2030, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, + 0x3c00, 0xb800, 0x293a, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, + }; + intptr_t i, opr_sz = simd_oprsz(desc) / sizeof(float16); + intptr_t x = simd_data(desc); + float16 *d = vd, *n = vn, *m = vm; + for (i = 0; i < opr_sz; i++) { + float16 mm = m[i]; + intptr_t xx = x; + if (float16_is_neg(mm)) { + mm = float16_abs(mm); + xx += 8; + } + d[i] = float16_muladd(n[i], mm, coeff[xx], 0, vs); + } +} + +void HELPER(sve_ftmad_s)(void *vd, void *vn, void *vm, void *vs, uint32_t desc) +{ + static const float32 coeff[16] = { + 0x3f800000, 0xbe2aaaab, 0x3c088886, 0xb95008b9, + 0x36369d6d, 0x00000000, 0x00000000, 0x00000000, + 0x3f800000, 0xbf000000, 0x3d2aaaa6, 0xbab60705, + 0x37cd37cc, 0x00000000, 0x00000000, 0x00000000, + }; + intptr_t i, opr_sz = simd_oprsz(desc) / sizeof(float32); + intptr_t x = simd_data(desc); + float32 *d = vd, *n = vn, *m = vm; + for (i = 0; i < opr_sz; i++) { + float32 mm = m[i]; + intptr_t xx = x; + if (float32_is_neg(mm)) { + mm = float32_abs(mm); + xx += 8; + } + d[i] = float32_muladd(n[i], mm, coeff[xx], 0, vs); + } +} + +void HELPER(sve_ftmad_d)(void *vd, void *vn, void *vm, void *vs, uint32_t desc) +{ + static const float64 coeff[16] = { + 0x3ff0000000000000ull, 0xbfc5555555555543ull, + 0x3f8111111110f30cull, 0xbf2a01a019b92fc6ull, + 0x3ec71de351f3d22bull, 0xbe5ae5e2b60f7b91ull, + 0x3de5d8408868552full, 0x0000000000000000ull, + 0x3ff0000000000000ull, 0xbfe0000000000000ull, + 0x3fa5555555555536ull, 0xbf56c16c16c13a0bull, + 0x3efa01a019b1e8d8ull, 0xbe927e4f7282f468ull, + 0x3e21ee96d2641b13ull, 0xbda8f76380fbb401ull, + }; + intptr_t i, opr_sz = simd_oprsz(desc) / sizeof(float64); + intptr_t x = simd_data(desc); + float64 *d = vd, *n = vn, *m = vm; + for (i = 0; i < opr_sz; i++) { + float64 mm = m[i]; + intptr_t xx = x; + if (float64_is_neg(mm)) { + mm = float64_abs(mm); + xx += 8; + } + d[i] = float64_muladd(n[i], mm, coeff[xx], 0, vs); + } +} + /* * Load contiguous data, protected by a governing predicate. */ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 02655bff03..e185af29e3 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3319,6 +3319,32 @@ DO_PPZ(FCMNE_ppz0, fcmne0) #undef DO_PPZ +/* + *** SVE floating-point trig multiply-add coefficient + */ + +static void trans_FTMAD(DisasContext *s, arg_FTMAD *a, uint32_t insn) +{ + static gen_helper_gvec_3_ptr * const fns[3] = { + gen_helper_sve_ftmad_h, + gen_helper_sve_ftmad_s, + gen_helper_sve_ftmad_d, + }; + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr status; + + if (a->esz == 0) { + unallocated_encoding(s); + return; + } + status = get_fpstatus_ptr(a->esz == MO_16); + tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + status, vsz, vsz, a->imm, fns[a->esz - 1]); + tcg_temp_free_ptr(status); +} + /* *** SVE Floating Point Accumulating Reduction Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index f4505ad0bf..ca54895900 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -804,6 +804,9 @@ FMINNM_zpzi 01100101 .. 011 101 100 ... 0000 . ..... @rdn_i1 FMAX_zpzi 01100101 .. 011 110 100 ... 0000 . ..... @rdn_i1 FMIN_zpzi 01100101 .. 011 111 100 ... 0000 . ..... @rdn_i1 +# SVE floating-point trig multiply-add coefficient +FTMAD 01100101 esz:2 010 imm:3 100000 rm:5 rd:5 rn=%reg_movprfx + ### SVE FP Multiply-Add Group # SVE floating-point multiply-accumulate writing addend From patchwork Sat Feb 17 18:23:20 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128723 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1844807ljc; Sat, 17 Feb 2018 11:10:07 -0800 (PST) X-Google-Smtp-Source: AH8x227Hd+/0hxL8lYzD/+aK69QT9mpqrbFwqDdv9r5DlV/w6w42oKDuq9MLMEFESPbiuFz2E7NR X-Received: by 10.37.211.203 with SMTP id e194mr3649869ybf.398.1518894607689; Sat, 17 Feb 2018 11:10:07 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518894607; cv=none; d=google.com; s=arc-20160816; b=dR1g7IiWKPzgsmZrM+WMVGJnZnWZeJhLUDKRbHHSm6/2lcACajSoyY/1bU4KDxDpGD tnc383/ft+23V2TpHLRPEab2dtOqBNZNzThIm250uNquHYQfN0S3TvzYAHlL/Cm+9Lku gHrMguIC+N7TwULNsMBC/JplRqgHUuTCxA3WuvsGptO8l4XqUEUFvduAaI7L1f9qkzc8 cNym6eoiKkRg24RG9J78FhRqshfE1o3IeVbXzh2qrqgfoJOnFZG2/GZyrGPlzOiF6B18 +dcdDPPJ+YAXZYuHktwYQbiYlFyzI9Qxl1dKBUgPgU1ASszEebkgimiDLbyVxpEVW+y0 godA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=/pW1zgtuWTJfjGKD3YWxaK5UQ0fE+1tL0T5+VXhGEbA=; b=SpsEaZSRPiJ+go2DX1hpGrUFcj/JXnVXX53GRt3YpjJfDxpqCcl+iZ0WHfp7cHQ1t1 18ZAOwLq23f15frlVdwBOgnpgzmfQvCR3E6tN2Wf41lCfeQzEKUmSiXSOzR4DMtLQjae DTEcKI6CHt/X/YEBFLZwi4RGGixOxQeLb0OeZY2z2fnltK9gtp7kg75WfEgCDXlR9BIu MYUfjI3pPFilJuTjHzkkdeWB/ebBLMSIZeF1uoSIyLQStBowDA2J/Lju+DRcv/ne8yb1 m+3Jxa8LYfgUFbpHZpqqiL/5wAYBMzG0nadvKyd7ygiLYA5HcoXyUdr5gwMQ2sygZxUs Aydg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=KD3mFwNu; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id b17si332503ybj.788.2018.02.17.11.10.07 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 11:10:07 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=KD3mFwNu; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48446 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7s6-00052r-WA for patch@linaro.org; Sat, 17 Feb 2018 14:10:07 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41094) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7Ae-00021V-N4 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:13 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7Ad-0002Li-8S for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:12 -0500 Received: from mail-pg0-x243.google.com ([2607:f8b0:400e:c05::243]:46094) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7Ad-0002L6-1b for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:11 -0500 Received: by mail-pg0-x243.google.com with SMTP id m1so262275pgp.13 for ; Sat, 17 Feb 2018 10:25:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=/pW1zgtuWTJfjGKD3YWxaK5UQ0fE+1tL0T5+VXhGEbA=; b=KD3mFwNuu0Wh5YtxIVHjM+iS1Dq7EM4MN/PnNHEUVpr1exml7OMtKF6sfATjkMwoym mZNtfX/P4RPWuJ7rVnnfO8rXiDgo2i8fJIt6fcNWZUfn2lDMniMH4bK/68xARR53osVB neelp2qsTMXks/J2JYWi2qkgcP/n0d7IwCd6Y= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=/pW1zgtuWTJfjGKD3YWxaK5UQ0fE+1tL0T5+VXhGEbA=; b=jIM0kKuiVqV1cSBzp6yOF28LQNX7yfiNkSRTGn32qKVvwO9UdJhAV7KE1051rSDA5T 7Y07Ok5BSdGxwNPquFx1pcyjhzwDq6Jaoss3ZrPABgdys0it4RQiPdMZ41JjjZT4CG1s S+nv/uUBkMjlUNblSv0ADzAzWpt9jjvd6iXs7K3Ejf4c5imjB0TNDULSp9SQ3YHKZc3b pcFY0KZ7hT+qrex62yflTktIl4/jltb6UraQexrWjlGvcFbP7p884rZb0o4pd5+8GmQv EyiMhT53ncT4f8b0uRisYG/2LN7vVhfARUjCegllrax3E8npR6Xn1tsMB+Vp6loanPQO INcA== X-Gm-Message-State: APf1xPBAOxmpqrizaSSsR/dM5jgyW7VW24DbRpWNYhlezRVYug3PH/qP 3hdJZEIW2qA77kXVdw67Sc2VwDUf/XA= X-Received: by 10.98.67.68 with SMTP id q65mr9808636pfa.129.1518891909777; Sat, 17 Feb 2018 10:25:09 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.25.08 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:25:08 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:20 -0800 Message-Id: <20180217182323.25885-65-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::243 Subject: [Qemu-devel] [PATCH v2 64/67] target/arm: Implement SVE floating-point convert precision X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 13 +++++++++++++ target/arm/sve_helper.c | 27 +++++++++++++++++++++++++++ target/arm/translate-sve.c | 30 ++++++++++++++++++++++++++++++ target/arm/sve.decode | 8 ++++++++ 4 files changed, 78 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index ce5fe24dc2..bac4bfdc60 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -942,6 +942,19 @@ DEF_HELPER_FLAGS_6(sve_fmins_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_6(sve_fmins_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i64, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvt_sh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvt_dh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvt_hs, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvt_ds, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvt_hd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvt_sd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 53e3516f47..9db01ac2f2 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3157,6 +3157,33 @@ void HELPER(NAME)(void *vd, void *vn, void *vg, void *status, uint32_t desc) \ } \ } +static inline float32 float16_to_float32_ieee(float16 f, float_status *s) +{ + return float16_to_float32(f, true, s); +} + +static inline float64 float16_to_float64_ieee(float16 f, float_status *s) +{ + return float16_to_float64(f, true, s); +} + +static inline float16 float32_to_float16_ieee(float32 f, float_status *s) +{ + return float32_to_float16(f, true, s); +} + +static inline float16 float64_to_float16_ieee(float64 f, float_status *s) +{ + return float64_to_float16(f, true, s); +} + +DO_ZPZ_FP(sve_fcvt_sh, uint32_t, H1_4, float32_to_float16_ieee) +DO_ZPZ_FP(sve_fcvt_hs, uint32_t, H1_4, float16_to_float32_ieee) +DO_ZPZ_FP_D(sve_fcvt_dh, uint64_t, float64_to_float16_ieee) +DO_ZPZ_FP_D(sve_fcvt_hd, uint64_t, float16_to_float64_ieee) +DO_ZPZ_FP_D(sve_fcvt_ds, uint64_t, float64_to_float32) +DO_ZPZ_FP_D(sve_fcvt_sd, uint64_t, float32_to_float64) + DO_ZPZ_FP(sve_scvt_hh, uint16_t, H1_2, int16_to_float16) DO_ZPZ_FP(sve_scvt_sh, uint32_t, H1_4, int32_to_float16) DO_ZPZ_FP(sve_scvt_ss, uint32_t, H1_4, int32_to_float32) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index e185af29e3..361d545965 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3651,6 +3651,36 @@ static void do_zpz_ptr(DisasContext *s, int rd, int rn, int pg, tcg_temp_free_ptr(status); } +static void trans_FCVT_sh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvt_sh); +} + +static void trans_FCVT_hs(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvt_hs); +} + +static void trans_FCVT_dh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvt_dh); +} + +static void trans_FCVT_hd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvt_hd); +} + +static void trans_FCVT_ds(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvt_ds); +} + +static void trans_FCVT_sd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvt_sd); +} + static void trans_SCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) { do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_hh); diff --git a/target/arm/sve.decode b/target/arm/sve.decode index ca54895900..d44cf17fc8 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -824,6 +824,14 @@ FNMLS_zpzzz 01100101 .. 1 ..... 111 ... ..... ..... @rdn_pg_rm_ra ### SVE FP Unary Operations Predicated Group +# SVE floating-point convert precision +FCVT_sh 01100101 10 0010 00 101 ... ..... ..... @rd_pg_rn_e0 +FCVT_hs 01100101 10 0010 01 101 ... ..... ..... @rd_pg_rn_e0 +FCVT_dh 01100101 11 0010 00 101 ... ..... ..... @rd_pg_rn_e0 +FCVT_hd 01100101 11 0010 01 101 ... ..... ..... @rd_pg_rn_e0 +FCVT_ds 01100101 11 0010 10 101 ... ..... ..... @rd_pg_rn_e0 +FCVT_sd 01100101 11 0010 11 101 ... ..... ..... @rd_pg_rn_e0 + # SVE integer convert to floating-point SCVTF_hh 01100101 01 010 01 0 101 ... ..... ..... @rd_pg_rn_e0 SCVTF_sh 01100101 01 010 10 0 101 ... ..... ..... @rd_pg_rn_e0 From patchwork Sat Feb 17 18:23:21 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128735 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1854304ljc; Sat, 17 Feb 2018 11:25:07 -0800 (PST) X-Google-Smtp-Source: AH8x226Yq39jqeenbZ+Z0w1Loi4ky+VtooKkwYB79b5kZ5ww62khnkr7alTr8egKI/fEVZnPeMft X-Received: by 10.37.20.196 with SMTP id 187mr7208036ybu.239.1518895507542; Sat, 17 Feb 2018 11:25:07 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518895507; cv=none; d=google.com; s=arc-20160816; b=G6YL7Tw87SW6aNAfdkRVB6e5OB8+vn+8FoCbxM0NoioCidAnWZDFVgBrcQJx+Bmvfk FKuNKb4mteCd6sQqGXRt7VnWwlLVib6oLs+4lwyJfYxjyHomsYB3YUccfuT0Wzk5Espo Wgt/KbrCI8Awn5c19AFXcYwa6j9Kh7MJr6MlpHshKOXQIoQhHZ40J1x080BhuMAbcB/p HSn7fHpJOm3yhPvLs7ecLC8onjLm1tGJFO3rEmBMGgl6K+25EhdKg/NiiqLwQGbhSa9i 8V3Kw9++iTyrHRMTHMlkLQ6UJtXEwDTwQegcnaCA0QaFSf+PH8W6AZRomhS/n0mPSCrW BIkA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=Uw3Coayj2BZrW4P8dkcsk/ENJVJ6Su3WMdNeRaiLH4w=; b=Sn3hAZeerHRbLnPDxqawlHYBELtrlLzY1v55eWnDtil8faZ0yvqfuH/NX8Y/ETc/gM Wtz5hNGuPTah3d+JVfq68gXbp41hBJo2zjdd9rYzOXz9xIt/XpmTDs/KRzALcfWqu007 BHqmMH6bixEkoli+ZKnWaFmoetsRkD1TC1w7ZEiDjT2vUdwUDUl9DMG6wCxY3XYZjx/0 S7nrpcaW3X0AIKmqX+L2uSwfmClUnr8sVhvpWSDWqd1ttjTFWAX6ABq1SBHmHawDaHHB pk2yqU4aQ2UGhZTd5kB2365u8umYQK1gf/YRYOfz6sRSOO5oSwNFqUgBf2tbmZ80uJRq fnNw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=aAw6kcee; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id l68si471855ywb.598.2018.02.17.11.25.07 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 11:25:07 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=aAw6kcee; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:49563 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en86c-000286-M2 for patch@linaro.org; Sat, 17 Feb 2018 14:25:06 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41121) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7Ag-00024M-QM for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:16 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7Af-0002NF-8w for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:14 -0500 Received: from mail-pf0-x243.google.com ([2607:f8b0:400e:c00::243]:35234) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7Ae-0002MW-Ok for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:13 -0500 Received: by mail-pf0-x243.google.com with SMTP id a6so592302pfi.2 for ; Sat, 17 Feb 2018 10:25:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Uw3Coayj2BZrW4P8dkcsk/ENJVJ6Su3WMdNeRaiLH4w=; b=aAw6kceenL51GdB5XgS8nMwyre4AoWxIRVivFI0Ab7mJWqoJTE1ShcX6pAXyMx9iO2 Zd8bearhCeaCnT585r9GoAQyknaf/PNKRv4s/N53Mmr/xxfgCVVSavQ5XS+PuBMbVVpN CNDRSWXljqo+ZQaWm1/pDGx80Yjzg8ws8Fp3c= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Uw3Coayj2BZrW4P8dkcsk/ENJVJ6Su3WMdNeRaiLH4w=; b=D4UMPJrJ1xKofl6cQVaQvGsg0x+OqumvNkhnZbaTr7eCej/XavudtOTAFuSfIDUUdo UD/mo26ZtDMA+oO1MLBE5kMnuO6FePAxO5Cs4p9OWMfJc8Ufzy5cdlYnfaM7vmcCmUnU NqePAkHyT8LHA78MCt9mXn+96PDGrXJok8oSAzR9Tj520bz5QAuzH9LO/177v5/fqFR5 s+T5engkolsW/qfmEjx8Wcl+KeZ7ck5AOZ/N0bVdr9FlBpy2NVUZuQzaK9iBWwAdxjVf H5f1aBSdgYZwZvzo11P0KbQuM8mN4FdIe0BoNB6Y1ORqkkSHs+8gntJGa9a94MAeh/bF Kncg== X-Gm-Message-State: APf1xPBYXI/8NWlJVIR0wqJlCvQ9rYEsTrH2x7kCifj0tJgMRXyQGDAx kCXM3SWl+aDPhnuuxv4+tY+RDAWMTws= X-Received: by 10.99.63.9 with SMTP id m9mr8512676pga.247.1518891911466; Sat, 17 Feb 2018 10:25:11 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.25.09 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:25:10 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:21 -0800 Message-Id: <20180217182323.25885-66-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::243 Subject: [Qemu-devel] [PATCH v2 65/67] target/arm: Implement SVE floating-point convert to integer X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 30 ++++++++++++++++++++ target/arm/sve_helper.c | 16 +++++++++++ target/arm/translate-sve.c | 70 ++++++++++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 16 +++++++++++ 4 files changed, 132 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index bac4bfdc60..0f5fea9045 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -955,6 +955,36 @@ DEF_HELPER_FLAGS_5(sve_fcvt_hd, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_fcvt_sd, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzs_hh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzs_hs, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzs_ss, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzs_ds, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzs_hd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzs_sd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzs_dd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_fcvtzu_hh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzu_hs, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzu_ss, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzu_ds, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzu_hd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzu_sd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fcvtzu_dd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 9db01ac2f2..09f5c77254 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3184,6 +3184,22 @@ DO_ZPZ_FP_D(sve_fcvt_hd, uint64_t, float16_to_float64_ieee) DO_ZPZ_FP_D(sve_fcvt_ds, uint64_t, float64_to_float32) DO_ZPZ_FP_D(sve_fcvt_sd, uint64_t, float32_to_float64) +DO_ZPZ_FP(sve_fcvtzs_hh, uint16_t, H1_2, float16_to_int16_round_to_zero) +DO_ZPZ_FP(sve_fcvtzs_hs, uint32_t, H1_4, float16_to_int32_round_to_zero) +DO_ZPZ_FP(sve_fcvtzs_ss, uint32_t, H1_4, float32_to_int32_round_to_zero) +DO_ZPZ_FP_D(sve_fcvtzs_hd, uint64_t, float16_to_int64_round_to_zero) +DO_ZPZ_FP_D(sve_fcvtzs_sd, uint64_t, float32_to_int64_round_to_zero) +DO_ZPZ_FP_D(sve_fcvtzs_ds, uint64_t, float64_to_int32_round_to_zero) +DO_ZPZ_FP_D(sve_fcvtzs_dd, uint64_t, float64_to_int64_round_to_zero) + +DO_ZPZ_FP(sve_fcvtzu_hh, uint16_t, H1_2, float16_to_uint16_round_to_zero) +DO_ZPZ_FP(sve_fcvtzu_hs, uint32_t, H1_4, float16_to_uint32_round_to_zero) +DO_ZPZ_FP(sve_fcvtzu_ss, uint32_t, H1_4, float32_to_uint32_round_to_zero) +DO_ZPZ_FP_D(sve_fcvtzu_hd, uint64_t, float16_to_uint64_round_to_zero) +DO_ZPZ_FP_D(sve_fcvtzu_sd, uint64_t, float32_to_uint64_round_to_zero) +DO_ZPZ_FP_D(sve_fcvtzu_ds, uint64_t, float64_to_uint32_round_to_zero) +DO_ZPZ_FP_D(sve_fcvtzu_dd, uint64_t, float64_to_uint64_round_to_zero) + DO_ZPZ_FP(sve_scvt_hh, uint16_t, H1_2, int16_to_float16) DO_ZPZ_FP(sve_scvt_sh, uint32_t, H1_4, int32_to_float16) DO_ZPZ_FP(sve_scvt_ss, uint32_t, H1_4, int32_to_float32) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 361d545965..bc865dfd15 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3681,6 +3681,76 @@ static void trans_FCVT_sd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvt_sd); } +static void trans_FCVTZS_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvtzs_hh); +} + +static void trans_FCVTZU_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvtzu_hh); +} + +static void trans_FCVTZS_hs(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvtzs_hs); +} + +static void trans_FCVTZU_hs(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvtzu_hs); +} + +static void trans_FCVTZS_hd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvtzs_hd); +} + +static void trans_FCVTZU_hd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_fcvtzu_hd); +} + +static void trans_FCVTZS_ss(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzs_ss); +} + +static void trans_FCVTZU_ss(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzu_ss); +} + +static void trans_FCVTZS_sd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzs_sd); +} + +static void trans_FCVTZU_sd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzu_sd); +} + +static void trans_FCVTZS_ds(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzs_ds); +} + +static void trans_FCVTZU_ds(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzu_ds); +} + +static void trans_FCVTZS_dd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzs_dd); +} + +static void trans_FCVTZU_dd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzu_dd); +} + static void trans_SCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) { do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_hh); diff --git a/target/arm/sve.decode b/target/arm/sve.decode index d44cf17fc8..92dda3a241 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -832,6 +832,22 @@ FCVT_hd 01100101 11 0010 01 101 ... ..... ..... @rd_pg_rn_e0 FCVT_ds 01100101 11 0010 10 101 ... ..... ..... @rd_pg_rn_e0 FCVT_sd 01100101 11 0010 11 101 ... ..... ..... @rd_pg_rn_e0 +# SVE floating-point convert to integer +FCVTZS_hh 01100101 01 011 01 0 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZU_hh 01100101 01 011 01 1 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZS_hs 01100101 01 011 10 0 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZU_hs 01100101 01 011 10 1 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZS_hd 01100101 01 011 11 0 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZU_hd 01100101 01 011 11 1 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZS_ss 01100101 10 011 10 0 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZU_ss 01100101 10 011 10 1 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZS_ds 01100101 11 011 00 0 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZU_ds 01100101 11 011 00 1 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZS_sd 01100101 11 011 10 0 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZU_sd 01100101 11 011 10 1 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZS_dd 01100101 11 011 11 0 101 ... ..... ..... @rd_pg_rn_e0 +FCVTZU_dd 01100101 11 011 11 1 101 ... ..... ..... @rd_pg_rn_e0 + # SVE integer convert to floating-point SCVTF_hh 01100101 01 010 01 0 101 ... ..... ..... @rd_pg_rn_e0 SCVTF_sh 01100101 01 010 10 0 101 ... ..... ..... @rd_pg_rn_e0 From patchwork Sat Feb 17 18:23:22 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128737 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1856128ljc; Sat, 17 Feb 2018 11:28:09 -0800 (PST) X-Google-Smtp-Source: AH8x224gJ4rl1stn252/u2fnhm5nHk22T9eJStscJhClnDsWnWeTTONXiCLHuI3qEhepP6LeN5Pi X-Received: by 10.129.41.19 with SMTP id p19mr6434213ywp.233.1518895688979; Sat, 17 Feb 2018 11:28:08 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518895688; cv=none; d=google.com; s=arc-20160816; b=wA4jRuPHVUax7XMT+SACn5xCNXjNPq1kNoHlDLYV7OQrLxWcGLveK0l/auN241Sav6 kN5fU82ki7JDQ4ZxsXhYwcUqUDLGr1GuuOROk1kdPcz/oQl7qN4Xia/SqcYJRZTBKL9R wSWUZVqms/QSAG2emXUObm4L585WvnvpnesL1swXaC7yn0axE4sGFOFenDNPqX52A0GG Yhmri3TVxbXjdZwwTgFK2ATrCNZhmDi6EFXNgnIg/XxjHaBjjpuoX28skFLKl7Am5QbI ZPha6uzlTt9LMRE9HtSM3s+/zuj3PDiU36J2SFDD4gLq663U5LtTVQZK2GRt5t3Lql38 QNww== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=TYEkgvQ2lsVeGMILIUd+qlGVon1fvaQFWVsi7XMK5X8=; b=OPw95+fc3mrCsyQMjKtYsb0Re7dEbkqmU1qh6IMm5l+D7gKn7WOe20AHQXh098tDo7 /GEn9v4JmNaYj70Cn8+ifVwakDuZ79wuR7cHH/2xJ4w5OHokosDG+HjFEV+NDKKmBBGE ZXmer03bdTYIPzq6fdSMQacPoJEF72AXOFjKNRofNzAUnarr5Rr09ZdH5YEw/aOljUcF ocYip2ZOIhLFfYXr60VdmLBG4qcSIUbwip3QKrQV52w+idjQNHMPX+ALpbobTh3F7/0X QCniZOu1VD7pqROZ7paKWLLLjTiBVGwvY7fUIjfRVFrqbPv7Ek5zuK3OOiJ95A/xsvyC I4xA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=OENPRN3p; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id q10si3526476ybj.227.2018.02.17.11.28.08 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 11:28:08 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=OENPRN3p; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:50397 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en89Y-0005tI-8m for patch@linaro.org; Sat, 17 Feb 2018 14:28:08 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41143) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7Ai-00025v-8T for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:19 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7Ag-0002Nl-Rc for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:16 -0500 Received: from mail-pg0-x241.google.com ([2607:f8b0:400e:c05::241]:39714) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7Ag-0002NM-6v for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:14 -0500 Received: by mail-pg0-x241.google.com with SMTP id w17so4356781pgv.6 for ; Sat, 17 Feb 2018 10:25:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=TYEkgvQ2lsVeGMILIUd+qlGVon1fvaQFWVsi7XMK5X8=; b=OENPRN3pDjBKPI8A6Li+DqhGVkcnN9QW73M2G8HKfYT4u72VfuhXjXtj/acxisha3p uld55SKBHwa3xpQbopJgeFGwDMFhZbfngDXiMGwRy4KnqqKGuRZZ13HTiLrtFVU7z1rC /yeQrfM72KiWHSsLT1Rjdp9PZZtjQPOQZQigk= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=TYEkgvQ2lsVeGMILIUd+qlGVon1fvaQFWVsi7XMK5X8=; b=uhcRc7JCllQfDsulEr6AmpxjvyUzYP0aPrGGyj63N0WU3bV3onufPlpeT+6Jy8jTRk rQDBY+hTeHSPiKgF4gk0CnXHf93N9skQ1VOQoeNj4AnOy/6zTspiFwvy56olPRiJ7+9R Al6gOH4Ch3r71EPePVK8NShP5w5kBeCBY1llsb7QMitt3NMBrqO2PfUY8tGBTbbBhu2K UCJyLNE/JCeXuhpCfu33TGYF5/QB7LidgS76RerSwmuyBS7C8V7aNNNic5hw4h2apYJr LkAf5cOuVA+SSeXb4O3nyJpLvtj22oB0p+jx5kN9dMa1qkLekzNweS04E5hbpe+EoWiq 5UYw== X-Gm-Message-State: APf1xPCVHuyyTKn/HC5vFbEvrLdqLry+stIRlyRhGLm66xf3fA9OgRnV XRS7EPgC2NsFl64TyxREUmo/K2gbx8w= X-Received: by 10.99.146.3 with SMTP id o3mr8292115pgd.309.1518891912866; Sat, 17 Feb 2018 10:25:12 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.25.11 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:25:11 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:22 -0800 Message-Id: <20180217182323.25885-67-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::241 Subject: [Qemu-devel] [PATCH v2 66/67] target/arm: Implement SVE floating-point round to integral value X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 14 ++++++++ target/arm/sve_helper.c | 8 +++++ target/arm/translate-sve.c | 80 ++++++++++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 9 ++++++ 4 files changed, 111 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 0f5fea9045..749bab0b38 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -985,6 +985,20 @@ DEF_HELPER_FLAGS_5(sve_fcvtzu_sd, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_fcvtzu_dd, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_frint_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_frint_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_frint_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_frintx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_frintx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_frintx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 09f5c77254..7950710be7 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3200,6 +3200,14 @@ DO_ZPZ_FP_D(sve_fcvtzu_sd, uint64_t, float32_to_uint64_round_to_zero) DO_ZPZ_FP_D(sve_fcvtzu_ds, uint64_t, float64_to_uint32_round_to_zero) DO_ZPZ_FP_D(sve_fcvtzu_dd, uint64_t, float64_to_uint64_round_to_zero) +DO_ZPZ_FP(sve_frint_h, uint16_t, H1_2, helper_advsimd_rinth) +DO_ZPZ_FP(sve_frint_s, uint32_t, H1_4, helper_rints) +DO_ZPZ_FP_D(sve_frint_d, uint64_t, helper_rintd) + +DO_ZPZ_FP(sve_frintx_h, uint16_t, H1_2, float16_round_to_int) +DO_ZPZ_FP(sve_frintx_s, uint32_t, H1_4, float32_round_to_int) +DO_ZPZ_FP_D(sve_frintx_d, uint64_t, float64_round_to_int) + DO_ZPZ_FP(sve_scvt_hh, uint16_t, H1_2, int16_to_float16) DO_ZPZ_FP(sve_scvt_sh, uint32_t, H1_4, int32_to_float16) DO_ZPZ_FP(sve_scvt_ss, uint32_t, H1_4, int32_to_float32) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index bc865dfd15..5f1c4984b8 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3751,6 +3751,86 @@ static void trans_FCVTZU_dd(DisasContext *s, arg_rpr_esz *a, uint32_t insn) do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve_fcvtzu_dd); } +static gen_helper_gvec_3_ptr * const frint_fns[3] = { + gen_helper_sve_frint_h, + gen_helper_sve_frint_s, + gen_helper_sve_frint_d +}; + +static void trans_FRINTI(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + if (a->esz == 0) { + unallocated_encoding(s); + } else { + do_zpz_ptr(s, a->rd, a->rn, a->pg, a->esz == MO_16, + frint_fns[a->esz - 1]); + } +} + +static void trans_FRINTX(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3_ptr * const fns[3] = { + gen_helper_sve_frintx_h, + gen_helper_sve_frintx_s, + gen_helper_sve_frintx_d + }; + if (a->esz == 0) { + unallocated_encoding(s); + } else { + do_zpz_ptr(s, a->rd, a->rn, a->pg, a->esz == MO_16, fns[a->esz - 1]); + } +} + +static void do_frint_mode(DisasContext *s, arg_rpr_esz *a, int mode) +{ + unsigned vsz = vec_full_reg_size(s); + TCGv_i32 tmode; + TCGv_ptr status; + + if (a->esz == 0) { + unallocated_encoding(s); + return; + } + + tmode = tcg_const_i32(mode); + status = get_fpstatus_ptr(a->esz == MO_16); + gen_helper_set_rmode(tmode, tmode, status); + + tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + pred_full_reg_offset(s, a->pg), + status, vsz, vsz, 0, frint_fns[a->esz - 1]); + + gen_helper_set_rmode(tmode, tmode, status); + tcg_temp_free_i32(tmode); + tcg_temp_free_ptr(status); +} + +static void trans_FRINTN(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_frint_mode(s, a, float_round_nearest_even); +} + +static void trans_FRINTP(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_frint_mode(s, a, float_round_up); +} + +static void trans_FRINTM(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_frint_mode(s, a, float_round_down); +} + +static void trans_FRINTZ(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_frint_mode(s, a, float_round_to_zero); +} + +static void trans_FRINTA(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + do_frint_mode(s, a, float_round_ties_away); +} + static void trans_SCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) { do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_hh); diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 92dda3a241..e06c0c5279 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -848,6 +848,15 @@ FCVTZU_sd 01100101 11 011 10 1 101 ... ..... ..... @rd_pg_rn_e0 FCVTZS_dd 01100101 11 011 11 0 101 ... ..... ..... @rd_pg_rn_e0 FCVTZU_dd 01100101 11 011 11 1 101 ... ..... ..... @rd_pg_rn_e0 +# SVE floating-point round to integral value +FRINTN 01100101 .. 000 000 101 ... ..... ..... @rd_pg_rn +FRINTP 01100101 .. 000 001 101 ... ..... ..... @rd_pg_rn +FRINTM 01100101 .. 000 010 101 ... ..... ..... @rd_pg_rn +FRINTZ 01100101 .. 000 011 101 ... ..... ..... @rd_pg_rn +FRINTA 01100101 .. 000 100 101 ... ..... ..... @rd_pg_rn +FRINTX 01100101 .. 000 110 101 ... ..... ..... @rd_pg_rn +FRINTI 01100101 .. 000 111 101 ... ..... ..... @rd_pg_rn + # SVE integer convert to floating-point SCVTF_hh 01100101 01 010 01 0 101 ... ..... ..... @rd_pg_rn_e0 SCVTF_sh 01100101 01 010 10 0 101 ... ..... ..... @rd_pg_rn_e0 From patchwork Sat Feb 17 18:23:23 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 128725 Delivered-To: patch@linaro.org Received: by 10.46.124.24 with SMTP id x24csp1846894ljc; Sat, 17 Feb 2018 11:13:21 -0800 (PST) X-Google-Smtp-Source: AH8x22737AFGJ+5Q1dhwFDJg8B3e5un3U2rpTlnQNd1BkLX0OALWZb8cenzDzSS/X8F9sCjiaq+u X-Received: by 10.129.65.74 with SMTP id f10mr7631322ywk.290.1518894801865; Sat, 17 Feb 2018 11:13:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518894801; cv=none; d=google.com; s=arc-20160816; b=irVsXyMgtCHE0E7rZ5zbJChY1BloF7CdKEcvnc2Qyr0gj1mSEVjqrUSzERYUYk6k7x 6xXGxIsVf34eRFSb7X1kt8Ol4nfBHy235NzwMMIIgKE+l/68O7hm869PGo6Yza6AFUGL b7B/ubhxHLDhwEFYEsrp+Pr1rv/zsi9ai24/Fe51J3Ak+UV9eGxMfUXtm8c+x0FWq/+I j7aOk7UH5+ofpjHTdhzfVEZZZDnYUuDn9CaGi9/FpolfOIz41ssU6CTB+Ql3eGQ765xi W1qN3MnsatwX8GI7cXz//iJBfQqcYVctA7sSpdIC2TKHiBaW3h7/1pb3adb5tJUk1pXK BM/A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=XmEjpPkWqm/YDvgUSl0/MJ3T6b+RSLhnCpKyIM1pO3k=; b=nXJ/MvGuakNjakcSkO9g+GmuPbSeNrMv7ZdTQBuBZw4HVCCFTLfiQjzkftWDSJMBsC wFqhuKWLFAKgpTKGmwaTIGo1VdmeS72ZHMqJDrB3s0fx79acPo3PVP0nDuP8SEEJLWAN sN6HxcxZI/RwuHhAlj07XOhAJBkk37OPCb1EcabPA+jTLqQdJH/360F7VsUcVTLVKoYG QVN7MmZDL0dwUjSecQ23fKyDhGz2/EG+8JFaiCHlF/yZD8JpK4nXldt1G36A4Uqp9UxM vZ1ayabUzx8jBGRRqqUeY/BHlBCvOPpUqZPneSPF8nbCeqoeZbOQ8ZTdPcO+VKl2EUpw z5xw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=f7LQ4NkL; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id h11si2221782ywh.373.2018.02.17.11.13.21 for (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 17 Feb 2018 11:13:21 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=f7LQ4NkL; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48476 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7vF-0007yb-3z for patch@linaro.org; Sat, 17 Feb 2018 14:13:21 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41157) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1en7Aj-00026p-48 for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:19 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1en7Ah-0002OR-NX for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:17 -0500 Received: from mail-pf0-x242.google.com ([2607:f8b0:400e:c00::242]:44408) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1en7Ah-0002O2-Ft for qemu-devel@nongnu.org; Sat, 17 Feb 2018 13:25:15 -0500 Received: by mail-pf0-x242.google.com with SMTP id 17so591534pfw.11 for ; Sat, 17 Feb 2018 10:25:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=XmEjpPkWqm/YDvgUSl0/MJ3T6b+RSLhnCpKyIM1pO3k=; b=f7LQ4NkLsW6JaC7vdaN1PMX9BdoQo+WGQW1dLlsOY6TTGumPC4h44m+Wjnz4hjf9Eb a+HCPj45ftQYLfixSJN/Yy71PUCgB3BQGNJt5pdHndmrsh9S97gsvvxKyBdmJFtabTeK R4jWTSB85XK/0nULn+6Noc/SAk5AnXqpkIlp0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=XmEjpPkWqm/YDvgUSl0/MJ3T6b+RSLhnCpKyIM1pO3k=; b=hquz5b8IljqyzPU8rCz4kbln/wSHZPrpYoh77YMiGE++iqouV0a9IzuKVdbNI3sngD 1GKmOF+I/+A+7cCASqK6WC2kSX+rPqLlM+Qq9A0EJevVopKOizp1bqXy1jTF4mwFCTFn GT1+HteJs3g1zii4c0lpAMPsmXfMRbl0rPw4rveJ0Ne4YkldL4o2XUK3Akh5Ze4VVN62 Zv/UYzs1v3Jm9csZzJYlgkdkhlHbnP770XqwMebdwWHo/Sm59uIefbirdNPXyCN69Obt eGDTeclzYe8OIZiUkCGy4amYxtW0Zvo+l+xPoCPfk7UMEBXzZYZgDuYbij7K6RWrob46 P2Ug== X-Gm-Message-State: APf1xPCCXrX4Og+mGTHWFGPo5TNO5Fhunbq1AISZDIplxIyc5HO49fnp kQCp5RoiCexNJgPPA0NzEdiA1rs++m0= X-Received: by 10.101.72.199 with SMTP id o7mr8233639pgs.303.1518891914266; Sat, 17 Feb 2018 10:25:14 -0800 (PST) Received: from cloudburst.twiddle.net ([50.0.192.64]) by smtp.gmail.com with ESMTPSA id h15sm13466712pfi.56.2018.02.17.10.25.13 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Sat, 17 Feb 2018 10:25:13 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sat, 17 Feb 2018 10:23:23 -0800 Message-Id: <20180217182323.25885-68-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180217182323.25885-1-richard.henderson@linaro.org> References: <20180217182323.25885-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::242 Subject: [Qemu-devel] [PATCH v2 67/67] target/arm: Implement SVE floating-point unary operations X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 14 ++++++++++++++ target/arm/sve_helper.c | 8 ++++++++ target/arm/translate-sve.c | 28 ++++++++++++++++++++++++++++ target/arm/sve.decode | 4 ++++ 4 files changed, 54 insertions(+) -- 2.14.3 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 749bab0b38..5cebc9121d 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -999,6 +999,20 @@ DEF_HELPER_FLAGS_5(sve_frintx_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_frintx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_frecpx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_frecpx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_frecpx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_fsqrt_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fsqrt_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_fsqrt_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_scvt_hh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_scvt_sh, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 7950710be7..4f0985a29e 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3208,6 +3208,14 @@ DO_ZPZ_FP(sve_frintx_h, uint16_t, H1_2, float16_round_to_int) DO_ZPZ_FP(sve_frintx_s, uint32_t, H1_4, float32_round_to_int) DO_ZPZ_FP_D(sve_frintx_d, uint64_t, float64_round_to_int) +DO_ZPZ_FP(sve_frecpx_h, uint16_t, H1_2, helper_frecpx_f16) +DO_ZPZ_FP(sve_frecpx_s, uint32_t, H1_4, helper_frecpx_f32) +DO_ZPZ_FP_D(sve_frecpx_d, uint64_t, helper_frecpx_f64) + +DO_ZPZ_FP(sve_fsqrt_h, uint16_t, H1_2, float16_sqrt) +DO_ZPZ_FP(sve_fsqrt_s, uint32_t, H1_4, float32_sqrt) +DO_ZPZ_FP_D(sve_fsqrt_d, uint64_t, float64_sqrt) + DO_ZPZ_FP(sve_scvt_hh, uint16_t, H1_2, int16_to_float16) DO_ZPZ_FP(sve_scvt_sh, uint32_t, H1_4, int32_to_float16) DO_ZPZ_FP(sve_scvt_ss, uint32_t, H1_4, int32_to_float32) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 5f1c4984b8..f1ff033333 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3831,6 +3831,34 @@ static void trans_FRINTA(DisasContext *s, arg_rpr_esz *a, uint32_t insn) do_frint_mode(s, a, float_round_ties_away); } +static void trans_FRECPX(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3_ptr * const fns[3] = { + gen_helper_sve_frecpx_h, + gen_helper_sve_frecpx_s, + gen_helper_sve_frecpx_d + }; + if (a->esz == 0) { + unallocated_encoding(s); + } else { + do_zpz_ptr(s, a->rd, a->rn, a->pg, a->esz == MO_16, fns[a->esz - 1]); + } +} + +static void trans_FSQRT(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3_ptr * const fns[3] = { + gen_helper_sve_fsqrt_h, + gen_helper_sve_fsqrt_s, + gen_helper_sve_fsqrt_d + }; + if (a->esz == 0) { + unallocated_encoding(s); + } else { + do_zpz_ptr(s, a->rd, a->rn, a->pg, a->esz == MO_16, fns[a->esz - 1]); + } +} + static void trans_SCVTF_hh(DisasContext *s, arg_rpr_esz *a, uint32_t insn) { do_zpz_ptr(s, a->rd, a->rn, a->pg, true, gen_helper_sve_scvt_hh); diff --git a/target/arm/sve.decode b/target/arm/sve.decode index e06c0c5279..fbd9cf1384 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -857,6 +857,10 @@ FRINTA 01100101 .. 000 100 101 ... ..... ..... @rd_pg_rn FRINTX 01100101 .. 000 110 101 ... ..... ..... @rd_pg_rn FRINTI 01100101 .. 000 111 101 ... ..... ..... @rd_pg_rn +# SVE floating-point unary operations +FRECPX 01100101 .. 001 100 101 ... ..... ..... @rd_pg_rn +FSQRT 01100101 .. 001 101 101 ... ..... ..... @rd_pg_rn + # SVE integer convert to floating-point SCVTF_hh 01100101 01 010 01 0 101 ... ..... ..... @rd_pg_rn_e0 SCVTF_sh 01100101 01 010 10 0 101 ... ..... ..... @rd_pg_rn_e0