From patchwork Wed May 30 18:01:03 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 137266 Delivered-To: patch@linaro.org Received: by 2002:a2e:9706:0:0:0:0:0 with SMTP id r6-v6csp5631757lji; Wed, 30 May 2018 11:05:12 -0700 (PDT) X-Google-Smtp-Source: ADUXVKLH5CRxasZ2WaNfre1LG2rvYPEKJToiSbMOfR2DlfXk6GJqt2zZhkubXaqoMMFLKMu2AMny X-Received: by 2002:a37:7b42:: with SMTP id w63-v6mr3580987qkc.396.1527703512161; Wed, 30 May 2018 11:05:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527703512; cv=none; d=google.com; s=arc-20160816; b=J/fOwkbP2oYu1zNb21cGFu8nDZFcvapx0EvOLclXkdgLMJCP3rRB7AbKRDp6AzJZ+c jSaxgTXt0uStkLuWfinzocdhiil0gb7p2EpHofPZFXhkBceZl2Y6hDn8t4GePjY3pZ9v TAvxByfK7SG4hgq6+TX7XSGR8FuEbvYgYBEmZoI6GIOStrGRP1nqiomN7Y+HNrRAket0 CMIXB7V7N1lr1Kyb3t0TrvYTCmPGNw6+0RJBJ8SQtROqsZ9EOqes2OXz0sS+goXW3ncF jg3h2aJkmFWDY8XpdHWe0bLSNzU4GDWJloFY3rNjr/In1PoTa95FscPbdmajZMRswX2E P2yA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=oyxFM5ORC2u3ySgE1d8lEI9lCFBcoWljxqJhU6kL95s=; b=ura848vn48WqScCuh/ab2CrgPiT+GdrLnaZpzlVmVAOfOv0sNJBGjaI8WHWUM8EqSh FC/Y4xNImX0KJGM/AiOWk977UO30T5aDKc6leI5X60hS/Ha7E0EpRh2+ue+DWGir0bHl e8vfjPwCTQRERk5R9Z09b13pDtTPUk7PPv3la5g9t8FaYdJP2IQhejy4ZegtTl7Xl5KP CLR6gmmpBJ/vtKmuC5NNf254nMl4ma2kGhAq7wd2vT0ye2JsLoPQUAkz7dhhtBf667HR 5ySyYsZ8bpZJ7DrpaCf7IYQmYnTePFEKj35FP/HszxFTfgubrvwwrqHW6NehrInRZ92N 2xYg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=YU0eCKMC; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id u14-v6si3940128qvb.224.2018.05.30.11.05.11 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 30 May 2018 11:05:12 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=YU0eCKMC; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:40048 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fO5TD-0004An-JG for patch@linaro.org; Wed, 30 May 2018 14:05:11 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33044) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fO5Pb-0001kP-7J for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:28 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fO5PZ-0004LC-T5 for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:27 -0400 Received: from mail-pl0-x22c.google.com ([2607:f8b0:400e:c01::22c]:43406) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fO5PZ-0004KB-OE for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:25 -0400 Received: by mail-pl0-x22c.google.com with SMTP id c41-v6so11532918plj.10 for ; Wed, 30 May 2018 11:01:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=oyxFM5ORC2u3ySgE1d8lEI9lCFBcoWljxqJhU6kL95s=; b=YU0eCKMCqeRO/43VLApx+GDVJ7sHLA90x5kPJSaMKiZtL0eVOH1fgasbgNmGeIVdwi R4VyHCDF+iQsNKQH6MQqgdnnbt5qFQ0kviuhOtqGErPsZK+lu26o0e9np7uoHUWXSNAo QO87XdBhz4vDJI04N369VzbQgmCvjF7ZYLmrU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=oyxFM5ORC2u3ySgE1d8lEI9lCFBcoWljxqJhU6kL95s=; b=kuqsqqENnzzre5Zk/zYobnl9vTC40mDShSuDS6e6oiBk0arBSgJHeJxwRSkMpUVSiZ Wj9EAiklS1CiGDKYicRCcK1ZFhzzeaQjJGBzlQfOjDMxLwY8YgQ5XDc5WJntEG1Td0XI nMxT65P4EsT/1RfS0CrXHjeiR6PP0CCgQBOsUzFUsjDk+B1G/Y6UWtvbD1J3ho9SoIGF FshYbCK4PuJ8uyFHMAlmNhJnkILS26UGQ1NmamlNH82MgfMxi7ATcBsDcqzV8NqGQTyr gZasj3erXO3sKfYHZqUnEzmWjuvLtTE8fQenh9MOUd7OQO5cDljFx1wnO/b7PMKXa2mF W6IQ== X-Gm-Message-State: ALKqPwdNFhcCrrz7pYYi8vJTyEPc66Vvp10GCgoYSQL6zy2bbUwZAxDE Xwjj5iSx8+chft365fUVH1+teeoQ9jI= X-Received: by 2002:a17:902:6b09:: with SMTP id o9-v6mr3850823plk.256.1527703284384; Wed, 30 May 2018 11:01:24 -0700 (PDT) Received: from cloudburst.twiddle.net (97-126-112-211.tukw.qwest.net. [97.126.112.211]) by smtp.gmail.com with ESMTPSA id b84-v6sm28179157pfm.123.2018.05.30.11.01.22 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 30 May 2018 11:01:22 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 30 May 2018 11:01:03 -0700 Message-Id: <20180530180120.13355-2-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180530180120.13355-1-richard.henderson@linaro.org> References: <20180530180120.13355-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::22c Subject: [Qemu-devel] [PATCH v3b 01/18] target/arm: Extend vec_reg_offset to larger sizes X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Rearrange the arithmetic so that we are agnostic about the total size of the vector and the size of the element. This will allow us to index up to the 32nd byte and with 16-byte elements. Signed-off-by: Richard Henderson --- target/arm/translate-a64.h | 26 +++++++++++++++++--------- 1 file changed, 17 insertions(+), 9 deletions(-) -- 2.17.0 Reviewed-by: Peter Maydell diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h index dd9c09f89b..63d958cf50 100644 --- a/target/arm/translate-a64.h +++ b/target/arm/translate-a64.h @@ -67,18 +67,26 @@ static inline void assert_fp_access_checked(DisasContext *s) static inline int vec_reg_offset(DisasContext *s, int regno, int element, TCGMemOp size) { - int offs = 0; + int element_size = 1 << size; + int offs = element * element_size; #ifdef HOST_WORDS_BIGENDIAN /* This is complicated slightly because vfp.zregs[n].d[0] is - * still the low half and vfp.zregs[n].d[1] the high half - * of the 128 bit vector, even on big endian systems. - * Calculate the offset assuming a fully bigendian 128 bits, - * then XOR to account for the order of the two 64 bit halves. + * still the lowest and vfp.zregs[n].d[15] the highest of the + * 256 byte vector, even on big endian systems. + * + * Calculate the offset assuming fully little-endian, + * then XOR to account for the order of the 8-byte units. + * + * For 16 byte elements, the two 8 byte halves will not form a + * host int128 if the host is bigendian, since they're in the + * wrong order. However the only 16 byte operation we have is + * a move, so we can ignore this for the moment. More complicated + * operations will have to special case loading and storing from + * the zregs array. */ - offs += (16 - ((element + 1) * (1 << size))); - offs ^= 8; -#else - offs += element * (1 << size); + if (element_size < 8) { + offs ^= 8 - element_size; + } #endif offs += offsetof(CPUARMState, vfp.zregs[regno]); assert_fp_access_checked(s); From patchwork Wed May 30 18:01:04 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 137265 Delivered-To: patch@linaro.org Received: by 2002:a2e:9706:0:0:0:0:0 with SMTP id r6-v6csp5631286lji; Wed, 30 May 2018 11:04:47 -0700 (PDT) X-Google-Smtp-Source: ADUXVKK1QW4IzOF4YfKNSZ4nGFiFnpkbHnfczVg1GWL7wNYmOqEdfjdBYKWQKsWD22jRTRYyny0R X-Received: by 2002:ae9:e105:: with SMTP id g5-v6mr3339402qkm.236.1527703487685; Wed, 30 May 2018 11:04:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527703487; cv=none; d=google.com; s=arc-20160816; b=OVm3sWxWBLXSTx0Uim4ZnLQBix6Qy1rm06SIVpEpHDea/GPKNmFaP3g6CuRKWon6fY jrPqY6eoIinaWNWVE7IvTQoqp4euj68UbgsPiE0KZWtjKNC7JS1QKhUCUXyqFNDbgf++ //KIELZCOt/Q+PzNe/0wrPPNJkgjDdwtQe+I77yHhmNJUfSVRsqfFui0SQsDeotVsvjI k2Fofp7RDnMUps/7nidrYQ+jO5Gm7hNWVJtMKrGunIPiOwr9GbHcAce6ndCFRJONt4FU eqAkE5E0t+bURZ8PeZ9zWRB82oZ1G+mVCQ1TsRI8zIxs7v4YlY8S3hka1/FZ3mAhVOk2 N+qQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=8/cK1CKnbZW3yoUTbzCwG4b0r7T0TlVA6vycAYRYfkI=; b=VpnS3LuCI9L2je3NzA+7b55DYzd3d3rYuKMUe0TV3v8Gn3D//Z8I07Y5bWJKo7K8x1 vHEZ0gwhlKuVzd9JOYva32Aj9BM+mEfxxY+0Eq2c81O6sqQXujPQl65DCgJ1Z3QCdfLE LEZ0cN8SZJ38EkkQA1oGwk1smrtD8HAL8h+qS2M8n74vpeuUitknQf9nf8WzBWxo0as/ GZ6z35yTpKiBxKZmw0STgEBaM9XDlgpxnhWdoRKrz8fgwu1jRMchgSrgFAxj5vi79mCt gIjNgT2v/ceP8P7cgYQQseZIi6jne/v3gLv8I98p+VUvx8XW/vouSVsoh+A8hdFOENEy ZXyA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=QiqRs/Fh; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id r27-v6si4514436qvj.213.2018.05.30.11.04.47 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 30 May 2018 11:04:47 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=QiqRs/Fh; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:40045 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fO5Sp-00037T-5G for patch@linaro.org; Wed, 30 May 2018 14:04:47 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33096) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fO5Pd-0001ls-7H for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:31 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fO5Pb-0004MZ-Ip for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:29 -0400 Received: from mail-pl0-x229.google.com ([2607:f8b0:400e:c01::229]:38153) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fO5Pb-0004Lj-Bf for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:27 -0400 Received: by mail-pl0-x229.google.com with SMTP id c11-v6so11553860plr.5 for ; Wed, 30 May 2018 11:01:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=8/cK1CKnbZW3yoUTbzCwG4b0r7T0TlVA6vycAYRYfkI=; b=QiqRs/Fhk8zwn/ZpUNKL0g/wyudymA5JpjRtUBRbWJ7Ctb6+BHkIHMl3ZQkLxU3xCe ZdTNb9AVQ3W/lkz8nzhv2y8VxiOM6Bcir4IkZef8v6b4R+XT13fXdAUy47A3oOYSwphw XmPIpqY0voPa+pCiew5HWjTohc2hlwwnmK+gY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=8/cK1CKnbZW3yoUTbzCwG4b0r7T0TlVA6vycAYRYfkI=; b=ApTHt10DSPl+Fglqx6WeRsOGt5oGoVrO6oS2Knjg6M19jftqJ+n0zwTgovxniA2uZj AJEQQgvPaZNX7yCxk6hoGivFeCNhEJHpszFAV+AR1HylwYwBYXjAndbXJ79DQvPb470C bxg6Mv3P2OOy3vZzhTfD8Ci2wVaqWDrS4BdYEhzoXs1qAITaH6kj/QfTPeUX9zyijtIS A47dFa7GpinfKZOK+IEO6eEKvQ/EfWBQKiicVpSzG5HXHuKkIeYPSiqmeR4GhvmpiMEi Ajz9yA8Uhp71YQadFulyw4E6XYaU/I2Eus/OToLTz+EoVNCMTtcLzgP//jLGdmBXv0oQ KhNA== X-Gm-Message-State: ALKqPwfzfOaXF0T871py8Bq8TeGtYJV1jWHoEJBb/Z47vE2lWPL69TRP Z309pO44DCVH4SUXCWGwtezG1WPjgdM= X-Received: by 2002:a17:902:ba87:: with SMTP id k7-v6mr3783601pls.271.1527703285890; Wed, 30 May 2018 11:01:25 -0700 (PDT) Received: from cloudburst.twiddle.net (97-126-112-211.tukw.qwest.net. [97.126.112.211]) by smtp.gmail.com with ESMTPSA id b84-v6sm28179157pfm.123.2018.05.30.11.01.24 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 30 May 2018 11:01:24 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 30 May 2018 11:01:04 -0700 Message-Id: <20180530180120.13355-3-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180530180120.13355-1-richard.henderson@linaro.org> References: <20180530180120.13355-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::229 Subject: [Qemu-devel] [PATCH v3b 02/18] target/arm: Implement SVE Permute - Unpredicated Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 23 +++++++ target/arm/sve_helper.c | 114 +++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 133 +++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 27 ++++++++ 4 files changed, 297 insertions(+) -- 2.17.0 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 94f4356ce9..0c9aad575e 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -416,6 +416,29 @@ DEF_HELPER_FLAGS_4(sve_cpy_z_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(sve_ext, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_insr_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_insr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_insr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_insr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_3(sve_rev_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_rev_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_rev_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_rev_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_tbl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_tbl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_tbl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_tbl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_sunpk_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_sunpk_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_sunpk_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_uunpk_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_uunpk_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_uunpk_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index b825e44cb5..58c0fda333 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1560,3 +1560,117 @@ void HELPER(sve_ext)(void *vd, void *vn, void *vm, uint32_t desc) memcpy(vd + n_siz, &tmp, n_ofs); } } + +#define DO_INSR(NAME, TYPE, H) \ +void HELPER(NAME)(void *vd, void *vn, uint64_t val, uint32_t desc) \ +{ \ + intptr_t opr_sz = simd_oprsz(desc); \ + swap_memmove(vd + sizeof(TYPE), vn, opr_sz - sizeof(TYPE)); \ + *(TYPE *)(vd + H(0)) = val; \ +} + +DO_INSR(sve_insr_b, uint8_t, H1) +DO_INSR(sve_insr_h, uint16_t, H1_2) +DO_INSR(sve_insr_s, uint32_t, H1_4) +DO_INSR(sve_insr_d, uint64_t, ) + +#undef DO_INSR + +void HELPER(sve_rev_b)(void *vd, void *vn, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc); + for (i = 0, j = opr_sz - 8; i < opr_sz / 2; i += 8, j -= 8) { + uint64_t f = *(uint64_t *)(vn + i); + uint64_t b = *(uint64_t *)(vn + j); + *(uint64_t *)(vd + i) = bswap64(b); + *(uint64_t *)(vd + j) = bswap64(f); + } +} + +static inline uint64_t hswap64(uint64_t h) +{ + uint64_t m = 0x0000ffff0000ffffull; + h = rol64(h, 32); + return ((h & m) << 16) | ((h >> 16) & m); +} + +void HELPER(sve_rev_h)(void *vd, void *vn, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc); + for (i = 0, j = opr_sz - 8; i < opr_sz / 2; i += 8, j -= 8) { + uint64_t f = *(uint64_t *)(vn + i); + uint64_t b = *(uint64_t *)(vn + j); + *(uint64_t *)(vd + i) = hswap64(b); + *(uint64_t *)(vd + j) = hswap64(f); + } +} + +void HELPER(sve_rev_s)(void *vd, void *vn, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc); + for (i = 0, j = opr_sz - 8; i < opr_sz / 2; i += 8, j -= 8) { + uint64_t f = *(uint64_t *)(vn + i); + uint64_t b = *(uint64_t *)(vn + j); + *(uint64_t *)(vd + i) = rol64(b, 32); + *(uint64_t *)(vd + j) = rol64(f, 32); + } +} + +void HELPER(sve_rev_d)(void *vd, void *vn, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc); + for (i = 0, j = opr_sz - 8; i < opr_sz / 2; i += 8, j -= 8) { + uint64_t f = *(uint64_t *)(vn + i); + uint64_t b = *(uint64_t *)(vn + j); + *(uint64_t *)(vd + i) = b; + *(uint64_t *)(vd + j) = f; + } +} + +#define DO_TBL(NAME, TYPE, H) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + uintptr_t elem = opr_sz / sizeof(TYPE); \ + TYPE *d = vd, *n = vn, *m = vm; \ + ARMVectorReg tmp; \ + if (unlikely(vd == vn)) { \ + n = memcpy(&tmp, vn, opr_sz); \ + } \ + for (i = 0; i < elem; i++) { \ + TYPE j = m[H(i)]; \ + d[H(i)] = j < elem ? n[H(j)] : 0; \ + } \ +} + +DO_TBL(sve_tbl_b, uint8_t, H1) +DO_TBL(sve_tbl_h, uint16_t, H2) +DO_TBL(sve_tbl_s, uint32_t, H4) +DO_TBL(sve_tbl_d, uint64_t, ) + +#undef TBL + +#define DO_UNPK(NAME, TYPED, TYPES, HD, HS) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + TYPED *d = vd; \ + TYPES *n = vn; \ + ARMVectorReg tmp; \ + if (unlikely(vn - vd < opr_sz)) { \ + n = memcpy(&tmp, n, opr_sz / 2); \ + } \ + for (i = 0; i < opr_sz / sizeof(TYPED); i++) { \ + d[HD(i)] = n[HS(i)]; \ + } \ +} + +DO_UNPK(sve_sunpk_h, int16_t, int8_t, H2, H1) +DO_UNPK(sve_sunpk_s, int32_t, int16_t, H4, H2) +DO_UNPK(sve_sunpk_d, int64_t, int32_t, , H4) + +DO_UNPK(sve_uunpk_h, uint16_t, uint8_t, H2, H1) +DO_UNPK(sve_uunpk_s, uint32_t, uint16_t, H4, H2) +DO_UNPK(sve_uunpk_d, uint64_t, uint32_t, , H4) + +#undef DO_UNPK diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index c48d4b530a..388cce9924 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -1956,6 +1956,139 @@ static bool trans_EXT(DisasContext *s, arg_EXT *a, uint32_t insn) return true; } +/* + *** SVE Permute - Unpredicated Group + */ + +static bool trans_DUP_s(DisasContext *s, arg_DUP_s *a, uint32_t insn) +{ + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_dup_i64(a->esz, vec_full_reg_offset(s, a->rd), + vsz, vsz, cpu_reg_sp(s, a->rn)); + } + return true; +} + +static bool trans_DUP_x(DisasContext *s, arg_DUP_x *a, uint32_t insn) +{ + if ((a->imm & 0x1f) == 0) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + unsigned dofs = vec_full_reg_offset(s, a->rd); + unsigned esz, index; + + esz = ctz32(a->imm); + index = a->imm >> (esz + 1); + + if ((index << esz) < vsz) { + unsigned nofs = vec_reg_offset(s, a->rn, index, esz); + tcg_gen_gvec_dup_mem(esz, dofs, nofs, vsz, vsz); + } else { + tcg_gen_gvec_dup64i(dofs, vsz, vsz, 0); + } + } + return true; +} + +static void do_insr_i64(DisasContext *s, arg_rrr_esz *a, TCGv_i64 val) +{ + typedef void gen_insr(TCGv_ptr, TCGv_ptr, TCGv_i64, TCGv_i32); + static gen_insr * const fns[4] = { + gen_helper_sve_insr_b, gen_helper_sve_insr_h, + gen_helper_sve_insr_s, gen_helper_sve_insr_d, + }; + unsigned vsz = vec_full_reg_size(s); + TCGv_i32 desc = tcg_const_i32(simd_desc(vsz, vsz, 0)); + TCGv_ptr t_zd = tcg_temp_new_ptr(); + TCGv_ptr t_zn = tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(t_zd, cpu_env, vec_full_reg_offset(s, a->rd)); + tcg_gen_addi_ptr(t_zn, cpu_env, vec_full_reg_offset(s, a->rn)); + + fns[a->esz](t_zd, t_zn, val, desc); + + tcg_temp_free_ptr(t_zd); + tcg_temp_free_ptr(t_zn); + tcg_temp_free_i32(desc); +} + +static bool trans_INSR_f(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + if (sve_access_check(s)) { + TCGv_i64 t = tcg_temp_new_i64(); + tcg_gen_ld_i64(t, cpu_env, vec_reg_offset(s, a->rm, 0, MO_64)); + do_insr_i64(s, a, t); + tcg_temp_free_i64(t); + } + return true; +} + +static bool trans_INSR_r(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + if (sve_access_check(s)) { + do_insr_i64(s, a, cpu_reg(s, a->rm)); + } + return true; +} + +static bool trans_REV_v(DisasContext *s, arg_rr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_2 * const fns[4] = { + gen_helper_sve_rev_b, gen_helper_sve_rev_h, + gen_helper_sve_rev_s, gen_helper_sve_rev_d + }; + + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_2_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vsz, vsz, 0, fns[a->esz]); + } + return true; +} + +static bool trans_TBL(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve_tbl_b, gen_helper_sve_tbl_h, + gen_helper_sve_tbl_s, gen_helper_sve_tbl_d + }; + + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vsz, vsz, 0, fns[a->esz]); + } + return true; +} + +static bool trans_UNPK(DisasContext *s, arg_UNPK *a, uint32_t insn) +{ + static gen_helper_gvec_2 * const fns[4][2] = { + { NULL, NULL }, + { gen_helper_sve_sunpk_h, gen_helper_sve_uunpk_h }, + { gen_helper_sve_sunpk_s, gen_helper_sve_uunpk_s }, + { gen_helper_sve_sunpk_d, gen_helper_sve_uunpk_d }, + }; + + if (a->esz == 0) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_2_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn) + + (a->h ? vsz / 2 : 0), + vsz, vsz, 0, fns[a->esz][a->u]); + } + return true; +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 4761d1921e..7ffd7962c8 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -24,6 +24,7 @@ %imm4_16_p1 16:4 !function=plus1 %imm6_22_5 22:1 5:5 +%imm7_22_16 22:2 16:5 %imm8_16_10 16:5 10:3 %imm9_16_10 16:s6 10:3 @@ -85,6 +86,8 @@ # Three operand, vector element size @rd_rn_rm ........ esz:2 . rm:5 ... ... rn:5 rd:5 &rrr_esz +@rdn_rm ........ esz:2 ...... ...... rm:5 rd:5 \ + &rrr_esz rn=%reg_movprfx # Three operand with "memory" size, aka immediate left shift @rd_rn_msz_rm ........ ... rm:5 .... imm:2 rn:5 rd:5 &rrri @@ -369,6 +372,30 @@ CPY_z_i 00000101 .. 01 .... 00 . ........ ..... @rdn_pg4 imm=%sh8_i8s EXT 00000101 001 ..... 000 ... rm:5 rd:5 \ &rrri rn=%reg_movprfx imm=%imm8_16_10 +### SVE Permute - Unpredicated Group + +# SVE broadcast general register +DUP_s 00000101 .. 1 00000 001110 ..... ..... @rd_rn + +# SVE broadcast indexed element +DUP_x 00000101 .. 1 ..... 001000 rn:5 rd:5 \ + &rri imm=%imm7_22_16 + +# SVE insert SIMD&FP scalar register +INSR_f 00000101 .. 1 10100 001110 ..... ..... @rdn_rm + +# SVE insert general register +INSR_r 00000101 .. 1 00100 001110 ..... ..... @rdn_rm + +# SVE reverse vector elements +REV_v 00000101 .. 1 11000 001110 ..... ..... @rd_rn + +# SVE vector table lookup +TBL 00000101 .. 1 ..... 001100 ..... ..... @rd_rn_rm + +# SVE unpack vector elements +UNPK 00000101 esz:2 1100 u:1 h:1 001110 rn:5 rd:5 + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Wed May 30 18:01:05 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 137270 Delivered-To: patch@linaro.org Received: by 2002:a2e:9706:0:0:0:0:0 with SMTP id r6-v6csp5635319lji; Wed, 30 May 2018 11:08:41 -0700 (PDT) X-Google-Smtp-Source: ADUXVKJwe9araumLWq3XwrzG8BiHrV4Cr3zIFFGgMsd3EIgTDY2QGCDJyvLaj5X/z+v6sdZXYGWO X-Received: by 2002:ac8:29a5:: with SMTP id 34-v6mr3674894qts.383.1527703720923; Wed, 30 May 2018 11:08:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527703720; cv=none; d=google.com; s=arc-20160816; b=fXj9Yn5iOhPvpxCVB8DOTOWpPIf205G+/BnjSNNF31EBsAhyEJp7Z+n88yA4xeoFG/ /pbBb3V+gbpIuMLoNvRlC21gMhPF3bXAuQt+SqGaQ+47Q+oa9jg02uwoh3G+RdfaEDYV HeeA11okCzvSg7PTOJbcBxeroFx6yFCXcxccQ1HrA1ts5vdacO9jH17vdwcE1PBIokGS jt5lI2xs9rVnc2gqPD54JPBjRNZOEUdVf35vdQWa/oHuC/lzLrUMZxF3NAw6zIerFyF9 I6B34Fq0efB73pSc43DbCnEQMX1gcPj8hbuXclphqFWsUqPB7+wkMtb0wxIhU09tzGaG wZKg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=20EZoQ8QPU8yFGAo8rpV97QFYZBSCVksGfxx17wN4pk=; b=ls0SCzEiDNZAplxRccRzZiScGUcfeLlvyeyIPlzgYxE7d4DuEIUIKAJvxVRKwZ2QUu CQsm7Njpa20qOzqAS1v9DART/ryXdDzl49p7jisI2TkvBstnJGyKjxaiZLQ7VCHqYG1N H2nU8r+lhbMOLbpEQViYofrljDoikkWHDmbKGD6/RxsG1bvtzPHzY7Nw11wNh0f9c/F5 7sWgmwVdesgRnUhlqmam6Ir4iiLLKYUJxNagg61LS7g9NDH2IqudS0OWee5tlXBFLe6l tvZ+ydL161rvCaFPQmKuhPN6kcGfkUDCpEybVZs9cTfcGoMr0piZZ33Dbzg8l60wVrul aqIA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=XPvSo7W2; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id 28-v6si2498066qtq.115.2018.05.30.11.08.40 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 30 May 2018 11:08:40 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=XPvSo7W2; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:40080 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fO5Wa-00076m-9V for patch@linaro.org; Wed, 30 May 2018 14:08:40 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33229) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fO5Ph-0001r4-IQ for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:36 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fO5Pe-0004Rg-Lx for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:33 -0400 Received: from mail-pl0-x243.google.com ([2607:f8b0:400e:c01::243]:41846) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fO5Pe-0004PU-CF for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:30 -0400 Received: by mail-pl0-x243.google.com with SMTP id az12-v6so11548194plb.8 for ; Wed, 30 May 2018 11:01:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=20EZoQ8QPU8yFGAo8rpV97QFYZBSCVksGfxx17wN4pk=; b=XPvSo7W2DlD5+yKtl8Jnpj7xvmGrY4TtqP+dXnfcuf06PSUvX4HHH6JmziQVs3kslG ypxMwprfCD5kRWzBO05jxCbXT1+BuSJBPIPq4RllBXOfYrB3Qbo3WbwsLhso/yEd6n3L NXdN6V4qBYInRF3UA1udX98i9cVXyOAbZuySY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=20EZoQ8QPU8yFGAo8rpV97QFYZBSCVksGfxx17wN4pk=; b=ZOJy095RDKc3BxpeODPBF+duvHjiNzMosdV08Y554VVr+PzAYYyDp4nFLuvhI4e4dX j/8X1OxH06s6OQ7WC2kFEHg8NbcasQjQ48wfxDlC1PfpEl4SZVcMYiKDbcuhuqHuIFSZ f8QQSuiyWoXpvF6LwxQArhKOlAdlSUtqRs35iniIFzVvJP6Y4Y86DMiLQ7M7IdIf0d0K QJsLQwwqo2cxjyt/o8MF9ybtug/LbDfupujYHJwlib7Bzckv0HJyl20XNP1RkxMh2Wgh brMZ5OIkwASmDvBcMHL5lSN+7j4++h8U5x/0L68UQ20jg37n46UDmSK35XAgdQfsPpZ6 Y3uw== X-Gm-Message-State: ALKqPwdu40R2ESWrZpWAzzD3mWr3MTreGYHpvzisMmNQ/OdAZn+55q1j 3BrkV+p9vE6xgmmhpJ0W+4hlMuwOG18= X-Received: by 2002:a17:902:64cf:: with SMTP id y15-v6mr3840781pli.53.1527703287507; Wed, 30 May 2018 11:01:27 -0700 (PDT) Received: from cloudburst.twiddle.net (97-126-112-211.tukw.qwest.net. [97.126.112.211]) by smtp.gmail.com with ESMTPSA id b84-v6sm28179157pfm.123.2018.05.30.11.01.25 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 30 May 2018 11:01:26 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 30 May 2018 11:01:05 -0700 Message-Id: <20180530180120.13355-4-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180530180120.13355-1-richard.henderson@linaro.org> References: <20180530180120.13355-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::243 Subject: [Qemu-devel] [PATCH v3b 03/18] target/arm: Implement SVE Permute - Predicates Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 6 + target/arm/sve_helper.c | 290 +++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 120 +++++++++++++++ target/arm/sve.decode | 18 +++ 4 files changed, 434 insertions(+) -- 2.17.0 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 0c9aad575e..ff958fcebd 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -439,6 +439,12 @@ DEF_HELPER_FLAGS_3(sve_uunpk_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_uunpk_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_uunpk_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_zip_p, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_uzp_p, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_trn_p, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_rev_p, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_punpk_p, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 58c0fda333..f4d49d4aff 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1674,3 +1674,293 @@ DO_UNPK(sve_uunpk_s, uint32_t, uint16_t, H4, H2) DO_UNPK(sve_uunpk_d, uint64_t, uint32_t, , H4) #undef DO_UNPK + +/* Mask of bits included in the even numbered predicates of width esz. + * We also use this for expand_bits/compress_bits, and so extend the + * same pattern out to 16-bit units. + */ +static const uint64_t even_bit_esz_masks[5] = { + 0x5555555555555555ull, + 0x3333333333333333ull, + 0x0f0f0f0f0f0f0f0full, + 0x00ff00ff00ff00ffull, + 0x0000ffff0000ffffull, +}; + +/* Zero-extend units of 2**N bits to units of 2**(N+1) bits. + * For N==0, this corresponds to the operation that in qemu/bitops.h + * we call half_shuffle64; this algorithm is from Hacker's Delight, + * section 7-2 Shuffling Bits. + */ +static uint64_t expand_bits(uint64_t x, int n) +{ + int i; + + x &= 0xffffffffu; + for (i = 4; i >= n; i--) { + int sh = 1 << i; + x = ((x << sh) | x) & even_bit_esz_masks[i]; + } + return x; +} + +/* Compress units of 2**(N+1) bits to units of 2**N bits. + * For N==0, this corresponds to the operation that in qemu/bitops.h + * we call half_unshuffle64; this algorithm is from Hacker's Delight, + * section 7-2 Shuffling Bits, where it is called an inverse half shuffle. + */ +static uint64_t compress_bits(uint64_t x, int n) +{ + int i; + + for (i = n; i <= 4; i++) { + int sh = 1 << i; + x &= even_bit_esz_masks[i]; + x = (x >> sh) | x; + } + return x & 0xffffffffu; +} + +void HELPER(sve_zip_p)(void *vd, void *vn, void *vm, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + int esz = extract32(pred_desc, SIMD_DATA_SHIFT, 2); + intptr_t high = extract32(pred_desc, SIMD_DATA_SHIFT + 2, 1); + uint64_t *d = vd; + intptr_t i; + + if (oprsz <= 8) { + uint64_t nn = *(uint64_t *)vn; + uint64_t mm = *(uint64_t *)vm; + int half = 4 * oprsz; + + nn = extract64(nn, high * half, half); + mm = extract64(mm, high * half, half); + nn = expand_bits(nn, esz); + mm = expand_bits(mm, esz); + d[0] = nn + (mm << (1 << esz)); + } else { + ARMPredicateReg tmp_n, tmp_m; + + /* We produce output faster than we consume input. + Therefore we must be mindful of possible overlap. */ + if ((vn - vd) < (uintptr_t)oprsz) { + vn = memcpy(&tmp_n, vn, oprsz); + } + if ((vm - vd) < (uintptr_t)oprsz) { + vm = memcpy(&tmp_m, vm, oprsz); + } + if (high) { + high = oprsz >> 1; + } + + if ((high & 3) == 0) { + uint32_t *n = vn, *m = vm; + high >>= 2; + + for (i = 0; i < DIV_ROUND_UP(oprsz, 8); i++) { + uint64_t nn = n[H4(high + i)]; + uint64_t mm = m[H4(high + i)]; + + nn = expand_bits(nn, esz); + mm = expand_bits(mm, esz); + d[i] = nn + (mm << (1 << esz)); + } + } else { + uint8_t *n = vn, *m = vm; + uint16_t *d16 = vd; + + for (i = 0; i < oprsz / 2; i++) { + uint16_t nn = n[H1(high + i)]; + uint16_t mm = m[H1(high + i)]; + + nn = expand_bits(nn, esz); + mm = expand_bits(mm, esz); + d16[H2(i)] = nn + (mm << (1 << esz)); + } + } + } +} + +void HELPER(sve_uzp_p)(void *vd, void *vn, void *vm, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + int esz = extract32(pred_desc, SIMD_DATA_SHIFT, 2); + int odd = extract32(pred_desc, SIMD_DATA_SHIFT + 2, 1) << esz; + uint64_t *d = vd, *n = vn, *m = vm; + uint64_t l, h; + intptr_t i; + + if (oprsz <= 8) { + l = compress_bits(n[0] >> odd, esz); + h = compress_bits(m[0] >> odd, esz); + d[0] = extract64(l + (h << (4 * oprsz)), 0, 8 * oprsz); + } else { + ARMPredicateReg tmp_m; + intptr_t oprsz_16 = oprsz / 16; + + if ((vm - vd) < (uintptr_t)oprsz) { + m = memcpy(&tmp_m, vm, oprsz); + } + + for (i = 0; i < oprsz_16; i++) { + l = n[2 * i + 0]; + h = n[2 * i + 1]; + l = compress_bits(l >> odd, esz); + h = compress_bits(h >> odd, esz); + d[i] = l + (h << 32); + } + + /* For VL which is not a power of 2, the results from M do not + align nicely with the uint64_t for D. Put the aligned results + from M into TMP_M and then copy it into place afterward. */ + if (oprsz & 15) { + d[i] = compress_bits(n[2 * i] >> odd, esz); + + for (i = 0; i < oprsz_16; i++) { + l = m[2 * i + 0]; + h = m[2 * i + 1]; + l = compress_bits(l >> odd, esz); + h = compress_bits(h >> odd, esz); + tmp_m.p[i] = l + (h << 32); + } + tmp_m.p[i] = compress_bits(m[2 * i] >> odd, esz); + + swap_memmove(vd + oprsz / 2, &tmp_m, oprsz / 2); + } else { + for (i = 0; i < oprsz_16; i++) { + l = m[2 * i + 0]; + h = m[2 * i + 1]; + l = compress_bits(l >> odd, esz); + h = compress_bits(h >> odd, esz); + d[oprsz_16 + i] = l + (h << 32); + } + } + } +} + +void HELPER(sve_trn_p)(void *vd, void *vn, void *vm, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + uintptr_t esz = extract32(pred_desc, SIMD_DATA_SHIFT, 2); + bool odd = extract32(pred_desc, SIMD_DATA_SHIFT + 2, 1); + uint64_t *d = vd, *n = vn, *m = vm; + uint64_t mask; + int shr, shl; + intptr_t i; + + shl = 1 << esz; + shr = 0; + mask = even_bit_esz_masks[esz]; + if (odd) { + mask <<= shl; + shr = shl; + shl = 0; + } + + for (i = 0; i < DIV_ROUND_UP(oprsz, 8); i++) { + uint64_t nn = (n[i] & mask) >> shr; + uint64_t mm = (m[i] & mask) << shl; + d[i] = nn + mm; + } +} + +/* Reverse units of 2**N bits. */ +static uint64_t reverse_bits_64(uint64_t x, int n) +{ + int i, sh; + + x = bswap64(x); + for (i = 2, sh = 4; i >= n; i--, sh >>= 1) { + uint64_t mask = even_bit_esz_masks[i]; + x = ((x & mask) << sh) | ((x >> sh) & mask); + } + return x; +} + +static uint8_t reverse_bits_8(uint8_t x, int n) +{ + static const uint8_t mask[3] = { 0x55, 0x33, 0x0f }; + int i, sh; + + for (i = 2, sh = 4; i >= n; i--, sh >>= 1) { + x = ((x & mask[i]) << sh) | ((x >> sh) & mask[i]); + } + return x; +} + +void HELPER(sve_rev_p)(void *vd, void *vn, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + int esz = extract32(pred_desc, SIMD_DATA_SHIFT, 2); + intptr_t i, oprsz_2 = oprsz / 2; + + if (oprsz <= 8) { + uint64_t l = *(uint64_t *)vn; + l = reverse_bits_64(l << (64 - 8 * oprsz), esz); + *(uint64_t *)vd = l; + } else if ((oprsz & 15) == 0) { + for (i = 0; i < oprsz_2; i += 8) { + intptr_t ih = oprsz - 8 - i; + uint64_t l = reverse_bits_64(*(uint64_t *)(vn + i), esz); + uint64_t h = reverse_bits_64(*(uint64_t *)(vn + ih), esz); + *(uint64_t *)(vd + i) = h; + *(uint64_t *)(vd + ih) = l; + } + } else { + for (i = 0; i < oprsz_2; i += 1) { + intptr_t il = H1(i); + intptr_t ih = H1(oprsz - 1 - i); + uint8_t l = reverse_bits_8(*(uint8_t *)(vn + il), esz); + uint8_t h = reverse_bits_8(*(uint8_t *)(vn + ih), esz); + *(uint8_t *)(vd + il) = h; + *(uint8_t *)(vd + ih) = l; + } + } +} + +void HELPER(sve_punpk_p)(void *vd, void *vn, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + intptr_t high = extract32(pred_desc, SIMD_DATA_SHIFT + 2, 1); + uint64_t *d = vd; + intptr_t i; + + if (oprsz <= 8) { + uint64_t nn = *(uint64_t *)vn; + int half = 4 * oprsz; + + nn = extract64(nn, high * half, half); + nn = expand_bits(nn, 0); + d[0] = nn; + } else { + ARMPredicateReg tmp_n; + + /* We produce output faster than we consume input. + Therefore we must be mindful of possible overlap. */ + if ((vn - vd) < (uintptr_t)oprsz) { + vn = memcpy(&tmp_n, vn, oprsz); + } + if (high) { + high = oprsz >> 1; + } + + if ((high & 3) == 0) { + uint32_t *n = vn; + high >>= 2; + + for (i = 0; i < DIV_ROUND_UP(oprsz, 8); i++) { + uint64_t nn = n[H4(high + i)]; + d[i] = expand_bits(nn, 0); + } + } else { + uint16_t *d16 = vd; + uint8_t *n = vn; + + for (i = 0; i < oprsz / 2; i++) { + uint16_t nn = n[H1(high + i)]; + d16[H2(i)] = expand_bits(nn, 0); + } + } + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 388cce9924..0160d06915 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2089,6 +2089,126 @@ static bool trans_UNPK(DisasContext *s, arg_UNPK *a, uint32_t insn) return true; } +/* + *** SVE Permute - Predicates Group + */ + +static bool do_perm_pred3(DisasContext *s, arg_rrr_esz *a, bool high_odd, + gen_helper_gvec_3 *fn) +{ + if (!sve_access_check(s)) { + return true; + } + + unsigned vsz = pred_full_reg_size(s); + + /* Predicate sizes may be smaller and cannot use simd_desc. + We cannot round up, as we do elsewhere, because we need + the exact size for ZIP2 and REV. We retain the style for + the other helpers for consistency. */ + TCGv_ptr t_d = tcg_temp_new_ptr(); + TCGv_ptr t_n = tcg_temp_new_ptr(); + TCGv_ptr t_m = tcg_temp_new_ptr(); + TCGv_i32 t_desc; + int desc; + + desc = vsz - 2; + desc = deposit32(desc, SIMD_DATA_SHIFT, 2, a->esz); + desc = deposit32(desc, SIMD_DATA_SHIFT + 2, 2, high_odd); + + tcg_gen_addi_ptr(t_d, cpu_env, pred_full_reg_offset(s, a->rd)); + tcg_gen_addi_ptr(t_n, cpu_env, pred_full_reg_offset(s, a->rn)); + tcg_gen_addi_ptr(t_m, cpu_env, pred_full_reg_offset(s, a->rm)); + t_desc = tcg_const_i32(desc); + + fn(t_d, t_n, t_m, t_desc); + + tcg_temp_free_ptr(t_d); + tcg_temp_free_ptr(t_n); + tcg_temp_free_ptr(t_m); + tcg_temp_free_i32(t_desc); + return true; +} + +static bool do_perm_pred2(DisasContext *s, arg_rr_esz *a, bool high_odd, + gen_helper_gvec_2 *fn) +{ + if (!sve_access_check(s)) { + return true; + } + + unsigned vsz = pred_full_reg_size(s); + TCGv_ptr t_d = tcg_temp_new_ptr(); + TCGv_ptr t_n = tcg_temp_new_ptr(); + TCGv_i32 t_desc; + int desc; + + tcg_gen_addi_ptr(t_d, cpu_env, pred_full_reg_offset(s, a->rd)); + tcg_gen_addi_ptr(t_n, cpu_env, pred_full_reg_offset(s, a->rn)); + + /* Predicate sizes may be smaller and cannot use simd_desc. + We cannot round up, as we do elsewhere, because we need + the exact size for ZIP2 and REV. We retain the style for + the other helpers for consistency. */ + + desc = vsz - 2; + desc = deposit32(desc, SIMD_DATA_SHIFT, 2, a->esz); + desc = deposit32(desc, SIMD_DATA_SHIFT + 2, 2, high_odd); + t_desc = tcg_const_i32(desc); + + fn(t_d, t_n, t_desc); + + tcg_temp_free_i32(t_desc); + tcg_temp_free_ptr(t_d); + tcg_temp_free_ptr(t_n); + return true; +} + +static bool trans_ZIP1_p(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + return do_perm_pred3(s, a, 0, gen_helper_sve_zip_p); +} + +static bool trans_ZIP2_p(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + return do_perm_pred3(s, a, 1, gen_helper_sve_zip_p); +} + +static bool trans_UZP1_p(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + return do_perm_pred3(s, a, 0, gen_helper_sve_uzp_p); +} + +static bool trans_UZP2_p(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + return do_perm_pred3(s, a, 1, gen_helper_sve_uzp_p); +} + +static bool trans_TRN1_p(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + return do_perm_pred3(s, a, 0, gen_helper_sve_trn_p); +} + +static bool trans_TRN2_p(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + return do_perm_pred3(s, a, 1, gen_helper_sve_trn_p); +} + +static bool trans_REV_p(DisasContext *s, arg_rr_esz *a, uint32_t insn) +{ + return do_perm_pred2(s, a, 0, gen_helper_sve_rev_p); +} + +static bool trans_PUNPKLO(DisasContext *s, arg_PUNPKLO *a, uint32_t insn) +{ + return do_perm_pred2(s, a, 0, gen_helper_sve_punpk_p); +} + +static bool trans_PUNPKHI(DisasContext *s, arg_PUNPKHI *a, uint32_t insn) +{ + return do_perm_pred2(s, a, 1, gen_helper_sve_punpk_p); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 7ffd7962c8..26fe1608c4 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -86,6 +86,7 @@ # Three operand, vector element size @rd_rn_rm ........ esz:2 . rm:5 ... ... rn:5 rd:5 &rrr_esz +@pd_pn_pm ........ esz:2 .. rm:4 ....... rn:4 . rd:4 &rrr_esz @rdn_rm ........ esz:2 ...... ...... rm:5 rd:5 \ &rrr_esz rn=%reg_movprfx @@ -396,6 +397,23 @@ TBL 00000101 .. 1 ..... 001100 ..... ..... @rd_rn_rm # SVE unpack vector elements UNPK 00000101 esz:2 1100 u:1 h:1 001110 rn:5 rd:5 +### SVE Permute - Predicates Group + +# SVE permute predicate elements +ZIP1_p 00000101 .. 10 .... 010 000 0 .... 0 .... @pd_pn_pm +ZIP2_p 00000101 .. 10 .... 010 001 0 .... 0 .... @pd_pn_pm +UZP1_p 00000101 .. 10 .... 010 010 0 .... 0 .... @pd_pn_pm +UZP2_p 00000101 .. 10 .... 010 011 0 .... 0 .... @pd_pn_pm +TRN1_p 00000101 .. 10 .... 010 100 0 .... 0 .... @pd_pn_pm +TRN2_p 00000101 .. 10 .... 010 101 0 .... 0 .... @pd_pn_pm + +# SVE reverse predicate elements +REV_p 00000101 .. 11 0100 010 000 0 .... 0 .... @pd_pn + +# SVE unpack predicate elements +PUNPKLO 00000101 00 11 0000 010 000 0 .... 0 .... @pd_pn_e0 +PUNPKHI 00000101 00 11 0001 010 000 0 .... 0 .... @pd_pn_e0 + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Wed May 30 18:01:06 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 137267 Delivered-To: patch@linaro.org Received: by 2002:a2e:9706:0:0:0:0:0 with SMTP id r6-v6csp5632011lji; Wed, 30 May 2018 11:05:23 -0700 (PDT) X-Google-Smtp-Source: ADUXVKJk54P6X+rcUauvjnAxIIh+6PcUh1DcGhuzfcqz/lFBshq7geC8MUy2xPTvkcPo8kJmej57 X-Received: by 2002:aed:22b6:: with SMTP id p51-v6mr3746546qtc.222.1527703523151; Wed, 30 May 2018 11:05:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527703523; cv=none; d=google.com; s=arc-20160816; b=dd46JECVVzeN1dOr8JxmE2GtX8NRdimU4WJtEVWGOY/49pvnJm67iYINtAT9JpYgPH JG8Qmyu9I5BScTg86gAt5zTxFI9yJIBzvwaV0UKbKmPYewcm8oW+JhjhqwflIAhNVPst podG6VxFAvIh03vjA+546a56k5J76dZm3VVuUFq009EWtBMzjI8MsT7N5anjTXEf43pn AbtZC3K90hcxzFJkFZ8Ezpz/bsBzqlD5GMQ/+yBr2UaNvvL3g/fppMR7N7KzFAAGX7zF FK9vUcJPpDhUJIzl+WDYX0CJemifiMbtYYz/SKJwk2LTgcJJJ5CmLUUHDw4egzGpyc2P /pJw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=1Amlgc5th51H+/Lj0wMJTGxFXQDR4w362i1SW6kVP98=; b=GJs8UeyCdMEUMiZH5bAcmHc7WZunroIWvDogyWp9RX/P+DMOlYXLH7noBJGFSPtOpI ZZGV42GUuicqwXZCqatOH7RU5fenTFBtRJb9vU54eyNECdDr9VyVzUi8ZOO0HrWO2HkS OyWEHcIyhAOXDcSeJ3qNSjzW5nse/4UEf4pJRlVndLczH+9HoZ03+tyndgETYTGVsBWK xubKHpvmsUWFbC9WUl1+0HlZm0uECM2fDjBCv6j3B9CzssncOU6noV9DeJjjmTI32nq4 gak41CLXtF/OU0I7AJxaiI4iky/xz6H2Z3k9I4iYSqnjrBtNmSviimbouj2xD6+kXqbL OqOQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=PlcYEP4P; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id t3-v6si3563907qvl.124.2018.05.30.11.05.22 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 30 May 2018 11:05:23 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=PlcYEP4P; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:40053 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fO5TO-0004Kq-Kh for patch@linaro.org; Wed, 30 May 2018 14:05:22 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33224) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fO5Ph-0001qo-G6 for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:36 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fO5Pf-0004SD-4p for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:33 -0400 Received: from mail-pf0-x236.google.com ([2607:f8b0:400e:c00::236]:37246) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fO5Pe-0004Qr-Tl for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:31 -0400 Received: by mail-pf0-x236.google.com with SMTP id e9-v6so9408226pfi.4 for ; Wed, 30 May 2018 11:01:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=1Amlgc5th51H+/Lj0wMJTGxFXQDR4w362i1SW6kVP98=; b=PlcYEP4P7iUsgbFDpp1ZXvz4lkXTDrb7PPuefGN3AMfeQkbzisjz1z29yNrcRdz2Tg n8TcBkLR1Pddl6gwGsDv4ySfUJpuWlC7NLqLXlk6mfMe2ak2oDevdBgr7nBIQqQuMcVr 7i8fRvIn0q4xapU8e9s/SUxFw3SLKXIEMMItk= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=1Amlgc5th51H+/Lj0wMJTGxFXQDR4w362i1SW6kVP98=; b=cNybos0kh2XvUhGaE8qD/0/PuqGAVFnrnicyzfBndkF2mnhRrP3ulV6V1hJ6+pii28 PT8+nB90ptvDiloWX+UNE3ggk7oTxRRwi+9MX8LOC6yTMZlpdLyVSamNvykL1PXBxgKu WfOxJqSgVHvg5oJymaU4uGFvGRNYWIYWwUsRENtdPdxOcwHjVvK5KKVZhYhV/Ro6Bc3s zVKvKPzLiJG6PsXldziLiO30za9jhM1X4z9pqEtTDYkMDddHQK8aIEbi59Xor0vZ+MnT Ld7QEWJVwevfE3aC38eoaAUqV6Et6tTeghcPWc+8nTN/rYJtCtoYKuQ4S9bzEb245kyr rWpA== X-Gm-Message-State: ALKqPwfeI1yB/hkI3a0zXGAQ2UTyA+5rQFRLf7ndhy0pdTyaG4AlvXSR 3xA8WHDTM3bE7LRN1DqDKfNBx0UDwFw= X-Received: by 2002:a62:3c15:: with SMTP id j21-v6mr3717488pfa.7.1527703288929; Wed, 30 May 2018 11:01:28 -0700 (PDT) Received: from cloudburst.twiddle.net (97-126-112-211.tukw.qwest.net. [97.126.112.211]) by smtp.gmail.com with ESMTPSA id b84-v6sm28179157pfm.123.2018.05.30.11.01.27 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 30 May 2018 11:01:27 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 30 May 2018 11:01:06 -0700 Message-Id: <20180530180120.13355-5-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180530180120.13355-1-richard.henderson@linaro.org> References: <20180530180120.13355-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::236 Subject: [Qemu-devel] [PATCH v3b 04/18] target/arm: Implement SVE Permute - Interleaving Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 15 ++++++++ target/arm/sve_helper.c | 72 ++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 75 ++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 10 +++++ 4 files changed, 172 insertions(+) -- 2.17.0 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index ff958fcebd..bab20345c6 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -445,6 +445,21 @@ DEF_HELPER_FLAGS_4(sve_trn_p, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_rev_p, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_punpk_p, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_zip_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_zip_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_zip_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_zip_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_uzp_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_uzp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_uzp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_uzp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_trn_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_trn_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_trn_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_trn_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index f4d49d4aff..f114e9ab63 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1964,3 +1964,75 @@ void HELPER(sve_punpk_p)(void *vd, void *vn, uint32_t pred_desc) } } } + +#define DO_ZIP(NAME, TYPE, H) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t oprsz = simd_oprsz(desc); \ + intptr_t i, oprsz_2 = oprsz / 2; \ + ARMVectorReg tmp_n, tmp_m; \ + /* We produce output faster than we consume input. \ + Therefore we must be mindful of possible overlap. */ \ + if (unlikely((vn - vd) < (uintptr_t)oprsz)) { \ + vn = memcpy(&tmp_n, vn, oprsz_2); \ + } \ + if (unlikely((vm - vd) < (uintptr_t)oprsz)) { \ + vm = memcpy(&tmp_m, vm, oprsz_2); \ + } \ + for (i = 0; i < oprsz_2; i += sizeof(TYPE)) { \ + *(TYPE *)(vd + H(2 * i + 0)) = *(TYPE *)(vn + H(i)); \ + *(TYPE *)(vd + H(2 * i + sizeof(TYPE))) = *(TYPE *)(vm + H(i)); \ + } \ +} + +DO_ZIP(sve_zip_b, uint8_t, H1) +DO_ZIP(sve_zip_h, uint16_t, H1_2) +DO_ZIP(sve_zip_s, uint32_t, H1_4) +DO_ZIP(sve_zip_d, uint64_t, ) + +#define DO_UZP(NAME, TYPE, H) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t oprsz = simd_oprsz(desc); \ + intptr_t oprsz_2 = oprsz / 2; \ + intptr_t odd_ofs = simd_data(desc); \ + intptr_t i; \ + ARMVectorReg tmp_m; \ + if (unlikely((vm - vd) < (uintptr_t)oprsz)) { \ + vm = memcpy(&tmp_m, vm, oprsz); \ + } \ + for (i = 0; i < oprsz_2; i += sizeof(TYPE)) { \ + *(TYPE *)(vd + H(i)) = *(TYPE *)(vn + H(2 * i + odd_ofs)); \ + } \ + for (i = 0; i < oprsz_2; i += sizeof(TYPE)) { \ + *(TYPE *)(vd + H(oprsz_2 + i)) = *(TYPE *)(vm + H(2 * i + odd_ofs)); \ + } \ +} + +DO_UZP(sve_uzp_b, uint8_t, H1) +DO_UZP(sve_uzp_h, uint16_t, H1_2) +DO_UZP(sve_uzp_s, uint32_t, H1_4) +DO_UZP(sve_uzp_d, uint64_t, ) + +#define DO_TRN(NAME, TYPE, H) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t oprsz = simd_oprsz(desc); \ + intptr_t odd_ofs = simd_data(desc); \ + intptr_t i; \ + for (i = 0; i < oprsz; i += 2 * sizeof(TYPE)) { \ + TYPE ae = *(TYPE *)(vn + H(i + odd_ofs)); \ + TYPE be = *(TYPE *)(vm + H(i + odd_ofs)); \ + *(TYPE *)(vd + H(i + 0)) = ae; \ + *(TYPE *)(vd + H(i + sizeof(TYPE))) = be; \ + } \ +} + +DO_TRN(sve_trn_b, uint8_t, H1) +DO_TRN(sve_trn_h, uint16_t, H1_2) +DO_TRN(sve_trn_s, uint32_t, H1_4) +DO_TRN(sve_trn_d, uint64_t, ) + +#undef DO_ZIP +#undef DO_UZP +#undef DO_TRN diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 0160d06915..21319518d7 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2209,6 +2209,81 @@ static bool trans_PUNPKHI(DisasContext *s, arg_PUNPKHI *a, uint32_t insn) return do_perm_pred2(s, a, 1, gen_helper_sve_punpk_p); } +/* + *** SVE Permute - Interleaving Group + */ + +static bool do_zip(DisasContext *s, arg_rrr_esz *a, bool high) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve_zip_b, gen_helper_sve_zip_h, + gen_helper_sve_zip_s, gen_helper_sve_zip_d, + }; + + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + unsigned high_ofs = high ? vsz / 2 : 0; + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn) + high_ofs, + vec_full_reg_offset(s, a->rm) + high_ofs, + vsz, vsz, 0, fns[a->esz]); + } + return true; +} + +static bool do_zzz_data_ool(DisasContext *s, arg_rrr_esz *a, int data, + gen_helper_gvec_3 *fn) +{ + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vsz, vsz, data, fn); + } + return true; +} + +static bool trans_ZIP1_z(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + return do_zip(s, a, false); +} + +static bool trans_ZIP2_z(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + return do_zip(s, a, true); +} + +static gen_helper_gvec_3 * const uzp_fns[4] = { + gen_helper_sve_uzp_b, gen_helper_sve_uzp_h, + gen_helper_sve_uzp_s, gen_helper_sve_uzp_d, +}; + +static bool trans_UZP1_z(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + return do_zzz_data_ool(s, a, 0, uzp_fns[a->esz]); +} + +static bool trans_UZP2_z(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + return do_zzz_data_ool(s, a, 1 << a->esz, uzp_fns[a->esz]); +} + +static gen_helper_gvec_3 * const trn_fns[4] = { + gen_helper_sve_trn_b, gen_helper_sve_trn_h, + gen_helper_sve_trn_s, gen_helper_sve_trn_d, +}; + +static bool trans_TRN1_z(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + return do_zzz_data_ool(s, a, 0, trn_fns[a->esz]); +} + +static bool trans_TRN2_z(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + return do_zzz_data_ool(s, a, 1 << a->esz, trn_fns[a->esz]); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 26fe1608c4..df2b94dc0a 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -414,6 +414,16 @@ REV_p 00000101 .. 11 0100 010 000 0 .... 0 .... @pd_pn PUNPKLO 00000101 00 11 0000 010 000 0 .... 0 .... @pd_pn_e0 PUNPKHI 00000101 00 11 0001 010 000 0 .... 0 .... @pd_pn_e0 +### SVE Permute - Interleaving Group + +# SVE permute vector elements +ZIP1_z 00000101 .. 1 ..... 011 000 ..... ..... @rd_rn_rm +ZIP2_z 00000101 .. 1 ..... 011 001 ..... ..... @rd_rn_rm +UZP1_z 00000101 .. 1 ..... 011 010 ..... ..... @rd_rn_rm +UZP2_z 00000101 .. 1 ..... 011 011 ..... ..... @rd_rn_rm +TRN1_z 00000101 .. 1 ..... 011 100 ..... ..... @rd_rn_rm +TRN2_z 00000101 .. 1 ..... 011 101 ..... ..... @rd_rn_rm + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Wed May 30 18:01:07 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 137269 Delivered-To: patch@linaro.org Received: by 2002:a2e:9706:0:0:0:0:0 with SMTP id r6-v6csp5634759lji; Wed, 30 May 2018 11:08:11 -0700 (PDT) X-Google-Smtp-Source: ADUXVKLCWV8x4om5WWIz1g/K0K+eCkCtJdzScxBCMZ6SzkOsP5O0tCIOf1zKq32k4jq+Ro5xrprA X-Received: by 2002:aed:226d:: with SMTP id o42-v6mr3802475qtc.258.1527703690978; Wed, 30 May 2018 11:08:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527703690; cv=none; d=google.com; s=arc-20160816; b=xjRMjb17/8tmmKvULCtJAK0KfBtO+WxMGJb060LpG24Dlo5c8be3nxyC8d01mhokWD RxS6BWH38ji0VnmJ+qmiYl+Uxy6UWPZDZmu0HRGvlcpjrlH7MxPT3MlKetYmPB1x3rh/ MYEoCO+DS2/E7xL5Xo+UnZIE2HDj1pVBlH2pSM8BVuLtG2NTwdnaQW9XiWoO3E2ea3df I3A5CZErx/ihFWESQGHuenONQXRq8QZVFofSmkfk2a7aQxwrlCK3Au2QAQ3Gj741Zecc K4On/NlSkOTnSRzqRZY9hSastSRju/087FV4h2tQ+jhCrxR+CMkyBSNl1T4AWUyUxZjD NexQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=PDpmyfb4bOVCnOHan4I6CC+vGr/XXSD5EaKpHePia6k=; b=WZmUphdFMZjeAoUNg8CF9vQvX9J1QyL6oRBg1CaJqzABRjjt623Bup8LOwYfcJ5M4y k/nlBK1FTVjWy9U+I4lCABuVyNBmLwQH3OjegUMH9MbkePGLTtSh1n+mIxohFMKXMqP6 xdPVnIvx05inY+qCAI229EX/yAUSrgCJCpcd9KtkoVllcyuj7Xi4Nw2WW/O1FTvYlVEF KlVQTjpZwGg8iB9CoETe8GFS0vCCbdaAVtbpHIounfTYA8mYQJlNVs7tbjbOjy6agKxJ EnXpYESH9QxgVt8gj9evtYOBAjKyGwlb9e7q6lILcurHz1wIBuaZ1QJs4QfJuLZ9ZruC BF/A== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=T2rP/jr9; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id m82-v6si4232946qki.56.2018.05.30.11.08.10 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 30 May 2018 11:08:10 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=T2rP/jr9; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:40071 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fO5W6-0005r9-E1 for patch@linaro.org; Wed, 30 May 2018 14:08:10 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33262) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fO5Pj-0001sS-1Z for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:36 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fO5Pg-0004U2-El for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:35 -0400 Received: from mail-pf0-x22c.google.com ([2607:f8b0:400e:c00::22c]:39944) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fO5Pg-0004SZ-6X for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:32 -0400 Received: by mail-pf0-x22c.google.com with SMTP id f189-v6so9413967pfa.7 for ; Wed, 30 May 2018 11:01:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=PDpmyfb4bOVCnOHan4I6CC+vGr/XXSD5EaKpHePia6k=; b=T2rP/jr9pMBVLTBhLlIzTy4Hz93hEpBngAC11ZoOnBYVocQbYox1ygyFiacP5HdKOO Zx3HAwg8A/Kna/RSVsO/87t9zEKJYZGV6Fnsk/pI0CCMRo5JPz/2A8fV7W5vwGXHTAhT oN41BvewirQW5PekPLVit2OpQJwlKFGPY7orE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=PDpmyfb4bOVCnOHan4I6CC+vGr/XXSD5EaKpHePia6k=; b=oAgI8XpiJaGtSe0k8NJgT8w7Iz3bTZmLjsXD3TpyYP1pHVwdLO33v+/3pb4MXNHwJF 9V+zDDijLNr5FxWGmwjMU6WHA4BXp7V/HjQ/Kpf+vJo/UvdmkpxYFNZJysQfvb8G/0AG 4V0zRUmUzh+4lr5L/Zd7YQJoWDqIjuMdPrFy0su5V00vJPUb5B/IhV2lBMe1xxZKQtfc 8Y8PtTA7RengRJOPi5bJ4HhMOO/62g86570vc2znNWWzB32aaHSFcTOp88cHAb16zRTl vll7O/ma3qOgyZ4ZajkeuYboN1g1ndP0EkDcaVUgKXIBy84eAX0d/JQQef6cnTY1ZRCj jirw== X-Gm-Message-State: ALKqPwcY3KAKP9pyZtEX/sTpNcyLA/OcI8WsoXw5DCbUxSTdT0jog5Wg lU6gnfPXKckZpi2BcTjd/XAqQUiKI9M= X-Received: by 2002:a62:930c:: with SMTP id b12-v6mr3680072pfe.193.1527703290385; Wed, 30 May 2018 11:01:30 -0700 (PDT) Received: from cloudburst.twiddle.net (97-126-112-211.tukw.qwest.net. [97.126.112.211]) by smtp.gmail.com with ESMTPSA id b84-v6sm28179157pfm.123.2018.05.30.11.01.29 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 30 May 2018 11:01:29 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 30 May 2018 11:01:07 -0700 Message-Id: <20180530180120.13355-6-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180530180120.13355-1-richard.henderson@linaro.org> References: <20180530180120.13355-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::22c Subject: [Qemu-devel] [PATCH v3b 05/18] target/arm: Implement SVE compress active elements X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 3 +++ target/arm/sve_helper.c | 34 ++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 12 ++++++++++++ target/arm/sve.decode | 6 ++++++ 4 files changed, 55 insertions(+) -- 2.17.0 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index bab20345c6..d977aea00d 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -460,6 +460,9 @@ DEF_HELPER_FLAGS_4(sve_trn_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_trn_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_trn_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_compact_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_compact_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index f114e9ab63..ed3c6d4ca9 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2036,3 +2036,37 @@ DO_TRN(sve_trn_d, uint64_t, ) #undef DO_ZIP #undef DO_UZP #undef DO_TRN + +void HELPER(sve_compact_s)(void *vd, void *vn, void *vg, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc) / 4; + uint32_t *d = vd, *n = vn; + uint8_t *pg = vg; + + for (i = j = 0; i < opr_sz; i++) { + if (pg[H1(i / 2)] & (i & 1 ? 0x10 : 0x01)) { + d[H4(j)] = n[H4(i)]; + j++; + } + } + for (; j < opr_sz; j++) { + d[H4(j)] = 0; + } +} + +void HELPER(sve_compact_d)(void *vd, void *vn, void *vg, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd, *n = vn; + uint8_t *pg = vg; + + for (i = j = 0; i < opr_sz; i++) { + if (pg[H1(i)] & 1) { + d[j] = n[i]; + j++; + } + } + for (; j < opr_sz; j++) { + d[j] = 0; + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 21319518d7..ed0f48a927 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2284,6 +2284,18 @@ static bool trans_TRN2_z(DisasContext *s, arg_rrr_esz *a, uint32_t insn) return do_zzz_data_ool(s, a, 1 << a->esz, trn_fns[a->esz]); } +/* + *** SVE Permute Vector - Predicated Group + */ + +static bool trans_COMPACT(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] = { + NULL, NULL, gen_helper_sve_compact_s, gen_helper_sve_compact_d + }; + return do_zpz_ool(s, a, fns[a->esz]); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index df2b94dc0a..9da6566d32 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -424,6 +424,12 @@ UZP2_z 00000101 .. 1 ..... 011 011 ..... ..... @rd_rn_rm TRN1_z 00000101 .. 1 ..... 011 100 ..... ..... @rd_rn_rm TRN2_z 00000101 .. 1 ..... 011 101 ..... ..... @rd_rn_rm +### SVE Permute - Predicated Group + +# SVE compress active elements +# Note esz >= 2 +COMPACT 00000101 .. 100001 100 ... ..... ..... @rd_pg_rn + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Wed May 30 18:01:08 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 137274 Delivered-To: patch@linaro.org Received: by 2002:a2e:9706:0:0:0:0:0 with SMTP id r6-v6csp5638603lji; Wed, 30 May 2018 11:11:58 -0700 (PDT) X-Google-Smtp-Source: ADUXVKL/ndE0f+QSeEbRA4ip+N4vVZO06FF1ZjvjcBKnVGoplxNM057AO4Eb2sIrq06Hax8kG6BV X-Received: by 2002:ac8:7159:: with SMTP id h25-v6mr3590321qtp.206.1527703918113; Wed, 30 May 2018 11:11:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527703918; cv=none; d=google.com; s=arc-20160816; b=th5/2U9oXugAgvubXuG4S/Xp3DtLDbGPq9L7mcimPzFQ/7vaDV+HKyxGPXxcrgrBe1 f970xo1M6Ex+EdTujIq0y8cjoGuiaiIeXVoSHQiQ+n81UK+AoLy9pA4ank5TdTYdS1om 78kz0bALVloLo4WFsUe03gEdx9f0e0DBFkstr8BE/jmAjo7xwPDarq9sVIyEkGXheB6B t5FDdqXVtqMZuqwKMTq6oGerbG03pZGWJG/NCsNJN/quSvqkrTzKAuphRoDjp1HZC6/K WCnu16Ot1E3i4fDKX9c9IcmTlBvzwbaTlVHhX2yWfUGobONgbabOPvtlWc/raKoAkcqd aSog== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=B9ekVgR/sp50y+9jw/muU0H/7AM7yNdC4FiG/uIbNMA=; b=KdcgRlhUJe06eCGNnHLsdiVefTHqTGLEY34o3w3qDjwlWM9t1ro+XFglcIPF2MAzLj IdR/iSn0SNnPKi6eOZAsr8oFmVtiCx5MvHFL2PCoPH7I3tRw8o+HwA4oSLV/PN7Ic1U5 jrmirDUFBm6ekHO+jWptLPOyZup1CEiY3aEAcBEhKsP2ryaJ9CbOa+a+4cBFhK8G6Lls m5CEOxu30FJQNHxbtMKYNh76ilYOkJJRU0XyxaVt9KhIJ9pt4jb15/O35o2KEBIh7jx5 EaECPebA5Xk3clUbbcPTHUvtPC/N/qhNUUVPmR30v226Ez70/msminMW65dYIUW+JEpD +Lyw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=W9blBZ0N; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id e65-v6si72786qkj.238.2018.05.30.11.11.57 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 30 May 2018 11:11:58 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=W9blBZ0N; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:40096 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fO5Zl-0000he-E0 for patch@linaro.org; Wed, 30 May 2018 14:11:57 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33295) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fO5Pk-0001tm-CC for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:43 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fO5Ph-0004Vi-NN for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:36 -0400 Received: from mail-pl0-x229.google.com ([2607:f8b0:400e:c01::229]:33979) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fO5Ph-0004Uv-Cb for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:33 -0400 Received: by mail-pl0-x229.google.com with SMTP id ay10-v6so11557962plb.1 for ; Wed, 30 May 2018 11:01:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=B9ekVgR/sp50y+9jw/muU0H/7AM7yNdC4FiG/uIbNMA=; b=W9blBZ0ND+ZWQSRMaHBGYuNOmX8KIzh+nUWXcNElUrRtIypq9SXB4pCsrqjjJHadJE PLhSQxraDrivO+CACeYhIiNl11b+nz/PRE/adE9Zwm0ZTZu7AAQyu/gazutmYkqq/ow0 hpl9WRTEuDkoO83bFX8E1g7u63jTZI3dAvR+w= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=B9ekVgR/sp50y+9jw/muU0H/7AM7yNdC4FiG/uIbNMA=; b=ZFVBp8XCRGY/tha1wyBsOapKl+/Hduk+Eo5y7IRaZIM6uiQGPsei8mh7N/SmjQR/mo V6FaIHORUn3RTf/VCvQ5jr5tEgLf+EG89QZuqwZm2M8Nz0LJ4JwoB4IEaML81viARSrt voYExeheipTUsYODICHWv/mCo4TCRWhPbxzLgZa51yVKV5XFgVMil7R2qI+M5qGPG4J8 7mLhe/bV5ii+5eZ+/JwUVqhIbDDpw+peL6xsldz8N7Yam3P+7EE4dzcRfT4G0W7B3j4C cdvr63Y6e48czI9EOiktXr6fZCF/AAlkRKf9bZYOhFDVWfv7T2F/pXEsmJjBue0E1jui vPig== X-Gm-Message-State: ALKqPwclJ5M3pbwq8IBkOHGFP7FLrle3LTFFV7CPXUAOkPmhNs2Wqc1o AJ+aXcxyDXp4qgxR1T4CF91i+L6ah/c= X-Received: by 2002:a17:902:7244:: with SMTP id c4-v6mr3771026pll.265.1527703291688; Wed, 30 May 2018 11:01:31 -0700 (PDT) Received: from cloudburst.twiddle.net (97-126-112-211.tukw.qwest.net. [97.126.112.211]) by smtp.gmail.com with ESMTPSA id b84-v6sm28179157pfm.123.2018.05.30.11.01.30 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 30 May 2018 11:01:30 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 30 May 2018 11:01:08 -0700 Message-Id: <20180530180120.13355-7-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180530180120.13355-1-richard.henderson@linaro.org> References: <20180530180120.13355-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::229 Subject: [Qemu-devel] [PATCH v3b 06/18] target/arm: Implement SVE conditionally broadcast/extract element X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 2 + target/arm/sve_helper.c | 11 ++ target/arm/translate-sve.c | 318 +++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 20 +++ 4 files changed, 351 insertions(+) -- 2.17.0 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index d977aea00d..a58fb4ba01 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -463,6 +463,8 @@ DEF_HELPER_FLAGS_4(sve_trn_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_compact_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_compact_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_2(sve_last_active_element, TCG_CALL_NO_RWG, s32, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index ed3c6d4ca9..941a098234 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2070,3 +2070,14 @@ void HELPER(sve_compact_d)(void *vd, void *vn, void *vg, uint32_t desc) d[j] = 0; } } + +/* Similar to the ARM LastActiveElement pseudocode function, except the + result is multiplied by the element size. This includes the not found + indication; e.g. not found for esz=3 is -8. */ +int32_t HELPER(sve_last_active_element)(void *vg, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + intptr_t esz = extract32(pred_desc, SIMD_DATA_SHIFT, 2); + + return last_active_element(vg, DIV_ROUND_UP(oprsz, 8), esz); +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index ed0f48a927..edcef277f8 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2296,6 +2296,324 @@ static bool trans_COMPACT(DisasContext *s, arg_rpr_esz *a, uint32_t insn) return do_zpz_ool(s, a, fns[a->esz]); } +/* Call the helper that computes the ARM LastActiveElement pseudocode + function, scaled by the element size. This includes the not found + indication; e.g. not found for esz=3 is -8. */ +static void find_last_active(DisasContext *s, TCGv_i32 ret, int esz, int pg) +{ + /* Predicate sizes may be smaller and cannot use simd_desc. We cannot + round up, as we do elsewhere, because we need the exact size. */ + TCGv_ptr t_p = tcg_temp_new_ptr(); + TCGv_i32 t_desc; + unsigned vsz = pred_full_reg_size(s); + unsigned desc; + + desc = vsz - 2; + desc = deposit32(desc, SIMD_DATA_SHIFT, 2, esz); + + tcg_gen_addi_ptr(t_p, cpu_env, pred_full_reg_offset(s, pg)); + t_desc = tcg_const_i32(desc); + + gen_helper_sve_last_active_element(ret, t_p, t_desc); + + tcg_temp_free_i32(t_desc); + tcg_temp_free_ptr(t_p); +} + +/* Increment LAST to the offset of the next element in the vector, + wrapping around to 0. */ +static void incr_last_active(DisasContext *s, TCGv_i32 last, int esz) +{ + unsigned vsz = vec_full_reg_size(s); + + tcg_gen_addi_i32(last, last, 1 << esz); + if (is_power_of_2(vsz)) { + tcg_gen_andi_i32(last, last, vsz - 1); + } else { + TCGv_i32 max = tcg_const_i32(vsz); + TCGv_i32 zero = tcg_const_i32(0); + tcg_gen_movcond_i32(TCG_COND_GEU, last, last, max, zero, last); + tcg_temp_free_i32(max); + tcg_temp_free_i32(zero); + } +} + +/* If LAST < 0, set LAST to the offset of the last element in the vector. */ +static void wrap_last_active(DisasContext *s, TCGv_i32 last, int esz) +{ + unsigned vsz = vec_full_reg_size(s); + + if (is_power_of_2(vsz)) { + tcg_gen_andi_i32(last, last, vsz - 1); + } else { + TCGv_i32 max = tcg_const_i32(vsz - (1 << esz)); + TCGv_i32 zero = tcg_const_i32(0); + tcg_gen_movcond_i32(TCG_COND_LT, last, last, zero, max, last); + tcg_temp_free_i32(max); + tcg_temp_free_i32(zero); + } +} + +/* Load an unsigned element of ESZ from BASE+OFS. */ +static TCGv_i64 load_esz(TCGv_ptr base, int ofs, int esz) +{ + TCGv_i64 r = tcg_temp_new_i64(); + + switch (esz) { + case 0: + tcg_gen_ld8u_i64(r, base, ofs); + break; + case 1: + tcg_gen_ld16u_i64(r, base, ofs); + break; + case 2: + tcg_gen_ld32u_i64(r, base, ofs); + break; + case 3: + tcg_gen_ld_i64(r, base, ofs); + break; + default: + g_assert_not_reached(); + } + return r; +} + +/* Load an unsigned element of ESZ from RM[LAST]. */ +static TCGv_i64 load_last_active(DisasContext *s, TCGv_i32 last, + int rm, int esz) +{ + TCGv_ptr p = tcg_temp_new_ptr(); + TCGv_i64 r; + + /* Convert offset into vector into offset into ENV. + The final adjustment for the vector register base + is added via constant offset to the load. */ +#ifdef HOST_WORDS_BIGENDIAN + /* Adjust for element ordering. See vec_reg_offset. */ + if (esz < 3) { + tcg_gen_xori_i32(last, last, 8 - (1 << esz)); + } +#endif + tcg_gen_ext_i32_ptr(p, last); + tcg_gen_add_ptr(p, p, cpu_env); + + r = load_esz(p, vec_full_reg_offset(s, rm), esz); + tcg_temp_free_ptr(p); + + return r; +} + +/* Compute CLAST for a Zreg. */ +static bool do_clast_vector(DisasContext *s, arg_rprr_esz *a, bool before) +{ + if (!sve_access_check(s)) { + return true; + } + + TCGv_i32 last = tcg_temp_local_new_i32(); + TCGLabel *over = gen_new_label(); + TCGv_i64 ele; + unsigned vsz, esz = a->esz; + + find_last_active(s, last, esz, a->pg); + + /* There is of course no movcond for a 2048-bit vector, + so we must branch over the actual store. */ + tcg_gen_brcondi_i32(TCG_COND_LT, last, 0, over); + + if (!before) { + incr_last_active(s, last, esz); + } + + ele = load_last_active(s, last, a->rm, esz); + tcg_temp_free_i32(last); + + vsz = vec_full_reg_size(s); + tcg_gen_gvec_dup_i64(esz, vec_full_reg_offset(s, a->rd), vsz, vsz, ele); + tcg_temp_free_i64(ele); + + /* If this insn used MOVPRFX, we may need a second move. */ + if (a->rd != a->rn) { + TCGLabel *done = gen_new_label(); + tcg_gen_br(done); + + gen_set_label(over); + do_mov_z(s, a->rd, a->rn); + + gen_set_label(done); + } else { + gen_set_label(over); + } + return true; +} + +static bool trans_CLASTA_z(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ + return do_clast_vector(s, a, false); +} + +static bool trans_CLASTB_z(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ + return do_clast_vector(s, a, true); +} + +/* Compute CLAST for a scalar. */ +static void do_clast_scalar(DisasContext *s, int esz, int pg, int rm, + bool before, TCGv_i64 reg_val) +{ + TCGv_i32 last = tcg_temp_new_i32(); + TCGv_i64 ele, cmp, zero; + + find_last_active(s, last, esz, pg); + + /* Extend the original value of last prior to incrementing. */ + cmp = tcg_temp_new_i64(); + tcg_gen_ext_i32_i64(cmp, last); + + if (!before) { + incr_last_active(s, last, esz); + } + + /* The conceit here is that while last < 0 indicates not found, after + adjusting for cpu_env->vfp.zregs[rm], it is still a valid address + from which we can load garbage. We then discard the garbage with + a conditional move. */ + ele = load_last_active(s, last, rm, esz); + tcg_temp_free_i32(last); + + zero = tcg_const_i64(0); + tcg_gen_movcond_i64(TCG_COND_GE, reg_val, cmp, zero, ele, reg_val); + + tcg_temp_free_i64(zero); + tcg_temp_free_i64(cmp); + tcg_temp_free_i64(ele); +} + +/* Compute CLAST for a Vreg. */ +static bool do_clast_fp(DisasContext *s, arg_rpr_esz *a, bool before) +{ + if (sve_access_check(s)) { + int esz = a->esz; + int ofs = vec_reg_offset(s, a->rd, 0, esz); + TCGv_i64 reg = load_esz(cpu_env, ofs, esz); + + do_clast_scalar(s, esz, a->pg, a->rn, before, reg); + write_fp_dreg(s, a->rd, reg); + tcg_temp_free_i64(reg); + } + return true; +} + +static bool trans_CLASTA_v(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_clast_fp(s, a, false); +} + +static bool trans_CLASTB_v(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_clast_fp(s, a, true); +} + +/* Compute CLAST for a Xreg. */ +static bool do_clast_general(DisasContext *s, arg_rpr_esz *a, bool before) +{ + if (!sve_access_check(s)) { + return true; + } + + TCGv_i64 reg = cpu_reg(s, a->rd); + + switch (a->esz) { + case 0: + tcg_gen_ext8u_i64(reg, reg); + break; + case 1: + tcg_gen_ext16u_i64(reg, reg); + break; + case 2: + tcg_gen_ext32u_i64(reg, reg); + break; + case 3: + break; + default: + g_assert_not_reached(); + } + + do_clast_scalar(s, a->esz, a->pg, a->rn, before, cpu_reg(s, a->rd)); + return true; +} + +static bool trans_CLASTA_r(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_clast_general(s, a, false); +} + +static bool trans_CLASTB_r(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_clast_general(s, a, true); +} + +/* Compute LAST for a scalar. */ +static TCGv_i64 do_last_scalar(DisasContext *s, int esz, + int pg, int rm, bool before) +{ + TCGv_i32 last = tcg_temp_new_i32(); + TCGv_i64 ret; + + find_last_active(s, last, esz, pg); + if (before) { + wrap_last_active(s, last, esz); + } else { + incr_last_active(s, last, esz); + } + + ret = load_last_active(s, last, rm, esz); + tcg_temp_free_i32(last); + return ret; +} + +/* Compute LAST for a Vreg. */ +static bool do_last_fp(DisasContext *s, arg_rpr_esz *a, bool before) +{ + if (sve_access_check(s)) { + TCGv_i64 val = do_last_scalar(s, a->esz, a->pg, a->rn, before); + write_fp_dreg(s, a->rd, val); + tcg_temp_free_i64(val); + } + return true; +} + +static bool trans_LASTA_v(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_last_fp(s, a, false); +} + +static bool trans_LASTB_v(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_last_fp(s, a, true); +} + +/* Compute LAST for a Xreg. */ +static bool do_last_general(DisasContext *s, arg_rpr_esz *a, bool before) +{ + if (sve_access_check(s)) { + TCGv_i64 val = do_last_scalar(s, a->esz, a->pg, a->rn, before); + tcg_gen_mov_i64(cpu_reg(s, a->rd), val); + tcg_temp_free_i64(val); + } + return true; +} + +static bool trans_LASTA_r(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_last_general(s, a, false); +} + +static bool trans_LASTB_r(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_last_general(s, a, true); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 9da6566d32..1226867f69 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -430,6 +430,26 @@ TRN2_z 00000101 .. 1 ..... 011 101 ..... ..... @rd_rn_rm # Note esz >= 2 COMPACT 00000101 .. 100001 100 ... ..... ..... @rd_pg_rn +# SVE conditionally broadcast element to vector +CLASTA_z 00000101 .. 10100 0 100 ... ..... ..... @rdn_pg_rm +CLASTB_z 00000101 .. 10100 1 100 ... ..... ..... @rdn_pg_rm + +# SVE conditionally copy element to SIMD&FP scalar +CLASTA_v 00000101 .. 10101 0 100 ... ..... ..... @rd_pg_rn +CLASTB_v 00000101 .. 10101 1 100 ... ..... ..... @rd_pg_rn + +# SVE conditionally copy element to general register +CLASTA_r 00000101 .. 11000 0 101 ... ..... ..... @rd_pg_rn +CLASTB_r 00000101 .. 11000 1 101 ... ..... ..... @rd_pg_rn + +# SVE copy element to SIMD&FP scalar register +LASTA_v 00000101 .. 10001 0 100 ... ..... ..... @rd_pg_rn +LASTB_v 00000101 .. 10001 1 100 ... ..... ..... @rd_pg_rn + +# SVE copy element to general register +LASTA_r 00000101 .. 10000 0 101 ... ..... ..... @rd_pg_rn +LASTB_r 00000101 .. 10000 1 101 ... ..... ..... @rd_pg_rn + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Wed May 30 18:01:09 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 137275 Delivered-To: patch@linaro.org Received: by 2002:a2e:9706:0:0:0:0:0 with SMTP id r6-v6csp5640034lji; Wed, 30 May 2018 11:13:19 -0700 (PDT) X-Google-Smtp-Source: ADUXVKJRV/Qdb0TzsIynyHMmtcv9bWa4QgXM22Y+lbdJZ0LlxZ/8H/XWaVKiqR4b5SM96pJtq/cg X-Received: by 2002:a37:7882:: with SMTP id t124-v6mr3581413qkc.285.1527703999131; Wed, 30 May 2018 11:13:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527703999; cv=none; d=google.com; s=arc-20160816; b=qc6PrchJq+qB5sUBJMdmGnwzci8MzsLQyNLFj82QV0j/94J4+JjdI7WLrIo559Hjuv 9qvxH+JW6FU8kn2TuU7naCdxMA12j1X8uGkq+E6WbJENKsFWXVVSssCNxf3AnqqLNFOx XreK7ZkQ5qleNtQB2dZnftQfpO62fXPFknhlWuTThM1xRAUfNI/Bvv5BiQHOr0dPKm3O E/llP4NkMlopUvyL4GrWfctzhc7o9TVOOauFFVqqhF0+A5rjBzlWx5itKc1TmvQNyiUr 5Nm8FwcOqR4IqASJZcm/jSD7nEjFFPH6ZHv8vtI/sLUhTlxxUWhJ5RAcq9uj/EnWUe1w G2lA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=TxLQPQ03rX5hBDGAZo1RxOuWce7FbsjPjTcpn7mZN/Q=; b=bYDRlwzM94t1jGjp6Dr89IvlHwcYa1M/VmkRxpRq382q5KnEQzp/eHSDVpf3PqqkSo dyw7JH6Bgx0vbzLvipnyveLcNoWkGre0aouWfvgAW80xT0yE+oYnwhYn7XcAG+lHFP7b EirRXc1bLw4Omdkcf2g2Mpy0ERHm0ic2ToOKZg/ajh4WERCG/Hvk+kWI8gXXzePOCr4f 7Fo7jUzGz2dsO8KpehbCwkR+B1v3dhrBSCql6jK8ZlRk5r5vOhz/H1Iy+kGtHNtJ30Qz ncXAARD/087BM3moahWDZJScNYp+ARiqajFln7nX3YhMhdnbmcVczmk9lmw9BAItp3BB c0vQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=e5ZE+7rV; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id k16-v6si367702qkl.269.2018.05.30.11.13.18 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 30 May 2018 11:13:19 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=e5ZE+7rV; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:40110 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fO5b4-000212-Hj for patch@linaro.org; Wed, 30 May 2018 14:13:18 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33294) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fO5Pk-0001tl-CL for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:37 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fO5Pj-0004Wo-6O for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:36 -0400 Received: from mail-pf0-x235.google.com ([2607:f8b0:400e:c00::235]:41817) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fO5Pj-0004WJ-1K for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:35 -0400 Received: by mail-pf0-x235.google.com with SMTP id v63-v6so9403242pfk.8 for ; Wed, 30 May 2018 11:01:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=TxLQPQ03rX5hBDGAZo1RxOuWce7FbsjPjTcpn7mZN/Q=; b=e5ZE+7rVlzrzBoalokZTuW7woB0WZWyUnpifqFOmTwXn1WhXLZiPCyBpev0rv4/rzz Y6BFrbTwH3qcLb7zS7jCprhWtbkWA7wUVSXcPgg6GigV2WHl82LQw7hSVtNjWrT40lwq R40VLpq0jRdfBCdvOqHWTj7m5QImy91XQ/pu4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=TxLQPQ03rX5hBDGAZo1RxOuWce7FbsjPjTcpn7mZN/Q=; b=EkW0VvkLbO17ai/j6R/zHsjNp4r7JPHIGEQcTvSTPeodts32WVSsaD0wQs4KSLdQIx pCqNpyD2zSH09FSAcmmGtkD6TECmJGrLrOJzTxRTl/EJhUR3/MmZITgDD+nVPOBa/Bjc lrQHJNPWVzTz5sa+EXZ4z8AufG9TREDWgazr7XG84x4KD8hiMhAQJ0jwzwZmh/gKszVI +QtKCKG2ZdC9pncXTF6iu1YsvsomVu8REO7E/bPQuRvEdkcjiZ3n/KwBS67T9VWX6IG2 0MjaXp6CsDaneSU4+2n7pKkpVUUyloqSuBFplghAFatYqMvbI/euBcIUluJkX61DVubk NH1A== X-Gm-Message-State: ALKqPwcrvQVSnZQA7i46/6hS8h6YxTvJ8EIXtLlaiVRSx4XWFjLtVu9Q +/NHzHBBQGrSjdQhbMfxWxn2pwS/Ju0= X-Received: by 2002:a65:5546:: with SMTP id t6-v6mr2981260pgr.363.1527703293484; Wed, 30 May 2018 11:01:33 -0700 (PDT) Received: from cloudburst.twiddle.net (97-126-112-211.tukw.qwest.net. [97.126.112.211]) by smtp.gmail.com with ESMTPSA id b84-v6sm28179157pfm.123.2018.05.30.11.01.32 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 30 May 2018 11:01:32 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 30 May 2018 11:01:09 -0700 Message-Id: <20180530180120.13355-8-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180530180120.13355-1-richard.henderson@linaro.org> References: <20180530180120.13355-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::235 Subject: [Qemu-devel] [PATCH v3b 07/18] target/arm: Implement SVE copy to vector (predicated) X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 19 +++++++++++++++++++ target/arm/sve.decode | 6 ++++++ 2 files changed, 25 insertions(+) -- 2.17.0 Reviewed-by: Peter Maydell diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index edcef277f8..a6f85de358 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2614,6 +2614,25 @@ static bool trans_LASTB_r(DisasContext *s, arg_rpr_esz *a, uint32_t insn) return do_last_general(s, a, true); } +static bool trans_CPY_m_r(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + if (sve_access_check(s)) { + do_cpy_m(s, a->esz, a->rd, a->rd, a->pg, cpu_reg_sp(s, a->rn)); + } + return true; +} + +static bool trans_CPY_m_v(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + if (sve_access_check(s)) { + int ofs = vec_reg_offset(s, a->rn, 0, a->esz); + TCGv_i64 t = load_esz(cpu_env, ofs, a->esz); + do_cpy_m(s, a->esz, a->rd, a->rd, a->pg, t); + tcg_temp_free_i64(t); + } + return true; +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 1226867f69..519139f684 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -450,6 +450,12 @@ LASTB_v 00000101 .. 10001 1 100 ... ..... ..... @rd_pg_rn LASTA_r 00000101 .. 10000 0 101 ... ..... ..... @rd_pg_rn LASTB_r 00000101 .. 10000 1 101 ... ..... ..... @rd_pg_rn +# SVE copy element from SIMD&FP scalar register +CPY_m_v 00000101 .. 100000 100 ... ..... ..... @rd_pg_rn + +# SVE copy element from general register to vector (predicated) +CPY_m_r 00000101 .. 101000 101 ... ..... ..... @rd_pg_rn + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Wed May 30 18:01:10 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 137273 Delivered-To: patch@linaro.org Received: by 2002:a2e:9706:0:0:0:0:0 with SMTP id r6-v6csp5637815lji; Wed, 30 May 2018 11:11:06 -0700 (PDT) X-Google-Smtp-Source: ADUXVKKCQES8Q/BkHg1a326OThtXTkyoKr6+jmy1Qv2DICJZMJeyrPmx/bRw+RgXKsMzOI6azuon X-Received: by 2002:aed:3dcb:: with SMTP id j11-v6mr3587318qtf.357.1527703865982; Wed, 30 May 2018 11:11:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527703865; cv=none; d=google.com; s=arc-20160816; b=SVipQS8eir1dNA9Hex2g9oS1mgR3nEuSm8UcjN7V1m9KyJEn0+aQSoI29mZ0izWxgV TR+K+UjOjqMfGsSnbtK+Pzk5m1l44wzEXZEXAjz5k5Uq2U/PjeQg3i7qYBdnxElXGWDi iblqD1u/GgzTqTMkN2R1Dei3TZGQJpmeaMhDs4pxa2BStHQl1niv5+okV45Lec/VLFd1 m8V76KoE8rwE6QXoD7/Ef68RAvZX8ChHQYjjfS3EJ3fnABvY1++veA8hEcDjVBREZN+Q 7CxXiIKM+TYnjWRyw+sgSoYdpmIAo/0SUb0wYhKy/RTzNm0t/SBL4o38Ol9fWEn5f3QN I3/Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=FGG+f4HCs97Xhb/BWjMpKJ77+RPKNLTeJw2dvqrMLyM=; b=Z5cBoVCjQ9thVEPD6MjkQBAQIEsJn1nZJYOjuaFmYn5haBe+v6C5wt7CuG5alkTKe1 yx/PVssIDwPHqKYAHOOnwWX8l/JqJdW/+ZW0Kz8cAiBeYGR3qgmIp+4J1unA2o4YRwMp bF8tLTg5okregyFSHMXpjrOWF3VB4/pSXptBDlUBrz1aUhOZ34nvTnr7qW5aKpUAzmdQ nX8FaHAyFka77pO2h9S+3aBMDqkkURIpielxwjPCCvIA9TmofC8+MnqFm0zwicdfAIDw UzTe/vYb/pJ0mIW/v4tiMTDgZ2WDWhlDkRSrk3l4J1pd35CXk7fB09FkZMcbA4hwphEA gE1Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=XWOO7t4V; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id w72-v6si6093358qkg.362.2018.05.30.11.11.05 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 30 May 2018 11:11:05 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=XWOO7t4V; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:40089 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fO5Yv-0008JQ-97 for patch@linaro.org; Wed, 30 May 2018 14:11:05 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33345) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fO5Pm-0001vZ-7v for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:40 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fO5Pl-0004Z7-0i for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:38 -0400 Received: from mail-pl0-x241.google.com ([2607:f8b0:400e:c01::241]:36265) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fO5Pk-0004YN-QU for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:36 -0400 Received: by mail-pl0-x241.google.com with SMTP id v24-v6so11553241plo.3 for ; Wed, 30 May 2018 11:01:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=FGG+f4HCs97Xhb/BWjMpKJ77+RPKNLTeJw2dvqrMLyM=; b=XWOO7t4VtH+hEXvRTvVze2E0mvFHsZ0ztljpFByhewrsSC7gI1hqAJWlpiw8yIQC/7 3JTPkbuiniVGWcGJW3JxCiTqeAGMi+PkcmH9YSVHef3VQr8PVcoO+Ql27mZhSmsuf1tc CN6YxuATDxc0/gZwgeIib7juTVTqH1mnSEdsk= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=FGG+f4HCs97Xhb/BWjMpKJ77+RPKNLTeJw2dvqrMLyM=; b=RpkIwpjbpoQjU44cn1GBPd0pCHtgw8Lk+lhV+gl+jQD/hK4bgX9veD13j67YUwELRm 1btV428Phws39zwQDq3Sv1YeO88uIanf5PCe+4ZtOwqaS/hh/XnwrNH/B/3aVvTveh1S OdIOxMkwLkGJLDPOjzpjuv10YbYP7eIupU/HNzMvkmqmg7fpvCMfVQ7V/Di4tl7UErwY Vzcc1I7A2DIfIJf+FqMOxGkVY6D3J90rGV+Y/iqZ69dZ1w5osTfYQm46dtdMFa3b9UmZ Qu8z3lBBsouTiu7mCGXINJRGa0pylLNtilGsy6CnC6vTUeM9pQ3oddmrlblfd8F99JB2 Sz0Q== X-Gm-Message-State: ALKqPwd5n3WoWFxb+5RQut+xIGj3QvkcF5g3KsqclhGH4L+DNB9aCHmQ KTzKzjKPQUnq6vKenSSfMj38NKC61J0= X-Received: by 2002:a17:902:7c84:: with SMTP id y4-v6mr3864682pll.262.1527703295093; Wed, 30 May 2018 11:01:35 -0700 (PDT) Received: from cloudburst.twiddle.net (97-126-112-211.tukw.qwest.net. [97.126.112.211]) by smtp.gmail.com with ESMTPSA id b84-v6sm28179157pfm.123.2018.05.30.11.01.33 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 30 May 2018 11:01:33 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 30 May 2018 11:01:10 -0700 Message-Id: <20180530180120.13355-9-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180530180120.13355-1-richard.henderson@linaro.org> References: <20180530180120.13355-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::241 Subject: [Qemu-devel] [PATCH v3b 08/18] target/arm: Implement SVE reverse within elements X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 14 +++++++++++++ target/arm/sve_helper.c | 41 +++++++++++++++++++++++++++++++------- target/arm/translate-sve.c | 38 +++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 7 +++++++ 4 files changed, 93 insertions(+), 7 deletions(-) -- 2.17.0 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index a58fb4ba01..3b7c54905d 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -465,6 +465,20 @@ DEF_HELPER_FLAGS_4(sve_compact_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_2(sve_last_active_element, TCG_CALL_NO_RWG, s32, ptr, i32) +DEF_HELPER_FLAGS_4(sve_revb_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_revb_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_revb_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_revh_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_revh_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_revw_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_rbit_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_rbit_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_rbit_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_rbit_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 941a098234..f8579a25e3 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -238,6 +238,26 @@ static inline uint64_t expand_pred_s(uint8_t byte) return word[byte & 0x11]; } +/* Swap 16-bit words within a 32-bit word. */ +static inline uint32_t hswap32(uint32_t h) +{ + return rol32(h, 16); +} + +/* Swap 16-bit words within a 64-bit word. */ +static inline uint64_t hswap64(uint64_t h) +{ + uint64_t m = 0x0000ffff0000ffffull; + h = rol64(h, 32); + return ((h & m) << 16) | ((h >> 16) & m); +} + +/* Swap 32-bit words within a 64-bit word. */ +static inline uint64_t wswap64(uint64_t h) +{ + return rol64(h, 32); +} + #define LOGICAL_PPPP(NAME, FUNC) \ void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \ { \ @@ -616,6 +636,20 @@ DO_ZPZ(sve_neg_h, uint16_t, H1_2, DO_NEG) DO_ZPZ(sve_neg_s, uint32_t, H1_4, DO_NEG) DO_ZPZ_D(sve_neg_d, uint64_t, DO_NEG) +DO_ZPZ(sve_revb_h, uint16_t, H1_2, bswap16) +DO_ZPZ(sve_revb_s, uint32_t, H1_4, bswap32) +DO_ZPZ_D(sve_revb_d, uint64_t, bswap64) + +DO_ZPZ(sve_revh_s, uint32_t, H1_4, hswap32) +DO_ZPZ_D(sve_revh_d, uint64_t, hswap64) + +DO_ZPZ_D(sve_revw_d, uint64_t, wswap64) + +DO_ZPZ(sve_rbit_b, uint8_t, H1, revbit8) +DO_ZPZ(sve_rbit_h, uint16_t, H1_2, revbit16) +DO_ZPZ(sve_rbit_s, uint32_t, H1_4, revbit32) +DO_ZPZ_D(sve_rbit_d, uint64_t, revbit64) + /* Three-operand expander, unpredicated, in which the third operand is "wide". */ #define DO_ZZW(NAME, TYPE, TYPEW, H, OP) \ @@ -1587,13 +1621,6 @@ void HELPER(sve_rev_b)(void *vd, void *vn, uint32_t desc) } } -static inline uint64_t hswap64(uint64_t h) -{ - uint64_t m = 0x0000ffff0000ffffull; - h = rol64(h, 32); - return ((h & m) << 16) | ((h >> 16) & m); -} - void HELPER(sve_rev_h)(void *vd, void *vn, uint32_t desc) { intptr_t i, j, opr_sz = simd_oprsz(desc); diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index a6f85de358..90561a8b50 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2633,6 +2633,44 @@ static bool trans_CPY_m_v(DisasContext *s, arg_rpr_esz *a, uint32_t insn) return true; } +static bool trans_REVB(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] = { + NULL, + gen_helper_sve_revb_h, + gen_helper_sve_revb_s, + gen_helper_sve_revb_d, + }; + return do_zpz_ool(s, a, fns[a->esz]); +} + +static bool trans_REVH(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] = { + NULL, + NULL, + gen_helper_sve_revh_s, + gen_helper_sve_revh_d, + }; + return do_zpz_ool(s, a, fns[a->esz]); +} + +static bool trans_REVW(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ool(s, a, a->esz == 3 ? gen_helper_sve_revw_d : NULL); +} + +static bool trans_RBIT(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve_rbit_b, + gen_helper_sve_rbit_h, + gen_helper_sve_rbit_s, + gen_helper_sve_rbit_d, + }; + return do_zpz_ool(s, a, fns[a->esz]); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 519139f684..95eb4968a9 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -456,6 +456,13 @@ CPY_m_v 00000101 .. 100000 100 ... ..... ..... @rd_pg_rn # SVE copy element from general register to vector (predicated) CPY_m_r 00000101 .. 101000 101 ... ..... ..... @rd_pg_rn +# SVE reverse within elements +# Note esz >= operation size +REVB 00000101 .. 1001 00 100 ... ..... ..... @rd_pg_rn +REVH 00000101 .. 1001 01 100 ... ..... ..... @rd_pg_rn +REVW 00000101 .. 1001 10 100 ... ..... ..... @rd_pg_rn +RBIT 00000101 .. 1001 11 100 ... ..... ..... @rd_pg_rn + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Wed May 30 18:01:11 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 137268 Delivered-To: patch@linaro.org Received: by 2002:a2e:9706:0:0:0:0:0 with SMTP id r6-v6csp5632125lji; Wed, 30 May 2018 11:05:29 -0700 (PDT) X-Google-Smtp-Source: ADUXVKKEENWJQtp0DKjeWsHrPRFA+mm7XkYD91R60BCLfBsNmfjtWsidQiuTpTWTeG4olPWJx8/8 X-Received: by 2002:aed:37e6:: with SMTP id j93-v6mr3747015qtb.111.1527703529822; Wed, 30 May 2018 11:05:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527703529; cv=none; d=google.com; s=arc-20160816; b=xQVbUtr6U2YXF7qb41nsHO1Qk+ePFwMbE5hmTPAM1qqz5t7PEJLBV7T4FvsV+648Hk LGDm5CwJsR31KwCGWs6/t3CrQ+u+N5j6MapstTk9XzfOJOhs28roVDd0LtbSmb6mbwCy nOxYcGCdibFvUo6JrAOEYgWFiR3/VbcJ7rO35foHUa2l7HYa/LdfGON5TlUg9KQge21f +1ZucXgbdqsS06VDzm08zA3IoxkbLevekRiQ+xfAW+fRVm/1JxnRjjjrJAVw6uLb98hs OfNvFqSmUeGANWuygrata5ekW1U0t3/2JQHKo5M19NQtBDRbgFtE7ab4tMqJ3k5IQiv4 jzgg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=n4+8RuZr2vOITB8cmfVyUiCIxF+yQylQx7PTF/yIixM=; b=OoJW2rNVxEvDI6xPxlvJzkOKv4vPLy7ZxAjN8leuJklG9yWKkfQqTbbTLbObz2AvrD venospOey5ftBFJSjrxQHtlpOX3Fnd3elWD3ISLIiKv3zEC7CjbvjxhEXJVSJFCTpKYN c11FIXsrBmZsv9+jypKez5RX6sujnknOWgU9w3w2FShxNh1WI1RAwpDG6MVZDVJL/9jP ymM+s4L0JYRP9tKtd4c4wvR4uGneJV/jmQyOPHmi0+5cdONC7WMRFQAkn8Q6YKNsHQhv gdHcaKwzo09PexBMkJbxavLDnsh4SX1dn3CGJqCufzOo7tMGM4ddNWHVAF04WYKVSNUO gpjQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Dl5Y5W4D; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id j7-v6si921269qth.269.2018.05.30.11.05.29 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 30 May 2018 11:05:29 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Dl5Y5W4D; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:40046 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fO5TV-0003eV-6u for patch@linaro.org; Wed, 30 May 2018 14:05:29 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33431) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fO5Pp-0001xd-9D for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:45 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fO5Pn-0004cJ-FE for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:41 -0400 Received: from mail-pl0-x236.google.com ([2607:f8b0:400e:c01::236]:44389) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fO5Pn-0004bT-6i for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:39 -0400 Received: by mail-pl0-x236.google.com with SMTP id z9-v6so8206977plk.11 for ; Wed, 30 May 2018 11:01:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=n4+8RuZr2vOITB8cmfVyUiCIxF+yQylQx7PTF/yIixM=; b=Dl5Y5W4DHJgZqkXnw+IqhcCknC/CFQSYMHTdosnJwOfDzIymQPTunAiyApmPDMuuiW kldJ+zEslXfkuPJ7LIoVjd5EsWdzTab+wpDBsFgnOt0Ky5TnpriczwGZ2uOEQNl6tJ9R 8v43liJ1U8ka4JXlez0Bb7+Z5NUuRoYgt2oZ4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=n4+8RuZr2vOITB8cmfVyUiCIxF+yQylQx7PTF/yIixM=; b=Uyi0NvV/r126q5nNk/jI9UBJ1twx/StxHgiogwGuxI2U9REcZ+czpMjLcvLkz3KGQb 5oATw4dmSRp+YMJhV+NCyvp9nkrdWl+dQqmUmg4bIrafpr4QCIbatmHhgTFNwVk3STaf 2IZzPq/IsTePFvBExH+74gb4/CpFmgSl7LsMhSddFzETo9UhM3pBvSsU1Ap0JYJTt5Wo kcTsqoSkS27mapHc9j2h57kwfFzZ/q/5P9y0DbtyCU4R4m4onRERF2lOpLJqY6oBHPtb enzvxlhCfKXx6fS/IGaq9Y8BQKPbi7MlB/9qaUYwmgQJwI7p4yoDZVUjNTuWp5MYJuak Zj0g== X-Gm-Message-State: ALKqPwdmeSzk66XbCfIM02WKeLdjST/Vaysuv5QK0pmh5wsrlAhGQpy8 D014A+Yj3d8WOIJ+/65zFjw/1ttgSvc= X-Received: by 2002:a17:902:74c8:: with SMTP id f8-v6mr3832181plt.317.1527703297024; Wed, 30 May 2018 11:01:37 -0700 (PDT) Received: from cloudburst.twiddle.net (97-126-112-211.tukw.qwest.net. [97.126.112.211]) by smtp.gmail.com with ESMTPSA id b84-v6sm28179157pfm.123.2018.05.30.11.01.35 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 30 May 2018 11:01:35 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 30 May 2018 11:01:11 -0700 Message-Id: <20180530180120.13355-10-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180530180120.13355-1-richard.henderson@linaro.org> References: <20180530180120.13355-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::236 Subject: [Qemu-devel] [PATCH v3b 09/18] target/arm: Implement SVE vector splice (predicated) X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 2 ++ target/arm/sve_helper.c | 37 +++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 13 +++++++++++++ target/arm/sve.decode | 3 +++ 4 files changed, 55 insertions(+) -- 2.17.0 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 3b7c54905d..c3f8a2b502 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -479,6 +479,8 @@ DEF_HELPER_FLAGS_4(sve_rbit_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_rbit_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_rbit_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_splice, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index f8579a25e3..f6d4b2139a 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2108,3 +2108,40 @@ int32_t HELPER(sve_last_active_element)(void *vg, uint32_t pred_desc) return last_active_element(vg, DIV_ROUND_UP(oprsz, 8), esz); } + +void HELPER(sve_splice)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) +{ + intptr_t opr_sz = simd_oprsz(desc) / 8; + int esz = simd_data(desc); + uint64_t pg, first_g, last_g, len, mask = pred_esz_masks[esz]; + intptr_t i, first_i, last_i; + ARMVectorReg tmp; + + first_i = last_i = 0; + first_g = last_g = 0; + + /* Find the extent of the active elements within VG. */ + for (i = QEMU_ALIGN_UP(opr_sz, 8) - 8; i >= 0; i -= 8) { + pg = *(uint64_t *)(vg + i) & mask; + if (pg) { + if (last_g == 0) { + last_g = pg; + last_i = i; + } + first_g = pg; + first_i = i; + } + } + + len = 0; + if (first_g != 0) { + first_i = first_i * 8 + ctz64(first_g); + last_i = last_i * 8 + 63 - clz64(last_g); + len = last_i - first_i + (1 << esz); + if (vd == vm) { + vm = memcpy(&tmp, vm, opr_sz * 8); + } + swap_memmove(vd, vn + first_i, len); + } + swap_memmove(vd + len, vm, opr_sz * 8 - len); +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 90561a8b50..cf3624b439 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2671,6 +2671,19 @@ static bool trans_RBIT(DisasContext *s, arg_rpr_esz *a, uint32_t insn) return do_zpz_ool(s, a, fns[a->esz]); } +static bool trans_SPLICE(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_4_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + pred_full_reg_offset(s, a->pg), + vsz, vsz, a->esz, gen_helper_sve_splice); + } + return true; +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 95eb4968a9..a9fa631252 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -463,6 +463,9 @@ REVH 00000101 .. 1001 01 100 ... ..... ..... @rd_pg_rn REVW 00000101 .. 1001 10 100 ... ..... ..... @rd_pg_rn RBIT 00000101 .. 1001 11 100 ... ..... ..... @rd_pg_rn +# SVE vector splice (predicated) +SPLICE 00000101 .. 101 100 100 ... ..... ..... @rdn_pg_rm + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Wed May 30 18:01:12 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 137279 Delivered-To: patch@linaro.org Received: by 2002:a2e:9706:0:0:0:0:0 with SMTP id r6-v6csp5642731lji; Wed, 30 May 2018 11:15:59 -0700 (PDT) X-Google-Smtp-Source: ADUXVKKz+dHLHgLOrJF7uIKVx3Oz0Gb34BJwrCb+l8jTwIHZo/TRx0oXKk0EyWGizTbsXQ3hTdwQ X-Received: by 2002:ac8:3338:: with SMTP id t53-v6mr3505634qta.312.1527704159740; Wed, 30 May 2018 11:15:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527704159; cv=none; d=google.com; s=arc-20160816; b=uSGM0xxa01daYYekyDJ87UwwOXovgrB4gRU18B6CLB899D1AFB3TpHBvxrPPA9vA4O oO0epMbhQi8/Kr4/9FPcYyAS1R2oGMnPw3k9pCMPiolohGw593Mz3dMI0dOlenGiQLzU vc7MgDDs5MwMzGnB2sdkC0x4FrwAGcCvn4z3mlvq8VUoBWd0MhbXf+wFE9JEHNagj27X W7KjbPf1Lte3hlPwT9w39AHAe5s1B7UrG7yR335HMbhoVPATlB52OyhvVkX3s2DdN4oz gDpU6GQ/nCZNet0LwPQkdD/wR8aoUSJm6R5CRWblFGasI6TMCzHh53TRZ8Hy/PbvLCgl 5azQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=Kp4DBikk3yZbQpa3m9qbcNB6Ol8v1TI2EZIhxI1zmKk=; b=w83lLuSwZIiYUYDKWRTwBr/0SP9Leot/ZF8rII2Sj+oiSskUd/qwemPDRBX0wczCoQ dqg6QPe63hqISWwCfZWvIvfLvzIh0HFFw8wLNJCaCdmSQsDWDDCTfuiU+ZFlX4X+IXeh FBk9ay7v8Z2szdIHxRVyfbu3ixyEfrBlA0iPUViiKeO/aRMQ2fhB2I3wpOJPl6A4/r+F AQCrOU/6BlEHeCJvBfCzlZfnnNFKByDZF39tt8Jc78zfD2mr4T1og1onNpwl94YOjQ0I 64BP4SxSOuVCu2+7c/km3GmDb+sA1d6A/hJx+7EyBAPWy6o07c9Fb2W/RFHhHQxfti8S cr5w== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=OusqQE83; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id i189-v6si6308154qkd.225.2018.05.30.11.15.59 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 30 May 2018 11:15:59 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=OusqQE83; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:40127 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fO5df-0004G8-7j for patch@linaro.org; Wed, 30 May 2018 14:15:59 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33470) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fO5Pq-0001z4-Jo for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:48 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fO5Po-0004dh-DU for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:42 -0400 Received: from mail-pg0-x230.google.com ([2607:f8b0:400e:c05::230]:38425) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fO5Po-0004cT-2n for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:40 -0400 Received: by mail-pg0-x230.google.com with SMTP id c9-v6so5663057pgf.5 for ; Wed, 30 May 2018 11:01:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Kp4DBikk3yZbQpa3m9qbcNB6Ol8v1TI2EZIhxI1zmKk=; b=OusqQE83iOZ44zYWr9iocXZ9i4npimEyaxFui/Z1NQvBbdtBFeYTlRuPicI1yVGstz o2AxjdBsYXJcqB9C25MneZQHdPdfMxWnE1lA5Guf1evyDCB9IKKFewSrpaG9dJZ1fB4Y BWb0SNd51KEyzDTWhp1kIsGFTisSax7eVjDlQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Kp4DBikk3yZbQpa3m9qbcNB6Ol8v1TI2EZIhxI1zmKk=; b=GWYUfZDUIoQ2KaXkJLJ9O6/U94Qx9XEmfL6V2vl2kBu/Eta7Acf9HtdzdxXO97qpOd fvx7mktART97zckIP9Xguv2wwmlMauwUr2EsY0cMxpu5X8BiF9jXiuk/Xnu4evz7Z+yy kkl4gU9ExfncmUkUA3Vvcf1ZsKLOIbA+123LEWBMdWIYuXO3nFRJDRbEXDvIeZqbo2F+ qN5PwGLNXaEHv4OeHuAoEG6dM8TdQ74GPMEYLuLpVhOQtTfMmUsBIM3o/n1mfZ2WoJ0j lAvAZ9L42tXH1/marL4TXqKfo2o5yNzB+H2upguao9Qlh2/ODxf4VvHMR9XTGxYV5gPT ipIw== X-Gm-Message-State: ALKqPweI4S1C3qZueoqV5brXln1M91eww8rSwU39RuqOpcwvDP5g1f0B qsn227blq/IAcouFZKJXQ6TfIBQnzcg= X-Received: by 2002:a62:3a59:: with SMTP id h86-v6mr3693245pfa.209.1527703298450; Wed, 30 May 2018 11:01:38 -0700 (PDT) Received: from cloudburst.twiddle.net (97-126-112-211.tukw.qwest.net. [97.126.112.211]) by smtp.gmail.com with ESMTPSA id b84-v6sm28179157pfm.123.2018.05.30.11.01.37 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 30 May 2018 11:01:37 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 30 May 2018 11:01:12 -0700 Message-Id: <20180530180120.13355-11-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180530180120.13355-1-richard.henderson@linaro.org> References: <20180530180120.13355-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::230 Subject: [Qemu-devel] [PATCH v3b 10/18] target/arm: Implement SVE Select Vectors Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 9 +++++++ target/arm/sve_helper.c | 55 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 2 ++ target/arm/sve.decode | 6 +++++ 4 files changed, 72 insertions(+) -- 2.17.0 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index c3f8a2b502..0f57f64895 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -195,6 +195,15 @@ DEF_HELPER_FLAGS_5(sve_lsl_zpzz_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_lsl_zpzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_sel_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_sel_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_sel_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_sel_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_asr_zpzw_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_asr_zpzw_h, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index f6d4b2139a..51cf9c47d9 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2145,3 +2145,58 @@ void HELPER(sve_splice)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) } swap_memmove(vd + len, vm, opr_sz * 8 - len); } + +void HELPER(sve_sel_zpzz_b)(void *vd, void *vn, void *vm, + void *vg, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd, *n = vn, *m = vm; + uint8_t *pg = vg; + + for (i = 0; i < opr_sz; i += 1) { + uint64_t nn = n[i], mm = m[i]; + uint64_t pp = expand_pred_b(pg[H1(i)]); + d[i] = (nn & pp) | (mm & ~pp); + } +} + +void HELPER(sve_sel_zpzz_h)(void *vd, void *vn, void *vm, + void *vg, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd, *n = vn, *m = vm; + uint8_t *pg = vg; + + for (i = 0; i < opr_sz; i += 1) { + uint64_t nn = n[i], mm = m[i]; + uint64_t pp = expand_pred_h(pg[H1(i)]); + d[i] = (nn & pp) | (mm & ~pp); + } +} + +void HELPER(sve_sel_zpzz_s)(void *vd, void *vn, void *vm, + void *vg, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd, *n = vn, *m = vm; + uint8_t *pg = vg; + + for (i = 0; i < opr_sz; i += 1) { + uint64_t nn = n[i], mm = m[i]; + uint64_t pp = expand_pred_s(pg[H1(i)]); + d[i] = (nn & pp) | (mm & ~pp); + } +} + +void HELPER(sve_sel_zpzz_d)(void *vd, void *vn, void *vm, + void *vg, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd, *n = vn, *m = vm; + uint8_t *pg = vg; + + for (i = 0; i < opr_sz; i += 1) { + uint64_t nn = n[i], mm = m[i]; + d[i] = (pg[H1(i)] & 1 ? nn : mm); + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index cf3624b439..53b86af94d 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -373,6 +373,8 @@ static bool trans_UDIV_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn) return do_zpzz_ool(s, a, fns[a->esz]); } +DO_ZPZZ(SEL, sel) + #undef DO_ZPZZ /* diff --git a/target/arm/sve.decode b/target/arm/sve.decode index a9fa631252..91522d8e13 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -98,6 +98,7 @@ &rprr_esz rn=%reg_movprfx @rdm_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 \ &rprr_esz rm=%reg_movprfx +@rd_pg4_rn_rm ........ esz:2 . rm:5 .. pg:4 rn:5 rd:5 &rprr_esz # Three register operand, with governing predicate, vector element size @rda_pg_rn_rm ........ esz:2 . rm:5 ... pg:3 rn:5 rd:5 \ @@ -466,6 +467,11 @@ RBIT 00000101 .. 1001 11 100 ... ..... ..... @rd_pg_rn # SVE vector splice (predicated) SPLICE 00000101 .. 101 100 100 ... ..... ..... @rdn_pg_rm +### SVE Select Vectors Group + +# SVE select vector elements (predicated) +SEL_zpzz 00000101 .. 1 ..... 11 .... ..... ..... @rd_pg4_rn_rm + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Wed May 30 18:01:13 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 137277 Delivered-To: patch@linaro.org Received: by 2002:a2e:9706:0:0:0:0:0 with SMTP id r6-v6csp5640353lji; Wed, 30 May 2018 11:13:37 -0700 (PDT) X-Google-Smtp-Source: ADUXVKImSTYZ7L9LoQHJY0bbCn7BSjk0vEPHw5jXQpCgdoN2lTfY4naBKCXNCTCaoRqVKdHEKpNQ X-Received: by 2002:a0c:c602:: with SMTP id v2-v6mr3601819qvi.11.1527704017343; Wed, 30 May 2018 11:13:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527704017; cv=none; d=google.com; s=arc-20160816; b=dFj6/3/niHsNmaEF3lYzsYGtqpFSD4EKDfjAtK50OcpUTLUn0xZ6O0MlPr/pnlSyck H9RST5tA0peWAVv3yJX+8i1VjB0exBhgyO63VpnxO3KRTKQETdU2heLYxxaXRQLX5GUm RRr/iCpem3uz3+huYhn0AlYvB2S5IhO2cD4zLIYYPnOgpoQFEfq5xNtbXT9PeghU4347 w9dabNJ66fSDc5alR4E0Z1Fl2XDO9OWB9eKopMy1fmdPfflkZ5SLYbhvr8Ve2Y2KACyH B8OVC/tz/8Tlr09ltXo1u9e6V20lkPvZbc9RXo8wRmooEISKilzp0W8Syg6dLoKaZakh JMYQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=SvrVYh2PWuwtdh6uLUs3LW+4B4XuI0Sq8omTmAY/Tvw=; b=No0lWmWRRudUFbpTDW2eVlKj7NMsDo55w4SVQ0j9TVDYKmypUn+mo98KkyElcvIVXb QonlSovYA+E90/0bDbaA4fuMeVSRpk59mSayiP/5G6nZIZuiW8StNC8t3S893VUUXzWe gjv/DoByhslBmSMAb07xNk9UToXbnxw6KtG0FrWqdPLlpYSD8pYxGbSlMPE6JK5Oebxp +fl8tT3YVo3xnIsqCzHuSpdYDmhvKJdpH3BsQckrVe63HZufToXLsLAGqloobtM85/sZ PkmgVKz6UUM++3vgYPcO6HixlWy47/ECoG7M1WocNXUlrX8y7vfkwcU7OCH4X7mhPSsr OJTA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=JLgpw6SC; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id s1-v6si4057691qvn.119.2018.05.30.11.13.37 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 30 May 2018 11:13:37 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=JLgpw6SC; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:40111 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fO5bM-0002Fn-Pl for patch@linaro.org; Wed, 30 May 2018 14:13:36 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33606) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fO5Pv-00024I-DX for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:50 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fO5Pq-0004gE-2c for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:47 -0400 Received: from mail-pg0-x244.google.com ([2607:f8b0:400e:c05::244]:39225) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fO5Pp-0004ei-Pc for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:41 -0400 Received: by mail-pg0-x244.google.com with SMTP id w12-v6so7271815pgc.6 for ; Wed, 30 May 2018 11:01:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=SvrVYh2PWuwtdh6uLUs3LW+4B4XuI0Sq8omTmAY/Tvw=; b=JLgpw6SCTJLf8FfsV0Eg953oJWW0/yxSYXOctdCCpUYatWZt6Md9pvmrdKbTL6fHHx VpcBlzC7O5jKyN2uNtXBT2Kh1PeNkze/3QFqE6jjKhIKel2eYKS91UUUTTr11XY1qzxi v7/Ok/ghDc8dN5tWbGcgbgXmVYKX81rz0E024= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=SvrVYh2PWuwtdh6uLUs3LW+4B4XuI0Sq8omTmAY/Tvw=; b=kYvB+ULVHjgW6Nojyf4ZU49x3dYPT8GR2HaeoDFlykmQqwoun1kXvIdvj6yQSiWNl8 GA4P0RQHo5UtY04x5sXqW3/fLK/P/Lf258xOJk40PlQXxDT/3yRRL5Wv2mO1rQPYJIjp 68K1oj6QirmPu/MHcRM4bbluy7Qwl7lq2YQ8RL781crgTXwc4GD5np5BGnTQJUW3AERV 7XCR5hlqB6cgc6BQTjuAq/m8YlFIs8zxSn1nsOn85+N2YXZHy9dw2vERiC0nf+K4l4Zf RDRIUGHxD17+Kr6Pr7gz97yhm42MmkP3h/AU8zkNVKIciqOWr18Gq13cGj7JZG5G+Tup BIOw== X-Gm-Message-State: ALKqPwfI9DL5Mwb0o7i5ItKNS4HkI8O5DhdjcNmLvD92DLFEhiRxUCx/ 4HgVfwzuZhvQP+aGFgnR+rrV4sqBmGY= X-Received: by 2002:a63:bf49:: with SMTP id i9-v6mr3012613pgo.342.1527703300126; Wed, 30 May 2018 11:01:40 -0700 (PDT) Received: from cloudburst.twiddle.net (97-126-112-211.tukw.qwest.net. [97.126.112.211]) by smtp.gmail.com with ESMTPSA id b84-v6sm28179157pfm.123.2018.05.30.11.01.38 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 30 May 2018 11:01:38 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 30 May 2018 11:01:13 -0700 Message-Id: <20180530180120.13355-12-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180530180120.13355-1-richard.henderson@linaro.org> References: <20180530180120.13355-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::244 Subject: [Qemu-devel] [PATCH v3b 11/18] target/arm: Implement SVE Integer Compare - Vectors Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 115 +++++++++++++++++++++++ target/arm/sve_helper.c | 187 +++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 91 ++++++++++++++++++ target/arm/sve.decode | 24 +++++ 4 files changed, 417 insertions(+) -- 2.17.0 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 0f57f64895..6ffd1fbe8e 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -490,6 +490,121 @@ DEF_HELPER_FLAGS_4(sve_rbit_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_splice, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpeq_ppzz_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpne_ppzz_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpge_ppzz_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpgt_ppzz_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphi_ppzz_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphs_ppzz_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_cmpeq_ppzz_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpne_ppzz_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpge_ppzz_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpgt_ppzz_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphi_ppzz_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphs_ppzz_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_cmpeq_ppzz_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpne_ppzz_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpge_ppzz_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpgt_ppzz_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphi_ppzz_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphs_ppzz_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_cmpeq_ppzz_d, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpne_ppzz_d, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpge_ppzz_d, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpgt_ppzz_d, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphi_ppzz_d, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphs_ppzz_d, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_cmpeq_ppzw_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpne_ppzw_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpge_ppzw_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpgt_ppzw_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphi_ppzw_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphs_ppzw_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmple_ppzw_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmplt_ppzw_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmplo_ppzw_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpls_ppzw_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_cmpeq_ppzw_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpne_ppzw_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpge_ppzw_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpgt_ppzw_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphi_ppzw_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphs_ppzw_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmple_ppzw_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmplt_ppzw_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmplo_ppzw_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpls_ppzw_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_cmpeq_ppzw_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpne_ppzw_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpge_ppzw_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpgt_ppzw_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphi_ppzw_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphs_ppzw_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmple_ppzw_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmplt_ppzw_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmplo_ppzw_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpls_ppzw_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 51cf9c47d9..1dc2ec1e65 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -74,6 +74,28 @@ static uint32_t iter_predtest_fwd(uint64_t d, uint64_t g, uint32_t flags) return flags; } +/* This is an iterative function, called for each Pd and Pg word + * moving backward. + */ +static uint32_t iter_predtest_bwd(uint64_t d, uint64_t g, uint32_t flags) +{ + if (likely(g)) { + /* Compute C from first (i.e last) !(D & G). + Use bit 2 to signal first G bit seen. */ + if (!(flags & 4)) { + flags += 4 - 1; /* add bit 2, subtract C from PREDTEST_INIT */ + flags |= (d & pow2floor(g)) == 0; + } + + /* Accumulate Z from each D & G. */ + flags |= ((d & g) != 0) << 1; + + /* Compute N from last (i.e first) D & G. Replace previous. */ + flags = deposit32(flags, 31, 1, (d & (g & -g)) != 0); + } + return flags; +} + /* The same for a single word predicate. */ uint32_t HELPER(sve_predtest1)(uint64_t d, uint64_t g) { @@ -2200,3 +2222,168 @@ void HELPER(sve_sel_zpzz_d)(void *vd, void *vn, void *vm, d[i] = (pg[H1(i)] & 1 ? nn : mm); } } + +/* Two operand comparison controlled by a predicate. + * ??? It is very tempting to want to be able to expand this inline + * with x86 instructions, e.g. + * + * vcmpeqw zm, zn, %ymm0 + * vpmovmskb %ymm0, %eax + * and $0x5555, %eax + * and pg, %eax + * + * or even aarch64, e.g. + * + * // mask = 4000 1000 0400 0100 0040 0010 0004 0001 + * cmeq v0.8h, zn, zm + * and v0.8h, v0.8h, mask + * addv h0, v0.8h + * and v0.8b, pg + * + * However, coming up with an abstraction that allows vector inputs and + * a scalar output, and also handles the byte-ordering of sub-uint64_t + * scalar outputs, is tricky. + */ +#define DO_CMP_PPZZ(NAME, TYPE, OP, H, MASK) \ +uint32_t HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \ +{ \ + intptr_t opr_sz = simd_oprsz(desc); \ + uint32_t flags = PREDTEST_INIT; \ + intptr_t i = opr_sz; \ + do { \ + uint64_t out = 0, pg; \ + do { \ + i -= sizeof(TYPE), out <<= sizeof(TYPE); \ + TYPE nn = *(TYPE *)(vn + H(i)); \ + TYPE mm = *(TYPE *)(vm + H(i)); \ + out |= nn OP mm; \ + } while (i & 63); \ + pg = *(uint64_t *)(vg + (i >> 3)) & MASK; \ + out &= pg; \ + *(uint64_t *)(vd + (i >> 3)) = out; \ + flags = iter_predtest_bwd(out, pg, flags); \ + } while (i > 0); \ + return flags; \ +} + +#define DO_CMP_PPZZ_B(NAME, TYPE, OP) \ + DO_CMP_PPZZ(NAME, TYPE, OP, H1, 0xffffffffffffffffull) +#define DO_CMP_PPZZ_H(NAME, TYPE, OP) \ + DO_CMP_PPZZ(NAME, TYPE, OP, H1_2, 0x5555555555555555ull) +#define DO_CMP_PPZZ_S(NAME, TYPE, OP) \ + DO_CMP_PPZZ(NAME, TYPE, OP, H1_4, 0x1111111111111111ull) +#define DO_CMP_PPZZ_D(NAME, TYPE, OP) \ + DO_CMP_PPZZ(NAME, TYPE, OP, , 0x0101010101010101ull) + +DO_CMP_PPZZ_B(sve_cmpeq_ppzz_b, uint8_t, ==) +DO_CMP_PPZZ_H(sve_cmpeq_ppzz_h, uint16_t, ==) +DO_CMP_PPZZ_S(sve_cmpeq_ppzz_s, uint32_t, ==) +DO_CMP_PPZZ_D(sve_cmpeq_ppzz_d, uint64_t, ==) + +DO_CMP_PPZZ_B(sve_cmpne_ppzz_b, uint8_t, !=) +DO_CMP_PPZZ_H(sve_cmpne_ppzz_h, uint16_t, !=) +DO_CMP_PPZZ_S(sve_cmpne_ppzz_s, uint32_t, !=) +DO_CMP_PPZZ_D(sve_cmpne_ppzz_d, uint64_t, !=) + +DO_CMP_PPZZ_B(sve_cmpgt_ppzz_b, int8_t, >) +DO_CMP_PPZZ_H(sve_cmpgt_ppzz_h, int16_t, >) +DO_CMP_PPZZ_S(sve_cmpgt_ppzz_s, int32_t, >) +DO_CMP_PPZZ_D(sve_cmpgt_ppzz_d, int64_t, >) + +DO_CMP_PPZZ_B(sve_cmpge_ppzz_b, int8_t, >=) +DO_CMP_PPZZ_H(sve_cmpge_ppzz_h, int16_t, >=) +DO_CMP_PPZZ_S(sve_cmpge_ppzz_s, int32_t, >=) +DO_CMP_PPZZ_D(sve_cmpge_ppzz_d, int64_t, >=) + +DO_CMP_PPZZ_B(sve_cmphi_ppzz_b, uint8_t, >) +DO_CMP_PPZZ_H(sve_cmphi_ppzz_h, uint16_t, >) +DO_CMP_PPZZ_S(sve_cmphi_ppzz_s, uint32_t, >) +DO_CMP_PPZZ_D(sve_cmphi_ppzz_d, uint64_t, >) + +DO_CMP_PPZZ_B(sve_cmphs_ppzz_b, uint8_t, >=) +DO_CMP_PPZZ_H(sve_cmphs_ppzz_h, uint16_t, >=) +DO_CMP_PPZZ_S(sve_cmphs_ppzz_s, uint32_t, >=) +DO_CMP_PPZZ_D(sve_cmphs_ppzz_d, uint64_t, >=) + +#undef DO_CMP_PPZZ_B +#undef DO_CMP_PPZZ_H +#undef DO_CMP_PPZZ_S +#undef DO_CMP_PPZZ_D +#undef DO_CMP_PPZZ + +/* Similar, but the second source is "wide". */ +#define DO_CMP_PPZW(NAME, TYPE, TYPEW, OP, H, MASK) \ +uint32_t HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \ +{ \ + intptr_t opr_sz = simd_oprsz(desc); \ + uint32_t flags = PREDTEST_INIT; \ + intptr_t i = opr_sz; \ + do { \ + uint64_t out = 0, pg; \ + do { \ + TYPEW mm = *(TYPEW *)(vm + i - 8); \ + do { \ + i -= sizeof(TYPE), out <<= sizeof(TYPE); \ + TYPE nn = *(TYPE *)(vn + H(i)); \ + out |= nn OP mm; \ + } while (i & 7); \ + } while (i & 63); \ + pg = *(uint64_t *)(vg + (i >> 3)) & MASK; \ + out &= pg; \ + *(uint64_t *)(vd + (i >> 3)) = out; \ + flags = iter_predtest_bwd(out, pg, flags); \ + } while (i > 0); \ + return flags; \ +} + +#define DO_CMP_PPZW_B(NAME, TYPE, TYPEW, OP) \ + DO_CMP_PPZW(NAME, TYPE, TYPEW, OP, H1, 0xffffffffffffffffull) +#define DO_CMP_PPZW_H(NAME, TYPE, TYPEW, OP) \ + DO_CMP_PPZW(NAME, TYPE, TYPEW, OP, H1_2, 0x5555555555555555ull) +#define DO_CMP_PPZW_S(NAME, TYPE, TYPEW, OP) \ + DO_CMP_PPZW(NAME, TYPE, TYPEW, OP, H1_4, 0x1111111111111111ull) + +DO_CMP_PPZW_B(sve_cmpeq_ppzw_b, uint8_t, uint64_t, ==) +DO_CMP_PPZW_H(sve_cmpeq_ppzw_h, uint16_t, uint64_t, ==) +DO_CMP_PPZW_S(sve_cmpeq_ppzw_s, uint32_t, uint64_t, ==) + +DO_CMP_PPZW_B(sve_cmpne_ppzw_b, uint8_t, uint64_t, !=) +DO_CMP_PPZW_H(sve_cmpne_ppzw_h, uint16_t, uint64_t, !=) +DO_CMP_PPZW_S(sve_cmpne_ppzw_s, uint32_t, uint64_t, !=) + +DO_CMP_PPZW_B(sve_cmpgt_ppzw_b, int8_t, int64_t, >) +DO_CMP_PPZW_H(sve_cmpgt_ppzw_h, int16_t, int64_t, >) +DO_CMP_PPZW_S(sve_cmpgt_ppzw_s, int32_t, int64_t, >) + +DO_CMP_PPZW_B(sve_cmpge_ppzw_b, int8_t, int64_t, >=) +DO_CMP_PPZW_H(sve_cmpge_ppzw_h, int16_t, int64_t, >=) +DO_CMP_PPZW_S(sve_cmpge_ppzw_s, int32_t, int64_t, >=) + +DO_CMP_PPZW_B(sve_cmphi_ppzw_b, uint8_t, uint64_t, >) +DO_CMP_PPZW_H(sve_cmphi_ppzw_h, uint16_t, uint64_t, >) +DO_CMP_PPZW_S(sve_cmphi_ppzw_s, uint32_t, uint64_t, >) + +DO_CMP_PPZW_B(sve_cmphs_ppzw_b, uint8_t, uint64_t, >=) +DO_CMP_PPZW_H(sve_cmphs_ppzw_h, uint16_t, uint64_t, >=) +DO_CMP_PPZW_S(sve_cmphs_ppzw_s, uint32_t, uint64_t, >=) + +DO_CMP_PPZW_B(sve_cmplt_ppzw_b, int8_t, int64_t, <) +DO_CMP_PPZW_H(sve_cmplt_ppzw_h, int16_t, int64_t, <) +DO_CMP_PPZW_S(sve_cmplt_ppzw_s, int32_t, int64_t, <) + +DO_CMP_PPZW_B(sve_cmple_ppzw_b, int8_t, int64_t, <=) +DO_CMP_PPZW_H(sve_cmple_ppzw_h, int16_t, int64_t, <=) +DO_CMP_PPZW_S(sve_cmple_ppzw_s, int32_t, int64_t, <=) + +DO_CMP_PPZW_B(sve_cmplo_ppzw_b, uint8_t, uint64_t, <) +DO_CMP_PPZW_H(sve_cmplo_ppzw_h, uint16_t, uint64_t, <) +DO_CMP_PPZW_S(sve_cmplo_ppzw_s, uint32_t, uint64_t, <) + +DO_CMP_PPZW_B(sve_cmpls_ppzw_b, uint8_t, uint64_t, <=) +DO_CMP_PPZW_H(sve_cmpls_ppzw_h, uint16_t, uint64_t, <=) +DO_CMP_PPZW_S(sve_cmpls_ppzw_s, uint32_t, uint64_t, <=) + +#undef DO_CMP_PPZW_B +#undef DO_CMP_PPZW_H +#undef DO_CMP_PPZW_S +#undef DO_CMP_PPZW diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 53b86af94d..ed0e3c48b1 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -33,6 +33,10 @@ #include "trace-tcg.h" #include "translate-a64.h" + +typedef void gen_helper_gvec_flags_4(TCGv_i32, TCGv_ptr, TCGv_ptr, + TCGv_ptr, TCGv_ptr, TCGv_i32); + /* * Helpers for extracting complex instruction fields. */ @@ -2686,6 +2690,93 @@ static bool trans_SPLICE(DisasContext *s, arg_rprr_esz *a, uint32_t insn) return true; } +/* + *** SVE Integer Compare - Vectors Group + */ + +static bool do_ppzz_flags(DisasContext *s, arg_rprr_esz *a, + gen_helper_gvec_flags_4 *gen_fn) +{ + TCGv_ptr pd, zn, zm, pg; + unsigned vsz; + TCGv_i32 t; + + if (gen_fn == NULL) { + return false; + } + if (!sve_access_check(s)) { + return true; + } + + vsz = vec_full_reg_size(s); + t = tcg_const_i32(simd_desc(vsz, vsz, 0)); + pd = tcg_temp_new_ptr(); + zn = tcg_temp_new_ptr(); + zm = tcg_temp_new_ptr(); + pg = tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(pd, cpu_env, pred_full_reg_offset(s, a->rd)); + tcg_gen_addi_ptr(zn, cpu_env, vec_full_reg_offset(s, a->rn)); + tcg_gen_addi_ptr(zm, cpu_env, vec_full_reg_offset(s, a->rm)); + tcg_gen_addi_ptr(pg, cpu_env, pred_full_reg_offset(s, a->pg)); + + gen_fn(t, pd, zn, zm, pg, t); + + tcg_temp_free_ptr(pd); + tcg_temp_free_ptr(zn); + tcg_temp_free_ptr(zm); + tcg_temp_free_ptr(pg); + + do_pred_flags(t); + + tcg_temp_free_i32(t); + return true; +} + +#define DO_PPZZ(NAME, name) \ +static bool trans_##NAME##_ppzz(DisasContext *s, arg_rprr_esz *a, \ + uint32_t insn) \ +{ \ + static gen_helper_gvec_flags_4 * const fns[4] = { \ + gen_helper_sve_##name##_ppzz_b, gen_helper_sve_##name##_ppzz_h, \ + gen_helper_sve_##name##_ppzz_s, gen_helper_sve_##name##_ppzz_d, \ + }; \ + return do_ppzz_flags(s, a, fns[a->esz]); \ +} + +DO_PPZZ(CMPEQ, cmpeq) +DO_PPZZ(CMPNE, cmpne) +DO_PPZZ(CMPGT, cmpgt) +DO_PPZZ(CMPGE, cmpge) +DO_PPZZ(CMPHI, cmphi) +DO_PPZZ(CMPHS, cmphs) + +#undef DO_PPZZ + +#define DO_PPZW(NAME, name) \ +static bool trans_##NAME##_ppzw(DisasContext *s, arg_rprr_esz *a, \ + uint32_t insn) \ +{ \ + static gen_helper_gvec_flags_4 * const fns[4] = { \ + gen_helper_sve_##name##_ppzw_b, gen_helper_sve_##name##_ppzw_h, \ + gen_helper_sve_##name##_ppzw_s, NULL \ + }; \ + return do_ppzz_flags(s, a, fns[a->esz]); \ +} + +DO_PPZW(CMPEQ, cmpeq) +DO_PPZW(CMPNE, cmpne) +DO_PPZW(CMPGT, cmpgt) +DO_PPZW(CMPGE, cmpge) +DO_PPZW(CMPHI, cmphi) +DO_PPZW(CMPHS, cmphs) +DO_PPZW(CMPLT, cmplt) +DO_PPZW(CMPLE, cmple) +DO_PPZW(CMPLO, cmplo) +DO_PPZW(CMPLS, cmpls) + +#undef DO_PPZW + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 91522d8e13..76a42193e4 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -99,6 +99,7 @@ @rdm_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 \ &rprr_esz rm=%reg_movprfx @rd_pg4_rn_rm ........ esz:2 . rm:5 .. pg:4 rn:5 rd:5 &rprr_esz +@pd_pg_rn_rm ........ esz:2 . rm:5 ... pg:3 rn:5 . rd:4 &rprr_esz # Three register operand, with governing predicate, vector element size @rda_pg_rn_rm ........ esz:2 . rm:5 ... pg:3 rn:5 rd:5 \ @@ -472,6 +473,29 @@ SPLICE 00000101 .. 101 100 100 ... ..... ..... @rdn_pg_rm # SVE select vector elements (predicated) SEL_zpzz 00000101 .. 1 ..... 11 .... ..... ..... @rd_pg4_rn_rm +### SVE Integer Compare - Vectors Group + +# SVE integer compare_vectors +CMPHS_ppzz 00100100 .. 0 ..... 000 ... ..... 0 .... @pd_pg_rn_rm +CMPHI_ppzz 00100100 .. 0 ..... 000 ... ..... 1 .... @pd_pg_rn_rm +CMPGE_ppzz 00100100 .. 0 ..... 100 ... ..... 0 .... @pd_pg_rn_rm +CMPGT_ppzz 00100100 .. 0 ..... 100 ... ..... 1 .... @pd_pg_rn_rm +CMPEQ_ppzz 00100100 .. 0 ..... 101 ... ..... 0 .... @pd_pg_rn_rm +CMPNE_ppzz 00100100 .. 0 ..... 101 ... ..... 1 .... @pd_pg_rn_rm + +# SVE integer compare with wide elements +# Note these require esz != 3. +CMPEQ_ppzw 00100100 .. 0 ..... 001 ... ..... 0 .... @pd_pg_rn_rm +CMPNE_ppzw 00100100 .. 0 ..... 001 ... ..... 1 .... @pd_pg_rn_rm +CMPGE_ppzw 00100100 .. 0 ..... 010 ... ..... 0 .... @pd_pg_rn_rm +CMPGT_ppzw 00100100 .. 0 ..... 010 ... ..... 1 .... @pd_pg_rn_rm +CMPLT_ppzw 00100100 .. 0 ..... 011 ... ..... 0 .... @pd_pg_rn_rm +CMPLE_ppzw 00100100 .. 0 ..... 011 ... ..... 1 .... @pd_pg_rn_rm +CMPHS_ppzw 00100100 .. 0 ..... 110 ... ..... 0 .... @pd_pg_rn_rm +CMPHI_ppzw 00100100 .. 0 ..... 110 ... ..... 1 .... @pd_pg_rn_rm +CMPLO_ppzw 00100100 .. 0 ..... 111 ... ..... 0 .... @pd_pg_rn_rm +CMPLS_ppzw 00100100 .. 0 ..... 111 ... ..... 1 .... @pd_pg_rn_rm + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Wed May 30 18:01:14 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 137286 Delivered-To: patch@linaro.org Received: by 2002:a2e:9706:0:0:0:0:0 with SMTP id r6-v6csp5646595lji; Wed, 30 May 2018 11:20:17 -0700 (PDT) X-Google-Smtp-Source: ADUXVKI6JCbW3Ozqrn9cPcC0dVyC7k1IydjBwfBCMObcXL2sP3+jZ4om2xw+C3YwKFlVEb1ki+F4 X-Received: by 2002:a37:ab04:: with SMTP id u4-v6mr3338302qke.92.1527704417014; Wed, 30 May 2018 11:20:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527704417; cv=none; d=google.com; s=arc-20160816; b=tAM/vlV2kt82QWjOOE7aHG2+uETauPjtGRvPZ90NKf7eOENcfz0lx7BsXBSNBVslIz t9QtKVBKadiyc9dXqwe6Cf2AU5saysra6lPN71wdOg1Q5uyTqkOEMzewWKVeG/uWR1ad JVEvr/Cs3WcqJ4nYTWnbcGf2HLh9/UJyKPzPa2aDJ8WdZ17Ki8P1DhiY5aemLMaHMN9T hp1goBRIpdyzVKPBrFsU8fs8pYzDWR9U/wZ0cEJq8pPgL0OGPC4mSP1GRi6WqsouCPP/ KJbfh6ND0BRdCqMRpmReu1ZVFbNDj7E27w0k4RIzhXSaVse8rFaPwVhZgd0tt9/9tTz1 4LAg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=9ZJ5MS9BNhn0KK0l8M8UKXBgrg+2w93FAOiLtmMVeCs=; b=WiUpGEMGsvzHFCQU5R6kwC067D8UYQpjhsTWt/GoeDHneOM4FAKXwReHbraBj4P88M LRMMR4Kkn68j9Od6L0cdIkgyiqciGw4YVwo15OL6PkEAw+dfyPZ88B5Fkunoyn1/lFqp NgmTze7ouMUJrf12es5R+Si5+uNOWQM4V2xVMW4vniTVfOlm4XOROWNa+IIQ0BEcLOcB DJ8PNLocUzqOb5SvTYwC4WB7YBtDUuD9uiqQicmoM47SuLwHFQCjBjb0x2nbvCuWqIWs yhN90Q396r49UVAOf2lLY7l29kj0AqMfp0Yo6vouj6E/PSFYP9hdAYi/M177Z0wcw2Vo P1rg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=XWoW990Q; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id l17-v6si3248598qki.299.2018.05.30.11.20.16 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 30 May 2018 11:20:17 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=XWoW990Q; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:40150 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fO5ho-00078O-Fa for patch@linaro.org; Wed, 30 May 2018 14:20:16 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33735) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fO5Q0-00029f-9W for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:54 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fO5Pr-0004i4-Mc for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:52 -0400 Received: from mail-pg0-x22c.google.com ([2607:f8b0:400e:c05::22c]:40735) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fO5Pr-0004h2-E3 for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:43 -0400 Received: by mail-pg0-x22c.google.com with SMTP id l2-v6so8477913pgc.7 for ; Wed, 30 May 2018 11:01:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=9ZJ5MS9BNhn0KK0l8M8UKXBgrg+2w93FAOiLtmMVeCs=; b=XWoW990QEqujpOzMT8qvXCWY1nKSWU0FRRR/17eL8KvERtTBMo2H7KieG/ui5rLSkz FFiNjQJosxAY1CQF7Z/85p7CVzbicGiL3PZdZ53fW5enf7wtow1pAL372b4lO2TMexAD rNySsnEGxnUkYTGNh5owQjKiE577z94FoxaDU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=9ZJ5MS9BNhn0KK0l8M8UKXBgrg+2w93FAOiLtmMVeCs=; b=qM0NQzRxmxY6mreTx8Uj1XelrH79f4Hq7lptZlEu0M/frF4vouQbw68sX2bSoJoQ0X GmgPgqqOFyaUIxq3IxAQaH1I0WmpYy3JIE0ah4tIsvEohto3EQDukkEq9AurH82hAQ83 EgVE/VsNi93tsngOwjNLx+8yQdpm6o+uwivH2NYZqfk1d4RYyKzn67EqnG0KiK2Ctmob FbwfrWmgUbJVP8encBC3A7aJVPH/n5jHK6PxCeTAziqVP/y6S1S1jaCMmKTypp5FWfo9 LZLp52bsEO26Apkq+XN7O9brw+KsgMs00ktAY3eUj4EhV2WCGF9qZ5kZto2cFzL+Ueka K9tg== X-Gm-Message-State: ALKqPwfPxNrOmwU2LzxqakSmLRDwcO+xeQ4MhsgpsP3CULIUL3PomGa+ OUAYxSF9BbFclFzvdLe4f1mkTyeIutI= X-Received: by 2002:a62:93c8:: with SMTP id r69-v6mr3693558pfk.59.1527703301807; Wed, 30 May 2018 11:01:41 -0700 (PDT) Received: from cloudburst.twiddle.net (97-126-112-211.tukw.qwest.net. [97.126.112.211]) by smtp.gmail.com with ESMTPSA id b84-v6sm28179157pfm.123.2018.05.30.11.01.40 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 30 May 2018 11:01:40 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 30 May 2018 11:01:14 -0700 Message-Id: <20180530180120.13355-13-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180530180120.13355-1-richard.henderson@linaro.org> References: <20180530180120.13355-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::22c Subject: [Qemu-devel] [PATCH v3b 12/18] target/arm: Implement SVE Integer Compare - Immediate Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 44 +++++++++++++++++++ target/arm/sve_helper.c | 88 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 66 ++++++++++++++++++++++++++++ target/arm/sve.decode | 23 ++++++++++ 4 files changed, 221 insertions(+) -- 2.17.0 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 6ffd1fbe8e..ae38c0a4be 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -605,6 +605,50 @@ DEF_HELPER_FLAGS_5(sve_cmplo_ppzw_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_cmpls_ppzw_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpeq_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpne_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpgt_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpge_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmplt_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmple_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmphs_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmphi_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmplo_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpls_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_cmpeq_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpne_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpgt_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpge_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmplt_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmple_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmphs_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmphi_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmplo_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpls_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_cmpeq_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpne_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpgt_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpge_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmplt_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmple_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmphs_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmphi_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmplo_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpls_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_cmpeq_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpne_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpgt_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpge_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmplt_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmple_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmphs_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmphi_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmplo_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpls_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 1dc2ec1e65..f6c58f54ad 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2387,3 +2387,91 @@ DO_CMP_PPZW_S(sve_cmpls_ppzw_s, uint32_t, uint64_t, <=) #undef DO_CMP_PPZW_H #undef DO_CMP_PPZW_S #undef DO_CMP_PPZW + +/* Similar, but the second source is immediate. */ +#define DO_CMP_PPZI(NAME, TYPE, OP, H, MASK) \ +uint32_t HELPER(NAME)(void *vd, void *vn, void *vg, uint32_t desc) \ +{ \ + intptr_t opr_sz = simd_oprsz(desc); \ + uint32_t flags = PREDTEST_INIT; \ + TYPE mm = simd_data(desc); \ + intptr_t i = opr_sz; \ + do { \ + uint64_t out = 0, pg; \ + do { \ + i -= sizeof(TYPE), out <<= sizeof(TYPE); \ + TYPE nn = *(TYPE *)(vn + H(i)); \ + out |= nn OP mm; \ + } while (i & 63); \ + pg = *(uint64_t *)(vg + (i >> 3)) & MASK; \ + out &= pg; \ + *(uint64_t *)(vd + (i >> 3)) = out; \ + flags = iter_predtest_bwd(out, pg, flags); \ + } while (i > 0); \ + return flags; \ +} + +#define DO_CMP_PPZI_B(NAME, TYPE, OP) \ + DO_CMP_PPZI(NAME, TYPE, OP, H1, 0xffffffffffffffffull) +#define DO_CMP_PPZI_H(NAME, TYPE, OP) \ + DO_CMP_PPZI(NAME, TYPE, OP, H1_2, 0x5555555555555555ull) +#define DO_CMP_PPZI_S(NAME, TYPE, OP) \ + DO_CMP_PPZI(NAME, TYPE, OP, H1_4, 0x1111111111111111ull) +#define DO_CMP_PPZI_D(NAME, TYPE, OP) \ + DO_CMP_PPZI(NAME, TYPE, OP, , 0x0101010101010101ull) + +DO_CMP_PPZI_B(sve_cmpeq_ppzi_b, uint8_t, ==) +DO_CMP_PPZI_H(sve_cmpeq_ppzi_h, uint16_t, ==) +DO_CMP_PPZI_S(sve_cmpeq_ppzi_s, uint32_t, ==) +DO_CMP_PPZI_D(sve_cmpeq_ppzi_d, uint64_t, ==) + +DO_CMP_PPZI_B(sve_cmpne_ppzi_b, uint8_t, !=) +DO_CMP_PPZI_H(sve_cmpne_ppzi_h, uint16_t, !=) +DO_CMP_PPZI_S(sve_cmpne_ppzi_s, uint32_t, !=) +DO_CMP_PPZI_D(sve_cmpne_ppzi_d, uint64_t, !=) + +DO_CMP_PPZI_B(sve_cmpgt_ppzi_b, int8_t, >) +DO_CMP_PPZI_H(sve_cmpgt_ppzi_h, int16_t, >) +DO_CMP_PPZI_S(sve_cmpgt_ppzi_s, int32_t, >) +DO_CMP_PPZI_D(sve_cmpgt_ppzi_d, int64_t, >) + +DO_CMP_PPZI_B(sve_cmpge_ppzi_b, int8_t, >=) +DO_CMP_PPZI_H(sve_cmpge_ppzi_h, int16_t, >=) +DO_CMP_PPZI_S(sve_cmpge_ppzi_s, int32_t, >=) +DO_CMP_PPZI_D(sve_cmpge_ppzi_d, int64_t, >=) + +DO_CMP_PPZI_B(sve_cmphi_ppzi_b, uint8_t, >) +DO_CMP_PPZI_H(sve_cmphi_ppzi_h, uint16_t, >) +DO_CMP_PPZI_S(sve_cmphi_ppzi_s, uint32_t, >) +DO_CMP_PPZI_D(sve_cmphi_ppzi_d, uint64_t, >) + +DO_CMP_PPZI_B(sve_cmphs_ppzi_b, uint8_t, >=) +DO_CMP_PPZI_H(sve_cmphs_ppzi_h, uint16_t, >=) +DO_CMP_PPZI_S(sve_cmphs_ppzi_s, uint32_t, >=) +DO_CMP_PPZI_D(sve_cmphs_ppzi_d, uint64_t, >=) + +DO_CMP_PPZI_B(sve_cmplt_ppzi_b, int8_t, <) +DO_CMP_PPZI_H(sve_cmplt_ppzi_h, int16_t, <) +DO_CMP_PPZI_S(sve_cmplt_ppzi_s, int32_t, <) +DO_CMP_PPZI_D(sve_cmplt_ppzi_d, int64_t, <) + +DO_CMP_PPZI_B(sve_cmple_ppzi_b, int8_t, <=) +DO_CMP_PPZI_H(sve_cmple_ppzi_h, int16_t, <=) +DO_CMP_PPZI_S(sve_cmple_ppzi_s, int32_t, <=) +DO_CMP_PPZI_D(sve_cmple_ppzi_d, int64_t, <=) + +DO_CMP_PPZI_B(sve_cmplo_ppzi_b, uint8_t, <) +DO_CMP_PPZI_H(sve_cmplo_ppzi_h, uint16_t, <) +DO_CMP_PPZI_S(sve_cmplo_ppzi_s, uint32_t, <) +DO_CMP_PPZI_D(sve_cmplo_ppzi_d, uint64_t, <) + +DO_CMP_PPZI_B(sve_cmpls_ppzi_b, uint8_t, <=) +DO_CMP_PPZI_H(sve_cmpls_ppzi_h, uint16_t, <=) +DO_CMP_PPZI_S(sve_cmpls_ppzi_s, uint32_t, <=) +DO_CMP_PPZI_D(sve_cmpls_ppzi_d, uint64_t, <=) + +#undef DO_CMP_PPZI_B +#undef DO_CMP_PPZI_H +#undef DO_CMP_PPZI_S +#undef DO_CMP_PPZI_D +#undef DO_CMP_PPZI diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index ed0e3c48b1..5054d4d91c 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -34,6 +34,8 @@ #include "translate-a64.h" +typedef void gen_helper_gvec_flags_3(TCGv_i32, TCGv_ptr, TCGv_ptr, + TCGv_ptr, TCGv_i32); typedef void gen_helper_gvec_flags_4(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32); @@ -2777,6 +2779,70 @@ DO_PPZW(CMPLS, cmpls) #undef DO_PPZW +/* + *** SVE Integer Compare - Immediate Groups + */ + +static bool do_ppzi_flags(DisasContext *s, arg_rpri_esz *a, + gen_helper_gvec_flags_3 *gen_fn) +{ + TCGv_ptr pd, zn, pg; + unsigned vsz; + TCGv_i32 t; + + if (gen_fn == NULL) { + return false; + } + if (!sve_access_check(s)) { + return true; + } + + vsz = vec_full_reg_size(s); + t = tcg_const_i32(simd_desc(vsz, vsz, a->imm)); + pd = tcg_temp_new_ptr(); + zn = tcg_temp_new_ptr(); + pg = tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(pd, cpu_env, pred_full_reg_offset(s, a->rd)); + tcg_gen_addi_ptr(zn, cpu_env, vec_full_reg_offset(s, a->rn)); + tcg_gen_addi_ptr(pg, cpu_env, pred_full_reg_offset(s, a->pg)); + + gen_fn(t, pd, zn, pg, t); + + tcg_temp_free_ptr(pd); + tcg_temp_free_ptr(zn); + tcg_temp_free_ptr(pg); + + do_pred_flags(t); + + tcg_temp_free_i32(t); + return true; +} + +#define DO_PPZI(NAME, name) \ +static bool trans_##NAME##_ppzi(DisasContext *s, arg_rpri_esz *a, \ + uint32_t insn) \ +{ \ + static gen_helper_gvec_flags_3 * const fns[4] = { \ + gen_helper_sve_##name##_ppzi_b, gen_helper_sve_##name##_ppzi_h, \ + gen_helper_sve_##name##_ppzi_s, gen_helper_sve_##name##_ppzi_d, \ + }; \ + return do_ppzi_flags(s, a, fns[a->esz]); \ +} + +DO_PPZI(CMPEQ, cmpeq) +DO_PPZI(CMPNE, cmpne) +DO_PPZI(CMPGT, cmpgt) +DO_PPZI(CMPGE, cmpge) +DO_PPZI(CMPHI, cmphi) +DO_PPZI(CMPHS, cmphs) +DO_PPZI(CMPLT, cmplt) +DO_PPZI(CMPLE, cmple) +DO_PPZI(CMPLO, cmplo) +DO_PPZI(CMPLS, cmpls) + +#undef DO_PPZI + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 76a42193e4..9bc383b085 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -131,6 +131,11 @@ @rdn_dbm ........ .. .... dbm:13 rd:5 \ &rr_dbm rn=%reg_movprfx +# Predicate output, vector and immediate input, +# controlling predicate, element size. +@pd_pg_rn_i7 ........ esz:2 . imm:7 . pg:3 rn:5 . rd:4 &rpri_esz +@pd_pg_rn_i5 ........ esz:2 . imm:s5 ... pg:3 rn:5 . rd:4 &rpri_esz + # Basic Load/Store with 9-bit immediate offset @pd_rn_i9 ........ ........ ...... rn:5 . rd:4 \ &rri imm=%imm9_16_10 @@ -496,6 +501,24 @@ CMPHI_ppzw 00100100 .. 0 ..... 110 ... ..... 1 .... @pd_pg_rn_rm CMPLO_ppzw 00100100 .. 0 ..... 111 ... ..... 0 .... @pd_pg_rn_rm CMPLS_ppzw 00100100 .. 0 ..... 111 ... ..... 1 .... @pd_pg_rn_rm +### SVE Integer Compare - Unsigned Immediate Group + +# SVE integer compare with unsigned immediate +CMPHS_ppzi 00100100 .. 1 ....... 0 ... ..... 0 .... @pd_pg_rn_i7 +CMPHI_ppzi 00100100 .. 1 ....... 0 ... ..... 1 .... @pd_pg_rn_i7 +CMPLO_ppzi 00100100 .. 1 ....... 1 ... ..... 0 .... @pd_pg_rn_i7 +CMPLS_ppzi 00100100 .. 1 ....... 1 ... ..... 1 .... @pd_pg_rn_i7 + +### SVE Integer Compare - Signed Immediate Group + +# SVE integer compare with signed immediate +CMPGE_ppzi 00100101 .. 0 ..... 000 ... ..... 0 .... @pd_pg_rn_i5 +CMPGT_ppzi 00100101 .. 0 ..... 000 ... ..... 1 .... @pd_pg_rn_i5 +CMPLT_ppzi 00100101 .. 0 ..... 001 ... ..... 0 .... @pd_pg_rn_i5 +CMPLE_ppzi 00100101 .. 0 ..... 001 ... ..... 1 .... @pd_pg_rn_i5 +CMPEQ_ppzi 00100101 .. 0 ..... 100 ... ..... 0 .... @pd_pg_rn_i5 +CMPNE_ppzi 00100101 .. 0 ..... 100 ... ..... 1 .... @pd_pg_rn_i5 + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Wed May 30 18:01:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 137280 Delivered-To: patch@linaro.org Received: by 2002:a2e:9706:0:0:0:0:0 with SMTP id r6-v6csp5643017lji; Wed, 30 May 2018 11:16:18 -0700 (PDT) X-Google-Smtp-Source: ADUXVKLLUnq3Hp1DXwl82VsaCJ3KYEL8BIWjVnKfkfdHBzxGz5vsSdCtCYP5lwV7IhK2kldqiUCy X-Received: by 2002:ac8:3698:: with SMTP id a24-v6mr3710338qtc.88.1527704178593; Wed, 30 May 2018 11:16:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527704178; cv=none; d=google.com; s=arc-20160816; b=RFzaQ0jKvc6p8gVCUmH3GA8dGiKct8/vRI/pGtW+teMmron7x2rA1BDP00JqUpIxu7 RGFqukRl/BSx3B81dajCxJnQgG7yUS8HW366ZMGvaDlxnUIiYtC8Vx/oJZFHLDlF1ii6 8GsOI7oVxHlHOgWqZWyhahtehiGyKlt85Yjt+poAeY3TjVIvW6RgJshd0KYblW1/sErJ yIVfOV+ZlTCXv+3IRbpKFhKjuwBaxOiALcWFsAcsO9xJ5aVY/ue/F8N9zbIPXlnl0t7K QlZCmo2lgLMWeQqbnuEaD6mMoIUlPHlocS8HpdefNFiNeYZaHDouA0UPdElagNrK10Yn /0UA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=bpJBtH7g9+18qUWT8wVKCm8TrfDhueDFrWyVI+Ck1/U=; b=YK3CD7Rf40yrnuPfTVehdFUjZrdpGsi53uMvF2t/TMezAAPKUQij09chhXBo9M3ouc zxfeAeN0W5dInyfA5Hd0F9wT5cNpB9ifEtR8x0ERrN69Q0AQm/9WHRCD4fY2BATq+0aP zUg0mpv21LGbMrIwH65xzOI86dgVXsjct7o0XNuy8SXIrzvZS74uAvsMbcmK3DYo8cls DdOTB2TN+ahbb9ShyDagDVgje2RSe1f3ox8uFg2N5EPoY1jZMcIpQsjTPc0Ry2B7Jpak 9toHh546bVmkFGRLoTqquN6d6z6bOXnGr6tNsJW64Ft6tqJWEDvMfuFEUbLHl4pzVAMd ulBA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=D5D6kapC; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id y126-v6si14578899qka.302.2018.05.30.11.16.18 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 30 May 2018 11:16:18 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=D5D6kapC; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:40137 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fO5dy-0005Bh-3L for patch@linaro.org; Wed, 30 May 2018 14:16:18 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33755) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fO5Q1-0002Al-5c for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:55 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fO5Pt-0004k6-9w for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:53 -0400 Received: from mail-pg0-x241.google.com ([2607:f8b0:400e:c05::241]:39223) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fO5Ps-0004jG-V5 for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:45 -0400 Received: by mail-pg0-x241.google.com with SMTP id w12-v6so7271895pgc.6 for ; Wed, 30 May 2018 11:01:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=bpJBtH7g9+18qUWT8wVKCm8TrfDhueDFrWyVI+Ck1/U=; b=D5D6kapCZineDWqSQMmZuOnsLBJVEw4LZMOt+Jpp/RWjBvA8gqGjFRidiAx8FImBKZ xTQak01VpoWjBcfno6+MnFk3jfWawQO5ldb1NTda+YnHktR2aZxIkTxruQrd24wXNrGZ D3RImxA+OPeLhqkYtCNB1seIaUVzTZkxu8hmw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=bpJBtH7g9+18qUWT8wVKCm8TrfDhueDFrWyVI+Ck1/U=; b=Tzt2luIFyTHvpktFYrf3UDJWsRsuVJsWUq3gcodDK9aLteUPte4ssZ8A1wGbh+wlPj lx+7juan9pbgxkMMT/LuQnrcoNDaIGm6xtD8ri5dNJiLlaFbJ32d5VxgWlrOPO49a3ds a4GKPbcQVhaO5P0ivsBhBgB8czGM4GyRY3UAxlzP635nxD0EXK5iAOFU01zOGodgJEgM Cw5gwHBOVzSyAgLqERtADByHp3pCGp6vaGbgEjnsbBKtY2LXs+QUEVhT8s+BM8qZfQlP DbsN9E83yYHbuV+rUWkegDTdlEPbKf/qNrLQVZEyGLxyCSuI+eO20Ompb/uBDlI7zZK0 1Z3w== X-Gm-Message-State: ALKqPwfPZwkM8NZfTzI/ZP+wpqpEtLQbS2EC5ltdfg+8T8P3aNF0iuDi /7LjY4gPWGPlg9E2ECLydYgTNaMJutk= X-Received: by 2002:a62:e30f:: with SMTP id g15-v6mr3734240pfh.68.1527703303310; Wed, 30 May 2018 11:01:43 -0700 (PDT) Received: from cloudburst.twiddle.net (97-126-112-211.tukw.qwest.net. [97.126.112.211]) by smtp.gmail.com with ESMTPSA id b84-v6sm28179157pfm.123.2018.05.30.11.01.41 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 30 May 2018 11:01:42 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 30 May 2018 11:01:15 -0700 Message-Id: <20180530180120.13355-14-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180530180120.13355-1-richard.henderson@linaro.org> References: <20180530180120.13355-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::241 Subject: [Qemu-devel] [PATCH v3b 13/18] target/arm: Implement SVE Partition Break Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 18 +++ target/arm/sve_helper.c | 247 +++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 106 ++++++++++++++++ target/arm/sve.decode | 19 +++ 4 files changed, 390 insertions(+) -- 2.17.0 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index ae38c0a4be..f0a3ed3414 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -658,3 +658,21 @@ DEF_HELPER_FLAGS_5(sve_orn_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_nor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_nand_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_brkpa, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_brkpb, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_brkpas, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_brkpbs, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_brka_z, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_brkb_z, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_brka_m, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_brkb_m, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_brkas_z, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_brkbs_z, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_brkas_m, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_brkbs_m, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_brkn, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_brkns, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index f6c58f54ad..409ec0963f 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2475,3 +2475,250 @@ DO_CMP_PPZI_D(sve_cmpls_ppzi_d, uint64_t, <=) #undef DO_CMP_PPZI_S #undef DO_CMP_PPZI_D #undef DO_CMP_PPZI + +/* Similar to the ARM LastActive pseudocode function. */ +static bool last_active_pred(void *vd, void *vg, intptr_t oprsz) +{ + intptr_t i; + + for (i = QEMU_ALIGN_UP(oprsz, 8) - 8; i >= 0; i -= 8) { + uint64_t pg = *(uint64_t *)(vg + i); + if (pg) { + return (pow2floor(pg) & *(uint64_t *)(vd + i)) != 0; + } + } + return 0; +} + +/* Compute a mask into RETB that is true for all G, up to and including + * (if after) or excluding (if !after) the first G & N. + * Return true if BRK found. + */ +static bool compute_brk(uint64_t *retb, uint64_t n, uint64_t g, + bool brk, bool after) +{ + uint64_t b; + + if (brk) { + b = 0; + } else if ((g & n) == 0) { + /* For all G, no N are set; break not found. */ + b = g; + } else { + /* Break somewhere in N. Locate it. */ + b = g & n; /* guard true, pred true*/ + b = b & -b; /* first such */ + if (after) { + b = b | (b - 1); /* break after same */ + } else { + b = b - 1; /* break before same */ + } + brk = true; + } + + *retb = b; + return brk; +} + +/* Compute a zeroing BRK. */ +static void compute_brk_z(uint64_t *d, uint64_t *n, uint64_t *g, + intptr_t oprsz, bool after) +{ + bool brk = false; + intptr_t i; + + for (i = 0; i < DIV_ROUND_UP(oprsz, 8); ++i) { + uint64_t this_b, this_g = g[i]; + + brk = compute_brk(&this_b, n[i], this_g, brk, after); + d[i] = this_b & this_g; + } +} + +/* Likewise, but also compute flags. */ +static uint32_t compute_brks_z(uint64_t *d, uint64_t *n, uint64_t *g, + intptr_t oprsz, bool after) +{ + uint32_t flags = PREDTEST_INIT; + bool brk = false; + intptr_t i; + + for (i = 0; i < DIV_ROUND_UP(oprsz, 8); ++i) { + uint64_t this_b, this_d, this_g = g[i]; + + brk = compute_brk(&this_b, n[i], this_g, brk, after); + d[i] = this_d = this_b & this_g; + flags = iter_predtest_fwd(this_d, this_g, flags); + } + return flags; +} + +/* Given a computation function, compute a merging BRK. */ +static void compute_brk_m(uint64_t *d, uint64_t *n, uint64_t *g, + intptr_t oprsz, bool after) +{ + bool brk = false; + intptr_t i; + + for (i = 0; i < DIV_ROUND_UP(oprsz, 8); ++i) { + uint64_t this_b, this_g = g[i]; + + brk = compute_brk(&this_b, n[i], this_g, brk, after); + d[i] = (this_b & this_g) | (d[i] & ~this_g); + } +} + +/* Likewise, but also compute flags. */ +static uint32_t compute_brks_m(uint64_t *d, uint64_t *n, uint64_t *g, + intptr_t oprsz, bool after) +{ + uint32_t flags = PREDTEST_INIT; + bool brk = false; + intptr_t i; + + for (i = 0; i < oprsz / 8; ++i) { + uint64_t this_b, this_d = d[i], this_g = g[i]; + + brk = compute_brk(&this_b, n[i], this_g, brk, after); + d[i] = this_d = (this_b & this_g) | (this_d & ~this_g); + flags = iter_predtest_fwd(this_d, this_g, flags); + } + return flags; +} + +static uint32_t do_zero(ARMPredicateReg *d, intptr_t oprsz) +{ + /* It is quicker to zero the whole predicate than loop on OPRSZ. + The compiler should turn this into 4 64-bit integer stores. */ + memset(d, 0, sizeof(ARMPredicateReg)); + return PREDTEST_INIT; +} + +void HELPER(sve_brkpa)(void *vd, void *vn, void *vm, void *vg, + uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + if (last_active_pred(vn, vg, oprsz)) { + compute_brk_z(vd, vm, vg, oprsz, true); + } else { + do_zero(vd, oprsz); + } +} + +uint32_t HELPER(sve_brkpas)(void *vd, void *vn, void *vm, void *vg, + uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + if (last_active_pred(vn, vg, oprsz)) { + return compute_brks_z(vd, vm, vg, oprsz, true); + } else { + return do_zero(vd, oprsz); + } +} + +void HELPER(sve_brkpb)(void *vd, void *vn, void *vm, void *vg, + uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + if (last_active_pred(vn, vg, oprsz)) { + compute_brk_z(vd, vm, vg, oprsz, false); + } else { + do_zero(vd, oprsz); + } +} + +uint32_t HELPER(sve_brkpbs)(void *vd, void *vn, void *vm, void *vg, + uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + if (last_active_pred(vn, vg, oprsz)) { + return compute_brks_z(vd, vm, vg, oprsz, false); + } else { + return do_zero(vd, oprsz); + } +} + +void HELPER(sve_brka_z)(void *vd, void *vn, void *vg, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + compute_brk_z(vd, vn, vg, oprsz, true); +} + +uint32_t HELPER(sve_brkas_z)(void *vd, void *vn, void *vg, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + return compute_brks_z(vd, vn, vg, oprsz, true); +} + +void HELPER(sve_brkb_z)(void *vd, void *vn, void *vg, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + compute_brk_z(vd, vn, vg, oprsz, false); +} + +uint32_t HELPER(sve_brkbs_z)(void *vd, void *vn, void *vg, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + return compute_brks_z(vd, vn, vg, oprsz, false); +} + +void HELPER(sve_brka_m)(void *vd, void *vn, void *vg, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + compute_brk_m(vd, vn, vg, oprsz, true); +} + +uint32_t HELPER(sve_brkas_m)(void *vd, void *vn, void *vg, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + return compute_brks_m(vd, vn, vg, oprsz, true); +} + +void HELPER(sve_brkb_m)(void *vd, void *vn, void *vg, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + compute_brk_m(vd, vn, vg, oprsz, false); +} + +uint32_t HELPER(sve_brkbs_m)(void *vd, void *vn, void *vg, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + return compute_brks_m(vd, vn, vg, oprsz, false); +} + +void HELPER(sve_brkn)(void *vd, void *vn, void *vg, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + + if (!last_active_pred(vn, vg, oprsz)) { + do_zero(vd, oprsz); + } +} + +/* As if PredTest(Ones(PL), D, esz). */ +static uint32_t predtest_ones(ARMPredicateReg *d, intptr_t oprsz, + uint64_t esz_mask) +{ + uint32_t flags = PREDTEST_INIT; + intptr_t i; + + for (i = 0; i < oprsz / 8; i++) { + flags = iter_predtest_fwd(d->p[i], esz_mask, flags); + } + if (oprsz & 7) { + uint64_t mask = ~(-1ULL << (8 * (oprsz & 7))); + flags = iter_predtest_fwd(d->p[i], esz_mask & mask, flags); + } + return flags; +} + +uint32_t HELPER(sve_brkns)(void *vd, void *vn, void *vg, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + + if (last_active_pred(vn, vg, oprsz)) { + return predtest_ones(vd, oprsz, -1); + } else { + return do_zero(vd, oprsz); + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 5054d4d91c..d5f33cbfe9 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2843,6 +2843,112 @@ DO_PPZI(CMPLS, cmpls) #undef DO_PPZI +/* + *** SVE Partition Break Group + */ + +static bool do_brk3(DisasContext *s, arg_rprr_s *a, + gen_helper_gvec_4 *fn, gen_helper_gvec_flags_4 *fn_s) +{ + if (!sve_access_check(s)) { + return true; + } + + unsigned vsz = pred_full_reg_size(s); + + /* Predicate sizes may be smaller and cannot use simd_desc. */ + TCGv_ptr d = tcg_temp_new_ptr(); + TCGv_ptr n = tcg_temp_new_ptr(); + TCGv_ptr m = tcg_temp_new_ptr(); + TCGv_ptr g = tcg_temp_new_ptr(); + TCGv_i32 t = tcg_const_i32(vsz - 2); + + tcg_gen_addi_ptr(d, cpu_env, pred_full_reg_offset(s, a->rd)); + tcg_gen_addi_ptr(n, cpu_env, pred_full_reg_offset(s, a->rn)); + tcg_gen_addi_ptr(m, cpu_env, pred_full_reg_offset(s, a->rm)); + tcg_gen_addi_ptr(g, cpu_env, pred_full_reg_offset(s, a->pg)); + + if (a->s) { + fn_s(t, d, n, m, g, t); + do_pred_flags(t); + } else { + fn(d, n, m, g, t); + } + tcg_temp_free_ptr(d); + tcg_temp_free_ptr(n); + tcg_temp_free_ptr(m); + tcg_temp_free_ptr(g); + tcg_temp_free_i32(t); + return true; +} + +static bool do_brk2(DisasContext *s, arg_rpr_s *a, + gen_helper_gvec_3 *fn, gen_helper_gvec_flags_3 *fn_s) +{ + if (!sve_access_check(s)) { + return true; + } + + unsigned vsz = pred_full_reg_size(s); + + /* Predicate sizes may be smaller and cannot use simd_desc. */ + TCGv_ptr d = tcg_temp_new_ptr(); + TCGv_ptr n = tcg_temp_new_ptr(); + TCGv_ptr g = tcg_temp_new_ptr(); + TCGv_i32 t = tcg_const_i32(vsz - 2); + + tcg_gen_addi_ptr(d, cpu_env, pred_full_reg_offset(s, a->rd)); + tcg_gen_addi_ptr(n, cpu_env, pred_full_reg_offset(s, a->rn)); + tcg_gen_addi_ptr(g, cpu_env, pred_full_reg_offset(s, a->pg)); + + if (a->s) { + fn_s(t, d, n, g, t); + do_pred_flags(t); + } else { + fn(d, n, g, t); + } + tcg_temp_free_ptr(d); + tcg_temp_free_ptr(n); + tcg_temp_free_ptr(g); + tcg_temp_free_i32(t); + return true; +} + +static bool trans_BRKPA(DisasContext *s, arg_rprr_s *a, uint32_t insn) +{ + return do_brk3(s, a, gen_helper_sve_brkpa, gen_helper_sve_brkpas); +} + +static bool trans_BRKPB(DisasContext *s, arg_rprr_s *a, uint32_t insn) +{ + return do_brk3(s, a, gen_helper_sve_brkpb, gen_helper_sve_brkpbs); +} + +static bool trans_BRKA_m(DisasContext *s, arg_rpr_s *a, uint32_t insn) +{ + return do_brk2(s, a, gen_helper_sve_brka_m, gen_helper_sve_brkas_m); +} + +static bool trans_BRKB_m(DisasContext *s, arg_rpr_s *a, uint32_t insn) +{ + return do_brk2(s, a, gen_helper_sve_brkb_m, gen_helper_sve_brkbs_m); +} + +static bool trans_BRKA_z(DisasContext *s, arg_rpr_s *a, uint32_t insn) +{ + return do_brk2(s, a, gen_helper_sve_brka_z, gen_helper_sve_brkas_z); +} + +static bool trans_BRKB_z(DisasContext *s, arg_rpr_s *a, uint32_t insn) +{ + return do_brk2(s, a, gen_helper_sve_brkb_z, gen_helper_sve_brkbs_z); +} + +static bool trans_BRKN(DisasContext *s, arg_rpr_s *a, uint32_t insn) +{ + return do_brk2(s, a, gen_helper_sve_brkn, gen_helper_sve_brkns); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 9bc383b085..66e1ee6b5c 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -59,6 +59,7 @@ &rri_esz rd rn imm esz &rrr_esz rd rn rm esz &rpr_esz rd pg rn esz +&rpr_s rd pg rn s &rprr_s rd pg rn rm s &rprr_esz rd pg rn rm esz &rprrr_esz rd pg rn rm ra esz @@ -78,6 +79,9 @@ @pd_pn ........ esz:2 .. .... ....... rn:4 . rd:4 &rr_esz @rd_rn ........ esz:2 ...... ...... rn:5 rd:5 &rr_esz +# Two operand with governing predicate, flags setting +@pd_pg_pn_s ........ . s:1 ...... .. pg:4 . rn:4 . rd:4 &rpr_s + # Three operand with unused vector element size @rd_rn_rm_e0 ........ ... rm:5 ... ... rn:5 rd:5 &rrr_esz esz=0 @@ -560,6 +564,21 @@ PFIRST 00100101 01 011 000 11000 00 .... 0 .... @pd_pn_e0 # SVE predicate next active PNEXT 00100101 .. 011 001 11000 10 .... 0 .... @pd_pn +### SVE Partition Break Group + +# SVE propagate break from previous partition +BRKPA 00100101 0. 00 .... 11 .... 0 .... 0 .... @pd_pg_pn_pm_s +BRKPB 00100101 0. 00 .... 11 .... 0 .... 1 .... @pd_pg_pn_pm_s + +# SVE partition break condition +BRKA_z 00100101 0. 01000001 .... 0 .... 0 .... @pd_pg_pn_s +BRKB_z 00100101 1. 01000001 .... 0 .... 0 .... @pd_pg_pn_s +BRKA_m 00100101 0. 01000001 .... 0 .... 1 .... @pd_pg_pn_s +BRKB_m 00100101 1. 01000001 .... 0 .... 1 .... @pd_pg_pn_s + +# SVE propagate break to next partition +BRKN 00100101 0. 01100001 .... 0 .... 0 .... @pd_pg_pn_s + ### SVE Memory - 32-bit Gather and Unsized Contiguous Group # SVE load predicate register From patchwork Wed May 30 18:01:16 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 137281 Delivered-To: patch@linaro.org Received: by 2002:a2e:9706:0:0:0:0:0 with SMTP id r6-v6csp5644883lji; Wed, 30 May 2018 11:18:15 -0700 (PDT) X-Google-Smtp-Source: ADUXVKLCmdnOFxWuWVRAEd/XRVAqfRRuteNfiAOJ9Mo2C+MJ8jwUXBRjtPz4IS1vjPXzRt/74UyT X-Received: by 2002:a0c:e043:: with SMTP id y3-v6mr17207qvk.148.1527704295852; Wed, 30 May 2018 11:18:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527704295; cv=none; d=google.com; s=arc-20160816; b=OkN2RCO3ab2R0VArTgfPtSBk2Y0wCwik9ykFi7TkbN6R+ZuKNrXUNxtj0RgV5NSj31 JfU/bUj1MpABoX2Ew0RbYkZpSM5P7+MjAmL6UZ024+XHVTJYUJSPtLv/8hrRp0+rfuGq onGfS3sul50lNM9ixDJJauo9mahS+Ygiz7Up3zFjmY0/wIwv8gGKSyL8mIXZxtqz4huU a/CijTNY2uKzNgDT/oJTOnVeNLJ3ybrRb6y9LgEFAyxXPOo8Dj9Q/5/YBiZuLnLo+Dl0 JqFcL5U/JxiLG8wp/aKoV9KE6Qi6bAqtBShfUIuB8e66Pm3bdLps9hHgrPL6KEQ8kt+1 rLyA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=Ld2xrQU82H6bGici9pXYUtbus0NOEmAaNaAWCDL2/k4=; b=LIUlW8/1LqTI6GcAdgV4M0erBSTnuujLDaR0OfcjA58zVNdeGxNkAUQjVBgULsFNix HJcq5qZONZDC/pfRh9GoTyntholcc0DP2fxCH3bVZshZfIk3P6hwmrDfG0RPmGV0DEW1 dnn0q8BeV6JORKcYucqhNscA+hABRzWqCUHXEXelmls36JFIT9z/c6I9PafortosK07t APvx8MG9gLVubHGQ/CcHcfz/LKYAQoE6ygHZUunuKWsYPpOtaNQejneJ1xSjZ5vD4ftc LCf8WtitBp1B0SAVXmbkHzI399gnaEbxPOZDuaIyfdUBEyL/tQLNWPb6M9hEyQYcOUs6 KC4Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=UwC3TtjR; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id t23-v6si9942899qve.232.2018.05.30.11.18.15 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 30 May 2018 11:18:15 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=UwC3TtjR; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:40142 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fO5fr-0005v8-AA for patch@linaro.org; Wed, 30 May 2018 14:18:15 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33648) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fO5Px-00025y-18 for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:50 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fO5Pv-0004ll-2E for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:49 -0400 Received: from mail-pf0-x243.google.com ([2607:f8b0:400e:c00::243]:34865) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fO5Pu-0004kw-Mp for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:46 -0400 Received: by mail-pf0-x243.google.com with SMTP id x9-v6so9409864pfm.2 for ; Wed, 30 May 2018 11:01:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Ld2xrQU82H6bGici9pXYUtbus0NOEmAaNaAWCDL2/k4=; b=UwC3TtjRNT8CSMvhw0uP/p/Zu772c6Jv5L5Y82fNkoNrhexKLA+exq4rGep2lpIQjB ddASwHK6V+PeEfSPnHPMtE3kfyBARG+vvUzmdiJt+dlfmeCh2uar6lhQzTJOPCgO85l4 Y1dHkWbfl+bKck43VhVjiSaf7zml2JmBayH6s= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Ld2xrQU82H6bGici9pXYUtbus0NOEmAaNaAWCDL2/k4=; b=W3Xq/5WNR6enpxMjDNOUlhFfKc8jKeQdDXr7hpmc7QLgoVeqMr+PQPg1HHcefRzh62 juoEIvwtkslQcnd0eO2q4ntazo0RL81vXpkNjTalofWJ0ljt5tIbh37YIIaNXwBT5h6F m0tDb4tUF7VQkUz7KfLJ+sf0MSZ++8fRAjlahBx3hB6Dp+ub951DmPkBhqRsM3MPyEDr dT+zpScZ6hBB50SEGAtXtAnsdXcZyA8+er6oAATp05bkyI6UhKdmuieArp/EvBu6O7r3 2627vcoT0E/gtjs+V99ayoxQ6f19A4FyPLQgBEP5EC4XGNpsvFqGLzpaMUoU1SnahC7/ iJOA== X-Gm-Message-State: ALKqPwfSBUUJ4D2LxCAarK86i8A1/5FcTIC1mRSogzy6btlwyLu/7HZE 3263uN1jRjAvCn7/th7O41eRZwtqCsM= X-Received: by 2002:a62:1656:: with SMTP id 83-v6mr3686176pfw.61.1527703305005; Wed, 30 May 2018 11:01:45 -0700 (PDT) Received: from cloudburst.twiddle.net (97-126-112-211.tukw.qwest.net. [97.126.112.211]) by smtp.gmail.com with ESMTPSA id b84-v6sm28179157pfm.123.2018.05.30.11.01.43 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 30 May 2018 11:01:43 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 30 May 2018 11:01:16 -0700 Message-Id: <20180530180120.13355-15-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180530180120.13355-1-richard.henderson@linaro.org> References: <20180530180120.13355-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::243 Subject: [Qemu-devel] [PATCH v3b 14/18] target/arm: Implement SVE Predicate Count Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 2 + target/arm/sve_helper.c | 14 ++++ target/arm/translate-sve.c | 132 +++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 27 ++++++++ 4 files changed, 175 insertions(+) -- 2.17.0 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index f0a3ed3414..dd4f8f754d 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -676,3 +676,5 @@ DEF_HELPER_FLAGS_4(sve_brkbs_m, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_brkn, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_brkns, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_cntp, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 409ec0963f..8d1631ea3c 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2722,3 +2722,17 @@ uint32_t HELPER(sve_brkns)(void *vd, void *vn, void *vg, uint32_t pred_desc) return do_zero(vd, oprsz); } } + +uint64_t HELPER(sve_cntp)(void *vn, void *vg, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + intptr_t esz = extract32(pred_desc, SIMD_DATA_SHIFT, 2); + uint64_t *n = vn, *g = vg, sum = 0, mask = pred_esz_masks[esz]; + intptr_t i; + + for (i = 0; i < DIV_ROUND_UP(oprsz, 8); ++i) { + uint64_t t = n[i] & g[i] & mask; + sum += ctpop64(t); + } + return sum; +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index d5f33cbfe9..e4815e4912 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -34,6 +34,9 @@ #include "translate-a64.h" +typedef void GVecGen2sFn(unsigned, uint32_t, uint32_t, + TCGv_i64, uint32_t, uint32_t); + typedef void gen_helper_gvec_flags_3(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32); typedef void gen_helper_gvec_flags_4(TCGv_i32, TCGv_ptr, TCGv_ptr, @@ -2949,6 +2952,135 @@ static bool trans_BRKN(DisasContext *s, arg_rpr_s *a, uint32_t insn) return do_brk2(s, a, gen_helper_sve_brkn, gen_helper_sve_brkns); } +/* + *** SVE Predicate Count Group + */ + +static void do_cntp(DisasContext *s, TCGv_i64 val, int esz, int pn, int pg) +{ + unsigned psz = pred_full_reg_size(s); + + if (psz <= 8) { + uint64_t psz_mask; + + tcg_gen_ld_i64(val, cpu_env, pred_full_reg_offset(s, pn)); + if (pn != pg) { + TCGv_i64 g = tcg_temp_new_i64(); + tcg_gen_ld_i64(g, cpu_env, pred_full_reg_offset(s, pg)); + tcg_gen_and_i64(val, val, g); + tcg_temp_free_i64(g); + } + + /* Reduce the pred_esz_masks value simply to reduce the + size of the code generated here. */ + psz_mask = deposit64(0, 0, psz * 8, -1); + tcg_gen_andi_i64(val, val, pred_esz_masks[esz] & psz_mask); + + tcg_gen_ctpop_i64(val, val); + } else { + TCGv_ptr t_pn = tcg_temp_new_ptr(); + TCGv_ptr t_pg = tcg_temp_new_ptr(); + unsigned desc; + TCGv_i32 t_desc; + + desc = psz - 2; + desc = deposit32(desc, SIMD_DATA_SHIFT, 2, esz); + + tcg_gen_addi_ptr(t_pn, cpu_env, pred_full_reg_offset(s, pn)); + tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg)); + t_desc = tcg_const_i32(desc); + + gen_helper_sve_cntp(val, t_pn, t_pg, t_desc); + tcg_temp_free_ptr(t_pn); + tcg_temp_free_ptr(t_pg); + tcg_temp_free_i32(t_desc); + } +} + +static bool trans_CNTP(DisasContext *s, arg_CNTP *a, uint32_t insn) +{ + if (sve_access_check(s)) { + do_cntp(s, cpu_reg(s, a->rd), a->esz, a->rn, a->pg); + } + return true; +} + +static bool trans_INCDECP_r(DisasContext *s, arg_incdec_pred *a, + uint32_t insn) +{ + if (sve_access_check(s)) { + TCGv_i64 reg = cpu_reg(s, a->rd); + TCGv_i64 val = tcg_temp_new_i64(); + + do_cntp(s, val, a->esz, a->pg, a->pg); + if (a->d) { + tcg_gen_sub_i64(reg, reg, val); + } else { + tcg_gen_add_i64(reg, reg, val); + } + tcg_temp_free_i64(val); + } + return true; +} + +static bool trans_INCDECP_z(DisasContext *s, arg_incdec2_pred *a, + uint32_t insn) +{ + if (a->esz == 0) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + TCGv_i64 val = tcg_temp_new_i64(); + GVecGen2sFn *gvec_fn = a->d ? tcg_gen_gvec_subs : tcg_gen_gvec_adds; + + do_cntp(s, val, a->esz, a->pg, a->pg); + gvec_fn(a->esz, vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), val, vsz, vsz); + } + return true; +} + +static bool trans_SINCDECP_r_32(DisasContext *s, arg_incdec_pred *a, + uint32_t insn) +{ + if (sve_access_check(s)) { + TCGv_i64 reg = cpu_reg(s, a->rd); + TCGv_i64 val = tcg_temp_new_i64(); + + do_cntp(s, val, a->esz, a->pg, a->pg); + do_sat_addsub_32(reg, val, a->u, a->d); + } + return true; +} + +static bool trans_SINCDECP_r_64(DisasContext *s, arg_incdec_pred *a, + uint32_t insn) +{ + if (sve_access_check(s)) { + TCGv_i64 reg = cpu_reg(s, a->rd); + TCGv_i64 val = tcg_temp_new_i64(); + + do_cntp(s, val, a->esz, a->pg, a->pg); + do_sat_addsub_64(reg, val, a->u, a->d); + } + return true; +} + +static bool trans_SINCDECP_z(DisasContext *s, arg_incdec2_pred *a, + uint32_t insn) +{ + if (a->esz == 0) { + return false; + } + if (sve_access_check(s)) { + TCGv_i64 val = tcg_temp_new_i64(); + do_cntp(s, val, a->esz, a->pg, a->pg); + do_sat_addsub_vec(s, a->esz, a->rd, a->rn, val, a->u, a->d); + } + return true; +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 66e1ee6b5c..62d51c252b 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -67,6 +67,8 @@ &ptrue rd esz pat s &incdec_cnt rd pat esz imm d u &incdec2_cnt rd rn pat esz imm d u +&incdec_pred rd pg esz d u +&incdec2_pred rd rn pg esz d u ########################################################################### # Named instruction formats. These are generally used to @@ -113,6 +115,7 @@ # One register operand, with governing predicate, vector element size @rd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz +@rd_pg4_pn ........ esz:2 ... ... .. pg:4 . rn:4 rd:5 &rpr_esz # Two register operands with a 6-bit signed immediate. @rd_rn_i6 ........ ... rn:5 ..... imm:s6 rd:5 &rri @@ -153,6 +156,12 @@ @incdec2_cnt ........ esz:2 .. .... ...... pat:5 rd:5 \ &incdec2_cnt imm=%imm4_16_p1 rn=%reg_movprfx +# One register, predicate. +# User must fill in U and D. +@incdec_pred ........ esz:2 .... .. ..... .. pg:4 rd:5 &incdec_pred +@incdec2_pred ........ esz:2 .... .. ..... .. pg:4 rd:5 \ + &incdec2_pred rn=%reg_movprfx + ########################################################################### # Instruction patterns. Grouped according to the SVE encodingindex.xhtml. @@ -579,6 +588,24 @@ BRKB_m 00100101 1. 01000001 .... 0 .... 1 .... @pd_pg_pn_s # SVE propagate break to next partition BRKN 00100101 0. 01100001 .... 0 .... 0 .... @pd_pg_pn_s +### SVE Predicate Count Group + +# SVE predicate count +CNTP 00100101 .. 100 000 10 .... 0 .... ..... @rd_pg4_pn + +# SVE inc/dec register by predicate count +INCDECP_r 00100101 .. 10110 d:1 10001 00 .... ..... @incdec_pred u=1 + +# SVE inc/dec vector by predicate count +INCDECP_z 00100101 .. 10110 d:1 10000 00 .... ..... @incdec2_pred u=1 + +# SVE saturating inc/dec register by predicate count +SINCDECP_r_32 00100101 .. 1010 d:1 u:1 10001 00 .... ..... @incdec_pred +SINCDECP_r_64 00100101 .. 1010 d:1 u:1 10001 10 .... ..... @incdec_pred + +# SVE saturating inc/dec vector by predicate count +SINCDECP_z 00100101 .. 1010 d:1 u:1 10000 00 .... ..... @incdec2_pred + ### SVE Memory - 32-bit Gather and Unsized Contiguous Group # SVE load predicate register From patchwork Wed May 30 18:01:17 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 137276 Delivered-To: patch@linaro.org Received: by 2002:a2e:9706:0:0:0:0:0 with SMTP id r6-v6csp5640064lji; Wed, 30 May 2018 11:13:21 -0700 (PDT) X-Google-Smtp-Source: ADUXVKJ/PuriLjj3nJBwhVMvxwDnkJJGaNjZutFrXWn8OyF9L/fhVCu3t5zKj+2HeIur62nfrQXY X-Received: by 2002:ac8:70c3:: with SMTP id g3-v6mr3482165qtp.95.1527704001013; Wed, 30 May 2018 11:13:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527704001; cv=none; d=google.com; s=arc-20160816; b=cHhAUTDmsW/pxgBLvDzj85/HzAvzcTTLCiSjx/7Vie9+TCNRiT+B+eq5v4c4R9F4bX JXAU8RSBAV+qg/9/GQwPVIjfVgH9ZDp4QDEkW3oY3eSveE46zcYR4c2XQbe2/I/hKxa8 So4WPFSCyVoGo6wPxX4oiHo6dFG0JfUmbsPKObNfq1Hxo/JB8r9+qaAkrPeTgHvx/oVx 2Qrr0rEevaJ6aI/QaWCX9B9hD1adU0OYDP+XsR+HiMCduUZu3UHWT8+piQevCQNSQXAv h32t0JVuZ6686RC8/hJUIpoCzjEJ1FcP8PdIn0QmIaxUBdnQcY49CyrbXpjh7NWQBU9k 8z3A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=aA/5S2B3GjqKGo0kl9R3P8OKdMyH3ZHUd87Da/paKlg=; b=x+Fr2k4BvUB+ETr00Lu0XMLY5OvxHbV0767qgXyD2v71lRP8Vm4hMw5y4USmkEIUN9 W8XnR/2a45hOIbAFsMSM/rrVIEB5T3p0lbhZEMVRWp0NOlFQfhC/d6JwIO9uOXURbGdC kC3lFaoX9tt0hGyL2h6xiJcUvGvoDhGc3xGpKiQPkO5H4TeunOxkJVUv2qbz+xc5/GLb uWH93ybHJV50EfSUSy/L6GUySaHbFpTYeVeIe4kT+udmW3pdNEYzkoglox3+9m+dYo2W NSGMVFabfNWs50AA5j7TcAOHdut/3aIqWoCnD4UZBPjO8fKuluOBhTp7afn6M5iufKzO ygbA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=afMY4qfZ; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id l2-v6si7125367qvl.34.2018.05.30.11.13.20 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 30 May 2018 11:13:21 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=afMY4qfZ; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:40112 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fO5b6-0002aj-FO for patch@linaro.org; Wed, 30 May 2018 14:13:20 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33758) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fO5Q1-0002Ap-7F for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fO5Pw-0004o3-O0 for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:53 -0400 Received: from mail-pf0-x22a.google.com ([2607:f8b0:400e:c00::22a]:44035) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fO5Pw-0004nJ-Gh for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:48 -0400 Received: by mail-pf0-x22a.google.com with SMTP id q22-v6so9403936pff.11 for ; Wed, 30 May 2018 11:01:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=aA/5S2B3GjqKGo0kl9R3P8OKdMyH3ZHUd87Da/paKlg=; b=afMY4qfZxvqQvTpb00F2BJkqB7b30V/b04qbLBZWR9LXoy+2Nst7zVwfZZGJHwJ/c0 MyMkjzd6kInkOg+4KrIuoji+IFBfLQQRPy1DUSzys3aJ0g0FDah6bobVkyvgbuz+JLdI wejRE0BlSxfqi70lqEJG7WkswmrTZZDnSz9cc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=aA/5S2B3GjqKGo0kl9R3P8OKdMyH3ZHUd87Da/paKlg=; b=IVRg8/rFF4h0J3q1ZwDTHNC2Z3PwNPHVb7uBA4Lcj9rXny+tl4ISyuMNMM2IpmG0gI lxP+IuLBXb9DWdsyzMwboZjvjlk4dkcu7vjkrin8kD/AvkCI4TyWi5UpLCxlydpRRYiA D6efxTGI5rqW9iZ7qVZn2XPiMYDHi/fP/Z8zcUiSpz9B0QmCGApxqr+W0fz/TpQd02is g4EG2sNow8Zi4wWi/UgF/hLAs1XVhVhAEuwaNOEsEnLZlfNLw2KMtL2U7QAZnCgYSTgh 6FujpmkMdYksIRxThemRthlLMrvmPuIpfn4YGmN5dHs09VyBCiGztAUv5LrWNmizUkh2 jvsA== X-Gm-Message-State: ALKqPwc8XzbsF0KgA7qgNmiu+ZO45rLqYJ/efB8hyJcX2Hy7h1cuFGMo 6TNitdQj7wSjE0pTBRYneTsRrRyDRx8= X-Received: by 2002:a62:5841:: with SMTP id m62-v6mr3657555pfb.116.1527703306707; Wed, 30 May 2018 11:01:46 -0700 (PDT) Received: from cloudburst.twiddle.net (97-126-112-211.tukw.qwest.net. [97.126.112.211]) by smtp.gmail.com with ESMTPSA id b84-v6sm28179157pfm.123.2018.05.30.11.01.45 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 30 May 2018 11:01:45 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 30 May 2018 11:01:17 -0700 Message-Id: <20180530180120.13355-16-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180530180120.13355-1-richard.henderson@linaro.org> References: <20180530180120.13355-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::22a Subject: [Qemu-devel] [PATCH v3b 15/18] target/arm: Implement SVE Integer Compare - Scalars Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 2 + target/arm/sve_helper.c | 31 +++++++++++ target/arm/translate-sve.c | 102 +++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 8 +++ 4 files changed, 143 insertions(+) -- 2.17.0 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index dd4f8f754d..1863106d0f 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -678,3 +678,5 @@ DEF_HELPER_FLAGS_4(sve_brkn, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_brkns, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_cntp, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_while, TCG_CALL_NO_RWG, i32, ptr, i32, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 8d1631ea3c..a65a06cc9e 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2736,3 +2736,34 @@ uint64_t HELPER(sve_cntp)(void *vn, void *vg, uint32_t pred_desc) } return sum; } + +uint32_t HELPER(sve_while)(void *vd, uint32_t count, uint32_t pred_desc) +{ + uintptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + intptr_t esz = extract32(pred_desc, SIMD_DATA_SHIFT, 2); + uint64_t esz_mask = pred_esz_masks[esz]; + ARMPredicateReg *d = vd; + uint32_t flags; + intptr_t i; + + /* Begin with a zero predicate register. */ + flags = do_zero(d, oprsz); + if (count == 0) { + return flags; + } + + /* Scale from predicate element count to bits. */ + count <<= esz; + /* Bound to the bits in the predicate. */ + count = MIN(count, oprsz * 8); + + /* Set all of the requested bits. */ + for (i = 0; i < count / 64; ++i) { + d->p[i] = esz_mask; + } + if (count & 63) { + d->p[i] = ~(-1ull << (count & 63)) & esz_mask; + } + + return predtest_ones(d, oprsz, esz_mask); +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index e4815e4912..75eb36f110 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3081,6 +3081,108 @@ static bool trans_SINCDECP_z(DisasContext *s, arg_incdec2_pred *a, return true; } +/* + *** SVE Integer Compare Scalars Group + */ + +static bool trans_CTERM(DisasContext *s, arg_CTERM *a, uint32_t insn) +{ + if (!sve_access_check(s)) { + return true; + } + + TCGCond cond = (a->ne ? TCG_COND_NE : TCG_COND_EQ); + TCGv_i64 rn = read_cpu_reg(s, a->rn, a->sf); + TCGv_i64 rm = read_cpu_reg(s, a->rm, a->sf); + TCGv_i64 cmp = tcg_temp_new_i64(); + + tcg_gen_setcond_i64(cond, cmp, rn, rm); + tcg_gen_extrl_i64_i32(cpu_NF, cmp); + tcg_temp_free_i64(cmp); + + /* VF = !NF & !CF. */ + tcg_gen_xori_i32(cpu_VF, cpu_NF, 1); + tcg_gen_andc_i32(cpu_VF, cpu_VF, cpu_CF); + + /* Both NF and VF actually look at bit 31. */ + tcg_gen_neg_i32(cpu_NF, cpu_NF); + tcg_gen_neg_i32(cpu_VF, cpu_VF); + return true; +} + +static bool trans_WHILE(DisasContext *s, arg_WHILE *a, uint32_t insn) +{ + if (!sve_access_check(s)) { + return true; + } + + TCGv_i64 op0 = read_cpu_reg(s, a->rn, 1); + TCGv_i64 op1 = read_cpu_reg(s, a->rm, 1); + TCGv_i64 t0 = tcg_temp_new_i64(); + TCGv_i64 t1 = tcg_temp_new_i64(); + TCGv_i32 t2, t3; + TCGv_ptr ptr; + unsigned desc, vsz = vec_full_reg_size(s); + TCGCond cond; + + if (!a->sf) { + if (a->u) { + tcg_gen_ext32u_i64(op0, op0); + tcg_gen_ext32u_i64(op1, op1); + } else { + tcg_gen_ext32s_i64(op0, op0); + tcg_gen_ext32s_i64(op1, op1); + } + } + + /* For the helper, compress the different conditions into a computation + * of how many iterations for which the condition is true. + * + * This is slightly complicated by 0 <= UINT64_MAX, which is nominally + * 2**64 iterations, overflowing to 0. Of course, predicate registers + * aren't that large, so any value >= predicate size is sufficient. + */ + tcg_gen_sub_i64(t0, op1, op0); + + /* t0 = MIN(op1 - op0, vsz). */ + if (a->eq) { + /* Equality means one more iteration. */ + tcg_gen_movi_i64(t1, vsz - 1); + tcg_gen_movcond_i64(TCG_COND_LTU, t0, t0, t1, t0, t1); + tcg_gen_addi_i64(t0, t0, 1); + } else { + tcg_gen_movi_i64(t1, vsz); + tcg_gen_movcond_i64(TCG_COND_LTU, t0, t0, t1, t0, t1); + } + + /* t0 = (condition true ? t0 : 0). */ + cond = (a->u + ? (a->eq ? TCG_COND_LEU : TCG_COND_LTU) + : (a->eq ? TCG_COND_LE : TCG_COND_LT)); + tcg_gen_movi_i64(t1, 0); + tcg_gen_movcond_i64(cond, t0, op0, op1, t0, t1); + + t2 = tcg_temp_new_i32(); + tcg_gen_extrl_i64_i32(t2, t0); + tcg_temp_free_i64(t0); + tcg_temp_free_i64(t1); + + desc = (vsz / 8) - 2; + desc = deposit32(desc, SIMD_DATA_SHIFT, 2, a->esz); + t3 = tcg_const_i32(desc); + + ptr = tcg_temp_new_ptr(); + tcg_gen_addi_ptr(ptr, cpu_env, pred_full_reg_offset(s, a->rd)); + + gen_helper_sve_while(t2, ptr, t2, t3); + do_pred_flags(t2); + + tcg_temp_free_ptr(ptr); + tcg_temp_free_i32(t2); + tcg_temp_free_i32(t3); + return true; +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 62d51c252b..4b718060a9 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -606,6 +606,14 @@ SINCDECP_r_64 00100101 .. 1010 d:1 u:1 10001 10 .... ..... @incdec_pred # SVE saturating inc/dec vector by predicate count SINCDECP_z 00100101 .. 1010 d:1 u:1 10000 00 .... ..... @incdec2_pred +### SVE Integer Compare - Scalars Group + +# SVE conditionally terminate scalars +CTERM 00100101 1 sf:1 1 rm:5 001000 rn:5 ne:1 0000 + +# SVE integer compare scalar count and limit +WHILE 00100101 esz:2 1 rm:5 000 sf:1 u:1 1 rn:5 eq:1 rd:4 + ### SVE Memory - 32-bit Gather and Unsized Contiguous Group # SVE load predicate register From patchwork Wed May 30 18:01:18 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 137278 Delivered-To: patch@linaro.org Received: by 2002:a2e:9706:0:0:0:0:0 with SMTP id r6-v6csp5640950lji; Wed, 30 May 2018 11:14:14 -0700 (PDT) X-Google-Smtp-Source: ADUXVKIYEfgMXm5PgC1JPHi/K9sPzcGAlClN2ynPAw+UoYluKm0Zh2usps6076+zKbhtAf4Vv6NS X-Received: by 2002:a37:140f:: with SMTP id e15-v6mr3475215qkh.117.1527704054855; Wed, 30 May 2018 11:14:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527704054; cv=none; d=google.com; s=arc-20160816; b=idX7W7Gve27qa3cU2RdrAAa0lhNIq0Qsuk98ec2I7ATIGgS0b1892dsU6/ygSF5Ps0 e+tnPCFdY7gHNvdxvyPYZRR51ODel1h/D4m5JoMjrYX0Xb9/ASRkHOc3ZtYEn9MRnx9s f8cVOdFeew2kWxw4QNYLj6bNgpzxsxNVoyE5mCpArVt0y4KP/bNiWYeSo2koIua/ob5q aruhpoavSHma8fZMB19h625gswKZq2OfLE0XS402is/kyJCN00QH4z/w3sDTliA8A2gM 1arTMgerbG85VPBu/X5Gz4aZXrK0wRRj9pblWGNBxsRHSzN4lYtHMxJYO5HN7ougbqkn iGnQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=KjHaYoglkEEDpj+nxjmOWg1p1VMDQ/DgG8iObr4ysR8=; b=UyOtC+DbNAPlX2/5XPep5km3rJGmOHlAn1gmqZCviipRaM9l7AXIM8QczWWPY3V/p2 00wWkBEt+Ep9i8E5r4Yqb5g50sos8tp8kMIIV/obHVBOZl9vt3NTaNh3jD1LQeGRUd7Q mhJbGE4aCCt4R7ab6OrzuaoOubJdfA41J09CWAlO9MRK65b5aN8ZpZl4k9m3iM8mg/kx hamCrRK9nnNeGgZ6LuVQcqUO2BrbxSaMRDx7Y1N9I4wWQuIaT+nlGpo7knru5AaoK445 d8GyiRLAUaUy6Rs3jo83CRKTmdCTUzl5gFk3J2VSplXyL+u0JMfmKQ/2mmTp69w6cd0o RVsQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=L7GWtWiV; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id k67-v6si423773qke.84.2018.05.30.11.14.14 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 30 May 2018 11:14:14 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=L7GWtWiV; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:40121 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fO5by-0003J4-B6 for patch@linaro.org; Wed, 30 May 2018 14:14:14 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33764) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fO5Q1-0002B9-J8 for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:54 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fO5Px-0004on-LL for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:53 -0400 Received: from mail-pf0-x243.google.com ([2607:f8b0:400e:c00::243]:37307) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fO5Px-0004oC-E3 for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:49 -0400 Received: by mail-pf0-x243.google.com with SMTP id e9-v6so9408629pfi.4 for ; Wed, 30 May 2018 11:01:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=KjHaYoglkEEDpj+nxjmOWg1p1VMDQ/DgG8iObr4ysR8=; b=L7GWtWiVYPk9CDmUHdrzAlCUExEWGKPkp3KkfvnEk1tSSqkl2enwZPpmcR+a+DZB11 TON2xn1QiTNQQshRPVIv6cN6fY5kC9V3q6I+w7sm7+3BdzsykTsBcHcD2+RX8QQ75g84 vxXppW9QOs8SkeV/n1yu7RY0qxFUNaJoyv4pU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=KjHaYoglkEEDpj+nxjmOWg1p1VMDQ/DgG8iObr4ysR8=; b=hOoE+ox6PQ2ed0WdHRVlGmDqu+6IsSyLauVud+eEt520RA9yKJvk8jq7wZkgJsJrxe 0jsibRbZCjzR4WxCUqTpDl2yka+vSC3RFu5IyyC95OH22XyTU19HDIti+Rq86mdTlT7f GjELrMeXoSDi1ivXfsWY4Yka3ghT66hOoQk1L0Xuq6ySO1uvxR5Hl6KFx3nF/YMWInCF Vyau5gwbC935LFDOh/lhCECOckAZVwohy3ffI4wJGCSs7DFp7cxxSBjuNrUUzRj6sxKh lrLUR1ytvQLVmmAW7Fffw0jyfEi/gTcs3nth/FqWGOQutJNfeheFUzl2VPJCXOFIVe+C VG6Q== X-Gm-Message-State: ALKqPwelXswx3LNRVDIG7i8yHy1lkEkaLJv79T8+Az/sjSjjU+Zm6fKn P2X+L9x0hIsQZCibEcLpQdBVcc/YH08= X-Received: by 2002:a62:1bc2:: with SMTP id b185-v6mr215234pfb.225.1527703308041; Wed, 30 May 2018 11:01:48 -0700 (PDT) Received: from cloudburst.twiddle.net (97-126-112-211.tukw.qwest.net. [97.126.112.211]) by smtp.gmail.com with ESMTPSA id b84-v6sm28179157pfm.123.2018.05.30.11.01.46 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 30 May 2018 11:01:47 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 30 May 2018 11:01:18 -0700 Message-Id: <20180530180120.13355-17-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180530180120.13355-1-richard.henderson@linaro.org> References: <20180530180120.13355-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::243 Subject: [Qemu-devel] [PATCH v3b 16/18] target/arm: Implement FDUP/DUP X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 37 +++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 8 ++++++++ 2 files changed, 45 insertions(+) -- 2.17.0 Reviewed-by: Peter Maydell diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 75eb36f110..e470f3b745 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3183,6 +3183,43 @@ static bool trans_WHILE(DisasContext *s, arg_WHILE *a, uint32_t insn) return true; } +/* + *** SVE Integer Wide Immediate - Unpredicated Group + */ + +static bool trans_FDUP(DisasContext *s, arg_FDUP *a, uint32_t insn) +{ + if (a->esz == 0) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + int dofs = vec_full_reg_offset(s, a->rd); + uint64_t imm; + + /* Decode the VFP immediate. */ + imm = vfp_expand_imm(a->esz, a->imm); + imm = dup_const(a->esz, imm); + + tcg_gen_gvec_dup64i(dofs, vsz, vsz, imm); + } + return true; +} + +static bool trans_DUP_i(DisasContext *s, arg_DUP_i *a, uint32_t insn) +{ + if (a->esz == 0 && extract32(insn, 13, 1)) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + int dofs = vec_full_reg_offset(s, a->rd); + + tcg_gen_gvec_dup64i(dofs, vsz, vsz, dup_const(a->esz, a->imm)); + } + return true; +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 4b718060a9..b8bd22aff7 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -614,6 +614,14 @@ CTERM 00100101 1 sf:1 1 rm:5 001000 rn:5 ne:1 0000 # SVE integer compare scalar count and limit WHILE 00100101 esz:2 1 rm:5 000 sf:1 u:1 1 rn:5 eq:1 rd:4 +### SVE Integer Wide Immediate - Unpredicated Group + +# SVE broadcast floating-point immediate (unpredicated) +FDUP 00100101 esz:2 111 00 1110 imm:8 rd:5 + +# SVE broadcast integer immediate (unpredicated) +DUP_i 00100101 esz:2 111 00 011 . ........ rd:5 imm=%sh8_i8s + ### SVE Memory - 32-bit Gather and Unsized Contiguous Group # SVE load predicate register From patchwork Wed May 30 18:01:19 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 137271 Delivered-To: patch@linaro.org Received: by 2002:a2e:9706:0:0:0:0:0 with SMTP id r6-v6csp5635520lji; Wed, 30 May 2018 11:08:50 -0700 (PDT) X-Google-Smtp-Source: ADUXVKKaHbPVTYx73it7s3KzoLg5Bz/mnd7KNpgCs9H1ybA17kTXqdQFJ2DWJMTvr2Ael/Gi022H X-Received: by 2002:aed:21c5:: with SMTP id m5-v6mr3772338qtc.211.1527703730264; Wed, 30 May 2018 11:08:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527703730; cv=none; d=google.com; s=arc-20160816; b=XgSyitF7nv/10AoZpU+uDHiGxlo6m7c0O0g8+3Nv6C80BMKIU6xZm+WTndWTk/w4Ho 2t55zX0ejZkLCvOq6dyEy7zW4iJ8DDB8M136LMgzlLyhcVH5IoXVyqaTRspS5TKwy2tK d/Gz2qGWkDAV3/WsOO2/9RtyABC0KAzFUnhwyI+RYRQVgJLGgxO7XtX2uQzZaVc24FmM EscvUkeFzHHgibkQ1vpuug0TNT1PeZp9Uk+hn73SJiKRQ4oM3cAub9LFtxBCGHcISdR7 WpWrzFZ9dyc3nUuO7zqwkE80IxBtMoBO0SlaW3Rf7osmiBf/Xdz7HOKIaYSc/nEMPuFm Rzyw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=l6E4fCqs1LmsFH2k1FgAVaHC3A49aq3TFkBVKssphx0=; b=Asava4TEka/8M7Cud9bMe8siQTSLDECfv4O9zNm6EosBldLd5Spdu1z/DAiefEcYmf xyZexvedikcrlgROai/A7nJaTEPRX0mME+kiMHIQ/yIS9Iu4WyEJBFqRxDifohzrnGom vLkr2MxTk9KO03+5ULEayGORYIYa1+YA5LluccxK6glB1tnTlh53Sbpu+tw3Rnu5ZXTO 47qT9wRuUG4eoLbNAeg1NHsmG4cDdRPZu6+yaJkPaROgFzLuZ9/pZKbM8gC0pqclqEDj fO5xldambFh1CCn58ogiUwQ1EMIx9M2GCTQD0tYKdSo9vn0cKCZhF0y10BQ+6sg6wwz3 bGJw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=RvqrnFis; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id i27-v6si3712255qtc.138.2018.05.30.11.08.50 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 30 May 2018 11:08:50 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=RvqrnFis; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:40072 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fO5Wj-0006N1-LT for patch@linaro.org; Wed, 30 May 2018 14:08:49 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33782) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fO5Q2-0002Bs-4l for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:55 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fO5Pz-0004qx-E1 for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:54 -0400 Received: from mail-pg0-x229.google.com ([2607:f8b0:400e:c05::229]:35959) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fO5Pz-0004qQ-6Z for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:51 -0400 Received: by mail-pg0-x229.google.com with SMTP id m5-v6so4115976pgd.3 for ; Wed, 30 May 2018 11:01:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=l6E4fCqs1LmsFH2k1FgAVaHC3A49aq3TFkBVKssphx0=; b=RvqrnFisIS4tM7cq+RPMdiNU7qjidp9G2yTPaoipr4dXS08x/hHPNmVaqJ86/Yz81m qUFpflAvMnqEaOtPFrUNC3u0vyigQ7ZJxrjKdpCbbOA/MbI3rfsull2RDZoOTQic7O5s iC4edLUOaEINpuAW/IkriomDQsif2SK4bssRY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=l6E4fCqs1LmsFH2k1FgAVaHC3A49aq3TFkBVKssphx0=; b=b7FwmemPfkZyRpzUa15nVfRv2c9PBUFn4Cq7QcwtTUEDSrHWAKaFdloPExycPIqUwc CazyFw+gXXBiPoo5yxeG73GlWtUjjJVJ+cWmjlznw/la6IkqYAc2TOPmGInRxswT5Gxv +uD6Sjj8MNZL0esk0T2mOBH4c4mEj41aOqaF/Rljv5ZcxaYcQXYP2zv+MY25lOwOHCpN WS7zmfjBdRu+agTCwllBSlSg8FAST+TdZTY+VbKlhG1PpQQ6/I75pGL1iFNsOqwoQpDX o57iloE25IzWxl0ENYI9ois4HclbqShQkT0YoL1057dvGVWw6V46QVXRB2hFPmgVQZc6 KU5Q== X-Gm-Message-State: ALKqPwcttAQAkEcpu4bkBZ1qXiFD0E2b2xAXfh4yRdqSIniwftKMwXCy p4cSIvEWqxaXSEnjCcmBj90yJLJijKE= X-Received: by 2002:aa7:8609:: with SMTP id p9-v6mr3670313pfn.123.1527703309484; Wed, 30 May 2018 11:01:49 -0700 (PDT) Received: from cloudburst.twiddle.net (97-126-112-211.tukw.qwest.net. [97.126.112.211]) by smtp.gmail.com with ESMTPSA id b84-v6sm28179157pfm.123.2018.05.30.11.01.48 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 30 May 2018 11:01:48 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 30 May 2018 11:01:19 -0700 Message-Id: <20180530180120.13355-18-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180530180120.13355-1-richard.henderson@linaro.org> References: <20180530180120.13355-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::229 Subject: [Qemu-devel] [PATCH v3b 17/18] target/arm: Implement SVE Integer Wide Immediate - Unpredicated Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 25 +++++++ target/arm/sve_helper.c | 41 +++++++++++ target/arm/translate-sve.c | 144 +++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 26 +++++++ 4 files changed, 236 insertions(+) -- 2.17.0 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 1863106d0f..97bfe0f47b 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -680,3 +680,28 @@ DEF_HELPER_FLAGS_4(sve_brkns, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_cntp, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_while, TCG_CALL_NO_RWG, i32, ptr, i32, i32) + +DEF_HELPER_FLAGS_4(sve_subri_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_subri_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_subri_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_subri_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_4(sve_smaxi_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_smaxi_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_smaxi_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_smaxi_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_4(sve_smini_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_smini_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_smini_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_smini_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_4(sve_umaxi_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_umaxi_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_umaxi_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_umaxi_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_4(sve_umini_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_umini_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_umini_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_umini_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index a65a06cc9e..ade954bfae 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -804,6 +804,46 @@ DO_VPZ_D(sve_uminv_d, uint64_t, uint64_t, -1, DO_MIN) #undef DO_VPZ #undef DO_VPZ_D +/* Two vector operand, one scalar operand, unpredicated. */ +#define DO_ZZI(NAME, TYPE, OP) \ +void HELPER(NAME)(void *vd, void *vn, uint64_t s64, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc) / sizeof(TYPE); \ + TYPE s = s64, *d = vd, *n = vn; \ + for (i = 0; i < opr_sz; ++i) { \ + d[i] = OP(n[i], s); \ + } \ +} + +#define DO_SUBR(X, Y) (Y - X) + +DO_ZZI(sve_subri_b, uint8_t, DO_SUBR) +DO_ZZI(sve_subri_h, uint16_t, DO_SUBR) +DO_ZZI(sve_subri_s, uint32_t, DO_SUBR) +DO_ZZI(sve_subri_d, uint64_t, DO_SUBR) + +DO_ZZI(sve_smaxi_b, int8_t, DO_MAX) +DO_ZZI(sve_smaxi_h, int16_t, DO_MAX) +DO_ZZI(sve_smaxi_s, int32_t, DO_MAX) +DO_ZZI(sve_smaxi_d, int64_t, DO_MAX) + +DO_ZZI(sve_smini_b, int8_t, DO_MIN) +DO_ZZI(sve_smini_h, int16_t, DO_MIN) +DO_ZZI(sve_smini_s, int32_t, DO_MIN) +DO_ZZI(sve_smini_d, int64_t, DO_MIN) + +DO_ZZI(sve_umaxi_b, uint8_t, DO_MAX) +DO_ZZI(sve_umaxi_h, uint16_t, DO_MAX) +DO_ZZI(sve_umaxi_s, uint32_t, DO_MAX) +DO_ZZI(sve_umaxi_d, uint64_t, DO_MAX) + +DO_ZZI(sve_umini_b, uint8_t, DO_MIN) +DO_ZZI(sve_umini_h, uint16_t, DO_MIN) +DO_ZZI(sve_umini_s, uint32_t, DO_MIN) +DO_ZZI(sve_umini_d, uint64_t, DO_MIN) + +#undef DO_ZZI + #undef DO_AND #undef DO_ORR #undef DO_EOR @@ -818,6 +858,7 @@ DO_VPZ_D(sve_uminv_d, uint64_t, uint64_t, -1, DO_MIN) #undef DO_ASR #undef DO_LSR #undef DO_LSL +#undef DO_SUBR /* Similar to the ARM LastActiveElement pseudocode function, except the result is multiplied by the element size. This includes the not found diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index e470f3b745..78e74047c8 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -77,6 +77,11 @@ static inline int expand_imm_sh8s(int x) return (int8_t)x << (x & 0x100 ? 8 : 0); } +static inline int expand_imm_sh8u(int x) +{ + return (uint8_t)x << (x & 0x100 ? 8 : 0); +} + /* * Include the generated decoder. */ @@ -3220,6 +3225,145 @@ static bool trans_DUP_i(DisasContext *s, arg_DUP_i *a, uint32_t insn) return true; } +static bool trans_ADD_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + if (a->esz == 0 && extract32(insn, 13, 1)) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_addi(a->esz, vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), a->imm, vsz, vsz); + } + return true; +} + +static bool trans_SUB_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + a->imm = -a->imm; + return trans_ADD_zzi(s, a, insn); +} + +static bool trans_SUBR_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + static const GVecGen2s op[4] = { + { .fni8 = tcg_gen_vec_sub8_i64, + .fniv = tcg_gen_sub_vec, + .fno = gen_helper_sve_subri_b, + .opc = INDEX_op_sub_vec, + .vece = MO_8, + .scalar_first = true }, + { .fni8 = tcg_gen_vec_sub16_i64, + .fniv = tcg_gen_sub_vec, + .fno = gen_helper_sve_subri_h, + .opc = INDEX_op_sub_vec, + .vece = MO_16, + .scalar_first = true }, + { .fni4 = tcg_gen_sub_i32, + .fniv = tcg_gen_sub_vec, + .fno = gen_helper_sve_subri_s, + .opc = INDEX_op_sub_vec, + .vece = MO_32, + .scalar_first = true }, + { .fni8 = tcg_gen_sub_i64, + .fniv = tcg_gen_sub_vec, + .fno = gen_helper_sve_subri_d, + .opc = INDEX_op_sub_vec, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .vece = MO_64, + .scalar_first = true } + }; + + if (a->esz == 0 && extract32(insn, 13, 1)) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + TCGv_i64 c = tcg_const_i64(a->imm); + tcg_gen_gvec_2s(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vsz, vsz, c, &op[a->esz]); + tcg_temp_free_i64(c); + } + return true; +} + +static bool trans_MUL_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_muli(a->esz, vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), a->imm, vsz, vsz); + } + return true; +} + +static bool do_zzi_sat(DisasContext *s, arg_rri_esz *a, uint32_t insn, + bool u, bool d) +{ + if (a->esz == 0 && extract32(insn, 13, 1)) { + return false; + } + if (sve_access_check(s)) { + TCGv_i64 val = tcg_const_i64(a->imm); + do_sat_addsub_vec(s, a->esz, a->rd, a->rn, val, u, d); + tcg_temp_free_i64(val); + } + return true; +} + +static bool trans_SQADD_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + return do_zzi_sat(s, a, insn, false, false); +} + +static bool trans_UQADD_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + return do_zzi_sat(s, a, insn, true, false); +} + +static bool trans_SQSUB_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + return do_zzi_sat(s, a, insn, false, true); +} + +static bool trans_UQSUB_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + return do_zzi_sat(s, a, insn, true, true); +} + +static bool do_zzi_ool(DisasContext *s, arg_rri_esz *a, gen_helper_gvec_2i *fn) +{ + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + TCGv_i64 c = tcg_const_i64(a->imm); + + tcg_gen_gvec_2i_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + c, vsz, vsz, 0, fn); + tcg_temp_free_i64(c); + } + return true; +} + +#define DO_ZZI(NAME, name) \ +static bool trans_##NAME##_zzi(DisasContext *s, arg_rri_esz *a, \ + uint32_t insn) \ +{ \ + static gen_helper_gvec_2i * const fns[4] = { \ + gen_helper_sve_##name##i_b, gen_helper_sve_##name##i_h, \ + gen_helper_sve_##name##i_s, gen_helper_sve_##name##i_d, \ + }; \ + return do_zzi_ool(s, a, fns[a->esz]); \ +} + +DO_ZZI(SMAX, smax) +DO_ZZI(UMAX, umax) +DO_ZZI(SMIN, smin) +DO_ZZI(UMIN, umin) + +#undef DO_ZZI + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index b8bd22aff7..eee8726bdf 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -42,6 +42,8 @@ # Signed 8-bit immediate, optionally shifted left by 8. %sh8_i8s 5:9 !function=expand_imm_sh8s +# Unsigned 8-bit immediate, optionally shifted left by 8. +%sh8_i8u 5:9 !function=expand_imm_sh8u # Either a copy of rd (at bit 0), or a different source # as propagated via the MOVPRFX instruction. @@ -95,6 +97,12 @@ @pd_pn_pm ........ esz:2 .. rm:4 ....... rn:4 . rd:4 &rrr_esz @rdn_rm ........ esz:2 ...... ...... rm:5 rd:5 \ &rrr_esz rn=%reg_movprfx +@rdn_sh_i8u ........ esz:2 ...... ...... ..... rd:5 \ + &rri_esz rn=%reg_movprfx imm=%sh8_i8u +@rdn_i8u ........ esz:2 ...... ... imm:8 rd:5 \ + &rri_esz rn=%reg_movprfx +@rdn_i8s ........ esz:2 ...... ... imm:s8 rd:5 \ + &rri_esz rn=%reg_movprfx # Three operand with "memory" size, aka immediate left shift @rd_rn_msz_rm ........ ... rm:5 .... imm:2 rn:5 rd:5 &rrri @@ -622,6 +630,24 @@ FDUP 00100101 esz:2 111 00 1110 imm:8 rd:5 # SVE broadcast integer immediate (unpredicated) DUP_i 00100101 esz:2 111 00 011 . ........ rd:5 imm=%sh8_i8s +# SVE integer add/subtract immediate (unpredicated) +ADD_zzi 00100101 .. 100 000 11 . ........ ..... @rdn_sh_i8u +SUB_zzi 00100101 .. 100 001 11 . ........ ..... @rdn_sh_i8u +SUBR_zzi 00100101 .. 100 011 11 . ........ ..... @rdn_sh_i8u +SQADD_zzi 00100101 .. 100 100 11 . ........ ..... @rdn_sh_i8u +UQADD_zzi 00100101 .. 100 101 11 . ........ ..... @rdn_sh_i8u +SQSUB_zzi 00100101 .. 100 110 11 . ........ ..... @rdn_sh_i8u +UQSUB_zzi 00100101 .. 100 111 11 . ........ ..... @rdn_sh_i8u + +# SVE integer min/max immediate (unpredicated) +SMAX_zzi 00100101 .. 101 000 110 ........ ..... @rdn_i8s +UMAX_zzi 00100101 .. 101 001 110 ........ ..... @rdn_i8u +SMIN_zzi 00100101 .. 101 010 110 ........ ..... @rdn_i8s +UMIN_zzi 00100101 .. 101 011 110 ........ ..... @rdn_i8u + +# SVE integer multiply immediate (unpredicated) +MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s + ### SVE Memory - 32-bit Gather and Unsized Contiguous Group # SVE load predicate register From patchwork Wed May 30 18:01:20 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 137272 Delivered-To: patch@linaro.org Received: by 2002:a2e:9706:0:0:0:0:0 with SMTP id r6-v6csp5635941lji; Wed, 30 May 2018 11:09:16 -0700 (PDT) X-Google-Smtp-Source: ADUXVKLgdE1nsNJr2OyUTku+7CoW0qM+NSrmpGH98cm67YUKqc79Me9dapzN71x/DnF5AC+1HmDD X-Received: by 2002:aed:2317:: with SMTP id h23-v6mr3603535qtc.283.1527703756365; Wed, 30 May 2018 11:09:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527703756; cv=none; d=google.com; s=arc-20160816; b=1JooVN6WlF/vyNqVPuRsRId1HjYdS2uz75iY7wzUB7n0dumHIb8f0J0glBqlgj9VTG 7tkXezPEfFmwU+sDt2OLcUZOeXjHbskU9iQlKehqJZ7rRz0DbpjbiCvXFN5RzNobTWGe xLqfkj6Dv4jMOsdNuwlbZHk6PFkB6BE/0dg7UBBaaV4SMWFoFeMW+QDMg2RGLma9jZWd cyiDgtIE7fKNhzHloxAjsYmpG9FpDxVNq7x3CLV/EO/KSckjt2J99avseNOgwHTPpOh4 W0r2MXufXge/52HGYQYIfZT/nie5DUq+Fzm783fORo0MrQSKPli2rnelGhv1PosVQMQR PBzg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=fLlTkQZMBKLlfK4X9k246o+TLhUSnciff9gKTmqGISo=; b=QEwXIZe88wtnXH8V9WVj8h8Ne9h6nxWoHStjEu6MT7OuULSfVE5yjmFPcykMR8yV2l 3kZlt0cU5hzlmVx6qNAWfXHZkigh+gaUFUxqHUSZKcATHwx08dXU/alPS5jxk/0aqAUG U3ECQz9AWxso9E/ga1Ubx1sSUYRFyFI46ocfmYADtGOKiUfm1u+nMtV2uZ8SXVJFb5fv OaXZvxdZ4RO53PcBr23iOcyhYVC00nvVGnr2N0dLfOMEt2rsBnj83bkok6zBuA4XaBBj 2lvrJDekkpauMHgiOsF+Ef6KDZZCDhEr0QOh3A9bkC0VPCJJgurkTM5GzVhX+LjIC/aY yToA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=QD/MRrQs; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id b80-v6si1844111qkg.25.2018.05.30.11.09.16 for (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 30 May 2018 11:09:16 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=QD/MRrQs; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:40083 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fO5X9-0007bH-Op for patch@linaro.org; Wed, 30 May 2018 14:09:15 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33781) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fO5Q2-0002Bq-4X for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fO5Q0-0004sM-Kl for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:54 -0400 Received: from mail-pl0-x22c.google.com ([2607:f8b0:400e:c01::22c]:38157) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fO5Q0-0004rA-2P for qemu-devel@nongnu.org; Wed, 30 May 2018 14:01:52 -0400 Received: by mail-pl0-x22c.google.com with SMTP id c11-v6so11554558plr.5 for ; Wed, 30 May 2018 11:01:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=fLlTkQZMBKLlfK4X9k246o+TLhUSnciff9gKTmqGISo=; b=QD/MRrQs/mRoxQOz0SRevXZxidLNSrLaU44VMTLX6tfUKhz0+0gX5KdCZp6PoyZFKZ MyikSPJRIRQqeumkj54dNFQ/5k8qibZtXLUCXYC+nT+iA5q0C27uK4XM2KhtPYwMUCbr G4WMWSdyPUh/jHiTPMlr7SxPjecMfzFlmCGcc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=fLlTkQZMBKLlfK4X9k246o+TLhUSnciff9gKTmqGISo=; b=nHWq5xwIZi5idTgm699kGkqws+5yZwfozZCotoS1Jz11vt5WA75wRJhOpKp7hpjKJr b6Ir3taDHbseRU3tWVtjCjIHbetEk6W649nFfvA+MDIsO4jMjG7nxrRn71wRKFOv6Jke etiCvQ1VeoaoGJJGBJFT3Tlc2njQnBZTbA0E34xgYr2hXf/OyGoJ7bl8+plLx23iQJgs u9P069wyyEYse+MIpDZwR2sOtwXp1NlW/0NJHQCQJ1wIbtpYoImzNSyOklzdvWHuVdPX d3G+1f7QLVZDhovFnkzxnSdrSXMitFV2uCGffE3viZKdJaN+p/eS+77vx/XtReX76umu 4RCQ== X-Gm-Message-State: ALKqPweoh63LjPOdpzEkCDhGI8mzNQSkMUgb2+H0JMrEVSE7Nct0lKXI ZFxHZPjPOZ9itogZmGjdp3QCPVjYkxQ= X-Received: by 2002:a17:902:6ac7:: with SMTP id i7-v6mr3737493plt.288.1527703310718; Wed, 30 May 2018 11:01:50 -0700 (PDT) Received: from cloudburst.twiddle.net (97-126-112-211.tukw.qwest.net. [97.126.112.211]) by smtp.gmail.com with ESMTPSA id b84-v6sm28179157pfm.123.2018.05.30.11.01.49 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 30 May 2018 11:01:49 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Wed, 30 May 2018 11:01:20 -0700 Message-Id: <20180530180120.13355-19-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180530180120.13355-1-richard.henderson@linaro.org> References: <20180530180120.13355-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::22c Subject: [Qemu-devel] [PATCH v3b 18/18] target/arm: Implement SVE Floating Point Arithmetic - Unpredicated Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 14 ++++++++ target/arm/helper.h | 19 +++++++++++ target/arm/translate-sve.c | 42 +++++++++++++++++++++++ target/arm/vec_helper.c | 69 ++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 10 ++++++ 5 files changed, 154 insertions(+) -- 2.17.0 Reviewed-by: Peter Maydell diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 97bfe0f47b..2e76084992 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -705,3 +705,17 @@ DEF_HELPER_FLAGS_4(sve_umini_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(sve_umini_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(sve_umini_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(sve_umini_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_5(gvec_recps_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_recps_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_recps_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(gvec_rsqrts_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/helper.h b/target/arm/helper.h index 0c6a144458..879a7229e9 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -601,6 +601,25 @@ DEF_HELPER_FLAGS_5(gvec_fcmlas_idx, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_fcmlad, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fadd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(gvec_fsub_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fsub_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fsub_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(gvec_fmul_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fmul_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fmul_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(gvec_ftsmul_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_ftsmul_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_ftsmul_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 78e74047c8..cea64dfc18 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3364,6 +3364,48 @@ DO_ZZI(UMIN, umin) #undef DO_ZZI +/* + *** SVE Floating Point Arithmetic - Unpredicated Group + */ + +static bool do_zzz_fp(DisasContext *s, arg_rrr_esz *a, + gen_helper_gvec_3_ptr *fn) +{ + if (fn == NULL) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16); + tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + status, vsz, vsz, 0, fn); + tcg_temp_free_ptr(status); + } + return true; +} + + +#define DO_FP3(NAME, name) \ +static bool trans_##NAME(DisasContext *s, arg_rrr_esz *a, uint32_t insn) \ +{ \ + static gen_helper_gvec_3_ptr * const fns[4] = { \ + NULL, gen_helper_gvec_##name##_h, \ + gen_helper_gvec_##name##_s, gen_helper_gvec_##name##_d \ + }; \ + return do_zzz_fp(s, a, fns[a->esz]); \ +} + +DO_FP3(FADD_zzz, fadd) +DO_FP3(FSUB_zzz, fsub) +DO_FP3(FMUL_zzz, fmul) +DO_FP3(FTSMUL, ftsmul) +DO_FP3(FRECPS, recps) +DO_FP3(FRSQRTS, rsqrts) + +#undef DO_FP3 + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index ec705cfca5..4b5ec94698 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -427,3 +427,72 @@ void HELPER(gvec_fcmlad)(void *vd, void *vn, void *vm, } clear_tail(d, opr_sz, simd_maxsz(desc)); } + +/* Floating-point trigonometric starting value. + * See the ARM ARM pseudocode function FPTrigSMul. + */ +static float16 float16_ftsmul(float16 op1, uint16_t op2, float_status *stat) +{ + float16 result = float16_mul(op1, op1, stat); + if (!float16_is_any_nan(result)) { + result = float16_set_sign(result, op2 & 1); + } + return result; +} + +static float32 float32_ftsmul(float32 op1, uint32_t op2, float_status *stat) +{ + float32 result = float32_mul(op1, op1, stat); + if (!float32_is_any_nan(result)) { + result = float32_set_sign(result, op2 & 1); + } + return result; +} + +static float64 float64_ftsmul(float64 op1, uint64_t op2, float_status *stat) +{ + float64 result = float64_mul(op1, op1, stat); + if (!float64_is_any_nan(result)) { + result = float64_set_sign(result, op2 & 1); + } + return result; +} + +#define DO_3OP(NAME, FUNC, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + TYPE *d = vd, *n = vn, *m = vm; \ + for (i = 0; i < oprsz / sizeof(TYPE); i++) { \ + d[i] = FUNC(n[i], m[i], stat); \ + } \ +} + +DO_3OP(gvec_fadd_h, float16_add, float16) +DO_3OP(gvec_fadd_s, float32_add, float32) +DO_3OP(gvec_fadd_d, float64_add, float64) + +DO_3OP(gvec_fsub_h, float16_sub, float16) +DO_3OP(gvec_fsub_s, float32_sub, float32) +DO_3OP(gvec_fsub_d, float64_sub, float64) + +DO_3OP(gvec_fmul_h, float16_mul, float16) +DO_3OP(gvec_fmul_s, float32_mul, float32) +DO_3OP(gvec_fmul_d, float64_mul, float64) + +DO_3OP(gvec_ftsmul_h, float16_ftsmul, float16) +DO_3OP(gvec_ftsmul_s, float32_ftsmul, float32) +DO_3OP(gvec_ftsmul_d, float64_ftsmul, float64) + +#ifdef TARGET_AARCH64 + +DO_3OP(gvec_recps_h, helper_recpsf_f16, float16) +DO_3OP(gvec_recps_s, helper_recpsf_f32, float32) +DO_3OP(gvec_recps_d, helper_recpsf_f64, float64) + +DO_3OP(gvec_rsqrts_h, helper_rsqrtsf_f16, float16) +DO_3OP(gvec_rsqrts_s, helper_rsqrtsf_f32, float32) +DO_3OP(gvec_rsqrts_d, helper_rsqrtsf_f64, float64) + +#endif +#undef DO_3OP diff --git a/target/arm/sve.decode b/target/arm/sve.decode index eee8726bdf..6f436f9096 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -648,6 +648,16 @@ UMIN_zzi 00100101 .. 101 011 110 ........ ..... @rdn_i8u # SVE integer multiply immediate (unpredicated) MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s +### SVE Floating Point Arithmetic - Unpredicated Group + +# SVE floating-point arithmetic (unpredicated) +FADD_zzz 01100101 .. 0 ..... 000 000 ..... ..... @rd_rn_rm +FSUB_zzz 01100101 .. 0 ..... 000 001 ..... ..... @rd_rn_rm +FMUL_zzz 01100101 .. 0 ..... 000 010 ..... ..... @rd_rn_rm +FTSMUL 01100101 .. 0 ..... 000 011 ..... ..... @rd_rn_rm +FRECPS 01100101 .. 0 ..... 000 110 ..... ..... @rd_rn_rm +FRSQRTS 01100101 .. 0 ..... 000 111 ..... ..... @rd_rn_rm + ### SVE Memory - 32-bit Gather and Unsized Contiguous Group # SVE load predicate register