From patchwork Wed Jun 13 01:56:24 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 138385 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp98111lji; Tue, 12 Jun 2018 18:59:58 -0700 (PDT) X-Google-Smtp-Source: ADUXVKKFeYFUzzAl5K69GaIyNxnX0Not1AIothHqw2XUI2LaFyhBwDvapHqAjxeElVnPd3iiNIbO X-Received: by 2002:a37:59c3:: with SMTP id n186-v6mr2790291qkb.132.1528855197925; Tue, 12 Jun 2018 18:59:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1528855197; cv=none; d=google.com; s=arc-20160816; b=BigiBiXrZXpUFJoRybHholW4S6nPfL8PaoUcu/G4yGrQ7hjvG1kwzbj+GyBuN4znUL xU/URnfKyON9e0nNxXBkZg+BKMPRy6ftTguh5L9ICOOOhX5kIGLoG4p+kifC8tjY9nV1 ygqNYVuiQKLolr99uNq5EW1KxpvB3rPV3hWXGCFli5xiIW4NHTq2utFRhe8qo/Ids8d0 5Zi+juMyCrUiy2VvUzC8QY6ee/4Gg1B6aDeIk0iORu5yIxlQ0Tsn/f1BfPGqdrxg785M lwP0bq96L09GjUqvaNWqowmZL34CyHI3QIlp4Rp96NT3dMNokMBHLf/TZlXaN9AD7em2 3wvw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=HFOHgvtQT4ZuVKP1qbPlOREDEdltSEASyGZntkspWL4=; b=MrKCwMyHTFMUHNajIDcb6Dnw8tEM8kqij5zJVnxyhMNxqElbCQbfbH20nC4U+uSRcx smCAFL4NQBCF3gkhiP/b8gwNTNc94ZL453LpcJJfqyxtI5ULACYMT3Iw8iwU4uxJkYO6 pKSqs7XybOegWHyLHlpcf+Hgp75WOW7KrD6P3aBCJs/R4lACVYsvEARxVM/O9Sv++Q2v sud93TkONbN8mR0LCLao80dk418sDKEd5lWbNRnv6QURLoI91UTvk/f1i7h1hm1F1ewB zPwK1vvtpqVhQk+1Veo63gy8q4dBOJch7YMJTPJVP9rkIEWgKIeR4SJ4RLy9WvK8VOq2 bYEg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=IuoqqLdJ; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id k189-v6si1586493qkf.39.2018.06.12.18.59.57 for (version=TLS1 cipher=AES128-SHA bits=128/128); Tue, 12 Jun 2018 18:59:57 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=IuoqqLdJ; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:59273 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSv4n-0004du-AA for patch@linaro.org; Tue, 12 Jun 2018 21:59:57 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:43978) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSv1s-0002A7-OZ for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:56:58 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fSv1q-0006Cb-Lg for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:56:56 -0400 Received: from mail-pg0-x229.google.com ([2607:f8b0:400e:c05::229]:41549) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fSv1q-0006CR-FD for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:56:54 -0400 Received: by mail-pg0-x229.google.com with SMTP id l65-v6so455733pgl.8 for ; Tue, 12 Jun 2018 18:56:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=HFOHgvtQT4ZuVKP1qbPlOREDEdltSEASyGZntkspWL4=; b=IuoqqLdJxj7k/iXG/4gXqh93bm7y+SyLUxmfVacXGmi1NfPVVsKZz4LWdIdUf3Dl+y dngKaAPPRThWgJPU97ftXbOCLC/5YpSoBaGYyEfq0A5TTlqG//kcFAZbXx/zEJTcP1Oy g2doR8MgosMFE/58eTyHHK7OXC5I+31ogaZyc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=HFOHgvtQT4ZuVKP1qbPlOREDEdltSEASyGZntkspWL4=; b=lD3SlziLp/obpyhxpxbb64JDm9QYtYH8fIdv8tjKjZEwJwc1ekJzSpgl2HCJl7Ms3g tsEUujSJeaxLmiG+/AO8umh6sWk4lPGD0EKJjJ6vcsCdX+WMQMGU6ttbt2xTa3jRYYpW KJxZ0CLSbv6F8PAcKOXMEswln32GmTzTBR8g5mB99UX8xD4Bmr9n9AoIeCLgK/bNuz9A 7GtBCPFv5XT6qqlpKdaTRsQ0K+9BY4WwNq3r50VTDUTzs/LgaTnJjLBOpqpcaHgn8DdT 3vB9WsWzftzUdCfuKo3fYBVc5VuszUI/LAaquhRrtjMdCEAOcYclnk+IfAUXsvS/FTzU 7knw== X-Gm-Message-State: APt69E3+XXp8vSA7zqmjKRA61XfwQj2Xb/UVuyxLHNEm5nSJYgjhLNTN yNjIK6t9oVYYsmb7SRyQzaed2pKQe8Q= X-Received: by 2002:a65:6007:: with SMTP id m7-v6mr2320808pgu.92.1528855013159; Tue, 12 Jun 2018 18:56:53 -0700 (PDT) Received: from cloudburst.twiddle.net (rrcs-173-198-77-219.west.biz.rr.com. [173.198.77.219]) by smtp.gmail.com with ESMTPSA id g10-v6sm1647287pfi.148.2018.06.12.18.56.51 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 12 Jun 2018 18:56:52 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Tue, 12 Jun 2018 15:56:24 -1000 Message-Id: <20180613015641.5667-2-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180613015641.5667-1-richard.henderson@linaro.org> References: <20180613015641.5667-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::229 Subject: [Qemu-devel] [PATCH v4b 01/18] target/arm: Extend vec_reg_offset to larger sizes X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Rearrange the arithmetic so that we are agnostic about the total size of the vector and the size of the element. This will allow us to index up to the 32nd byte and with 16-byte elements. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/translate-a64.h | 26 +++++++++++++++++--------- 1 file changed, 17 insertions(+), 9 deletions(-) -- 2.17.1 diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h index dd9c09f89b..63d958cf50 100644 --- a/target/arm/translate-a64.h +++ b/target/arm/translate-a64.h @@ -67,18 +67,26 @@ static inline void assert_fp_access_checked(DisasContext *s) static inline int vec_reg_offset(DisasContext *s, int regno, int element, TCGMemOp size) { - int offs = 0; + int element_size = 1 << size; + int offs = element * element_size; #ifdef HOST_WORDS_BIGENDIAN /* This is complicated slightly because vfp.zregs[n].d[0] is - * still the low half and vfp.zregs[n].d[1] the high half - * of the 128 bit vector, even on big endian systems. - * Calculate the offset assuming a fully bigendian 128 bits, - * then XOR to account for the order of the two 64 bit halves. + * still the lowest and vfp.zregs[n].d[15] the highest of the + * 256 byte vector, even on big endian systems. + * + * Calculate the offset assuming fully little-endian, + * then XOR to account for the order of the 8-byte units. + * + * For 16 byte elements, the two 8 byte halves will not form a + * host int128 if the host is bigendian, since they're in the + * wrong order. However the only 16 byte operation we have is + * a move, so we can ignore this for the moment. More complicated + * operations will have to special case loading and storing from + * the zregs array. */ - offs += (16 - ((element + 1) * (1 << size))); - offs ^= 8; -#else - offs += element * (1 << size); + if (element_size < 8) { + offs ^= 8 - element_size; + } #endif offs += offsetof(CPUARMState, vfp.zregs[regno]); assert_fp_access_checked(s); From patchwork Wed Jun 13 01:56:25 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 138386 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp98206lji; Tue, 12 Jun 2018 19:00:04 -0700 (PDT) X-Google-Smtp-Source: ADUXVKKg+bHqUFYuN1bOFaGVBMzqB6BbkvKxRL2S3asesKAW1Rnb7QCHhuzqr8O2D1Lvl2iw8yOq X-Received: by 2002:ac8:83a:: with SMTP id u55-v6mr2869924qth.185.1528855204896; Tue, 12 Jun 2018 19:00:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1528855204; cv=none; d=google.com; s=arc-20160816; b=AgX0dstuFIeqefMTj9ECSx/RlyZ4+hmcwhsqJKVl3sPLcRGbDwaEbrtbCukoovQM82 UD3t6BRUaApORLzGIXVFYar0KXrPer6U7GFXe9Hy8Aug5MxiujAQb42eHHGK0FsgoDLu pJpJ1uTAqVh3V3OH7hSf/5+JGHriB31r0VQDxoW9z+o7oZ7+osctWoM3rYPhk7jF6pCR WpVioEVShFY3w7+q+b7tmPB2PT3O9s/tqJNOkpVP7ctC+FuVwV4tbxzUCfD5LvZfgRAD 1JfbVNdk0XAM1yCjKRNJvQI/6d6uHuH12KIm6zXK3I5ZcH5+SMcnIiaffpNuIBMG0iDf XFog== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=6dkk/owoQ5UikpdFiLxaAkREzx5aLGj9f1f7Fj5j+4s=; b=ukkJJAnyFGK9jMHQiBqhgNUlyUEX2bPv1nrkykkin8z4hDVDiRoLtOdT9IVKJV7Er3 lcH128Sbbq9IENt/XAFznjT/X4I7iZ+fh4VxL7zl2bq48jf++UDvp5J4d6en2djqKXL6 BjdufEXJbDv2KbqoR1znde+5r/0ZI7Wp/xHHNp/lrlS5zmwZM0Mc6qujjQ01hR2EAtj+ DbJdLrjVlp9UHfD5lAy2gWspSIGxZbnzX40uFEK643x7625m0bHDVa/2ZEiIL2lGnkWf 1R81ISr0pzL1Sk9pkqObkeFIs4lkDtn6eq973iqNmbhQ8fBOd5iImhCzbXdfTUOU8ZT9 h2Fg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=XyUv0cjH; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id t56-v6si1562606qtj.402.2018.06.12.19.00.04 for (version=TLS1 cipher=AES128-SHA bits=128/128); Tue, 12 Jun 2018 19:00:04 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=XyUv0cjH; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:59277 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSv4u-0004kx-AN for patch@linaro.org; Tue, 12 Jun 2018 22:00:04 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:43994) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSv1u-0002AK-RE for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:00 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fSv1t-0006DG-6j for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:56:58 -0400 Received: from mail-pg0-x230.google.com ([2607:f8b0:400e:c05::230]:39155) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fSv1s-0006Cz-UE for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:56:57 -0400 Received: by mail-pg0-x230.google.com with SMTP id w12-v6so458671pgc.6 for ; Tue, 12 Jun 2018 18:56:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=6dkk/owoQ5UikpdFiLxaAkREzx5aLGj9f1f7Fj5j+4s=; b=XyUv0cjHG5tgBtn7ghXa1vUQrK3XsZnx5i4zkQ1Eq6gzp4GPYgtqgpAbzjH57fr/EL E1P1jjNv7IhfhSadc/Y//WPlq7/podwmuHSeQVG+xevLB+DqMNOqRFb7+m/usS5FOcma 8Rf6duxtUCM00AvTgPGuDsJf+VE2CtWMxF2d8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=6dkk/owoQ5UikpdFiLxaAkREzx5aLGj9f1f7Fj5j+4s=; b=PPWqyKyF8IJF0RZwZYx5cQiMAm+LivgRDwNYT/eJb3IWNrwCLIxrXIZnXXtx5v0HOt HQjgnNxP1qjr7QOZF4cKz0RXsfM+hRCNXVrtD9S/n8mPogXWNzyPvYc+ih+U+MjayY+V Y+BvnlS1djAcl9DCOZsGvCZimRVgYq7coQR8sAhJziHvSZH/5AtKl34SPvhzNDigKCP3 MfqPo2dyIUGqzVlRoDWu5+t5fyv7B11e086JEskwWdrlLuWDZ9cp/yK2aU4mvSGvB27e tTcYkRR8smTJinty9UuhWJ+N4mGkf/FnZYUjTt4+lqKlQeUAvjDL48FpioRZL6yaXEse wfeg== X-Gm-Message-State: APt69E17dvNPnuofHjyIXPCiWCI81RkwWY4Axh38APM1VjUd+R8Ydakc z/pOOTqi30s8VOjf4CcJCaa3cz24IPM= X-Received: by 2002:a63:3e83:: with SMTP id l125-v6mr2384168pga.355.1528855015459; Tue, 12 Jun 2018 18:56:55 -0700 (PDT) Received: from cloudburst.twiddle.net (rrcs-173-198-77-219.west.biz.rr.com. [173.198.77.219]) by smtp.gmail.com with ESMTPSA id g10-v6sm1647287pfi.148.2018.06.12.18.56.53 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 12 Jun 2018 18:56:54 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Tue, 12 Jun 2018 15:56:25 -1000 Message-Id: <20180613015641.5667-3-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180613015641.5667-1-richard.henderson@linaro.org> References: <20180613015641.5667-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::230 Subject: [Qemu-devel] [PATCH v4b 02/18] target/arm: Implement SVE Permute - Unpredicated Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 23 +++++++ target/arm/sve_helper.c | 114 +++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 133 +++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 27 ++++++++ 4 files changed, 297 insertions(+) -- 2.17.1 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 94f4356ce9..0c9aad575e 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -416,6 +416,29 @@ DEF_HELPER_FLAGS_4(sve_cpy_z_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(sve_ext, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_insr_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_insr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_insr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_insr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_3(sve_rev_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_rev_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_rev_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_rev_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_tbl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_tbl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_tbl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_tbl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_sunpk_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_sunpk_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_sunpk_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_uunpk_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_uunpk_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_uunpk_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index b825e44cb5..58c0fda333 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1560,3 +1560,117 @@ void HELPER(sve_ext)(void *vd, void *vn, void *vm, uint32_t desc) memcpy(vd + n_siz, &tmp, n_ofs); } } + +#define DO_INSR(NAME, TYPE, H) \ +void HELPER(NAME)(void *vd, void *vn, uint64_t val, uint32_t desc) \ +{ \ + intptr_t opr_sz = simd_oprsz(desc); \ + swap_memmove(vd + sizeof(TYPE), vn, opr_sz - sizeof(TYPE)); \ + *(TYPE *)(vd + H(0)) = val; \ +} + +DO_INSR(sve_insr_b, uint8_t, H1) +DO_INSR(sve_insr_h, uint16_t, H1_2) +DO_INSR(sve_insr_s, uint32_t, H1_4) +DO_INSR(sve_insr_d, uint64_t, ) + +#undef DO_INSR + +void HELPER(sve_rev_b)(void *vd, void *vn, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc); + for (i = 0, j = opr_sz - 8; i < opr_sz / 2; i += 8, j -= 8) { + uint64_t f = *(uint64_t *)(vn + i); + uint64_t b = *(uint64_t *)(vn + j); + *(uint64_t *)(vd + i) = bswap64(b); + *(uint64_t *)(vd + j) = bswap64(f); + } +} + +static inline uint64_t hswap64(uint64_t h) +{ + uint64_t m = 0x0000ffff0000ffffull; + h = rol64(h, 32); + return ((h & m) << 16) | ((h >> 16) & m); +} + +void HELPER(sve_rev_h)(void *vd, void *vn, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc); + for (i = 0, j = opr_sz - 8; i < opr_sz / 2; i += 8, j -= 8) { + uint64_t f = *(uint64_t *)(vn + i); + uint64_t b = *(uint64_t *)(vn + j); + *(uint64_t *)(vd + i) = hswap64(b); + *(uint64_t *)(vd + j) = hswap64(f); + } +} + +void HELPER(sve_rev_s)(void *vd, void *vn, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc); + for (i = 0, j = opr_sz - 8; i < opr_sz / 2; i += 8, j -= 8) { + uint64_t f = *(uint64_t *)(vn + i); + uint64_t b = *(uint64_t *)(vn + j); + *(uint64_t *)(vd + i) = rol64(b, 32); + *(uint64_t *)(vd + j) = rol64(f, 32); + } +} + +void HELPER(sve_rev_d)(void *vd, void *vn, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc); + for (i = 0, j = opr_sz - 8; i < opr_sz / 2; i += 8, j -= 8) { + uint64_t f = *(uint64_t *)(vn + i); + uint64_t b = *(uint64_t *)(vn + j); + *(uint64_t *)(vd + i) = b; + *(uint64_t *)(vd + j) = f; + } +} + +#define DO_TBL(NAME, TYPE, H) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + uintptr_t elem = opr_sz / sizeof(TYPE); \ + TYPE *d = vd, *n = vn, *m = vm; \ + ARMVectorReg tmp; \ + if (unlikely(vd == vn)) { \ + n = memcpy(&tmp, vn, opr_sz); \ + } \ + for (i = 0; i < elem; i++) { \ + TYPE j = m[H(i)]; \ + d[H(i)] = j < elem ? n[H(j)] : 0; \ + } \ +} + +DO_TBL(sve_tbl_b, uint8_t, H1) +DO_TBL(sve_tbl_h, uint16_t, H2) +DO_TBL(sve_tbl_s, uint32_t, H4) +DO_TBL(sve_tbl_d, uint64_t, ) + +#undef TBL + +#define DO_UNPK(NAME, TYPED, TYPES, HD, HS) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + TYPED *d = vd; \ + TYPES *n = vn; \ + ARMVectorReg tmp; \ + if (unlikely(vn - vd < opr_sz)) { \ + n = memcpy(&tmp, n, opr_sz / 2); \ + } \ + for (i = 0; i < opr_sz / sizeof(TYPED); i++) { \ + d[HD(i)] = n[HS(i)]; \ + } \ +} + +DO_UNPK(sve_sunpk_h, int16_t, int8_t, H2, H1) +DO_UNPK(sve_sunpk_s, int32_t, int16_t, H4, H2) +DO_UNPK(sve_sunpk_d, int64_t, int32_t, , H4) + +DO_UNPK(sve_uunpk_h, uint16_t, uint8_t, H2, H1) +DO_UNPK(sve_uunpk_s, uint32_t, uint16_t, H4, H2) +DO_UNPK(sve_uunpk_d, uint64_t, uint32_t, , H4) + +#undef DO_UNPK diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index c48d4b530a..388cce9924 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -1956,6 +1956,139 @@ static bool trans_EXT(DisasContext *s, arg_EXT *a, uint32_t insn) return true; } +/* + *** SVE Permute - Unpredicated Group + */ + +static bool trans_DUP_s(DisasContext *s, arg_DUP_s *a, uint32_t insn) +{ + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_dup_i64(a->esz, vec_full_reg_offset(s, a->rd), + vsz, vsz, cpu_reg_sp(s, a->rn)); + } + return true; +} + +static bool trans_DUP_x(DisasContext *s, arg_DUP_x *a, uint32_t insn) +{ + if ((a->imm & 0x1f) == 0) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + unsigned dofs = vec_full_reg_offset(s, a->rd); + unsigned esz, index; + + esz = ctz32(a->imm); + index = a->imm >> (esz + 1); + + if ((index << esz) < vsz) { + unsigned nofs = vec_reg_offset(s, a->rn, index, esz); + tcg_gen_gvec_dup_mem(esz, dofs, nofs, vsz, vsz); + } else { + tcg_gen_gvec_dup64i(dofs, vsz, vsz, 0); + } + } + return true; +} + +static void do_insr_i64(DisasContext *s, arg_rrr_esz *a, TCGv_i64 val) +{ + typedef void gen_insr(TCGv_ptr, TCGv_ptr, TCGv_i64, TCGv_i32); + static gen_insr * const fns[4] = { + gen_helper_sve_insr_b, gen_helper_sve_insr_h, + gen_helper_sve_insr_s, gen_helper_sve_insr_d, + }; + unsigned vsz = vec_full_reg_size(s); + TCGv_i32 desc = tcg_const_i32(simd_desc(vsz, vsz, 0)); + TCGv_ptr t_zd = tcg_temp_new_ptr(); + TCGv_ptr t_zn = tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(t_zd, cpu_env, vec_full_reg_offset(s, a->rd)); + tcg_gen_addi_ptr(t_zn, cpu_env, vec_full_reg_offset(s, a->rn)); + + fns[a->esz](t_zd, t_zn, val, desc); + + tcg_temp_free_ptr(t_zd); + tcg_temp_free_ptr(t_zn); + tcg_temp_free_i32(desc); +} + +static bool trans_INSR_f(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + if (sve_access_check(s)) { + TCGv_i64 t = tcg_temp_new_i64(); + tcg_gen_ld_i64(t, cpu_env, vec_reg_offset(s, a->rm, 0, MO_64)); + do_insr_i64(s, a, t); + tcg_temp_free_i64(t); + } + return true; +} + +static bool trans_INSR_r(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + if (sve_access_check(s)) { + do_insr_i64(s, a, cpu_reg(s, a->rm)); + } + return true; +} + +static bool trans_REV_v(DisasContext *s, arg_rr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_2 * const fns[4] = { + gen_helper_sve_rev_b, gen_helper_sve_rev_h, + gen_helper_sve_rev_s, gen_helper_sve_rev_d + }; + + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_2_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vsz, vsz, 0, fns[a->esz]); + } + return true; +} + +static bool trans_TBL(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve_tbl_b, gen_helper_sve_tbl_h, + gen_helper_sve_tbl_s, gen_helper_sve_tbl_d + }; + + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vsz, vsz, 0, fns[a->esz]); + } + return true; +} + +static bool trans_UNPK(DisasContext *s, arg_UNPK *a, uint32_t insn) +{ + static gen_helper_gvec_2 * const fns[4][2] = { + { NULL, NULL }, + { gen_helper_sve_sunpk_h, gen_helper_sve_uunpk_h }, + { gen_helper_sve_sunpk_s, gen_helper_sve_uunpk_s }, + { gen_helper_sve_sunpk_d, gen_helper_sve_uunpk_d }, + }; + + if (a->esz == 0) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_2_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn) + + (a->h ? vsz / 2 : 0), + vsz, vsz, 0, fns[a->esz][a->u]); + } + return true; +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 4761d1921e..7ffd7962c8 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -24,6 +24,7 @@ %imm4_16_p1 16:4 !function=plus1 %imm6_22_5 22:1 5:5 +%imm7_22_16 22:2 16:5 %imm8_16_10 16:5 10:3 %imm9_16_10 16:s6 10:3 @@ -85,6 +86,8 @@ # Three operand, vector element size @rd_rn_rm ........ esz:2 . rm:5 ... ... rn:5 rd:5 &rrr_esz +@rdn_rm ........ esz:2 ...... ...... rm:5 rd:5 \ + &rrr_esz rn=%reg_movprfx # Three operand with "memory" size, aka immediate left shift @rd_rn_msz_rm ........ ... rm:5 .... imm:2 rn:5 rd:5 &rrri @@ -369,6 +372,30 @@ CPY_z_i 00000101 .. 01 .... 00 . ........ ..... @rdn_pg4 imm=%sh8_i8s EXT 00000101 001 ..... 000 ... rm:5 rd:5 \ &rrri rn=%reg_movprfx imm=%imm8_16_10 +### SVE Permute - Unpredicated Group + +# SVE broadcast general register +DUP_s 00000101 .. 1 00000 001110 ..... ..... @rd_rn + +# SVE broadcast indexed element +DUP_x 00000101 .. 1 ..... 001000 rn:5 rd:5 \ + &rri imm=%imm7_22_16 + +# SVE insert SIMD&FP scalar register +INSR_f 00000101 .. 1 10100 001110 ..... ..... @rdn_rm + +# SVE insert general register +INSR_r 00000101 .. 1 00100 001110 ..... ..... @rdn_rm + +# SVE reverse vector elements +REV_v 00000101 .. 1 11000 001110 ..... ..... @rd_rn + +# SVE vector table lookup +TBL 00000101 .. 1 ..... 001100 ..... ..... @rd_rn_rm + +# SVE unpack vector elements +UNPK 00000101 esz:2 1100 u:1 h:1 001110 rn:5 rd:5 + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Wed Jun 13 01:56:26 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 138390 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp101195lji; Tue, 12 Jun 2018 19:03:21 -0700 (PDT) X-Google-Smtp-Source: ADUXVKKWYHt9+4T5DMIBB6OwmvFRLGqNODtBDZWmKBxzHGJE5cpetFBq0K1p+zjdwSuSLQiMKjU6 X-Received: by 2002:a37:5b44:: with SMTP id p65-v6mr2827917qkb.195.1528855401591; Tue, 12 Jun 2018 19:03:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1528855401; cv=none; d=google.com; s=arc-20160816; b=FQrwiZFiJGyPTgLJxvIGtwgU0P7H+rm0NhLqRh4oameFj34r4cAPiokPPtzTIsJBg5 hgxPQ8C9bHH6KGNl6L99g7CcmD0GqKkde4OrQWAqqdEvHLKCd5qCb0mW0cLMRyi8uYjH H1q/b5BmXGsETx0mHYwHDYyfRItFVfTDFhmC16cq+12qqzTw75aOIQcv/EjrWAuSm3CF aPbp8f9s8rP1TnimRaYcPT1lhBA+VOCCwCHBFX/hqXwGsYtJASpr3/f6UHPFK9wmw0JO P8PTWQd5MB0yEx5ENIIKTLuzUF6cIZX8MyIufAZriCUD4WPTWzQTYL5yrZDayDpQvDUt ltIg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=QJ8PErBoj5fvywNAXGNk1ZA4SVa9iNvCN1eDrLhJUjo=; b=M85+W65tjv3JpWNUxYuZtO4rRJ+jGkU5YU84qte9GYm4gGK4ofdnQCBvP+snBhaJxN FdthiP4uWGNVOf53xCud0A0M5BB1pXEEVh5rpj4ySZqFP0xvEznjsgXWcB8CNK5KUCn0 zSuVIPvq27SjuGwOW6eE7SabUvPMaKhJAFUu/WLPMI3qnqJ8bNSp5CApe5nLvDzBh7// +zNs6h9jeyXcF+jBU0jJCb6aJOzfbu0YTjn8srp4Cb0+9EkMOZ69rtfu13Iu2c4/migk xeCpMwg7b4YzEnsUrjpr1eHGxu8T3EBFqI1sApRd7sYAnQnvN6nknMOXso00OwAd6FWZ FJOQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=FtifB4Hs; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id n4-v6si1536606qkl.110.2018.06.12.19.03.21 for (version=TLS1 cipher=AES128-SHA bits=128/128); Tue, 12 Jun 2018 19:03:21 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=FtifB4Hs; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:59296 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSv85-0007Fn-1E for patch@linaro.org; Tue, 12 Jun 2018 22:03:21 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44005) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSv1x-0002CD-12 for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:05 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fSv1v-0006Da-6J for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:01 -0400 Received: from mail-pl0-x243.google.com ([2607:f8b0:400e:c01::243]:45634) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fSv1u-0006DO-U8 for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:56:59 -0400 Received: by mail-pl0-x243.google.com with SMTP id c23-v6so541290plz.12 for ; Tue, 12 Jun 2018 18:56:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=QJ8PErBoj5fvywNAXGNk1ZA4SVa9iNvCN1eDrLhJUjo=; b=FtifB4Hsl6Hr7+h4gbf44Yt5fP9GzGQ94KqtTwvmQxWDan7nxjMNtrAiZWiN+gfW8U 7Ad6ezYrO808nflYS98wZWAHUh3C7AmPsZNTnKI/txNMM4IIfsMAwc1q4BU+qscvzxRI BvoNGrEP/G091IfYFwK73FsN3pB1MKJ/9kkog= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=QJ8PErBoj5fvywNAXGNk1ZA4SVa9iNvCN1eDrLhJUjo=; b=bZcCpuAzj85PTM+hv1bwXmWe+aZQ8MfdM5TL1EBuXXWimB00tcC1d+ft8K5B7QK33m TKFiDDkezLv8Z0pg7MEBoy3OI3UM42vu5D9Wdy1r3Lug0A1XE65HMBAd0/db/gpg3pLZ qB1UOLJN64HGm70Vr00TdbFitLey/q735XMTDMPuA6HWUdpOLXEESPdYjCf140jUevBt zvyd3BTwYXuzgVxid6ep7efMKrhoAwncJAz1NnZ1jan8r4a739iF1UtQ3NSbMGp7WxBs 0kTJi8FB3kIHfdHxai47Q2aP6Qwa+oP+RQLSN4iXgBqoQnursZbJ47nAgngUvI9/4Vjw Tjsw== X-Gm-Message-State: APt69E3TLUExx43UzQod8ReU0g+wtciDTR/7pvm6cxWrNtG5AcC/kZ51 6KkiCz0xJsbwA6cAflwoH5+ihYvlTe4= X-Received: by 2002:a17:902:246a:: with SMTP id m39-v6mr2981010plg.141.1528855017486; Tue, 12 Jun 2018 18:56:57 -0700 (PDT) Received: from cloudburst.twiddle.net (rrcs-173-198-77-219.west.biz.rr.com. [173.198.77.219]) by smtp.gmail.com with ESMTPSA id g10-v6sm1647287pfi.148.2018.06.12.18.56.55 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 12 Jun 2018 18:56:56 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Tue, 12 Jun 2018 15:56:26 -1000 Message-Id: <20180613015641.5667-4-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180613015641.5667-1-richard.henderson@linaro.org> References: <20180613015641.5667-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::243 Subject: [Qemu-devel] [PATCH v4b 03/18] target/arm: Implement SVE Permute - Predicates Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 6 + target/arm/sve_helper.c | 290 +++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 120 +++++++++++++++ target/arm/sve.decode | 18 +++ 4 files changed, 434 insertions(+) -- 2.17.1 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 0c9aad575e..ff958fcebd 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -439,6 +439,12 @@ DEF_HELPER_FLAGS_3(sve_uunpk_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_uunpk_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_uunpk_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_zip_p, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_uzp_p, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_trn_p, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_rev_p, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve_punpk_p, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 58c0fda333..f4d49d4aff 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1674,3 +1674,293 @@ DO_UNPK(sve_uunpk_s, uint32_t, uint16_t, H4, H2) DO_UNPK(sve_uunpk_d, uint64_t, uint32_t, , H4) #undef DO_UNPK + +/* Mask of bits included in the even numbered predicates of width esz. + * We also use this for expand_bits/compress_bits, and so extend the + * same pattern out to 16-bit units. + */ +static const uint64_t even_bit_esz_masks[5] = { + 0x5555555555555555ull, + 0x3333333333333333ull, + 0x0f0f0f0f0f0f0f0full, + 0x00ff00ff00ff00ffull, + 0x0000ffff0000ffffull, +}; + +/* Zero-extend units of 2**N bits to units of 2**(N+1) bits. + * For N==0, this corresponds to the operation that in qemu/bitops.h + * we call half_shuffle64; this algorithm is from Hacker's Delight, + * section 7-2 Shuffling Bits. + */ +static uint64_t expand_bits(uint64_t x, int n) +{ + int i; + + x &= 0xffffffffu; + for (i = 4; i >= n; i--) { + int sh = 1 << i; + x = ((x << sh) | x) & even_bit_esz_masks[i]; + } + return x; +} + +/* Compress units of 2**(N+1) bits to units of 2**N bits. + * For N==0, this corresponds to the operation that in qemu/bitops.h + * we call half_unshuffle64; this algorithm is from Hacker's Delight, + * section 7-2 Shuffling Bits, where it is called an inverse half shuffle. + */ +static uint64_t compress_bits(uint64_t x, int n) +{ + int i; + + for (i = n; i <= 4; i++) { + int sh = 1 << i; + x &= even_bit_esz_masks[i]; + x = (x >> sh) | x; + } + return x & 0xffffffffu; +} + +void HELPER(sve_zip_p)(void *vd, void *vn, void *vm, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + int esz = extract32(pred_desc, SIMD_DATA_SHIFT, 2); + intptr_t high = extract32(pred_desc, SIMD_DATA_SHIFT + 2, 1); + uint64_t *d = vd; + intptr_t i; + + if (oprsz <= 8) { + uint64_t nn = *(uint64_t *)vn; + uint64_t mm = *(uint64_t *)vm; + int half = 4 * oprsz; + + nn = extract64(nn, high * half, half); + mm = extract64(mm, high * half, half); + nn = expand_bits(nn, esz); + mm = expand_bits(mm, esz); + d[0] = nn + (mm << (1 << esz)); + } else { + ARMPredicateReg tmp_n, tmp_m; + + /* We produce output faster than we consume input. + Therefore we must be mindful of possible overlap. */ + if ((vn - vd) < (uintptr_t)oprsz) { + vn = memcpy(&tmp_n, vn, oprsz); + } + if ((vm - vd) < (uintptr_t)oprsz) { + vm = memcpy(&tmp_m, vm, oprsz); + } + if (high) { + high = oprsz >> 1; + } + + if ((high & 3) == 0) { + uint32_t *n = vn, *m = vm; + high >>= 2; + + for (i = 0; i < DIV_ROUND_UP(oprsz, 8); i++) { + uint64_t nn = n[H4(high + i)]; + uint64_t mm = m[H4(high + i)]; + + nn = expand_bits(nn, esz); + mm = expand_bits(mm, esz); + d[i] = nn + (mm << (1 << esz)); + } + } else { + uint8_t *n = vn, *m = vm; + uint16_t *d16 = vd; + + for (i = 0; i < oprsz / 2; i++) { + uint16_t nn = n[H1(high + i)]; + uint16_t mm = m[H1(high + i)]; + + nn = expand_bits(nn, esz); + mm = expand_bits(mm, esz); + d16[H2(i)] = nn + (mm << (1 << esz)); + } + } + } +} + +void HELPER(sve_uzp_p)(void *vd, void *vn, void *vm, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + int esz = extract32(pred_desc, SIMD_DATA_SHIFT, 2); + int odd = extract32(pred_desc, SIMD_DATA_SHIFT + 2, 1) << esz; + uint64_t *d = vd, *n = vn, *m = vm; + uint64_t l, h; + intptr_t i; + + if (oprsz <= 8) { + l = compress_bits(n[0] >> odd, esz); + h = compress_bits(m[0] >> odd, esz); + d[0] = extract64(l + (h << (4 * oprsz)), 0, 8 * oprsz); + } else { + ARMPredicateReg tmp_m; + intptr_t oprsz_16 = oprsz / 16; + + if ((vm - vd) < (uintptr_t)oprsz) { + m = memcpy(&tmp_m, vm, oprsz); + } + + for (i = 0; i < oprsz_16; i++) { + l = n[2 * i + 0]; + h = n[2 * i + 1]; + l = compress_bits(l >> odd, esz); + h = compress_bits(h >> odd, esz); + d[i] = l + (h << 32); + } + + /* For VL which is not a power of 2, the results from M do not + align nicely with the uint64_t for D. Put the aligned results + from M into TMP_M and then copy it into place afterward. */ + if (oprsz & 15) { + d[i] = compress_bits(n[2 * i] >> odd, esz); + + for (i = 0; i < oprsz_16; i++) { + l = m[2 * i + 0]; + h = m[2 * i + 1]; + l = compress_bits(l >> odd, esz); + h = compress_bits(h >> odd, esz); + tmp_m.p[i] = l + (h << 32); + } + tmp_m.p[i] = compress_bits(m[2 * i] >> odd, esz); + + swap_memmove(vd + oprsz / 2, &tmp_m, oprsz / 2); + } else { + for (i = 0; i < oprsz_16; i++) { + l = m[2 * i + 0]; + h = m[2 * i + 1]; + l = compress_bits(l >> odd, esz); + h = compress_bits(h >> odd, esz); + d[oprsz_16 + i] = l + (h << 32); + } + } + } +} + +void HELPER(sve_trn_p)(void *vd, void *vn, void *vm, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + uintptr_t esz = extract32(pred_desc, SIMD_DATA_SHIFT, 2); + bool odd = extract32(pred_desc, SIMD_DATA_SHIFT + 2, 1); + uint64_t *d = vd, *n = vn, *m = vm; + uint64_t mask; + int shr, shl; + intptr_t i; + + shl = 1 << esz; + shr = 0; + mask = even_bit_esz_masks[esz]; + if (odd) { + mask <<= shl; + shr = shl; + shl = 0; + } + + for (i = 0; i < DIV_ROUND_UP(oprsz, 8); i++) { + uint64_t nn = (n[i] & mask) >> shr; + uint64_t mm = (m[i] & mask) << shl; + d[i] = nn + mm; + } +} + +/* Reverse units of 2**N bits. */ +static uint64_t reverse_bits_64(uint64_t x, int n) +{ + int i, sh; + + x = bswap64(x); + for (i = 2, sh = 4; i >= n; i--, sh >>= 1) { + uint64_t mask = even_bit_esz_masks[i]; + x = ((x & mask) << sh) | ((x >> sh) & mask); + } + return x; +} + +static uint8_t reverse_bits_8(uint8_t x, int n) +{ + static const uint8_t mask[3] = { 0x55, 0x33, 0x0f }; + int i, sh; + + for (i = 2, sh = 4; i >= n; i--, sh >>= 1) { + x = ((x & mask[i]) << sh) | ((x >> sh) & mask[i]); + } + return x; +} + +void HELPER(sve_rev_p)(void *vd, void *vn, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + int esz = extract32(pred_desc, SIMD_DATA_SHIFT, 2); + intptr_t i, oprsz_2 = oprsz / 2; + + if (oprsz <= 8) { + uint64_t l = *(uint64_t *)vn; + l = reverse_bits_64(l << (64 - 8 * oprsz), esz); + *(uint64_t *)vd = l; + } else if ((oprsz & 15) == 0) { + for (i = 0; i < oprsz_2; i += 8) { + intptr_t ih = oprsz - 8 - i; + uint64_t l = reverse_bits_64(*(uint64_t *)(vn + i), esz); + uint64_t h = reverse_bits_64(*(uint64_t *)(vn + ih), esz); + *(uint64_t *)(vd + i) = h; + *(uint64_t *)(vd + ih) = l; + } + } else { + for (i = 0; i < oprsz_2; i += 1) { + intptr_t il = H1(i); + intptr_t ih = H1(oprsz - 1 - i); + uint8_t l = reverse_bits_8(*(uint8_t *)(vn + il), esz); + uint8_t h = reverse_bits_8(*(uint8_t *)(vn + ih), esz); + *(uint8_t *)(vd + il) = h; + *(uint8_t *)(vd + ih) = l; + } + } +} + +void HELPER(sve_punpk_p)(void *vd, void *vn, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + intptr_t high = extract32(pred_desc, SIMD_DATA_SHIFT + 2, 1); + uint64_t *d = vd; + intptr_t i; + + if (oprsz <= 8) { + uint64_t nn = *(uint64_t *)vn; + int half = 4 * oprsz; + + nn = extract64(nn, high * half, half); + nn = expand_bits(nn, 0); + d[0] = nn; + } else { + ARMPredicateReg tmp_n; + + /* We produce output faster than we consume input. + Therefore we must be mindful of possible overlap. */ + if ((vn - vd) < (uintptr_t)oprsz) { + vn = memcpy(&tmp_n, vn, oprsz); + } + if (high) { + high = oprsz >> 1; + } + + if ((high & 3) == 0) { + uint32_t *n = vn; + high >>= 2; + + for (i = 0; i < DIV_ROUND_UP(oprsz, 8); i++) { + uint64_t nn = n[H4(high + i)]; + d[i] = expand_bits(nn, 0); + } + } else { + uint16_t *d16 = vd; + uint8_t *n = vn; + + for (i = 0; i < oprsz / 2; i++) { + uint16_t nn = n[H1(high + i)]; + d16[H2(i)] = expand_bits(nn, 0); + } + } + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 388cce9924..0160d06915 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2089,6 +2089,126 @@ static bool trans_UNPK(DisasContext *s, arg_UNPK *a, uint32_t insn) return true; } +/* + *** SVE Permute - Predicates Group + */ + +static bool do_perm_pred3(DisasContext *s, arg_rrr_esz *a, bool high_odd, + gen_helper_gvec_3 *fn) +{ + if (!sve_access_check(s)) { + return true; + } + + unsigned vsz = pred_full_reg_size(s); + + /* Predicate sizes may be smaller and cannot use simd_desc. + We cannot round up, as we do elsewhere, because we need + the exact size for ZIP2 and REV. We retain the style for + the other helpers for consistency. */ + TCGv_ptr t_d = tcg_temp_new_ptr(); + TCGv_ptr t_n = tcg_temp_new_ptr(); + TCGv_ptr t_m = tcg_temp_new_ptr(); + TCGv_i32 t_desc; + int desc; + + desc = vsz - 2; + desc = deposit32(desc, SIMD_DATA_SHIFT, 2, a->esz); + desc = deposit32(desc, SIMD_DATA_SHIFT + 2, 2, high_odd); + + tcg_gen_addi_ptr(t_d, cpu_env, pred_full_reg_offset(s, a->rd)); + tcg_gen_addi_ptr(t_n, cpu_env, pred_full_reg_offset(s, a->rn)); + tcg_gen_addi_ptr(t_m, cpu_env, pred_full_reg_offset(s, a->rm)); + t_desc = tcg_const_i32(desc); + + fn(t_d, t_n, t_m, t_desc); + + tcg_temp_free_ptr(t_d); + tcg_temp_free_ptr(t_n); + tcg_temp_free_ptr(t_m); + tcg_temp_free_i32(t_desc); + return true; +} + +static bool do_perm_pred2(DisasContext *s, arg_rr_esz *a, bool high_odd, + gen_helper_gvec_2 *fn) +{ + if (!sve_access_check(s)) { + return true; + } + + unsigned vsz = pred_full_reg_size(s); + TCGv_ptr t_d = tcg_temp_new_ptr(); + TCGv_ptr t_n = tcg_temp_new_ptr(); + TCGv_i32 t_desc; + int desc; + + tcg_gen_addi_ptr(t_d, cpu_env, pred_full_reg_offset(s, a->rd)); + tcg_gen_addi_ptr(t_n, cpu_env, pred_full_reg_offset(s, a->rn)); + + /* Predicate sizes may be smaller and cannot use simd_desc. + We cannot round up, as we do elsewhere, because we need + the exact size for ZIP2 and REV. We retain the style for + the other helpers for consistency. */ + + desc = vsz - 2; + desc = deposit32(desc, SIMD_DATA_SHIFT, 2, a->esz); + desc = deposit32(desc, SIMD_DATA_SHIFT + 2, 2, high_odd); + t_desc = tcg_const_i32(desc); + + fn(t_d, t_n, t_desc); + + tcg_temp_free_i32(t_desc); + tcg_temp_free_ptr(t_d); + tcg_temp_free_ptr(t_n); + return true; +} + +static bool trans_ZIP1_p(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + return do_perm_pred3(s, a, 0, gen_helper_sve_zip_p); +} + +static bool trans_ZIP2_p(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + return do_perm_pred3(s, a, 1, gen_helper_sve_zip_p); +} + +static bool trans_UZP1_p(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + return do_perm_pred3(s, a, 0, gen_helper_sve_uzp_p); +} + +static bool trans_UZP2_p(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + return do_perm_pred3(s, a, 1, gen_helper_sve_uzp_p); +} + +static bool trans_TRN1_p(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + return do_perm_pred3(s, a, 0, gen_helper_sve_trn_p); +} + +static bool trans_TRN2_p(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + return do_perm_pred3(s, a, 1, gen_helper_sve_trn_p); +} + +static bool trans_REV_p(DisasContext *s, arg_rr_esz *a, uint32_t insn) +{ + return do_perm_pred2(s, a, 0, gen_helper_sve_rev_p); +} + +static bool trans_PUNPKLO(DisasContext *s, arg_PUNPKLO *a, uint32_t insn) +{ + return do_perm_pred2(s, a, 0, gen_helper_sve_punpk_p); +} + +static bool trans_PUNPKHI(DisasContext *s, arg_PUNPKHI *a, uint32_t insn) +{ + return do_perm_pred2(s, a, 1, gen_helper_sve_punpk_p); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 7ffd7962c8..26fe1608c4 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -86,6 +86,7 @@ # Three operand, vector element size @rd_rn_rm ........ esz:2 . rm:5 ... ... rn:5 rd:5 &rrr_esz +@pd_pn_pm ........ esz:2 .. rm:4 ....... rn:4 . rd:4 &rrr_esz @rdn_rm ........ esz:2 ...... ...... rm:5 rd:5 \ &rrr_esz rn=%reg_movprfx @@ -396,6 +397,23 @@ TBL 00000101 .. 1 ..... 001100 ..... ..... @rd_rn_rm # SVE unpack vector elements UNPK 00000101 esz:2 1100 u:1 h:1 001110 rn:5 rd:5 +### SVE Permute - Predicates Group + +# SVE permute predicate elements +ZIP1_p 00000101 .. 10 .... 010 000 0 .... 0 .... @pd_pn_pm +ZIP2_p 00000101 .. 10 .... 010 001 0 .... 0 .... @pd_pn_pm +UZP1_p 00000101 .. 10 .... 010 010 0 .... 0 .... @pd_pn_pm +UZP2_p 00000101 .. 10 .... 010 011 0 .... 0 .... @pd_pn_pm +TRN1_p 00000101 .. 10 .... 010 100 0 .... 0 .... @pd_pn_pm +TRN2_p 00000101 .. 10 .... 010 101 0 .... 0 .... @pd_pn_pm + +# SVE reverse predicate elements +REV_p 00000101 .. 11 0100 010 000 0 .... 0 .... @pd_pn + +# SVE unpack predicate elements +PUNPKLO 00000101 00 11 0000 010 000 0 .... 0 .... @pd_pn_e0 +PUNPKHI 00000101 00 11 0001 010 000 0 .... 0 .... @pd_pn_e0 + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Wed Jun 13 01:56:27 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 138387 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp98473lji; Tue, 12 Jun 2018 19:00:23 -0700 (PDT) X-Google-Smtp-Source: ADUXVKLdMNWud3d5vJ+85txeJ8+L0p8C8ot2EBTtOg7ffkvVFpxym/Jn2MADKh85K5JoBUlfLzcR X-Received: by 2002:a37:b506:: with SMTP id e6-v6mr2721046qkf.80.1528855223594; Tue, 12 Jun 2018 19:00:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1528855223; cv=none; d=google.com; s=arc-20160816; b=PpJV8RHgnZBc+AJQvzyevDnUvjsjwIGWDoXeavd/uSyfVx0/EllAI8ejaGI2AZ0TPR DqZ5nS8kc/p+xYZwKnXXv26qVxcbqh2hh+lhPQE4Vj4/bQ9U+/b3UGBDmLbOibPu7ieJ 1fBgXqAGPZrr4m3QtHIbSmTJ3cB8jQVzPkXv/rozYmF8qhUxUZNm4Ujv6tlA+4G3M8EC QNGcm2ReHOoyy2yd+iXOiz88RYVSpcqhSFLWR3r+skJ8kf26NoLZysLTpG/y3uLCTAt0 FPMb+Y+Mqyrx4Ea4MZSBeAZylNn6CydaBBur9F6H7zsBXK+uhrWEMutMXan6i061+scl sY+g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=ANk7IyD8I0Q+Ruy1oxuydyXHODwQqVjYqLQEOtCCTWI=; b=HRKjsTkYLnzlWx04MzCJrckL3+AMAv7aeQx076PjNNKLWbsbr69vDjo4wl79YVCbtu 3COyDjAFwZ40lTh7kxveTfXoxH+9NHdjtyiZMYPHLBVpQe0fsjbClnKtHzVEe0ceJCA9 XIxk1E+WX0Ei4upm4Ds0P589srV8vxsRimlf7kFZ1VQD5/NwMPVWWywSUNaIWop8C0vv GeKbq2gjEFg+SmouUGVG5hAkzxkLTTywHSFxoZehBRx57oztySXPHDU8BV/Es3WFV3HQ jrRAewZOmA+lvfppw+CTzuvRlnJgJ+XuJIw4Ts/CvgrJz6Ir3lfdyv5ebwTf5tw8YlNy R0ig== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=FwJLCQiC; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id s123-v6si1598071qkd.40.2018.06.12.19.00.23 for (version=TLS1 cipher=AES128-SHA bits=128/128); Tue, 12 Jun 2018 19:00:23 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=FwJLCQiC; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:59274 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSv5D-0004e4-3g for patch@linaro.org; Tue, 12 Jun 2018 22:00:23 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44015) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSv21-0002DA-At for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:07 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fSv1x-0006EG-CU for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:02 -0400 Received: from mail-pl0-x234.google.com ([2607:f8b0:400e:c01::234]:37062) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fSv1x-0006Dz-3V for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:01 -0400 Received: by mail-pl0-x234.google.com with SMTP id 31-v6so554533plc.4 for ; Tue, 12 Jun 2018 18:57:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=ANk7IyD8I0Q+Ruy1oxuydyXHODwQqVjYqLQEOtCCTWI=; b=FwJLCQiCqmgFuN7fHrg+lGZN4n0lAONvXm4G3/y/IUlnocV4xmweBmqC2CSjal0gVR gy1+lc4upK7daALUGoOWJK34MPFC/oXDhKf7N7F7Yk97hH7ybyS1g+eS/8Vt2P+4PHdE pZIRTUdt/58YOqjjno8ciiTjYiu47UmZsNFL8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=ANk7IyD8I0Q+Ruy1oxuydyXHODwQqVjYqLQEOtCCTWI=; b=InVpgxpPFoHjtbC4eVDacq/K+NAII8bRXNdqGAVa/yqovzRPlicXzXiry5VEZbQ56O x+uYfAqVGigBN0nK4RBYdwVuIgbqv+yOYdOf8qDEkzJarLqN+CEc2CGD/x8a5aOTxzBJ 2j4CYpvcIO+q7s1Sh9++tzxBYF+hlfLpfTNZjcZbId2upYznm4mCcWM58wzaYE0z9uQF I+Jkh/afvl4KE8KOea+FFnDuNUIMEb/AGrkpGQpEhwPWABodmZAXvmW0uuNzY337ZVJ1 wxmxYG+d3Ox2U+FJvUM2UmLNDqwagEP+mgkzWNjkOu0QCxfkYL6e+I5G5Hu3F5I94YdT jI+g== X-Gm-Message-State: APt69E2gfksUR8j4DlR7JjOjA/e0z0Zosexe1qXLEcq83QuzXbv/6/u5 aiQ29/qFWlD8hZNFyD2pks3mXoVq1H0= X-Received: by 2002:a17:902:8685:: with SMTP id g5-v6mr2992980plo.180.1528855019724; Tue, 12 Jun 2018 18:56:59 -0700 (PDT) Received: from cloudburst.twiddle.net (rrcs-173-198-77-219.west.biz.rr.com. [173.198.77.219]) by smtp.gmail.com with ESMTPSA id g10-v6sm1647287pfi.148.2018.06.12.18.56.57 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 12 Jun 2018 18:56:58 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Tue, 12 Jun 2018 15:56:27 -1000 Message-Id: <20180613015641.5667-5-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180613015641.5667-1-richard.henderson@linaro.org> References: <20180613015641.5667-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::234 Subject: [Qemu-devel] [PATCH v4b 04/18] target/arm: Implement SVE Permute - Interleaving Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 15 ++++++++ target/arm/sve_helper.c | 72 ++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 75 ++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 10 +++++ 4 files changed, 172 insertions(+) -- 2.17.1 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index ff958fcebd..bab20345c6 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -445,6 +445,21 @@ DEF_HELPER_FLAGS_4(sve_trn_p, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_rev_p, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_punpk_p, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_zip_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_zip_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_zip_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_zip_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_uzp_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_uzp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_uzp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_uzp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_trn_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_trn_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_trn_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_trn_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index f4d49d4aff..f114e9ab63 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1964,3 +1964,75 @@ void HELPER(sve_punpk_p)(void *vd, void *vn, uint32_t pred_desc) } } } + +#define DO_ZIP(NAME, TYPE, H) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t oprsz = simd_oprsz(desc); \ + intptr_t i, oprsz_2 = oprsz / 2; \ + ARMVectorReg tmp_n, tmp_m; \ + /* We produce output faster than we consume input. \ + Therefore we must be mindful of possible overlap. */ \ + if (unlikely((vn - vd) < (uintptr_t)oprsz)) { \ + vn = memcpy(&tmp_n, vn, oprsz_2); \ + } \ + if (unlikely((vm - vd) < (uintptr_t)oprsz)) { \ + vm = memcpy(&tmp_m, vm, oprsz_2); \ + } \ + for (i = 0; i < oprsz_2; i += sizeof(TYPE)) { \ + *(TYPE *)(vd + H(2 * i + 0)) = *(TYPE *)(vn + H(i)); \ + *(TYPE *)(vd + H(2 * i + sizeof(TYPE))) = *(TYPE *)(vm + H(i)); \ + } \ +} + +DO_ZIP(sve_zip_b, uint8_t, H1) +DO_ZIP(sve_zip_h, uint16_t, H1_2) +DO_ZIP(sve_zip_s, uint32_t, H1_4) +DO_ZIP(sve_zip_d, uint64_t, ) + +#define DO_UZP(NAME, TYPE, H) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t oprsz = simd_oprsz(desc); \ + intptr_t oprsz_2 = oprsz / 2; \ + intptr_t odd_ofs = simd_data(desc); \ + intptr_t i; \ + ARMVectorReg tmp_m; \ + if (unlikely((vm - vd) < (uintptr_t)oprsz)) { \ + vm = memcpy(&tmp_m, vm, oprsz); \ + } \ + for (i = 0; i < oprsz_2; i += sizeof(TYPE)) { \ + *(TYPE *)(vd + H(i)) = *(TYPE *)(vn + H(2 * i + odd_ofs)); \ + } \ + for (i = 0; i < oprsz_2; i += sizeof(TYPE)) { \ + *(TYPE *)(vd + H(oprsz_2 + i)) = *(TYPE *)(vm + H(2 * i + odd_ofs)); \ + } \ +} + +DO_UZP(sve_uzp_b, uint8_t, H1) +DO_UZP(sve_uzp_h, uint16_t, H1_2) +DO_UZP(sve_uzp_s, uint32_t, H1_4) +DO_UZP(sve_uzp_d, uint64_t, ) + +#define DO_TRN(NAME, TYPE, H) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t oprsz = simd_oprsz(desc); \ + intptr_t odd_ofs = simd_data(desc); \ + intptr_t i; \ + for (i = 0; i < oprsz; i += 2 * sizeof(TYPE)) { \ + TYPE ae = *(TYPE *)(vn + H(i + odd_ofs)); \ + TYPE be = *(TYPE *)(vm + H(i + odd_ofs)); \ + *(TYPE *)(vd + H(i + 0)) = ae; \ + *(TYPE *)(vd + H(i + sizeof(TYPE))) = be; \ + } \ +} + +DO_TRN(sve_trn_b, uint8_t, H1) +DO_TRN(sve_trn_h, uint16_t, H1_2) +DO_TRN(sve_trn_s, uint32_t, H1_4) +DO_TRN(sve_trn_d, uint64_t, ) + +#undef DO_ZIP +#undef DO_UZP +#undef DO_TRN diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 0160d06915..21319518d7 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2209,6 +2209,81 @@ static bool trans_PUNPKHI(DisasContext *s, arg_PUNPKHI *a, uint32_t insn) return do_perm_pred2(s, a, 1, gen_helper_sve_punpk_p); } +/* + *** SVE Permute - Interleaving Group + */ + +static bool do_zip(DisasContext *s, arg_rrr_esz *a, bool high) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve_zip_b, gen_helper_sve_zip_h, + gen_helper_sve_zip_s, gen_helper_sve_zip_d, + }; + + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + unsigned high_ofs = high ? vsz / 2 : 0; + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn) + high_ofs, + vec_full_reg_offset(s, a->rm) + high_ofs, + vsz, vsz, 0, fns[a->esz]); + } + return true; +} + +static bool do_zzz_data_ool(DisasContext *s, arg_rrr_esz *a, int data, + gen_helper_gvec_3 *fn) +{ + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vsz, vsz, data, fn); + } + return true; +} + +static bool trans_ZIP1_z(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + return do_zip(s, a, false); +} + +static bool trans_ZIP2_z(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + return do_zip(s, a, true); +} + +static gen_helper_gvec_3 * const uzp_fns[4] = { + gen_helper_sve_uzp_b, gen_helper_sve_uzp_h, + gen_helper_sve_uzp_s, gen_helper_sve_uzp_d, +}; + +static bool trans_UZP1_z(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + return do_zzz_data_ool(s, a, 0, uzp_fns[a->esz]); +} + +static bool trans_UZP2_z(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + return do_zzz_data_ool(s, a, 1 << a->esz, uzp_fns[a->esz]); +} + +static gen_helper_gvec_3 * const trn_fns[4] = { + gen_helper_sve_trn_b, gen_helper_sve_trn_h, + gen_helper_sve_trn_s, gen_helper_sve_trn_d, +}; + +static bool trans_TRN1_z(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + return do_zzz_data_ool(s, a, 0, trn_fns[a->esz]); +} + +static bool trans_TRN2_z(DisasContext *s, arg_rrr_esz *a, uint32_t insn) +{ + return do_zzz_data_ool(s, a, 1 << a->esz, trn_fns[a->esz]); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 26fe1608c4..df2b94dc0a 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -414,6 +414,16 @@ REV_p 00000101 .. 11 0100 010 000 0 .... 0 .... @pd_pn PUNPKLO 00000101 00 11 0000 010 000 0 .... 0 .... @pd_pn_e0 PUNPKHI 00000101 00 11 0001 010 000 0 .... 0 .... @pd_pn_e0 +### SVE Permute - Interleaving Group + +# SVE permute vector elements +ZIP1_z 00000101 .. 1 ..... 011 000 ..... ..... @rd_rn_rm +ZIP2_z 00000101 .. 1 ..... 011 001 ..... ..... @rd_rn_rm +UZP1_z 00000101 .. 1 ..... 011 010 ..... ..... @rd_rn_rm +UZP2_z 00000101 .. 1 ..... 011 011 ..... ..... @rd_rn_rm +TRN1_z 00000101 .. 1 ..... 011 100 ..... ..... @rd_rn_rm +TRN2_z 00000101 .. 1 ..... 011 101 ..... ..... @rd_rn_rm + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Wed Jun 13 01:56:28 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 138391 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp101515lji; Tue, 12 Jun 2018 19:03:42 -0700 (PDT) X-Google-Smtp-Source: ADUXVKKxoPVe3qMjjp/+GWdPE96WBPARYjnmdLmdHFjznBByyKKaly+tvFhcf9CLMJvn8w4eknj1 X-Received: by 2002:ac8:3283:: with SMTP id z3-v6mr2816352qta.324.1528855422837; Tue, 12 Jun 2018 19:03:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1528855422; cv=none; d=google.com; s=arc-20160816; b=VX7MLbIL9uFERTGbL0WIFCddSnQBWwwmB72tepoD6Bf7e7SVoASjN4UtahXtJCT7yp 61ilgxp0yKJIbvXusdvGnfapx++w4clIyFNbNR15YeuT5FyG977tF5VjdjzDMrjpRmkN d8QzkBXcRN/iCjVcGRQNA124q8nF41DHU3ysRvUJMOiHuEexspURkbof8LwmYzxHrXDK pFGlwJCyMgXjJO+TArtWsFUCqTk4U4urnFiKfskIHsPNvzptLbHVDPYR1HWrnaPwVfu6 aMOYYeMLHvnwJ4pmX2zEEHsblGAkuDWYb+HDbZVLL39sl7uxy+hOv0xe9t3Ur6i9u3R6 XfGA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=gGEHoRI1n8JpcCPxiCjVrwzqGOTkwWZTYIeBMaMAdoo=; b=TT6IomsVeWK2UGTiy/NGxgaCeL4+SnIegmbWscnioDZf6STTXH/tSrzmFs4/lMxyLC L/5V2pgNH2fW2wN13JTTn8KTWdq+c2L1hKZzaP2kVHa/ZA012YrwFc7hXLc7JnnmLpmY G/MFoGUUWiEiCVADS1TqV7z0TLa4PJLYdGlOiZKAke3iC4xn1fSJAf4k3fMAgRpKz89h WG31zotSkQoGiosTGVKDGrJC6GpMHfPdJy/f/p+giGX9klw1U/id8Ol3sdDlpxLkzPVm +WyGHcMBxN2qNG1jgGDrQ0cMC0aJFkTB0YxR4Hb2SBTwFjOq3L/2COqlg0kwEkvADbQd uVpA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=B5R4LFOv; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id v18-v6si1631281qta.163.2018.06.12.19.03.42 for (version=TLS1 cipher=AES128-SHA bits=128/128); Tue, 12 Jun 2018 19:03:42 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=B5R4LFOv; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:59295 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSv8Q-0007EM-AC for patch@linaro.org; Tue, 12 Jun 2018 22:03:42 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44037) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSv23-0002FI-Hd for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:08 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fSv20-0006F0-E1 for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:07 -0400 Received: from mail-pf0-x233.google.com ([2607:f8b0:400e:c00::233]:32937) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fSv20-0006Eq-7n for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:04 -0400 Received: by mail-pf0-x233.google.com with SMTP id b17-v6so513648pfi.0 for ; Tue, 12 Jun 2018 18:57:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=gGEHoRI1n8JpcCPxiCjVrwzqGOTkwWZTYIeBMaMAdoo=; b=B5R4LFOveCzEt2LZwm6CMP2dh1tHNpHUFyJS/rHDkpMWeLboBUJwo7xwTvVLfZ69Df ZvQXGIxTqqmRyp5rUJGd1MaHbp5Xn+tF9vMomDKmD7w5jECGyzEn7JyRyGFUoDCTMRdc soQoTRg6gk4trrVZolQvfJ6qwt2il80MhUv0c= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=gGEHoRI1n8JpcCPxiCjVrwzqGOTkwWZTYIeBMaMAdoo=; b=pPNXoHJRGJNRWNCnwpi/wtOS8FjyPGiHUo6AC0X+rrz84yymwHN2KJkv4TtuO0m2nq c/Hs3yIC7vpWRVbEaoAvgyZEDslvj/1Vn6F0eHn3IYkN/n22DX1PKE89HKANHwBHRAqN dsPazcsYUtDB9gT2/abVjoiUK17y8DfeStdiNCR/uuRTDZYDqrfEtQcE19qDKpnuRNsq MJzclCyPYjZBe4tkZEJZv4uq2YnZzL9zpa4VPSbnDYfTLN/miiE/KiDvsqKTRXFWTXeu DnMsUWvDmAkbxsFKrZK4oMQGPcS/bBaRrWOdPcmrt5WZZE1kZL33OilugzVMJa7jBTHC GGow== X-Gm-Message-State: APt69E1irmyfzRHJJzkDoFUIq/TmXkNrBFtjZzWItLsAp4UasjxnlatT cTY7qQMfo94eWJSfiWHO2igJGB+y5kI= X-Received: by 2002:a65:4241:: with SMTP id d1-v6mr2321500pgq.392.1528855022900; Tue, 12 Jun 2018 18:57:02 -0700 (PDT) Received: from cloudburst.twiddle.net (rrcs-173-198-77-219.west.biz.rr.com. [173.198.77.219]) by smtp.gmail.com with ESMTPSA id g10-v6sm1647287pfi.148.2018.06.12.18.56.59 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 12 Jun 2018 18:57:02 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Tue, 12 Jun 2018 15:56:28 -1000 Message-Id: <20180613015641.5667-6-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180613015641.5667-1-richard.henderson@linaro.org> References: <20180613015641.5667-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::233 Subject: [Qemu-devel] [PATCH v4b 05/18] target/arm: Implement SVE compress active elements X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 3 +++ target/arm/sve_helper.c | 34 ++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 12 ++++++++++++ target/arm/sve.decode | 6 ++++++ 4 files changed, 55 insertions(+) -- 2.17.1 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index bab20345c6..d977aea00d 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -460,6 +460,9 @@ DEF_HELPER_FLAGS_4(sve_trn_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_trn_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_trn_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_compact_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_compact_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index f114e9ab63..ed3c6d4ca9 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2036,3 +2036,37 @@ DO_TRN(sve_trn_d, uint64_t, ) #undef DO_ZIP #undef DO_UZP #undef DO_TRN + +void HELPER(sve_compact_s)(void *vd, void *vn, void *vg, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc) / 4; + uint32_t *d = vd, *n = vn; + uint8_t *pg = vg; + + for (i = j = 0; i < opr_sz; i++) { + if (pg[H1(i / 2)] & (i & 1 ? 0x10 : 0x01)) { + d[H4(j)] = n[H4(i)]; + j++; + } + } + for (; j < opr_sz; j++) { + d[H4(j)] = 0; + } +} + +void HELPER(sve_compact_d)(void *vd, void *vn, void *vg, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd, *n = vn; + uint8_t *pg = vg; + + for (i = j = 0; i < opr_sz; i++) { + if (pg[H1(i)] & 1) { + d[j] = n[i]; + j++; + } + } + for (; j < opr_sz; j++) { + d[j] = 0; + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 21319518d7..ed0f48a927 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2284,6 +2284,18 @@ static bool trans_TRN2_z(DisasContext *s, arg_rrr_esz *a, uint32_t insn) return do_zzz_data_ool(s, a, 1 << a->esz, trn_fns[a->esz]); } +/* + *** SVE Permute Vector - Predicated Group + */ + +static bool trans_COMPACT(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] = { + NULL, NULL, gen_helper_sve_compact_s, gen_helper_sve_compact_d + }; + return do_zpz_ool(s, a, fns[a->esz]); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index df2b94dc0a..9da6566d32 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -424,6 +424,12 @@ UZP2_z 00000101 .. 1 ..... 011 011 ..... ..... @rd_rn_rm TRN1_z 00000101 .. 1 ..... 011 100 ..... ..... @rd_rn_rm TRN2_z 00000101 .. 1 ..... 011 101 ..... ..... @rd_rn_rm +### SVE Permute - Predicated Group + +# SVE compress active elements +# Note esz >= 2 +COMPACT 00000101 .. 100001 100 ... ..... ..... @rd_pg_rn + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Wed Jun 13 01:56:29 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 138392 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp101616lji; Tue, 12 Jun 2018 19:03:52 -0700 (PDT) X-Google-Smtp-Source: ADUXVKLknFoYB3OlUSLi+ujFOR/0nFwCB2CkW4gaqtav2uw+virBxHek4c6+7LQ1JQdJQ0iR1a32 X-Received: by 2002:a0c:9308:: with SMTP id d8-v6mr2827121qvd.191.1528855432571; Tue, 12 Jun 2018 19:03:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1528855432; cv=none; d=google.com; s=arc-20160816; b=CrEIQs0AbwQUll/8FMJaR4IwFKkLWJs7vv4Jd4EqLKVHzT9JFEY9D7U2mYAmNP1NLh ePBadhG2FwT6JABCbrb8lc5nEK5PzsKzCSk0x8SkHE92xHn0u1rj22Uqe4pWiI1iPZhD TMEP5dGqKNWFChKRDDWDgZeL91Uw1irsQMXuLtqeD6G0lFhECISLQ3/qS70nlUQ/qd7/ sBnQWFLKTwvdpKNUBPBc1hAacWHc4yGNkNVvda3WzF+24CYfVlfTDusfYaqUav9x+cOW NrfFWd2TuymZx75Zu5MBFUw98Ur3jH7BxVDzfA4WKqqlKZ/10hU6S8kHziohidPov0sN 9QaQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=wyV3aTF+k97U0cdfQPbX9V0mv9r8Q24Iii8LVRofa2g=; b=Kx5IZgVrkOKkOcf8XMugucGEETRMCQPWLf4eSfXDaABL5em3CbV7rH30kqpmnIUlQZ KVevZPqiQTC6Rsvk2Fotp+BKTUZtuNYGg4bkHlosXeSBUlEMTlZ1kvosvrPTqAaw8i2K 6ElAOp+WeYaaUlCaH7QKwqx4KdNaZHhb/qSilSB3wbU+RCBGMZ0qYMI1EPJaBqTSBENp vyTcIE54DI8goHddk/II9CQ3G6CUnJjmbesLsFGnA6NWLS4i0MI1zm7k7DdpBr+cH3+4 Ym4iYfNkbW47ZbhE5lLE1kpR5pD9zH0VPdfQwbct44q/pYP4cpT8qnKoBsNDWmjxgyqa +HTQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=jTBAffjX; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id n4-v6si1537365qkl.110.2018.06.12.19.03.52 for (version=TLS1 cipher=AES128-SHA bits=128/128); Tue, 12 Jun 2018 19:03:52 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=jTBAffjX; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:59297 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSv8a-0007Js-24 for patch@linaro.org; Tue, 12 Jun 2018 22:03:52 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44047) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSv24-0002HC-IR for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:12 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fSv22-0006FQ-S9 for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:08 -0400 Received: from mail-pf0-x22f.google.com ([2607:f8b0:400e:c00::22f]:46689) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fSv22-0006FG-Jq for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:06 -0400 Received: by mail-pf0-x22f.google.com with SMTP id q1-v6so495282pff.13 for ; Tue, 12 Jun 2018 18:57:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=wyV3aTF+k97U0cdfQPbX9V0mv9r8Q24Iii8LVRofa2g=; b=jTBAffjXlmQ3wtvnucJ5kWKcW8Vq0EYGjBqqzT5FS7fBHNc6IjC7NaSCwFyW/GARxJ WLBj2/UIZXQb8oSRJOJI6g3TZUoFAAcm/YUoGw/g5mbp4CYZw+f0sAIyor3XULcJCaUC xbJYXhXoWs19Kbhqw/m3u7ETexM0Kmqfhqd08= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=wyV3aTF+k97U0cdfQPbX9V0mv9r8Q24Iii8LVRofa2g=; b=tq4ywM0QTeJh/eiip/VHHvBzL87jSTuYbEKaNN8ivTs5SJQnxYz99gQ34Unl3wPeRj goRYCUyWJ/bm4XVX1RxroeLIv7if6i96Jas8DCi/UdJNEteKIOj1R9WjaJi7y+jMGgxz 5tVGfn0Ch3fiNiV61gBIkINvoniGy/m6yQ9VyE6B2+XZqqfSwOOGLA0h9KJrGgsJHxe0 4atOGplemZsuh99mclKyhHUev5oxYoF4DPj47Qrx4x1GHMalvhEXBnUZ1GptN/X2ocd7 O34THJrba12BHxyPu2Xmpx9YCFC4aTeZSOyMymPrvhE391DpF+rVHyhsJ8NQJNeSGja3 yxvA== X-Gm-Message-State: APt69E0r5todPxmuBEIGb3wb96Gsp1gO0TfYxc/IkuqT7RLh70rsyE+M nDydyxSeV/d/KK6hs8YDfAwlCpnPw9U= X-Received: by 2002:a63:6107:: with SMTP id v7-v6mr2419293pgb.264.1528855025171; Tue, 12 Jun 2018 18:57:05 -0700 (PDT) Received: from cloudburst.twiddle.net (rrcs-173-198-77-219.west.biz.rr.com. [173.198.77.219]) by smtp.gmail.com with ESMTPSA id g10-v6sm1647287pfi.148.2018.06.12.18.57.03 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 12 Jun 2018 18:57:04 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Tue, 12 Jun 2018 15:56:29 -1000 Message-Id: <20180613015641.5667-7-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180613015641.5667-1-richard.henderson@linaro.org> References: <20180613015641.5667-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::22f Subject: [Qemu-devel] [PATCH v4b 06/18] target/arm: Implement SVE conditionally broadcast/extract element X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 2 + target/arm/sve_helper.c | 12 ++ target/arm/translate-sve.c | 328 +++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 20 +++ 4 files changed, 362 insertions(+) -- 2.17.1 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index d977aea00d..a58fb4ba01 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -463,6 +463,8 @@ DEF_HELPER_FLAGS_4(sve_trn_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_compact_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_compact_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_2(sve_last_active_element, TCG_CALL_NO_RWG, s32, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index ed3c6d4ca9..cb7d101b57 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2070,3 +2070,15 @@ void HELPER(sve_compact_d)(void *vd, void *vn, void *vg, uint32_t desc) d[j] = 0; } } + +/* Similar to the ARM LastActiveElement pseudocode function, except the + * result is multiplied by the element size. This includes the not found + * indication; e.g. not found for esz=3 is -8. + */ +int32_t HELPER(sve_last_active_element)(void *vg, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + intptr_t esz = extract32(pred_desc, SIMD_DATA_SHIFT, 2); + + return last_active_element(vg, DIV_ROUND_UP(oprsz, 8), esz); +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index ed0f48a927..feb4c09f1b 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2296,6 +2296,334 @@ static bool trans_COMPACT(DisasContext *s, arg_rpr_esz *a, uint32_t insn) return do_zpz_ool(s, a, fns[a->esz]); } +/* Call the helper that computes the ARM LastActiveElement pseudocode + * function, scaled by the element size. This includes the not found + * indication; e.g. not found for esz=3 is -8. + */ +static void find_last_active(DisasContext *s, TCGv_i32 ret, int esz, int pg) +{ + /* Predicate sizes may be smaller and cannot use simd_desc. We cannot + * round up, as we do elsewhere, because we need the exact size. + */ + TCGv_ptr t_p = tcg_temp_new_ptr(); + TCGv_i32 t_desc; + unsigned vsz = pred_full_reg_size(s); + unsigned desc; + + desc = vsz - 2; + desc = deposit32(desc, SIMD_DATA_SHIFT, 2, esz); + + tcg_gen_addi_ptr(t_p, cpu_env, pred_full_reg_offset(s, pg)); + t_desc = tcg_const_i32(desc); + + gen_helper_sve_last_active_element(ret, t_p, t_desc); + + tcg_temp_free_i32(t_desc); + tcg_temp_free_ptr(t_p); +} + +/* Increment LAST to the offset of the next element in the vector, + * wrapping around to 0. + */ +static void incr_last_active(DisasContext *s, TCGv_i32 last, int esz) +{ + unsigned vsz = vec_full_reg_size(s); + + tcg_gen_addi_i32(last, last, 1 << esz); + if (is_power_of_2(vsz)) { + tcg_gen_andi_i32(last, last, vsz - 1); + } else { + TCGv_i32 max = tcg_const_i32(vsz); + TCGv_i32 zero = tcg_const_i32(0); + tcg_gen_movcond_i32(TCG_COND_GEU, last, last, max, zero, last); + tcg_temp_free_i32(max); + tcg_temp_free_i32(zero); + } +} + +/* If LAST < 0, set LAST to the offset of the last element in the vector. */ +static void wrap_last_active(DisasContext *s, TCGv_i32 last, int esz) +{ + unsigned vsz = vec_full_reg_size(s); + + if (is_power_of_2(vsz)) { + tcg_gen_andi_i32(last, last, vsz - 1); + } else { + TCGv_i32 max = tcg_const_i32(vsz - (1 << esz)); + TCGv_i32 zero = tcg_const_i32(0); + tcg_gen_movcond_i32(TCG_COND_LT, last, last, zero, max, last); + tcg_temp_free_i32(max); + tcg_temp_free_i32(zero); + } +} + +/* Load an unsigned element of ESZ from BASE+OFS. */ +static TCGv_i64 load_esz(TCGv_ptr base, int ofs, int esz) +{ + TCGv_i64 r = tcg_temp_new_i64(); + + switch (esz) { + case 0: + tcg_gen_ld8u_i64(r, base, ofs); + break; + case 1: + tcg_gen_ld16u_i64(r, base, ofs); + break; + case 2: + tcg_gen_ld32u_i64(r, base, ofs); + break; + case 3: + tcg_gen_ld_i64(r, base, ofs); + break; + default: + g_assert_not_reached(); + } + return r; +} + +/* Load an unsigned element of ESZ from RM[LAST]. */ +static TCGv_i64 load_last_active(DisasContext *s, TCGv_i32 last, + int rm, int esz) +{ + TCGv_ptr p = tcg_temp_new_ptr(); + TCGv_i64 r; + + /* Convert offset into vector into offset into ENV. + * The final adjustment for the vector register base + * is added via constant offset to the load. + */ +#ifdef HOST_WORDS_BIGENDIAN + /* Adjust for element ordering. See vec_reg_offset. */ + if (esz < 3) { + tcg_gen_xori_i32(last, last, 8 - (1 << esz)); + } +#endif + tcg_gen_ext_i32_ptr(p, last); + tcg_gen_add_ptr(p, p, cpu_env); + + r = load_esz(p, vec_full_reg_offset(s, rm), esz); + tcg_temp_free_ptr(p); + + return r; +} + +/* Compute CLAST for a Zreg. */ +static bool do_clast_vector(DisasContext *s, arg_rprr_esz *a, bool before) +{ + TCGv_i32 last; + TCGLabel *over; + TCGv_i64 ele; + unsigned vsz, esz = a->esz; + + if (!sve_access_check(s)) { + return true; + } + + last = tcg_temp_local_new_i32(); + over = gen_new_label(); + + find_last_active(s, last, esz, a->pg); + + /* There is of course no movcond for a 2048-bit vector, + * so we must branch over the actual store. + */ + tcg_gen_brcondi_i32(TCG_COND_LT, last, 0, over); + + if (!before) { + incr_last_active(s, last, esz); + } + + ele = load_last_active(s, last, a->rm, esz); + tcg_temp_free_i32(last); + + vsz = vec_full_reg_size(s); + tcg_gen_gvec_dup_i64(esz, vec_full_reg_offset(s, a->rd), vsz, vsz, ele); + tcg_temp_free_i64(ele); + + /* If this insn used MOVPRFX, we may need a second move. */ + if (a->rd != a->rn) { + TCGLabel *done = gen_new_label(); + tcg_gen_br(done); + + gen_set_label(over); + do_mov_z(s, a->rd, a->rn); + + gen_set_label(done); + } else { + gen_set_label(over); + } + return true; +} + +static bool trans_CLASTA_z(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ + return do_clast_vector(s, a, false); +} + +static bool trans_CLASTB_z(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ + return do_clast_vector(s, a, true); +} + +/* Compute CLAST for a scalar. */ +static void do_clast_scalar(DisasContext *s, int esz, int pg, int rm, + bool before, TCGv_i64 reg_val) +{ + TCGv_i32 last = tcg_temp_new_i32(); + TCGv_i64 ele, cmp, zero; + + find_last_active(s, last, esz, pg); + + /* Extend the original value of last prior to incrementing. */ + cmp = tcg_temp_new_i64(); + tcg_gen_ext_i32_i64(cmp, last); + + if (!before) { + incr_last_active(s, last, esz); + } + + /* The conceit here is that while last < 0 indicates not found, after + * adjusting for cpu_env->vfp.zregs[rm], it is still a valid address + * from which we can load garbage. We then discard the garbage with + * a conditional move. + */ + ele = load_last_active(s, last, rm, esz); + tcg_temp_free_i32(last); + + zero = tcg_const_i64(0); + tcg_gen_movcond_i64(TCG_COND_GE, reg_val, cmp, zero, ele, reg_val); + + tcg_temp_free_i64(zero); + tcg_temp_free_i64(cmp); + tcg_temp_free_i64(ele); +} + +/* Compute CLAST for a Vreg. */ +static bool do_clast_fp(DisasContext *s, arg_rpr_esz *a, bool before) +{ + if (sve_access_check(s)) { + int esz = a->esz; + int ofs = vec_reg_offset(s, a->rd, 0, esz); + TCGv_i64 reg = load_esz(cpu_env, ofs, esz); + + do_clast_scalar(s, esz, a->pg, a->rn, before, reg); + write_fp_dreg(s, a->rd, reg); + tcg_temp_free_i64(reg); + } + return true; +} + +static bool trans_CLASTA_v(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_clast_fp(s, a, false); +} + +static bool trans_CLASTB_v(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_clast_fp(s, a, true); +} + +/* Compute CLAST for a Xreg. */ +static bool do_clast_general(DisasContext *s, arg_rpr_esz *a, bool before) +{ + TCGv_i64 reg; + + if (!sve_access_check(s)) { + return true; + } + + reg = cpu_reg(s, a->rd); + switch (a->esz) { + case 0: + tcg_gen_ext8u_i64(reg, reg); + break; + case 1: + tcg_gen_ext16u_i64(reg, reg); + break; + case 2: + tcg_gen_ext32u_i64(reg, reg); + break; + case 3: + break; + default: + g_assert_not_reached(); + } + + do_clast_scalar(s, a->esz, a->pg, a->rn, before, reg); + return true; +} + +static bool trans_CLASTA_r(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_clast_general(s, a, false); +} + +static bool trans_CLASTB_r(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_clast_general(s, a, true); +} + +/* Compute LAST for a scalar. */ +static TCGv_i64 do_last_scalar(DisasContext *s, int esz, + int pg, int rm, bool before) +{ + TCGv_i32 last = tcg_temp_new_i32(); + TCGv_i64 ret; + + find_last_active(s, last, esz, pg); + if (before) { + wrap_last_active(s, last, esz); + } else { + incr_last_active(s, last, esz); + } + + ret = load_last_active(s, last, rm, esz); + tcg_temp_free_i32(last); + return ret; +} + +/* Compute LAST for a Vreg. */ +static bool do_last_fp(DisasContext *s, arg_rpr_esz *a, bool before) +{ + if (sve_access_check(s)) { + TCGv_i64 val = do_last_scalar(s, a->esz, a->pg, a->rn, before); + write_fp_dreg(s, a->rd, val); + tcg_temp_free_i64(val); + } + return true; +} + +static bool trans_LASTA_v(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_last_fp(s, a, false); +} + +static bool trans_LASTB_v(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_last_fp(s, a, true); +} + +/* Compute LAST for a Xreg. */ +static bool do_last_general(DisasContext *s, arg_rpr_esz *a, bool before) +{ + if (sve_access_check(s)) { + TCGv_i64 val = do_last_scalar(s, a->esz, a->pg, a->rn, before); + tcg_gen_mov_i64(cpu_reg(s, a->rd), val); + tcg_temp_free_i64(val); + } + return true; +} + +static bool trans_LASTA_r(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_last_general(s, a, false); +} + +static bool trans_LASTB_r(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_last_general(s, a, true); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 9da6566d32..1226867f69 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -430,6 +430,26 @@ TRN2_z 00000101 .. 1 ..... 011 101 ..... ..... @rd_rn_rm # Note esz >= 2 COMPACT 00000101 .. 100001 100 ... ..... ..... @rd_pg_rn +# SVE conditionally broadcast element to vector +CLASTA_z 00000101 .. 10100 0 100 ... ..... ..... @rdn_pg_rm +CLASTB_z 00000101 .. 10100 1 100 ... ..... ..... @rdn_pg_rm + +# SVE conditionally copy element to SIMD&FP scalar +CLASTA_v 00000101 .. 10101 0 100 ... ..... ..... @rd_pg_rn +CLASTB_v 00000101 .. 10101 1 100 ... ..... ..... @rd_pg_rn + +# SVE conditionally copy element to general register +CLASTA_r 00000101 .. 11000 0 101 ... ..... ..... @rd_pg_rn +CLASTB_r 00000101 .. 11000 1 101 ... ..... ..... @rd_pg_rn + +# SVE copy element to SIMD&FP scalar register +LASTA_v 00000101 .. 10001 0 100 ... ..... ..... @rd_pg_rn +LASTB_v 00000101 .. 10001 1 100 ... ..... ..... @rd_pg_rn + +# SVE copy element to general register +LASTA_r 00000101 .. 10000 0 101 ... ..... ..... @rd_pg_rn +LASTB_r 00000101 .. 10000 1 101 ... ..... ..... @rd_pg_rn + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Wed Jun 13 01:56:30 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 138383 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp96531lji; Tue, 12 Jun 2018 18:57:40 -0700 (PDT) X-Google-Smtp-Source: ADUXVKKZDxHvmzLSf31Kr21toF+pcJatdV1U/yJL6F/RunlVyWYaNRJFpoJPpQ/VPuvcei6zy/Qp X-Received: by 2002:aed:3a24:: with SMTP id n33-v6mr2940629qte.348.1528855060713; Tue, 12 Jun 2018 18:57:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1528855060; cv=none; d=google.com; s=arc-20160816; b=hr9yQV2Q/nZKbV3+t1eFhmGaVx/Ra5Hd9v29IdzIwudbicWRSl7p1DjI+7dP3pNS4u sgqpTueFZDaQ9WKynp6oRqAUyYE2ZEkI1VnMxtonWmTNIoPJ2Nh8N8D/vBrbTgx2DY14 56ck9Bchua1FaKEHqYU+A5UdTvZHYCEuQuKVJqRH4PxpGXog/h9eCNbCWP3cY4NoyyPy frcPxJc2rukQ1edXMjdZdPNBesGQUtmuXjz9Z5A968+XCWid04aUBYC87OFtmKriZ7+t eYpWj0YpSyPEV/ZUn1IzHbpNhMI7BTaGKQSfmg0h8+/5q/l4PM4aEBr7Q4JY+V8jvUCz MW8w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=idcOSJQM+x+r7FGE5XyXexTmFvg3KhZ13BSxcgZ76D0=; b=JHd84i3vDNCQsevzufo8msOR4BFaKcKTIdcZdp4zjuHCJ89lXy+O+G3TTeSu5sz8aR 9dCxhMEsw1HYTMOng/ZSm+WAVyYWlX+inF/TkuXpPWb0HdyEAc/6Hd4go3XZ6wXRBIu2 fTq0jb3U+K/yvqrC+bpA6T89t4z+punxY+JfCMddBxSV6oY69DH+qa2r/15DI9AKUGkF 4PpEAi5zslk8CYIHSg8R2AmCAMVbWMzgo1xwOdBJ77Q3Ez19lmK3SEPnJiL1qqGhIjcr cijb+MKISKsmdHJNfbmnw9AFUa8jkXvPG/gpQTgbPtBlGBbfOlGIa51F/8MV2xy/SxHd EZlg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=JasFLzOf; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id b129-v6si1583740qkc.103.2018.06.12.18.57.40 for (version=TLS1 cipher=AES128-SHA bits=128/128); Tue, 12 Jun 2018 18:57:40 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=JasFLzOf; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:59267 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSv2a-0002L6-89 for patch@linaro.org; Tue, 12 Jun 2018 21:57:40 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44055) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSv25-0002IW-Ue for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:12 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fSv25-0006G9-6G for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:10 -0400 Received: from mail-pl0-x231.google.com ([2607:f8b0:400e:c01::231]:39474) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fSv25-0006Fw-08 for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:09 -0400 Received: by mail-pl0-x231.google.com with SMTP id f1-v6so549743plt.6 for ; Tue, 12 Jun 2018 18:57:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=idcOSJQM+x+r7FGE5XyXexTmFvg3KhZ13BSxcgZ76D0=; b=JasFLzOfMPn17GR4sc62HEvQwMdlGNWXBjz0O6epAqu1EBNOoeBfdkrnWXryPv2VmF UL4YgZtDv63zG2B7vg7+PLvBnf1P1luUmXdPCOcqBljeqpMC6UnNCNOwwvf2sL47NX0P jgGRpuuCE3Nl65TFuWKfMOVMl/M0cxbCflnuA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=idcOSJQM+x+r7FGE5XyXexTmFvg3KhZ13BSxcgZ76D0=; b=pehDnOQGmfjYWvyn/CacVRf6y9ImncBPlg/SSQ+vMskN7Ji6IXZGEodQW6vUwQh13F rgb/LJHGUnVDatNHaQsEfykBPH+1Uovco4zIDYMzwq2zmkMtdFB7kJs5yQm0RUYLxtpV PrqqtxYXD7r7eSEPxrOQFocr04/nzc3kaN8OutBqPYrb9KAKhPiAZd+rCjXSkbnqClwE G6L1ym2BsAPrfGD+obx0ic8ZxcZndEoZCJeC8yybRQ/yTWIzlkhKjFQrp3VKbCxdNxWG kXvjhEEYuz9mhHYtXLSrCD3Pc80PqyQe4oW6gZUq9KzhzHoI3myN34Zn5D35N4Jx8EED xh7g== X-Gm-Message-State: APt69E1HWBwXNqWJU2iA0RKL/rgc/jsNLsPp5XzcosjzN5szeXAeNNqw 9FVF56LkE+hcKWhuNEt+RU25eEkUcqk= X-Received: by 2002:a17:902:b60c:: with SMTP id b12-v6mr3038355pls.44.1528855027714; Tue, 12 Jun 2018 18:57:07 -0700 (PDT) Received: from cloudburst.twiddle.net (rrcs-173-198-77-219.west.biz.rr.com. [173.198.77.219]) by smtp.gmail.com with ESMTPSA id g10-v6sm1647287pfi.148.2018.06.12.18.57.05 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 12 Jun 2018 18:57:06 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Tue, 12 Jun 2018 15:56:30 -1000 Message-Id: <20180613015641.5667-8-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180613015641.5667-1-richard.henderson@linaro.org> References: <20180613015641.5667-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::231 Subject: [Qemu-devel] [PATCH v4b 07/18] target/arm: Implement SVE copy to vector (predicated) X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 19 +++++++++++++++++++ target/arm/sve.decode | 6 ++++++ 2 files changed, 25 insertions(+) -- 2.17.1 diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index feb4c09f1b..eed59524a9 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2624,6 +2624,25 @@ static bool trans_LASTB_r(DisasContext *s, arg_rpr_esz *a, uint32_t insn) return do_last_general(s, a, true); } +static bool trans_CPY_m_r(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + if (sve_access_check(s)) { + do_cpy_m(s, a->esz, a->rd, a->rd, a->pg, cpu_reg_sp(s, a->rn)); + } + return true; +} + +static bool trans_CPY_m_v(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + if (sve_access_check(s)) { + int ofs = vec_reg_offset(s, a->rn, 0, a->esz); + TCGv_i64 t = load_esz(cpu_env, ofs, a->esz); + do_cpy_m(s, a->esz, a->rd, a->rd, a->pg, t); + tcg_temp_free_i64(t); + } + return true; +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 1226867f69..519139f684 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -450,6 +450,12 @@ LASTB_v 00000101 .. 10001 1 100 ... ..... ..... @rd_pg_rn LASTA_r 00000101 .. 10000 0 101 ... ..... ..... @rd_pg_rn LASTB_r 00000101 .. 10000 1 101 ... ..... ..... @rd_pg_rn +# SVE copy element from SIMD&FP scalar register +CPY_m_v 00000101 .. 100000 100 ... ..... ..... @rd_pg_rn + +# SVE copy element from general register to vector (predicated) +CPY_m_r 00000101 .. 101000 101 ... ..... ..... @rd_pg_rn + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Wed Jun 13 01:56:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 138394 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp103938lji; Tue, 12 Jun 2018 19:06:51 -0700 (PDT) X-Google-Smtp-Source: ADUXVKISx2dQJN9m8GNNCezgAo7zTPYNsg1vJQF5/vL3QCsWsTGmN/Y2zInDtIsfl/m483ZRBGgZ X-Received: by 2002:aed:38a9:: with SMTP id k38-v6mr2830803qte.30.1528855611645; Tue, 12 Jun 2018 19:06:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1528855611; cv=none; d=google.com; s=arc-20160816; b=lI036P6aFYQ7VDUH2wNLdgOdwLE74xl6B8sRELICRasL3Sh7VR0vX9VCacE4dWm8rf fJOOpYCMPFV3f3cURXJqiAiivkS7eADKT9WRxYmDaDUGoj6usc7Z4FNJZW7URJdR9c4s 0rDhnDqShXzi9dd1QXf7OgdwS8Dre6qYPqd71F85GYzgYbw5BxMS8nyI9Z3NWUbA6RQw p18Rn2AamFPUXsno6tmQSuuBWshLejtpp6l/n5li+jAt7urPMr4KnC1qj6RfhtZIuv24 XvGnkqPbIY7MM0I9OvACqQd9G8mWyMQjBAVhkTdnFAsm++fofcIpPzBHnLxaOBMTGPu7 vEDA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=0qv9waxjnItdqxrb1PCW4c/OmiL8b4gRs4csNfG9iRs=; b=EMm5ZAz04PVJSQ+5ORsuknbMO9kapsVwLM6BM2MbblTRi5kfRZ23dONEfnvQchM/BG AnMpq3rySjXFvqKHMUbAz2Gao5OUV86HCCmquJcDIEqkuAmZR1CM7KWxZgf+L9v/avzK UfpNNSwuXyYOVdBhbnp1rHNooprJsqdr6XFCvjJarLW6fXzl3NU4iK0EgMdROE793Cj0 p+W9ZMQsQ+tcWn5jhh/Gt+GAiA6xmZndwkmoBo5h2MGm38ulXYXdMTrS6ptWW9p6qyVj 6FO7HfVDLi0GWZagtZIsY9eOLENQAQRlzA77l6lWeFqdZnOLNsqDZ7M8v00HXRv8P0j/ hfBw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=OhNCMhyK; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id j6-v6si1609810qvl.174.2018.06.12.19.06.51 for (version=TLS1 cipher=AES128-SHA bits=128/128); Tue, 12 Jun 2018 19:06:51 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=OhNCMhyK; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:59313 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSvBT-0001QD-1J for patch@linaro.org; Tue, 12 Jun 2018 22:06:51 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44069) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSv29-0002KF-J5 for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:16 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fSv28-0006Gh-Am for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:13 -0400 Received: from mail-pf0-x22f.google.com ([2607:f8b0:400e:c00::22f]:46041) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fSv28-0006GW-2I for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:12 -0400 Received: by mail-pf0-x22f.google.com with SMTP id a22-v6so497620pfo.12 for ; Tue, 12 Jun 2018 18:57:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=0qv9waxjnItdqxrb1PCW4c/OmiL8b4gRs4csNfG9iRs=; b=OhNCMhyKWn+ok0NnHq6HzoDuj6Cjv82cJ6nib1D/CBGxJCBqr+VxDmmnLcxQPvD2QN Ssu+gMFJPNR+QcevNdDbs6YPkNtKcno9fzG29lLwRiT3fpTaLEbPJoxsoltc+KPYK8Yw PZHjZ+fcTksULSciHVu390CgNeHgioqDZU00E= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=0qv9waxjnItdqxrb1PCW4c/OmiL8b4gRs4csNfG9iRs=; b=EkCKI+uPMZGzPGInLp7ZenTDszdyVY52qZ7LzVHsEWF2ExjB3VsnyiiK27ZPVjYA9e Q81knK2XpnbClmVzJbNES1/yWCNw0IUpplozXvM9eV1VA2fRZ8PBicrjiM18mtDt6LDD lGF4c3nDyStN+0/7wDpB69mxk7+UWAb2HOkTAhB4Il1D8ic0e3pfKO8Ltiz/5VHctrl8 DOh+NyuCczK6+9Boo2hTErsc3uV7odWOVNg6MD+wUTLB3xq0GVOb5s43qk2q4X/Oc8qY iIrfZD54Xh8GbEugj5MWLAm315BZmvSRRhWown+S9L2KYPNIXN11FmOcRXwfKtHxEdBS Iibw== X-Gm-Message-State: APt69E3oMqd6YayHPr+DKVC9VHZ4xDg7gEKpjFM2LKs7bTKfc6jw+/hV rkH3Jl6Wdpo1TSWzIgyqyOFOoCKBe5k= X-Received: by 2002:a65:4a51:: with SMTP id a17-v6mr2395012pgu.168.1528855030753; Tue, 12 Jun 2018 18:57:10 -0700 (PDT) Received: from cloudburst.twiddle.net (rrcs-173-198-77-219.west.biz.rr.com. [173.198.77.219]) by smtp.gmail.com with ESMTPSA id g10-v6sm1647287pfi.148.2018.06.12.18.57.07 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 12 Jun 2018 18:57:09 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Tue, 12 Jun 2018 15:56:31 -1000 Message-Id: <20180613015641.5667-9-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180613015641.5667-1-richard.henderson@linaro.org> References: <20180613015641.5667-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::22f Subject: [Qemu-devel] [PATCH v4b 08/18] target/arm: Implement SVE reverse within elements X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 14 +++++++++++++ target/arm/sve_helper.c | 41 +++++++++++++++++++++++++++++++------- target/arm/translate-sve.c | 38 +++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 7 +++++++ 4 files changed, 93 insertions(+), 7 deletions(-) -- 2.17.1 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index a58fb4ba01..3b7c54905d 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -465,6 +465,20 @@ DEF_HELPER_FLAGS_4(sve_compact_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_2(sve_last_active_element, TCG_CALL_NO_RWG, s32, ptr, i32) +DEF_HELPER_FLAGS_4(sve_revb_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_revb_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_revb_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_revh_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_revh_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_revw_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_rbit_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_rbit_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_rbit_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_rbit_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index cb7d101b57..4017b9eed1 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -238,6 +238,26 @@ static inline uint64_t expand_pred_s(uint8_t byte) return word[byte & 0x11]; } +/* Swap 16-bit words within a 32-bit word. */ +static inline uint32_t hswap32(uint32_t h) +{ + return rol32(h, 16); +} + +/* Swap 16-bit words within a 64-bit word. */ +static inline uint64_t hswap64(uint64_t h) +{ + uint64_t m = 0x0000ffff0000ffffull; + h = rol64(h, 32); + return ((h & m) << 16) | ((h >> 16) & m); +} + +/* Swap 32-bit words within a 64-bit word. */ +static inline uint64_t wswap64(uint64_t h) +{ + return rol64(h, 32); +} + #define LOGICAL_PPPP(NAME, FUNC) \ void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \ { \ @@ -616,6 +636,20 @@ DO_ZPZ(sve_neg_h, uint16_t, H1_2, DO_NEG) DO_ZPZ(sve_neg_s, uint32_t, H1_4, DO_NEG) DO_ZPZ_D(sve_neg_d, uint64_t, DO_NEG) +DO_ZPZ(sve_revb_h, uint16_t, H1_2, bswap16) +DO_ZPZ(sve_revb_s, uint32_t, H1_4, bswap32) +DO_ZPZ_D(sve_revb_d, uint64_t, bswap64) + +DO_ZPZ(sve_revh_s, uint32_t, H1_4, hswap32) +DO_ZPZ_D(sve_revh_d, uint64_t, hswap64) + +DO_ZPZ_D(sve_revw_d, uint64_t, wswap64) + +DO_ZPZ(sve_rbit_b, uint8_t, H1, revbit8) +DO_ZPZ(sve_rbit_h, uint16_t, H1_2, revbit16) +DO_ZPZ(sve_rbit_s, uint32_t, H1_4, revbit32) +DO_ZPZ_D(sve_rbit_d, uint64_t, revbit64) + /* Three-operand expander, unpredicated, in which the third operand is "wide". */ #define DO_ZZW(NAME, TYPE, TYPEW, H, OP) \ @@ -1587,13 +1621,6 @@ void HELPER(sve_rev_b)(void *vd, void *vn, uint32_t desc) } } -static inline uint64_t hswap64(uint64_t h) -{ - uint64_t m = 0x0000ffff0000ffffull; - h = rol64(h, 32); - return ((h & m) << 16) | ((h >> 16) & m); -} - void HELPER(sve_rev_h)(void *vd, void *vn, uint32_t desc) { intptr_t i, j, opr_sz = simd_oprsz(desc); diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index eed59524a9..f8d8cf1547 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2643,6 +2643,44 @@ static bool trans_CPY_m_v(DisasContext *s, arg_rpr_esz *a, uint32_t insn) return true; } +static bool trans_REVB(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] = { + NULL, + gen_helper_sve_revb_h, + gen_helper_sve_revb_s, + gen_helper_sve_revb_d, + }; + return do_zpz_ool(s, a, fns[a->esz]); +} + +static bool trans_REVH(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] = { + NULL, + NULL, + gen_helper_sve_revh_s, + gen_helper_sve_revh_d, + }; + return do_zpz_ool(s, a, fns[a->esz]); +} + +static bool trans_REVW(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + return do_zpz_ool(s, a, a->esz == 3 ? gen_helper_sve_revw_d : NULL); +} + +static bool trans_RBIT(DisasContext *s, arg_rpr_esz *a, uint32_t insn) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve_rbit_b, + gen_helper_sve_rbit_h, + gen_helper_sve_rbit_s, + gen_helper_sve_rbit_d, + }; + return do_zpz_ool(s, a, fns[a->esz]); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 519139f684..95eb4968a9 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -456,6 +456,13 @@ CPY_m_v 00000101 .. 100000 100 ... ..... ..... @rd_pg_rn # SVE copy element from general register to vector (predicated) CPY_m_r 00000101 .. 101000 101 ... ..... ..... @rd_pg_rn +# SVE reverse within elements +# Note esz >= operation size +REVB 00000101 .. 1001 00 100 ... ..... ..... @rd_pg_rn +REVH 00000101 .. 1001 01 100 ... ..... ..... @rd_pg_rn +REVW 00000101 .. 1001 10 100 ... ..... ..... @rd_pg_rn +RBIT 00000101 .. 1001 11 100 ... ..... ..... @rd_pg_rn + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Wed Jun 13 01:56:32 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 138389 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp98731lji; Tue, 12 Jun 2018 19:00:39 -0700 (PDT) X-Google-Smtp-Source: ADUXVKKeKB6PFkry0OSGEwc+hxWtzoKGMRQNqHyX9zeAcrmFcu0KpcoCtqZSW7WCKds+0eum5728 X-Received: by 2002:a0c:a083:: with SMTP id c3-v6mr2798921qva.107.1528855239546; Tue, 12 Jun 2018 19:00:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1528855239; cv=none; d=google.com; s=arc-20160816; b=NKlRRmmjoAkvGl4oKpfiV3UfoSxeSg+IXpSjgIBChrywVgRrMZ/9O+sCkwxW4/o1r1 LV04SFJFsfEBSlbWUfGnjHlaucls79ULS6h40/AC4S84eWKQjdLGXi2CdLElNH13rGds Fq2XnynY2XvdUwkE+gOdqPmzo0xo0stydk2VDL/om5mQc9Rs77jKlhsSpQcOS1bQzaDF GTLXaUKdk8jIkXn+Gfh3o+27Pod5X/ryC2SgqgzRqne6wOk63q5Wge5Ioc+KnQXD4e7O VEC3NOzao+50FjbjmD/Mv3S011bCSwZgoJTtOZTTRqu3mWKRb8B1TEQJgNdgtjH81r07 CRsg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=IkiridyT5PI+W0HF8A0jZ7zJMR+O60d3GRJl8gAbJew=; b=hlYoBZ887ICB7bCxFxhP4OgWo0qwDEx0/9MqN8bIFuf6TXrm8mYLVdIi3OUJaoX6KH boh0+0z8GivuHBLs5tkVFkGO9M01ZRe0Hz1QDdEoAyhIrGM0Ns7oC/H/M43cH/OQkpns j4bO8KuffySfElwLGRY6EP2HJtOMxMxUcRckeq2nRADU9iloeQuJI2cKIrl/NQzp5Kp7 kKO9ytNETenim2qaCipBfYGKQTD9B/YoM85BuPt3zPp6PoZBKxnXrSavRPjQHhf79FRD Cqob/fcpIMaR918Y1PssB1FjTx9FdPCYuDodXa0FB5cQ5Twg5kvpNR1yR5lsVnelKwGG lknQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=EJnxV1cB; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id o3-v6si1420484qvk.58.2018.06.12.19.00.39 for (version=TLS1 cipher=AES128-SHA bits=128/128); Tue, 12 Jun 2018 19:00:39 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=EJnxV1cB; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:59281 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSv5S-0004tL-Vz for patch@linaro.org; Tue, 12 Jun 2018 22:00:39 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44081) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSv2C-0002Nb-S3 for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:17 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fSv2A-0006H7-7P for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:16 -0400 Received: from mail-pl0-x235.google.com ([2607:f8b0:400e:c01::235]:44821) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fSv2A-0006Gy-1R for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:14 -0400 Received: by mail-pl0-x235.google.com with SMTP id z9-v6so544517plk.11 for ; Tue, 12 Jun 2018 18:57:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=IkiridyT5PI+W0HF8A0jZ7zJMR+O60d3GRJl8gAbJew=; b=EJnxV1cBQGV7wWMY4eyhPq2jyRQG2I+YC5ZNzyAnZ5EPht+Yfz6kbjPIsTRQrclny/ USDh1KE/miDJVQzp3hMf7hzrBQgSjUSUPgjJJGBEyZ0oRtYBqF63QTZLlTMWoVbo8ywo 3fZV2VYg0CYzmea7pIfk9w5CopGi86uh5E/Zw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=IkiridyT5PI+W0HF8A0jZ7zJMR+O60d3GRJl8gAbJew=; b=rpRwRSeTO9gnq80M8XncI2U3Lc34QfmH0dXgMWir2lOSAzDJMBTeA27h9+jiLVkprF 5bGo5iI+lPpv8hA64xpF4VHGHYRhjQRow0dfPEuiTAWBAhIn6aZbTZMxH3RC+bcUGale QTyHF7zNzN1iFe+Wwr3YmodtB9OPmFZ97UzZ4nX4K4Wk9lgYzKk8oa+oYe/aHvbHe+IJ KeMKTndOfHyBDxuA4SbfA8Q2vm2zFdyF1Qxln4ZujVOyh/PxBl5T7TXh2KGq6B2bsXqj B32C58xr9hDwb7jA8/+WtHkU3sovrUhm8CcxtEZkl/p0X+EOzjxN7MtIvYSuIrmV9stj o/Sw== X-Gm-Message-State: APt69E0cyH5Mi6XQnf6ygmMPJn3opy2KrWEhLONbLHUSAjRgQjbYt3SY l0KmxWrUsanceCrjNtyTVGF8dCP1OsA= X-Received: by 2002:a17:902:9883:: with SMTP id s3-v6mr2978636plp.194.1528855032719; Tue, 12 Jun 2018 18:57:12 -0700 (PDT) Received: from cloudburst.twiddle.net (rrcs-173-198-77-219.west.biz.rr.com. [173.198.77.219]) by smtp.gmail.com with ESMTPSA id g10-v6sm1647287pfi.148.2018.06.12.18.57.10 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 12 Jun 2018 18:57:11 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Tue, 12 Jun 2018 15:56:32 -1000 Message-Id: <20180613015641.5667-10-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180613015641.5667-1-richard.henderson@linaro.org> References: <20180613015641.5667-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::235 Subject: [Qemu-devel] [PATCH v4b 09/18] target/arm: Implement SVE vector splice (predicated) X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 2 ++ target/arm/sve_helper.c | 37 +++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 13 +++++++++++++ target/arm/sve.decode | 3 +++ 4 files changed, 55 insertions(+) -- 2.17.1 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 3b7c54905d..c3f8a2b502 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -479,6 +479,8 @@ DEF_HELPER_FLAGS_4(sve_rbit_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_rbit_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_rbit_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_splice, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 4017b9eed1..8da7baad76 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2109,3 +2109,40 @@ int32_t HELPER(sve_last_active_element)(void *vg, uint32_t pred_desc) return last_active_element(vg, DIV_ROUND_UP(oprsz, 8), esz); } + +void HELPER(sve_splice)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) +{ + intptr_t opr_sz = simd_oprsz(desc) / 8; + int esz = simd_data(desc); + uint64_t pg, first_g, last_g, len, mask = pred_esz_masks[esz]; + intptr_t i, first_i, last_i; + ARMVectorReg tmp; + + first_i = last_i = 0; + first_g = last_g = 0; + + /* Find the extent of the active elements within VG. */ + for (i = QEMU_ALIGN_UP(opr_sz, 8) - 8; i >= 0; i -= 8) { + pg = *(uint64_t *)(vg + i) & mask; + if (pg) { + if (last_g == 0) { + last_g = pg; + last_i = i; + } + first_g = pg; + first_i = i; + } + } + + len = 0; + if (first_g != 0) { + first_i = first_i * 8 + ctz64(first_g); + last_i = last_i * 8 + 63 - clz64(last_g); + len = last_i - first_i + (1 << esz); + if (vd == vm) { + vm = memcpy(&tmp, vm, opr_sz * 8); + } + swap_memmove(vd, vn + first_i, len); + } + swap_memmove(vd + len, vm, opr_sz * 8 - len); +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index f8d8cf1547..1517d82468 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2681,6 +2681,19 @@ static bool trans_RBIT(DisasContext *s, arg_rpr_esz *a, uint32_t insn) return do_zpz_ool(s, a, fns[a->esz]); } +static bool trans_SPLICE(DisasContext *s, arg_rprr_esz *a, uint32_t insn) +{ + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_4_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + pred_full_reg_offset(s, a->pg), + vsz, vsz, a->esz, gen_helper_sve_splice); + } + return true; +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 95eb4968a9..a9fa631252 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -463,6 +463,9 @@ REVH 00000101 .. 1001 01 100 ... ..... ..... @rd_pg_rn REVW 00000101 .. 1001 10 100 ... ..... ..... @rd_pg_rn RBIT 00000101 .. 1001 11 100 ... ..... ..... @rd_pg_rn +# SVE vector splice (predicated) +SPLICE 00000101 .. 101 100 100 ... ..... ..... @rdn_pg_rm + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Wed Jun 13 01:56:33 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 138397 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp107897lji; Tue, 12 Jun 2018 19:12:15 -0700 (PDT) X-Google-Smtp-Source: ADUXVKIDrEfVPML3PLhilLzNGiXsT+lGY0c9qp50wDKPF9Mqo1qj+wzDsZ1vdYrEh5A8A7eUzmXh X-Received: by 2002:aed:38c2:: with SMTP id k60-v6mr2745863qte.171.1528855935393; Tue, 12 Jun 2018 19:12:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1528855935; cv=none; d=google.com; s=arc-20160816; b=PZWxckaSnHhKINzYY1B2b1eP4L+ahEeV6AZ0TfyE6aKHlIByx0syJ51YowK+KqyOuB 7Y0xTdUVX3KcAb5ZJHDsXvAPXGrp0pVPpTtUtLdS2xcnO74i/r2oZ+OaVs4fwpmqRls/ I2rz1h6EJEg84a2d9vIc2IzfIYj+RPhxEY7kOuFrv/klRFxM7TRPprLBccee4o08vO0S gJWp3IMW2J9AHzfPc2Le4TdUY3QhehdwtIzJmlISU3N+jkV937x4/AKkyC/uYmfd85OV SoOsCYccil/lyxyyIctvWz9blNpmilRAFX9r5s6rTPiBfq249paThO65EY9oBMkfdwNa y6Lw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=JrUBGULR193RsfcxMikDjaM1qj6AR2n7mIFladBHOpY=; b=lqN3xn5FdqytR0TRU8YIZERegdPzGfCERsBDAHW0FmLAzRPVz4GkycAkMbc9MChsWc XB9Sy+jjDr6SETMtSwBE/wkD18Nu3x0HiCbkMQNqesFS7xP+UpaEcC6QTY3ckT7OfQ5Q 9Yel4XPnqJQic/o4qYGQWZNsmGpjjWBMQALAaQjAsNnwfFi/PbTYL59svc145NB1RyAG 8aa0zGz0WsxMQWF8TXDqKhiWotNqc269Pw2tYtg1ChOCaQVCkV00py4P+yPGbGPCf4pY fTRAye3t5sKG7uBbNS6knhSfj7A2UZjMsOQ8tlJsMRPzoPcMcawXCrZf8w1Qu9EfGz4n 12Vw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=fmfa6MQr; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id a17-v6si1515321qth.378.2018.06.12.19.12.15 for (version=TLS1 cipher=AES128-SHA bits=128/128); Tue, 12 Jun 2018 19:12:15 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=fmfa6MQr; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:59355 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSvGg-0006NS-LT for patch@linaro.org; Tue, 12 Jun 2018 22:12:14 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44091) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSv2D-0002OT-O8 for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:18 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fSv2C-0006Hc-JM for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:17 -0400 Received: from mail-pg0-x22e.google.com ([2607:f8b0:400e:c05::22e]:46249) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fSv2C-0006HO-B8 for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:16 -0400 Received: by mail-pg0-x22e.google.com with SMTP id d2-v6so449201pga.13 for ; Tue, 12 Jun 2018 18:57:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=JrUBGULR193RsfcxMikDjaM1qj6AR2n7mIFladBHOpY=; b=fmfa6MQr/Z/KBZ8JUAdE1bGGa5mg49wgvu/jO0uvwQVX38nTsv+wSMY0lwc30nf+Uk 7tIOlHSKpEgxOA9lEZE3WaetzwT2AkVIHy09y9rmBEVCeILfCOF71lFeQeKhoATVj8Pe xXyJDseWy6Mhx8H5oHNiSYfjPNW/Ntq/28WsM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=JrUBGULR193RsfcxMikDjaM1qj6AR2n7mIFladBHOpY=; b=iL4Ad+WdQcgzD0sd19UjCTUkmLYoNxHrQB+lBvN/nA76jnkpFXClnslgcTtg5UKvp3 Db8NjtYQqy7PLhAM2P34G+t6TTU71UGQSAg7oga357ELEjPznEJqfsBzTGFwWvLrQj64 PRHk6qR1aZXXUYxhYGSCFFUY5QjMS+AL9cBc7kYiXhZdR/brgTZqB4lqHGvU3Hcovt+5 GfFC4+eMO7yrXFqVF676DvXOAYdau5Xc/QtCClEp+UJ6A0GHdOEdUh2mYxF+tluPWdDi mOMJuwfwXPbRk3DvhYZFkjshb5Y2ExCuSdl9D8B6a9GWIlc6H9j1b+LWhL7Eeh422PMp CWxA== X-Gm-Message-State: APt69E2X4oBRb2RaTcJ6HYIA4/Apu6wIg/uWPAHhy6fE0O3eM/0mGsYj RMUpuEmJ6U/p6CJIzodxPDZQjJePFjw= X-Received: by 2002:a63:3f42:: with SMTP id m63-v6mr2420346pga.340.1528855035032; Tue, 12 Jun 2018 18:57:15 -0700 (PDT) Received: from cloudburst.twiddle.net (rrcs-173-198-77-219.west.biz.rr.com. [173.198.77.219]) by smtp.gmail.com with ESMTPSA id g10-v6sm1647287pfi.148.2018.06.12.18.57.12 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 12 Jun 2018 18:57:14 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Tue, 12 Jun 2018 15:56:33 -1000 Message-Id: <20180613015641.5667-11-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180613015641.5667-1-richard.henderson@linaro.org> References: <20180613015641.5667-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::22e Subject: [Qemu-devel] [PATCH v4b 10/18] target/arm: Implement SVE Select Vectors Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 9 +++++++ target/arm/sve_helper.c | 55 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 2 ++ target/arm/sve.decode | 6 +++++ 4 files changed, 72 insertions(+) -- 2.17.1 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index c3f8a2b502..0f57f64895 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -195,6 +195,15 @@ DEF_HELPER_FLAGS_5(sve_lsl_zpzz_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_lsl_zpzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_sel_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_sel_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_sel_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_sel_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_asr_zpzw_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_asr_zpzw_h, TCG_CALL_NO_RWG, diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 8da7baad76..f55fdc7dbe 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2146,3 +2146,58 @@ void HELPER(sve_splice)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) } swap_memmove(vd + len, vm, opr_sz * 8 - len); } + +void HELPER(sve_sel_zpzz_b)(void *vd, void *vn, void *vm, + void *vg, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd, *n = vn, *m = vm; + uint8_t *pg = vg; + + for (i = 0; i < opr_sz; i += 1) { + uint64_t nn = n[i], mm = m[i]; + uint64_t pp = expand_pred_b(pg[H1(i)]); + d[i] = (nn & pp) | (mm & ~pp); + } +} + +void HELPER(sve_sel_zpzz_h)(void *vd, void *vn, void *vm, + void *vg, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd, *n = vn, *m = vm; + uint8_t *pg = vg; + + for (i = 0; i < opr_sz; i += 1) { + uint64_t nn = n[i], mm = m[i]; + uint64_t pp = expand_pred_h(pg[H1(i)]); + d[i] = (nn & pp) | (mm & ~pp); + } +} + +void HELPER(sve_sel_zpzz_s)(void *vd, void *vn, void *vm, + void *vg, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd, *n = vn, *m = vm; + uint8_t *pg = vg; + + for (i = 0; i < opr_sz; i += 1) { + uint64_t nn = n[i], mm = m[i]; + uint64_t pp = expand_pred_s(pg[H1(i)]); + d[i] = (nn & pp) | (mm & ~pp); + } +} + +void HELPER(sve_sel_zpzz_d)(void *vd, void *vn, void *vm, + void *vg, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd, *n = vn, *m = vm; + uint8_t *pg = vg; + + for (i = 0; i < opr_sz; i += 1) { + uint64_t nn = n[i], mm = m[i]; + d[i] = (pg[H1(i)] & 1 ? nn : mm); + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 1517d82468..0de9118fdf 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -373,6 +373,8 @@ static bool trans_UDIV_zpzz(DisasContext *s, arg_rprr_esz *a, uint32_t insn) return do_zpzz_ool(s, a, fns[a->esz]); } +DO_ZPZZ(SEL, sel) + #undef DO_ZPZZ /* diff --git a/target/arm/sve.decode b/target/arm/sve.decode index a9fa631252..91522d8e13 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -98,6 +98,7 @@ &rprr_esz rn=%reg_movprfx @rdm_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 \ &rprr_esz rm=%reg_movprfx +@rd_pg4_rn_rm ........ esz:2 . rm:5 .. pg:4 rn:5 rd:5 &rprr_esz # Three register operand, with governing predicate, vector element size @rda_pg_rn_rm ........ esz:2 . rm:5 ... pg:3 rn:5 rd:5 \ @@ -466,6 +467,11 @@ RBIT 00000101 .. 1001 11 100 ... ..... ..... @rd_pg_rn # SVE vector splice (predicated) SPLICE 00000101 .. 101 100 100 ... ..... ..... @rdn_pg_rm +### SVE Select Vectors Group + +# SVE select vector elements (predicated) +SEL_zpzz 00000101 .. 1 ..... 11 .... ..... ..... @rd_pg4_rn_rm + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Wed Jun 13 01:56:34 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 138388 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp98717lji; Tue, 12 Jun 2018 19:00:38 -0700 (PDT) X-Google-Smtp-Source: ADUXVKKXQE8LYFps4DxPbsRTNy33OrADNwTeNBJanKFf3qxzrLGaxypPUEuRsY3tgZvxQ4WNoz9i X-Received: by 2002:a0c:aa4d:: with SMTP id e13-v6mr2817820qvb.26.1528855238696; Tue, 12 Jun 2018 19:00:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1528855238; cv=none; d=google.com; s=arc-20160816; b=kYOvISKP43Xdt/Qjy1O9tjQXPe2xkt/QdpZyKWyrOZfbGKyA5xYyJxmsWdeu8s3IZp dFinDF2+vlczExHcMzyxXNkfTpTPqLempdKvfg4mxfIHeKAXs/Sd00TyXq4tnaj0taI0 LnnyMs0UEkNBN9vAStHOmFxn2ZrlurumPtG0EdP4k2POqjIxmmd+NMrDSM3HTLDQHnZM IwYnmHsFbKXEuSqYkgogmkdBRDjpL/RqZN/NGV3Z+E19rL18zPc9AfsvyAqVRq9fof1B P2gifsAszPAvyaTvR8Uh+x67ZGLJ/QaqquxXL8w8MhzA77hu2YLxgJ2TPta5fbdE0NrH XDdw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=YuW+dwxDNbYmJ2VfEq4hkj93xm9ZPbynAI4+fISmHT8=; b=HXFg+VXB1jJraft+wubaBKndyekakF4ssqiTEhSMTIMaPc2SBdwWz9xXm6vzEMBIcp jpxujyVv1yHPB6tQJ6xQcy9IkaXASVFFeZI4fbNjflhN0amkYO9c8bnuSU1MKzARHMyN kwYdC4w2ieeLbVt2mf8s1rw74G4Cqlh8iwr+AXPewtN+v+ZEdZNmNYAQuLYegsyRi83/ BX2mkUw7ytewHq8gULE8FhVzL05Qmp++Ky7r/sL1ex5bLwqoXA5Bz5+zkogQ34hXRLcV 0sqniSgHR7PvA6PJn6ppFgU5dc54fBoOLyirui0l4rtDc2wIsDFT3i6c2jC2TJwGFWNX IPew== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=fLzMp9BO; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id j65-v6si274841qkc.74.2018.06.12.19.00.38 for (version=TLS1 cipher=AES128-SHA bits=128/128); Tue, 12 Jun 2018 19:00:38 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=fLzMp9BO; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:59280 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSv5S-0004r0-4R for patch@linaro.org; Tue, 12 Jun 2018 22:00:38 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44103) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSv2H-0002T0-W2 for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:25 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fSv2F-0006I8-Pu for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:22 -0400 Received: from mail-pg0-x242.google.com ([2607:f8b0:400e:c05::242]:32974) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fSv2F-0006I4-Ge for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:19 -0400 Received: by mail-pg0-x242.google.com with SMTP id e11-v6so464599pgq.0 for ; Tue, 12 Jun 2018 18:57:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=YuW+dwxDNbYmJ2VfEq4hkj93xm9ZPbynAI4+fISmHT8=; b=fLzMp9BOcgky3nxu3/BgWVuKSXf7lzEpnCn8qJIfi81K3YDTXxqzgq4ANa8v41z3Uo ivNTC4OmzsP2uLvqRRMYTg5NDm1fHrpd0gcbRbF1JMDjKDszYqRhGaQ5LUCHAInFmow/ R8cCtmHVxQVegzVDIV6Dgt9u8gONYXyrZBNWo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=YuW+dwxDNbYmJ2VfEq4hkj93xm9ZPbynAI4+fISmHT8=; b=NBDNU8r0aBAwSooeHUrCKk1B36uWS1QqSC0LHvLobk/slSsCaQheiUoIdA17XFhRWc QaaeNKwxztlTzGk4wTgpIQcanZPTDTQQ8pc9/c21MBwY9JIrBYS3mNfqHceXs+oSGg5s ZbmggXzSC9keRXmCn3/1iuJwYGCD+aotC0347aX13N81QIkDFdZvR9vuzh89fAaQ3Yov LsFJcaQzzERh5Gw4J/jP8JPZ5fADREuaQacykvGBvCPId0j8KLMhsbaVkSWwF1pPuLQA f8SB/YhbH8PWu6OlcTuMQInx+yxCOyH3M8R3oJ1TiUqWIGiXH9u5e8mQuCrrs3VcUyTa jTMA== X-Gm-Message-State: APt69E0muLwTm79khesLxvNfZqla0RUZVJeqEzt5sbHWV5F+cooUbvwi mrYfNBayMZxjZAmvMaZATwbllCMvjE8= X-Received: by 2002:a62:6a46:: with SMTP id f67-v6mr2820313pfc.105.1528855037874; Tue, 12 Jun 2018 18:57:17 -0700 (PDT) Received: from cloudburst.twiddle.net (rrcs-173-198-77-219.west.biz.rr.com. [173.198.77.219]) by smtp.gmail.com with ESMTPSA id g10-v6sm1647287pfi.148.2018.06.12.18.57.15 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 12 Jun 2018 18:57:17 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Tue, 12 Jun 2018 15:56:34 -1000 Message-Id: <20180613015641.5667-12-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180613015641.5667-1-richard.henderson@linaro.org> References: <20180613015641.5667-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::242 Subject: [Qemu-devel] [PATCH v4b 11/18] target/arm: Implement SVE Integer Compare - Vectors Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 115 +++++++++++++++++++++++ target/arm/sve_helper.c | 187 +++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 91 ++++++++++++++++++ target/arm/sve.decode | 24 +++++ 4 files changed, 417 insertions(+) -- 2.17.1 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 0f57f64895..6ffd1fbe8e 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -490,6 +490,121 @@ DEF_HELPER_FLAGS_4(sve_rbit_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_splice, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpeq_ppzz_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpne_ppzz_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpge_ppzz_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpgt_ppzz_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphi_ppzz_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphs_ppzz_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_cmpeq_ppzz_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpne_ppzz_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpge_ppzz_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpgt_ppzz_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphi_ppzz_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphs_ppzz_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_cmpeq_ppzz_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpne_ppzz_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpge_ppzz_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpgt_ppzz_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphi_ppzz_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphs_ppzz_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_cmpeq_ppzz_d, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpne_ppzz_d, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpge_ppzz_d, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpgt_ppzz_d, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphi_ppzz_d, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphs_ppzz_d, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_cmpeq_ppzw_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpne_ppzw_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpge_ppzw_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpgt_ppzw_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphi_ppzw_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphs_ppzw_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmple_ppzw_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmplt_ppzw_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmplo_ppzw_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpls_ppzw_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_cmpeq_ppzw_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpne_ppzw_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpge_ppzw_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpgt_ppzw_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphi_ppzw_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphs_ppzw_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmple_ppzw_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmplt_ppzw_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmplo_ppzw_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpls_ppzw_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_cmpeq_ppzw_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpne_ppzw_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpge_ppzw_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpgt_ppzw_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphi_ppzw_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmphs_ppzw_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmple_ppzw_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmplt_ppzw_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmplo_ppzw_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_cmpls_ppzw_s, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index f55fdc7dbe..d11f591661 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -74,6 +74,28 @@ static uint32_t iter_predtest_fwd(uint64_t d, uint64_t g, uint32_t flags) return flags; } +/* This is an iterative function, called for each Pd and Pg word + * moving backward. + */ +static uint32_t iter_predtest_bwd(uint64_t d, uint64_t g, uint32_t flags) +{ + if (likely(g)) { + /* Compute C from first (i.e last) !(D & G). + Use bit 2 to signal first G bit seen. */ + if (!(flags & 4)) { + flags += 4 - 1; /* add bit 2, subtract C from PREDTEST_INIT */ + flags |= (d & pow2floor(g)) == 0; + } + + /* Accumulate Z from each D & G. */ + flags |= ((d & g) != 0) << 1; + + /* Compute N from last (i.e first) D & G. Replace previous. */ + flags = deposit32(flags, 31, 1, (d & (g & -g)) != 0); + } + return flags; +} + /* The same for a single word predicate. */ uint32_t HELPER(sve_predtest1)(uint64_t d, uint64_t g) { @@ -2201,3 +2223,168 @@ void HELPER(sve_sel_zpzz_d)(void *vd, void *vn, void *vm, d[i] = (pg[H1(i)] & 1 ? nn : mm); } } + +/* Two operand comparison controlled by a predicate. + * ??? It is very tempting to want to be able to expand this inline + * with x86 instructions, e.g. + * + * vcmpeqw zm, zn, %ymm0 + * vpmovmskb %ymm0, %eax + * and $0x5555, %eax + * and pg, %eax + * + * or even aarch64, e.g. + * + * // mask = 4000 1000 0400 0100 0040 0010 0004 0001 + * cmeq v0.8h, zn, zm + * and v0.8h, v0.8h, mask + * addv h0, v0.8h + * and v0.8b, pg + * + * However, coming up with an abstraction that allows vector inputs and + * a scalar output, and also handles the byte-ordering of sub-uint64_t + * scalar outputs, is tricky. + */ +#define DO_CMP_PPZZ(NAME, TYPE, OP, H, MASK) \ +uint32_t HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \ +{ \ + intptr_t opr_sz = simd_oprsz(desc); \ + uint32_t flags = PREDTEST_INIT; \ + intptr_t i = opr_sz; \ + do { \ + uint64_t out = 0, pg; \ + do { \ + i -= sizeof(TYPE), out <<= sizeof(TYPE); \ + TYPE nn = *(TYPE *)(vn + H(i)); \ + TYPE mm = *(TYPE *)(vm + H(i)); \ + out |= nn OP mm; \ + } while (i & 63); \ + pg = *(uint64_t *)(vg + (i >> 3)) & MASK; \ + out &= pg; \ + *(uint64_t *)(vd + (i >> 3)) = out; \ + flags = iter_predtest_bwd(out, pg, flags); \ + } while (i > 0); \ + return flags; \ +} + +#define DO_CMP_PPZZ_B(NAME, TYPE, OP) \ + DO_CMP_PPZZ(NAME, TYPE, OP, H1, 0xffffffffffffffffull) +#define DO_CMP_PPZZ_H(NAME, TYPE, OP) \ + DO_CMP_PPZZ(NAME, TYPE, OP, H1_2, 0x5555555555555555ull) +#define DO_CMP_PPZZ_S(NAME, TYPE, OP) \ + DO_CMP_PPZZ(NAME, TYPE, OP, H1_4, 0x1111111111111111ull) +#define DO_CMP_PPZZ_D(NAME, TYPE, OP) \ + DO_CMP_PPZZ(NAME, TYPE, OP, , 0x0101010101010101ull) + +DO_CMP_PPZZ_B(sve_cmpeq_ppzz_b, uint8_t, ==) +DO_CMP_PPZZ_H(sve_cmpeq_ppzz_h, uint16_t, ==) +DO_CMP_PPZZ_S(sve_cmpeq_ppzz_s, uint32_t, ==) +DO_CMP_PPZZ_D(sve_cmpeq_ppzz_d, uint64_t, ==) + +DO_CMP_PPZZ_B(sve_cmpne_ppzz_b, uint8_t, !=) +DO_CMP_PPZZ_H(sve_cmpne_ppzz_h, uint16_t, !=) +DO_CMP_PPZZ_S(sve_cmpne_ppzz_s, uint32_t, !=) +DO_CMP_PPZZ_D(sve_cmpne_ppzz_d, uint64_t, !=) + +DO_CMP_PPZZ_B(sve_cmpgt_ppzz_b, int8_t, >) +DO_CMP_PPZZ_H(sve_cmpgt_ppzz_h, int16_t, >) +DO_CMP_PPZZ_S(sve_cmpgt_ppzz_s, int32_t, >) +DO_CMP_PPZZ_D(sve_cmpgt_ppzz_d, int64_t, >) + +DO_CMP_PPZZ_B(sve_cmpge_ppzz_b, int8_t, >=) +DO_CMP_PPZZ_H(sve_cmpge_ppzz_h, int16_t, >=) +DO_CMP_PPZZ_S(sve_cmpge_ppzz_s, int32_t, >=) +DO_CMP_PPZZ_D(sve_cmpge_ppzz_d, int64_t, >=) + +DO_CMP_PPZZ_B(sve_cmphi_ppzz_b, uint8_t, >) +DO_CMP_PPZZ_H(sve_cmphi_ppzz_h, uint16_t, >) +DO_CMP_PPZZ_S(sve_cmphi_ppzz_s, uint32_t, >) +DO_CMP_PPZZ_D(sve_cmphi_ppzz_d, uint64_t, >) + +DO_CMP_PPZZ_B(sve_cmphs_ppzz_b, uint8_t, >=) +DO_CMP_PPZZ_H(sve_cmphs_ppzz_h, uint16_t, >=) +DO_CMP_PPZZ_S(sve_cmphs_ppzz_s, uint32_t, >=) +DO_CMP_PPZZ_D(sve_cmphs_ppzz_d, uint64_t, >=) + +#undef DO_CMP_PPZZ_B +#undef DO_CMP_PPZZ_H +#undef DO_CMP_PPZZ_S +#undef DO_CMP_PPZZ_D +#undef DO_CMP_PPZZ + +/* Similar, but the second source is "wide". */ +#define DO_CMP_PPZW(NAME, TYPE, TYPEW, OP, H, MASK) \ +uint32_t HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \ +{ \ + intptr_t opr_sz = simd_oprsz(desc); \ + uint32_t flags = PREDTEST_INIT; \ + intptr_t i = opr_sz; \ + do { \ + uint64_t out = 0, pg; \ + do { \ + TYPEW mm = *(TYPEW *)(vm + i - 8); \ + do { \ + i -= sizeof(TYPE), out <<= sizeof(TYPE); \ + TYPE nn = *(TYPE *)(vn + H(i)); \ + out |= nn OP mm; \ + } while (i & 7); \ + } while (i & 63); \ + pg = *(uint64_t *)(vg + (i >> 3)) & MASK; \ + out &= pg; \ + *(uint64_t *)(vd + (i >> 3)) = out; \ + flags = iter_predtest_bwd(out, pg, flags); \ + } while (i > 0); \ + return flags; \ +} + +#define DO_CMP_PPZW_B(NAME, TYPE, TYPEW, OP) \ + DO_CMP_PPZW(NAME, TYPE, TYPEW, OP, H1, 0xffffffffffffffffull) +#define DO_CMP_PPZW_H(NAME, TYPE, TYPEW, OP) \ + DO_CMP_PPZW(NAME, TYPE, TYPEW, OP, H1_2, 0x5555555555555555ull) +#define DO_CMP_PPZW_S(NAME, TYPE, TYPEW, OP) \ + DO_CMP_PPZW(NAME, TYPE, TYPEW, OP, H1_4, 0x1111111111111111ull) + +DO_CMP_PPZW_B(sve_cmpeq_ppzw_b, uint8_t, uint64_t, ==) +DO_CMP_PPZW_H(sve_cmpeq_ppzw_h, uint16_t, uint64_t, ==) +DO_CMP_PPZW_S(sve_cmpeq_ppzw_s, uint32_t, uint64_t, ==) + +DO_CMP_PPZW_B(sve_cmpne_ppzw_b, uint8_t, uint64_t, !=) +DO_CMP_PPZW_H(sve_cmpne_ppzw_h, uint16_t, uint64_t, !=) +DO_CMP_PPZW_S(sve_cmpne_ppzw_s, uint32_t, uint64_t, !=) + +DO_CMP_PPZW_B(sve_cmpgt_ppzw_b, int8_t, int64_t, >) +DO_CMP_PPZW_H(sve_cmpgt_ppzw_h, int16_t, int64_t, >) +DO_CMP_PPZW_S(sve_cmpgt_ppzw_s, int32_t, int64_t, >) + +DO_CMP_PPZW_B(sve_cmpge_ppzw_b, int8_t, int64_t, >=) +DO_CMP_PPZW_H(sve_cmpge_ppzw_h, int16_t, int64_t, >=) +DO_CMP_PPZW_S(sve_cmpge_ppzw_s, int32_t, int64_t, >=) + +DO_CMP_PPZW_B(sve_cmphi_ppzw_b, uint8_t, uint64_t, >) +DO_CMP_PPZW_H(sve_cmphi_ppzw_h, uint16_t, uint64_t, >) +DO_CMP_PPZW_S(sve_cmphi_ppzw_s, uint32_t, uint64_t, >) + +DO_CMP_PPZW_B(sve_cmphs_ppzw_b, uint8_t, uint64_t, >=) +DO_CMP_PPZW_H(sve_cmphs_ppzw_h, uint16_t, uint64_t, >=) +DO_CMP_PPZW_S(sve_cmphs_ppzw_s, uint32_t, uint64_t, >=) + +DO_CMP_PPZW_B(sve_cmplt_ppzw_b, int8_t, int64_t, <) +DO_CMP_PPZW_H(sve_cmplt_ppzw_h, int16_t, int64_t, <) +DO_CMP_PPZW_S(sve_cmplt_ppzw_s, int32_t, int64_t, <) + +DO_CMP_PPZW_B(sve_cmple_ppzw_b, int8_t, int64_t, <=) +DO_CMP_PPZW_H(sve_cmple_ppzw_h, int16_t, int64_t, <=) +DO_CMP_PPZW_S(sve_cmple_ppzw_s, int32_t, int64_t, <=) + +DO_CMP_PPZW_B(sve_cmplo_ppzw_b, uint8_t, uint64_t, <) +DO_CMP_PPZW_H(sve_cmplo_ppzw_h, uint16_t, uint64_t, <) +DO_CMP_PPZW_S(sve_cmplo_ppzw_s, uint32_t, uint64_t, <) + +DO_CMP_PPZW_B(sve_cmpls_ppzw_b, uint8_t, uint64_t, <=) +DO_CMP_PPZW_H(sve_cmpls_ppzw_h, uint16_t, uint64_t, <=) +DO_CMP_PPZW_S(sve_cmpls_ppzw_s, uint32_t, uint64_t, <=) + +#undef DO_CMP_PPZW_B +#undef DO_CMP_PPZW_H +#undef DO_CMP_PPZW_S +#undef DO_CMP_PPZW diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 0de9118fdf..1510af6ece 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -33,6 +33,10 @@ #include "trace-tcg.h" #include "translate-a64.h" + +typedef void gen_helper_gvec_flags_4(TCGv_i32, TCGv_ptr, TCGv_ptr, + TCGv_ptr, TCGv_ptr, TCGv_i32); + /* * Helpers for extracting complex instruction fields. */ @@ -2696,6 +2700,93 @@ static bool trans_SPLICE(DisasContext *s, arg_rprr_esz *a, uint32_t insn) return true; } +/* + *** SVE Integer Compare - Vectors Group + */ + +static bool do_ppzz_flags(DisasContext *s, arg_rprr_esz *a, + gen_helper_gvec_flags_4 *gen_fn) +{ + TCGv_ptr pd, zn, zm, pg; + unsigned vsz; + TCGv_i32 t; + + if (gen_fn == NULL) { + return false; + } + if (!sve_access_check(s)) { + return true; + } + + vsz = vec_full_reg_size(s); + t = tcg_const_i32(simd_desc(vsz, vsz, 0)); + pd = tcg_temp_new_ptr(); + zn = tcg_temp_new_ptr(); + zm = tcg_temp_new_ptr(); + pg = tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(pd, cpu_env, pred_full_reg_offset(s, a->rd)); + tcg_gen_addi_ptr(zn, cpu_env, vec_full_reg_offset(s, a->rn)); + tcg_gen_addi_ptr(zm, cpu_env, vec_full_reg_offset(s, a->rm)); + tcg_gen_addi_ptr(pg, cpu_env, pred_full_reg_offset(s, a->pg)); + + gen_fn(t, pd, zn, zm, pg, t); + + tcg_temp_free_ptr(pd); + tcg_temp_free_ptr(zn); + tcg_temp_free_ptr(zm); + tcg_temp_free_ptr(pg); + + do_pred_flags(t); + + tcg_temp_free_i32(t); + return true; +} + +#define DO_PPZZ(NAME, name) \ +static bool trans_##NAME##_ppzz(DisasContext *s, arg_rprr_esz *a, \ + uint32_t insn) \ +{ \ + static gen_helper_gvec_flags_4 * const fns[4] = { \ + gen_helper_sve_##name##_ppzz_b, gen_helper_sve_##name##_ppzz_h, \ + gen_helper_sve_##name##_ppzz_s, gen_helper_sve_##name##_ppzz_d, \ + }; \ + return do_ppzz_flags(s, a, fns[a->esz]); \ +} + +DO_PPZZ(CMPEQ, cmpeq) +DO_PPZZ(CMPNE, cmpne) +DO_PPZZ(CMPGT, cmpgt) +DO_PPZZ(CMPGE, cmpge) +DO_PPZZ(CMPHI, cmphi) +DO_PPZZ(CMPHS, cmphs) + +#undef DO_PPZZ + +#define DO_PPZW(NAME, name) \ +static bool trans_##NAME##_ppzw(DisasContext *s, arg_rprr_esz *a, \ + uint32_t insn) \ +{ \ + static gen_helper_gvec_flags_4 * const fns[4] = { \ + gen_helper_sve_##name##_ppzw_b, gen_helper_sve_##name##_ppzw_h, \ + gen_helper_sve_##name##_ppzw_s, NULL \ + }; \ + return do_ppzz_flags(s, a, fns[a->esz]); \ +} + +DO_PPZW(CMPEQ, cmpeq) +DO_PPZW(CMPNE, cmpne) +DO_PPZW(CMPGT, cmpgt) +DO_PPZW(CMPGE, cmpge) +DO_PPZW(CMPHI, cmphi) +DO_PPZW(CMPHS, cmphs) +DO_PPZW(CMPLT, cmplt) +DO_PPZW(CMPLE, cmple) +DO_PPZW(CMPLO, cmplo) +DO_PPZW(CMPLS, cmpls) + +#undef DO_PPZW + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 91522d8e13..76a42193e4 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -99,6 +99,7 @@ @rdm_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 \ &rprr_esz rm=%reg_movprfx @rd_pg4_rn_rm ........ esz:2 . rm:5 .. pg:4 rn:5 rd:5 &rprr_esz +@pd_pg_rn_rm ........ esz:2 . rm:5 ... pg:3 rn:5 . rd:4 &rprr_esz # Three register operand, with governing predicate, vector element size @rda_pg_rn_rm ........ esz:2 . rm:5 ... pg:3 rn:5 rd:5 \ @@ -472,6 +473,29 @@ SPLICE 00000101 .. 101 100 100 ... ..... ..... @rdn_pg_rm # SVE select vector elements (predicated) SEL_zpzz 00000101 .. 1 ..... 11 .... ..... ..... @rd_pg4_rn_rm +### SVE Integer Compare - Vectors Group + +# SVE integer compare_vectors +CMPHS_ppzz 00100100 .. 0 ..... 000 ... ..... 0 .... @pd_pg_rn_rm +CMPHI_ppzz 00100100 .. 0 ..... 000 ... ..... 1 .... @pd_pg_rn_rm +CMPGE_ppzz 00100100 .. 0 ..... 100 ... ..... 0 .... @pd_pg_rn_rm +CMPGT_ppzz 00100100 .. 0 ..... 100 ... ..... 1 .... @pd_pg_rn_rm +CMPEQ_ppzz 00100100 .. 0 ..... 101 ... ..... 0 .... @pd_pg_rn_rm +CMPNE_ppzz 00100100 .. 0 ..... 101 ... ..... 1 .... @pd_pg_rn_rm + +# SVE integer compare with wide elements +# Note these require esz != 3. +CMPEQ_ppzw 00100100 .. 0 ..... 001 ... ..... 0 .... @pd_pg_rn_rm +CMPNE_ppzw 00100100 .. 0 ..... 001 ... ..... 1 .... @pd_pg_rn_rm +CMPGE_ppzw 00100100 .. 0 ..... 010 ... ..... 0 .... @pd_pg_rn_rm +CMPGT_ppzw 00100100 .. 0 ..... 010 ... ..... 1 .... @pd_pg_rn_rm +CMPLT_ppzw 00100100 .. 0 ..... 011 ... ..... 0 .... @pd_pg_rn_rm +CMPLE_ppzw 00100100 .. 0 ..... 011 ... ..... 1 .... @pd_pg_rn_rm +CMPHS_ppzw 00100100 .. 0 ..... 110 ... ..... 0 .... @pd_pg_rn_rm +CMPHI_ppzw 00100100 .. 0 ..... 110 ... ..... 1 .... @pd_pg_rn_rm +CMPLO_ppzw 00100100 .. 0 ..... 111 ... ..... 0 .... @pd_pg_rn_rm +CMPLS_ppzw 00100100 .. 0 ..... 111 ... ..... 1 .... @pd_pg_rn_rm + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Wed Jun 13 01:56:35 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 138395 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp104975lji; Tue, 12 Jun 2018 19:08:13 -0700 (PDT) X-Google-Smtp-Source: ADUXVKJn4XwOw2R5cW0N8nDjL6xWb6n3SL5019nK7tU3bL33pMdPFJDXWyDRRnVPZV4GCN/fgELL X-Received: by 2002:a0c:f9ca:: with SMTP id j10-v6mr2832377qvo.25.1528855693201; Tue, 12 Jun 2018 19:08:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1528855693; cv=none; d=google.com; s=arc-20160816; b=XiiJbNx7aNR9uXqtXMqu2m5fiBaqfr0oVNyIC37Pb9msh17+f18m1a3JkoA5p4mCXe gm90lK+xYQFnH9srX7ihDIoizR2gf5WN4VPVjuiVyf+hBLHlFKyLNCR8/5rzS5rxe7jM QKDZ7Gow0vYn6lFd3xwvsm0cAap2sS8csRjBorRH924BYpNy10ICLBjVoD+MM6mBrD2u QYlRhLud+fDiJwo82b+GNPCJ2p1l4p8qZfJxP4rXddB8qLY8WhNUQ3O2JD5UxUhocQ3Z lRn+c/w4I3Hvdg/0eV647qMRM27mbkoUqlYXagP7OYj+iUPXbffoQMh9XeudhnOFkLWL dAiQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=geb2TdvbZ/FmTKQrziVC2OweTFP7OrYLltPJtPcKaOw=; b=igNusZmTJlBGX7MfpLOmdKsjc0JoCLFdmArsEvjl9/VKGs9MU8/3gqijOFtSELO3Hx Z87izt1R64Nb7iwlBW3I/iP8oi5lyGmi5QdnaJXpIoC6AKBZ6NZ/78cwYAsEexIgAwTe GxHYfRQTH2rWQzLrClOpssnkojd5IJA3SJYqmSPVlBoE3Ni6D43NObTbJOCmAukSBqGU loY10hRr+Dip+t/mCselCgROLC42xgVHd7EDhliy19wTSUsSDE6T3zaMCFJ3uey78VRA eEvRYFsGU8XENz2jKlocAKzqMRrxr3ejCajpkZtG6EOyOjsN1Fyw1EADM/8Duzb2LfCR h1fA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=HDzNTpLz; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id p5-v6si1539504qkc.230.2018.06.12.19.08.13 for (version=TLS1 cipher=AES128-SHA bits=128/128); Tue, 12 Jun 2018 19:08:13 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=HDzNTpLz; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:59329 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSvCm-0002ay-JD for patch@linaro.org; Tue, 12 Jun 2018 22:08:12 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44115) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSv2L-0002Vx-3Z for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:27 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fSv2H-0006IY-Un for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:25 -0400 Received: from mail-pf0-x231.google.com ([2607:f8b0:400e:c00::231]:46043) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fSv2H-0006IN-Lb for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:21 -0400 Received: by mail-pf0-x231.google.com with SMTP id a22-v6so497781pfo.12 for ; Tue, 12 Jun 2018 18:57:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=geb2TdvbZ/FmTKQrziVC2OweTFP7OrYLltPJtPcKaOw=; b=HDzNTpLzXMDqun0+l/1afBExJAlsFqurwOYSjbSIKaSxlAmutm+9ePM1g9TNtpEq1a cMLvvERoxfQsQjaE6jjVmOSiJ1dvXOd6koiVWn9co4p0x4a/eLxpC97NtPsYaPxFRwHi 867xiYu48aR+Zi99kmkWmKV7QoX+VsUXmXRlg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=geb2TdvbZ/FmTKQrziVC2OweTFP7OrYLltPJtPcKaOw=; b=Y6mJg7YsIN9hBcSMkC7INNtNibwOb+lj/CPSxX4naGNlqfqw5bn6IXnx1jp49UQI7s azQTBSyF9qli6ZmPt+HrryRL8tj3dqwuRJ2dJfLJcpSDkCe0zrgHZ7oQTM2geREjEZ7t 5xLxGZA9fRhVP9zUpdDHi6wZ+zTphv60Unq24QpbSa+NZIS1os024E0mlA+iyS5oXVxV esA2yv4sTTkw7kCElRHkZqkxn8tt2E8G7PoiBv8brt+tROem9JGXfAis5qq3ee3Wxg2v TU/SXRNUw2sYGMAgc0DSeJwduG0iwr+bmcfukv3/vmtgZGIyH27gJY4h5FvNdfJC2MIs Gtdg== X-Gm-Message-State: APt69E1tDwhS1/saCLXcT/T/QQyHN1AuMQGn2OfaKtnAgTEZ+EQU6Wwt 3FSpL/J1HxfTj4Pht5zs7qrEKxw9jMg= X-Received: by 2002:aa7:810c:: with SMTP id b12-v6mr2806885pfi.79.1528855040232; Tue, 12 Jun 2018 18:57:20 -0700 (PDT) Received: from cloudburst.twiddle.net (rrcs-173-198-77-219.west.biz.rr.com. [173.198.77.219]) by smtp.gmail.com with ESMTPSA id g10-v6sm1647287pfi.148.2018.06.12.18.57.18 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 12 Jun 2018 18:57:19 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Tue, 12 Jun 2018 15:56:35 -1000 Message-Id: <20180613015641.5667-13-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180613015641.5667-1-richard.henderson@linaro.org> References: <20180613015641.5667-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::231 Subject: [Qemu-devel] [PATCH v4b 12/18] target/arm: Implement SVE Integer Compare - Immediate Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 44 +++++++++++++++++++ target/arm/sve_helper.c | 88 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 66 ++++++++++++++++++++++++++++ target/arm/sve.decode | 23 ++++++++++ 4 files changed, 221 insertions(+) -- 2.17.1 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 6ffd1fbe8e..ae38c0a4be 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -605,6 +605,50 @@ DEF_HELPER_FLAGS_5(sve_cmplo_ppzw_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_cmpls_ppzw_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpeq_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpne_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpgt_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpge_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmplt_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmple_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmphs_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmphi_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmplo_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpls_ppzi_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_cmpeq_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpne_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpgt_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpge_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmplt_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmple_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmphs_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmphi_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmplo_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpls_ppzi_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_cmpeq_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpne_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpgt_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpge_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmplt_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmple_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmphs_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmphi_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmplo_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpls_ppzi_s, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_cmpeq_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpne_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpgt_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpge_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmplt_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmple_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmphs_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmphi_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmplo_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_cmpls_ppzi_d, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_and_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_bic_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_eor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index d11f591661..c1d95edfca 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2388,3 +2388,91 @@ DO_CMP_PPZW_S(sve_cmpls_ppzw_s, uint32_t, uint64_t, <=) #undef DO_CMP_PPZW_H #undef DO_CMP_PPZW_S #undef DO_CMP_PPZW + +/* Similar, but the second source is immediate. */ +#define DO_CMP_PPZI(NAME, TYPE, OP, H, MASK) \ +uint32_t HELPER(NAME)(void *vd, void *vn, void *vg, uint32_t desc) \ +{ \ + intptr_t opr_sz = simd_oprsz(desc); \ + uint32_t flags = PREDTEST_INIT; \ + TYPE mm = simd_data(desc); \ + intptr_t i = opr_sz; \ + do { \ + uint64_t out = 0, pg; \ + do { \ + i -= sizeof(TYPE), out <<= sizeof(TYPE); \ + TYPE nn = *(TYPE *)(vn + H(i)); \ + out |= nn OP mm; \ + } while (i & 63); \ + pg = *(uint64_t *)(vg + (i >> 3)) & MASK; \ + out &= pg; \ + *(uint64_t *)(vd + (i >> 3)) = out; \ + flags = iter_predtest_bwd(out, pg, flags); \ + } while (i > 0); \ + return flags; \ +} + +#define DO_CMP_PPZI_B(NAME, TYPE, OP) \ + DO_CMP_PPZI(NAME, TYPE, OP, H1, 0xffffffffffffffffull) +#define DO_CMP_PPZI_H(NAME, TYPE, OP) \ + DO_CMP_PPZI(NAME, TYPE, OP, H1_2, 0x5555555555555555ull) +#define DO_CMP_PPZI_S(NAME, TYPE, OP) \ + DO_CMP_PPZI(NAME, TYPE, OP, H1_4, 0x1111111111111111ull) +#define DO_CMP_PPZI_D(NAME, TYPE, OP) \ + DO_CMP_PPZI(NAME, TYPE, OP, , 0x0101010101010101ull) + +DO_CMP_PPZI_B(sve_cmpeq_ppzi_b, uint8_t, ==) +DO_CMP_PPZI_H(sve_cmpeq_ppzi_h, uint16_t, ==) +DO_CMP_PPZI_S(sve_cmpeq_ppzi_s, uint32_t, ==) +DO_CMP_PPZI_D(sve_cmpeq_ppzi_d, uint64_t, ==) + +DO_CMP_PPZI_B(sve_cmpne_ppzi_b, uint8_t, !=) +DO_CMP_PPZI_H(sve_cmpne_ppzi_h, uint16_t, !=) +DO_CMP_PPZI_S(sve_cmpne_ppzi_s, uint32_t, !=) +DO_CMP_PPZI_D(sve_cmpne_ppzi_d, uint64_t, !=) + +DO_CMP_PPZI_B(sve_cmpgt_ppzi_b, int8_t, >) +DO_CMP_PPZI_H(sve_cmpgt_ppzi_h, int16_t, >) +DO_CMP_PPZI_S(sve_cmpgt_ppzi_s, int32_t, >) +DO_CMP_PPZI_D(sve_cmpgt_ppzi_d, int64_t, >) + +DO_CMP_PPZI_B(sve_cmpge_ppzi_b, int8_t, >=) +DO_CMP_PPZI_H(sve_cmpge_ppzi_h, int16_t, >=) +DO_CMP_PPZI_S(sve_cmpge_ppzi_s, int32_t, >=) +DO_CMP_PPZI_D(sve_cmpge_ppzi_d, int64_t, >=) + +DO_CMP_PPZI_B(sve_cmphi_ppzi_b, uint8_t, >) +DO_CMP_PPZI_H(sve_cmphi_ppzi_h, uint16_t, >) +DO_CMP_PPZI_S(sve_cmphi_ppzi_s, uint32_t, >) +DO_CMP_PPZI_D(sve_cmphi_ppzi_d, uint64_t, >) + +DO_CMP_PPZI_B(sve_cmphs_ppzi_b, uint8_t, >=) +DO_CMP_PPZI_H(sve_cmphs_ppzi_h, uint16_t, >=) +DO_CMP_PPZI_S(sve_cmphs_ppzi_s, uint32_t, >=) +DO_CMP_PPZI_D(sve_cmphs_ppzi_d, uint64_t, >=) + +DO_CMP_PPZI_B(sve_cmplt_ppzi_b, int8_t, <) +DO_CMP_PPZI_H(sve_cmplt_ppzi_h, int16_t, <) +DO_CMP_PPZI_S(sve_cmplt_ppzi_s, int32_t, <) +DO_CMP_PPZI_D(sve_cmplt_ppzi_d, int64_t, <) + +DO_CMP_PPZI_B(sve_cmple_ppzi_b, int8_t, <=) +DO_CMP_PPZI_H(sve_cmple_ppzi_h, int16_t, <=) +DO_CMP_PPZI_S(sve_cmple_ppzi_s, int32_t, <=) +DO_CMP_PPZI_D(sve_cmple_ppzi_d, int64_t, <=) + +DO_CMP_PPZI_B(sve_cmplo_ppzi_b, uint8_t, <) +DO_CMP_PPZI_H(sve_cmplo_ppzi_h, uint16_t, <) +DO_CMP_PPZI_S(sve_cmplo_ppzi_s, uint32_t, <) +DO_CMP_PPZI_D(sve_cmplo_ppzi_d, uint64_t, <) + +DO_CMP_PPZI_B(sve_cmpls_ppzi_b, uint8_t, <=) +DO_CMP_PPZI_H(sve_cmpls_ppzi_h, uint16_t, <=) +DO_CMP_PPZI_S(sve_cmpls_ppzi_s, uint32_t, <=) +DO_CMP_PPZI_D(sve_cmpls_ppzi_d, uint64_t, <=) + +#undef DO_CMP_PPZI_B +#undef DO_CMP_PPZI_H +#undef DO_CMP_PPZI_S +#undef DO_CMP_PPZI_D +#undef DO_CMP_PPZI diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 1510af6ece..00481e92de 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -34,6 +34,8 @@ #include "translate-a64.h" +typedef void gen_helper_gvec_flags_3(TCGv_i32, TCGv_ptr, TCGv_ptr, + TCGv_ptr, TCGv_i32); typedef void gen_helper_gvec_flags_4(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32); @@ -2787,6 +2789,70 @@ DO_PPZW(CMPLS, cmpls) #undef DO_PPZW +/* + *** SVE Integer Compare - Immediate Groups + */ + +static bool do_ppzi_flags(DisasContext *s, arg_rpri_esz *a, + gen_helper_gvec_flags_3 *gen_fn) +{ + TCGv_ptr pd, zn, pg; + unsigned vsz; + TCGv_i32 t; + + if (gen_fn == NULL) { + return false; + } + if (!sve_access_check(s)) { + return true; + } + + vsz = vec_full_reg_size(s); + t = tcg_const_i32(simd_desc(vsz, vsz, a->imm)); + pd = tcg_temp_new_ptr(); + zn = tcg_temp_new_ptr(); + pg = tcg_temp_new_ptr(); + + tcg_gen_addi_ptr(pd, cpu_env, pred_full_reg_offset(s, a->rd)); + tcg_gen_addi_ptr(zn, cpu_env, vec_full_reg_offset(s, a->rn)); + tcg_gen_addi_ptr(pg, cpu_env, pred_full_reg_offset(s, a->pg)); + + gen_fn(t, pd, zn, pg, t); + + tcg_temp_free_ptr(pd); + tcg_temp_free_ptr(zn); + tcg_temp_free_ptr(pg); + + do_pred_flags(t); + + tcg_temp_free_i32(t); + return true; +} + +#define DO_PPZI(NAME, name) \ +static bool trans_##NAME##_ppzi(DisasContext *s, arg_rpri_esz *a, \ + uint32_t insn) \ +{ \ + static gen_helper_gvec_flags_3 * const fns[4] = { \ + gen_helper_sve_##name##_ppzi_b, gen_helper_sve_##name##_ppzi_h, \ + gen_helper_sve_##name##_ppzi_s, gen_helper_sve_##name##_ppzi_d, \ + }; \ + return do_ppzi_flags(s, a, fns[a->esz]); \ +} + +DO_PPZI(CMPEQ, cmpeq) +DO_PPZI(CMPNE, cmpne) +DO_PPZI(CMPGT, cmpgt) +DO_PPZI(CMPGE, cmpge) +DO_PPZI(CMPHI, cmphi) +DO_PPZI(CMPHS, cmphs) +DO_PPZI(CMPLT, cmplt) +DO_PPZI(CMPLE, cmple) +DO_PPZI(CMPLO, cmplo) +DO_PPZI(CMPLS, cmpls) + +#undef DO_PPZI + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 76a42193e4..9bc383b085 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -131,6 +131,11 @@ @rdn_dbm ........ .. .... dbm:13 rd:5 \ &rr_dbm rn=%reg_movprfx +# Predicate output, vector and immediate input, +# controlling predicate, element size. +@pd_pg_rn_i7 ........ esz:2 . imm:7 . pg:3 rn:5 . rd:4 &rpri_esz +@pd_pg_rn_i5 ........ esz:2 . imm:s5 ... pg:3 rn:5 . rd:4 &rpri_esz + # Basic Load/Store with 9-bit immediate offset @pd_rn_i9 ........ ........ ...... rn:5 . rd:4 \ &rri imm=%imm9_16_10 @@ -496,6 +501,24 @@ CMPHI_ppzw 00100100 .. 0 ..... 110 ... ..... 1 .... @pd_pg_rn_rm CMPLO_ppzw 00100100 .. 0 ..... 111 ... ..... 0 .... @pd_pg_rn_rm CMPLS_ppzw 00100100 .. 0 ..... 111 ... ..... 1 .... @pd_pg_rn_rm +### SVE Integer Compare - Unsigned Immediate Group + +# SVE integer compare with unsigned immediate +CMPHS_ppzi 00100100 .. 1 ....... 0 ... ..... 0 .... @pd_pg_rn_i7 +CMPHI_ppzi 00100100 .. 1 ....... 0 ... ..... 1 .... @pd_pg_rn_i7 +CMPLO_ppzi 00100100 .. 1 ....... 1 ... ..... 0 .... @pd_pg_rn_i7 +CMPLS_ppzi 00100100 .. 1 ....... 1 ... ..... 1 .... @pd_pg_rn_i7 + +### SVE Integer Compare - Signed Immediate Group + +# SVE integer compare with signed immediate +CMPGE_ppzi 00100101 .. 0 ..... 000 ... ..... 0 .... @pd_pg_rn_i5 +CMPGT_ppzi 00100101 .. 0 ..... 000 ... ..... 1 .... @pd_pg_rn_i5 +CMPLT_ppzi 00100101 .. 0 ..... 001 ... ..... 0 .... @pd_pg_rn_i5 +CMPLE_ppzi 00100101 .. 0 ..... 001 ... ..... 1 .... @pd_pg_rn_i5 +CMPEQ_ppzi 00100101 .. 0 ..... 100 ... ..... 0 .... @pd_pg_rn_i5 +CMPNE_ppzi 00100101 .. 0 ..... 100 ... ..... 1 .... @pd_pg_rn_i5 + ### SVE Predicate Logical Operations Group # SVE predicate logical operations From patchwork Wed Jun 13 01:56:36 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 138396 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp107212lji; Tue, 12 Jun 2018 19:11:17 -0700 (PDT) X-Google-Smtp-Source: ADUXVKJezGlqdQ6uTci0Lex5ZFSyAvFosiaLZFpl+8kBNQ/MY+DJtXCld/3NPK1IPI+gKC5dVtnJ X-Received: by 2002:a0c:e444:: with SMTP id d4-v6mr2834152qvm.119.1528855877594; Tue, 12 Jun 2018 19:11:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1528855877; cv=none; d=google.com; s=arc-20160816; b=hLxq0FgU4ec5dzyqJNz9SXnB1UClj53OD17gDWpHSCjJ1kRVkt5nGmgCXT3PVJ3Tyi wGJgha9030r6TPVIBy34QjGA+9ckhJGQYYfD0/oIz5dYdaln5ZQkrb64ENDVb0C7+aWz q4gA4jpPu58W/w0+mfZKapplkYCFphNCf2NyYDGD+OmYbroF7tgIC5/e9bI0IouxKVH8 ZZw1znIVvDt1qafC4I4ogGXNjrtUXik8gXepk+EniSpngpa588hhkSBHpX0p97Uah+Bz 15yRqRGESCs7PceEQiH+A8D6tBfrgR12EQ9fbDCs5HFqqqq4/eY1jmW88xiTc6JZnu8r oZ1w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=beupV5ZNjAQiOr/5OEx1HYEiX0wEqBDubM6Cqn+4RHA=; b=wjjU9viEVfHSl5ZxJcI1+CH3NqDrIWoRW5T3brvEXu3lqqHWovbWHIHHin+EWugKzK qp3aE2P+Z1vr07gP/8TjrYjMOPdTdglsP+VPVEAaa14iE9ZwGbnCKpazZi20CxbImRvS +K9GGa+bVtpxh0OxUcpBIRb1KGAckeOrXUh4zOVKWslsXcdG6DlTvj80N8Z/m1LMDfjj Kps1UARfV8Bo6RLRK7C//mzokihVI01HpwM9ydQcR55DDEuW1E/0g2+DPlqiz1A5i7/g OC7OrDsVtLku11W1az8uzPwQkMB1dazeI2zT1BIOaR+7BCpJeX23FKT39v0R+y80zoZI Zu6A== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=joI4VT47; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id x20-v6si461838qkx.316.2018.06.12.19.11.17 for (version=TLS1 cipher=AES128-SHA bits=128/128); Tue, 12 Jun 2018 19:11:17 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=joI4VT47; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:59350 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSvFk-0005DB-TY for patch@linaro.org; Tue, 12 Jun 2018 22:11:16 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44126) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSv2M-0002Y0-Vn for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:29 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fSv2K-0006JK-Uz for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:27 -0400 Received: from mail-pl0-x241.google.com ([2607:f8b0:400e:c01::241]:37933) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fSv2K-0006J9-MK for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:24 -0400 Received: by mail-pl0-x241.google.com with SMTP id b14-v6so553871pls.5 for ; Tue, 12 Jun 2018 18:57:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=beupV5ZNjAQiOr/5OEx1HYEiX0wEqBDubM6Cqn+4RHA=; b=joI4VT47niFYjN9yeGQ1AWDhpRI8h4WbW3ms9pzV0x/wnEDWX9W9MS/DVPfjHA/TDZ wQQDwGW62mk9eXlsmfLTGcwnWh0fcJCzCscc6J7AdO4JjIhmdc59uY/UDurfkaqMbJez PYm9vA80DniMGhDnU4MpQwEJKvSg5xHlOLnZ4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=beupV5ZNjAQiOr/5OEx1HYEiX0wEqBDubM6Cqn+4RHA=; b=KDKH20D7XpbCERa9vYsNmBTJWpsTRjz4j3A7EjFnJZDWtGLFCvFuYIwf5Y/CMaIHrM SED636u4VIgbQy6IP2p5mWS5s3RKMRBKu3JuUCPnEerSVwnj4GGv5/94vXR2WigtNbQf gPa1bi2Rao+Hro1UUns+4MCYZ6h5F4PTOjGCJbp+suXVQGzN/DP19LAEhe3eAOq9uj+3 FH2gmmbwqgwHmMBBCwmjgO+uXkpQCeBgfW4wxCZTGwTcLbCbXAQI3ZEhbsD//GHk4ejc yUDuKmKiRlHFz3JVmGJeDWzyzVtvPyi1YZ7ErIlmt/rydfz65V5MSXOgz0jZwP6xOmi/ JM4w== X-Gm-Message-State: APt69E3X37qa8wXvEcZDykWwjFhQ+WmP1WXBEZOmd+2uHuE8sRPasvyi 1gXuBHw65wBozbLbeRjdDBA0UYqUB5g= X-Received: by 2002:a17:902:585c:: with SMTP id f28-v6mr3015299plj.206.1528855043277; Tue, 12 Jun 2018 18:57:23 -0700 (PDT) Received: from cloudburst.twiddle.net (rrcs-173-198-77-219.west.biz.rr.com. [173.198.77.219]) by smtp.gmail.com with ESMTPSA id g10-v6sm1647287pfi.148.2018.06.12.18.57.20 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 12 Jun 2018 18:57:22 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Tue, 12 Jun 2018 15:56:36 -1000 Message-Id: <20180613015641.5667-14-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180613015641.5667-1-richard.henderson@linaro.org> References: <20180613015641.5667-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::241 Subject: [Qemu-devel] [PATCH v4b 13/18] target/arm: Implement SVE Partition Break Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 18 +++ target/arm/sve_helper.c | 248 +++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 106 ++++++++++++++++ target/arm/sve.decode | 19 +++ 4 files changed, 391 insertions(+) -- 2.17.1 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index ae38c0a4be..f0a3ed3414 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -658,3 +658,21 @@ DEF_HELPER_FLAGS_5(sve_orn_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_nor_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_nand_pppp, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve_brkpa, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_brkpb, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_brkpas, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve_brkpbs, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_brka_z, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_brkb_z, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_brka_m, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_brkb_m, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_brkas_z, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_brkbs_z, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_brkas_m, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_brkbs_m, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve_brkn, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve_brkns, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index c1d95edfca..b27b502d75 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2476,3 +2476,251 @@ DO_CMP_PPZI_D(sve_cmpls_ppzi_d, uint64_t, <=) #undef DO_CMP_PPZI_S #undef DO_CMP_PPZI_D #undef DO_CMP_PPZI + +/* Similar to the ARM LastActive pseudocode function. */ +static bool last_active_pred(void *vd, void *vg, intptr_t oprsz) +{ + intptr_t i; + + for (i = QEMU_ALIGN_UP(oprsz, 8) - 8; i >= 0; i -= 8) { + uint64_t pg = *(uint64_t *)(vg + i); + if (pg) { + return (pow2floor(pg) & *(uint64_t *)(vd + i)) != 0; + } + } + return 0; +} + +/* Compute a mask into RETB that is true for all G, up to and including + * (if after) or excluding (if !after) the first G & N. + * Return true if BRK found. + */ +static bool compute_brk(uint64_t *retb, uint64_t n, uint64_t g, + bool brk, bool after) +{ + uint64_t b; + + if (brk) { + b = 0; + } else if ((g & n) == 0) { + /* For all G, no N are set; break not found. */ + b = g; + } else { + /* Break somewhere in N. Locate it. */ + b = g & n; /* guard true, pred true */ + b = b & -b; /* first such */ + if (after) { + b = b | (b - 1); /* break after same */ + } else { + b = b - 1; /* break before same */ + } + brk = true; + } + + *retb = b; + return brk; +} + +/* Compute a zeroing BRK. */ +static void compute_brk_z(uint64_t *d, uint64_t *n, uint64_t *g, + intptr_t oprsz, bool after) +{ + bool brk = false; + intptr_t i; + + for (i = 0; i < DIV_ROUND_UP(oprsz, 8); ++i) { + uint64_t this_b, this_g = g[i]; + + brk = compute_brk(&this_b, n[i], this_g, brk, after); + d[i] = this_b & this_g; + } +} + +/* Likewise, but also compute flags. */ +static uint32_t compute_brks_z(uint64_t *d, uint64_t *n, uint64_t *g, + intptr_t oprsz, bool after) +{ + uint32_t flags = PREDTEST_INIT; + bool brk = false; + intptr_t i; + + for (i = 0; i < DIV_ROUND_UP(oprsz, 8); ++i) { + uint64_t this_b, this_d, this_g = g[i]; + + brk = compute_brk(&this_b, n[i], this_g, brk, after); + d[i] = this_d = this_b & this_g; + flags = iter_predtest_fwd(this_d, this_g, flags); + } + return flags; +} + +/* Compute a merging BRK. */ +static void compute_brk_m(uint64_t *d, uint64_t *n, uint64_t *g, + intptr_t oprsz, bool after) +{ + bool brk = false; + intptr_t i; + + for (i = 0; i < DIV_ROUND_UP(oprsz, 8); ++i) { + uint64_t this_b, this_g = g[i]; + + brk = compute_brk(&this_b, n[i], this_g, brk, after); + d[i] = (this_b & this_g) | (d[i] & ~this_g); + } +} + +/* Likewise, but also compute flags. */ +static uint32_t compute_brks_m(uint64_t *d, uint64_t *n, uint64_t *g, + intptr_t oprsz, bool after) +{ + uint32_t flags = PREDTEST_INIT; + bool brk = false; + intptr_t i; + + for (i = 0; i < oprsz / 8; ++i) { + uint64_t this_b, this_d = d[i], this_g = g[i]; + + brk = compute_brk(&this_b, n[i], this_g, brk, after); + d[i] = this_d = (this_b & this_g) | (this_d & ~this_g); + flags = iter_predtest_fwd(this_d, this_g, flags); + } + return flags; +} + +static uint32_t do_zero(ARMPredicateReg *d, intptr_t oprsz) +{ + /* It is quicker to zero the whole predicate than loop on OPRSZ. + * The compiler should turn this into 4 64-bit integer stores. + */ + memset(d, 0, sizeof(ARMPredicateReg)); + return PREDTEST_INIT; +} + +void HELPER(sve_brkpa)(void *vd, void *vn, void *vm, void *vg, + uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + if (last_active_pred(vn, vg, oprsz)) { + compute_brk_z(vd, vm, vg, oprsz, true); + } else { + do_zero(vd, oprsz); + } +} + +uint32_t HELPER(sve_brkpas)(void *vd, void *vn, void *vm, void *vg, + uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + if (last_active_pred(vn, vg, oprsz)) { + return compute_brks_z(vd, vm, vg, oprsz, true); + } else { + return do_zero(vd, oprsz); + } +} + +void HELPER(sve_brkpb)(void *vd, void *vn, void *vm, void *vg, + uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + if (last_active_pred(vn, vg, oprsz)) { + compute_brk_z(vd, vm, vg, oprsz, false); + } else { + do_zero(vd, oprsz); + } +} + +uint32_t HELPER(sve_brkpbs)(void *vd, void *vn, void *vm, void *vg, + uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + if (last_active_pred(vn, vg, oprsz)) { + return compute_brks_z(vd, vm, vg, oprsz, false); + } else { + return do_zero(vd, oprsz); + } +} + +void HELPER(sve_brka_z)(void *vd, void *vn, void *vg, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + compute_brk_z(vd, vn, vg, oprsz, true); +} + +uint32_t HELPER(sve_brkas_z)(void *vd, void *vn, void *vg, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + return compute_brks_z(vd, vn, vg, oprsz, true); +} + +void HELPER(sve_brkb_z)(void *vd, void *vn, void *vg, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + compute_brk_z(vd, vn, vg, oprsz, false); +} + +uint32_t HELPER(sve_brkbs_z)(void *vd, void *vn, void *vg, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + return compute_brks_z(vd, vn, vg, oprsz, false); +} + +void HELPER(sve_brka_m)(void *vd, void *vn, void *vg, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + compute_brk_m(vd, vn, vg, oprsz, true); +} + +uint32_t HELPER(sve_brkas_m)(void *vd, void *vn, void *vg, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + return compute_brks_m(vd, vn, vg, oprsz, true); +} + +void HELPER(sve_brkb_m)(void *vd, void *vn, void *vg, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + compute_brk_m(vd, vn, vg, oprsz, false); +} + +uint32_t HELPER(sve_brkbs_m)(void *vd, void *vn, void *vg, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + return compute_brks_m(vd, vn, vg, oprsz, false); +} + +void HELPER(sve_brkn)(void *vd, void *vn, void *vg, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + + if (!last_active_pred(vn, vg, oprsz)) { + do_zero(vd, oprsz); + } +} + +/* As if PredTest(Ones(PL), D, esz). */ +static uint32_t predtest_ones(ARMPredicateReg *d, intptr_t oprsz, + uint64_t esz_mask) +{ + uint32_t flags = PREDTEST_INIT; + intptr_t i; + + for (i = 0; i < oprsz / 8; i++) { + flags = iter_predtest_fwd(d->p[i], esz_mask, flags); + } + if (oprsz & 7) { + uint64_t mask = ~(-1ULL << (8 * (oprsz & 7))); + flags = iter_predtest_fwd(d->p[i], esz_mask & mask, flags); + } + return flags; +} + +uint32_t HELPER(sve_brkns)(void *vd, void *vn, void *vg, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + + if (last_active_pred(vn, vg, oprsz)) { + return predtest_ones(vd, oprsz, -1); + } else { + return do_zero(vd, oprsz); + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 00481e92de..c381240b6f 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2853,6 +2853,112 @@ DO_PPZI(CMPLS, cmpls) #undef DO_PPZI +/* + *** SVE Partition Break Group + */ + +static bool do_brk3(DisasContext *s, arg_rprr_s *a, + gen_helper_gvec_4 *fn, gen_helper_gvec_flags_4 *fn_s) +{ + if (!sve_access_check(s)) { + return true; + } + + unsigned vsz = pred_full_reg_size(s); + + /* Predicate sizes may be smaller and cannot use simd_desc. */ + TCGv_ptr d = tcg_temp_new_ptr(); + TCGv_ptr n = tcg_temp_new_ptr(); + TCGv_ptr m = tcg_temp_new_ptr(); + TCGv_ptr g = tcg_temp_new_ptr(); + TCGv_i32 t = tcg_const_i32(vsz - 2); + + tcg_gen_addi_ptr(d, cpu_env, pred_full_reg_offset(s, a->rd)); + tcg_gen_addi_ptr(n, cpu_env, pred_full_reg_offset(s, a->rn)); + tcg_gen_addi_ptr(m, cpu_env, pred_full_reg_offset(s, a->rm)); + tcg_gen_addi_ptr(g, cpu_env, pred_full_reg_offset(s, a->pg)); + + if (a->s) { + fn_s(t, d, n, m, g, t); + do_pred_flags(t); + } else { + fn(d, n, m, g, t); + } + tcg_temp_free_ptr(d); + tcg_temp_free_ptr(n); + tcg_temp_free_ptr(m); + tcg_temp_free_ptr(g); + tcg_temp_free_i32(t); + return true; +} + +static bool do_brk2(DisasContext *s, arg_rpr_s *a, + gen_helper_gvec_3 *fn, gen_helper_gvec_flags_3 *fn_s) +{ + if (!sve_access_check(s)) { + return true; + } + + unsigned vsz = pred_full_reg_size(s); + + /* Predicate sizes may be smaller and cannot use simd_desc. */ + TCGv_ptr d = tcg_temp_new_ptr(); + TCGv_ptr n = tcg_temp_new_ptr(); + TCGv_ptr g = tcg_temp_new_ptr(); + TCGv_i32 t = tcg_const_i32(vsz - 2); + + tcg_gen_addi_ptr(d, cpu_env, pred_full_reg_offset(s, a->rd)); + tcg_gen_addi_ptr(n, cpu_env, pred_full_reg_offset(s, a->rn)); + tcg_gen_addi_ptr(g, cpu_env, pred_full_reg_offset(s, a->pg)); + + if (a->s) { + fn_s(t, d, n, g, t); + do_pred_flags(t); + } else { + fn(d, n, g, t); + } + tcg_temp_free_ptr(d); + tcg_temp_free_ptr(n); + tcg_temp_free_ptr(g); + tcg_temp_free_i32(t); + return true; +} + +static bool trans_BRKPA(DisasContext *s, arg_rprr_s *a, uint32_t insn) +{ + return do_brk3(s, a, gen_helper_sve_brkpa, gen_helper_sve_brkpas); +} + +static bool trans_BRKPB(DisasContext *s, arg_rprr_s *a, uint32_t insn) +{ + return do_brk3(s, a, gen_helper_sve_brkpb, gen_helper_sve_brkpbs); +} + +static bool trans_BRKA_m(DisasContext *s, arg_rpr_s *a, uint32_t insn) +{ + return do_brk2(s, a, gen_helper_sve_brka_m, gen_helper_sve_brkas_m); +} + +static bool trans_BRKB_m(DisasContext *s, arg_rpr_s *a, uint32_t insn) +{ + return do_brk2(s, a, gen_helper_sve_brkb_m, gen_helper_sve_brkbs_m); +} + +static bool trans_BRKA_z(DisasContext *s, arg_rpr_s *a, uint32_t insn) +{ + return do_brk2(s, a, gen_helper_sve_brka_z, gen_helper_sve_brkas_z); +} + +static bool trans_BRKB_z(DisasContext *s, arg_rpr_s *a, uint32_t insn) +{ + return do_brk2(s, a, gen_helper_sve_brkb_z, gen_helper_sve_brkbs_z); +} + +static bool trans_BRKN(DisasContext *s, arg_rpr_s *a, uint32_t insn) +{ + return do_brk2(s, a, gen_helper_sve_brkn, gen_helper_sve_brkns); +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 9bc383b085..66e1ee6b5c 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -59,6 +59,7 @@ &rri_esz rd rn imm esz &rrr_esz rd rn rm esz &rpr_esz rd pg rn esz +&rpr_s rd pg rn s &rprr_s rd pg rn rm s &rprr_esz rd pg rn rm esz &rprrr_esz rd pg rn rm ra esz @@ -78,6 +79,9 @@ @pd_pn ........ esz:2 .. .... ....... rn:4 . rd:4 &rr_esz @rd_rn ........ esz:2 ...... ...... rn:5 rd:5 &rr_esz +# Two operand with governing predicate, flags setting +@pd_pg_pn_s ........ . s:1 ...... .. pg:4 . rn:4 . rd:4 &rpr_s + # Three operand with unused vector element size @rd_rn_rm_e0 ........ ... rm:5 ... ... rn:5 rd:5 &rrr_esz esz=0 @@ -560,6 +564,21 @@ PFIRST 00100101 01 011 000 11000 00 .... 0 .... @pd_pn_e0 # SVE predicate next active PNEXT 00100101 .. 011 001 11000 10 .... 0 .... @pd_pn +### SVE Partition Break Group + +# SVE propagate break from previous partition +BRKPA 00100101 0. 00 .... 11 .... 0 .... 0 .... @pd_pg_pn_pm_s +BRKPB 00100101 0. 00 .... 11 .... 0 .... 1 .... @pd_pg_pn_pm_s + +# SVE partition break condition +BRKA_z 00100101 0. 01000001 .... 0 .... 0 .... @pd_pg_pn_s +BRKB_z 00100101 1. 01000001 .... 0 .... 0 .... @pd_pg_pn_s +BRKA_m 00100101 0. 01000001 .... 0 .... 1 .... @pd_pg_pn_s +BRKB_m 00100101 1. 01000001 .... 0 .... 1 .... @pd_pg_pn_s + +# SVE propagate break to next partition +BRKN 00100101 0. 01100001 .... 0 .... 0 .... @pd_pg_pn_s + ### SVE Memory - 32-bit Gather and Unsized Contiguous Group # SVE load predicate register From patchwork Wed Jun 13 01:56:37 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 138398 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp109227lji; Tue, 12 Jun 2018 19:14:08 -0700 (PDT) X-Google-Smtp-Source: ADUXVKLwANmZO47kB74Wyf56eKemewcvLVYuq/eLyaN+r5vL+kHx/ZoK8RwNOcIV5FsXB2Afk295 X-Received: by 2002:ac8:bc9:: with SMTP id p9-v6mr2757485qti.175.1528856048875; Tue, 12 Jun 2018 19:14:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1528856048; cv=none; d=google.com; s=arc-20160816; b=UpeTiFpZgRzI1mC3LvhpZ2TczpR0Tyjpk9qwEQJyPy3IJOCEGYF5zmFKyFxw12sz3O 0w9gDGKwhweCL8KH5XqWK/oXM60LDjulGpJt3howb4Kd3aV3VKFpDfHM9ioEf9T+hTMI tZklDRCheUpKSk9idcxtojf+17g6EYJxUESwH0pdEz81JynR8cdBEIzpsUSL7MqW6ObS taZvBl2yXA0i2Qg997xzfw5FfY5bFxZKQJZYSixBFmqkUY1KhEdEfI4L+VwcKQOXdti7 ToRY3UEqg03f2T5qQrWR9ad2etnTdawV1tECIFYL/lqMIHFdvAJhxu3sB5UUfvCGegbp aqhA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=mtr5JZvxddoi+ioTVsC359EOo4qE1ADriCasHzm4FLY=; b=l4+LZ2f0/hLBgHvD8u27Ftg1wldCFmq/1yxN5rTK/gjwEPl4+D6DBc9gvnCkg1d8iN vMkiMfP6UZ4Ggga3MIdF2z3KNknh2gUJGEfgn3QR1iJOFCSyWCOe//KNqMgYID9QPRKp TR7EpnQxZEZkXVucOx/eGhPtvTlijIyQQM+OhzT4lGBnmae9SUKOFIrKWLfGdMEsTLzs iI8tVUj+Uj9EO95mypaOPagoKe7ujxYEGoVYSV8DUKWNT8RPS3TdPqT+JI2J+EsK0HWk 8eplX6Bg7CcxR/7Gy887ybWuB6h+2O1wCji2z0h6BLY9qBs2kp7uKO2MXp4tDL3FqAgr mTkg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=EG5q6Jq1; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id b126-v6si1606175qkd.320.2018.06.12.19.14.08 for (version=TLS1 cipher=AES128-SHA bits=128/128); Tue, 12 Jun 2018 19:14:08 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=EG5q6Jq1; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:59364 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSvIW-0007bB-9d for patch@linaro.org; Tue, 12 Jun 2018 22:14:08 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44132) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSv2O-0002Zi-K0 for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:30 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fSv2M-0006Jq-VL for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:28 -0400 Received: from mail-pl0-x241.google.com ([2607:f8b0:400e:c01::241]:40640) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fSv2M-0006Jd-MU for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:26 -0400 Received: by mail-pl0-x241.google.com with SMTP id t12-v6so551091plo.7 for ; Tue, 12 Jun 2018 18:57:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=mtr5JZvxddoi+ioTVsC359EOo4qE1ADriCasHzm4FLY=; b=EG5q6Jq1ZUYCJZ/AQGzp8wOsiwOGmudkPZ8YdPjNgeJdo8hW5sWw+dmBgLoinvfDFB pqYm4Z182FGoSKCzkzdj4z/jW9dLi5sZMehfGAK24rfm/rkpoqVHA90M6DutU6ibpLXE RzZ4sLgbIe4ckLcgPQg65wj48/gB5PYNOHmAw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=mtr5JZvxddoi+ioTVsC359EOo4qE1ADriCasHzm4FLY=; b=Mt4KZPVa3ue0Nt3CUKZjCtZYsw5sGG+hoHldn5Y0PJrHmFYMX92WkmEKQV1Jvq2kM8 ZPHjNHonN50EaS8WtqzsAUBSFFyZzqBHA7pBfLcin3L09KZCvP1wP9F//QanOur6Bbh8 z1Pyt09ygGseLrnnsWWaGBBk9+gqjUuK2qRPvFgYjL7gbNFTSOiLFOKBx2hYUeVHSISn pQ3covIelmp+Vr6cbdVLKa34a3gYMcSkAxaHD2sEAKGurUNtyVegy18P2bGpPR/rAS1L DvxXEQNZLqFXgRRFvVXDEAav2n7RBtmMnlnqawzXno7o9kyhaYjCqCvtrciPmGNSKMQO IV4A== X-Gm-Message-State: APt69E06QjqLyHa/PExnGGvz+SgKJu0xX8dXX984N46+niWqTbOEy+cC JXlTPlKHSpHLLX1v6+rM5Y83f5ozgDU= X-Received: by 2002:a17:902:d24:: with SMTP id 33-v6mr3076287plu.22.1528855045396; Tue, 12 Jun 2018 18:57:25 -0700 (PDT) Received: from cloudburst.twiddle.net (rrcs-173-198-77-219.west.biz.rr.com. [173.198.77.219]) by smtp.gmail.com with ESMTPSA id g10-v6sm1647287pfi.148.2018.06.12.18.57.23 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 12 Jun 2018 18:57:24 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Tue, 12 Jun 2018 15:56:37 -1000 Message-Id: <20180613015641.5667-15-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180613015641.5667-1-richard.henderson@linaro.org> References: <20180613015641.5667-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::241 Subject: [Qemu-devel] [PATCH v4b 14/18] target/arm: Implement SVE Predicate Count Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 2 + target/arm/sve_helper.c | 14 ++++ target/arm/translate-sve.c | 133 +++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 27 ++++++++ 4 files changed, 176 insertions(+) -- 2.17.1 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index f0a3ed3414..dd4f8f754d 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -676,3 +676,5 @@ DEF_HELPER_FLAGS_4(sve_brkbs_m, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_brkn, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_brkns, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_cntp, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index b27b502d75..a4ecd653c1 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2724,3 +2724,17 @@ uint32_t HELPER(sve_brkns)(void *vd, void *vn, void *vg, uint32_t pred_desc) return do_zero(vd, oprsz); } } + +uint64_t HELPER(sve_cntp)(void *vn, void *vg, uint32_t pred_desc) +{ + intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + intptr_t esz = extract32(pred_desc, SIMD_DATA_SHIFT, 2); + uint64_t *n = vn, *g = vg, sum = 0, mask = pred_esz_masks[esz]; + intptr_t i; + + for (i = 0; i < DIV_ROUND_UP(oprsz, 8); ++i) { + uint64_t t = n[i] & g[i] & mask; + sum += ctpop64(t); + } + return sum; +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index c381240b6f..6b0b8c55d0 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -34,6 +34,9 @@ #include "translate-a64.h" +typedef void GVecGen2sFn(unsigned, uint32_t, uint32_t, + TCGv_i64, uint32_t, uint32_t); + typedef void gen_helper_gvec_flags_3(TCGv_i32, TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_i32); typedef void gen_helper_gvec_flags_4(TCGv_i32, TCGv_ptr, TCGv_ptr, @@ -2959,6 +2962,136 @@ static bool trans_BRKN(DisasContext *s, arg_rpr_s *a, uint32_t insn) return do_brk2(s, a, gen_helper_sve_brkn, gen_helper_sve_brkns); } +/* + *** SVE Predicate Count Group + */ + +static void do_cntp(DisasContext *s, TCGv_i64 val, int esz, int pn, int pg) +{ + unsigned psz = pred_full_reg_size(s); + + if (psz <= 8) { + uint64_t psz_mask; + + tcg_gen_ld_i64(val, cpu_env, pred_full_reg_offset(s, pn)); + if (pn != pg) { + TCGv_i64 g = tcg_temp_new_i64(); + tcg_gen_ld_i64(g, cpu_env, pred_full_reg_offset(s, pg)); + tcg_gen_and_i64(val, val, g); + tcg_temp_free_i64(g); + } + + /* Reduce the pred_esz_masks value simply to reduce the + * size of the code generated here. + */ + psz_mask = MAKE_64BIT_MASK(0, psz * 8); + tcg_gen_andi_i64(val, val, pred_esz_masks[esz] & psz_mask); + + tcg_gen_ctpop_i64(val, val); + } else { + TCGv_ptr t_pn = tcg_temp_new_ptr(); + TCGv_ptr t_pg = tcg_temp_new_ptr(); + unsigned desc; + TCGv_i32 t_desc; + + desc = psz - 2; + desc = deposit32(desc, SIMD_DATA_SHIFT, 2, esz); + + tcg_gen_addi_ptr(t_pn, cpu_env, pred_full_reg_offset(s, pn)); + tcg_gen_addi_ptr(t_pg, cpu_env, pred_full_reg_offset(s, pg)); + t_desc = tcg_const_i32(desc); + + gen_helper_sve_cntp(val, t_pn, t_pg, t_desc); + tcg_temp_free_ptr(t_pn); + tcg_temp_free_ptr(t_pg); + tcg_temp_free_i32(t_desc); + } +} + +static bool trans_CNTP(DisasContext *s, arg_CNTP *a, uint32_t insn) +{ + if (sve_access_check(s)) { + do_cntp(s, cpu_reg(s, a->rd), a->esz, a->rn, a->pg); + } + return true; +} + +static bool trans_INCDECP_r(DisasContext *s, arg_incdec_pred *a, + uint32_t insn) +{ + if (sve_access_check(s)) { + TCGv_i64 reg = cpu_reg(s, a->rd); + TCGv_i64 val = tcg_temp_new_i64(); + + do_cntp(s, val, a->esz, a->pg, a->pg); + if (a->d) { + tcg_gen_sub_i64(reg, reg, val); + } else { + tcg_gen_add_i64(reg, reg, val); + } + tcg_temp_free_i64(val); + } + return true; +} + +static bool trans_INCDECP_z(DisasContext *s, arg_incdec2_pred *a, + uint32_t insn) +{ + if (a->esz == 0) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + TCGv_i64 val = tcg_temp_new_i64(); + GVecGen2sFn *gvec_fn = a->d ? tcg_gen_gvec_subs : tcg_gen_gvec_adds; + + do_cntp(s, val, a->esz, a->pg, a->pg); + gvec_fn(a->esz, vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), val, vsz, vsz); + } + return true; +} + +static bool trans_SINCDECP_r_32(DisasContext *s, arg_incdec_pred *a, + uint32_t insn) +{ + if (sve_access_check(s)) { + TCGv_i64 reg = cpu_reg(s, a->rd); + TCGv_i64 val = tcg_temp_new_i64(); + + do_cntp(s, val, a->esz, a->pg, a->pg); + do_sat_addsub_32(reg, val, a->u, a->d); + } + return true; +} + +static bool trans_SINCDECP_r_64(DisasContext *s, arg_incdec_pred *a, + uint32_t insn) +{ + if (sve_access_check(s)) { + TCGv_i64 reg = cpu_reg(s, a->rd); + TCGv_i64 val = tcg_temp_new_i64(); + + do_cntp(s, val, a->esz, a->pg, a->pg); + do_sat_addsub_64(reg, val, a->u, a->d); + } + return true; +} + +static bool trans_SINCDECP_z(DisasContext *s, arg_incdec2_pred *a, + uint32_t insn) +{ + if (a->esz == 0) { + return false; + } + if (sve_access_check(s)) { + TCGv_i64 val = tcg_temp_new_i64(); + do_cntp(s, val, a->esz, a->pg, a->pg); + do_sat_addsub_vec(s, a->esz, a->rd, a->rn, val, a->u, a->d); + } + return true; +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 66e1ee6b5c..62d51c252b 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -67,6 +67,8 @@ &ptrue rd esz pat s &incdec_cnt rd pat esz imm d u &incdec2_cnt rd rn pat esz imm d u +&incdec_pred rd pg esz d u +&incdec2_pred rd rn pg esz d u ########################################################################### # Named instruction formats. These are generally used to @@ -113,6 +115,7 @@ # One register operand, with governing predicate, vector element size @rd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz +@rd_pg4_pn ........ esz:2 ... ... .. pg:4 . rn:4 rd:5 &rpr_esz # Two register operands with a 6-bit signed immediate. @rd_rn_i6 ........ ... rn:5 ..... imm:s6 rd:5 &rri @@ -153,6 +156,12 @@ @incdec2_cnt ........ esz:2 .. .... ...... pat:5 rd:5 \ &incdec2_cnt imm=%imm4_16_p1 rn=%reg_movprfx +# One register, predicate. +# User must fill in U and D. +@incdec_pred ........ esz:2 .... .. ..... .. pg:4 rd:5 &incdec_pred +@incdec2_pred ........ esz:2 .... .. ..... .. pg:4 rd:5 \ + &incdec2_pred rn=%reg_movprfx + ########################################################################### # Instruction patterns. Grouped according to the SVE encodingindex.xhtml. @@ -579,6 +588,24 @@ BRKB_m 00100101 1. 01000001 .... 0 .... 1 .... @pd_pg_pn_s # SVE propagate break to next partition BRKN 00100101 0. 01100001 .... 0 .... 0 .... @pd_pg_pn_s +### SVE Predicate Count Group + +# SVE predicate count +CNTP 00100101 .. 100 000 10 .... 0 .... ..... @rd_pg4_pn + +# SVE inc/dec register by predicate count +INCDECP_r 00100101 .. 10110 d:1 10001 00 .... ..... @incdec_pred u=1 + +# SVE inc/dec vector by predicate count +INCDECP_z 00100101 .. 10110 d:1 10000 00 .... ..... @incdec2_pred u=1 + +# SVE saturating inc/dec register by predicate count +SINCDECP_r_32 00100101 .. 1010 d:1 u:1 10001 00 .... ..... @incdec_pred +SINCDECP_r_64 00100101 .. 1010 d:1 u:1 10001 10 .... ..... @incdec_pred + +# SVE saturating inc/dec vector by predicate count +SINCDECP_z 00100101 .. 1010 d:1 u:1 10000 00 .... ..... @incdec2_pred + ### SVE Memory - 32-bit Gather and Unsized Contiguous Group # SVE load predicate register From patchwork Wed Jun 13 01:56:38 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 138399 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp111141lji; Tue, 12 Jun 2018 19:16:43 -0700 (PDT) X-Google-Smtp-Source: ADUXVKI5zIGavANcuIjLPsF+6qH23XLAu7ZAZ6EGKDkbTwvRx51kXpbqn34GTWY8Ema7CDrXLzJz X-Received: by 2002:ac8:1c6e:: with SMTP id j43-v6mr2958709qtk.218.1528856203042; Tue, 12 Jun 2018 19:16:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1528856203; cv=none; d=google.com; s=arc-20160816; b=tjElCBa1I7qrVK1d1XiIhqpw1sYzg0oxyP02oETZMXLlDhgn+4tNrcdROjWUvb/cYD taW7silbZBiA18lupD4hQNlK2MMhWx55xdJyqA+QZ322NSeZ8hW4DZBCFJbyTTGBoMcb cKuxtsuWzSeFQnPaxON8H+E4ArZ6JX7FYVdbURXO5LXkfam0iHqWAarIuriaAW2/QJxa lOcJ+vBOFavHG0VyYxRSfZZPhpZOy+D24p6dz7p+iXip6KjPwTH2SZ0LHvIzVWadnreZ gyPmtmeHhAQtGKZPqNaWxjTXz8GxO7nH8Wl0t7rW/B76wHwupOglrjZYdn/RUmz/c39J MQ7Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=lP7pAHqSFNFh7l1XUm+Cdv4v7MFH6sGk56YYuuXMtLg=; b=g5GTTxSonNrN8/mbADjhjVrgXR63ZeTS4t1QCpTRJG3HCHYnoE6e4wp0ZV1gk3hMxj 3ZbvMOJGQICUzXq5/kFnavtskf5MmYwS+rh1VthH5oqya02uqW6IPCrdlRYOIvFVgWn1 +Hr0ub7kd5fgA7pMQDz6P4+ne7W8KcDY9uuNYdJl1W4+Ro6KWBrO9brrEl4CueT2MSiL R/xhOgcebMTyU/uxp7+/hASKwDi5GBt1tHER7vT9WF1aZdVg24yboqdJeU8c5+kIJgkt vk0FgltvM7uNzE29ifaM7dzVs4x6TY9yfvqjlZ9TkMKBduifuDUtzh50KzHefQV91ZyL MBoA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=YtqzpGOo; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id y15-v6si1658189qvb.188.2018.06.12.19.16.42 for (version=TLS1 cipher=AES128-SHA bits=128/128); Tue, 12 Jun 2018 19:16:43 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=YtqzpGOo; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:59387 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSvL0-0001BS-Fv for patch@linaro.org; Tue, 12 Jun 2018 22:16:42 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44152) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSv2Q-0002bq-Nh for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:32 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fSv2P-0006KQ-5s for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:30 -0400 Received: from mail-pf0-x22d.google.com ([2607:f8b0:400e:c00::22d]:36070) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fSv2O-0006K4-Th for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:29 -0400 Received: by mail-pf0-x22d.google.com with SMTP id a12-v6so509873pfi.3 for ; Tue, 12 Jun 2018 18:57:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=lP7pAHqSFNFh7l1XUm+Cdv4v7MFH6sGk56YYuuXMtLg=; b=YtqzpGOowbdRrPTEPOJjqIkLse8OyKZ1kMLlQW4sWlbNHvgRJlNovQZYPFd2GzvUhD 8vlPlo+5Bp9Nlex5Xq8rhj+G4k645iFSGjepDkr1/j31wwYhHYIDHGUJeAAhfKs/u8k+ UkRVmtjWMIwR/MDbIrPi4KrAaRAozqPeND/IY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=lP7pAHqSFNFh7l1XUm+Cdv4v7MFH6sGk56YYuuXMtLg=; b=opByPBH65tV3cbxpbb3MSlr7Zk+b2lxN+jBh0g0awlfQC/nMuG1PQqcfxblwahXsJ4 CsY4nBlqwKth0+jcPkGH4pJzLGxBimjor8PB7gV0B/KxyJzJz0+6U50ilBEqhKgAB1ZU RIim6WSvAcIn7prWDcrs0htlhotgEKpqKC+ciM/Z4FwkoUTJCiEV5zLgx0OSC5yEf3vO VZiQMs8rjFfVbC5g/BoKl0wQzNAEugIm5Zo+njWv4KBX7C97fbfUFObvhjogkVqwONof mBTRyhXw238cznk5jo7dfry01SPIJOMDXShzcYRdZA/OIz9uvJnHvJhw6hmCdgqwAJvu AoHw== X-Gm-Message-State: APt69E1pgNMiDHzEV1mi7IErqia+IV+WqP4OJkHSDfRIqNaSIt1VCeAd E+BLMQzxnCoEBUR+uGHHqAc5d6V6v3k= X-Received: by 2002:a62:d913:: with SMTP id s19-v6mr2842502pfg.39.1528855047512; Tue, 12 Jun 2018 18:57:27 -0700 (PDT) Received: from cloudburst.twiddle.net (rrcs-173-198-77-219.west.biz.rr.com. [173.198.77.219]) by smtp.gmail.com with ESMTPSA id g10-v6sm1647287pfi.148.2018.06.12.18.57.25 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 12 Jun 2018 18:57:26 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Tue, 12 Jun 2018 15:56:38 -1000 Message-Id: <20180613015641.5667-16-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180613015641.5667-1-richard.henderson@linaro.org> References: <20180613015641.5667-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::22d Subject: [Qemu-devel] [PATCH v4b 15/18] target/arm: Implement SVE Integer Compare - Scalars Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 2 + target/arm/sve_helper.c | 31 ++++++++++++ target/arm/translate-sve.c | 99 ++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 8 +++ 4 files changed, 140 insertions(+) -- 2.17.1 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index dd4f8f754d..1863106d0f 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -678,3 +678,5 @@ DEF_HELPER_FLAGS_4(sve_brkn, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_brkns, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_cntp, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve_while, TCG_CALL_NO_RWG, i32, ptr, i32, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index a4ecd653c1..8539595bd7 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2738,3 +2738,34 @@ uint64_t HELPER(sve_cntp)(void *vn, void *vg, uint32_t pred_desc) } return sum; } + +uint32_t HELPER(sve_while)(void *vd, uint32_t count, uint32_t pred_desc) +{ + uintptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + intptr_t esz = extract32(pred_desc, SIMD_DATA_SHIFT, 2); + uint64_t esz_mask = pred_esz_masks[esz]; + ARMPredicateReg *d = vd; + uint32_t flags; + intptr_t i; + + /* Begin with a zero predicate register. */ + flags = do_zero(d, oprsz); + if (count == 0) { + return flags; + } + + /* Scale from predicate element count to bits. */ + count <<= esz; + /* Bound to the bits in the predicate. */ + count = MIN(count, oprsz * 8); + + /* Set all of the requested bits. */ + for (i = 0; i < count / 64; ++i) { + d->p[i] = esz_mask; + } + if (count & 63) { + d->p[i] = MAKE_64BIT_MASK(0, count & 63) & esz_mask; + } + + return predtest_ones(d, oprsz, esz_mask); +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 6b0b8c55d0..ae6a504f61 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3092,6 +3092,105 @@ static bool trans_SINCDECP_z(DisasContext *s, arg_incdec2_pred *a, return true; } +/* + *** SVE Integer Compare Scalars Group + */ + +static bool trans_CTERM(DisasContext *s, arg_CTERM *a, uint32_t insn) +{ + if (!sve_access_check(s)) { + return true; + } + + TCGCond cond = (a->ne ? TCG_COND_NE : TCG_COND_EQ); + TCGv_i64 rn = read_cpu_reg(s, a->rn, a->sf); + TCGv_i64 rm = read_cpu_reg(s, a->rm, a->sf); + TCGv_i64 cmp = tcg_temp_new_i64(); + + tcg_gen_setcond_i64(cond, cmp, rn, rm); + tcg_gen_extrl_i64_i32(cpu_NF, cmp); + tcg_temp_free_i64(cmp); + + /* VF = !NF & !CF. */ + tcg_gen_xori_i32(cpu_VF, cpu_NF, 1); + tcg_gen_andc_i32(cpu_VF, cpu_VF, cpu_CF); + + /* Both NF and VF actually look at bit 31. */ + tcg_gen_neg_i32(cpu_NF, cpu_NF); + tcg_gen_neg_i32(cpu_VF, cpu_VF); + return true; +} + +static bool trans_WHILE(DisasContext *s, arg_WHILE *a, uint32_t insn) +{ + if (!sve_access_check(s)) { + return true; + } + + TCGv_i64 op0 = read_cpu_reg(s, a->rn, 1); + TCGv_i64 op1 = read_cpu_reg(s, a->rm, 1); + TCGv_i64 t0 = tcg_temp_new_i64(); + TCGv_i64 t1 = tcg_temp_new_i64(); + TCGv_i32 t2, t3; + TCGv_ptr ptr; + unsigned desc, vsz = vec_full_reg_size(s); + TCGCond cond; + + if (!a->sf) { + if (a->u) { + tcg_gen_ext32u_i64(op0, op0); + tcg_gen_ext32u_i64(op1, op1); + } else { + tcg_gen_ext32s_i64(op0, op0); + tcg_gen_ext32s_i64(op1, op1); + } + } + + /* For the helper, compress the different conditions into a computation + * of how many iterations for which the condition is true. + * + * This is slightly complicated by 0 <= UINT64_MAX, which is nominally + * 2**64 iterations, overflowing to 0. Of course, predicate registers + * aren't that large, so any value >= predicate size is sufficient. + */ + tcg_gen_sub_i64(t0, op1, op0); + + /* t0 = MIN(op1 - op0, vsz). */ + tcg_gen_movi_i64(t1, vsz); + tcg_gen_umin_i64(t0, t0, t1); + if (a->eq) { + /* Equality means one more iteration. */ + tcg_gen_addi_i64(t0, t0, 1); + } + + /* t0 = (condition true ? t0 : 0). */ + cond = (a->u + ? (a->eq ? TCG_COND_LEU : TCG_COND_LTU) + : (a->eq ? TCG_COND_LE : TCG_COND_LT)); + tcg_gen_movi_i64(t1, 0); + tcg_gen_movcond_i64(cond, t0, op0, op1, t0, t1); + + t2 = tcg_temp_new_i32(); + tcg_gen_extrl_i64_i32(t2, t0); + tcg_temp_free_i64(t0); + tcg_temp_free_i64(t1); + + desc = (vsz / 8) - 2; + desc = deposit32(desc, SIMD_DATA_SHIFT, 2, a->esz); + t3 = tcg_const_i32(desc); + + ptr = tcg_temp_new_ptr(); + tcg_gen_addi_ptr(ptr, cpu_env, pred_full_reg_offset(s, a->rd)); + + gen_helper_sve_while(t2, ptr, t2, t3); + do_pred_flags(t2); + + tcg_temp_free_ptr(ptr); + tcg_temp_free_i32(t2); + tcg_temp_free_i32(t3); + return true; +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 62d51c252b..4b718060a9 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -606,6 +606,14 @@ SINCDECP_r_64 00100101 .. 1010 d:1 u:1 10001 10 .... ..... @incdec_pred # SVE saturating inc/dec vector by predicate count SINCDECP_z 00100101 .. 1010 d:1 u:1 10000 00 .... ..... @incdec2_pred +### SVE Integer Compare - Scalars Group + +# SVE conditionally terminate scalars +CTERM 00100101 1 sf:1 1 rm:5 001000 rn:5 ne:1 0000 + +# SVE integer compare scalar count and limit +WHILE 00100101 esz:2 1 rm:5 000 sf:1 u:1 1 rn:5 eq:1 rd:4 + ### SVE Memory - 32-bit Gather and Unsized Contiguous Group # SVE load predicate register From patchwork Wed Jun 13 01:56:39 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 138393 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp101725lji; Tue, 12 Jun 2018 19:04:00 -0700 (PDT) X-Google-Smtp-Source: ADUXVKJ5Qd2Jl0+S89qvzjSJaTwNDmpZYNgmJuxM6kBaBpe0SUUebWLgtixT6EdJJog7aSN82VeU X-Received: by 2002:ac8:163a:: with SMTP id p55-v6mr2781538qtj.157.1528855440877; Tue, 12 Jun 2018 19:04:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1528855440; cv=none; d=google.com; s=arc-20160816; b=DNDeikZ8p3IrkZ+h/EHcAyFR9eG5DAf6RFzjZ1HAr867DvjAP3Py5cXOVF0X69sBNA iMxmUTDd2jRalVPnjqJ+XgK3QcxHrw+uAiicZq7y7UfGzVJiuYAWl9THTAFTfeqchv/U mEUW38bYnV+iJc+BU3h+5iEtpGUlcYY1K4sU90RdjDt8BfMm+q8KVCPH25oW2m+eJhfj yiOzxV1D+whKRnwT5vpe0oiwRtSPNHMB5UDXzVlXUCpWm1Cy8b3me2SsBbNvXh3o3y6p IyMfzb+C4+bzGMzUHPYTxCRKDQpv4Xoya9wx2zeifF8UGbU0Qgo3u72r3VAPIaUWi3Az Crzg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=1seX/MWdF66DWCgqk5sBbS7AoboDAL+ftKWl5wY1S6g=; b=mBDULznq/jrQavqaV82+Cbjg0YvcuBafIoo9vGeS3jSU5QqRmmyZYWktWvIRrNEIFS Q97atMHeV/ZRoSjoR7fvDjdreBIq7N4pk7uln0/4Sx2DEGI/IE8qTn4gwNSXE8o/3RoO pYkZloIxXGNbunX87AjG3RekAqcaym/V83EZEZqrG38bBrfyEth4jkGakuk1PuRPR5jY yCYXh1kGLcM24UC4yHF2ECS/l2Q3d/kxEJ4jX2VRMdSREUiyHJ4jwEOv6eJMXRRjW9m+ dL0ieONvpLtbOOJGVSjwgdB8/mJv5E7WZnwev0hJg1NTUcgTPYGCczSanqMq7LwuN3JZ ITTA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=K2kPudgq; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id w140-v6si1690226qkw.211.2018.06.12.19.04.00 for (version=TLS1 cipher=AES128-SHA bits=128/128); Tue, 12 Jun 2018 19:04:00 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=K2kPudgq; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:59300 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSv8i-0007Or-Af for patch@linaro.org; Tue, 12 Jun 2018 22:04:00 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44166) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSv2U-0002g1-H8 for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:35 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fSv2R-0006Ky-CM for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:34 -0400 Received: from mail-pf0-x242.google.com ([2607:f8b0:400e:c00::242]:45756) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fSv2R-0006Kl-19 for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:31 -0400 Received: by mail-pf0-x242.google.com with SMTP id a22-v6so497934pfo.12 for ; Tue, 12 Jun 2018 18:57:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=1seX/MWdF66DWCgqk5sBbS7AoboDAL+ftKWl5wY1S6g=; b=K2kPudgqELQqytqjMo/5gcCIuB2Mi3/++BV2dSrxPywTIuFIWxApsdme5obuaMsMW9 e1WpnPxUz8nPK80qMdHSwsb4Pj6n+1rO5c6NQLmkzV8KPQu1XVZHurfyMNL9TQa9R6Pn +2cBpiEfrH+C/smGyei6HWNKsI/7MYZNxf4zU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=1seX/MWdF66DWCgqk5sBbS7AoboDAL+ftKWl5wY1S6g=; b=hs4GofUkMalLNwsGBWW2HUhSC6UjiGoL9U8Tn2kDVfDK0pKvnza97bZPYfxRPm40ZW pkZeYeSMMOw5Uh/TRfHM40ZmdgRQMlTUrLp17Qjm9V2sKnPiHwW5dWrY1XHWuaSRjthR vHkB4Ur8r/pdlBhpyRWnJKwqlwV/DDV46XNjr9JVNhDkoolxw2RykTcNjRiHdzOKkBRW l7RAUiGiRM0Ez/6o1CjQoy+cRXBpr3q1QcXWqUu0aCiddHnfrE/vllK9XdhCryeFBeFV 7AhD/id1p0sLHn+n8BcX99qE1bHn94NSrw3lXIExT7P6PF5RZM0Y4sfKzmAKjDp3n6Sk Li0Q== X-Gm-Message-State: APt69E2oUbVOIWNNv+dIWFKmOlBHuTGvDhxT53vKWLOJtSaF1Mxj1f7R V9gQ+9vDLGBR1GPyjDF4aahd0DdLlIU= X-Received: by 2002:a62:f705:: with SMTP id h5-v6mr2856832pfi.169.1528855049603; Tue, 12 Jun 2018 18:57:29 -0700 (PDT) Received: from cloudburst.twiddle.net (rrcs-173-198-77-219.west.biz.rr.com. [173.198.77.219]) by smtp.gmail.com with ESMTPSA id g10-v6sm1647287pfi.148.2018.06.12.18.57.27 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 12 Jun 2018 18:57:28 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Tue, 12 Jun 2018 15:56:39 -1000 Message-Id: <20180613015641.5667-17-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180613015641.5667-1-richard.henderson@linaro.org> References: <20180613015641.5667-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::242 Subject: [Qemu-devel] [PATCH v4b 16/18] target/arm: Implement FDUP/DUP X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 37 +++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 8 ++++++++ 2 files changed, 45 insertions(+) -- 2.17.1 diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index ae6a504f61..13d5effff1 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3191,6 +3191,43 @@ static bool trans_WHILE(DisasContext *s, arg_WHILE *a, uint32_t insn) return true; } +/* + *** SVE Integer Wide Immediate - Unpredicated Group + */ + +static bool trans_FDUP(DisasContext *s, arg_FDUP *a, uint32_t insn) +{ + if (a->esz == 0) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + int dofs = vec_full_reg_offset(s, a->rd); + uint64_t imm; + + /* Decode the VFP immediate. */ + imm = vfp_expand_imm(a->esz, a->imm); + imm = dup_const(a->esz, imm); + + tcg_gen_gvec_dup64i(dofs, vsz, vsz, imm); + } + return true; +} + +static bool trans_DUP_i(DisasContext *s, arg_DUP_i *a, uint32_t insn) +{ + if (a->esz == 0 && extract32(insn, 13, 1)) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + int dofs = vec_full_reg_offset(s, a->rd); + + tcg_gen_gvec_dup64i(dofs, vsz, vsz, dup_const(a->esz, a->imm)); + } + return true; +} + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 4b718060a9..b8bd22aff7 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -614,6 +614,14 @@ CTERM 00100101 1 sf:1 1 rm:5 001000 rn:5 ne:1 0000 # SVE integer compare scalar count and limit WHILE 00100101 esz:2 1 rm:5 000 sf:1 u:1 1 rn:5 eq:1 rd:4 +### SVE Integer Wide Immediate - Unpredicated Group + +# SVE broadcast floating-point immediate (unpredicated) +FDUP 00100101 esz:2 111 00 1110 imm:8 rd:5 + +# SVE broadcast integer immediate (unpredicated) +DUP_i 00100101 esz:2 111 00 011 . ........ rd:5 imm=%sh8_i8s + ### SVE Memory - 32-bit Gather and Unsized Contiguous Group # SVE load predicate register From patchwork Wed Jun 13 01:56:40 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 138401 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp115951lji; Tue, 12 Jun 2018 19:23:22 -0700 (PDT) X-Google-Smtp-Source: ADUXVKKOI2qUv5AOEmKcMxSQDOAi2Tc5mCymc9UlnddsBpnFiRf5XE360zliiNwoyoa8lbrBwiWS X-Received: by 2002:ae9:e8c2:: with SMTP id a185-v6mr2838189qkg.223.1528856602814; Tue, 12 Jun 2018 19:23:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1528856602; cv=none; d=google.com; s=arc-20160816; b=XLWd2wHPVSXz/IlLukFoLE3SDTEd2C97ZF12KX5DduGtMoZ2EveVC/C4ENW1JtO2c+ JXAFt+tbrJ1g4m40aw/dCDAV/wE8a6sH3+yhEOc5Bpn4ci9q8L2Iw0uKHGNs9+u5R5b+ k5LWolCApjk7Er4G9iRI7DS3BGREkEOlqtBTPh7z9qg1acfkONVuJF39A8B4nIrrw0G9 olCX5gdnSn02bs3aI255QT2MfPA83xBpotRJzr8iN1Igo/E34JphcoNTCLPBkHbM0et1 i7zksj+bFMiEClV6hta2sc/O81A9aS7BwYfi/ZRj6rozXlcc8XgAYklCJNjQuiszi9F4 Vmyg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=08KVnXkt+l/ZXsM7qwG4HAYjy2+Tc3GyWRkNmHbxxLw=; b=kBCY+CQHNZE6RMgLgmjRJjK1ehnk/RpfQmzWDy/ljLVUNqJVpr+8Ofq4YvVp1PlL2S VeZcq6ckojbi/7Vqd1MK2qdmZKlnEPDdAKK8hFkh+QPg405ABOEE7fiziJDaqmCI7IH9 7WpMFr0H+30r5q9uytRIk6BtzKUaz2/6h3ejAymwlfnOCCRaNCEMx1LpWDb5Rl3eZZaS EEfddf2tsJGZgpN8kIMjG7IlK/w4z2jShJmEUTQ+Dg4qgAdVyC4d5lLes3m4qSk8fCtw 6cmXUm4s+VgSrxcAogS3QyAVUGM+RImgApH/YvS3lHCcuEFN5ZRFWKYUgnYCldFl1p8x PFfA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=D44pEWeJ; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id u27-v6si1652363qku.32.2018.06.12.19.23.22 for (version=TLS1 cipher=AES128-SHA bits=128/128); Tue, 12 Jun 2018 19:23:22 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=D44pEWeJ; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:59420 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSvRS-000631-Ay for patch@linaro.org; Tue, 12 Jun 2018 22:23:22 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44176) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSv2W-0002hC-94 for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:40 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fSv2U-0006LW-Fp for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:36 -0400 Received: from mail-pg0-x22a.google.com ([2607:f8b0:400e:c05::22a]:35736) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fSv2U-0006LF-7C for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:34 -0400 Received: by mail-pg0-x22a.google.com with SMTP id 15-v6so463197pge.2 for ; Tue, 12 Jun 2018 18:57:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=08KVnXkt+l/ZXsM7qwG4HAYjy2+Tc3GyWRkNmHbxxLw=; b=D44pEWeJwYgKOfRVCSCrnq6zYHalzSRIgFACO29YYZEy33ZLs0goBqYQme7VNWVTxk UOQxdZmj9hxPpPLjr+qrsq4mvu1s/8WDN1DgRHmiX5Ai224ueGQi+KOvO0KROdayqxLm EsxHVKnE2eRvIv0aHHUaEC5q0VWJNwo0JUtRc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=08KVnXkt+l/ZXsM7qwG4HAYjy2+Tc3GyWRkNmHbxxLw=; b=QE0+qYnx24o90m9b+m2gsQp5dkqh5q8hYP+NW4qECeuRCnfrbWaxznMevUS88rKpuU tY1pdQz9sOCWosqkiRtsO4d8nkdR2A8/lH1hXrnD95WLOuUo7S+u+fDg6qVDp8uNf5Lt PxD0cVCjLuP3ExXGASZMPy0H40Ke1EAxnOux098l1FriB16Mr0GIBM3bzJrfjbwcaP0a Cyq0BLnj91Ndj+NO5Z+0hku2KhSVKneMDh4t9B4QL9mMJFQMnrGu7jkEqbCWBXhJXj7J s3EQAVijxdnoeNjJd5NZBS9+lcelbRSToLw1T1mVj8G+tNo18++BOfgnoSQdjttSWuvU jpPA== X-Gm-Message-State: APt69E1mlZDrS9LEaHxVBotwXtU4X67DgnLTSKqkHlUjIZonrCy6ABuE xxHcDRE21CwJiL6iH64vSuu0ZGIm7qU= X-Received: by 2002:a62:4bc8:: with SMTP id d69-v6mr2867513pfj.244.1528855052728; Tue, 12 Jun 2018 18:57:32 -0700 (PDT) Received: from cloudburst.twiddle.net (rrcs-173-198-77-219.west.biz.rr.com. [173.198.77.219]) by smtp.gmail.com with ESMTPSA id g10-v6sm1647287pfi.148.2018.06.12.18.57.29 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 12 Jun 2018 18:57:31 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Tue, 12 Jun 2018 15:56:40 -1000 Message-Id: <20180613015641.5667-18-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180613015641.5667-1-richard.henderson@linaro.org> References: <20180613015641.5667-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::22a Subject: [Qemu-devel] [PATCH v4b 17/18] target/arm: Implement SVE Integer Wide Immediate - Unpredicated Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 25 +++++++ target/arm/sve_helper.c | 41 +++++++++++ target/arm/translate-sve.c | 144 +++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 26 +++++++ 4 files changed, 236 insertions(+) -- 2.17.1 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 1863106d0f..97bfe0f47b 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -680,3 +680,28 @@ DEF_HELPER_FLAGS_4(sve_brkns, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_cntp, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_while, TCG_CALL_NO_RWG, i32, ptr, i32, i32) + +DEF_HELPER_FLAGS_4(sve_subri_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_subri_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_subri_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_subri_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_4(sve_smaxi_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_smaxi_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_smaxi_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_smaxi_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_4(sve_smini_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_smini_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_smini_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_smini_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_4(sve_umaxi_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_umaxi_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_umaxi_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_umaxi_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_4(sve_umini_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_umini_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_umini_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) +DEF_HELPER_FLAGS_4(sve_umini_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 8539595bd7..128bbf9b04 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -804,6 +804,46 @@ DO_VPZ_D(sve_uminv_d, uint64_t, uint64_t, -1, DO_MIN) #undef DO_VPZ #undef DO_VPZ_D +/* Two vector operand, one scalar operand, unpredicated. */ +#define DO_ZZI(NAME, TYPE, OP) \ +void HELPER(NAME)(void *vd, void *vn, uint64_t s64, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc) / sizeof(TYPE); \ + TYPE s = s64, *d = vd, *n = vn; \ + for (i = 0; i < opr_sz; ++i) { \ + d[i] = OP(n[i], s); \ + } \ +} + +#define DO_SUBR(X, Y) (Y - X) + +DO_ZZI(sve_subri_b, uint8_t, DO_SUBR) +DO_ZZI(sve_subri_h, uint16_t, DO_SUBR) +DO_ZZI(sve_subri_s, uint32_t, DO_SUBR) +DO_ZZI(sve_subri_d, uint64_t, DO_SUBR) + +DO_ZZI(sve_smaxi_b, int8_t, DO_MAX) +DO_ZZI(sve_smaxi_h, int16_t, DO_MAX) +DO_ZZI(sve_smaxi_s, int32_t, DO_MAX) +DO_ZZI(sve_smaxi_d, int64_t, DO_MAX) + +DO_ZZI(sve_smini_b, int8_t, DO_MIN) +DO_ZZI(sve_smini_h, int16_t, DO_MIN) +DO_ZZI(sve_smini_s, int32_t, DO_MIN) +DO_ZZI(sve_smini_d, int64_t, DO_MIN) + +DO_ZZI(sve_umaxi_b, uint8_t, DO_MAX) +DO_ZZI(sve_umaxi_h, uint16_t, DO_MAX) +DO_ZZI(sve_umaxi_s, uint32_t, DO_MAX) +DO_ZZI(sve_umaxi_d, uint64_t, DO_MAX) + +DO_ZZI(sve_umini_b, uint8_t, DO_MIN) +DO_ZZI(sve_umini_h, uint16_t, DO_MIN) +DO_ZZI(sve_umini_s, uint32_t, DO_MIN) +DO_ZZI(sve_umini_d, uint64_t, DO_MIN) + +#undef DO_ZZI + #undef DO_AND #undef DO_ORR #undef DO_EOR @@ -818,6 +858,7 @@ DO_VPZ_D(sve_uminv_d, uint64_t, uint64_t, -1, DO_MIN) #undef DO_ASR #undef DO_LSR #undef DO_LSL +#undef DO_SUBR /* Similar to the ARM LastActiveElement pseudocode function, except the result is multiplied by the element size. This includes the not found diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 13d5effff1..afd0b1638d 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -77,6 +77,11 @@ static inline int expand_imm_sh8s(int x) return (int8_t)x << (x & 0x100 ? 8 : 0); } +static inline int expand_imm_sh8u(int x) +{ + return (uint8_t)x << (x & 0x100 ? 8 : 0); +} + /* * Include the generated decoder. */ @@ -3228,6 +3233,145 @@ static bool trans_DUP_i(DisasContext *s, arg_DUP_i *a, uint32_t insn) return true; } +static bool trans_ADD_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + if (a->esz == 0 && extract32(insn, 13, 1)) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_addi(a->esz, vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), a->imm, vsz, vsz); + } + return true; +} + +static bool trans_SUB_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + a->imm = -a->imm; + return trans_ADD_zzi(s, a, insn); +} + +static bool trans_SUBR_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + static const GVecGen2s op[4] = { + { .fni8 = tcg_gen_vec_sub8_i64, + .fniv = tcg_gen_sub_vec, + .fno = gen_helper_sve_subri_b, + .opc = INDEX_op_sub_vec, + .vece = MO_8, + .scalar_first = true }, + { .fni8 = tcg_gen_vec_sub16_i64, + .fniv = tcg_gen_sub_vec, + .fno = gen_helper_sve_subri_h, + .opc = INDEX_op_sub_vec, + .vece = MO_16, + .scalar_first = true }, + { .fni4 = tcg_gen_sub_i32, + .fniv = tcg_gen_sub_vec, + .fno = gen_helper_sve_subri_s, + .opc = INDEX_op_sub_vec, + .vece = MO_32, + .scalar_first = true }, + { .fni8 = tcg_gen_sub_i64, + .fniv = tcg_gen_sub_vec, + .fno = gen_helper_sve_subri_d, + .opc = INDEX_op_sub_vec, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .vece = MO_64, + .scalar_first = true } + }; + + if (a->esz == 0 && extract32(insn, 13, 1)) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + TCGv_i64 c = tcg_const_i64(a->imm); + tcg_gen_gvec_2s(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vsz, vsz, c, &op[a->esz]); + tcg_temp_free_i64(c); + } + return true; +} + +static bool trans_MUL_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_muli(a->esz, vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), a->imm, vsz, vsz); + } + return true; +} + +static bool do_zzi_sat(DisasContext *s, arg_rri_esz *a, uint32_t insn, + bool u, bool d) +{ + if (a->esz == 0 && extract32(insn, 13, 1)) { + return false; + } + if (sve_access_check(s)) { + TCGv_i64 val = tcg_const_i64(a->imm); + do_sat_addsub_vec(s, a->esz, a->rd, a->rn, val, u, d); + tcg_temp_free_i64(val); + } + return true; +} + +static bool trans_SQADD_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + return do_zzi_sat(s, a, insn, false, false); +} + +static bool trans_UQADD_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + return do_zzi_sat(s, a, insn, true, false); +} + +static bool trans_SQSUB_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + return do_zzi_sat(s, a, insn, false, true); +} + +static bool trans_UQSUB_zzi(DisasContext *s, arg_rri_esz *a, uint32_t insn) +{ + return do_zzi_sat(s, a, insn, true, true); +} + +static bool do_zzi_ool(DisasContext *s, arg_rri_esz *a, gen_helper_gvec_2i *fn) +{ + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + TCGv_i64 c = tcg_const_i64(a->imm); + + tcg_gen_gvec_2i_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + c, vsz, vsz, 0, fn); + tcg_temp_free_i64(c); + } + return true; +} + +#define DO_ZZI(NAME, name) \ +static bool trans_##NAME##_zzi(DisasContext *s, arg_rri_esz *a, \ + uint32_t insn) \ +{ \ + static gen_helper_gvec_2i * const fns[4] = { \ + gen_helper_sve_##name##i_b, gen_helper_sve_##name##i_h, \ + gen_helper_sve_##name##i_s, gen_helper_sve_##name##i_d, \ + }; \ + return do_zzi_ool(s, a, fns[a->esz]); \ +} + +DO_ZZI(SMAX, smax) +DO_ZZI(UMAX, umax) +DO_ZZI(SMIN, smin) +DO_ZZI(UMIN, umin) + +#undef DO_ZZI + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index b8bd22aff7..eee8726bdf 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -42,6 +42,8 @@ # Signed 8-bit immediate, optionally shifted left by 8. %sh8_i8s 5:9 !function=expand_imm_sh8s +# Unsigned 8-bit immediate, optionally shifted left by 8. +%sh8_i8u 5:9 !function=expand_imm_sh8u # Either a copy of rd (at bit 0), or a different source # as propagated via the MOVPRFX instruction. @@ -95,6 +97,12 @@ @pd_pn_pm ........ esz:2 .. rm:4 ....... rn:4 . rd:4 &rrr_esz @rdn_rm ........ esz:2 ...... ...... rm:5 rd:5 \ &rrr_esz rn=%reg_movprfx +@rdn_sh_i8u ........ esz:2 ...... ...... ..... rd:5 \ + &rri_esz rn=%reg_movprfx imm=%sh8_i8u +@rdn_i8u ........ esz:2 ...... ... imm:8 rd:5 \ + &rri_esz rn=%reg_movprfx +@rdn_i8s ........ esz:2 ...... ... imm:s8 rd:5 \ + &rri_esz rn=%reg_movprfx # Three operand with "memory" size, aka immediate left shift @rd_rn_msz_rm ........ ... rm:5 .... imm:2 rn:5 rd:5 &rrri @@ -622,6 +630,24 @@ FDUP 00100101 esz:2 111 00 1110 imm:8 rd:5 # SVE broadcast integer immediate (unpredicated) DUP_i 00100101 esz:2 111 00 011 . ........ rd:5 imm=%sh8_i8s +# SVE integer add/subtract immediate (unpredicated) +ADD_zzi 00100101 .. 100 000 11 . ........ ..... @rdn_sh_i8u +SUB_zzi 00100101 .. 100 001 11 . ........ ..... @rdn_sh_i8u +SUBR_zzi 00100101 .. 100 011 11 . ........ ..... @rdn_sh_i8u +SQADD_zzi 00100101 .. 100 100 11 . ........ ..... @rdn_sh_i8u +UQADD_zzi 00100101 .. 100 101 11 . ........ ..... @rdn_sh_i8u +SQSUB_zzi 00100101 .. 100 110 11 . ........ ..... @rdn_sh_i8u +UQSUB_zzi 00100101 .. 100 111 11 . ........ ..... @rdn_sh_i8u + +# SVE integer min/max immediate (unpredicated) +SMAX_zzi 00100101 .. 101 000 110 ........ ..... @rdn_i8s +UMAX_zzi 00100101 .. 101 001 110 ........ ..... @rdn_i8u +SMIN_zzi 00100101 .. 101 010 110 ........ ..... @rdn_i8s +UMIN_zzi 00100101 .. 101 011 110 ........ ..... @rdn_i8u + +# SVE integer multiply immediate (unpredicated) +MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s + ### SVE Memory - 32-bit Gather and Unsized Contiguous Group # SVE load predicate register From patchwork Wed Jun 13 01:56:41 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 138400 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp113139lji; Tue, 12 Jun 2018 19:19:35 -0700 (PDT) X-Google-Smtp-Source: ADUXVKJXlKyoyVtsRa28OVkpIB7Dbl0cbrEp1UHkgNU4xJ+A3ihiA6Bo/ZslNe7lDj+ktLHv6Qa+ X-Received: by 2002:a0c:ab18:: with SMTP id h24-v6mr2830641qvb.138.1528856375061; Tue, 12 Jun 2018 19:19:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1528856375; cv=none; d=google.com; s=arc-20160816; b=bPjRXIWtwbiNJuflec7jlctaJrVgF/6EDM8M6QuplF6GFBLx5bz1PySGRz34g4shCT PytaRkN1WA1NqAKa01M08cb/ZB+zw9Steg4Cu34mqftN0jy58g+6UUAX7E6T8+tq4MX6 PHMRlYCb0/ZX+46bMSTXYSS1rNJCymjlO8wss7DUe+ruGr6kg7cLDSt87MxQd0lLta1j Rp0oquqcCnmGuZqAXLJz9/5RhGGMkSJmXyxQMr4a1fZ8YNKz93e+FaTXj/B6RDJ8FCuv dA0Jz4lsigAy+AfyxpfSse00+F6FkA17KrfrcoigCMdVosMVAApHUcX7AtwLJzDzTAvq 5c6Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=OWdYG3vMfjlQHDFIiyZTQjB4KPQwi/sEtaom98eE8hc=; b=xsOLRearPe8uIfLFqmQ4DEw8hi/m5CblRPwyW9zKs84L491DAk/f040KJ3oYwK00Lj EwUqpN3ZnadeK2XnvPcc1fI/3BnwLvh2QAMOpjSRkRjebubZhFLDsXhhYBPiXbgx3a3M s342yYI7NpIYw6XGWGintcDNH8B/Ibs5E4LQJCDwrN7VOnNuKI0PzWNnqHBybIOm1y6g KO8RxpgC6tZNSt0DIfWmbI3eV9JEq5+BymBlVSDZWSmhd7SBxonh8ecT+JGr+ncsT+BS DtPaq1ItFRGc26ro9WqVHyskq7Q7unQTVmOXMAK8GSc9VfWPaMJrpnRkzAMIFNq2+4pG w8aA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=YuHFpIFE; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id 50-v6si1593192qvi.158.2018.06.12.19.19.34 for (version=TLS1 cipher=AES128-SHA bits=128/128); Tue, 12 Jun 2018 19:19:35 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=YuHFpIFE; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:59398 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSvNm-00037z-GT for patch@linaro.org; Tue, 12 Jun 2018 22:19:34 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44187) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fSv2Y-0002hG-5t for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:40 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fSv2W-0006MK-W4 for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:38 -0400 Received: from mail-pg0-x22c.google.com ([2607:f8b0:400e:c05::22c]:41553) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fSv2W-0006M3-OI for qemu-devel@nongnu.org; Tue, 12 Jun 2018 21:57:36 -0400 Received: by mail-pg0-x22c.google.com with SMTP id l65-v6so456436pgl.8 for ; Tue, 12 Jun 2018 18:57:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=OWdYG3vMfjlQHDFIiyZTQjB4KPQwi/sEtaom98eE8hc=; b=YuHFpIFEi2+ZZ5P5sfPDbk+74tVl3pxzQLNCrYHSeP0SzlTd4oVH2sxVF8lLbr1YJ0 zn9IiuRivuuWbjKDgFAaSWlAQrRM96SSXTOrTgjk1WKlRBa9IWUPBlGZh5W/Lid8dAdW ewM1sDqxgDZPWEDbZOSwnyBC7KO0OWX/MkDFc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=OWdYG3vMfjlQHDFIiyZTQjB4KPQwi/sEtaom98eE8hc=; b=PuCC97ZEFrAl/WUFWW49s3KKLHKl02oN/04evwdGMtF9GpQKE3cIyrYHIhc34Q23lu tyShGOqdG+Tc8fUd3vsFokigU/vu4z0MFS0mD36ufHy9KFyFTN2NRyGRACx6lSTaOOkm OfTD7+SGZ3UVFAemSWLMVjstBI5qZvJvemH9WsfD7bHtf47M0+vX/hVINVmmch6MAcRS q3yAdd6pRyu0aizeLGIt9FcHvgpf5ID3SUmuxyOfoWCpVqz+GFmo567aumDfUe4Q+ToB C8jIbwR4sNy+LtusQEFmseZ7XcY7mTFmoP9T6RhabtRHgbTGV9l/VrS5toaDqIl3pcOf Ywdg== X-Gm-Message-State: APt69E0HOsOU6jX/b/ycjv0E2XfTgzS97nCNXa+Z0PqBjOtbCMaaGD6C zFEWCY5+g8Du0hEGC9siBmRss+zcSg8= X-Received: by 2002:a63:980a:: with SMTP id q10-v6mr2361857pgd.50.1528855055395; Tue, 12 Jun 2018 18:57:35 -0700 (PDT) Received: from cloudburst.twiddle.net (rrcs-173-198-77-219.west.biz.rr.com. [173.198.77.219]) by smtp.gmail.com with ESMTPSA id g10-v6sm1647287pfi.148.2018.06.12.18.57.33 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 12 Jun 2018 18:57:34 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Tue, 12 Jun 2018 15:56:41 -1000 Message-Id: <20180613015641.5667-19-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180613015641.5667-1-richard.henderson@linaro.org> References: <20180613015641.5667-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::22c Subject: [Qemu-devel] [PATCH v4b 18/18] target/arm: Implement SVE Floating Point Arithmetic - Unpredicated Group X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 14 ++++++++ target/arm/helper.h | 19 +++++++++++ target/arm/translate-sve.c | 42 +++++++++++++++++++++++ target/arm/vec_helper.c | 69 ++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 10 ++++++ 5 files changed, 154 insertions(+) -- 2.17.1 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 97bfe0f47b..2e76084992 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -705,3 +705,17 @@ DEF_HELPER_FLAGS_4(sve_umini_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(sve_umini_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(sve_umini_s, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(sve_umini_d, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) + +DEF_HELPER_FLAGS_5(gvec_recps_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_recps_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_recps_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(gvec_rsqrts_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_rsqrts_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_rsqrts_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/helper.h b/target/arm/helper.h index 0c6a144458..879a7229e9 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -601,6 +601,25 @@ DEF_HELPER_FLAGS_5(gvec_fcmlas_idx, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_fcmlad, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fadd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(gvec_fsub_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fsub_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fsub_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(gvec_fmul_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fmul_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_fmul_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(gvec_ftsmul_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_ftsmul_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_ftsmul_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index afd0b1638d..226c97579c 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3372,6 +3372,48 @@ DO_ZZI(UMIN, umin) #undef DO_ZZI +/* + *** SVE Floating Point Arithmetic - Unpredicated Group + */ + +static bool do_zzz_fp(DisasContext *s, arg_rrr_esz *a, + gen_helper_gvec_3_ptr *fn) +{ + if (fn == NULL) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr status = get_fpstatus_ptr(a->esz == MO_16); + tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + status, vsz, vsz, 0, fn); + tcg_temp_free_ptr(status); + } + return true; +} + + +#define DO_FP3(NAME, name) \ +static bool trans_##NAME(DisasContext *s, arg_rrr_esz *a, uint32_t insn) \ +{ \ + static gen_helper_gvec_3_ptr * const fns[4] = { \ + NULL, gen_helper_gvec_##name##_h, \ + gen_helper_gvec_##name##_s, gen_helper_gvec_##name##_d \ + }; \ + return do_zzz_fp(s, a, fns[a->esz]); \ +} + +DO_FP3(FADD_zzz, fadd) +DO_FP3(FSUB_zzz, fsub) +DO_FP3(FMUL_zzz, fmul) +DO_FP3(FTSMUL, ftsmul) +DO_FP3(FRECPS, recps) +DO_FP3(FRSQRTS, rsqrts) + +#undef DO_FP3 + /* *** SVE Memory - 32-bit Gather and Unsized Contiguous Group */ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 25e209da31..f504dd53c8 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -426,3 +426,72 @@ void HELPER(gvec_fcmlad)(void *vd, void *vn, void *vm, } clear_tail(d, opr_sz, simd_maxsz(desc)); } + +/* Floating-point trigonometric starting value. + * See the ARM ARM pseudocode function FPTrigSMul. + */ +static float16 float16_ftsmul(float16 op1, uint16_t op2, float_status *stat) +{ + float16 result = float16_mul(op1, op1, stat); + if (!float16_is_any_nan(result)) { + result = float16_set_sign(result, op2 & 1); + } + return result; +} + +static float32 float32_ftsmul(float32 op1, uint32_t op2, float_status *stat) +{ + float32 result = float32_mul(op1, op1, stat); + if (!float32_is_any_nan(result)) { + result = float32_set_sign(result, op2 & 1); + } + return result; +} + +static float64 float64_ftsmul(float64 op1, uint64_t op2, float_status *stat) +{ + float64 result = float64_mul(op1, op1, stat); + if (!float64_is_any_nan(result)) { + result = float64_set_sign(result, op2 & 1); + } + return result; +} + +#define DO_3OP(NAME, FUNC, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + TYPE *d = vd, *n = vn, *m = vm; \ + for (i = 0; i < oprsz / sizeof(TYPE); i++) { \ + d[i] = FUNC(n[i], m[i], stat); \ + } \ +} + +DO_3OP(gvec_fadd_h, float16_add, float16) +DO_3OP(gvec_fadd_s, float32_add, float32) +DO_3OP(gvec_fadd_d, float64_add, float64) + +DO_3OP(gvec_fsub_h, float16_sub, float16) +DO_3OP(gvec_fsub_s, float32_sub, float32) +DO_3OP(gvec_fsub_d, float64_sub, float64) + +DO_3OP(gvec_fmul_h, float16_mul, float16) +DO_3OP(gvec_fmul_s, float32_mul, float32) +DO_3OP(gvec_fmul_d, float64_mul, float64) + +DO_3OP(gvec_ftsmul_h, float16_ftsmul, float16) +DO_3OP(gvec_ftsmul_s, float32_ftsmul, float32) +DO_3OP(gvec_ftsmul_d, float64_ftsmul, float64) + +#ifdef TARGET_AARCH64 + +DO_3OP(gvec_recps_h, helper_recpsf_f16, float16) +DO_3OP(gvec_recps_s, helper_recpsf_f32, float32) +DO_3OP(gvec_recps_d, helper_recpsf_f64, float64) + +DO_3OP(gvec_rsqrts_h, helper_rsqrtsf_f16, float16) +DO_3OP(gvec_rsqrts_s, helper_rsqrtsf_f32, float32) +DO_3OP(gvec_rsqrts_d, helper_rsqrtsf_f64, float64) + +#endif +#undef DO_3OP diff --git a/target/arm/sve.decode b/target/arm/sve.decode index eee8726bdf..6f436f9096 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -648,6 +648,16 @@ UMIN_zzi 00100101 .. 101 011 110 ........ ..... @rdn_i8u # SVE integer multiply immediate (unpredicated) MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s +### SVE Floating Point Arithmetic - Unpredicated Group + +# SVE floating-point arithmetic (unpredicated) +FADD_zzz 01100101 .. 0 ..... 000 000 ..... ..... @rd_rn_rm +FSUB_zzz 01100101 .. 0 ..... 000 001 ..... ..... @rd_rn_rm +FMUL_zzz 01100101 .. 0 ..... 000 010 ..... ..... @rd_rn_rm +FTSMUL 01100101 .. 0 ..... 000 011 ..... ..... @rd_rn_rm +FRECPS 01100101 .. 0 ..... 000 110 ..... ..... @rd_rn_rm +FRSQRTS 01100101 .. 0 ..... 000 111 ..... ..... @rd_rn_rm + ### SVE Memory - 32-bit Gather and Unsized Contiguous Group # SVE load predicate register