From patchwork Thu Mar 26 23:08:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 184940 Delivered-To: patch@linaro.org Received: by 2002:a92:de47:0:0:0:0:0 with SMTP id e7csp638312ilr; Thu, 26 Mar 2020 16:10:41 -0700 (PDT) X-Google-Smtp-Source: ADFU+vsjHwzQzCRChnT6j4IcNUReo12g9XPi0+sX5jWhWGpuSkZJe0cyxgAfoLJG/le7XBVO97Sg X-Received: by 2002:a37:d0e:: with SMTP id 14mr11631943qkn.310.1585264241371; Thu, 26 Mar 2020 16:10:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585264241; cv=none; d=google.com; s=arc-20160816; b=k7ui5AP/oPLn0DpejBkA3gGPBOsQeYERAVnmYUw3SpFrrrWlMetpdR3SqrzN/JVKIF rl1fwTB5SR6ShiPi3GcY3xUOYJp3/S2Y8XS5xoMdnmWMN9F76O2JG0UyVwhfRfUiZaSY TB8dh+8I6TCGg5HPeCm4pGz+4sSacVnYg2CerHLv9NXrZ8nMVgyRvmUOpnX2B8bNsiD7 Tr7hcj2CDo3uUKPlbyk1rnkCPs4kR0PgPLZh7eEgzm3KGbFlsLFTZ4zUeU+MLdf6KMkt L1itjgGlP9q0jUXBRtA5gIuM9GNnLPtzPP8Mvf2lWXScn6wMc04EIeBvtOwbzs46fkN1 32MA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=xqsUthjNzD1OriitBStkzWbLURropgGL+K53ub9AoCo=; b=GzAeWRetiRUeLCoiM2BLb2SBzs5UBBVs8aznrBfcrtuhzmBnBbWH0tVLg+8+QBBE9q lVAq/wfYV3JegsUlT11JgPOPadvfCZrjfNmJp3BKL4sM8RrUzl1OSUb6HOJbUSERyg2P KYPUrN2CPq3zRo8AnLwoYvE5ID8+KmxsDP02M7DDy7RscxOIrEYTPV6spjq1o2MtAjzb oK7jYfYdHho4SY2UbUcCmZQxnLB3WAeDw0w3t6tq+p94UfB1s/ePFSeFWBNXMVp4+Vnp l0qV0Pk4BqoJ8IkD2a8WXs7GjoqxBWIITtZZyw/XVSeOonwpbwT2FtqGdt8PBbSKmATN 1ePQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=f5jT5e3o; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id u93si2339453qtd.347.2020.03.26.16.10.41 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 26 Mar 2020 16:10:41 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=f5jT5e3o; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:34488 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbe4-0005MF-Mz for patch@linaro.org; Thu, 26 Mar 2020 19:10:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:57786) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcC-0001q8-84 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:45 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcA-0001Eh-Vm for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:44 -0400 Received: from mail-pj1-x1041.google.com ([2607:f8b0:4864:20::1041]:51267) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcA-0001EM-Qt for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:42 -0400 Received: by mail-pj1-x1041.google.com with SMTP id w9so3092401pjh.1 for ; Thu, 26 Mar 2020 16:08:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=xqsUthjNzD1OriitBStkzWbLURropgGL+K53ub9AoCo=; b=f5jT5e3okDqJogWU1nL0qewojI1O986GHmmoWkD+36Iq0ygFlCd1lPWdtKHDaePmwL Wf34+XObhZSN2sjEWcB/ptwoml1Tw8fkyPm+GWzPbaJbr31wNqudVgmMWrK4RMfkwjnC F5zf4Bl8+JhBKETgqYk5AY+kOW/jDUBrhqeI6Gn3jvBeMR+rPS2OwO86sViNtnnL3sVu J9s3+Wm/gLNnAnMiA2JJ0jZxYjHVfJa/2/o8EhHfD4FjmZBrUu+myegxJrRtexspMVnF pVcvbTlGEjqmqsk+5gmAGRCzL8B6PpjGuy5HVW73KnBtN7aeZmlmjrPwNugRoe0iVZ93 5wMQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=xqsUthjNzD1OriitBStkzWbLURropgGL+K53ub9AoCo=; b=kmujAE205NGHC5iYmQhcLLCMRfj1wr9q5+Ng2K8lD2+IX50LIhQUF2Rk/brqdGEYbY KpD1Fx6goNy5eDisvHawJriH+vPiMt8eQCOO+f+jYrQfBC2Hi4zpzwG8GtScSG/a5Tpd TWngMc3ZjW/asRsMTRCporYkMF6FV55dapSh0ERKy04KD88ndfCKtHJYS9oh1feJE+JF HdSQ5RRIsPHjVpKN1EQvzxQyxIXYtsn+laTrBozFonDL/IVU3qWTcTyI41GJzl/5V67F GCI+jg+RYOc1F41GCphwgUOFt9jyhM8yjRx45gK2BlONwRW2pqNJRGBCsshAg/pXqhYW ecpw== X-Gm-Message-State: ANhLgQ2G6V2jDaY1Q0ZjVfePw5VFNsbQzBIHR78BtNAORtJELwrga8tS R7DkfEFv4mCOO7uXxTMqNMA2mhklTMg= X-Received: by 2002:a17:90a:2147:: with SMTP id a65mr2534859pje.176.1585264121421; Thu, 26 Mar 2020 16:08:41 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.08.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:08:40 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 01/31] target/arm: Add ID_AA64ZFR0 fields and isar_feature_aa64_sve2 Date: Thu, 26 Mar 2020 16:08:08 -0700 Message-Id: <20200326230838.31112-2-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::1041 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Will be used for SVE2 isa subset enablement. Signed-off-by: Richard Henderson --- target/arm/cpu.h | 16 ++++++++++++++++ target/arm/helper.c | 3 +-- target/arm/kvm64.c | 2 ++ 3 files changed, 19 insertions(+), 2 deletions(-) -- 2.20.1 Reviewed-by: Alex Bennée diff --git a/target/arm/cpu.h b/target/arm/cpu.h index e9f049c8d8..2314e3c18c 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -900,6 +900,7 @@ struct ARMCPU { uint64_t id_aa64mmfr2; uint64_t id_aa64dfr0; uint64_t id_aa64dfr1; + uint64_t id_aa64zfr0; } isar; uint32_t midr; uint32_t revidr; @@ -1860,6 +1861,16 @@ FIELD(ID_AA64DFR0, PMSVER, 32, 4) FIELD(ID_AA64DFR0, DOUBLELOCK, 36, 4) FIELD(ID_AA64DFR0, TRACEFILT, 40, 4) +FIELD(ID_AA64ZFR0, SVEVER, 0, 4) +FIELD(ID_AA64ZFR0, AES, 4, 4) +FIELD(ID_AA64ZFR0, BITPERM, 16, 4) +FIELD(ID_AA64ZFR0, BFLOAT16, 20, 4) +FIELD(ID_AA64ZFR0, SHA3, 32, 4) +FIELD(ID_AA64ZFR0, SM4, 40, 4) +FIELD(ID_AA64ZFR0, I8MM, 44, 4) +FIELD(ID_AA64ZFR0, F32MM, 52, 4) +FIELD(ID_AA64ZFR0, F64MM, 56, 4) + FIELD(ID_DFR0, COPDBG, 0, 4) FIELD(ID_DFR0, COPSDBG, 4, 4) FIELD(ID_DFR0, MMAPDBG, 8, 4) @@ -3839,6 +3850,11 @@ static inline bool isar_feature_aa64_ccidx(const ARMISARegisters *id) return FIELD_EX64(id->id_aa64mmfr2, ID_AA64MMFR2, CCIDX) != 0; } +static inline bool isar_feature_aa64_sve2(const ARMISARegisters *id) +{ + return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, SVEVER) != 0; +} + /* * Feature tests for "does this exist in either 32-bit or 64-bit?" */ diff --git a/target/arm/helper.c b/target/arm/helper.c index b3bc33db41..3767002995 100644 --- a/target/arm/helper.c +++ b/target/arm/helper.c @@ -7500,8 +7500,7 @@ void register_cp_regs_for_features(ARMCPU *cpu) .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 4, .opc2 = 4, .access = PL1_R, .type = ARM_CP_CONST, .accessfn = access_aa64_tid3, - /* At present, only SVEver == 0 is defined anyway. */ - .resetvalue = 0 }, + .resetvalue = cpu->isar.id_aa64zfr0 }, { .name = "ID_AA64PFR5_EL1_RESERVED", .state = ARM_CP_STATE_AA64, .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 4, .opc2 = 5, .access = PL1_R, .type = ARM_CP_CONST, diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c index be5b31c2b0..eda4679fcd 100644 --- a/target/arm/kvm64.c +++ b/target/arm/kvm64.c @@ -555,6 +555,8 @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf) ARM64_SYS_REG(3, 0, 0, 7, 1)); err |= read_sys_reg64(fdarray[2], &ahcf->isar.id_aa64mmfr2, ARM64_SYS_REG(3, 0, 0, 7, 2)); + err |= read_sys_reg64(fdarray[2], &ahcf->isar.id_aa64zfr0, + ARM64_SYS_REG(3, 0, 0, 4, 4)); /* * Note that if AArch32 support is not present in the host, From patchwork Thu Mar 26 23:08:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 184939 Delivered-To: patch@linaro.org Received: by 2002:a92:de47:0:0:0:0:0 with SMTP id e7csp636866ilr; Thu, 26 Mar 2020 16:09:02 -0700 (PDT) X-Google-Smtp-Source: ADFU+vuPVNdgVJ8nWwx2UCCXGK3D89/hnOkno7/1joDDXaD7gVw4XT4g84WiPNaYyhbJFBOC8+fP X-Received: by 2002:a05:620a:1ed:: with SMTP id x13mr11444747qkn.70.1585264141985; Thu, 26 Mar 2020 16:09:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585264141; cv=none; d=google.com; s=arc-20160816; b=gFHFwB6jxcKb/xDtQuL/hQnx9l3do0czlCM2JRUmUuRDtYLZ4wzldyICF6DJZ7FT/O i9qbJN4tyJjoH0WNVxbFVaLcqqc1pSM1CtXs1pOe9daux6pwL08WXAPQdBeatGQIG9eC Shsz48yEpcXW25ugIDDQkrbi9DmdDjqWZgdxSfVGzh5kcVlQAPT0t7/sqF7sy7DvuFeG k4xKpq0UxD7/FRU/V8GV6vFIATdYSEN8z0Dkfce52ifhNjoLo0RZWaMlGd9zRnp9Plv2 rMOCKZKo/ANWV3oIhtrrntDyIa8KMZOdbGFoncY/yWrzFQjwJCrRYGONKk+Lev3AahoE T3qw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=Q1ePjC0PZpEMmpsvj5toCEs0OJPgs4+rpqy4ey1oxvI=; b=eQt3wDa/vzIF98l051Y/7WGvO3usfV06HMq59vFbaUzAlrvypkRJKieeBJt6huuDl3 Gsp6mkv7PYcxsxeiTf9pWB01i3oLbzhAg0UOKgw8NCW5J2wHxIaSCOhakvFMVnTj7RMq GHUZLyoI23FC5jOazyCb0aGaPtLm15iEz3/7s0ocVw0msc0Dp38QLBd1W4y5WUhU7Z65 K3gRW+u9T3Yyfr+GNnlCrlOH4E6IR2UCPKuHL9L6KPlZFlcisjgDN5M7aXsCdreWCkKF x/ObnsS/UPfentKJhc0pOqTei5Mykp+BmOhVkngAIhp/TKUiyVTey2njgGaz6BehKrmz GkMg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b="dfz/ky64"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id r7si2438992qtb.178.2020.03.26.16.09.01 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 26 Mar 2020 16:09:01 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b="dfz/ky64"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:34128 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcT-0001wa-Dh for patch@linaro.org; Thu, 26 Mar 2020 19:09:01 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:57836) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcD-0001qV-Ra for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:47 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcC-0001Fq-Cn for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:45 -0400 Received: from mail-pj1-x1032.google.com ([2607:f8b0:4864:20::1032]:33986) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcC-0001FB-76 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:44 -0400 Received: by mail-pj1-x1032.google.com with SMTP id q16so3754626pje.1 for ; Thu, 26 Mar 2020 16:08:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Q1ePjC0PZpEMmpsvj5toCEs0OJPgs4+rpqy4ey1oxvI=; b=dfz/ky64gZMpo1AacGx5BZSiGmb1rPh+HEyTYW/sl196DELP9/0DtxJmjngXWaTQuS IEzrQ+IbeU2CHe3VbtBqK0cyHGrEy3m2zX7h2Eo5v39ATkZ+qaS6mRGQ1vpABpoer4rF Iu23TjKptFwojb+XGYOK+QanCH6Xku/OAqDsv+A5y29wJiZolUXqcXFSRasjvkvaUlrW PjN/k92csmCD96PvfqI2bQXNsKof707hm6z3CKB0CHXCOenx0bOcTCpSmGHZaqv1Mnpw 6sm0P0iqnloYSZKysik6HbSMk29Q6k+mkfczAp2mhfYvhHKI0kOgzHqsJVu0MtDs3w2r C0dg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Q1ePjC0PZpEMmpsvj5toCEs0OJPgs4+rpqy4ey1oxvI=; b=S7rcu4V0QcfoC9gK+GvDVFA3vsmEc/1+wc5kGcRGrxK4IR/lqQ7w4IZuyB/q12o8+i FRAXm6xBWQK6zIQx79ILV6ai9tqGZyG/KPkfMJg7H5A//MSU/77rMRQFYud7U1CWabAe ZqDbjC9zz8akIKpnRxrMOwuL3KpYo3AJEG3KCZwptcbQaIVBvhShOaFVzk/69Ri0iAUb ZDnYMZhsr2EBlqhQhECqiQ9OIFeykfaXdSFmbEFAHB7a/ZW5QhLO27m7L8jWl/jitdXs mFDKY5nPgkvV5SdXuKmZRPKi/iw0Fa1128UTkvlwcYIir1xx5T0RnJxPdG+sCs4kroDU xJ6A== X-Gm-Message-State: ANhLgQ1qmC8CUhQawbPXPugM3oWJFqsxCx3NocJfTVRMP3gIJDdRzaCE /7mVV6YvdnqZnqrefZvmNXZ7iPgT7+A= X-Received: by 2002:a17:90a:e7c8:: with SMTP id kb8mr2534548pjb.79.1585264122682; Thu, 26 Mar 2020 16:08:42 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.08.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:08:42 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 02/31] target/arm: Implement SVE2 Integer Multiply - Unpredicated Date: Thu, 26 Mar 2020 16:08:09 -0700 Message-Id: <20200326230838.31112-3-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::1032 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" For MUL, we can rely on generic support. For SMULH and UMULH, create some trivial helpers. For PMUL, back in a21bb78e5817, we organized helper_gvec_pmul_b in preparation for this use. Signed-off-by: Richard Henderson --- target/arm/helper.h | 10 ++++ target/arm/sve.decode | 9 ++++ target/arm/translate-sve.c | 51 ++++++++++++++++++++ target/arm/vec_helper.c | 96 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 166 insertions(+) -- 2.20.1 diff --git a/target/arm/helper.h b/target/arm/helper.h index d5f1c87192..80bc129763 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -688,6 +688,16 @@ DEF_HELPER_FLAGS_2(frint64_s, TCG_CALL_NO_RWG, f32, f32, ptr) DEF_HELPER_FLAGS_2(frint32_d, TCG_CALL_NO_RWG, f64, f64, ptr) DEF_HELPER_FLAGS_2(frint64_d, TCG_CALL_NO_RWG, f64, f64, ptr) +DEF_HELPER_FLAGS_4(gvec_smulh_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_smulh_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_smulh_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_smulh_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(gvec_umulh_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_umulh_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_umulh_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_umulh_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(gvec_sshl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_sshl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_ushl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 4f580a25e7..58e0b808e9 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1093,3 +1093,12 @@ ST1_zprz 1110010 .. 00 ..... 100 ... ..... ..... \ @rprr_scatter_store xs=0 esz=3 scale=0 ST1_zprz 1110010 .. 00 ..... 110 ... ..... ..... \ @rprr_scatter_store xs=1 esz=3 scale=0 + +#### SVE2 Support + +### SVE2 Integer Multiply - Unpredicated + +MUL_zzz 00000100 .. 1 ..... 0110 00 ..... ..... @rd_rn_rm +SMULH_zzz 00000100 .. 1 ..... 0110 10 ..... ..... @rd_rn_rm +UMULH_zzz 00000100 .. 1 ..... 0110 11 ..... ..... @rd_rn_rm +PMUL_zzz 00000100 00 1 ..... 0110 01 ..... ..... @rd_rn_rm_e0 diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index acf962b6b0..e962f45b32 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -5829,3 +5829,54 @@ static bool trans_MOVPRFX_z(DisasContext *s, arg_rpr_esz *a) } return true; } + +/* + * SVE2 Integer Multiply - Unpredicated + */ + +static bool trans_MUL_zzz(DisasContext *s, arg_rrr_esz *a) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + return do_vector3_z(s, tcg_gen_gvec_mul, a->esz, a->rd, a->rn, a->rm); +} + +static bool do_sve2_zzz_ool(DisasContext *s, arg_rrr_esz *a, + gen_helper_gvec_3 *fn) +{ + if (fn == NULL || !dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vsz, vsz, 0, fn); + } + return true; +} + +static bool trans_SMULH_zzz(DisasContext *s, arg_rrr_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_gvec_smulh_b, gen_helper_gvec_smulh_h, + gen_helper_gvec_smulh_s, gen_helper_gvec_smulh_d, + }; + return do_sve2_zzz_ool(s, a, fns[a->esz]); +} + +static bool trans_UMULH_zzz(DisasContext *s, arg_rrr_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_gvec_umulh_b, gen_helper_gvec_umulh_h, + gen_helper_gvec_umulh_s, gen_helper_gvec_umulh_d, + }; + return do_sve2_zzz_ool(s, a, fns[a->esz]); +} + +static bool trans_PMUL_zzz(DisasContext *s, arg_rrr_esz *a) +{ + return do_sve2_zzz_ool(s, a, gen_helper_gvec_pmul_b); +} diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 8017bd88c4..00dc38c9db 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -1257,3 +1257,99 @@ void HELPER(sve2_pmull_h)(void *vd, void *vn, void *vm, uint32_t desc) } } #endif + +/* + * NxN -> N highpart multiply + * + * TODO: expose this as a generic vector operation. + */ + +void HELPER(gvec_smulh_b)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int8_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz; ++i) { + d[i] = ((int32_t)n[i] * m[i]) >> 8; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_smulh_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int16_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 2; ++i) { + d[i] = ((int32_t)n[i] * m[i]) >> 16; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_smulh_s)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int32_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 4; ++i) { + d[i] = ((int64_t)n[i] * m[i]) >> 32; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_smulh_d)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint64_t *d = vd, *n = vn, *m = vm; + uint64_t discard; + + for (i = 0; i < opr_sz / 8; ++i) { + muls64(&discard, &d[i], n[i], m[i]); + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_umulh_b)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint8_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz; ++i) { + d[i] = ((uint32_t)n[i] * m[i]) >> 8; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_umulh_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint16_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 2; ++i) { + d[i] = ((uint32_t)n[i] * m[i]) >> 16; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_umulh_s)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint32_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 4; ++i) { + d[i] = ((uint64_t)n[i] * m[i]) >> 32; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_umulh_d)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint64_t *d = vd, *n = vn, *m = vm; + uint64_t discard; + + for (i = 0; i < opr_sz / 8; ++i) { + mulu64(&discard, &d[i], n[i], m[i]); + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} From patchwork Thu Mar 26 23:08:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 184944 Delivered-To: patch@linaro.org Received: by 2002:a92:de47:0:0:0:0:0 with SMTP id e7csp640953ilr; Thu, 26 Mar 2020 16:13:33 -0700 (PDT) X-Google-Smtp-Source: ADFU+vuot4tnPLtMpcc6KQDmldNJuOkGeu7ndbcOchbvTOjMh/xUpWAhm1/pmBNmYYDBuqqywyZb X-Received: by 2002:a37:a746:: with SMTP id q67mr11748702qke.215.1585264413366; Thu, 26 Mar 2020 16:13:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585264413; cv=none; d=google.com; s=arc-20160816; b=b+ICG9/fhF1PqcR0NrAOsHmm1Bax8ZI7/Zv1CKg7i1smX54WOBYe7F8lWg8GJNtaWW 082k7hLSMLFLeKVQfhxvqR4cJrlVG0wScSyktS3xkSRVeKhg49Ouszg9TuAIKqpsr4IV QXnDtoDZrUdRlPPxHPif9xlCEC1WWUK+TTFlxJCMFueD/Rr33sBqBh4ue9w+UqlHUovg d27ueunXlpoNbexEKuK09IbfXou1SwOZM24o7PwB1sjz8dF4Mam9fWUF+vIopukIUn/4 DaTwcBQZ9Kyb4Jmv4/Lp2+PdXR+Ap9xB7ZP0I3acPEAnm8cfGaAQP1Vm6bc5OQ85l3Po S6Jw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=ENzLI8qMuJSws+8eLNHdlg92DyxjJFX8uRl+A0psg7s=; b=U27QG7mfKr/NygnH1bu9Tuu3u1txqLgUXX+B/Bo7MSqRvri//qDMUuI/0kiKieiHi7 JBaKU5McnXXZ5Ss2G0FxuhjkfY08zN5j6FlqLPCNeKCNEwPSomwVJGWRL0GhAsl/ZD6Z xkH+tlPuAi3tf8qqe/+qOxrgC4Mi4OJOKuCB45muyQhql2wa8b2PVeZae8anDt3Qnbgu ZscQ7E0rDlm1Ghqn7NsrnQpuhdnpFInuE9BDN3W4KWKQ8c/pcCEj9qkS9wT5dMOEUAiG DrRG5T0TiF1K+GyuV31mOvzyxyFY0scZ4VzDf0XHv9pZM7nmN+smcjGyyl2rbkdyvv3b qDJA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=iperahye; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id 196si2361932qkl.55.2020.03.26.16.13.33 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 26 Mar 2020 16:13:33 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=iperahye; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:34576 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbgq-0001Bx-QF for patch@linaro.org; Thu, 26 Mar 2020 19:13:32 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:57862) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcE-0001rE-S0 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:48 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcD-0001Ge-Io for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:46 -0400 Received: from mail-pj1-x1043.google.com ([2607:f8b0:4864:20::1043]:51269) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcD-0001GF-DW for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:45 -0400 Received: by mail-pj1-x1043.google.com with SMTP id w9so3092430pjh.1 for ; Thu, 26 Mar 2020 16:08:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ENzLI8qMuJSws+8eLNHdlg92DyxjJFX8uRl+A0psg7s=; b=iperahyeiAw+M62lLJODUFHBwlVgapsnkYDNUt8COvK1fUmm1CJjWCE2Hler7y8WFs thYR4AGnwlDpnCKLOz4W2/jPs87Mc9ZC6qQB6TmI5C9/I8Qzt10HEy5wCcfcrImpN++9 q9/ToVSicS6yJNXRwm7YmfeY1qvAz1dE00t+m10yGiwla4weY2NoLjx6l5mYqNOX2zU1 Z9PFttzHnPu6iFSBdTty5pTpAISQ3TP4y8T0HMxc/IyWH4HluXYgXB7QE/vzVfvzqEXl 59jGq9BWoPSBCrgj9jgfG9GAMPto66F6A8AY6ZugID3QINWKSrYUAzjpAwU9dB1vLayU P8ng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ENzLI8qMuJSws+8eLNHdlg92DyxjJFX8uRl+A0psg7s=; b=hDpUGfwmRjHoujr8IGtOhaDcaLs8AngmziqIsvngUBArUFCPrgGeSj24Yy6bRO6cVW QkbqUtx2FA+Ic30Z0OvWmQMapXNO7+2uCdBLA3KTH8o6nQebJUSw+vglCvrbmNLMeUXr DxU9GtPFdZ9w1H1SUBPXfjhCBj4TEiSwBPyDMJarATkg7Oedkx2QNNY4CacFRHgY7LSN IW8CBPdcNLkelxYHbkZaQFSb93uDaoONsypsq5RFxnGaE7v46RbHY6gQN4rIiAssaadn 6uPq4xLPhtedJYpepsHzBWIVTAVMujE3wR5J1qylwXi/9h4jQuh3gVpfsKscm1Jcm+tm 0fVQ== X-Gm-Message-State: ANhLgQ35+YgrOeYszLcWP0v26YnsSWfiMYYOiZ4WCf5P8WyXVpeDDOGh Hw8wmLNYNOfSgypyEfazF1uLnM7FGZ4= X-Received: by 2002:a17:90a:c385:: with SMTP id h5mr2508231pjt.131.1585264123970; Thu, 26 Mar 2020 16:08:43 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.08.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:08:43 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 03/31] target/arm: Implement SVE2 integer pairwise add and accumulate long Date: Thu, 26 Mar 2020 16:08:10 -0700 Message-Id: <20200326230838.31112-4-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::1043 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 14 ++++++++++++ target/arm/sve.decode | 5 +++++ target/arm/sve_helper.c | 44 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 39 +++++++++++++++++++++++++++++++++ 4 files changed, 102 insertions(+) -- 2.20.1 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 11d627981d..854cd97fdf 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -158,6 +158,20 @@ DEF_HELPER_FLAGS_5(sve_umulh_zpzz_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_umulh_zpzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sadalp_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sadalp_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sadalp_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_uadalp_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uadalp_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uadalp_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_sdiv_zpzz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_sdiv_zpzz_d, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 58e0b808e9..6691145854 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1102,3 +1102,8 @@ MUL_zzz 00000100 .. 1 ..... 0110 00 ..... ..... @rd_rn_rm SMULH_zzz 00000100 .. 1 ..... 0110 10 ..... ..... @rd_rn_rm UMULH_zzz 00000100 .. 1 ..... 0110 11 ..... ..... @rd_rn_rm PMUL_zzz 00000100 00 1 ..... 0110 01 ..... ..... @rd_rn_rm_e0 + +### SVE2 Integer - Predicated + +SADALP_zpzz 01000100 .. 000 100 101 ... ..... ..... @rdm_pg_rn +UADALP_zpzz 01000100 .. 000 101 101 ... ..... ..... @rdm_pg_rn diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index d40b1994aa..7dc17421e9 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -517,6 +517,50 @@ DO_ZPZZ_D(sve_asr_zpzz_d, int64_t, DO_ASR) DO_ZPZZ_D(sve_lsr_zpzz_d, uint64_t, DO_LSR) DO_ZPZZ_D(sve_lsl_zpzz_d, uint64_t, DO_LSL) +static inline uint16_t do_sadalp_h(uint16_t n, uint16_t m) +{ + int8_t n1 = n, n2 = n >> 8; + return m + n1 + n2; +} + +static inline uint32_t do_sadalp_s(uint32_t n, uint32_t m) +{ + int16_t n1 = n, n2 = n >> 16; + return m + n1 + n2; +} + +static inline uint64_t do_sadalp_d(uint64_t n, uint64_t m) +{ + int32_t n1 = n, n2 = n >> 32; + return m + n1 + n2; +} + +DO_ZPZZ(sve2_sadalp_zpzz_h, int16_t, H1_2, do_sadalp_h) +DO_ZPZZ(sve2_sadalp_zpzz_s, int32_t, H1_4, do_sadalp_s) +DO_ZPZZ_D(sve2_sadalp_zpzz_d, uint64_t, do_sadalp_d) + +static inline uint16_t do_uadalp_h(uint16_t n, uint16_t m) +{ + uint8_t n1 = n, n2 = n >> 8; + return m + n1 + n2; +} + +static inline uint32_t do_uadalp_s(uint32_t n, uint32_t m) +{ + uint16_t n1 = n, n2 = n >> 16; + return m + n1 + n2; +} + +static inline uint64_t do_uadalp_d(uint64_t n, uint64_t m) +{ + uint32_t n1 = n, n2 = n >> 32; + return m + n1 + n2; +} + +DO_ZPZZ(sve2_uadalp_zpzz_h, int16_t, H1_2, do_uadalp_h) +DO_ZPZZ(sve2_uadalp_zpzz_s, int32_t, H1_4, do_uadalp_s) +DO_ZPZZ_D(sve2_uadalp_zpzz_d, uint64_t, do_uadalp_d) + #undef DO_ZPZZ #undef DO_ZPZZ_D diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index e962f45b32..bc8321f7cd 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -5880,3 +5880,42 @@ static bool trans_PMUL_zzz(DisasContext *s, arg_rrr_esz *a) { return do_sve2_zzz_ool(s, a, gen_helper_gvec_pmul_b); } + +/* + * SVE2 Integer - Predicated + */ + +static bool do_sve2_zpzz_ool(DisasContext *s, arg_rprr_esz *a, + gen_helper_gvec_4 *fn) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + return do_zpzz_ool(s, a, fn); +} + +static bool trans_SADALP_zpzz(DisasContext *s, arg_rprr_esz *a) +{ + static gen_helper_gvec_4 * const fns[3] = { + gen_helper_sve2_sadalp_zpzz_h, + gen_helper_sve2_sadalp_zpzz_s, + gen_helper_sve2_sadalp_zpzz_d, + }; + if (a->esz == 0) { + return false; + } + return do_sve2_zpzz_ool(s, a, fns[a->esz - 1]); +} + +static bool trans_UADALP_zpzz(DisasContext *s, arg_rprr_esz *a) +{ + static gen_helper_gvec_4 * const fns[3] = { + gen_helper_sve2_uadalp_zpzz_h, + gen_helper_sve2_uadalp_zpzz_s, + gen_helper_sve2_uadalp_zpzz_d, + }; + if (a->esz == 0) { + return false; + } + return do_sve2_zpzz_ool(s, a, fns[a->esz - 1]); +} From patchwork Thu Mar 26 23:08:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 184943 Delivered-To: patch@linaro.org Received: by 2002:a92:de47:0:0:0:0:0 with SMTP id e7csp640938ilr; Thu, 26 Mar 2020 16:13:32 -0700 (PDT) X-Google-Smtp-Source: ADFU+vvjkSzNKB2aouzeyd5xz1n0F+i7jwjNd+izxztKQIpReXvo44ZSS21KGtgO+FL8HNGbfu5t X-Received: by 2002:a0c:a998:: with SMTP id a24mr11035978qvb.141.1585264412166; Thu, 26 Mar 2020 16:13:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585264412; cv=none; d=google.com; s=arc-20160816; b=k26GgIuB5H0gfaRubuVBeVHOlrfJAYk9p51bY+wWG1uy1q/W2Ds/n6KJN5XME6WzWM B0TLNDsuQDQoPcygLHiaxqm8fi10uBrXUqZxgIxiAHI2oHO72DwInr5LOiwS62HD2g+K XijAarrYQp2IU7NKsCp1erkx1Zl7EGdRX8BgscvCvHSntIsNAFiC2W1vtJ3FlboVB5ek 1QQkMxfVERxGio7DIJl4P7dIIOdwwtL/rqxm6iLl9voOCWXdpPAmMTwFIrfejK4SLbRp uOsSB8NVVmWcQQ9T6SBsrWmMfF6Sz+naibA19EinEq+JODQQ79Iv0vjAz8Ug85wNq3fJ eDSQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=HkqstrAP2B7IxXW64bwy+CKbvFuSsclyXxBa82r0Wh4=; b=grskxk4i9uhkjbqNjtCLleengbGBs5bSunrcFdsKd14z9PTuEy6lqWI2ETW7lyzIP8 k6oIySR9r3atSVnVY3I2WSamfoO9oyFEtD3HowtUDpthhMro+VGP7Kn6SuwNt/qY72WP vOQsib2LjpoAkyeZ6d1Uh9HjUyUdpUvQsL75bCzaExn2+/0MNkG9XSrBaRh0aIdqThiT OjdcBXoqmQhiyhTW3yDab+7yjjxYmCyTng8LgUgkXeL1LQXOVyuB1tg4790dc5tKVbj1 FLJHyv8NmDI16vVpZ298hj+iem0krij0bBF4tiRhuHRsygL2y08h3Xh/zucUBTCuWv3v Zcxg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b="u/edcjH8"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id v5si2384999qto.49.2020.03.26.16.13.32 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 26 Mar 2020 16:13:32 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b="u/edcjH8"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:34572 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbgp-00019F-MG for patch@linaro.org; Thu, 26 Mar 2020 19:13:31 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:57892) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcG-0001uE-3J for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:50 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcE-0001Hj-SA for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:48 -0400 Received: from mail-pl1-x629.google.com ([2607:f8b0:4864:20::629]:46788) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcE-0001Gx-N2 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:46 -0400 Received: by mail-pl1-x629.google.com with SMTP id s23so2720978plq.13 for ; Thu, 26 Mar 2020 16:08:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=HkqstrAP2B7IxXW64bwy+CKbvFuSsclyXxBa82r0Wh4=; b=u/edcjH8d9Y7t/gbupmzjmDrvCYtvDAHHPG9LbB/qWyLc/VUn0n/9SK6TVTqezET1g Z9aPfYv14+ZrYHuWh2UkRfrtewc0X+z45eY86y7772g9W65A3Mx+IfMg4H8ggbPE+btq LUt6CiuDsRpUANEkoS8AmTUC99DH1ZXH0G1v8/hV6ghpuk3ZLZaVLctQNLTENnGmf/UA Euy0wbCbm8VbRdse5etjgTmi5wmOePIfSMhxxzaodZldDn0gfmrshMViMUTcwL7JZRCh nAHAT8PF3M0ZR4K/dRDex2X/a0ZZH1awmCYcYlBPOylBnVxl5s6jTAuUgdprVhfHFaVq 1v5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=HkqstrAP2B7IxXW64bwy+CKbvFuSsclyXxBa82r0Wh4=; b=hSrlz0PwLfwFEKaPZgKlFZ2KSnSR2K41XnmjO8k2kWLUjjNkyZbpi1QhCNBkxZKO4+ +PqrbtJzq2rITnyttfLdbL1aDfB7XJbiQoc8/LXs8Cy9nc7rGmHHetsSwYjl/d4BiDN7 f7nthJuf1qgTSO5r1PPNYVz6U/C8pN66JX5fiCl2tbNzgL6sGX1KeNs8RnBvVoBzagjq UvRZbcaEFuQMSXmhIl//vH2D2SYk6OHc0YF4rB6+n24wzShdRingqyALzLNQtjElOyej t++4lGsqPtI0PViF0IQud+6popD6avvO41Otcv2FbWOT6931s2zW7RJzq5JlHO9891k+ PbPA== X-Gm-Message-State: ANhLgQ2aukKn49Pu+cslILLwti0XZjDXAONHL7xdOofstAOzbXWeXmvc 8Jgkp2mbsyXzGpWVlw3EU6+rUIoNVtA= X-Received: by 2002:a17:902:82c5:: with SMTP id u5mr10729656plz.254.1585264125357; Thu, 26 Mar 2020 16:08:45 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.08.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:08:44 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 04/31] target/arm: Remove fp_status from helper_{recpe, rsqrte}_u32 Date: Thu, 26 Mar 2020 16:08:11 -0700 Message-Id: <20200326230838.31112-5-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::629 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" These operations do not touch fp_status. Signed-off-by: Richard Henderson --- target/arm/helper.h | 4 ++-- target/arm/translate-a64.c | 5 ++--- target/arm/translate.c | 12 ++---------- target/arm/vfp_helper.c | 4 ++-- 4 files changed, 8 insertions(+), 17 deletions(-) -- 2.20.1 diff --git a/target/arm/helper.h b/target/arm/helper.h index 80bc129763..938fdbc362 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -213,8 +213,8 @@ DEF_HELPER_FLAGS_2(recpe_f64, TCG_CALL_NO_RWG, f64, f64, ptr) DEF_HELPER_FLAGS_2(rsqrte_f16, TCG_CALL_NO_RWG, f16, f16, ptr) DEF_HELPER_FLAGS_2(rsqrte_f32, TCG_CALL_NO_RWG, f32, f32, ptr) DEF_HELPER_FLAGS_2(rsqrte_f64, TCG_CALL_NO_RWG, f64, f64, ptr) -DEF_HELPER_2(recpe_u32, i32, i32, ptr) -DEF_HELPER_FLAGS_2(rsqrte_u32, TCG_CALL_NO_RWG, i32, i32, ptr) +DEF_HELPER_FLAGS_1(recpe_u32, TCG_CALL_NO_RWG, i32, i32) +DEF_HELPER_FLAGS_1(rsqrte_u32, TCG_CALL_NO_RWG, i32, i32) DEF_HELPER_FLAGS_4(neon_tbl, TCG_CALL_NO_RWG, i32, i32, i32, ptr, i32) DEF_HELPER_3(shl_cc, i32, env, i32, i32) diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index db41e3d72a..2bcf643069 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -10220,7 +10220,7 @@ static void handle_2misc_reciprocal(DisasContext *s, int opcode, switch (opcode) { case 0x3c: /* URECPE */ - gen_helper_recpe_u32(tcg_res, tcg_op, fpst); + gen_helper_recpe_u32(tcg_res, tcg_op); break; case 0x3d: /* FRECPE */ gen_helper_recpe_f32(tcg_res, tcg_op, fpst); @@ -12802,7 +12802,6 @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn) unallocated_encoding(s); return; } - need_fpstatus = true; break; case 0x1e: /* FRINT32Z */ case 0x1f: /* FRINT64Z */ @@ -12970,7 +12969,7 @@ static void disas_simd_two_reg_misc(DisasContext *s, uint32_t insn) gen_helper_rints_exact(tcg_res, tcg_op, tcg_fpstatus); break; case 0x7c: /* URSQRTE */ - gen_helper_rsqrte_u32(tcg_res, tcg_op, tcg_fpstatus); + gen_helper_rsqrte_u32(tcg_res, tcg_op); break; case 0x1e: /* FRINT32Z */ case 0x5e: /* FRINT32X */ diff --git a/target/arm/translate.c b/target/arm/translate.c index b38af6149a..cba84987db 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -6711,19 +6711,11 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) break; } case NEON_2RM_VRECPE: - { - TCGv_ptr fpstatus = get_fpstatus_ptr(1); - gen_helper_recpe_u32(tmp, tmp, fpstatus); - tcg_temp_free_ptr(fpstatus); + gen_helper_recpe_u32(tmp, tmp); break; - } case NEON_2RM_VRSQRTE: - { - TCGv_ptr fpstatus = get_fpstatus_ptr(1); - gen_helper_rsqrte_u32(tmp, tmp, fpstatus); - tcg_temp_free_ptr(fpstatus); + gen_helper_rsqrte_u32(tmp, tmp); break; - } case NEON_2RM_VRECPE_F: { TCGv_ptr fpstatus = get_fpstatus_ptr(1); diff --git a/target/arm/vfp_helper.c b/target/arm/vfp_helper.c index 930d6e747f..a792661166 100644 --- a/target/arm/vfp_helper.c +++ b/target/arm/vfp_helper.c @@ -1023,7 +1023,7 @@ float64 HELPER(rsqrte_f64)(float64 input, void *fpstp) return make_float64(val); } -uint32_t HELPER(recpe_u32)(uint32_t a, void *fpstp) +uint32_t HELPER(recpe_u32)(uint32_t a) { /* float_status *s = fpstp; */ int input, estimate; @@ -1038,7 +1038,7 @@ uint32_t HELPER(recpe_u32)(uint32_t a, void *fpstp) return deposit32(0, (32 - 9), 9, estimate); } -uint32_t HELPER(rsqrte_u32)(uint32_t a, void *fpstp) +uint32_t HELPER(rsqrte_u32)(uint32_t a) { int estimate; From patchwork Thu Mar 26 23:08:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 184948 Delivered-To: patch@linaro.org Received: by 2002:a92:de47:0:0:0:0:0 with SMTP id e7csp642790ilr; Thu, 26 Mar 2020 16:15:32 -0700 (PDT) X-Google-Smtp-Source: ADFU+vv1q9XupDZ173uHysNQecWru0OVhfiFXrOJgTfPOZKVupek59vq27cmmFYxPmPDX+Ndv/G7 X-Received: by 2002:a37:2e01:: with SMTP id u1mr11448145qkh.331.1585264532265; Thu, 26 Mar 2020 16:15:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585264532; cv=none; d=google.com; s=arc-20160816; b=pK10b/XXh+bXxrnZHSNk7lOnpkhWO7sSQEQ2b9Z+gts7cei5CdJ4ly7aZcSPT/HkR7 eBnhInbecDu73j4OBGluuORl61B7PbCw1JodZKvm9rGV1pu39lUn/B0azGk2SWt/UxAB BOMwO98wvHy1AHWp6z/fuc0KF3749UY8Rc1bngTWzgdM72Kq/1PeKEVYXIBWUCVs8hJf ghzt0c6ydJizfUFEzfOfdB654+sySKUc3MGH/vQOv3Uf5IKrZZ/j1ZCiARC/tm2tBnpv 1BuhG94sYxqvLayJkCGeIkI0UKelsg9w0EIVIz6EtDUEMVcJRfXgyHiAadeO6y1fFjOY qgMA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=1a/74RtZJl65FwDaRqFWJWhWsV/XH0AS+LKjeLcIuFE=; b=pMK1veErtEFB5AbCuqNp9n/IFaJa7W9dexlepgYlsL07SZClQzc4eDjkcwB527j+8q OXWA2yvhwD7jVx89H4qK8HKWqWHbFYfmjNayZ1BVbUG6WTUMWj3TFeqt0kdbpfVY6Zdo VOL6cXn4v0IANL5qwlsBgGV4XRfhYLMNh5D76x+A4+QnTqYgQvnFxnB01Sp5FYrbrK8C 08VwJ6OzthXSXUDITVmsPPPOnjxmG7eV/gZKrEQOC/nd5xcLkT4MU+fqq+TovCjboC8R 4LAPGqhsMqlVGQEno6D2LEW+Yijsht19kM3PVgOu44oJ7T2kDjXrYO/iyIdOfNhWP4FI ojWQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Qz0ewG13; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id v1si2524981qkg.113.2020.03.26.16.15.32 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 26 Mar 2020 16:15:32 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=Qz0ewG13; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:34644 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbik-0004ce-JE for patch@linaro.org; Thu, 26 Mar 2020 19:15:30 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:57914) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcI-0001xP-26 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:51 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcG-0001IY-3Y for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:49 -0400 Received: from mail-pf1-x443.google.com ([2607:f8b0:4864:20::443]:36571) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcF-0001I2-UF for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:48 -0400 Received: by mail-pf1-x443.google.com with SMTP id i13so3553360pfe.3 for ; Thu, 26 Mar 2020 16:08:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=1a/74RtZJl65FwDaRqFWJWhWsV/XH0AS+LKjeLcIuFE=; b=Qz0ewG13NTs2zcsq5z35cW116x7yv5X9PevHtCa0PiThUIaXkg053Wb+w3nOSnUdDE 2Ryev4Aqz41K268MlINa9PBxEJHsNKX/yI9RuEfAWnhryv20RvXh6y2Y1vUb2Vx4qOHS pzRJuHJnRN8ov2cqV/IHW0et7KAwaqPZ2d3xENNcHdI/1S8r8RzzcfG3xeMgmcF3yyvc H9OtAO+BC/P48eQHCejTOG4mK/gBXcgMLWtVX7+6d8jzpvRhoz1KE4DGnpI5zwx1htBW h0kA7W5FfmizD6LIFASo3usTpKDwCdZwgq2Y+8fUyg/+HZGPY+8YBuW/ysgQEIYo8gNO h1+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=1a/74RtZJl65FwDaRqFWJWhWsV/XH0AS+LKjeLcIuFE=; b=j8Nkqzh3I1cMpFroI9uM6kqwXc1xLJDXNgUmswFEyWE2oxxZ2zo72qWCMA2Fl74b2Q V41Uo3kC0PdsJ0Dt7iWmd2vfO+wN5FCZkmPjvmz8KFbT6yJo/3H53GAfzrIhuH3L+Ctm C43wmNy6EMbccb4ORgKpSkl/y6kDudBhEM9E0gDsAA6RxTK7LWIw1VFrrz1qxO3vKAg1 GYhIZhKMfUEoKlN0h6YGNr634KvXh7GEsY/XYBGHRcrDH5Hbh9Z8+N4DYgGkh+pERPMX vtnwezwDa2xmf0uaUgjwfzn8riY5Wc0vqQVAkyd8gXKIUv47kgv3wlUatHibglr0xFVi 4PxQ== X-Gm-Message-State: ANhLgQ24T0X7qna+LRsHYDHIew8VvFdTtZEo11fYsS98Isociif3z/7p WesyAT8qn996aJJMJbnKdb3JsvoTtTg= X-Received: by 2002:a62:1dcd:: with SMTP id d196mr11633461pfd.296.1585264126502; Thu, 26 Mar 2020 16:08:46 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.08.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:08:45 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 05/31] target/arm: Implement SVE2 integer unary operations (predicated) Date: Thu, 26 Mar 2020 16:08:12 -0700 Message-Id: <20200326230838.31112-6-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::443 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 13 +++++++++++ target/arm/sve.decode | 7 ++++++ target/arm/sve_helper.c | 25 ++++++++++++++++---- target/arm/translate-sve.c | 47 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 88 insertions(+), 4 deletions(-) -- 2.20.1 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 854cd97fdf..d3b7c3bd12 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -507,6 +507,19 @@ DEF_HELPER_FLAGS_4(sve_rbit_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_rbit_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_rbit_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqabs_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqabs_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqabs_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqabs_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_sqneg_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqneg_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqneg_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqneg_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_urecpe_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_ursqrte_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_splice, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_cmpeq_ppzz_b, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 6691145854..95a9c65451 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1107,3 +1107,10 @@ PMUL_zzz 00000100 00 1 ..... 0110 01 ..... ..... @rd_rn_rm_e0 SADALP_zpzz 01000100 .. 000 100 101 ... ..... ..... @rdm_pg_rn UADALP_zpzz 01000100 .. 000 101 101 ... ..... ..... @rdm_pg_rn + +### SVE2 integer unary operations (predicated) + +URECPE 01000100 .. 000 000 101 ... ..... ..... @rd_pg_rn +URSQRTE 01000100 .. 000 001 101 ... ..... ..... @rd_pg_rn +SQABS 01000100 .. 001 000 101 ... ..... ..... @rd_pg_rn +SQNEG 01000100 .. 001 001 101 ... ..... ..... @rd_pg_rn diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 7dc17421e9..16606331fc 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -535,8 +535,8 @@ static inline uint64_t do_sadalp_d(uint64_t n, uint64_t m) return m + n1 + n2; } -DO_ZPZZ(sve2_sadalp_zpzz_h, int16_t, H1_2, do_sadalp_h) -DO_ZPZZ(sve2_sadalp_zpzz_s, int32_t, H1_4, do_sadalp_s) +DO_ZPZZ(sve2_sadalp_zpzz_h, uint16_t, H1_2, do_sadalp_h) +DO_ZPZZ(sve2_sadalp_zpzz_s, uint32_t, H1_4, do_sadalp_s) DO_ZPZZ_D(sve2_sadalp_zpzz_d, uint64_t, do_sadalp_d) static inline uint16_t do_uadalp_h(uint16_t n, uint16_t m) @@ -557,8 +557,8 @@ static inline uint64_t do_uadalp_d(uint64_t n, uint64_t m) return m + n1 + n2; } -DO_ZPZZ(sve2_uadalp_zpzz_h, int16_t, H1_2, do_uadalp_h) -DO_ZPZZ(sve2_uadalp_zpzz_s, int32_t, H1_4, do_uadalp_s) +DO_ZPZZ(sve2_uadalp_zpzz_h, uint16_t, H1_2, do_uadalp_h) +DO_ZPZZ(sve2_uadalp_zpzz_s, uint32_t, H1_4, do_uadalp_s) DO_ZPZZ_D(sve2_uadalp_zpzz_d, uint64_t, do_uadalp_d) #undef DO_ZPZZ @@ -728,6 +728,23 @@ DO_ZPZ(sve_rbit_h, uint16_t, H1_2, revbit16) DO_ZPZ(sve_rbit_s, uint32_t, H1_4, revbit32) DO_ZPZ_D(sve_rbit_d, uint64_t, revbit64) +#define DO_SQABS(N) (N == -N ? N - 1 : N < 0 ? -N : N) + +DO_ZPZ(sve2_sqabs_b, int8_t, H1, DO_SQABS) +DO_ZPZ(sve2_sqabs_h, int16_t, H1_2, DO_SQABS) +DO_ZPZ(sve2_sqabs_s, int32_t, H1_4, DO_SQABS) +DO_ZPZ_D(sve2_sqabs_d, int64_t, DO_SQABS) + +#define DO_SQNEG(N) (N == -N ? N - 1 : -N) + +DO_ZPZ(sve2_sqneg_b, uint8_t, H1, DO_SQNEG) +DO_ZPZ(sve2_sqneg_h, uint16_t, H1_2, DO_SQNEG) +DO_ZPZ(sve2_sqneg_s, uint32_t, H1_4, DO_SQNEG) +DO_ZPZ_D(sve2_sqneg_d, uint64_t, DO_SQNEG) + +DO_ZPZ(sve2_urecpe_s, uint32_t, H1_4, helper_recpe_u32) +DO_ZPZ(sve2_ursqrte_s, uint32_t, H1_4, helper_rsqrte_u32) + /* Three-operand expander, unpredicated, in which the third operand is "wide". */ #define DO_ZZW(NAME, TYPE, TYPEW, H, OP) \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index bc8321f7cd..938ec08673 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -5919,3 +5919,50 @@ static bool trans_UADALP_zpzz(DisasContext *s, arg_rprr_esz *a) } return do_sve2_zpzz_ool(s, a, fns[a->esz - 1]); } + +/* + * SVE2 integer unary operations (predicated) + */ + +static bool do_sve2_zpz_ool(DisasContext *s, arg_rpr_esz *a, + gen_helper_gvec_3 *fn) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + return do_zpz_ool(s, a, fn); +} + +static bool trans_URECPE(DisasContext *s, arg_rpr_esz *a) +{ + if (a->esz != 2) { + return false; + } + return do_sve2_zpz_ool(s, a, gen_helper_sve2_urecpe_s); +} + +static bool trans_URSQRTE(DisasContext *s, arg_rpr_esz *a) +{ + if (a->esz != 2) { + return false; + } + return do_sve2_zpz_ool(s, a, gen_helper_sve2_ursqrte_s); +} + +static bool trans_SQABS(DisasContext *s, arg_rpr_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve2_sqabs_b, gen_helper_sve2_sqabs_h, + gen_helper_sve2_sqabs_s, gen_helper_sve2_sqabs_d, + }; + return do_sve2_zpz_ool(s, a, fns[a->esz]); +} + +static bool trans_SQNEG(DisasContext *s, arg_rpr_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve2_sqneg_b, gen_helper_sve2_sqneg_h, + gen_helper_sve2_sqneg_s, gen_helper_sve2_sqneg_d, + }; + return do_sve2_zpz_ool(s, a, fns[a->esz]); +} From patchwork Thu Mar 26 23:08:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 184947 Delivered-To: patch@linaro.org Received: by 2002:a92:de47:0:0:0:0:0 with SMTP id e7csp642301ilr; Thu, 26 Mar 2020 16:15:02 -0700 (PDT) X-Google-Smtp-Source: ADFU+vu/MgPN8Rfn2Qsk9YvvMJ+PIdIZVMJd2awlP9PhEf65TY2sfNV6vaDGQX0Q0MopAP2p3uIL X-Received: by 2002:a37:b146:: with SMTP id a67mr11579943qkf.473.1585264501959; Thu, 26 Mar 2020 16:15:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585264501; cv=none; d=google.com; s=arc-20160816; b=FNjIoTMBQW+4dUHgkph5beMn09q3F8/W8xclEDLr1ktruqfbhUvDaFQRha0p3dAJoF XhfSiSNTnKZtJfqvrWiP1WHEzYQ/66W84s6ufmCPcie/W+sw1aRg8n8WEg0nDNSEEUrJ kco4zfst0ZZaZYomC40wfZN+FqXRyeV52ijS6KhAdaRcnmQ9lM1x9EnJXm0qg6pZgDUG lW5MWp9Xr/E97RTBCNED4eBKffOWvKDlJ1ipKlbrP3DhTTZxwIwqxsUH0Aatdyidvh1J cthT0fptCVH4pVQWwynqiqsgNX9YMWEprATrFTiMGqa0ZPrXVQ+7cMGH8K2HKkpSTZ62 dhig== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=jC3lzQRoD9FYu6PU7hez8Vvm2StpyVGzvHCSmbTwtC4=; b=DUHjH90qZxzUjTPJrYM89h26bRtP5xP4llqYBrgcZePAAm01sZdIlffyBPwIsrsoCC Jp04oY43KW5vvcwXmE8XYMlWIk8MCob8OWZkQRLV0IJfAGN+9+rTHCMg22AEh23RMX10 2NZ9OwUJYJOGcyN/NH1GBhkX3iXsAjNkIb3Dtwgy/6m8kN0aE0xuSCefyRx39XfKnlt9 Pt3ZwHfMR1I0Jb5hm2Aj+ipEHAJYCdRE9jGRIS1s4plc2k0oGFQdGHb4x+E5SY6jPjGL a8qmpd6oArZ3Y0SgBQIIgEQcmVUjEsR4xd8bXT6O/iJ8rWUVvVE7tlLR5CdnNcG7mhBu H+VA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=nPxpTUTX; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id dy12si2324934qvb.137.2020.03.26.16.15.01 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 26 Mar 2020 16:15:01 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=nPxpTUTX; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:34626 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbiH-0003Mm-Cm for patch@linaro.org; Thu, 26 Mar 2020 19:15:01 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58063) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcL-00026u-R4 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:57 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcI-0001Jg-IO for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:53 -0400 Received: from mail-pj1-x1044.google.com ([2607:f8b0:4864:20::1044]:55236) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcI-0001J4-9Y for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:50 -0400 Received: by mail-pj1-x1044.google.com with SMTP id np9so3088352pjb.4 for ; Thu, 26 Mar 2020 16:08:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=jC3lzQRoD9FYu6PU7hez8Vvm2StpyVGzvHCSmbTwtC4=; b=nPxpTUTXZgSbtkfAVT1ucMgmZz4RroG/rSKGEVdxBMA0DjxMpyk4u1jpMy3WY0c6iH YQYoikICQPy9ls+eZlB+NCluAq773ZUu0kRzyjKuGPgBioQYxdwBa1yZHzhgUliRjcer 7durjlmiTkTy/KPMqxibQ1k7ynzRCkePuIsY6e1yGLPN0heS7wPMID6+ge2f5JUNZ7Yg x7/j18ZgELxd7fvXzdLtMLDOukWV7dhnuFaBtKfY78WIkUaRt3D+yuxJz9EQ4fF3Yqyq 95cKTTcrf6LpuNmvvAkwR34bozawG/7EfH+HfVBCoEcex0a1b+cVbrvYYcrPQ+WVDqBW NSkw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=jC3lzQRoD9FYu6PU7hez8Vvm2StpyVGzvHCSmbTwtC4=; b=qxdCtDNwb6OnJucKh6DMfITDyXb3Iw3M+3+AKFAAGIivM/4SOX/N3vd7WTIS0MB8Ts 4shctRVgl99/ansN5c6O8kJJQUv/U1Ub2kyjMuEe30ED8ittxUBd6YszwkSLLAPWDRYy 1yw0CRQjFr1WBIqh233V27Pfhm51h9Ydd6AqSAdO32EulEMiN5B+rEkmRmkHi9pLnvdm iC0cYxoZ66l4SgjLpZ5bnADiYNEoNDKh10rm8ZdBniJkHb1J1OesaB4s4JXK1CckQZME qdEETQt0eNXRdlb90XHkLA8kLENK+lpWDoyQ4DuK8EIhI8h1qrXpw7Dgf1CBSKcvAh6D Hqjw== X-Gm-Message-State: ANhLgQ2KjfPhlzXZI41NZAFtBrDryDsMOiuCzGQEhBjXRscrkz1AF6Fd 0Elq367xnT4gj2tDYnEzMylWqfz+Mxs= X-Received: by 2002:a17:902:5984:: with SMTP id p4mr10599828pli.43.1585264127804; Thu, 26 Mar 2020 16:08:47 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.08.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:08:47 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 06/31] target/arm: Split out saturating/rounding shifts from neon Date: Thu, 26 Mar 2020 16:08:13 -0700 Message-Id: <20200326230838.31112-7-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::1044 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Split these operations out into a header that can be shared between neon and sve. The "sat" pointer acts both as a boolean for control of saturating behavior and controls the difference in behavior between neon and sve -- QC bit or no QC bit. Implement right-shift rounding as tmp = src >> (shift - 1); dst = (tmp >> 1) + (tmp & 1); This is the same number of instructions as the current tmp = 1 << (shift - 1); dst = (src + tmp) >> shift; without any possibility of intermediate overflow. Signed-off-by: Richard Henderson --- target/arm/vec_internal.h | 161 ++++++++++++ target/arm/neon_helper.c | 507 +++++++------------------------------- 2 files changed, 244 insertions(+), 424 deletions(-) create mode 100644 target/arm/vec_internal.h -- 2.20.1 diff --git a/target/arm/vec_internal.h b/target/arm/vec_internal.h new file mode 100644 index 0000000000..0d1f9c86c8 --- /dev/null +++ b/target/arm/vec_internal.h @@ -0,0 +1,161 @@ +/* + * ARM AdvSIMD / SVE Vector Helpers + * + * Copyright (c) 2020 Linaro + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, see . + */ + +#ifndef TARGET_ARM_VEC_INTERNALS_H +#define TARGET_ARM_VEC_INTERNALS_H + +static inline int32_t do_sqrshl_bhs(int32_t src, int8_t shift, int bits, + bool round, uint32_t *sat) +{ + if (shift <= -bits) { + /* Rounding the sign bit always produces 0. */ + if (round) { + return 0; + } + return src >> 31; + } else if (shift < 0) { + if (round) { + src >>= -shift - 1; + return (src >> 1) + (src & 1); + } + return src >> -shift; + } else if (shift < bits) { + int32_t val = src << shift; + if (bits == 32) { + if (!sat || val >> shift == src) { + return val; + } + } else { + int32_t extval = sextract32(val, 0, bits); + if (!sat || val == extval) { + return extval; + } + } + } else if (!sat || src == 0) { + return 0; + } + + *sat = 1; + return (1u << (bits - 1)) - (src >= 0); +} + +static inline uint32_t do_uqrshl_bhs(uint32_t src, int8_t shift, int bits, + bool round, uint32_t *sat) +{ + if (shift <= -(bits + round)) { + return 0; + } else if (shift < 0) { + if (round) { + src >>= -shift - 1; + return (src >> 1) + (src & 1); + } + return src >> -shift; + } else if (shift < bits) { + uint32_t val = src << shift; + if (bits == 32) { + if (!sat || val >> shift == src) { + return val; + } + } else { + uint32_t extval = extract32(val, 0, bits); + if (!sat || val == extval) { + return extval; + } + } + } else if (!sat || src == 0) { + return 0; + } + + *sat = 1; + return MAKE_64BIT_MASK(0, bits); +} + +static inline int32_t do_suqrshl_bhs(int32_t src, int8_t shift, int bits, + bool round, uint32_t *sat) +{ + if (src < 0) { + *sat = 1; + return 0; + } + return do_uqrshl_bhs(src, shift, bits, round, sat); +} + +static inline int64_t do_sqrshl_d(int64_t src, int8_t shift, + bool round, uint32_t *sat) +{ + if (shift <= -64) { + /* Rounding the sign bit always produces 0. */ + if (round) { + return 0; + } + return src >> 63; + } else if (shift < 0) { + if (round) { + src >>= -shift - 1; + return (src >> 1) + (src & 1); + } + return src >> -shift; + } else if (shift < 64) { + int64_t val = src << shift; + if (!sat || val >> shift == src) { + return val; + } + } else if (!sat || src == 0) { + return 0; + } + + *sat = 1; + return src < 0 ? INT64_MIN : INT64_MAX; +} + +static inline uint64_t do_uqrshl_d(uint64_t src, int8_t shift, + bool round, uint32_t *sat) +{ + if (shift <= -(64 + round)) { + return 0; + } else if (shift < 0) { + if (round) { + src >>= -shift - 1; + return (src >> 1) + (src & 1); + } + return src >> -shift; + } else if (shift < 64) { + uint64_t val = src << shift; + if (!sat || val >> shift == src) { + return val; + } + } else if (!sat || src == 0) { + return 0; + } + + *sat = 1; + return UINT64_MAX; +} + +static inline int64_t do_suqrshl_d(int64_t src, int8_t shift, + bool round, uint32_t *sat) +{ + if (src < 0) { + *sat = 1; + return 0; + } + return do_uqrshl_d(src, shift, round, sat); +} + +#endif /* TARGET_ARM_VEC_INTERNALS_H */ diff --git a/target/arm/neon_helper.c b/target/arm/neon_helper.c index c7a8438b42..e6481a5764 100644 --- a/target/arm/neon_helper.c +++ b/target/arm/neon_helper.c @@ -11,6 +11,7 @@ #include "cpu.h" #include "exec/helper-proto.h" #include "fpu/softfloat.h" +#include "vec_internal.h" #define SIGNBIT (uint32_t)0x80000000 #define SIGNBIT64 ((uint64_t)1 << 63) @@ -604,496 +605,154 @@ NEON_VOP(abd_s32, neon_s32, 1) NEON_VOP(abd_u32, neon_u32, 1) #undef NEON_FN -#define NEON_FN(dest, src1, src2) do { \ - int8_t tmp; \ - tmp = (int8_t)src2; \ - if (tmp >= (ssize_t)sizeof(src1) * 8 || \ - tmp <= -(ssize_t)sizeof(src1) * 8) { \ - dest = 0; \ - } else if (tmp < 0) { \ - dest = src1 >> -tmp; \ - } else { \ - dest = src1 << tmp; \ - }} while (0) +#define NEON_FN(dest, src1, src2) \ + (dest = do_uqrshl_bhs(src1, src2, 16, false, NULL)) NEON_VOP(shl_u16, neon_u16, 2) #undef NEON_FN -#define NEON_FN(dest, src1, src2) do { \ - int8_t tmp; \ - tmp = (int8_t)src2; \ - if (tmp >= (ssize_t)sizeof(src1) * 8) { \ - dest = 0; \ - } else if (tmp <= -(ssize_t)sizeof(src1) * 8) { \ - dest = src1 >> (sizeof(src1) * 8 - 1); \ - } else if (tmp < 0) { \ - dest = src1 >> -tmp; \ - } else { \ - dest = src1 << tmp; \ - }} while (0) +#define NEON_FN(dest, src1, src2) \ + (dest = do_sqrshl_bhs(src1, src2, 16, false, NULL)) NEON_VOP(shl_s16, neon_s16, 2) #undef NEON_FN -#define NEON_FN(dest, src1, src2) do { \ - int8_t tmp; \ - tmp = (int8_t)src2; \ - if ((tmp >= (ssize_t)sizeof(src1) * 8) \ - || (tmp <= -(ssize_t)sizeof(src1) * 8)) { \ - dest = 0; \ - } else if (tmp < 0) { \ - dest = (src1 + (1 << (-1 - tmp))) >> -tmp; \ - } else { \ - dest = src1 << tmp; \ - }} while (0) +#define NEON_FN(dest, src1, src2) \ + (dest = do_sqrshl_bhs(src1, src2, 8, true, NULL)) NEON_VOP(rshl_s8, neon_s8, 4) +#undef NEON_FN + +#define NEON_FN(dest, src1, src2) \ + (dest = do_sqrshl_bhs(src1, src2, 16, true, NULL)) NEON_VOP(rshl_s16, neon_s16, 2) #undef NEON_FN -/* The addition of the rounding constant may overflow, so we use an - * intermediate 64 bit accumulator. */ -uint32_t HELPER(neon_rshl_s32)(uint32_t valop, uint32_t shiftop) +uint32_t HELPER(neon_rshl_s32)(uint32_t val, uint32_t shift) { - int32_t dest; - int32_t val = (int32_t)valop; - int8_t shift = (int8_t)shiftop; - if ((shift >= 32) || (shift <= -32)) { - dest = 0; - } else if (shift < 0) { - int64_t big_dest = ((int64_t)val + (1 << (-1 - shift))); - dest = big_dest >> -shift; - } else { - dest = val << shift; - } - return dest; + return do_sqrshl_bhs(val, shift, 32, true, NULL); } -/* Handling addition overflow with 64 bit input values is more - * tricky than with 32 bit values. */ -uint64_t HELPER(neon_rshl_s64)(uint64_t valop, uint64_t shiftop) +uint64_t HELPER(neon_rshl_s64)(uint64_t val, uint64_t shift) { - int8_t shift = (int8_t)shiftop; - int64_t val = valop; - if ((shift >= 64) || (shift <= -64)) { - val = 0; - } else if (shift < 0) { - val >>= (-shift - 1); - if (val == INT64_MAX) { - /* In this case, it means that the rounding constant is 1, - * and the addition would overflow. Return the actual - * result directly. */ - val = 0x4000000000000000LL; - } else { - val++; - val >>= 1; - } - } else { - val <<= shift; - } - return val; + return do_sqrshl_d(val, shift, true, NULL); } -#define NEON_FN(dest, src1, src2) do { \ - int8_t tmp; \ - tmp = (int8_t)src2; \ - if (tmp >= (ssize_t)sizeof(src1) * 8 || \ - tmp < -(ssize_t)sizeof(src1) * 8) { \ - dest = 0; \ - } else if (tmp == -(ssize_t)sizeof(src1) * 8) { \ - dest = src1 >> (-tmp - 1); \ - } else if (tmp < 0) { \ - dest = (src1 + (1 << (-1 - tmp))) >> -tmp; \ - } else { \ - dest = src1 << tmp; \ - }} while (0) +#define NEON_FN(dest, src1, src2) \ + (dest = do_uqrshl_bhs(src1, src2, 8, true, NULL)) NEON_VOP(rshl_u8, neon_u8, 4) +#undef NEON_FN + +#define NEON_FN(dest, src1, src2) \ + (dest = do_uqrshl_bhs(src1, src2, 16, true, NULL)) NEON_VOP(rshl_u16, neon_u16, 2) #undef NEON_FN -/* The addition of the rounding constant may overflow, so we use an - * intermediate 64 bit accumulator. */ -uint32_t HELPER(neon_rshl_u32)(uint32_t val, uint32_t shiftop) +uint32_t HELPER(neon_rshl_u32)(uint32_t val, uint32_t shift) { - uint32_t dest; - int8_t shift = (int8_t)shiftop; - if (shift >= 32 || shift < -32) { - dest = 0; - } else if (shift == -32) { - dest = val >> 31; - } else if (shift < 0) { - uint64_t big_dest = ((uint64_t)val + (1 << (-1 - shift))); - dest = big_dest >> -shift; - } else { - dest = val << shift; - } - return dest; + return do_uqrshl_bhs(val, shift, 32, true, NULL); } -/* Handling addition overflow with 64 bit input values is more - * tricky than with 32 bit values. */ -uint64_t HELPER(neon_rshl_u64)(uint64_t val, uint64_t shiftop) +uint64_t HELPER(neon_rshl_u64)(uint64_t val, uint64_t shift) { - int8_t shift = (uint8_t)shiftop; - if (shift >= 64 || shift < -64) { - val = 0; - } else if (shift == -64) { - /* Rounding a 1-bit result just preserves that bit. */ - val >>= 63; - } else if (shift < 0) { - val >>= (-shift - 1); - if (val == UINT64_MAX) { - /* In this case, it means that the rounding constant is 1, - * and the addition would overflow. Return the actual - * result directly. */ - val = 0x8000000000000000ULL; - } else { - val++; - val >>= 1; - } - } else { - val <<= shift; - } - return val; + return do_uqrshl_d(val, shift, true, NULL); } -#define NEON_FN(dest, src1, src2) do { \ - int8_t tmp; \ - tmp = (int8_t)src2; \ - if (tmp >= (ssize_t)sizeof(src1) * 8) { \ - if (src1) { \ - SET_QC(); \ - dest = ~0; \ - } else { \ - dest = 0; \ - } \ - } else if (tmp <= -(ssize_t)sizeof(src1) * 8) { \ - dest = 0; \ - } else if (tmp < 0) { \ - dest = src1 >> -tmp; \ - } else { \ - dest = src1 << tmp; \ - if ((dest >> tmp) != src1) { \ - SET_QC(); \ - dest = ~0; \ - } \ - }} while (0) +#define NEON_FN(dest, src1, src2) \ + (dest = do_uqrshl_bhs(src1, src2, 8, false, env->vfp.qc)) NEON_VOP_ENV(qshl_u8, neon_u8, 4) +#undef NEON_FN + +#define NEON_FN(dest, src1, src2) \ + (dest = do_uqrshl_bhs(src1, src2, 16, false, env->vfp.qc)) NEON_VOP_ENV(qshl_u16, neon_u16, 2) -NEON_VOP_ENV(qshl_u32, neon_u32, 1) #undef NEON_FN -uint64_t HELPER(neon_qshl_u64)(CPUARMState *env, uint64_t val, uint64_t shiftop) +uint32_t HELPER(neon_qshl_u32)(CPUARMState *env, uint32_t val, uint32_t shift) { - int8_t shift = (int8_t)shiftop; - if (shift >= 64) { - if (val) { - val = ~(uint64_t)0; - SET_QC(); - } - } else if (shift <= -64) { - val = 0; - } else if (shift < 0) { - val >>= -shift; - } else { - uint64_t tmp = val; - val <<= shift; - if ((val >> shift) != tmp) { - SET_QC(); - val = ~(uint64_t)0; - } - } - return val; + return do_uqrshl_bhs(val, shift, 32, false, env->vfp.qc); } -#define NEON_FN(dest, src1, src2) do { \ - int8_t tmp; \ - tmp = (int8_t)src2; \ - if (tmp >= (ssize_t)sizeof(src1) * 8) { \ - if (src1) { \ - SET_QC(); \ - dest = (uint32_t)(1 << (sizeof(src1) * 8 - 1)); \ - if (src1 > 0) { \ - dest--; \ - } \ - } else { \ - dest = src1; \ - } \ - } else if (tmp <= -(ssize_t)sizeof(src1) * 8) { \ - dest = src1 >> 31; \ - } else if (tmp < 0) { \ - dest = src1 >> -tmp; \ - } else { \ - dest = src1 << tmp; \ - if ((dest >> tmp) != src1) { \ - SET_QC(); \ - dest = (uint32_t)(1 << (sizeof(src1) * 8 - 1)); \ - if (src1 > 0) { \ - dest--; \ - } \ - } \ - }} while (0) +uint64_t HELPER(neon_qshl_u64)(CPUARMState *env, uint64_t val, uint64_t shift) +{ + return do_uqrshl_d(val, shift, false, env->vfp.qc); +} + +#define NEON_FN(dest, src1, src2) \ + (dest = do_sqrshl_bhs(src1, src2, 8, false, env->vfp.qc)) NEON_VOP_ENV(qshl_s8, neon_s8, 4) +#undef NEON_FN + +#define NEON_FN(dest, src1, src2) \ + (dest = do_sqrshl_bhs(src1, src2, 16, false, env->vfp.qc)) NEON_VOP_ENV(qshl_s16, neon_s16, 2) -NEON_VOP_ENV(qshl_s32, neon_s32, 1) #undef NEON_FN -uint64_t HELPER(neon_qshl_s64)(CPUARMState *env, uint64_t valop, uint64_t shiftop) +uint32_t HELPER(neon_qshl_s32)(CPUARMState *env, uint32_t val, uint32_t shift) { - int8_t shift = (uint8_t)shiftop; - int64_t val = valop; - if (shift >= 64) { - if (val) { - SET_QC(); - val = (val >> 63) ^ ~SIGNBIT64; - } - } else if (shift <= -64) { - val >>= 63; - } else if (shift < 0) { - val >>= -shift; - } else { - int64_t tmp = val; - val <<= shift; - if ((val >> shift) != tmp) { - SET_QC(); - val = (tmp >> 63) ^ ~SIGNBIT64; - } - } - return val; + return do_sqrshl_bhs(val, shift, 32, false, env->vfp.qc); } -#define NEON_FN(dest, src1, src2) do { \ - if (src1 & (1 << (sizeof(src1) * 8 - 1))) { \ - SET_QC(); \ - dest = 0; \ - } else { \ - int8_t tmp; \ - tmp = (int8_t)src2; \ - if (tmp >= (ssize_t)sizeof(src1) * 8) { \ - if (src1) { \ - SET_QC(); \ - dest = ~0; \ - } else { \ - dest = 0; \ - } \ - } else if (tmp <= -(ssize_t)sizeof(src1) * 8) { \ - dest = 0; \ - } else if (tmp < 0) { \ - dest = src1 >> -tmp; \ - } else { \ - dest = src1 << tmp; \ - if ((dest >> tmp) != src1) { \ - SET_QC(); \ - dest = ~0; \ - } \ - } \ - }} while (0) -NEON_VOP_ENV(qshlu_s8, neon_u8, 4) -NEON_VOP_ENV(qshlu_s16, neon_u16, 2) +uint64_t HELPER(neon_qshl_s64)(CPUARMState *env, uint64_t val, uint64_t shift) +{ + return do_sqrshl_d(val, shift, false, env->vfp.qc); +} + +#define NEON_FN(dest, src1, src2) \ + (dest = do_suqrshl_bhs(src1, src2, 8, false, env->vfp.qc)) +NEON_VOP_ENV(qshlu_s8, neon_s8, 4) #undef NEON_FN -uint32_t HELPER(neon_qshlu_s32)(CPUARMState *env, uint32_t valop, uint32_t shiftop) +#define NEON_FN(dest, src1, src2) \ + (dest = do_suqrshl_bhs(src1, src2, 16, false, env->vfp.qc)) +NEON_VOP_ENV(qshlu_s16, neon_s16, 2) +#undef NEON_FN + +uint32_t HELPER(neon_qshlu_s32)(CPUARMState *env, uint32_t val, uint32_t shift) { - if ((int32_t)valop < 0) { - SET_QC(); - return 0; - } - return helper_neon_qshl_u32(env, valop, shiftop); + return do_suqrshl_bhs(val, shift, 32, false, env->vfp.qc); } -uint64_t HELPER(neon_qshlu_s64)(CPUARMState *env, uint64_t valop, uint64_t shiftop) +uint64_t HELPER(neon_qshlu_s64)(CPUARMState *env, uint64_t val, uint64_t shift) { - if ((int64_t)valop < 0) { - SET_QC(); - return 0; - } - return helper_neon_qshl_u64(env, valop, shiftop); + return do_suqrshl_d(val, shift, false, env->vfp.qc); } -#define NEON_FN(dest, src1, src2) do { \ - int8_t tmp; \ - tmp = (int8_t)src2; \ - if (tmp >= (ssize_t)sizeof(src1) * 8) { \ - if (src1) { \ - SET_QC(); \ - dest = ~0; \ - } else { \ - dest = 0; \ - } \ - } else if (tmp < -(ssize_t)sizeof(src1) * 8) { \ - dest = 0; \ - } else if (tmp == -(ssize_t)sizeof(src1) * 8) { \ - dest = src1 >> (sizeof(src1) * 8 - 1); \ - } else if (tmp < 0) { \ - dest = (src1 + (1 << (-1 - tmp))) >> -tmp; \ - } else { \ - dest = src1 << tmp; \ - if ((dest >> tmp) != src1) { \ - SET_QC(); \ - dest = ~0; \ - } \ - }} while (0) +#define NEON_FN(dest, src1, src2) \ + (dest = do_uqrshl_bhs(src1, src2, 8, true, env->vfp.qc)) NEON_VOP_ENV(qrshl_u8, neon_u8, 4) +#undef NEON_FN + +#define NEON_FN(dest, src1, src2) \ + (dest = do_uqrshl_bhs(src1, src2, 16, true, env->vfp.qc)) NEON_VOP_ENV(qrshl_u16, neon_u16, 2) #undef NEON_FN -/* The addition of the rounding constant may overflow, so we use an - * intermediate 64 bit accumulator. */ -uint32_t HELPER(neon_qrshl_u32)(CPUARMState *env, uint32_t val, uint32_t shiftop) +uint32_t HELPER(neon_qrshl_u32)(CPUARMState *env, uint32_t val, uint32_t shift) { - uint32_t dest; - int8_t shift = (int8_t)shiftop; - if (shift >= 32) { - if (val) { - SET_QC(); - dest = ~0; - } else { - dest = 0; - } - } else if (shift < -32) { - dest = 0; - } else if (shift == -32) { - dest = val >> 31; - } else if (shift < 0) { - uint64_t big_dest = ((uint64_t)val + (1 << (-1 - shift))); - dest = big_dest >> -shift; - } else { - dest = val << shift; - if ((dest >> shift) != val) { - SET_QC(); - dest = ~0; - } - } - return dest; + return do_uqrshl_bhs(val, shift, 32, true, env->vfp.qc); } -/* Handling addition overflow with 64 bit input values is more - * tricky than with 32 bit values. */ -uint64_t HELPER(neon_qrshl_u64)(CPUARMState *env, uint64_t val, uint64_t shiftop) +uint64_t HELPER(neon_qrshl_u64)(CPUARMState *env, uint64_t val, uint64_t shift) { - int8_t shift = (int8_t)shiftop; - if (shift >= 64) { - if (val) { - SET_QC(); - val = ~0; - } - } else if (shift < -64) { - val = 0; - } else if (shift == -64) { - val >>= 63; - } else if (shift < 0) { - val >>= (-shift - 1); - if (val == UINT64_MAX) { - /* In this case, it means that the rounding constant is 1, - * and the addition would overflow. Return the actual - * result directly. */ - val = 0x8000000000000000ULL; - } else { - val++; - val >>= 1; - } - } else { \ - uint64_t tmp = val; - val <<= shift; - if ((val >> shift) != tmp) { - SET_QC(); - val = ~0; - } - } - return val; + return do_uqrshl_d(val, shift, true, env->vfp.qc); } -#define NEON_FN(dest, src1, src2) do { \ - int8_t tmp; \ - tmp = (int8_t)src2; \ - if (tmp >= (ssize_t)sizeof(src1) * 8) { \ - if (src1) { \ - SET_QC(); \ - dest = (typeof(dest))(1 << (sizeof(src1) * 8 - 1)); \ - if (src1 > 0) { \ - dest--; \ - } \ - } else { \ - dest = 0; \ - } \ - } else if (tmp <= -(ssize_t)sizeof(src1) * 8) { \ - dest = 0; \ - } else if (tmp < 0) { \ - dest = (src1 + (1 << (-1 - tmp))) >> -tmp; \ - } else { \ - dest = src1 << tmp; \ - if ((dest >> tmp) != src1) { \ - SET_QC(); \ - dest = (uint32_t)(1 << (sizeof(src1) * 8 - 1)); \ - if (src1 > 0) { \ - dest--; \ - } \ - } \ - }} while (0) +#define NEON_FN(dest, src1, src2) \ + (dest = do_sqrshl_bhs(src1, src2, 8, true, env->vfp.qc)) NEON_VOP_ENV(qrshl_s8, neon_s8, 4) +#undef NEON_FN + +#define NEON_FN(dest, src1, src2) \ + (dest = do_sqrshl_bhs(src1, src2, 16, true, env->vfp.qc)) NEON_VOP_ENV(qrshl_s16, neon_s16, 2) #undef NEON_FN -/* The addition of the rounding constant may overflow, so we use an - * intermediate 64 bit accumulator. */ -uint32_t HELPER(neon_qrshl_s32)(CPUARMState *env, uint32_t valop, uint32_t shiftop) +uint32_t HELPER(neon_qrshl_s32)(CPUARMState *env, uint32_t val, uint32_t shift) { - int32_t dest; - int32_t val = (int32_t)valop; - int8_t shift = (int8_t)shiftop; - if (shift >= 32) { - if (val) { - SET_QC(); - dest = (val >> 31) ^ ~SIGNBIT; - } else { - dest = 0; - } - } else if (shift <= -32) { - dest = 0; - } else if (shift < 0) { - int64_t big_dest = ((int64_t)val + (1 << (-1 - shift))); - dest = big_dest >> -shift; - } else { - dest = val << shift; - if ((dest >> shift) != val) { - SET_QC(); - dest = (val >> 31) ^ ~SIGNBIT; - } - } - return dest; + return do_sqrshl_bhs(val, shift, 32, true, env->vfp.qc); } -/* Handling addition overflow with 64 bit input values is more - * tricky than with 32 bit values. */ -uint64_t HELPER(neon_qrshl_s64)(CPUARMState *env, uint64_t valop, uint64_t shiftop) +uint64_t HELPER(neon_qrshl_s64)(CPUARMState *env, uint64_t val, uint64_t shift) { - int8_t shift = (uint8_t)shiftop; - int64_t val = valop; - - if (shift >= 64) { - if (val) { - SET_QC(); - val = (val >> 63) ^ ~SIGNBIT64; - } - } else if (shift <= -64) { - val = 0; - } else if (shift < 0) { - val >>= (-shift - 1); - if (val == INT64_MAX) { - /* In this case, it means that the rounding constant is 1, - * and the addition would overflow. Return the actual - * result directly. */ - val = 0x4000000000000000ULL; - } else { - val++; - val >>= 1; - } - } else { - int64_t tmp = val; - val <<= shift; - if ((val >> shift) != tmp) { - SET_QC(); - val = (tmp >> 63) ^ ~SIGNBIT64; - } - } - return val; + return do_sqrshl_d(val, shift, true, env->vfp.qc); } uint32_t HELPER(neon_add_u8)(uint32_t a, uint32_t b) From patchwork Thu Mar 26 23:08:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 184945 Delivered-To: patch@linaro.org Received: by 2002:a92:de47:0:0:0:0:0 with SMTP id e7csp641416ilr; Thu, 26 Mar 2020 16:13:59 -0700 (PDT) X-Google-Smtp-Source: ADFU+vtKbbdBYzCMwxG/Kw8bw8byKFV5rxvvFkP3vGgpGQSnucHzN4o3sZk2j3UzRZNP/M4Kz661 X-Received: by 2002:a05:6214:11ec:: with SMTP id e12mr11292883qvu.89.1585264439875; Thu, 26 Mar 2020 16:13:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585264439; cv=none; d=google.com; s=arc-20160816; b=vYJXCZDfePltkbX3RCc9vV/Y3HIfcC9nSwEc1ueRfQcwE1CWQRhBikjBYXsn6MLf+0 2W/A7gIivS62bqiGMTmyZN4zafDVLLYq5+L+fAwNP0cBfOAmrtgutLc55xLtYgf1JAD4 Snpuk735F1oeBnMrsDCxEshfeEdixPd8Wkg2GCU2nnpG8Br7rstq3kzJXf3RrzXqpJK2 3vk2igD32tWpwhnPHTWlm/4a+Q5HJeaT28k4wPds619UFsZ2ZFlFA1g3g5bUDADvmgaX Wx5JhMgrjc2NebXsK2xtVUR+T8btHb/OcY2Hs+vFYc06F4mc3q1Kjpp4Ie04Ei+U8ESO u9Qw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=qcvLlVnwnl+rx9t724hsUpqaoFTYlb8bMAA1WYSB/kc=; b=WhdYTvYgzahpAq6uhLr98F0WOn6WmrtZmJAAyDFbWkJZH0Q8zF70tL6j+3EbZ883lj PpzBLLDW3xdYcTr+eX6vN3lI4lDILTc0vI/iomT9g8rff9gO0qkOXyRBXIphMDhblA9G MeOHVVwiypxnpUa2bmRyIx7jToQAlQFcgS4tOv8YAP9gcvEHjkBxZ9EI9yX/0a8BUF4S 4r8df21TOYJozfgRdtxxKPdJ71ah6NnQxPycZxNZ/7Bv9+ch9+rZp8dJXD4+ekqYtI1L BnIaQp2Ui57jHsXvB8zPnLiu3u9kpQ0DstOBsUGujUk/0/xm3my2jDsF1BSayds3H6yf kcLA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=nEmAwWsg; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id d74si2397024qke.16.2020.03.26.16.13.59 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 26 Mar 2020 16:13:59 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=nEmAwWsg; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:34594 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbhH-00026H-CJ for patch@linaro.org; Thu, 26 Mar 2020 19:13:59 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58010) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcK-00024n-OB for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:54 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcI-0001K0-Qo for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:52 -0400 Received: from mail-pj1-x1043.google.com ([2607:f8b0:4864:20::1043]:40041) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcI-0001JS-Jc for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:50 -0400 Received: by mail-pj1-x1043.google.com with SMTP id kx8so3020224pjb.5 for ; Thu, 26 Mar 2020 16:08:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=qcvLlVnwnl+rx9t724hsUpqaoFTYlb8bMAA1WYSB/kc=; b=nEmAwWsgfi7nh5KGB2o58Lcu+tKp9gwPw+8yRg9qj9LxuYMrwd1N0RClYJYpij3/Qy II655RKPyB0MpzXIxyK5S/GpPbBvpwT3kHkWuAQ2LsmQIBLoxJnFpWoI+wm2ubIWaNFI 47LOcWQE8xvH64jmPIINiXOejQpbRPspJv3Qr1fHOrRPbtSepLWg4R/0nimPTxU0lSyr jFZ1zecAXOtFGB/c/FwPP1zqNbv8UvmQ2IKtsF3J/iH0QgpcSX8vP3kyXIXbwm2YTWNt sOMsPHyXqtvlULv3VojsZNuu8aSViVVa0Ma73+zmcEiB0LiULRs+ZVIWkVuN9wIqWHXE l8OA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=qcvLlVnwnl+rx9t724hsUpqaoFTYlb8bMAA1WYSB/kc=; b=DPZxwci1g6CZkvRAtgNSCooNvtIwwj9OFfjM7OdQQte7+BG4ih3zx7ZiMEXu8yOp1O x9LL8JIMIJZ/0z3dvMhSRWM4sNF6Nse32Vq8t/0cmTZsRv5GAfv+DZBSZ/t9XuPA4+oK +QGP7OfB77aaDQY/TCI0l1AJUby2WhaNOGhAuNkMsXYbpaX1Mt6eWwUILo5L7N1ckqZz l+vVAmHOOvowFeHOajxVA3ILQCIkpDy7y1LhblBOjUS0GXcJry9KVxKX3tBOzOxvjLVC T8xpB+k0n14ryZyOd9teFta3BZg6eS2Lfmc8pWYaiKFrbS55finZnWEz4cz8D2Far1Sd FKNw== X-Gm-Message-State: ANhLgQ1TRz6zG7W4bWaFJYAUmBKchjtPnFAP6YDdSwQ7NATOxWXSHGUI QWYMDIsongySfMjd8+eaizpxQ19XIns= X-Received: by 2002:a17:902:694c:: with SMTP id k12mr10460092plt.173.1585264129055; Thu, 26 Mar 2020 16:08:49 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.08.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:08:48 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 07/31] target/arm: Implement SVE2 saturating/rounding bitwise shift left (predicated) Date: Thu, 26 Mar 2020 16:08:14 -0700 Message-Id: <20200326230838.31112-8-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::1043 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 54 ++++++++++++++++++++++++++ target/arm/sve.decode | 17 +++++++++ target/arm/sve_helper.c | 78 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 18 +++++++++ 4 files changed, 167 insertions(+) -- 2.20.1 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index d3b7c3bd12..0eecf33249 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -172,6 +172,60 @@ DEF_HELPER_FLAGS_5(sve2_uadalp_zpzz_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve2_uadalp_zpzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_srshl_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_srshl_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_srshl_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_srshl_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_urshl_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_urshl_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_urshl_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_urshl_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_sqshl_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqshl_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqshl_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqshl_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_uqshl_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqshl_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqshl_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqshl_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_sqrshl_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrshl_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrshl_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrshl_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_uqrshl_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqrshl_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqrshl_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqrshl_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_sdiv_zpzz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_sdiv_zpzz_d, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 95a9c65451..f0b6692e43 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1114,3 +1114,20 @@ URECPE 01000100 .. 000 000 101 ... ..... ..... @rd_pg_rn URSQRTE 01000100 .. 000 001 101 ... ..... ..... @rd_pg_rn SQABS 01000100 .. 001 000 101 ... ..... ..... @rd_pg_rn SQNEG 01000100 .. 001 001 101 ... ..... ..... @rd_pg_rn + +### SVE2 saturating/rounding bitwise shift left (predicated) + +SRSHL 01000100 .. 000 010 100 ... ..... ..... @rdn_pg_rm +URSHL 01000100 .. 000 011 100 ... ..... ..... @rdn_pg_rm +SRSHL 01000100 .. 000 110 100 ... ..... ..... @rdm_pg_rn # SRSHLR +URSHL 01000100 .. 000 111 100 ... ..... ..... @rdm_pg_rn # URSHLR + +SQSHL 01000100 .. 001 000 100 ... ..... ..... @rdn_pg_rm +UQSHL 01000100 .. 001 001 100 ... ..... ..... @rdn_pg_rm +SQSHL 01000100 .. 001 100 100 ... ..... ..... @rdm_pg_rn # SQSHLR +UQSHL 01000100 .. 001 101 100 ... ..... ..... @rdm_pg_rn # UQSHLR + +SQRSHL 01000100 .. 001 010 100 ... ..... ..... @rdn_pg_rm +UQRSHL 01000100 .. 001 011 100 ... ..... ..... @rdn_pg_rm +SQRSHL 01000100 .. 001 110 100 ... ..... ..... @rdm_pg_rn # SQRSHLR +UQRSHL 01000100 .. 001 111 100 ... ..... ..... @rdm_pg_rn # UQRSHLR diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 16606331fc..a7e9b8d341 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -26,6 +26,7 @@ #include "tcg/tcg-gvec-desc.h" #include "fpu/softfloat.h" #include "tcg/tcg.h" +#include "vec_internal.h" /* Note that vector data is stored in host-endian 64-bit chunks, @@ -561,6 +562,83 @@ DO_ZPZZ(sve2_uadalp_zpzz_h, uint16_t, H1_2, do_uadalp_h) DO_ZPZZ(sve2_uadalp_zpzz_s, uint32_t, H1_4, do_uadalp_s) DO_ZPZZ_D(sve2_uadalp_zpzz_d, uint64_t, do_uadalp_d) +#define do_srshl_b(n, m) do_sqrshl_bhs(n, m, 8, true, NULL) +#define do_srshl_h(n, m) do_sqrshl_bhs(n, m, 16, true, NULL) +#define do_srshl_s(n, m) do_sqrshl_bhs(n, m, 32, true, NULL) +#define do_srshl_d(n, m) do_sqrshl_d(n, m, true, NULL) + +DO_ZPZZ(sve2_srshl_zpzz_b, int8_t, H1_2, do_srshl_b) +DO_ZPZZ(sve2_srshl_zpzz_h, int16_t, H1_2, do_srshl_h) +DO_ZPZZ(sve2_srshl_zpzz_s, int32_t, H1_4, do_srshl_s) +DO_ZPZZ_D(sve2_srshl_zpzz_d, int64_t, do_srshl_d) + +#define do_urshl_b(n, m) do_uqrshl_bhs(n, m, 8, true, NULL) +#define do_urshl_h(n, m) do_uqrshl_bhs(n, m, 16, true, NULL) +#define do_urshl_s(n, m) do_uqrshl_bhs(n, m, 32, true, NULL) +#define do_urshl_d(n, m) do_uqrshl_d(n, m, true, NULL) + +DO_ZPZZ(sve2_urshl_zpzz_b, uint8_t, H1_2, do_urshl_b) +DO_ZPZZ(sve2_urshl_zpzz_h, uint16_t, H1_2, do_urshl_h) +DO_ZPZZ(sve2_urshl_zpzz_s, uint32_t, H1_4, do_urshl_s) +DO_ZPZZ_D(sve2_urshl_zpzz_d, uint64_t, do_urshl_d) + +/* Unlike the NEON and AdvSIMD versions, there is no QC bit to set. */ +#define do_sqshl_b(n, m) \ + ({ uint32_t discard; do_sqrshl_bhs(n, m, 8, false, &discard); }) +#define do_sqshl_h(n, m) \ + ({ uint32_t discard; do_sqrshl_bhs(n, m, 16, false, &discard); }) +#define do_sqshl_s(n, m) \ + ({ uint32_t discard; do_sqrshl_bhs(n, m, 32, false, &discard); }) +#define do_sqshl_d(n, m) \ + ({ uint32_t discard; do_sqrshl_d(n, m, false, &discard); }) + +DO_ZPZZ(sve2_sqshl_zpzz_b, int8_t, H1_2, do_sqshl_b) +DO_ZPZZ(sve2_sqshl_zpzz_h, int16_t, H1_2, do_sqshl_h) +DO_ZPZZ(sve2_sqshl_zpzz_s, int32_t, H1_4, do_sqshl_s) +DO_ZPZZ_D(sve2_sqshl_zpzz_d, int64_t, do_sqshl_d) + +#define do_uqshl_b(n, m) \ + ({ uint32_t discard; do_uqrshl_bhs(n, m, 8, false, &discard); }) +#define do_uqshl_h(n, m) \ + ({ uint32_t discard; do_uqrshl_bhs(n, m, 16, false, &discard); }) +#define do_uqshl_s(n, m) \ + ({ uint32_t discard; do_uqrshl_bhs(n, m, 32, false, &discard); }) +#define do_uqshl_d(n, m) \ + ({ uint32_t discard; do_uqrshl_d(n, m, false, &discard); }) + +DO_ZPZZ(sve2_uqshl_zpzz_b, uint8_t, H1_2, do_uqshl_b) +DO_ZPZZ(sve2_uqshl_zpzz_h, uint16_t, H1_2, do_uqshl_h) +DO_ZPZZ(sve2_uqshl_zpzz_s, uint32_t, H1_4, do_uqshl_s) +DO_ZPZZ_D(sve2_uqshl_zpzz_d, uint64_t, do_uqshl_d) + +#define do_sqrshl_b(n, m) \ + ({ uint32_t discard; do_sqrshl_bhs(n, m, 8, true, &discard); }) +#define do_sqrshl_h(n, m) \ + ({ uint32_t discard; do_sqrshl_bhs(n, m, 16, true, &discard); }) +#define do_sqrshl_s(n, m) \ + ({ uint32_t discard; do_sqrshl_bhs(n, m, 32, true, &discard); }) +#define do_sqrshl_d(n, m) \ + ({ uint32_t discard; do_sqrshl_d(n, m, true, &discard); }) + +DO_ZPZZ(sve2_sqrshl_zpzz_b, int8_t, H1_2, do_sqrshl_b) +DO_ZPZZ(sve2_sqrshl_zpzz_h, int16_t, H1_2, do_sqrshl_h) +DO_ZPZZ(sve2_sqrshl_zpzz_s, int32_t, H1_4, do_sqrshl_s) +DO_ZPZZ_D(sve2_sqrshl_zpzz_d, int64_t, do_sqrshl_d) + +#define do_uqrshl_b(n, m) \ + ({ uint32_t discard; do_uqrshl_bhs(n, m, 8, true, &discard); }) +#define do_uqrshl_h(n, m) \ + ({ uint32_t discard; do_uqrshl_bhs(n, m, 16, true, &discard); }) +#define do_uqrshl_s(n, m) \ + ({ uint32_t discard; do_uqrshl_bhs(n, m, 32, true, &discard); }) +#define do_uqrshl_d(n, m) \ + ({ uint32_t discard; do_uqrshl_d(n, m, true, &discard); }) + +DO_ZPZZ(sve2_uqrshl_zpzz_b, uint8_t, H1_2, do_uqrshl_b) +DO_ZPZZ(sve2_uqrshl_zpzz_h, uint16_t, H1_2, do_uqrshl_h) +DO_ZPZZ(sve2_uqrshl_zpzz_s, uint32_t, H1_4, do_uqrshl_s) +DO_ZPZZ_D(sve2_uqrshl_zpzz_d, uint64_t, do_uqrshl_d) + #undef DO_ZPZZ #undef DO_ZPZZ_D diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 938ec08673..45a72b1750 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -5966,3 +5966,21 @@ static bool trans_SQNEG(DisasContext *s, arg_rpr_esz *a) }; return do_sve2_zpz_ool(s, a, fns[a->esz]); } + +#define DO_SVE2_ZPZZ(NAME, name) \ +static bool trans_##NAME(DisasContext *s, arg_rprr_esz *a) \ +{ \ + static gen_helper_gvec_4 * const fns[4] = { \ + gen_helper_sve2_##name##_zpzz_b, gen_helper_sve2_##name##_zpzz_h, \ + gen_helper_sve2_##name##_zpzz_s, gen_helper_sve2_##name##_zpzz_d, \ + }; \ + return do_sve2_zpzz_ool(s, a, fns[a->esz]); \ +} + +DO_SVE2_ZPZZ(SQSHL, sqshl) +DO_SVE2_ZPZZ(SQRSHL, sqrshl) +DO_SVE2_ZPZZ(SRSHL, srshl) + +DO_SVE2_ZPZZ(UQSHL, uqshl) +DO_SVE2_ZPZZ(UQRSHL, uqrshl) +DO_SVE2_ZPZZ(URSHL, urshl) From patchwork Thu Mar 26 23:08:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 184946 Delivered-To: patch@linaro.org Received: by 2002:a92:de47:0:0:0:0:0 with SMTP id e7csp641840ilr; Thu, 26 Mar 2020 16:14:30 -0700 (PDT) X-Google-Smtp-Source: ADFU+vtAxCQl7edhoMHRLfL55DLwkeV22laXQkXLS8LEwWhusAN/ft2mN21be/YhoSjO/3yJ/6zW X-Received: by 2002:a37:884:: with SMTP id 126mr10882880qki.72.1585264469908; Thu, 26 Mar 2020 16:14:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585264469; cv=none; d=google.com; s=arc-20160816; b=JSBS67lrRwaodJuC1SQ9lme1yOKoav0Y9fRl2S8S+s/YVz5vy/8LCnYWlU7hRr0vZx j4rfGhrlazGow+WgXiqapCI4kCi161c+9InQlBpERzBC4qD7skZRvjWUPPLc2BrNhUYG v72hnnH5uTTVWd8GzEI+ObKsBCeOqESRs8jn2zQ44/wxB3z5BkKcFuqusf7Kj5g9ElDS uwwC6wo8NWADVugTflvOH5e6Y3cFZJq0XA0FcLTLFYkaZyqXEY1W06hxxGNoJqog3Aor oEFR4AyLZBUBYTz9MkhNqoK+2njQ3twMM5cO9jrQIciFjkCGOtm35c/j6VhGAK5bD2YF Vjyg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=80IQPaqK9ZEvLA4BC4hOR/X+CTW0LTpH0JJGF9vMSPw=; b=m7+7bKReQNSKSNfEvlv/HBK5RCQw2mZW1vs2MTyftyH2UaL74L/RsX8q07+XEkSu4U 0GsfKDUgvN7pK0weCjJycz27nt9hCnwpyexWwaKFA2OYbuFemoS2/biyt1FX+3uPt48N XVh6+TbUH5b13qFaAdpzfR3ArIhSG3CeTnA1pAiOpBiwpVi4thHWCoBnXJ3iYhwdZWNL yZTgYBV69DE7+MKEhsaA5S2/3FHnsr+qONzp3MkieRcBTXFV3aaI26g0+z+yK7AtTUSr KgFx25yw4uQspFKYWXVFdJ7Gss9tqScVp6R7+ydxi0dYXvcgkx1AesImJkEgY6PAGXOO ZmBQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=dmdeoTnu; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id r1si2370276qvv.214.2020.03.26.16.14.29 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 26 Mar 2020 16:14:29 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=dmdeoTnu; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:34602 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbhl-0002Fu-8B for patch@linaro.org; Thu, 26 Mar 2020 19:14:29 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58052) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcL-00026H-Hx for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:55 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcJ-0001L6-Vz for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:53 -0400 Received: from mail-pj1-x1036.google.com ([2607:f8b0:4864:20::1036]:54173) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcJ-0001KK-Q3 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:51 -0400 Received: by mail-pj1-x1036.google.com with SMTP id l36so3088418pjb.3 for ; Thu, 26 Mar 2020 16:08:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=80IQPaqK9ZEvLA4BC4hOR/X+CTW0LTpH0JJGF9vMSPw=; b=dmdeoTnu06QZ1pe3dwlLNBHPKVQ4gwOYsa7GaCugqIBnU72azJBDCtcCD5CR6MnmOC 9CTr39b5EiCvsuYHsq1ibd/eVoj2ZRiStTvUlIrVIt0OW16rMcR5q5ka0qs5yukYThL2 d4eJ7rWeYq+NPwAh4JgS60+ccYKo5i7BrjA9nM7iNPPYEnvYdtWL+LZzfCLLOZaZmMz1 azQtpdahe58sdt/XU1grI1oVDxP5ofE71uBEAW/Y6kBP3vMz/6IeZiXkkjEOnrRF00jr 8GZjyKIZaCDAoxWYL60UNT/Hrz1J4W59Swxg7ETQIm/57QcV7UunpRkrAehjNkvsEeMB G5kg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=80IQPaqK9ZEvLA4BC4hOR/X+CTW0LTpH0JJGF9vMSPw=; b=DP91jnTPzWtGV99/qHVP1WvgOfnkdZEvMm1MPXkVG17aiEhOzxpP7r7gM9BBYib0Fk kZ+m/Likm6HwF5PdN2ZHQNo1JLNVSmwA5Bwl0vuFEIrzKZOqXG8VPqi4vh3wPHiUirnH CC8s2VqrNrpTT4TSTIK+ohGgGWYY9IenMkvD8BH082HJtFyI9Rg6EI/eUIix1pZmrqdj smq5CmHoJ4mVPr6DJ0XJJsgHVCI6Za2w0gsa7xrrKLv3RWJ+SRXKn3PhmjUgeKzxYARe 7YGgniTQAivJnZkswKWJy+7HxKsMJUlevalGu+FidE8gWi6vnCbYrATTBn7wEPEQfCoX jJGA== X-Gm-Message-State: ANhLgQ0yjOnt0jCyxrHYbAdTad9YPrOMVNMM8ik1jjRW/yVvdgfs4/lF r77nHo5KyyTf8nb4VbyvJHeoGxBcun0= X-Received: by 2002:a17:902:8648:: with SMTP id y8mr10258682plt.153.1585264130363; Thu, 26 Mar 2020 16:08:50 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.08.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:08:49 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 08/31] target/arm: Implement SVE2 integer halving add/subtract (predicated) Date: Thu, 26 Mar 2020 16:08:15 -0700 Message-Id: <20200326230838.31112-9-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::1036 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 54 ++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 11 ++++++++ target/arm/sve_helper.c | 39 +++++++++++++++++++++++++++ target/arm/translate-sve.c | 8 ++++++ 4 files changed, 112 insertions(+) -- 2.20.1 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 0eecf33249..149fff1fae 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -226,6 +226,60 @@ DEF_HELPER_FLAGS_5(sve2_uqrshl_zpzz_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve2_uqrshl_zpzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_shadd_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_shadd_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_shadd_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_shadd_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_uhadd_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uhadd_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uhadd_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uhadd_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_srhadd_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_srhadd_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_srhadd_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_srhadd_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_urhadd_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_urhadd_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_urhadd_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_urhadd_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_shsub_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_shsub_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_shsub_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_shsub_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_uhsub_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uhsub_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uhsub_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uhsub_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_sdiv_zpzz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_sdiv_zpzz_d, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index f0b6692e43..54076bb607 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1131,3 +1131,14 @@ SQRSHL 01000100 .. 001 010 100 ... ..... ..... @rdn_pg_rm UQRSHL 01000100 .. 001 011 100 ... ..... ..... @rdn_pg_rm SQRSHL 01000100 .. 001 110 100 ... ..... ..... @rdm_pg_rn # SQRSHLR UQRSHL 01000100 .. 001 111 100 ... ..... ..... @rdm_pg_rn # UQRSHLR + +### SVE2 integer halving add/subtract (predicated) + +SHADD 01000100 .. 010 000 100 ... ..... ..... @rdn_pg_rm +UHADD 01000100 .. 010 001 100 ... ..... ..... @rdn_pg_rm +SHSUB 01000100 .. 010 010 100 ... ..... ..... @rdn_pg_rm +UHSUB 01000100 .. 010 011 100 ... ..... ..... @rdn_pg_rm +SRHADD 01000100 .. 010 100 100 ... ..... ..... @rdn_pg_rm +URHADD 01000100 .. 010 101 100 ... ..... ..... @rdn_pg_rm +SHSUB 01000100 .. 010 110 100 ... ..... ..... @rdm_pg_rn # SHSUBR +UHSUB 01000100 .. 010 111 100 ... ..... ..... @rdm_pg_rn # UHSUBR diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index a7e9b8d341..5d75aed7b7 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -639,6 +639,45 @@ DO_ZPZZ(sve2_uqrshl_zpzz_h, uint16_t, H1_2, do_uqrshl_h) DO_ZPZZ(sve2_uqrshl_zpzz_s, uint32_t, H1_4, do_uqrshl_s) DO_ZPZZ_D(sve2_uqrshl_zpzz_d, uint64_t, do_uqrshl_d) +#define DO_HADD_BHS(n, m) (((int64_t)n + m) >> 1) +#define DO_HADD_D(n, m) ((n >> 1) + (m >> 1) + (n & m & 1)) + +DO_ZPZZ(sve2_shadd_zpzz_b, int8_t, H1_2, DO_HADD_BHS) +DO_ZPZZ(sve2_shadd_zpzz_h, int16_t, H1_2, DO_HADD_BHS) +DO_ZPZZ(sve2_shadd_zpzz_s, int32_t, H1_4, DO_HADD_BHS) +DO_ZPZZ_D(sve2_shadd_zpzz_d, int64_t, DO_HADD_D) + +DO_ZPZZ(sve2_uhadd_zpzz_b, uint8_t, H1_2, DO_HADD_BHS) +DO_ZPZZ(sve2_uhadd_zpzz_h, uint16_t, H1_2, DO_HADD_BHS) +DO_ZPZZ(sve2_uhadd_zpzz_s, uint32_t, H1_4, DO_HADD_BHS) +DO_ZPZZ_D(sve2_uhadd_zpzz_d, uint64_t, DO_HADD_D) + +#define DO_RHADD_BHS(n, m) (((int64_t)n + m + 1) >> 1) +#define DO_RHADD_D(n, m) ((n >> 1) + (m >> 1) + ((n | m) & 1)) + +DO_ZPZZ(sve2_srhadd_zpzz_b, int8_t, H1_2, DO_RHADD_BHS) +DO_ZPZZ(sve2_srhadd_zpzz_h, int16_t, H1_2, DO_RHADD_BHS) +DO_ZPZZ(sve2_srhadd_zpzz_s, int32_t, H1_4, DO_RHADD_BHS) +DO_ZPZZ_D(sve2_srhadd_zpzz_d, int64_t, DO_RHADD_D) + +DO_ZPZZ(sve2_urhadd_zpzz_b, uint8_t, H1_2, DO_RHADD_BHS) +DO_ZPZZ(sve2_urhadd_zpzz_h, uint16_t, H1_2, DO_RHADD_BHS) +DO_ZPZZ(sve2_urhadd_zpzz_s, uint32_t, H1_4, DO_RHADD_BHS) +DO_ZPZZ_D(sve2_urhadd_zpzz_d, uint64_t, DO_RHADD_D) + +#define DO_HSUB_BHS(n, m) (((int64_t)n - m) >> 1) +#define DO_HSUB_D(n, m) ((n >> 1) - (m >> 1) - (~n & m & 1)) + +DO_ZPZZ(sve2_shsub_zpzz_b, int8_t, H1_2, DO_HSUB_BHS) +DO_ZPZZ(sve2_shsub_zpzz_h, int16_t, H1_2, DO_HSUB_BHS) +DO_ZPZZ(sve2_shsub_zpzz_s, int32_t, H1_4, DO_HSUB_BHS) +DO_ZPZZ_D(sve2_shsub_zpzz_d, int64_t, DO_HSUB_D) + +DO_ZPZZ(sve2_uhsub_zpzz_b, uint8_t, H1_2, DO_HSUB_BHS) +DO_ZPZZ(sve2_uhsub_zpzz_h, uint16_t, H1_2, DO_HSUB_BHS) +DO_ZPZZ(sve2_uhsub_zpzz_s, uint32_t, H1_4, DO_HSUB_BHS) +DO_ZPZZ_D(sve2_uhsub_zpzz_d, uint64_t, DO_HSUB_D) + #undef DO_ZPZZ #undef DO_ZPZZ_D diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 45a72b1750..7d619d7ad4 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -5984,3 +5984,11 @@ DO_SVE2_ZPZZ(SRSHL, srshl) DO_SVE2_ZPZZ(UQSHL, uqshl) DO_SVE2_ZPZZ(UQRSHL, uqrshl) DO_SVE2_ZPZZ(URSHL, urshl) + +DO_SVE2_ZPZZ(SHADD, shadd) +DO_SVE2_ZPZZ(SRHADD, srhadd) +DO_SVE2_ZPZZ(SHSUB, shsub) + +DO_SVE2_ZPZZ(UHADD, uhadd) +DO_SVE2_ZPZZ(URHADD, urhadd) +DO_SVE2_ZPZZ(UHSUB, uhsub) From patchwork Thu Mar 26 23:08:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 184950 Delivered-To: patch@linaro.org Received: by 2002:a92:de47:0:0:0:0:0 with SMTP id e7csp645281ilr; Thu, 26 Mar 2020 16:18:32 -0700 (PDT) X-Google-Smtp-Source: ADFU+vtdoQ0AVeqjlhEnuQHJTxi1y+6j/orBQX7oQ+90UyTbmAVX95DEv0/mU5xc5ZSN36eDvjAe X-Received: by 2002:ac8:6e99:: with SMTP id c25mr11409313qtv.51.1585264712850; Thu, 26 Mar 2020 16:18:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585264712; cv=none; d=google.com; s=arc-20160816; b=s7obBmlE5CjASrdeKUEMFEYmiqApJ21dVtbxWGCV5JYIL6sTWfejaE2rypnghZ6QcS voQeaIcihK+qbhOijsXn6BDFSaPI6QB4L4mshtXe+TiPDhaGBV0hxe8v4L9/JK2/+4OS x9tYAvK3hDzsmqA+wMaiFVrWkdn3lWvJq01IeFeIFUu8B3KzgYqlTFtEJUaMre2M4KZq WD2KhlaTMvhluRXbBM3QO2hszekb/CXfu7oSxKEL/jeGpQ+UPipGue4RdTVAFN9nMiIp vVt8ixY9yGKtVDHWF/twWC+eR5N91u2fUzeyeyB5XMOmI5o6dbzmyBpxPTnbLncq+9k0 h8fQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=0Qn92PslIMK9Z7a8YCtRKYyKEHx9UY3I2q6Qw3BgsLw=; b=VA0cMjhW7fI8IcXdB1dQ0xQMVxVDNIjcWyQS2E3x+E7FGF2Eig0mUOhIcviJJlj8a+ RxH2IFVNkHzJ0jvtvgMHCZqCuvVMbGPKm0lCAJwaYk9q0RFtnc+QEl7SpvXMFvOZBoMS g4HGhVTDUtt0uM5vSaBfQ3UTzIjle0bc6CBOKSJt9Ier7mESvj1UkmJ6xcyDZyiwIB7A ucmnyLxuuojc9r1UhAU6A1CLiYqLJy/2B+Sh8ICS/PWpHnM2tXtd3M+BVDFJmM87RPKn GSe5+KyOJ9h1jFCWPiNa+6CDozaqm5J3alelHVCO/pgba0exEqyhvuu5Fv8eMgbOQRET o6xg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=hLbAw9Gk; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id 207si2514218qki.39.2020.03.26.16.18.32 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 26 Mar 2020 16:18:32 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=hLbAw9Gk; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:34732 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHblg-000106-C5 for patch@linaro.org; Thu, 26 Mar 2020 19:18:32 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58141) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcN-0002Ag-Co for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:56 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcL-0001N8-B2 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:54 -0400 Received: from mail-pg1-x541.google.com ([2607:f8b0:4864:20::541]:39793) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcL-0001M5-41 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:53 -0400 Received: by mail-pg1-x541.google.com with SMTP id b22so3650789pgb.6 for ; Thu, 26 Mar 2020 16:08:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=0Qn92PslIMK9Z7a8YCtRKYyKEHx9UY3I2q6Qw3BgsLw=; b=hLbAw9GkJTfLGztqk+Vwl/Cib3bAvHo/CfCnwH3dvGqK06hKyTu8n8klONtfC0kVgt HADSW5TxCM8XxDK9MKlTT28IUjujqPPBb+qiu6KqZudjr73B+EWR0PlTB5mqy4dLxz8U uFh1rrc3PE9e+pR/wdpfTcO31ZZXmyNso299OF1xfmp8cFjDx9t8T2L5OVMJQYYR609O gOdetXG7E3pQjJ9CLoonydRkL+8XqfiRmzhh8AK4o/0GpJ29FsV+yLl9NgzywHuAYkV9 OWX7kbjWMYk6WadWh0mMsrVPArO9TnBqJBkUduKxuZsTjCGdlj8SwrJTzwv/98yHmJ9v uCQw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=0Qn92PslIMK9Z7a8YCtRKYyKEHx9UY3I2q6Qw3BgsLw=; b=fOWHY09kCBmWExxWEw61VzkIvgS4fvwwWfHBKuMTVNYZ/godOAZ/EGetbHt46rmvz9 NlX3jDO59Cg4Jwi6rz7/mqjWnrc4GnF0toSc4h+365GeyUwkdNooJ6Ob5XH4payt4Rd+ F05oJaUR5Vwb8U9TuXuiHxzO8xpSK7kicSAV3fbvrqdTxSSg133aNPIvViQVZ4rQTlAk 5CkpTSWd+oPpE5yMB/Fqe92OlGg+mVUTsPsZQk6Bnh+8+MUePSzSwITHck7PZ9z1Z5bj G2Un4WLkgbEpcS6gGC+JnGRrpGK4zQ0fIPSY+pK5wIrRk0Nnq2YG4LLCntFGNAMTSpAc DznA== X-Gm-Message-State: ANhLgQ0txrsD44AWlEqcDCJFiBTS4WU7DYq8sDE7OHw7LHypdGe3fntw 5x4ATMDdlRHOm9c73BK39xek4bxS/ps= X-Received: by 2002:a63:a91a:: with SMTP id u26mr3677060pge.236.1585264131657; Thu, 26 Mar 2020 16:08:51 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.08.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:08:51 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 09/31] target/arm: Implement SVE2 integer pairwise arithmetic Date: Thu, 26 Mar 2020 16:08:16 -0700 Message-Id: <20200326230838.31112-10-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::541 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 45 +++++++++++++++++++++++++ target/arm/sve.decode | 8 +++++ target/arm/sve_helper.c | 67 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 6 ++++ 4 files changed, 126 insertions(+) -- 2.20.1 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 149fff1fae..028c3b85a8 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -326,6 +326,51 @@ DEF_HELPER_FLAGS_5(sve_sel_zpzz_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_sel_zpzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_addp_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_addp_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_addp_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_addp_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_smaxp_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_smaxp_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_smaxp_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_smaxp_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_umaxp_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_umaxp_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_umaxp_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_umaxp_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_sminp_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sminp_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sminp_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sminp_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_uminp_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uminp_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uminp_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uminp_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_asr_zpzw_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_asr_zpzw_h, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 54076bb607..86a6bf7088 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1142,3 +1142,11 @@ SRHADD 01000100 .. 010 100 100 ... ..... ..... @rdn_pg_rm URHADD 01000100 .. 010 101 100 ... ..... ..... @rdn_pg_rm SHSUB 01000100 .. 010 110 100 ... ..... ..... @rdm_pg_rn # SHSUBR UHSUB 01000100 .. 010 111 100 ... ..... ..... @rdm_pg_rn # UHSUBR + +### SVE2 integer pairwise arithmetic + +ADDP 01000100 .. 010 001 101 ... ..... ..... @rdn_pg_rm +SMAXP 01000100 .. 010 100 101 ... ..... ..... @rdn_pg_rm +UMAXP 01000100 .. 010 101 101 ... ..... ..... @rdn_pg_rm +SMINP 01000100 .. 010 110 101 ... ..... ..... @rdn_pg_rm +UMINP 01000100 .. 010 111 101 ... ..... ..... @rdn_pg_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 5d75aed7b7..d7c181ddb8 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -681,6 +681,73 @@ DO_ZPZZ_D(sve2_uhsub_zpzz_d, uint64_t, DO_HSUB_D) #undef DO_ZPZZ #undef DO_ZPZZ_D +/* + * Three operand expander, operating on element pairs. + * If the slot I is even, the elements from from VN {I, I+1}. + * If the slot I is odd, the elements from from VM {I-1, I}. + */ +#define DO_ZPZZ_PAIR(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + for (i = 0; i < opr_sz; ) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + if (pg & 1) { \ + void *p = (i & 1 ? vm : vn); \ + TYPE nn = *(TYPE *)(p + H(i & ~1)); \ + TYPE mm = *(TYPE *)(p + H(i | 1)); \ + *(TYPE *)(vd + H(i)) = OP(nn, mm); \ + } \ + i += sizeof(TYPE), pg >>= sizeof(TYPE); \ + } while (i & 15); \ + } \ +} + +/* Similarly, specialized for 64-bit operands. */ +#define DO_ZPZZ_PAIR_D(NAME, TYPE, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; \ + TYPE *d = vd, *n = vn, *m = vm; \ + uint8_t *pg = vg; \ + for (i = 0; i < opr_sz; i += 1) { \ + if (pg[H1(i)] & 1) { \ + TYPE *p = (i & 1 ? m : n) + (i & ~1); \ + TYPE nn = p[0], mm = p[1]; \ + d[i] = OP(nn, mm); \ + } \ + } \ +} + +DO_ZPZZ_PAIR(sve2_addp_zpzz_b, uint8_t, H1_2, DO_ADD) +DO_ZPZZ_PAIR(sve2_addp_zpzz_h, uint16_t, H1_2, DO_ADD) +DO_ZPZZ_PAIR(sve2_addp_zpzz_s, uint32_t, H1_4, DO_ADD) +DO_ZPZZ_PAIR_D(sve2_addp_zpzz_d, uint64_t, DO_ADD) + +DO_ZPZZ_PAIR(sve2_umaxp_zpzz_b, uint8_t, H1_2, DO_MAX) +DO_ZPZZ_PAIR(sve2_umaxp_zpzz_h, uint16_t, H1_2, DO_MAX) +DO_ZPZZ_PAIR(sve2_umaxp_zpzz_s, uint32_t, H1_4, DO_MAX) +DO_ZPZZ_PAIR_D(sve2_umaxp_zpzz_d, uint64_t, DO_MAX) + +DO_ZPZZ_PAIR(sve2_uminp_zpzz_b, uint8_t, H1_2, DO_MIN) +DO_ZPZZ_PAIR(sve2_uminp_zpzz_h, uint16_t, H1_2, DO_MIN) +DO_ZPZZ_PAIR(sve2_uminp_zpzz_s, uint32_t, H1_4, DO_MIN) +DO_ZPZZ_PAIR_D(sve2_uminp_zpzz_d, uint64_t, DO_MIN) + +DO_ZPZZ_PAIR(sve2_smaxp_zpzz_b, int8_t, H1_2, DO_MAX) +DO_ZPZZ_PAIR(sve2_smaxp_zpzz_h, int16_t, H1_2, DO_MAX) +DO_ZPZZ_PAIR(sve2_smaxp_zpzz_s, int32_t, H1_4, DO_MAX) +DO_ZPZZ_PAIR_D(sve2_smaxp_zpzz_d, int64_t, DO_MAX) + +DO_ZPZZ_PAIR(sve2_sminp_zpzz_b, int8_t, H1_2, DO_MIN) +DO_ZPZZ_PAIR(sve2_sminp_zpzz_h, int16_t, H1_2, DO_MIN) +DO_ZPZZ_PAIR(sve2_sminp_zpzz_s, int32_t, H1_4, DO_MIN) +DO_ZPZZ_PAIR_D(sve2_sminp_zpzz_d, int64_t, DO_MIN) + +#undef DO_ZPZZ_PAIR +#undef DO_ZPZZ_PAIR_D + /* Three-operand expander, controlled by a predicate, in which the * third operand is "wide". That is, for D = N op M, the same 64-bit * value of M is used with all of the narrower values of N. diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 7d619d7ad4..5f137c0e92 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -5992,3 +5992,9 @@ DO_SVE2_ZPZZ(SHSUB, shsub) DO_SVE2_ZPZZ(UHADD, uhadd) DO_SVE2_ZPZZ(URHADD, urhadd) DO_SVE2_ZPZZ(UHSUB, uhsub) + +DO_SVE2_ZPZZ(ADDP, addp) +DO_SVE2_ZPZZ(SMAXP, smaxp) +DO_SVE2_ZPZZ(UMAXP, umaxp) +DO_SVE2_ZPZZ(SMINP, sminp) +DO_SVE2_ZPZZ(UMINP, uminp) From patchwork Thu Mar 26 23:08:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 184953 Delivered-To: patch@linaro.org Received: by 2002:a92:de47:0:0:0:0:0 with SMTP id e7csp645898ilr; Thu, 26 Mar 2020 16:19:17 -0700 (PDT) X-Google-Smtp-Source: ADFU+vsgazTaHE1ZaBI5GTLK+e/EWe8trrvRp4MtmlaiiQGyYN082mMoJCL9F4v0ORFZo8wB+Sb3 X-Received: by 2002:a37:6cb:: with SMTP id 194mr11681758qkg.235.1585264757072; Thu, 26 Mar 2020 16:19:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585264757; cv=none; d=google.com; s=arc-20160816; b=PME7K3g3I68ffThI4RzvUV62SjBcD1qJujFxki/FpVOVNdCBBuVNm3HqWSAhS7k1ys tgksbbLK9XuCTSdW1KgbjlJnQdkkkThin4VIlJ2kBcW0zD5bPlj2ydllFbX/TuLSORko kKicqfj6KFE21/wEPTOJ9DHCOjrZc9X/huS5OsnGDAVzhKuwx2MIn1yw0OSK8Ol6Kglb j7L9MQOUPCYswWqYNqT6DUcYqNhESIqO+H7jvspMtWmXLXATIfW0yUQ8tzpOfeCzchh0 m25Q6+LR7LrzIWxpyUEZjxv8jfZaNBxcnW26AqLSwBLzjnBfQ2n4jCN/9UTWPvsAnFUU PIPg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=TZZOTv3OeZghjguCKNcfyc5uGnx4cEJVuFPp8if3XZ4=; b=0DpmCogAoEA2HojSpkOw4TzX9aUX+y8R3+i7Jrt+2jciaYjqF8GOSI3lrOOTDEy3dL GxBidmMUPtakEduElOshLQg8QkMGbOgYW5klg8vi9q2uzWWYVErKBeGrrh82f5FiOWaf 1oQWq3VRSKDR6RdBHmLQmKxuTB1rgftFgOpywU8vCSH9YfXVp3YVz6pp8/B+hng3CE/o dVby0CZybL0PEA+/9WmdpVTYq85hYTwCeijpUFp8DgZcWHPCt+b6Yf0aaDnF/kLybVzl NhTDRc1G433ywhoPHoZCruBh9EG5/O9lQ7D8Dm/EkeexuKEgbM5GwMqqsrBkjseJK39N MOSw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b="dOg/KZgb"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id x9si2368210qta.101.2020.03.26.16.19.16 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 26 Mar 2020 16:19:17 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b="dOg/KZgb"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:34772 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbmO-000245-G9 for patch@linaro.org; Thu, 26 Mar 2020 19:19:16 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58270) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcP-0002GS-BP for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcN-0001Pj-0D for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:57 -0400 Received: from mail-pj1-x1034.google.com ([2607:f8b0:4864:20::1034]:33988) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcM-0001Ob-Ox for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:54 -0400 Received: by mail-pj1-x1034.google.com with SMTP id q16so3754705pje.1 for ; Thu, 26 Mar 2020 16:08:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=TZZOTv3OeZghjguCKNcfyc5uGnx4cEJVuFPp8if3XZ4=; b=dOg/KZgbX2SBMbO9mQ0Ao9i5826dhyaAIyqQl4wUUff6lLjBDZgOGDTkzfcGu3TK/r W4br3S0TCsczqKk4GHtocBDF3a5Oz+qGdzeJMwLcfHYxs3t+et3RN0SB5sUo7Ji3AbIX jrayJCIXq0CjPVr15Dvy65cSNyPC2e4MxMdQyH754ZpJo4auUn+NQWhuFq9k+Wd417qi 7yTWqGq4JKEfEmKb/x0bUbk6BvhisXKwVWOZVytNXv/QrlVecmdd0mPCR+yMQJoAaF17 q/d+Bc+Anz5J98EfMAUp48ve59a19jadmDfzhqfb9sQFRaXwYjU6ZbIsh5yKsAd+rIP6 L7Uw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=TZZOTv3OeZghjguCKNcfyc5uGnx4cEJVuFPp8if3XZ4=; b=cnuE6t+euSjSP21d/u+ud0ZYF4xr0R0NoXSEGeSsY75sQx8nKpYz+Uypf0BTXt1pBY YfHwebHrFQzE4R2lQ20R+L3wX9vYabEvytac3+DXBESQleTCZg2A98a3Ac9W1vgX4rix IY3Me+JcWsanhV5SV0r0ZW5VmC7ec7+9v9yw+VgheTduFwQDJsH83uj7HUbQb9EAhWH6 5p4WHE66Ytxo8c7BBKF+vuaMPhJl9NmphJ7JUrvPzxFs9tB1h5Z9udvF9YcG7q5SygL0 I/5Ve43PHLrDlp13LdA4Rczt3Mgj/Tu4b57UNh7bHUZNI2Uds8axHeMfh6X7EfGuZPEp EqRA== X-Gm-Message-State: ANhLgQ0dSE4z9VJuY0VZVz4XvNhyRE9o8jWXC/mtMl6fQBp2ZBKMEujk d9GvGSAD14Ol6cEWy8cGQJaAAAcIe5o= X-Received: by 2002:a17:902:9890:: with SMTP id s16mr10062525plp.71.1585264132994; Thu, 26 Mar 2020 16:08:52 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.08.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:08:52 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 10/31] target/arm: Implement SVE2 saturating add/subtract (predicated) Date: Thu, 26 Mar 2020 16:08:17 -0700 Message-Id: <20200326230838.31112-11-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::1034 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 54 +++++++++++ target/arm/sve.decode | 11 +++ target/arm/sve_helper.c | 182 +++++++++++++++++++++++++------------ target/arm/translate-sve.c | 7 ++ 4 files changed, 198 insertions(+), 56 deletions(-) -- 2.20.1 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 028c3b85a8..368185944a 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -371,6 +371,60 @@ DEF_HELPER_FLAGS_5(sve2_uminp_zpzz_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve2_uminp_zpzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqadd_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqadd_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqadd_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqadd_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_uqadd_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqadd_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqadd_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqadd_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_sqsub_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqsub_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqsub_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqsub_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_uqsub_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqsub_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqsub_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqsub_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_suqadd_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_suqadd_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_suqadd_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_suqadd_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_usqadd_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_usqadd_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_usqadd_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_usqadd_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_asr_zpzw_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_asr_zpzw_h, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 86a6bf7088..86aee38668 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1150,3 +1150,14 @@ SMAXP 01000100 .. 010 100 101 ... ..... ..... @rdn_pg_rm UMAXP 01000100 .. 010 101 101 ... ..... ..... @rdn_pg_rm SMINP 01000100 .. 010 110 101 ... ..... ..... @rdn_pg_rm UMINP 01000100 .. 010 111 101 ... ..... ..... @rdn_pg_rm + +### SVE2 saturating add/subtract (predicated) + +SQADD_zpzz 01000100 .. 011 000 100 ... ..... ..... @rdn_pg_rm +UQADD_zpzz 01000100 .. 011 001 100 ... ..... ..... @rdn_pg_rm +SQSUB_zpzz 01000100 .. 011 010 100 ... ..... ..... @rdn_pg_rm +UQSUB_zpzz 01000100 .. 011 011 100 ... ..... ..... @rdn_pg_rm +SUQADD 01000100 .. 011 100 100 ... ..... ..... @rdn_pg_rm +USQADD 01000100 .. 011 101 100 ... ..... ..... @rdn_pg_rm +SQSUB_zpzz 01000100 .. 011 110 100 ... ..... ..... @rdm_pg_rn # SQSUBR +UQSUB_zpzz 01000100 .. 011 111 100 ... ..... ..... @rdm_pg_rn # UQSUBR diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index d7c181ddb8..bee00eaa44 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -678,6 +678,123 @@ DO_ZPZZ(sve2_uhsub_zpzz_h, uint16_t, H1_2, DO_HSUB_BHS) DO_ZPZZ(sve2_uhsub_zpzz_s, uint32_t, H1_4, DO_HSUB_BHS) DO_ZPZZ_D(sve2_uhsub_zpzz_d, uint64_t, DO_HSUB_D) +static inline int32_t do_sat_bhs(int64_t val, int64_t min, int64_t max) +{ + return val >= max ? max : val <= min ? min : val; +} + +#define DO_SQADD_B(n, m) do_sat_bhs((int64_t)n + m, INT8_MIN, INT8_MAX) +#define DO_SQADD_H(n, m) do_sat_bhs((int64_t)n + m, INT16_MIN, INT16_MAX) +#define DO_SQADD_S(n, m) do_sat_bhs((int64_t)n + m, INT32_MIN, INT32_MAX) + +static inline int64_t do_sqadd_d(int64_t n, int64_t m) +{ + int64_t r = n + m; + if (((r ^ n) & ~(n ^ m)) < 0) { + /* Signed overflow. */ + return r < 0 ? INT64_MAX : INT64_MIN; + } + return r; +} + +DO_ZPZZ(sve2_sqadd_zpzz_b, int8_t, H1_2, DO_SQADD_B) +DO_ZPZZ(sve2_sqadd_zpzz_h, int16_t, H1_2, DO_SQADD_H) +DO_ZPZZ(sve2_sqadd_zpzz_s, int32_t, H1_4, DO_SQADD_S) +DO_ZPZZ_D(sve2_sqadd_zpzz_d, int64_t, do_sqadd_d) + +#define DO_UQADD_B(n, m) do_sat_bhs((int64_t)n + m, 0, UINT8_MAX) +#define DO_UQADD_H(n, m) do_sat_bhs((int64_t)n + m, 0, UINT16_MAX) +#define DO_UQADD_S(n, m) do_sat_bhs((int64_t)n + m, 0, UINT32_MAX) + +static inline uint64_t do_uqadd_d(uint64_t n, uint64_t m) +{ + uint64_t r = n + m; + return r < n ? UINT64_MAX : r; +} + +DO_ZPZZ(sve2_uqadd_zpzz_b, uint8_t, H1_2, DO_UQADD_B) +DO_ZPZZ(sve2_uqadd_zpzz_h, uint16_t, H1_2, DO_UQADD_H) +DO_ZPZZ(sve2_uqadd_zpzz_s, uint32_t, H1_4, DO_UQADD_S) +DO_ZPZZ_D(sve2_uqadd_zpzz_d, uint64_t, do_uqadd_d) + +#define DO_SQSUB_B(n, m) do_sat_bhs((int64_t)n - m, INT8_MIN, INT8_MAX) +#define DO_SQSUB_H(n, m) do_sat_bhs((int64_t)n - m, INT16_MIN, INT16_MAX) +#define DO_SQSUB_S(n, m) do_sat_bhs((int64_t)n - m, INT32_MIN, INT32_MAX) + +static inline int64_t do_sqsub_d(int64_t n, int64_t m) +{ + int64_t r = n - m; + if (((r ^ n) & (n ^ m)) < 0) { + /* Signed overflow. */ + return r < 0 ? INT64_MAX : INT64_MIN; + } + return r; +} + +DO_ZPZZ(sve2_sqsub_zpzz_b, int8_t, H1_2, DO_SQSUB_B) +DO_ZPZZ(sve2_sqsub_zpzz_h, int16_t, H1_2, DO_SQSUB_H) +DO_ZPZZ(sve2_sqsub_zpzz_s, int32_t, H1_4, DO_SQSUB_S) +DO_ZPZZ_D(sve2_sqsub_zpzz_d, int64_t, do_sqsub_d) + +#define DO_UQSUB_B(n, m) do_sat_bhs((int64_t)n - m, 0, UINT8_MAX) +#define DO_UQSUB_H(n, m) do_sat_bhs((int64_t)n - m, 0, UINT16_MAX) +#define DO_UQSUB_S(n, m) do_sat_bhs((int64_t)n - m, 0, UINT32_MAX) + +static inline uint64_t do_uqsub_d(uint64_t n, uint64_t m) +{ + return n > m ? n - m : 0; +} + +DO_ZPZZ(sve2_uqsub_zpzz_b, uint8_t, H1_2, DO_UQSUB_B) +DO_ZPZZ(sve2_uqsub_zpzz_h, uint16_t, H1_2, DO_UQSUB_H) +DO_ZPZZ(sve2_uqsub_zpzz_s, uint32_t, H1_4, DO_UQSUB_S) +DO_ZPZZ_D(sve2_uqsub_zpzz_d, uint64_t, do_uqsub_d) + +#define DO_SUQADD_B(n, m) \ + do_sat_bhs((int64_t)(int8_t)n + m, INT8_MIN, INT8_MAX) +#define DO_SUQADD_H(n, m) \ + do_sat_bhs((int64_t)(int16_t)n + m, INT16_MIN, INT16_MAX) +#define DO_SUQADD_S(n, m) \ + do_sat_bhs((int64_t)(int32_t)n + m, INT32_MIN, INT32_MAX) + +static inline int64_t do_suqadd_d(int64_t n, uint64_t m) +{ + uint64_t r = n + m; + + /* Note that m - abs(n) cannot underflow. */ + if (n >= 0 && (r < m || r >= INT64_MAX)) { + return INT64_MAX; + } + return r; +} + +DO_ZPZZ(sve2_suqadd_zpzz_b, uint8_t, H1_2, DO_SUQADD_B) +DO_ZPZZ(sve2_suqadd_zpzz_h, uint16_t, H1_2, DO_SUQADD_H) +DO_ZPZZ(sve2_suqadd_zpzz_s, uint32_t, H1_4, DO_SUQADD_S) +DO_ZPZZ_D(sve2_suqadd_zpzz_d, uint64_t, do_suqadd_d) + +#define DO_USQADD_B(n, m) \ + do_sat_bhs((int64_t)n + (int8_t)m, 0, UINT8_MAX) +#define DO_USQADD_H(n, m) \ + do_sat_bhs((int64_t)n + (int16_t)m, 0, UINT16_MAX) +#define DO_USQADD_S(n, m) \ + do_sat_bhs((int64_t)n + (int32_t)m, 0, UINT32_MAX) + +static inline uint64_t do_usqadd_d(uint64_t n, int64_t m) +{ + uint64_t r = n + m; + + if (m < 0) { + return n < -m ? 0 : r; + } + return r < n ? UINT64_MAX : r; +} + +DO_ZPZZ(sve2_usqadd_zpzz_b, uint8_t, H1_2, DO_USQADD_B) +DO_ZPZZ(sve2_usqadd_zpzz_h, uint16_t, H1_2, DO_USQADD_H) +DO_ZPZZ(sve2_usqadd_zpzz_s, uint32_t, H1_4, DO_USQADD_S) +DO_ZPZZ_D(sve2_usqadd_zpzz_d, uint64_t, do_usqadd_d) + #undef DO_ZPZZ #undef DO_ZPZZ_D @@ -1640,13 +1757,7 @@ void HELPER(sve_sqaddi_b)(void *d, void *a, int32_t b, uint32_t desc) intptr_t i, oprsz = simd_oprsz(desc); for (i = 0; i < oprsz; i += sizeof(int8_t)) { - int r = *(int8_t *)(a + i) + b; - if (r > INT8_MAX) { - r = INT8_MAX; - } else if (r < INT8_MIN) { - r = INT8_MIN; - } - *(int8_t *)(d + i) = r; + *(int8_t *)(d + i) = DO_SQADD_B(b, *(int8_t *)(a + i)); } } @@ -1655,13 +1766,7 @@ void HELPER(sve_sqaddi_h)(void *d, void *a, int32_t b, uint32_t desc) intptr_t i, oprsz = simd_oprsz(desc); for (i = 0; i < oprsz; i += sizeof(int16_t)) { - int r = *(int16_t *)(a + i) + b; - if (r > INT16_MAX) { - r = INT16_MAX; - } else if (r < INT16_MIN) { - r = INT16_MIN; - } - *(int16_t *)(d + i) = r; + *(int16_t *)(d + i) = DO_SQADD_H(b, *(int16_t *)(a + i)); } } @@ -1670,13 +1775,7 @@ void HELPER(sve_sqaddi_s)(void *d, void *a, int64_t b, uint32_t desc) intptr_t i, oprsz = simd_oprsz(desc); for (i = 0; i < oprsz; i += sizeof(int32_t)) { - int64_t r = *(int32_t *)(a + i) + b; - if (r > INT32_MAX) { - r = INT32_MAX; - } else if (r < INT32_MIN) { - r = INT32_MIN; - } - *(int32_t *)(d + i) = r; + *(int32_t *)(d + i) = DO_SQADD_S(b, *(int32_t *)(a + i)); } } @@ -1685,13 +1784,7 @@ void HELPER(sve_sqaddi_d)(void *d, void *a, int64_t b, uint32_t desc) intptr_t i, oprsz = simd_oprsz(desc); for (i = 0; i < oprsz; i += sizeof(int64_t)) { - int64_t ai = *(int64_t *)(a + i); - int64_t r = ai + b; - if (((r ^ ai) & ~(ai ^ b)) < 0) { - /* Signed overflow. */ - r = (r < 0 ? INT64_MAX : INT64_MIN); - } - *(int64_t *)(d + i) = r; + *(int64_t *)(d + i) = do_sqadd_d(b, *(int64_t *)(a + i)); } } @@ -1704,13 +1797,7 @@ void HELPER(sve_uqaddi_b)(void *d, void *a, int32_t b, uint32_t desc) intptr_t i, oprsz = simd_oprsz(desc); for (i = 0; i < oprsz; i += sizeof(uint8_t)) { - int r = *(uint8_t *)(a + i) + b; - if (r > UINT8_MAX) { - r = UINT8_MAX; - } else if (r < 0) { - r = 0; - } - *(uint8_t *)(d + i) = r; + *(uint8_t *)(d + i) = DO_UQADD_B(b, *(uint8_t *)(a + i)); } } @@ -1719,13 +1806,7 @@ void HELPER(sve_uqaddi_h)(void *d, void *a, int32_t b, uint32_t desc) intptr_t i, oprsz = simd_oprsz(desc); for (i = 0; i < oprsz; i += sizeof(uint16_t)) { - int r = *(uint16_t *)(a + i) + b; - if (r > UINT16_MAX) { - r = UINT16_MAX; - } else if (r < 0) { - r = 0; - } - *(uint16_t *)(d + i) = r; + *(uint16_t *)(d + i) = DO_UQADD_H(b, *(uint16_t *)(a + i)); } } @@ -1734,13 +1815,7 @@ void HELPER(sve_uqaddi_s)(void *d, void *a, int64_t b, uint32_t desc) intptr_t i, oprsz = simd_oprsz(desc); for (i = 0; i < oprsz; i += sizeof(uint32_t)) { - int64_t r = *(uint32_t *)(a + i) + b; - if (r > UINT32_MAX) { - r = UINT32_MAX; - } else if (r < 0) { - r = 0; - } - *(uint32_t *)(d + i) = r; + *(uint32_t *)(d + i) = DO_UQADD_S(b, *(uint32_t *)(a + i)); } } @@ -1749,11 +1824,7 @@ void HELPER(sve_uqaddi_d)(void *d, void *a, uint64_t b, uint32_t desc) intptr_t i, oprsz = simd_oprsz(desc); for (i = 0; i < oprsz; i += sizeof(uint64_t)) { - uint64_t r = *(uint64_t *)(a + i) + b; - if (r < b) { - r = UINT64_MAX; - } - *(uint64_t *)(d + i) = r; + *(uint64_t *)(d + i) = do_uqadd_d(b, *(uint64_t *)(a + i)); } } @@ -1762,8 +1833,7 @@ void HELPER(sve_uqsubi_d)(void *d, void *a, uint64_t b, uint32_t desc) intptr_t i, oprsz = simd_oprsz(desc); for (i = 0; i < oprsz; i += sizeof(uint64_t)) { - uint64_t ai = *(uint64_t *)(a + i); - *(uint64_t *)(d + i) = (ai < b ? 0 : ai - b); + *(uint64_t *)(d + i) = do_uqsub_d(*(uint64_t *)(a + i), b); } } diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 5f137c0e92..21dfb2455a 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -5998,3 +5998,10 @@ DO_SVE2_ZPZZ(SMAXP, smaxp) DO_SVE2_ZPZZ(UMAXP, umaxp) DO_SVE2_ZPZZ(SMINP, sminp) DO_SVE2_ZPZZ(UMINP, uminp) + +DO_SVE2_ZPZZ(SQADD_zpzz, sqadd) +DO_SVE2_ZPZZ(UQADD_zpzz, uqadd) +DO_SVE2_ZPZZ(SQSUB_zpzz, sqsub) +DO_SVE2_ZPZZ(UQSUB_zpzz, uqsub) +DO_SVE2_ZPZZ(SUQADD, suqadd) +DO_SVE2_ZPZZ(USQADD, usqadd) From patchwork Thu Mar 26 23:08:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 184941 Delivered-To: patch@linaro.org Received: by 2002:a92:de47:0:0:0:0:0 with SMTP id e7csp639100ilr; Thu, 26 Mar 2020 16:11:34 -0700 (PDT) X-Google-Smtp-Source: ADFU+vtP+JRmsASx7lCmPr4Bz9ItUbhx8pgvxvYBkcWDhPsgZmvZf1H6hTzJoRJNHKzizHI1fFLp X-Received: by 2002:ad4:5401:: with SMTP id f1mr10742501qvt.209.1585264293931; Thu, 26 Mar 2020 16:11:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585264293; cv=none; d=google.com; s=arc-20160816; b=zNyPJyHu0sEJ7gekAPxFGiabdnjW3IGniF0YBN86iB0a8ozGfhqD6B5KBYROZJw+zE UFH8pOqmmGlgMihkgBPbcWwbwTEYIb6SnNtCD5GjgVa9V3/Sp2eP0CW1XV/EXR+er9da ve2vpChQAXu+Jm2MC7oRJU3NGouUQQ3Tq6G0w2I/PTA5vbEQuYL7TVCclbaLdoINQPs3 TmKy1SWhghpZ4XSzccKTuw12Bixdhnc69ObE1MYB6Fhga+ShvFY6EbE/tbtJ1yajg+Pt XCKL+K4gHvp+UFddzyFV/0qzN9Tr3Z7MQM1U35oUjnF1IyEaCn8jeRipGAUQ/tR/98Ky EAgQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=gLLGScFvDeYfc9yE8BUZTiwCqKsBUzUDnFzQPSPVdVg=; b=uxNCWSYOpTxt3rpd/kg1p+EMwG0S+SLd35NFrqQLgfD9rVnXcpWZTK2cmEAE/OCIE4 AqjxqKh723w76gL4VmYLjPfibTaLnAjaOUbwfjEYDak0EOt7uaGrHcdTzEiyB2oFX5lH r0PQE2xJ2zLRSA2tN//4/+hlWlTjsH/pZ14psoJIw3EWCk0ZicC9vB8mPS1g7QJ5ExTy DVs/XJyzv36EQp3SB/V8Z4RtjOBocAN141l10qYzREhIMF5CGz/t5kJmhBS1E77ob86m pgCc1FOT4cHv5wPPGwbxt3TdGmN0noHfpYCdMx5zCCB8tCAtW3A2tXr4Q+28QjHvGDg8 Fw1Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b="wGx1bq/f"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id 1si2627167qkr.89.2020.03.26.16.11.33 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 26 Mar 2020 16:11:33 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b="wGx1bq/f"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:34544 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbev-0006vq-BD for patch@linaro.org; Thu, 26 Mar 2020 19:11:33 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58299) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcP-0002Ho-U3 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcO-0001RV-7Z for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:57 -0400 Received: from mail-pj1-x1042.google.com ([2607:f8b0:4864:20::1042]:50910) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcO-0001QH-0u for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:56 -0400 Received: by mail-pj1-x1042.google.com with SMTP id v13so3096993pjb.0 for ; Thu, 26 Mar 2020 16:08:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=gLLGScFvDeYfc9yE8BUZTiwCqKsBUzUDnFzQPSPVdVg=; b=wGx1bq/f3pT24xZnCyWWby/2e7Fcd8f8s5MWQuZEvEAK9iCT4FPslEJKW5CIhu7xl+ lcZRNe3a8tPFMKFWteMKa3Xb5f/gzWh0RLj2ASFGwHmSi4Irelfp4cKsVjZz7DDqS1cV NbAUQxVtu2SIy2yqsGvh3uFU4MrRaIcEE/rjatHT5i4ODvVX4Xsblk/GLGaR/ee/72Ue vlW0pkXrpAQPUFNzIjK9ubzpKzNHZQWINeWoUWetGzwe0wYV+DdblK4llS9LKkclIIso CeDviFHLDg/U5jZ5/BaF6SoLclIgBb1BPOzpSB8tToNr0x+Zmd4piAfIHqV8BCB6Gx6d f2IA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=gLLGScFvDeYfc9yE8BUZTiwCqKsBUzUDnFzQPSPVdVg=; b=mX/h29+Irp+ZNccp/Ij7N5xI6hBD4WO31bP7FHh+/1Xm+tOPX6SBHtcdnFjd1QXyNe IOGdfzmDVlzzLJbltUur7qwTCqBHkzm3mVaahvoDeNV9ltHiQOgP5zrSA2iudEwCFs5O g394ASBku8fzG46J8oef73WZR9n5VVPAxvhgBsuBgRmqBbBL+bc1Yl48caG8iDg+PyiI j8BZXTL+EfrvvpVu9nroJnaiKm4DTOk2mNzsQQEf1gYRfUjxhcnaNDRqSlZxH6TjboUp YzDzkt4zFTEanYwx9w4AoV8rLtKVLKZQBTdMfr8XZTLRbYxPn82l7FW2SqQOu4tkj/b4 RGiw== X-Gm-Message-State: ANhLgQ2u4QIrDDKFF2UKiwNbD/JuMAjXNWbQsKqY1QePRv7Hii9B73i+ JqNhII4gXZtwtip2VpN3LvV8Bxs5HVI= X-Received: by 2002:a17:902:6bc8:: with SMTP id m8mr10234014plt.223.1585264134560; Thu, 26 Mar 2020 16:08:54 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.08.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:08:53 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 11/31] target/arm: Implement SVE2 integer add/subtract long Date: Thu, 26 Mar 2020 16:08:18 -0700 Message-Id: <20200326230838.31112-12-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::1042 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 24 ++++++++++++++++++++ target/arm/sve.decode | 19 ++++++++++++++++ target/arm/sve_helper.c | 43 +++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 46 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 132 insertions(+) -- 2.20.1 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 368185944a..475fce7f3a 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -1372,6 +1372,30 @@ DEF_HELPER_FLAGS_5(sve_ftmad_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_ftmad_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_ftmad_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_saddl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_saddl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_saddl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_ssubl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_ssubl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_ssubl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_sabdl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sabdl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sabdl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_uaddl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_uaddl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_uaddl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_usubl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_usubl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_usubl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_uabdl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_uabdl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_uabdl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(sve_ldr, TCG_CALL_NO_WG, void, env, ptr, tl, int) DEF_HELPER_FLAGS_4(sve_str, TCG_CALL_NO_WG, void, env, ptr, tl, int) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 86aee38668..a239fd3479 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1161,3 +1161,22 @@ SUQADD 01000100 .. 011 100 100 ... ..... ..... @rdn_pg_rm USQADD 01000100 .. 011 101 100 ... ..... ..... @rdn_pg_rm SQSUB_zpzz 01000100 .. 011 110 100 ... ..... ..... @rdm_pg_rn # SQSUBR UQSUB_zpzz 01000100 .. 011 111 100 ... ..... ..... @rdm_pg_rn # UQSUBR + +#### SVE2 Widening Integer Arithmetic + +## SVE2 integer add/subtract long + +SADDLB 01000101 .. 0 ..... 00 0000 ..... ..... @rd_rn_rm +SADDLT 01000101 .. 0 ..... 00 0001 ..... ..... @rd_rn_rm +UADDLB 01000101 .. 0 ..... 00 0010 ..... ..... @rd_rn_rm +UADDLT 01000101 .. 0 ..... 00 0011 ..... ..... @rd_rn_rm + +SSUBLB 01000101 .. 0 ..... 00 0100 ..... ..... @rd_rn_rm +SSUBLT 01000101 .. 0 ..... 00 0101 ..... ..... @rd_rn_rm +USUBLB 01000101 .. 0 ..... 00 0110 ..... ..... @rd_rn_rm +USUBLT 01000101 .. 0 ..... 00 0111 ..... ..... @rd_rn_rm + +SABDLB 01000101 .. 0 ..... 00 1100 ..... ..... @rd_rn_rm +SABDLT 01000101 .. 0 ..... 00 1101 ..... ..... @rd_rn_rm +UABDLB 01000101 .. 0 ..... 00 1110 ..... ..... @rd_rn_rm +UABDLT 01000101 .. 0 ..... 00 1111 ..... ..... @rd_rn_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index bee00eaa44..7d7a59f620 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1088,6 +1088,49 @@ DO_ZZW(sve_lsl_zzw_s, uint32_t, uint64_t, H1_4, DO_LSL) #undef DO_ZPZ #undef DO_ZPZ_D +/* + * Three-operand expander, unpredicated, in which the two inputs are + * selected from the top or bottom half of the wide column. + */ +#define DO_ZZZ_TB(NAME, TYPE, TYPEN, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + int sel1 = (simd_data(desc) & 1) * sizeof(TYPE); \ + int sel2 = (simd_data(desc) & 2) * (sizeof(TYPE) / 2); \ + for (i = 0; i < opr_sz; i += sizeof(TYPE)) { \ + TYPE nn = (TYPEN)(*(TYPE *)(vn + i) >> sel1); \ + TYPE mm = (TYPEN)(*(TYPE *)(vm + i) >> sel2); \ + *(TYPE *)(vd + i) = OP(nn, mm); \ + } \ +} + +DO_ZZZ_TB(sve2_saddl_h, int16_t, int8_t, DO_ADD) +DO_ZZZ_TB(sve2_saddl_s, int32_t, int16_t, DO_ADD) +DO_ZZZ_TB(sve2_saddl_d, int64_t, int32_t, DO_ADD) + +DO_ZZZ_TB(sve2_ssubl_h, int16_t, int8_t, DO_SUB) +DO_ZZZ_TB(sve2_ssubl_s, int32_t, int16_t, DO_SUB) +DO_ZZZ_TB(sve2_ssubl_d, int64_t, int32_t, DO_SUB) + +DO_ZZZ_TB(sve2_sabdl_h, int16_t, int8_t, DO_ABD) +DO_ZZZ_TB(sve2_sabdl_s, int32_t, int16_t, DO_ABD) +DO_ZZZ_TB(sve2_sabdl_d, int64_t, int32_t, DO_ABD) + +DO_ZZZ_TB(sve2_uaddl_h, uint16_t, uint8_t, DO_ADD) +DO_ZZZ_TB(sve2_uaddl_s, uint32_t, uint16_t, DO_ADD) +DO_ZZZ_TB(sve2_uaddl_d, uint64_t, uint32_t, DO_ADD) + +DO_ZZZ_TB(sve2_usubl_h, uint16_t, uint8_t, DO_SUB) +DO_ZZZ_TB(sve2_usubl_s, uint32_t, uint16_t, DO_SUB) +DO_ZZZ_TB(sve2_usubl_d, uint64_t, uint32_t, DO_SUB) + +DO_ZZZ_TB(sve2_uabdl_h, uint16_t, uint8_t, DO_ABD) +DO_ZZZ_TB(sve2_uabdl_s, uint32_t, uint16_t, DO_ABD) +DO_ZZZ_TB(sve2_uabdl_d, uint64_t, uint32_t, DO_ABD) + +#undef DO_ZZZ_TB + /* Two-operand reduction expander, controlled by a predicate. * The difference between TYPERED and TYPERET has to do with * sign-extension. E.g. for SMAX, TYPERED must be signed, diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 21dfb2455a..ee8a6fd912 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6005,3 +6005,49 @@ DO_SVE2_ZPZZ(SQSUB_zpzz, sqsub) DO_SVE2_ZPZZ(UQSUB_zpzz, uqsub) DO_SVE2_ZPZZ(SUQADD, suqadd) DO_SVE2_ZPZZ(USQADD, usqadd) + +/* + * SVE2 Widening Integer Arithmetic + */ + +static bool do_sve2_zzw_ool(DisasContext *s, arg_rrr_esz *a, + gen_helper_gvec_3 *fn, int data) +{ + if (fn == NULL || !dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vsz, vsz, data, fn); + } + return true; +} + +#define DO_SVE2_ZZZ_TB(NAME, name, SEL1, SEL2) \ +static bool trans_##NAME(DisasContext *s, arg_rrr_esz *a) \ +{ \ + static gen_helper_gvec_3 * const fns[4] = { \ + NULL, gen_helper_sve2_##name##_h, \ + gen_helper_sve2_##name##_s, gen_helper_sve2_##name##_d, \ + }; \ + return do_sve2_zzw_ool(s, a, fns[a->esz], (SEL2 << 1) | SEL1); \ +} + +DO_SVE2_ZZZ_TB(SADDLB, saddl, false, false) +DO_SVE2_ZZZ_TB(SSUBLB, ssubl, false, false) +DO_SVE2_ZZZ_TB(SABDLB, sabdl, false, false) + +DO_SVE2_ZZZ_TB(UADDLB, uaddl, false, false) +DO_SVE2_ZZZ_TB(USUBLB, usubl, false, false) +DO_SVE2_ZZZ_TB(UABDLB, uabdl, false, false) + +DO_SVE2_ZZZ_TB(SADDLT, saddl, true, true) +DO_SVE2_ZZZ_TB(SSUBLT, ssubl, true, true) +DO_SVE2_ZZZ_TB(SABDLT, sabdl, true, true) + +DO_SVE2_ZZZ_TB(UADDLT, uaddl, true, true) +DO_SVE2_ZZZ_TB(USUBLT, usubl, true, true) +DO_SVE2_ZZZ_TB(UABDLT, uabdl, true, true) From patchwork Thu Mar 26 23:08:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 184942 Delivered-To: patch@linaro.org Received: by 2002:a92:de47:0:0:0:0:0 with SMTP id e7csp639745ilr; Thu, 26 Mar 2020 16:12:12 -0700 (PDT) X-Google-Smtp-Source: ADFU+vuJhDfiI02czCqCZDSGiH7GwMErdFgveCZKzSNz/hbMUsn5Lr/S67XKqzY/DCOBubzKSL11 X-Received: by 2002:ac8:4449:: with SMTP id m9mr11327599qtn.175.1585264332853; Thu, 26 Mar 2020 16:12:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585264332; cv=none; d=google.com; s=arc-20160816; b=R9BRTlGveYgevJtQZLe5QtQnxuc03AAeWOmSugegqiltJQVgU0Ulat1Hxk7JnHwyy9 F4Ki7GZw+Z/l97AdQy8J4U2O5TuYNV6U2i8J0J9lFnB2poN+35eS+jSZeZOH2rhT2LBf Invf+XBu4L0eCVPV1/5d7h4ECu3UWSrOr3q0JM9mgb7301ecBZXQSm5iwvltZYafOQQB QpvWIeN67T764uUiyKiZFEFuVHSrz/5x5U7a7sqRaX5STuF4dN9U6QJVp5FsNLX1taGB onvqEstVXjb/uVEc1rxCfU3QdM0/2DRz7S+PQlHQEzIZXs9Wwb3+7127YKTOiCFXhk4Z onaw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=Sz8Q6/lROF919J+K+cTqN6AeyEkEwGuAjIPol0tSqWA=; b=x6GC7kYkoBP5ta72xhGJXxPYaRsk71RL4d2MB+iQU7Op/TxeYLVL8rpJyUYMPIGiXn g+P8rMJxQVxIXcEy2BAZ1S74x4Vw5kZpuggdsmF8XoYLW0cwetPIHrQ4dD7xt+6wRXkP GVxJMRQUgV0/Y5v5wt5YV2q1zCU8vU6ym8ghv9csYPwkWB8HUV1DJ3ru3/uEx1pukV4I cWkuuOxtfUPNBlKrXP4iA8vwY7WLtiSv8Bwia5GIWdbNM6sU4xKgxmu3VDdnZlLFmicP NXmIJtpfgYD15fKBWMWLxaldsoRm3722l/ksG44HSGldb6avSY6/OuTEhrpEXXL/+Y17 5BuA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=bCT8eHrH; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id k188si2463107qkb.381.2020.03.26.16.12.12 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 26 Mar 2020 16:12:12 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=bCT8eHrH; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:34548 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbfY-00077k-Ai for patch@linaro.org; Thu, 26 Mar 2020 19:12:12 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58342) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcQ-0002Jk-Ha for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcP-0001Te-EF for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:58 -0400 Received: from mail-pg1-x52e.google.com ([2607:f8b0:4864:20::52e]:45972) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcP-0001SB-8l for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:57 -0400 Received: by mail-pg1-x52e.google.com with SMTP id o26so3638237pgc.12 for ; Thu, 26 Mar 2020 16:08:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Sz8Q6/lROF919J+K+cTqN6AeyEkEwGuAjIPol0tSqWA=; b=bCT8eHrH03fihfc4mys2UWYgxhCZJ3Eapf7D7S00Lg9bWrQGq0DSmu744se0BHLjGh DwuCyP4cypLCDASWuWTY6OYPCIoJP1OLzkFkhNICLoiZyEoLV17lB9IIWLuilfQ8FFDE HVOdsAKdSfZrHcS5MnA6JlC/maqVdgt+OfI5+ajF4MwkVuONJgvrmm8uUa6wfnwxaza/ YWdK9bCWF9lafluE/31aNF/sgdtqLlWBnUQ/1YZQWTbPdfm+A+sFngvccHk+p8nIkYhi m2BdmtW0TikIeH+bZkBbXm/V01DSVsfBIb3TonHZ2I0JUU56JrX01i/B8IKv7jBzbOKD FumQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Sz8Q6/lROF919J+K+cTqN6AeyEkEwGuAjIPol0tSqWA=; b=bVN4zZvwj8pA3KvftM9AKQ4vgD0P6Sqw1PFO8t/H+Gvr1IaFR0bs/zR8IxNqPrxScW Be5yY8J6fhtw2ZAquapwUqv2kZlG4uwgYkrbNmA5ljkFB+9MCQWB1LoSuTrdNajd7Sqq A9NdeW/8a77zW0wtextgA1xIrelHooNBmosQMjlHYo+EpswoRM9A496nl3/0AsdWoY8r CacJ/LMfSIENMt7HolHuG7swz6zfWU2x8isY6bw3Iv+2Dj/f4I4rnhr6zC9q3kwU2giA AoMr5SNkDlqIAAkOmspYsTd517jFca7NLHRooJ8P+Vpj6JNWsmo3Hq+BYDyMzoHYSUJG Bpsw== X-Gm-Message-State: ANhLgQ29f8hgYUy8gLGxSeKXlSkLPr3Lgyfo66SgXNpYcFb9xsqdMbTR QuL5u66ueSn5X4D9oTQrvedtqpdZtSc= X-Received: by 2002:a63:ff4e:: with SMTP id s14mr11080157pgk.269.1585264135783; Thu, 26 Mar 2020 16:08:55 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.08.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:08:55 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 12/31] target/arm: Implement SVE2 integer add/subtract interleaved long Date: Thu, 26 Mar 2020 16:08:19 -0700 Message-Id: <20200326230838.31112-13-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::52e X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/sve.decode | 7 +++++++ target/arm/translate-sve.c | 3 +++ 2 files changed, 10 insertions(+) -- 2.20.1 diff --git a/target/arm/sve.decode b/target/arm/sve.decode index a239fd3479..8d5f31bcc4 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -109,6 +109,7 @@ # Three operand, vector element size @rd_rn_rm ........ esz:2 . rm:5 ... ... rn:5 rd:5 &rrr_esz +@rd_rm_rn ........ esz:2 . rn:5 ... ... rm:5 rd:5 &rrr_esz @pd_pn_pm ........ esz:2 .. rm:4 ....... rn:4 . rd:4 &rrr_esz @rdn_rm ........ esz:2 ...... ...... rm:5 rd:5 \ &rrr_esz rn=%reg_movprfx @@ -1180,3 +1181,9 @@ SABDLB 01000101 .. 0 ..... 00 1100 ..... ..... @rd_rn_rm SABDLT 01000101 .. 0 ..... 00 1101 ..... ..... @rd_rn_rm UABDLB 01000101 .. 0 ..... 00 1110 ..... ..... @rd_rn_rm UABDLT 01000101 .. 0 ..... 00 1111 ..... ..... @rd_rn_rm + +## SVE2 integer add/subtract interleaved long + +SADDLBT 01000101 .. 0 ..... 1000 00 ..... ..... @rd_rn_rm +SSUBLBT 01000101 .. 0 ..... 1000 10 ..... ..... @rd_rn_rm +SSUBLBT 01000101 .. 0 ..... 1000 11 ..... ..... @rd_rm_rn # SSUBLTB diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index ee8a6fd912..accb74537b 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6051,3 +6051,6 @@ DO_SVE2_ZZZ_TB(SABDLT, sabdl, true, true) DO_SVE2_ZZZ_TB(UADDLT, uaddl, true, true) DO_SVE2_ZZZ_TB(USUBLT, usubl, true, true) DO_SVE2_ZZZ_TB(UABDLT, uabdl, true, true) + +DO_SVE2_ZZZ_TB(SADDLBT, saddl, false, true) +DO_SVE2_ZZZ_TB(SSUBLBT, ssubl, false, true) From patchwork Thu Mar 26 23:08:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 184951 Delivered-To: patch@linaro.org Received: by 2002:a92:de47:0:0:0:0:0 with SMTP id e7csp645311ilr; Thu, 26 Mar 2020 16:18:35 -0700 (PDT) X-Google-Smtp-Source: ADFU+vtCekpw5Za5ZjzsSPcW9U9nUbGSmgs/AdrxwUwG2PHFO/KskNbCabYmYas2NwGYAxbkMMH+ X-Received: by 2002:a37:a614:: with SMTP id p20mr11448192qke.114.1585264714955; Thu, 26 Mar 2020 16:18:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585264714; cv=none; d=google.com; s=arc-20160816; b=kJdQOyyRHDla8SThGSHNYdba5l46sylxL8jDBdSiGKUSmEQ96mZOOVs6bjTC9cPDwq ohz5Q6GfirumF+CpjrVOkjzHBICOv6FpaSaRoBB51IELRBkuCGcElEP3QD7a4uDJRyvL +msiJ+r5nR8bydD6RAmzUqjEL9nuNw4LnwZgQtvxFuF/FMV2iubnVAemikgfOJA05zVn ogjcmRvTodjVcBqOlb7DBrkia/OS4g4R4Vgvm6I0QzAFTg9FRNZ7BDNhqiYMPrW1ap2r COHd9B1l7ZDKOVEdFg2UK585c5GfXUsF+FPLlXK3q4iah+TFS8XnEWsFroQqc4CoO4Y/ z83A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=mMKWgjpLavmbt56//ygtC6ZwR2NAPw0rAnQXI9rkvA8=; b=f2mhN5rF6IxVMyTyUc2dC8i2oQ24aTllXWRJmdaSgfMxW1jgobzx2ZVmgsfdI+O0zd 4pjkhmdBofo+9K+wf9iY5Wu0MoZj0MTHfeIyxURRSFVSBDfm+zuEBYNPAy/HlHz8HDgv yz8pAwu+mj96QlwkcUpLxF229ord0T4aSwrZpyzJln+tTLnCShSRQAri8/TdRIkQ9DG+ 8IinG+ZNSphVJ1bJ4/vdLM9mZ4a/TDUfLzBYwZHSl8pIVNcPj4rCi4nNun2YXpibAMbL HVukXU9AOqhtaDpFR6fICpRaAw31tGiHSCOqKAGe2kIcrALM74ls+N4t8Exy4rMg6+wd X+Xg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=hhI8h5Pi; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id c124si2368432qke.33.2020.03.26.16.18.34 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 26 Mar 2020 16:18:34 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=hhI8h5Pi; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:34736 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbli-00015M-CW for patch@linaro.org; Thu, 26 Mar 2020 19:18:34 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58430) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcR-0002O1-UX for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:01 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcQ-0001VK-Fu for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:59 -0400 Received: from mail-pl1-x643.google.com ([2607:f8b0:4864:20::643]:33612) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcQ-0001UG-9C for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:58 -0400 Received: by mail-pl1-x643.google.com with SMTP id g18so2752886plq.0 for ; Thu, 26 Mar 2020 16:08:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=mMKWgjpLavmbt56//ygtC6ZwR2NAPw0rAnQXI9rkvA8=; b=hhI8h5PirpzeUyHGeWw5TyAz80KK5mgR5DlmLdJ2ACyl0WdBE2lwA+HxEqrAlstXvQ 2LTJSsfkSCDt+nDl10GmcP6aJijm8dzs3keJexTrj9J8Ef5kMdW7UAJTD7R0Dw4BOj9G YV/B/qxQhh7bRnEdENfUNlRdzMaYHKJ+SE30yGk4dsILc5RQUt1XRsR7i1l5yoYmEEb9 fPhxknTMAKFCA0WN59UwLJCC0wcblfgSPM39Oc0DI7UnYgw9xKZYhqeoGiun44p+PzxO uYzl1afbw/Z3tvQjbT2Z8BpXl/n3GWZUViXp0BqN/gJKfFDyEJMHxSh/s9602LKOZaWc 49UQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=mMKWgjpLavmbt56//ygtC6ZwR2NAPw0rAnQXI9rkvA8=; b=rynQJnmg3ol28yMEWnrHz31bwu4GK5iZwL1vEBNw7RuaASCDZLQMlhWt1o1YpH3dxT TGgYvvm/ciKcxamJzJCLQCuLgmFCxeorYHfpYjGUmowP2Kn6AKQGmdEK/mD1+RFE/+Gy UWoFvKwtf2f30WWkyUS136yoRmJTvdg+M+tua/2colwa4gyHqPpsFehh6kCAXQhSDiD/ LVJtkQVWiv2u15HBLu9hMKL7cFRNQ0oq3SD6DCpAX4Gb4dajOaCHUEBHfDZK3V0X2DQM rOCMoI/ylsNS94L474gdoa5bnWKmDjXMOh9hLaWD5SYZ0n/bRORq1of3wigUSd35NFfC LuPA== X-Gm-Message-State: ANhLgQ0EFle4NjvudgJEyFl3O/3tQ3DUewSPX8KTn5az12b+kA1NF0q4 f8Szgu3UufpmAbsi7OyiaiYdkkInX+k= X-Received: by 2002:a17:902:a517:: with SMTP id s23mr10766321plq.125.1585264136950; Thu, 26 Mar 2020 16:08:56 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.08.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:08:56 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 13/31] target/arm: Implement SVE2 integer add/subtract wide Date: Thu, 26 Mar 2020 16:08:20 -0700 Message-Id: <20200326230838.31112-14-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::643 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 16 ++++++++++++++++ target/arm/sve.decode | 12 ++++++++++++ target/arm/sve_helper.c | 30 ++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 20 ++++++++++++++++++++ 4 files changed, 78 insertions(+) -- 2.20.1 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 475fce7f3a..6a95c6085c 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -1396,6 +1396,22 @@ DEF_HELPER_FLAGS_4(sve2_uabdl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_uabdl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_uabdl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_saddw_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_saddw_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_saddw_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_ssubw_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_ssubw_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_ssubw_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_uaddw_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_uaddw_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_uaddw_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_usubw_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_usubw_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_usubw_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(sve_ldr, TCG_CALL_NO_WG, void, env, ptr, tl, int) DEF_HELPER_FLAGS_4(sve_str, TCG_CALL_NO_WG, void, env, ptr, tl, int) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 8d5f31bcc4..9994e1eb71 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1187,3 +1187,15 @@ UABDLT 01000101 .. 0 ..... 00 1111 ..... ..... @rd_rn_rm SADDLBT 01000101 .. 0 ..... 1000 00 ..... ..... @rd_rn_rm SSUBLBT 01000101 .. 0 ..... 1000 10 ..... ..... @rd_rn_rm SSUBLBT 01000101 .. 0 ..... 1000 11 ..... ..... @rd_rm_rn # SSUBLTB + +## SVE2 integer add/subtract wide + +SADDWB 01000101 .. 0 ..... 010 000 ..... ..... @rd_rn_rm +SADDWT 01000101 .. 0 ..... 010 001 ..... ..... @rd_rn_rm +UADDWB 01000101 .. 0 ..... 010 010 ..... ..... @rd_rn_rm +UADDWT 01000101 .. 0 ..... 010 011 ..... ..... @rd_rn_rm + +SSUBWB 01000101 .. 0 ..... 010 100 ..... ..... @rd_rn_rm +SSUBWT 01000101 .. 0 ..... 010 101 ..... ..... @rd_rn_rm +USUBWB 01000101 .. 0 ..... 010 110 ..... ..... @rd_rn_rm +USUBWT 01000101 .. 0 ..... 010 111 ..... ..... @rd_rn_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 7d7a59f620..44503626e4 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1131,6 +1131,36 @@ DO_ZZZ_TB(sve2_uabdl_d, uint64_t, uint32_t, DO_ABD) #undef DO_ZZZ_TB +#define DO_ZZZ_WTB(NAME, TYPE, TYPEN, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + int sel2 = (simd_data(desc) & 1) * sizeof(TYPE); \ + for (i = 0; i < opr_sz; i += sizeof(TYPE)) { \ + TYPE nn = *(TYPE *)(vn + i); \ + TYPE mm = (TYPEN)(*(TYPE *)(vm + i) >> sel2); \ + *(TYPE *)(vd + i) = OP(nn, mm); \ + } \ +} + +DO_ZZZ_WTB(sve2_saddw_h, int16_t, int8_t, DO_ADD) +DO_ZZZ_WTB(sve2_saddw_s, int32_t, int16_t, DO_ADD) +DO_ZZZ_WTB(sve2_saddw_d, int64_t, int32_t, DO_ADD) + +DO_ZZZ_WTB(sve2_ssubw_h, int16_t, int8_t, DO_SUB) +DO_ZZZ_WTB(sve2_ssubw_s, int32_t, int16_t, DO_SUB) +DO_ZZZ_WTB(sve2_ssubw_d, int64_t, int32_t, DO_SUB) + +DO_ZZZ_WTB(sve2_uaddw_h, uint16_t, uint8_t, DO_ADD) +DO_ZZZ_WTB(sve2_uaddw_s, uint32_t, uint16_t, DO_ADD) +DO_ZZZ_WTB(sve2_uaddw_d, uint64_t, uint32_t, DO_ADD) + +DO_ZZZ_WTB(sve2_usubw_h, uint16_t, uint8_t, DO_SUB) +DO_ZZZ_WTB(sve2_usubw_s, uint32_t, uint16_t, DO_SUB) +DO_ZZZ_WTB(sve2_usubw_d, uint64_t, uint32_t, DO_SUB) + +#undef DO_ZZZ_WTB + /* Two-operand reduction expander, controlled by a predicate. * The difference between TYPERED and TYPERET has to do with * sign-extension. E.g. for SMAX, TYPERED must be signed, diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index accb74537b..fb214360bf 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6054,3 +6054,23 @@ DO_SVE2_ZZZ_TB(UABDLT, uabdl, true, true) DO_SVE2_ZZZ_TB(SADDLBT, saddl, false, true) DO_SVE2_ZZZ_TB(SSUBLBT, ssubl, false, true) + +#define DO_SVE2_ZZZ_WTB(NAME, name, SEL2) \ +static bool trans_##NAME(DisasContext *s, arg_rrr_esz *a) \ +{ \ + static gen_helper_gvec_3 * const fns[4] = { \ + NULL, gen_helper_sve2_##name##_h, \ + gen_helper_sve2_##name##_s, gen_helper_sve2_##name##_d, \ + }; \ + return do_sve2_zzw_ool(s, a, fns[a->esz], SEL2); \ +} + +DO_SVE2_ZZZ_WTB(SADDWB, saddw, false) +DO_SVE2_ZZZ_WTB(SADDWT, saddw, true) +DO_SVE2_ZZZ_WTB(SSUBWB, ssubw, false) +DO_SVE2_ZZZ_WTB(SSUBWT, ssubw, true) + +DO_SVE2_ZZZ_WTB(UADDWB, uaddw, false) +DO_SVE2_ZZZ_WTB(UADDWT, uaddw, true) +DO_SVE2_ZZZ_WTB(USUBWB, usubw, false) +DO_SVE2_ZZZ_WTB(USUBWT, usubw, true) From patchwork Thu Mar 26 23:08:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 184955 Delivered-To: patch@linaro.org Received: by 2002:a92:de47:0:0:0:0:0 with SMTP id e7csp647467ilr; Thu, 26 Mar 2020 16:21:12 -0700 (PDT) X-Google-Smtp-Source: ADFU+vt4jNdQMBV+5yEfCFYYl/obk7fEA21OR03L5EmfRlKmddzO4H1qQ/QjaZoOtie+tb+mLWcf X-Received: by 2002:a37:981:: with SMTP id 123mr11431447qkj.154.1585264872162; Thu, 26 Mar 2020 16:21:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585264872; cv=none; d=google.com; s=arc-20160816; b=MJ7SIVVsLVe8Whitj9H/kL9a1KiNTgNxm9kimndJFdmeLWPFbA1B5j98jcNpKamg00 esfJUQ9c2K6hCFKuizRw2JRDdmaOp4AWgAcyukFSID+VOS1U0Vz+gGWvy9k/rDlj/mTO BZiuBJ57RPNdGisR1pEKSnjqmD/Zdwn4CQ7+QM4qyNAxsDzMW6iIxBwiXEFYndoHDAiK nwc27/T3L8Ddkf7OCquCVNNmJjJiRip/HSjRC3uxtrwVps7kX1A8nMUBhIzlaQvxJusz XALi3Bz1GdRz59tIHhvL29gcZ9Y5YX7nVgVdUP9VLCLP45N+224gkk6JSxuF9+UtzOcm XHnQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=j4qfax2U5xHz9yKRRnJ0t4mZCcpGgQcdv9vkX7WKX8I=; b=BX2G0P27Brn/TYCVtQIUV/M3LIPqNu6pi64SdjS+/dS7QDauPRyM5WqDzO/9E1UiWM uED5Q4BEhCcietonAgidCOGYjxcGe2U3pNwZo+N0wePbmBxOFNgqMycn0vZwRQDpio8a Z+hvr6oAh4x/HCJIlwfxCjd7BmlAf6QY0p3rFbNTwfKstSerNGAA1yUIr/VVS8pZF259 dn+FGprelv/w4mOT3DGZth2E7rXFhHw6Fe376HA7BYgf/QLGI6sTRWiOJHH+XH7eTb4K muMdgaACuWzdYqZyTfgtpmgTmzcdf4ZiXo+3fz/bfO9QRvFDMEV2uZAc5rjamkQ4RYvu b4Ng== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=SaqlsuX5; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id r3si2486980qkd.153.2020.03.26.16.21.12 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 26 Mar 2020 16:21:12 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=SaqlsuX5; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:34822 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHboF-0004l3-I0 for patch@linaro.org; Thu, 26 Mar 2020 19:21:11 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58509) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcT-0002Rg-8Q for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:02 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcR-0001XV-Q2 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:01 -0400 Received: from mail-pj1-x1044.google.com ([2607:f8b0:4864:20::1044]:39644) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcR-0001WS-Jq for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:08:59 -0400 Received: by mail-pj1-x1044.google.com with SMTP id z3so2516103pjr.4 for ; Thu, 26 Mar 2020 16:08:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=j4qfax2U5xHz9yKRRnJ0t4mZCcpGgQcdv9vkX7WKX8I=; b=SaqlsuX5Nr9AGr9lWM0Bg3tTu5mQu0H/31bW/Ih5/q/4+oWIN93BxUBpAJ0DQHdTiT TG/DhHGc0PDhvQcZUzv2xGay40RDtb9dtbcIcdUd0WIEppJqYh+LAPW7fMZ+Ass/3lQ1 uu4Ua4cK+Wpsa8RPUDovOMNqP4EXmj8+mf1DpskvrIfQTpYU2IynMH4Ylr6xlTRiFYdm C8aXto4AhQgDlUVriiUvp5HSr0AzhFL2YvQIA7sUdRUSKMg/K068e/ghBBQoj9rnpaZz T3yR12eCQzDZBazdejtfDe65SS/dS5dnhlvt9l2mgz1cD9dMs5sXw3RG5KSGuyu/DUUB yTvQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=j4qfax2U5xHz9yKRRnJ0t4mZCcpGgQcdv9vkX7WKX8I=; b=hdgwSzxujnm5fD0C5Emvu1BMuqOaAQep4TjJRkihtLSmjIY8ed+4/NHKUQMeDS4Ape Fk+jATv2AGPUazfqWpgJ2lLpSPpJ1UVh45ECezFUEmjQIv+/avyIdVHjVD2c/On7AeR+ Ovox9xeBswYbDsBT9IugitDESLdeW+k5xCTCUW2JbEaQabCfrk4kBTzANwJwq5o+hkzA xOW6Kds3sMruh4/ISpmLI8rbC2OQbh7ulwm5wfqwuOqB+/FSjpOsAsxUimEVOHPwUYMg tyBHIwM8xAd0jR92bkrgLzv2jpCBx+z1YsShU5gjjvaEhUJwL6/uBKHHeJeB2C80LUJE Qqow== X-Gm-Message-State: ANhLgQ1NeEsaj7UOYGJlH/weq1QGlQs4QuclNrnzkcyhSy69WmcH+vcb P9PTAGDRPK2DEg6AjmuMwdXWp6zx6hg= X-Received: by 2002:a17:902:a9cc:: with SMTP id b12mr10337563plr.177.1585264138249; Thu, 26 Mar 2020 16:08:58 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.08.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:08:57 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 14/31] target/arm: Implement SVE2 integer multiply long Date: Thu, 26 Mar 2020 16:08:21 -0700 Message-Id: <20200326230838.31112-15-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::1044 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Exclude PMULL from this category for the moment. Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 15 +++++++++++++++ target/arm/sve.decode | 9 +++++++++ target/arm/sve_helper.c | 31 +++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 9 +++++++++ 4 files changed, 64 insertions(+) -- 2.20.1 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 6a95c6085c..c4784919d2 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2355,4 +2355,19 @@ DEF_HELPER_FLAGS_6(sve_stdd_le_zd_mte, TCG_CALL_NO_WG, DEF_HELPER_FLAGS_6(sve_stdd_be_zd_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve2_sqdmull_zzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqdmull_zzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqdmull_zzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_smull_zzz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_smull_zzz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_smull_zzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_umull_zzz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_umull_zzz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_umull_zzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(sve2_pmull_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 9994e1eb71..2410dd85a1 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1199,3 +1199,12 @@ SSUBWB 01000101 .. 0 ..... 010 100 ..... ..... @rd_rn_rm SSUBWT 01000101 .. 0 ..... 010 101 ..... ..... @rd_rn_rm USUBWB 01000101 .. 0 ..... 010 110 ..... ..... @rd_rn_rm USUBWT 01000101 .. 0 ..... 010 111 ..... ..... @rd_rn_rm + +## SVE2 integer multiply long + +SQDMULLB_zzz 01000101 .. 0 ..... 011 000 ..... ..... @rd_rn_rm +SQDMULLT_zzz 01000101 .. 0 ..... 011 001 ..... ..... @rd_rn_rm +SMULLB_zzz 01000101 .. 0 ..... 011 100 ..... ..... @rd_rn_rm +SMULLT_zzz 01000101 .. 0 ..... 011 101 ..... ..... @rd_rn_rm +UMULLB_zzz 01000101 .. 0 ..... 011 110 ..... ..... @rd_rn_rm +UMULLT_zzz 01000101 .. 0 ..... 011 111 ..... ..... @rd_rn_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 44503626e4..130697f3d9 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1129,6 +1129,37 @@ DO_ZZZ_TB(sve2_uabdl_h, uint16_t, uint8_t, DO_ABD) DO_ZZZ_TB(sve2_uabdl_s, uint32_t, uint16_t, DO_ABD) DO_ZZZ_TB(sve2_uabdl_d, uint64_t, uint32_t, DO_ABD) +DO_ZZZ_TB(sve2_smull_zzz_h, int16_t, int8_t, DO_MUL) +DO_ZZZ_TB(sve2_smull_zzz_s, int32_t, int16_t, DO_MUL) +DO_ZZZ_TB(sve2_smull_zzz_d, int64_t, int32_t, DO_MUL) + +DO_ZZZ_TB(sve2_umull_zzz_h, uint16_t, uint8_t, DO_MUL) +DO_ZZZ_TB(sve2_umull_zzz_s, uint32_t, uint16_t, DO_MUL) +DO_ZZZ_TB(sve2_umull_zzz_d, uint64_t, uint32_t, DO_MUL) + +/* Note that the multiply cannot overflow, but the doubling can. */ +static inline int16_t do_sqdmull_h(int16_t n, int16_t m) +{ + int16_t val = n * m; + return DO_SQADD_H(val, val); +} + +static inline int32_t do_sqdmull_s(int32_t n, int32_t m) +{ + int32_t val = n * m; + return DO_SQADD_S(val, val); +} + +static inline int64_t do_sqdmull_d(int64_t n, int64_t m) +{ + int64_t val = n * m; + return do_sqadd_d(val, val); +} + +DO_ZZZ_TB(sve2_sqdmull_zzz_h, int16_t, int8_t, do_sqdmull_h) +DO_ZZZ_TB(sve2_sqdmull_zzz_s, int32_t, int16_t, do_sqdmull_s) +DO_ZZZ_TB(sve2_sqdmull_zzz_d, int64_t, int32_t, do_sqdmull_d) + #undef DO_ZZZ_TB #define DO_ZZZ_WTB(NAME, TYPE, TYPEN, OP) \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index fb214360bf..c66ec9eb83 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6055,6 +6055,15 @@ DO_SVE2_ZZZ_TB(UABDLT, uabdl, true, true) DO_SVE2_ZZZ_TB(SADDLBT, saddl, false, true) DO_SVE2_ZZZ_TB(SSUBLBT, ssubl, false, true) +DO_SVE2_ZZZ_TB(SQDMULLB_zzz, sqdmull_zzz, false, false) +DO_SVE2_ZZZ_TB(SQDMULLT_zzz, sqdmull_zzz, true, true) + +DO_SVE2_ZZZ_TB(SMULLB_zzz, smull_zzz, false, false) +DO_SVE2_ZZZ_TB(SMULLT_zzz, smull_zzz, true, true) + +DO_SVE2_ZZZ_TB(UMULLB_zzz, umull_zzz, false, false) +DO_SVE2_ZZZ_TB(UMULLT_zzz, umull_zzz, true, true) + #define DO_SVE2_ZZZ_WTB(NAME, name, SEL2) \ static bool trans_##NAME(DisasContext *s, arg_rrr_esz *a) \ { \ From patchwork Thu Mar 26 23:08:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 184960 Delivered-To: patch@linaro.org Received: by 2002:a92:de47:0:0:0:0:0 with SMTP id e7csp649222ilr; Thu, 26 Mar 2020 16:23:15 -0700 (PDT) X-Google-Smtp-Source: ADFU+vsoTfLPSpWxBOuIhHtoA4t4kNmQnoPbF08+nAhv1Oz20Y7NfbdaR/AnRNpdPKqYGqnensoK X-Received: by 2002:a05:620a:1250:: with SMTP id a16mr3059170qkl.497.1585264995059; Thu, 26 Mar 2020 16:23:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585264995; cv=none; d=google.com; s=arc-20160816; b=OhC0rKIa9Py9zORoHsnJx/CLcTRBgQq2PolMtAXHGkOVeu9ON4VhIMqgoFogSFUpTx MM5nfq/q6E4QKPdwqYWpIEVyXZSE6CSUAlStYuBxNDcgrb/cT3Br008usHrWbRDkaqxs dRcVP054Fvge6wC3/nzn1FbHE1SBBTMGy6B0NKQnsq1YhvuuOxkQl4SnjiJnaXQNi6at QWRurlaGZTIcOqHPxPf9weyG34FUgMnTMfYHPhSzPgK9+2BDwQaSRMYnzsgFKcZnl+cr mMOoLyR/G/cMj29FBspluV3Ob9JCHFyg3JaEFBViHWowmA42KyCbIwhVKh7A+CafA8uN Lx1Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=rI8qejJNvyJMtMzOFbbZO4TRq+KeIHi8lkQl/Kjod0g=; b=CV2hVhNlVMM86SLrLR4+SSCvO/VGfdewd+JU/1/3uEn9Br9gIBE5iw+cIcD+Sve4dY nl4/rGv13r9HFc2eoKyo1FK9x76BHYbAoeKgg5DoO1oShaRr1/tVeDgl6yFBaX2ve3fC V7XUP4fZ00cfgfhpS0QytlQgHvgNhm6qrpoiJ7jl+A/jst4u8vaxiEZFA2cn0lrrLEGs 3g1ik252ES1A17adlwjpnbpAAL17vPCYiiMeQcHgejM/KpphYIuCTgRA4+0I8OWI/TjF zF+HIGCSc7PSv0wqeW9dnreR1V2/AbXT7qnpwBqh8zXbvqzZdVH0rAAP5FPpHmyB8LPB UUZQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=RJ64crcL; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id r3si2474835qve.188.2020.03.26.16.23.14 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 26 Mar 2020 16:23:15 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=RJ64crcL; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:34880 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbqE-0008Bp-Hh for patch@linaro.org; Thu, 26 Mar 2020 19:23:14 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58587) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcU-0002V7-G1 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:03 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcT-0001Zq-3G for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:02 -0400 Received: from mail-pf1-x441.google.com ([2607:f8b0:4864:20::441]:33179) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcS-0001Yc-T8 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:01 -0400 Received: by mail-pf1-x441.google.com with SMTP id j1so3565816pfe.0 for ; Thu, 26 Mar 2020 16:09:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=rI8qejJNvyJMtMzOFbbZO4TRq+KeIHi8lkQl/Kjod0g=; b=RJ64crcLI/IiapbqkQWiQAhcVnS+EC2JxfMAlP2pfuq6LQaG4KuMeOlLm0j2ym8b1J WBDNEQeWBAFCsdRRKryCr/J32tu53LGaGCZAspzLYWODv6Fj/ftlPAJXTWHlhviJ1o4V tYSnn9owYLxT9Qjbj8rcgMP2jwIBVKNx2jbcVfcj3Vjoy93ehTz/QATOJhW48JcdiujE t2XuWILZcr49oH6Hgc7fNzXWHPHKZV74IYeuavftKeu7xawIursnI8kWZI5n6kij7Acy 62i4vIGyxaWCifq5a8xXwhnTX/FXM4xPos/yBikiF0Xhu8sxWTTyMdCHbLnCea9cQrYU M53g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=rI8qejJNvyJMtMzOFbbZO4TRq+KeIHi8lkQl/Kjod0g=; b=rw6m/RYDqKnEHXemhgH1yJMLHN3oCGBYsZzKa0TUCaq3LQgSl6NekNC/PEbTOuRpHI n8AkeXBirJm0XD87dAniwxvkRnnFO0UaIyj6la3jGjz/HGfpGf0n+Y/cpRJNRnqYeWCs ELBUpgd88E2/fMPGp3UNSvRdXKcPgvNleUVLdI4VM+VBFyr01xW5kNv1opC9l0e6qqjN HF05+73pJdP69NZPpklQJkBmj5ltZIKOjrDzvum+0pzPztYzAOhZdvc4GNKAtMrmpuPx zkA5+B6ql9z/mIxMamESNmn/wp05VFZIb9oOrLGZLtnJvdJyIcJNfCIXlnh9yHyIpkb0 5rKg== X-Gm-Message-State: ANhLgQ31mKZqOmzy0ym7H3Rh38YEeBprZXwpuySSC1mNki5gRPtFD6fj px749Dm+7yS22IRST0DpDtYNbj19SZM= X-Received: by 2002:aa7:9f94:: with SMTP id z20mr9893160pfr.261.1585264139443; Thu, 26 Mar 2020 16:08:59 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.08.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:08:58 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 15/31] target/arm: Implement PMULLB and PMULLT Date: Thu, 26 Mar 2020 16:08:22 -0700 Message-Id: <20200326230838.31112-16-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::441 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/cpu.h | 10 ++++++++++ target/arm/helper-sve.h | 1 + target/arm/sve.decode | 2 ++ target/arm/translate-sve.c | 22 ++++++++++++++++++++++ target/arm/vec_helper.c | 26 ++++++++++++++++++++++++++ 5 files changed, 61 insertions(+) -- 2.20.1 diff --git a/target/arm/cpu.h b/target/arm/cpu.h index 2314e3c18c..2e9d9f046d 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -3855,6 +3855,16 @@ static inline bool isar_feature_aa64_sve2(const ARMISARegisters *id) return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, SVEVER) != 0; } +static inline bool isar_feature_aa64_sve2_aes(const ARMISARegisters *id) +{ + return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, AES) != 0; +} + +static inline bool isar_feature_aa64_sve2_pmull128(const ARMISARegisters *id) +{ + return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, AES) >= 2; +} + /* * Feature tests for "does this exist in either 32-bit or 64-bit?" */ diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index c4784919d2..943839e2dc 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2371,3 +2371,4 @@ DEF_HELPER_FLAGS_4(sve2_umull_zzz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_umull_zzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_pmull_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_pmull_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 2410dd85a1..04bf9e5ce8 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1204,6 +1204,8 @@ USUBWT 01000101 .. 0 ..... 010 111 ..... ..... @rd_rn_rm SQDMULLB_zzz 01000101 .. 0 ..... 011 000 ..... ..... @rd_rn_rm SQDMULLT_zzz 01000101 .. 0 ..... 011 001 ..... ..... @rd_rn_rm +PMULLB 01000101 .. 0 ..... 011 010 ..... ..... @rd_rn_rm +PMULLT 01000101 .. 0 ..... 011 011 ..... ..... @rd_rn_rm SMULLB_zzz 01000101 .. 0 ..... 011 100 ..... ..... @rd_rn_rm SMULLT_zzz 01000101 .. 0 ..... 011 101 ..... ..... @rd_rn_rm UMULLB_zzz 01000101 .. 0 ..... 011 110 ..... ..... @rd_rn_rm diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index c66ec9eb83..67416a25ce 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6064,6 +6064,28 @@ DO_SVE2_ZZZ_TB(SMULLT_zzz, smull_zzz, true, true) DO_SVE2_ZZZ_TB(UMULLB_zzz, umull_zzz, false, false) DO_SVE2_ZZZ_TB(UMULLT_zzz, umull_zzz, true, true) +static bool do_trans_pmull(DisasContext *s, arg_rrr_esz *a, bool sel) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_gvec_pmull_q, gen_helper_sve2_pmull_h, + NULL, gen_helper_sve2_pmull_d, + }; + if (a->esz == 0 && !dc_isar_feature(aa64_sve2_pmull128, s)) { + return false; + } + return do_sve2_zzw_ool(s, a, fns[a->esz], sel); +} + +static bool trans_PMULLB(DisasContext *s, arg_rrr_esz *a) +{ + return do_trans_pmull(s, a, false); +} + +static bool trans_PMULLT(DisasContext *s, arg_rrr_esz *a) +{ + return do_trans_pmull(s, a, true); +} + #define DO_SVE2_ZZZ_WTB(NAME, name, SEL2) \ static bool trans_##NAME(DisasContext *s, arg_rrr_esz *a) \ { \ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 00dc38c9db..154d32518a 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -1256,6 +1256,32 @@ void HELPER(sve2_pmull_h)(void *vd, void *vn, void *vm, uint32_t desc) d[i] = pmull_h(nn, mm); } } + +static uint64_t pmull_d(uint64_t op1, uint64_t op2) +{ + uint64_t result = 0; + int i; + + for (i = 0; i < 32; ++i) { + uint64_t mask = -((op1 >> i) & 1); + result ^= (op2 << i) & mask; + } + return result; +} + +void HELPER(sve2_pmull_d)(void *vd, void *vn, void *vm, uint32_t desc) +{ + int shift = simd_data(desc) * 32; + intptr_t i, opr_sz = simd_oprsz(desc); + uint64_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 8; ++i) { + uint64_t nn = (uint32_t)(n[i] >> shift); + uint64_t mm = (uint32_t)(m[i] >> shift); + + d[i] = pmull_d(nn, mm); + } +} #endif /* From patchwork Thu Mar 26 23:08:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 184963 Delivered-To: patch@linaro.org Received: by 2002:a92:de47:0:0:0:0:0 with SMTP id e7csp650836ilr; Thu, 26 Mar 2020 16:25:23 -0700 (PDT) X-Google-Smtp-Source: ADFU+vvslh+Y9385/1wew5mXIwKsXy6dTMnyQChNZw9dIIHNkIpkztfokZ2tFBCSOpDEdEv267TX X-Received: by 2002:a37:e109:: with SMTP id c9mr11740011qkm.348.1585265123600; Thu, 26 Mar 2020 16:25:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585265123; cv=none; d=google.com; s=arc-20160816; b=piWbxsnufd+jsam2TzZtodBYjen5j9zeEMDzPM+V4wSqAWzisQLKcgn/U40aR0gT9y l1q64/SfqiTuE9FRJPkG2cl10sVZkpGk0ch2Ic9pCUK7wAa3DCn1BbnK2R8Mn2EFLoTg Hp3Pasa1KWkPB3EpL0mdfJl8TS4z5B05oDfAlw5ZAhRPziU8527+TIeQgGf+oC59dmQM CSDFtzjAT96txfdopQcNj6u2bIXxrbw481WrvpYhFOI6BGkicIY3exd9T5eE8nlp9BrG Fbn/afip1Qs+GyHnwge2dGFaIvBX2E/j0CZ6ZCmeXQW015mHsH1R2tQsBPgFEAzwB2gI R/uA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=cv1t9pmvTM2or2y2OW8+tePDhkWPmqSM7xnDl7FKLbM=; b=jcAC5uMUj8hr49DMULkFCgWbDNeVMB1QtD5WeZUWG5UQ6PQjtXRXSbbFoPmlvgjSiC JiKzmKb0c3N9oV55NrT6y7sGPn0chSn3aGTV03xnf54Ub9hPJVcy4FNyYeyPRJcKBZqp d/3U+yqw9XCGui7s4wqNKsWQTgSkx5xzxsj1g9UGYdRiwQhza1crWeIL2BSqbyMgUk/J VFC5VB6c4jw79gMvLpO99Lh8uwCE3zZd0ROoB4V/NNB//npebgzMR1sMnVJSipG0PyUY M5ZCzrqnsG7EK4Gn2Wjo+xYjAGp+uXQOmJ9OF2S7hxntsEZZpJpkWLf/mlcHnqFLpB2D 3WbQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=uIqBnWL8; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id y32si2244908qtd.62.2020.03.26.16.25.23 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 26 Mar 2020 16:25:23 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=uIqBnWL8; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:34936 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbsJ-0003Jm-1u for patch@linaro.org; Thu, 26 Mar 2020 19:25:23 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58659) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcV-0002YU-Ls for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:04 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcU-0001cl-GL for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:03 -0400 Received: from mail-pf1-x442.google.com ([2607:f8b0:4864:20::442]:36573) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcU-0001b3-A1 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:02 -0400 Received: by mail-pf1-x442.google.com with SMTP id i13so3553596pfe.3 for ; Thu, 26 Mar 2020 16:09:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=cv1t9pmvTM2or2y2OW8+tePDhkWPmqSM7xnDl7FKLbM=; b=uIqBnWL8T4v191boSmUZwB494xE/PGFRifYhV43jeTNSJ3Cj5Mm43BNKjSeUAl55sw vZpHh6nj8aEaZFog2zlzPCeedO3UXD8V9+nLMzCuqI9iircm7N7FIdUre1kskonwCdgr KBHtgE1vg8AQRYnz74dth6UCBR9+I222bWevRB9nK1VYtmgZ4rIND3oCiHe1/TrhKiRj IGGoyLmmMUQPujB2Rf9iBo5fch/Cxcbj0DMTZuam8fRBKKxY0j1JMG/s5Fbf0dobOhFe 8SVvRMM1qrpW8VkmErq1AKHlE96DpetQ8FrjlFbH2hHMWpY8JvoWgRElvU/Kt2HJGreR 6S+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=cv1t9pmvTM2or2y2OW8+tePDhkWPmqSM7xnDl7FKLbM=; b=cD5NE3WDJ7J4ulLw8t5WPRLYs8iMeZVZX2+kbvHesJmg9WZhFDyFOb4XQLyMPwHp7K psUq+VSe18iO9KlrPG5TIqKYFGZdwG0C6LhanICRclVTSj/X4xPp/Js47I+akgsjvkz7 fIVEGxA0LT7uneqpwvE4x3EON3/Wtgp6F2MfTPzNb7RmTrj6Iew+pvXAFyO57OxuE8yx 8KUSLxTpfOgxPwnSWWRzkrRvpY9gy73N9oI2Qxkq3u7uOiYuY/kF0Bxl18hAJv/gEohV czJqYK2GjIEKRtut/xlcpdTSs+rGF4sJRhmJ4dO0rjNCjWYlWGIaWFCm/m49y+SmKAcV r5QA== X-Gm-Message-State: ANhLgQ0x0JEoLA+FQi7YU4tit+wViFT2Mm7Xc30eOrCxv+zm6PSmhNHa q5FJLDhvoZkmsgfdkYMva+u59GPGnjE= X-Received: by 2002:a63:e544:: with SMTP id z4mr10927219pgj.174.1585264140810; Thu, 26 Mar 2020 16:09:00 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.08.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:09:00 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 16/31] target/arm: Tidy SVE tszimm shift formats Date: Thu, 26 Mar 2020 16:08:23 -0700 Message-Id: <20200326230838.31112-17-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::442 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Rather than require the user to fill in the immediate (shl or shr), create full formats that include the immediate. Signed-off-by: Richard Henderson --- target/arm/sve.decode | 35 ++++++++++++++++------------------- 1 file changed, 16 insertions(+), 19 deletions(-) -- 2.20.1 diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 04bf9e5ce8..440cff4597 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -151,13 +151,17 @@ @rd_rn_i6 ........ ... rn:5 ..... imm:s6 rd:5 &rri # Two register operand, one immediate operand, with predicate, -# element size encoded as TSZHL. User must fill in imm. -@rdn_pg_tszimm ........ .. ... ... ... pg:3 ..... rd:5 \ - &rpri_esz rn=%reg_movprfx esz=%tszimm_esz +# element size encoded as TSZHL. +@rdn_pg_tszimm_shl ........ .. ... ... ... pg:3 ..... rd:5 \ + &rpri_esz rn=%reg_movprfx esz=%tszimm_esz imm=%tszimm_shl +@rdn_pg_tszimm_shr ........ .. ... ... ... pg:3 ..... rd:5 \ + &rpri_esz rn=%reg_movprfx esz=%tszimm_esz imm=%tszimm_shr # Similarly without predicate. -@rd_rn_tszimm ........ .. ... ... ...... rn:5 rd:5 \ - &rri_esz esz=%tszimm16_esz +@rd_rn_tszimm_shl ........ .. ... ... ...... rn:5 rd:5 \ + &rri_esz esz=%tszimm16_esz imm=%tszimm16_shl +@rd_rn_tszimm_shr ........ .. ... ... ...... rn:5 rd:5 \ + &rri_esz esz=%tszimm16_esz imm=%tszimm16_shr # Two register operand, one immediate operand, with 4-bit predicate. # User must fill in imm. @@ -290,14 +294,10 @@ UMINV 00000100 .. 001 011 001 ... ..... ..... @rd_pg_rn ### SVE Shift by Immediate - Predicated Group # SVE bitwise shift by immediate (predicated) -ASR_zpzi 00000100 .. 000 000 100 ... .. ... ..... \ - @rdn_pg_tszimm imm=%tszimm_shr -LSR_zpzi 00000100 .. 000 001 100 ... .. ... ..... \ - @rdn_pg_tszimm imm=%tszimm_shr -LSL_zpzi 00000100 .. 000 011 100 ... .. ... ..... \ - @rdn_pg_tszimm imm=%tszimm_shl -ASRD 00000100 .. 000 100 100 ... .. ... ..... \ - @rdn_pg_tszimm imm=%tszimm_shr +ASR_zpzi 00000100 .. 000 000 100 ... .. ... ..... @rdn_pg_tszimm_shr +LSR_zpzi 00000100 .. 000 001 100 ... .. ... ..... @rdn_pg_tszimm_shr +LSL_zpzi 00000100 .. 000 011 100 ... .. ... ..... @rdn_pg_tszimm_shl +ASRD 00000100 .. 000 100 100 ... .. ... ..... @rdn_pg_tszimm_shr # SVE bitwise shift by vector (predicated) ASR_zpzz 00000100 .. 010 000 100 ... ..... ..... @rdn_pg_rm @@ -401,12 +401,9 @@ RDVL 00000100 101 11111 01010 imm:s6 rd:5 ### SVE Bitwise Shift - Unpredicated Group # SVE bitwise shift by immediate (unpredicated) -ASR_zzi 00000100 .. 1 ..... 1001 00 ..... ..... \ - @rd_rn_tszimm imm=%tszimm16_shr -LSR_zzi 00000100 .. 1 ..... 1001 01 ..... ..... \ - @rd_rn_tszimm imm=%tszimm16_shr -LSL_zzi 00000100 .. 1 ..... 1001 11 ..... ..... \ - @rd_rn_tszimm imm=%tszimm16_shl +ASR_zzi 00000100 .. 1 ..... 1001 00 ..... ..... @rd_rn_tszimm_shr +LSR_zzi 00000100 .. 1 ..... 1001 01 ..... ..... @rd_rn_tszimm_shr +LSL_zzi 00000100 .. 1 ..... 1001 11 ..... ..... @rd_rn_tszimm_shl # SVE bitwise shift by wide elements (unpredicated) # Note esz != 3 From patchwork Thu Mar 26 23:08:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 184949 Delivered-To: patch@linaro.org Received: by 2002:a92:de47:0:0:0:0:0 with SMTP id e7csp643643ilr; Thu, 26 Mar 2020 16:16:34 -0700 (PDT) X-Google-Smtp-Source: ADFU+vtT4/QQLM75jUxjPgnYpq7YEjPQ+J+eST/mD42IM/CK5SIBlpcVdp59MHtS6giuMfG4TQBt X-Received: by 2002:ad4:5610:: with SMTP id ca16mr11418450qvb.104.1585264594028; Thu, 26 Mar 2020 16:16:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585264594; cv=none; d=google.com; s=arc-20160816; b=W2Vew//dczsLzZXljSKCtD/Fqej0EH4dzUqg5aJoHevGkkRct1qUTPPle6oQ8duqkJ bFZ1yMpKhlceZhlwF7GEf6IL5OiACL0B/mAYqG/IuuCx94nf8UJ6RS708moSm3QHEgRU umsYYLMrK97KaL2vFoiWWC2TX9iw0qyAxyZkmCJZMFFvm0QAuoXSc+MWpZq4Q3rh2RIi EFlYRXq1LqPwII3+ZegXpMuFumk01KQ1Idn13Y7/Qn/Xp0knHKfdUG2QIHyqJF2U4De1 cJfCZ0YVjRn/D7LUO+QTQFNN0GfF1L26NEi1U3XexvCINoA5SUMLnZNWZhg89kexeFqp WK9g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=6gFXa7U00jC32XzVhB/bZg7gt2oypaZr80DxmsTe9GI=; b=ecC0LXtesHVMyUQMckGRWyoQEsvXHe6lcN5RcPVGcnAMoxooRDFbEbjtGwkErklBNl uJBjv4qqWfdAlgMQ6U4OB3NfJk1aHDX3J0S73fpRakzpXCFu890LJ4PxaXGyYP8ScTsH fKCKOhg1N0Pszv9vGXJEo57qHJnIY/9g+yYStXEG1M03i9BtY+ZKZk9N/jwNHLJoUmJB 5UF9E3Cray3dxA7sCPJp+h2x3T+22/kG6ikau9GpcSP9Tig4SpgMzgVZVCBHsQp8t3NQ uJvVSE0fpZo0aJo3/iWUxBIm+Ryycl9YDEjWWl8NOfzjtPHkorCiXuh9mKADZFY4EHHC /SWA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=UC7RaQjB; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id 137si2435424qkn.203.2020.03.26.16.16.33 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 26 Mar 2020 16:16:34 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=UC7RaQjB; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:34696 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbjl-0006Hq-G8 for patch@linaro.org; Thu, 26 Mar 2020 19:16:33 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58747) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcX-0002cq-8O for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:06 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcV-0001ee-Mr for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:05 -0400 Received: from mail-pf1-x443.google.com ([2607:f8b0:4864:20::443]:36574) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcV-0001dX-H3 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:03 -0400 Received: by mail-pf1-x443.google.com with SMTP id i13so3553621pfe.3 for ; Thu, 26 Mar 2020 16:09:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=6gFXa7U00jC32XzVhB/bZg7gt2oypaZr80DxmsTe9GI=; b=UC7RaQjBnyI9Jsd1QHYb4JunreupUSW8ZCngkDpMBSwvQOxyLaqftlmlP5XFuK+nze QyGHtHWGh//fU+Qaf5L7e0rY3w0oZ8q+XycCkbTXNsxv0UaDXibcCf2oIaZ5sn5aNUjF OO0jKglu9zcYUtZ/AwGqb2IUpR/+V90dtejgWmLQb+ZqRPQaRA+LmBICmqdXJQD9fGvw ku4AL7MNAh65IjqrF5lphjGQY/VHQLwHWtCGJ1+etq4UTRmG5jl6LmhneasoxctYHHSI e6CEzWtynYSVEOccw4jB9AbUUzZCSGXLBUfWhtcGQ0AbXZOISjPsqrR2M1S3SVyo+hN+ Qgyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=6gFXa7U00jC32XzVhB/bZg7gt2oypaZr80DxmsTe9GI=; b=WwygLfA/3gaauGV3xEOSn1sWaTUdLjEtSaWMEhEMy3TIpbw5nXMaCalJKFqemL5CjO tpiS2i3vAI03cmK4etjasokA8cReCG4KmUhiAzJ+Z8qIF+yaWiYEpxN+FLfB6y2Ir+eh Be+ob7Zmpq+i9EWbb9r559wZ19H3r0rs8fVp+CJ6zb1wRmq6qgbAP/zjaMftLZc6fhCq 21oiqEiTAboOEjli72pW37xX/7ao+UDfMzDeS8LXDjcJTbezK/fEOPCreRzrEZ25GmfB NMsiont1iGeeIgk6qNxFiPdNBkCq8kDAGgJ1O2Nzc1hh/AtyW+g8JScs/N1cW5or2mJf Njww== X-Gm-Message-State: ANhLgQ3XEK3QB737IfH40NC2sg1q1qN80uI9jEZ4RVDXlz+TMuBYXuYf 0WfVlCcTB1FBTHTpF1OxJNLridgH8Hs= X-Received: by 2002:aa7:9790:: with SMTP id o16mr11614378pfp.322.1585264142088; Thu, 26 Mar 2020 16:09:02 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.09.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:09:01 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 17/31] target/arm: Implement SVE2 bitwise shift left long Date: Thu, 26 Mar 2020 16:08:24 -0700 Message-Id: <20200326230838.31112-18-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::443 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 8 +++++++ target/arm/sve.decode | 8 +++++++ target/arm/sve_helper.c | 34 ++++++++++++++++++++++++++ target/arm/translate-sve.c | 49 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 99 insertions(+) -- 2.20.1 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 943839e2dc..9c0c41ba80 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2372,3 +2372,11 @@ DEF_HELPER_FLAGS_4(sve2_umull_zzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_pmull_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_pmull_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_sshll_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sshll_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sshll_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_ushll_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_ushll_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_ushll_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 440cff4597..36ef9de563 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1207,3 +1207,11 @@ SMULLB_zzz 01000101 .. 0 ..... 011 100 ..... ..... @rd_rn_rm SMULLT_zzz 01000101 .. 0 ..... 011 101 ..... ..... @rd_rn_rm UMULLB_zzz 01000101 .. 0 ..... 011 110 ..... ..... @rd_rn_rm UMULLT_zzz 01000101 .. 0 ..... 011 111 ..... ..... @rd_rn_rm + +## SVE2 bitwise shift left long + +# Note bit23 == 0 is handled by esz > 0 in do_sve2_shll_tb. +SSHLLB 01000101 .. 0 ..... 1010 00 ..... ..... @rd_rn_tszimm_shl +SSHLLT 01000101 .. 0 ..... 1010 01 ..... ..... @rd_rn_tszimm_shl +USHLLB 01000101 .. 0 ..... 1010 10 ..... ..... @rd_rn_tszimm_shl +USHLLT 01000101 .. 0 ..... 1010 11 ..... ..... @rd_rn_tszimm_shl diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 130697f3d9..e0a701c446 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -625,6 +625,8 @@ DO_ZPZZ(sve2_sqrshl_zpzz_h, int16_t, H1_2, do_sqrshl_h) DO_ZPZZ(sve2_sqrshl_zpzz_s, int32_t, H1_4, do_sqrshl_s) DO_ZPZZ_D(sve2_sqrshl_zpzz_d, int64_t, do_sqrshl_d) +#undef do_sqrshl_d + #define do_uqrshl_b(n, m) \ ({ uint32_t discard; do_uqrshl_bhs(n, m, 8, true, &discard); }) #define do_uqrshl_h(n, m) \ @@ -639,6 +641,8 @@ DO_ZPZZ(sve2_uqrshl_zpzz_h, uint16_t, H1_2, do_uqrshl_h) DO_ZPZZ(sve2_uqrshl_zpzz_s, uint32_t, H1_4, do_uqrshl_s) DO_ZPZZ_D(sve2_uqrshl_zpzz_d, uint64_t, do_uqrshl_d) +#undef do_uqrshl_d + #define DO_HADD_BHS(n, m) (((int64_t)n + m) >> 1) #define DO_HADD_D(n, m) ((n >> 1) + (m >> 1) + (n & m & 1)) @@ -1192,6 +1196,36 @@ DO_ZZZ_WTB(sve2_usubw_d, uint64_t, uint32_t, DO_SUB) #undef DO_ZZZ_WTB +#define DO_ZZI_SHLL(NAME, TYPE, TYPEN, OP) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + int sel = (simd_data(desc) & 1) * sizeof(TYPE); \ + int shift = simd_data(desc) >> 1; \ + for (i = 0; i < opr_sz; i += sizeof(TYPE)) { \ + TYPE nn = (TYPEN)(*(TYPE *)(vn + i) >> sel); \ + *(TYPE *)(vd + i) = OP(nn, shift); \ + } \ +} + +#define DO_SSHLL_H(val, sh) do_sqrshl_bhs(val, sh, 16, false, NULL) +#define DO_SSHLL_S(val, sh) do_sqrshl_bhs(val, sh, 32, false, NULL) +#define DO_SSHLL_D(val, sh) do_sqrshl_d(val, sh, false, NULL) + +DO_ZZI_SHLL(sve2_sshll_h, int16_t, int8_t, DO_SSHLL_H) +DO_ZZI_SHLL(sve2_sshll_s, int32_t, int16_t, DO_SSHLL_S) +DO_ZZI_SHLL(sve2_sshll_d, int64_t, int32_t, DO_SSHLL_D) + +#define DO_USHLL_H(val, sh) do_uqrshl_bhs(val, sh, 16, false, NULL) +#define DO_USHLL_S(val, sh) do_uqrshl_bhs(val, sh, 32, false, NULL) +#define DO_USHLL_D(val, sh) do_uqrshl_d(val, sh, false, NULL) + +DO_ZZI_SHLL(sve2_ushll_h, uint16_t, uint8_t, DO_USHLL_H) +DO_ZZI_SHLL(sve2_ushll_s, uint32_t, uint16_t, DO_USHLL_S) +DO_ZZI_SHLL(sve2_ushll_d, uint64_t, uint32_t, DO_USHLL_D) + +#undef DO_ZZI_SHLL + /* Two-operand reduction expander, controlled by a predicate. * The difference between TYPERED and TYPERET has to do with * sign-extension. E.g. for SMAX, TYPERED must be signed, diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 67416a25ce..9873b83feb 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6105,3 +6105,52 @@ DO_SVE2_ZZZ_WTB(UADDWB, uaddw, false) DO_SVE2_ZZZ_WTB(UADDWT, uaddw, true) DO_SVE2_ZZZ_WTB(USUBWB, usubw, false) DO_SVE2_ZZZ_WTB(USUBWT, usubw, true) + +static bool do_sve2_shll_tb(DisasContext *s, arg_rri_esz *a, + bool sel, bool uns) +{ + static gen_helper_gvec_2 * const fns[2][3] = { + { gen_helper_sve2_sshll_h, + gen_helper_sve2_sshll_s, + gen_helper_sve2_sshll_d }, + { gen_helper_sve2_ushll_h, + gen_helper_sve2_ushll_s, + gen_helper_sve2_ushll_d }, + }; + + if (a->esz <= 0 || !dc_isar_feature(aa64_sve2, s)) { + /* + * For < 0, invalid tsz encoding -- see tszimm_esz. + * For = 0, not a widening operation; note this implies bit23 = 0. + */ + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_2_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vsz, vsz, (a->imm << 1) | sel, + fns[uns][a->esz - 1]); + } + return true; +} + +static bool trans_SSHLLB(DisasContext *s, arg_rri_esz *a) +{ + return do_sve2_shll_tb(s, a, false, false); +} + +static bool trans_SSHLLT(DisasContext *s, arg_rri_esz *a) +{ + return do_sve2_shll_tb(s, a, true, false); +} + +static bool trans_USHLLB(DisasContext *s, arg_rri_esz *a) +{ + return do_sve2_shll_tb(s, a, false, true); +} + +static bool trans_USHLLT(DisasContext *s, arg_rri_esz *a) +{ + return do_sve2_shll_tb(s, a, true, true); +} From patchwork Thu Mar 26 23:08:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 184966 Delivered-To: patch@linaro.org Received: by 2002:a92:de47:0:0:0:0:0 with SMTP id e7csp652011ilr; Thu, 26 Mar 2020 16:26:51 -0700 (PDT) X-Google-Smtp-Source: ADFU+vs8ZJOkGJunJebcUeuPppHpyK+e++6v0i7i6qNZHWFzE9GBnfXMu1Wl+B4HT5L+oUCFxkcZ X-Received: by 2002:a37:6616:: with SMTP id a22mr11371678qkc.391.1585265211476; Thu, 26 Mar 2020 16:26:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585265211; cv=none; d=google.com; s=arc-20160816; b=CTVs0vFIUe+u0H4pxlFIiZDanJfz9NRE67JGpmkch0aHFESz7o1oQvviHe6qGuW7P1 xztITPHK1lxQFxTUxY28f5Y/Zi7n/DVygREDHxasElM0cElI3kLHjPfPkdHgwDeCTouE XPbVFQIllIZBxY+vFltocUzADl+bFf8RJdWbR0zT5jY70z69A6Md1qpd87OoT1ZaVlL1 5Gm1a0W+6RR4ZIlVXMk7xx08+zyhhqkJMYdJDSweoBhI5Avkuk73692v4ilAxTFzHe7r x3PDD6BG5YXemKlodhbtulV701fI3hKqvFkW/sNQXLwIghOWY9fsqei6WIlgLs0Gla40 zbfw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=wrW/Tw+fzY4lh5Sk9yEJSGX1ZVgOSoYBH6s+EdHCLNI=; b=tCJQmmi0a5JgMj0kZAenR3o9ML6kMchiUnfaQIVXX7yUEL9ADZ6MwUiJ+RllOhXGgP ur2u0BrZ7QhSK1X2dqQy0HRmAzWSxdfhmLr4WF3m4pj9VBeOBMpKIn82bf+x81xAAQrv xt0TI8XFPxrA7WiryjPfGp0wrvWbmMutxj6Fq6banJmTMw/JpcxnNaeL75Atg6CdOpEI 3xjwd3SiVuJqzDSTPTwJV4ksRH4aIdi2TLPFsHjeyeU7ksSZMYAc1d08ULu+I7DNpmTN GlWOUaBJpBLhgekEJ/wZ4Zv5s+NB0G6MXIm7yT1/hJznGCcsA7LIaSmQyFBfPEt5m5FD rmdQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=BrEbMv9x; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id 18si2476997qkk.210.2020.03.26.16.26.51 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 26 Mar 2020 16:26:51 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=BrEbMv9x; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:34978 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbti-0005C3-UT for patch@linaro.org; Thu, 26 Mar 2020 19:26:50 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58819) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcY-0002gA-HA for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:07 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcW-0001gc-Uf for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:06 -0400 Received: from mail-pf1-x42b.google.com ([2607:f8b0:4864:20::42b]:33342) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcW-0001f9-OI for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:04 -0400 Received: by mail-pf1-x42b.google.com with SMTP id j1so3565885pfe.0 for ; Thu, 26 Mar 2020 16:09:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=wrW/Tw+fzY4lh5Sk9yEJSGX1ZVgOSoYBH6s+EdHCLNI=; b=BrEbMv9x6I4UYjjt578PfVTp5NTnY4KulJP7eMueszsSXLHs4JDW+q4ZAyAKMqjTvF puiyrdvzPlBAiwdmNIIW1JgZLQmwqpf1HQKpYe4XIxCo3h9yH+fI7WwaV+24LQbbvXKC Cf5caroStFOxthkkauY/DLdJ8IMtUoSB5P6dBHYvMcLzDBtEiNN2lmQLmLRgNHnZDPF4 pUrvZXk3D9HSnOqtg2j8GE6Tay3hCT9rwg++eGWXhJvY+mPa8diiJOPxO7ndFZY8E4HX FOT4oD8TMq52Ue57ktHpOuzDAYRTLU1prnFWQ80j78ucP0Sz1Anr6S18LF5vQSXNtL/2 Qgng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=wrW/Tw+fzY4lh5Sk9yEJSGX1ZVgOSoYBH6s+EdHCLNI=; b=rSdPbr7VZ6IQJmG+Y5Vvr6cKzBxYxtx1gV7GG+H1/z1owLV7EJNt5mFXwW8oNN6p09 X0UWFdh308weFUqGW6pAVRo5DfvVEXI/AOKRJENwZ+cTaAS/km0WRGvKwRm8HvIiJXat jkinAPpRW8olzqOxMutjo6flmvBpmyA+ga0qGurI6vwy4gbALL1oKbw34oY4Bp/emHNo RQp+7XD8lftfmLiiNer+lvhBHxQxfmdVixojoIph8mVKDDk5KQKflh90yDcSXiD1krpV Fnyvh/AKG/wVCwzaD81wN52tyE1DMJukxOkB0p+85oFp0gwoVb1f55cNauEDvZnKeS0P w9sQ== X-Gm-Message-State: ANhLgQ0xh8miLRqarlJKmc3m3IdJZP7VengMUQE3tRkmjOd/1p90+8uc Jmq92qy4RxZnapSnB4Kw4UZ1jJAFTv4= X-Received: by 2002:a62:3305:: with SMTP id z5mr11968336pfz.297.1585264143340; Thu, 26 Mar 2020 16:09:03 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.09.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:09:02 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 18/31] target/arm: Implement SVE2 bitwise exclusive-or interleaved Date: Thu, 26 Mar 2020 16:08:25 -0700 Message-Id: <20200326230838.31112-19-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::42b X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 5 +++++ target/arm/sve.decode | 5 +++++ target/arm/sve_helper.c | 20 ++++++++++++++++++++ target/arm/translate-sve.c | 19 +++++++++++++++++++ 4 files changed, 49 insertions(+) -- 2.20.1 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 9c0c41ba80..9e894a2b55 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2380,3 +2380,8 @@ DEF_HELPER_FLAGS_3(sve2_sshll_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_ushll_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_ushll_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_ushll_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_eoril_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_eoril_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_eoril_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_eoril_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 36ef9de563..8af35e48a5 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1215,3 +1215,8 @@ SSHLLB 01000101 .. 0 ..... 1010 00 ..... ..... @rd_rn_tszimm_shl SSHLLT 01000101 .. 0 ..... 1010 01 ..... ..... @rd_rn_tszimm_shl USHLLB 01000101 .. 0 ..... 1010 10 ..... ..... @rd_rn_tszimm_shl USHLLT 01000101 .. 0 ..... 1010 11 ..... ..... @rd_rn_tszimm_shl + +## SVE2 bitwise exclusive-or interleaved + +EORBT 01000101 .. 0 ..... 10010 0 ..... ..... @rd_rn_rm +EORTB 01000101 .. 0 ..... 10010 1 ..... ..... @rd_rn_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index e0a701c446..15ea1fd524 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1196,6 +1196,26 @@ DO_ZZZ_WTB(sve2_usubw_d, uint64_t, uint32_t, DO_SUB) #undef DO_ZZZ_WTB +#define DO_ZZZ_NTB(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + intptr_t sel1 = extract32(desc, SIMD_DATA_SHIFT, 1) * sizeof(TYPE); \ + intptr_t sel2 = extract32(desc, SIMD_DATA_SHIFT + 1, 1) * sizeof(TYPE); \ + for (i = 0; i < opr_sz; i += 2 * sizeof(TYPE)) { \ + TYPE nn = *(TYPE *)(vn + H(i + sel1)); \ + TYPE mm = *(TYPE *)(vm + H(i + sel2)); \ + *(TYPE *)(vd + H(i + sel1)) = OP(nn, mm); \ + } \ +} + +DO_ZZZ_NTB(sve2_eoril_b, uint8_t, H1, DO_EOR) +DO_ZZZ_NTB(sve2_eoril_h, uint16_t, H1_2, DO_EOR) +DO_ZZZ_NTB(sve2_eoril_s, uint32_t, H1_4, DO_EOR) +DO_ZZZ_NTB(sve2_eoril_d, uint64_t, , DO_EOR) + +#undef DO_ZZZ_NTB + #define DO_ZZI_SHLL(NAME, TYPE, TYPEN, OP) \ void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ { \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 9873b83feb..3eaf9cbe51 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6064,6 +6064,25 @@ DO_SVE2_ZZZ_TB(SMULLT_zzz, smull_zzz, true, true) DO_SVE2_ZZZ_TB(UMULLB_zzz, umull_zzz, false, false) DO_SVE2_ZZZ_TB(UMULLT_zzz, umull_zzz, true, true) +static bool do_eor_tb(DisasContext *s, arg_rrr_esz *a, bool sel1) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve2_eoril_b, gen_helper_sve2_eoril_h, + gen_helper_sve2_eoril_s, gen_helper_sve2_eoril_d, + }; + return do_sve2_zzw_ool(s, a, fns[a->esz], (!sel1 << 1) | sel1); +} + +static bool trans_EORBT(DisasContext *s, arg_rrr_esz *a) +{ + return do_eor_tb(s, a, false); +} + +static bool trans_EORTB(DisasContext *s, arg_rrr_esz *a) +{ + return do_eor_tb(s, a, true); +} + static bool do_trans_pmull(DisasContext *s, arg_rrr_esz *a, bool sel) { static gen_helper_gvec_3 * const fns[4] = { From patchwork Thu Mar 26 23:08:26 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 184956 Delivered-To: patch@linaro.org Received: by 2002:a92:de47:0:0:0:0:0 with SMTP id e7csp647680ilr; Thu, 26 Mar 2020 16:21:27 -0700 (PDT) X-Google-Smtp-Source: ADFU+vuKhhi0MDR3GB9IOeMoRzwVxFReqojp9SqnAa6tTYQdkMQSIAZdbFQYNzQ+NslPo4dZhP3v X-Received: by 2002:aed:2ee1:: with SMTP id k88mr11632492qtd.268.1585264887383; Thu, 26 Mar 2020 16:21:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585264887; cv=none; d=google.com; s=arc-20160816; b=gyJXEYDdzB3Rg11IKX3p9uAGRxrYU7ppm/YLx7aP+7jOAxOD9YZre8RhaBWAcn5w3H 2LCwSbUXeeEgDS+v18oGmwUULRrhPzmoKnBItFY1zxUGFuLjdgyyxfb7708bElWQwbrF xE2rEpAm5G7hL1Hknvds7GldEel4fmJ8Qe/pujh4s7qDno7UIBXswUilpsSqu2F08bBg WTN8fEObOqngUpRCdDfKOzffZEgCXgVYzEew9rIYGiBxPSJtsL26YP7nDVJYj9o5F0Qs g8ikOywHZ0ppeMHIvMr34k1rTQIgL4RzT41bxj4UWP/K8xd1z4Gsvu/80QncYqSDpQ8B TK5A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=/qWmMRJyiGABbx/G2niY2yMmtnq+8Op3LMQTaXLdldI=; b=QGTTvgB5NCUkxWP/h9LhtvvrcQsHUtJdOkm0/QwTCDlAviTHY6J6bvduxhCtTNWN5G tWCXy3PiQnsa0U0bzpZGuYwpch2ODv9vSm/Nl0OqUFApN1gA+bqY1weIDPDHmsbbViQQ JtLqS4J8Vm2w0jx2K8vNlMKSLuosht6q6ZkIfzPCZiSNEcjkSFWLhrpxRpyBMmcU0jpW wgPRa9iikJCEispoWmgJmLTQgaR/46fhXWab7Jn4L3rOAyaS4Bcwc7oYsEIJsDYLqq4H Bml8Kqiyhse8Mc5LGWIBAUGrvV8nbRh0EUh11PxR5wLs5QJXhC4xevtYqb28P6VbJX+u k7GA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=IGICM+am; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id a42si2464462qtk.266.2020.03.26.16.21.27 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 26 Mar 2020 16:21:27 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=IGICM+am; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:34830 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHboU-0005C1-T7 for patch@linaro.org; Thu, 26 Mar 2020 19:21:26 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:58899) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcZ-0002jY-RV for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:09 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcY-0001io-7O for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:07 -0400 Received: from mail-pf1-x434.google.com ([2607:f8b0:4864:20::434]:39120) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcY-0001hL-0l for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:06 -0400 Received: by mail-pf1-x434.google.com with SMTP id d25so3539181pfn.6 for ; Thu, 26 Mar 2020 16:09:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=/qWmMRJyiGABbx/G2niY2yMmtnq+8Op3LMQTaXLdldI=; b=IGICM+amvQtPCB/7f1CwynEjtdKE7uXrktOg20btDhVHcMbqYwqCizDEjeWASh2N7A u6AcwMa/1X97tyi7ypnwLjU1j7eMQEXh1Nc2hu88+Xv2DHIwbS1ie8sygjfZg/URRjo7 63RJYn/gQ868f2rPc9bbAViA2Z11qbwx7YykmUrxqVZ90O5mHpA/Tcx3RqFlHl8g59q3 7DuCuY1/KSuTsf3MD/s1pbY5yWCfF5xHVeW9+oSyGzfw///4nBhSNTMB6uqu/EE0vsgS fFdZggHdsfigx+BaezIcc8nUAkOdz18BMXOWJFLcYpEgXgyKbZD8iWP873zB8c8nDeH7 n6bw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=/qWmMRJyiGABbx/G2niY2yMmtnq+8Op3LMQTaXLdldI=; b=KYRPvQLd8Vstce0ogo/Vc/IaS8i9oFFPmhlReYumWWtATMi0gFKHJyanMJA5yBpRgm uuWuebKlRHcSMYfDtXjEx0pUBLzMmzBzifzdag02vAHMiG8qBelHn6ATvNs/vp4C8mSU p/EUW6UvjbK/bSX9VvZ8ZBZZhcZ9O1sBWXa5DkhvD1nq+KivgBxqIvGhvilXS0VdI0/N 03/HDWBQuRjTG6bN7ezY5+JTOiU7hZGFmekBS47o9qq+CVYyeTQ8np1ZZvCrygBEH7gw 12dwqsnWUQlrIPAQOiHp5eKXX53F7FHrasaWeH/N5CPEJaYW2fEZGESRYSsWo5FORLaK Hlpw== X-Gm-Message-State: ANhLgQ2EZxwwCyHzOE+LckkM98i8zbAasq4Rvv0b52S5356pYSf8dHDy InQZU9hGs7Cx5rGddWH6CFFLjHwLlJo= X-Received: by 2002:a62:4e57:: with SMTP id c84mr12144654pfb.156.1585264144578; Thu, 26 Mar 2020 16:09:04 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.09.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:09:03 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 19/31] target/arm: Implement SVE2 bitwise permute Date: Thu, 26 Mar 2020 16:08:26 -0700 Message-Id: <20200326230838.31112-20-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::434 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/cpu.h | 5 +++ target/arm/helper-sve.h | 15 ++++++++ target/arm/sve.decode | 6 ++++ target/arm/sve_helper.c | 73 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 36 +++++++++++++++++++ 5 files changed, 135 insertions(+) -- 2.20.1 diff --git a/target/arm/cpu.h b/target/arm/cpu.h index 2e9d9f046d..b7c7946771 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -3865,6 +3865,11 @@ static inline bool isar_feature_aa64_sve2_pmull128(const ARMISARegisters *id) return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, AES) >= 2; } +static inline bool isar_feature_aa64_sve2_bitperm(const ARMISARegisters *id) +{ + return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, BITPERM) != 0; +} + /* * Feature tests for "does this exist in either 32-bit or 64-bit?" */ diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 9e894a2b55..466b01986f 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2385,3 +2385,18 @@ DEF_HELPER_FLAGS_4(sve2_eoril_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_eoril_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_eoril_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_eoril_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_bext_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_bext_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_bext_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_bext_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_bdep_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_bdep_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_bdep_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_bdep_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_bgrp_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_bgrp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_bgrp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_bgrp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 8af35e48a5..ca60e9f2ce 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1220,3 +1220,9 @@ USHLLT 01000101 .. 0 ..... 1010 11 ..... ..... @rd_rn_tszimm_shl EORBT 01000101 .. 0 ..... 10010 0 ..... ..... @rd_rn_rm EORTB 01000101 .. 0 ..... 10010 1 ..... ..... @rd_rn_rm + +## SVE2 bitwise permute + +BEXT 01000101 .. 0 ..... 1011 00 ..... ..... @rd_rn_rm +BDEP 01000101 .. 0 ..... 1011 01 ..... ..... @rd_rn_rm +BGRP 01000101 .. 0 ..... 1011 10 ..... ..... @rd_rn_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 15ea1fd524..b5afa34efe 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1216,6 +1216,79 @@ DO_ZZZ_NTB(sve2_eoril_d, uint64_t, , DO_EOR) #undef DO_ZZZ_NTB +#define DO_BITPERM(NAME, TYPE, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + for (i = 0; i < opr_sz; i += sizeof(TYPE)) { \ + TYPE nn = *(TYPE *)(vn + i); \ + TYPE mm = *(TYPE *)(vm + i); \ + *(TYPE *)(vd + i) = OP(nn, mm, sizeof(TYPE) * 8); \ + } \ +} + +static uint64_t bitextract(uint64_t data, uint64_t mask, int n) +{ + uint64_t res = 0; + int db, rb = 0; + + for (db = 0; db < n; ++db) { + if ((mask >> db) & 1) { + res |= ((data >> db) & 1) << rb; + ++rb; + } + } + return res; +} + +DO_BITPERM(sve2_bext_b, uint8_t, bitextract) +DO_BITPERM(sve2_bext_h, uint16_t, bitextract) +DO_BITPERM(sve2_bext_s, uint32_t, bitextract) +DO_BITPERM(sve2_bext_d, uint64_t, bitextract) + +static uint64_t bitdeposit(uint64_t data, uint64_t mask, int n) +{ + uint64_t res = 0; + int rb, db = 0; + + for (rb = 0; rb < n; ++rb) { + if ((mask >> rb) & 1) { + res |= ((data >> db) & 1) << rb; + ++db; + } + } + return res; +} + +DO_BITPERM(sve2_bdep_b, uint8_t, bitdeposit) +DO_BITPERM(sve2_bdep_h, uint16_t, bitdeposit) +DO_BITPERM(sve2_bdep_s, uint32_t, bitdeposit) +DO_BITPERM(sve2_bdep_d, uint64_t, bitdeposit) + +static uint64_t bitgroup(uint64_t data, uint64_t mask, int n) +{ + uint64_t resm = 0, resu = 0; + int db, rbm = 0, rbu = 0; + + for (db = 0; db < n; ++db) { + uint64_t val = (data >> db) & 1; + if ((mask >> db) & 1) { + resm |= val << rbm++; + } else { + resu |= val << rbu++; + } + } + + return resm | (resu << rbm); +} + +DO_BITPERM(sve2_bgrp_b, uint8_t, bitgroup) +DO_BITPERM(sve2_bgrp_h, uint16_t, bitgroup) +DO_BITPERM(sve2_bgrp_s, uint32_t, bitgroup) +DO_BITPERM(sve2_bgrp_d, uint64_t, bitgroup) + +#undef DO_BITPERM + #define DO_ZZI_SHLL(NAME, TYPE, TYPEN, OP) \ void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ { \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 3eaf9cbe51..375b9dc983 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6173,3 +6173,39 @@ static bool trans_USHLLT(DisasContext *s, arg_rri_esz *a) { return do_sve2_shll_tb(s, a, true, true); } + +static bool trans_BEXT(DisasContext *s, arg_rrr_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve2_bext_b, gen_helper_sve2_bext_h, + gen_helper_sve2_bext_s, gen_helper_sve2_bext_d, + }; + if (!dc_isar_feature(aa64_sve2_bitperm, s)) { + return false; + } + return do_sve2_zzw_ool(s, a, fns[a->esz], 0); +} + +static bool trans_BDEP(DisasContext *s, arg_rrr_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve2_bdep_b, gen_helper_sve2_bdep_h, + gen_helper_sve2_bdep_s, gen_helper_sve2_bdep_d, + }; + if (!dc_isar_feature(aa64_sve2_bitperm, s)) { + return false; + } + return do_sve2_zzw_ool(s, a, fns[a->esz], 0); +} + +static bool trans_BGRP(DisasContext *s, arg_rrr_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve2_bgrp_b, gen_helper_sve2_bgrp_h, + gen_helper_sve2_bgrp_s, gen_helper_sve2_bgrp_d, + }; + if (!dc_isar_feature(aa64_sve2_bitperm, s)) { + return false; + } + return do_sve2_zzw_ool(s, a, fns[a->esz], 0); +} From patchwork Thu Mar 26 23:08:27 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 184957 Delivered-To: patch@linaro.org Received: by 2002:a92:de47:0:0:0:0:0 with SMTP id e7csp648033ilr; Thu, 26 Mar 2020 16:21:52 -0700 (PDT) X-Google-Smtp-Source: ADFU+vsoHEhEDiul/Sw33ergoGXfOU1NYJm3yq+IxzcvUMM1faWjYEvEbTZ7TR5shyVtDzew9zK5 X-Received: by 2002:a05:6214:1471:: with SMTP id c17mr10854901qvy.97.1585264912199; Thu, 26 Mar 2020 16:21:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585264912; cv=none; d=google.com; s=arc-20160816; b=vYir+bGR5YA+AlIkqi+REd2SnwhYvTVrLDY32SqfU2ZnfbotJckGEtr1PlCdAn/w72 rEfrJ8PgpR9qYtg1Zrt5DfC1YZNmRchWUK0NiJ0qK8aVbcYFTxNaKKy0mUdZr4graj8v rbvnJkjo9Y8ghAK+EaTJh69rshJxQwTBHwcpnfgerehZoho5YPUj9/BjNbOPGzBA3GNj pCcorDX0nICIjW0T3GhonbC2WEdAxnQgNN2zyAjhndWP83XUzhIUPZv5UMb47dcTjAb2 HrnGjLoCYc3Aun+2qvGu8L1kTJmY7uYJGyn+7DBjuyHWhSsfprRirXqrnMN03Rx3cYPo 0R3Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=z4NXK1cCVKPNkZrtCQBn5TawdBNm+oJul6QWfWmZysk=; b=nQmwEfWdoaGpMsUfND2pRJ2b9HnS3P1dMwrIlqjrFQN1sdYTGJ8uol5L4gnSUoNn3a Osp3mo+j1o/gXb4CEUTl0yHPzELyNogXfI/BglCGmC6g0KisQkWLmSBAaVao5XbEkYEQ 5Cjn5j1c6AkuIwimI7L8Z1fUbgbdk+kWKezHuOcsCy/sYRkZbCzjk6Ydouzx5YdLjuMY Pb676I6Ra/5rs86VtNvg9dGHMji/gDeimPa1zJWl+FEKXuaImpMd34XSQXSw6znBerP5 14M5jKxoykg5YAKrVOsw0qjvJYaI1KIEloJ/BBCcGsXhsbJrupqzo/r1j9g9VymzYpTV 1FAQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=C3NYIsWk; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id t71si2521868qke.155.2020.03.26.16.21.52 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 26 Mar 2020 16:21:52 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=C3NYIsWk; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:34838 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbot-0005Nv-Mg for patch@linaro.org; Thu, 26 Mar 2020 19:21:51 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:59048) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcc-0002pA-Oa for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:12 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcZ-0001kl-Au for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:08 -0400 Received: from mail-pj1-x1033.google.com ([2607:f8b0:4864:20::1033]:52092) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcZ-0001jY-4Z for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:07 -0400 Received: by mail-pj1-x1033.google.com with SMTP id w9so3092755pjh.1 for ; Thu, 26 Mar 2020 16:09:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=z4NXK1cCVKPNkZrtCQBn5TawdBNm+oJul6QWfWmZysk=; b=C3NYIsWkK2jC2OBvctWMG4R9cQGXUIRJxoFtLMVH3vbR+h+iV5c2khBUZL/Kn/eJ6d Uy+SzmOgWiiZrmJVmlYfwN7jhg39GQNwq7r1ACRu12d57mQsJYLlctLt78shDvFEDkJr RjyvS23MCbQuegedqaSsTE2WB1VJDwV+Zk0oUjDYk/ZhnJ0uyQtvtCLun/H49Vtzm1Xn ZS2LIloP67Loxau11L8MLCg0o8LnIr6wm4ZZuLLHsEo3cidqSQsC5eTepvlWf/tXEoKT iZgsdkhIpgo15QUSUcKUEqywr1A2qUE1GOzRJhufvAQtGQ1Q02j37gVZGAUz6B1AD2nj iCaQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=z4NXK1cCVKPNkZrtCQBn5TawdBNm+oJul6QWfWmZysk=; b=sngz6V/NzSWwzLK3DWNiz0octd9sN5vhjSyvjUh7HQyiwKdAx1cvQ4EPoF6/L2pYwt xRSUxRXnMHkDEiUVpyNlImIFiAtb0ua1RMksLqtcd+aMC6jsy/cNln+BCS27WmAgIOaH /H5bNZJPkajrAErTE0UaBIqLlUEC7jrCPC962BxA0CNigehNZ91JS+mfDoHrISNPPXCp nAOktQH6k8ViBsd2HOoRZp0lf0iY8csGKZjCAUvGc/RTNkdznREcZr/ApUaZgP3YgTW6 pesjseTgnyyftQ5z16rnHmnj/ucjNoJtlnGYcjub0cuInV0WjKP37YdQighQJ2b0ZgxT rruw== X-Gm-Message-State: ANhLgQ2wMkk3i845Ax60DbEBw94R1CGHlNoNUPV8VPTMSv0EJrxElTVD plLwC/Fk8EwJmeTJtn17BhRKKV+Rqos= X-Received: by 2002:a17:90a:cc14:: with SMTP id b20mr2714276pju.75.1585264145760; Thu, 26 Mar 2020 16:09:05 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.09.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:09:05 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 20/31] target/arm: Implement SVE2 complex integer add Date: Thu, 26 Mar 2020 16:08:27 -0700 Message-Id: <20200326230838.31112-21-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::1033 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 10 +++++++++ target/arm/sve.decode | 9 ++++++++ target/arm/sve_helper.c | 42 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 31 ++++++++++++++++++++++++++++ 4 files changed, 92 insertions(+) -- 2.20.1 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 466b01986f..0e4b4c48da 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2400,3 +2400,13 @@ DEF_HELPER_FLAGS_4(sve2_bgrp_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_bgrp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_bgrp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_bgrp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_cadd_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_cadd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_cadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_cadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_sqcadd_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqcadd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqcadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqcadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index ca60e9f2ce..5fb4b5f977 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1226,3 +1226,12 @@ EORTB 01000101 .. 0 ..... 10010 1 ..... ..... @rd_rn_rm BEXT 01000101 .. 0 ..... 1011 00 ..... ..... @rd_rn_rm BDEP 01000101 .. 0 ..... 1011 01 ..... ..... @rd_rn_rm BGRP 01000101 .. 0 ..... 1011 10 ..... ..... @rd_rn_rm + +#### SVE2 Accumulate + +## SVE2 complex integer add + +CADD_rot90 01000101 .. 00000 0 11011 0 ..... ..... @rdn_rm +CADD_rot270 01000101 .. 00000 0 11011 1 ..... ..... @rdn_rm +SQCADD_rot90 01000101 .. 00000 1 11011 0 ..... ..... @rdn_rm +SQCADD_rot270 01000101 .. 00000 1 11011 1 ..... ..... @rdn_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index b5afa34efe..a3653007ac 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1289,6 +1289,48 @@ DO_BITPERM(sve2_bgrp_d, uint64_t, bitgroup) #undef DO_BITPERM +#define DO_CADD(NAME, TYPE, H, ADD_OP, SUB_OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + int sub_r = simd_data(desc); \ + if (sub_r) { \ + for (i = 0; i < opr_sz; i += 2 * sizeof(TYPE)) { \ + TYPE acc_r = *(TYPE *)(vn + H(i)); \ + TYPE acc_i = *(TYPE *)(vn + H(i + sizeof(TYPE))); \ + TYPE el2_r = *(TYPE *)(vm + H(i)); \ + TYPE el2_i = *(TYPE *)(vm + H(i + sizeof(TYPE))); \ + acc_r = SUB_OP(acc_r, el2_i); \ + acc_i = ADD_OP(acc_i, el2_r); \ + *(TYPE *)(vd + H(i)) = acc_r; \ + *(TYPE *)(vd + H(i + sizeof(TYPE))) = acc_i; \ + } \ + } else { \ + for (i = 0; i < opr_sz; i += 2 * sizeof(TYPE)) { \ + TYPE acc_r = *(TYPE *)(vn + H(i)); \ + TYPE acc_i = *(TYPE *)(vn + H(i + sizeof(TYPE))); \ + TYPE el2_r = *(TYPE *)(vm + H(i)); \ + TYPE el2_i = *(TYPE *)(vm + H(i + sizeof(TYPE))); \ + acc_r = ADD_OP(acc_r, el2_i); \ + acc_i = SUB_OP(acc_i, el2_r); \ + *(TYPE *)(vd + H(i)) = acc_r; \ + *(TYPE *)(vd + H(i + sizeof(TYPE))) = acc_i; \ + } \ + } \ +} + +DO_CADD(sve2_cadd_b, int8_t, H1, DO_ADD, DO_SUB) +DO_CADD(sve2_cadd_h, int16_t, H1_2, DO_ADD, DO_SUB) +DO_CADD(sve2_cadd_s, int32_t, H1_4, DO_ADD, DO_SUB) +DO_CADD(sve2_cadd_d, int64_t, , DO_ADD, DO_SUB) + +DO_CADD(sve2_sqcadd_b, int8_t, H1, DO_SQADD_B, DO_SQSUB_B) +DO_CADD(sve2_sqcadd_h, int16_t, H1_2, DO_SQADD_H, DO_SQSUB_H) +DO_CADD(sve2_sqcadd_s, int32_t, H1_4, DO_SQADD_S, DO_SQSUB_S) +DO_CADD(sve2_sqcadd_d, int64_t, , do_sqadd_d, do_sqsub_d) + +#undef DO_CADD + #define DO_ZZI_SHLL(NAME, TYPE, TYPEN, OP) \ void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ { \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 375b9dc983..3b0aa86e79 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6209,3 +6209,34 @@ static bool trans_BGRP(DisasContext *s, arg_rrr_esz *a) } return do_sve2_zzw_ool(s, a, fns[a->esz], 0); } + +static bool do_cadd(DisasContext *s, arg_rrr_esz *a, bool sq, bool rot) +{ + static gen_helper_gvec_3 * const fns[2][4] = { + { gen_helper_sve2_cadd_b, gen_helper_sve2_cadd_h, + gen_helper_sve2_cadd_s, gen_helper_sve2_cadd_d }, + { gen_helper_sve2_sqcadd_b, gen_helper_sve2_sqcadd_h, + gen_helper_sve2_sqcadd_s, gen_helper_sve2_sqcadd_d }, + }; + return do_sve2_zzw_ool(s, a, fns[sq][a->esz], rot); +} + +static bool trans_CADD_rot90(DisasContext *s, arg_rrr_esz *a) +{ + return do_cadd(s, a, false, false); +} + +static bool trans_CADD_rot270(DisasContext *s, arg_rrr_esz *a) +{ + return do_cadd(s, a, false, true); +} + +static bool trans_SQCADD_rot90(DisasContext *s, arg_rrr_esz *a) +{ + return do_cadd(s, a, true, false); +} + +static bool trans_SQCADD_rot270(DisasContext *s, arg_rrr_esz *a) +{ + return do_cadd(s, a, true, true); +} From patchwork Thu Mar 26 23:08:28 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 184964 Delivered-To: patch@linaro.org Received: by 2002:a92:de47:0:0:0:0:0 with SMTP id e7csp651007ilr; Thu, 26 Mar 2020 16:25:36 -0700 (PDT) X-Google-Smtp-Source: ADFU+vuKpge3QTjEl5oLUQPMTOkdzCj30HFVrIGcNAhV7wQ+86klaamQ1VbaGlsbVH24kLEJE+VJ X-Received: by 2002:ac8:f88:: with SMTP id b8mr11113745qtk.80.1585265136387; Thu, 26 Mar 2020 16:25:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585265136; cv=none; d=google.com; s=arc-20160816; b=kgmDNhHv2VgdRHwQ4zd7gfcj5uQjFddhcr0UjExefubrm0eDcPl6xWrby/4EWImMPe wfsQ0GPDSefUeWJnRRmELYuzdyJD6b1aOFGze42K/3ce8iXKjWXbwTu3A85/HA0RMyFe FRBv6Zz+4uXdoq1t+of1kvIsvAMgioPuSybxs2SvBwC8eqYFkpxAMrHzQQ1SeIFjuiNK 5CYIlsfL/8GXOIgKqdSiYTVLeSaomE+//wa9o0Qd8mEkdTRcgolw38W39vmCEStDTcy8 r/sy7A8OIQ1sIMcx1RyOXgnS42oPA3rve+qxvDkr0Y9UY3lyFSjh5y1NATwdj2F85ooE i0Pg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=LQixDJy63XBgzN/Out1zz4zHKJbz7GOqSodi9xHGXHY=; b=AYlMmTYGF2xU1+Iue8XpOrj4gtQVUjkYkqwe8R5nVq+fe+vVv8VZntxHeMLgCOUEAT ubvzY2W2WSe545cpgHgSQBy8zm1hpocWdfdAd+wGD2VSiQ4hgmKlrU/vYa32Mze+b74o 9pq/vzvd/KITdE8B/rfzA4EFb8hG4OTjWbZv1VydBvN8b03Aj2XBd7waUcvj93Iet7et DtxY/j+ehNVp/eIIaeSMeWj4i0oA+yA4yD9ncmxS4TH47ZpN6D1GJl1DZEPyfk7XBfV3 71rq/R4PMuAA2PVQMtNkfMdvwL40PqppQp3Gsv2pCtHU2SkyeSPdUP+M447REAniH3e7 JyUg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=yw8hGYah; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id p16si1599259qtn.363.2020.03.26.16.25.36 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 26 Mar 2020 16:25:36 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=yw8hGYah; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:34942 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbsV-0003on-T3 for patch@linaro.org; Thu, 26 Mar 2020 19:25:35 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:59089) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcd-0002r2-C1 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:12 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcb-0001oV-Q1 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:11 -0400 Received: from mail-pj1-x1044.google.com ([2607:f8b0:4864:20::1044]:37022) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbca-0001m4-KA for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:09 -0400 Received: by mail-pj1-x1044.google.com with SMTP id o12so3028580pjs.2 for ; Thu, 26 Mar 2020 16:09:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=LQixDJy63XBgzN/Out1zz4zHKJbz7GOqSodi9xHGXHY=; b=yw8hGYahmmK5OWDdb+aXE7qdpyChrE6WKdynmHRx10QCfxGN12pxCrUM3xOHoJHocf IT4y1kaUnMJ9dew4pZAMz2N4TM1EZtNZzbQVsqnmOWd/YcZlllzjUD0u79NNuflhLka4 GAJJvIIgOp+2EXGxCpzPDpXU1YG9hDIQf1zIXrSjtWXgUr3s/KJUfzv62KdgIaWgIxbB AXueFc4VMn5nFMMqiMwRGgxJMcvTaqdzZop4M+jQm8b0SQPw9BbDU9rqA7NVmQ66FViq Ozrw0/LOXg4KNlMd3HPnfgK0x3cyWmHMvdYdjDaljHPuWo1y3DP1GMbFmFF4rqR29OO9 sNVQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=LQixDJy63XBgzN/Out1zz4zHKJbz7GOqSodi9xHGXHY=; b=ZaurbdBcyCBupDO2b1C/Vn7mOify+cZ8M1UktXhUlRmCaAxGzYiJHG1Yarrzoo9mWm KHla0VKhmfI6lo6LfBk7OBPtfkAH2vyMhrugXYvYhs0UImPV4iw7rStNkbqJDDwpW2CV fcWGomZHLbuCaFaMKN9hgsM5Oqa6xHx6WVBJoV2hBfEGdbJgmK6CRL5gozJbP8UrICIT 9rYCkTiNlsZajHayaXUJ+6XG/gas7Q6T+qO94IqBI+X4zvhXVrU8glyyQ+CNnpqarIsU m94zxeyujBdFe3Djd7bh+zOgNkw3Y73A7t6qUb2iLqalMF5qZUDzO7EObWZMoDKxOsI7 B4tQ== X-Gm-Message-State: ANhLgQ0ToKAYNiIX5d6yyNZyaYrCP8k+hwrxK2NctxJ0zuMlka31sAYE fkIoHxTJO0DwhkDg8BIZdHMDV+hCvuU= X-Received: by 2002:a17:90a:2103:: with SMTP id a3mr2604118pje.181.1585264147049; Thu, 26 Mar 2020 16:09:07 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.09.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:09:06 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 21/31] target/arm: Implement SVE2 integer absolute difference and accumulate long Date: Thu, 26 Mar 2020 16:08:28 -0700 Message-Id: <20200326230838.31112-22-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::1044 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 14 ++++++++++ target/arm/sve.decode | 12 +++++++++ target/arm/sve_helper.c | 24 +++++++++++++++++ target/arm/translate-sve.c | 54 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 104 insertions(+) -- 2.20.1 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 0e4b4c48da..b48a88135f 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2410,3 +2410,17 @@ DEF_HELPER_FLAGS_4(sve2_sqcadd_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_sqcadd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_sqcadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_sqcadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_sabal_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sabal_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sabal_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_uabal_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uabal_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uabal_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 5fb4b5f977..f66a6c242f 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -70,6 +70,7 @@ &rpr_s rd pg rn s &rprr_s rd pg rn rm s &rprr_esz rd pg rn rm esz +&rrrr_esz rd ra rn rm esz &rprrr_esz rd pg rn rm ra esz &rpri_esz rd pg rn imm esz &ptrue rd esz pat s @@ -120,6 +121,10 @@ @rdn_i8s ........ esz:2 ...... ... imm:s8 rd:5 \ &rri_esz rn=%reg_movprfx +# Four operand, vector element size +@rda_rn_rm ........ esz:2 . rm:5 ... ... rn:5 rd:5 \ + &rrrr_esz ra=%reg_movprfx + # Three operand with "memory" size, aka immediate left shift @rd_rn_msz_rm ........ ... rm:5 .... imm:2 rn:5 rd:5 &rrri @@ -1235,3 +1240,10 @@ CADD_rot90 01000101 .. 00000 0 11011 0 ..... ..... @rdn_rm CADD_rot270 01000101 .. 00000 0 11011 1 ..... ..... @rdn_rm SQCADD_rot90 01000101 .. 00000 1 11011 0 ..... ..... @rdn_rm SQCADD_rot270 01000101 .. 00000 1 11011 1 ..... ..... @rdn_rm + +## SVE2 integer absolute difference and accumulate long + +SABALB 01000101 .. 0 ..... 1100 00 ..... ..... @rda_rn_rm +SABALT 01000101 .. 0 ..... 1100 01 ..... ..... @rda_rn_rm +UABALB 01000101 .. 0 ..... 1100 10 ..... ..... @rda_rn_rm +UABALT 01000101 .. 0 ..... 1100 11 ..... ..... @rda_rn_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index a3653007ac..a0995d95c7 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1216,6 +1216,30 @@ DO_ZZZ_NTB(sve2_eoril_d, uint64_t, , DO_EOR) #undef DO_ZZZ_NTB +#define DO_ABAL(NAME, TYPE, TYPEN) \ +void HELPER(NAME)(void *vd, void *va, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + int sel1 = (simd_data(desc) & 1) * sizeof(TYPE); \ + int sel2 = (simd_data(desc) & 2) * (sizeof(TYPE) / 2); \ + for (i = 0; i < opr_sz; i += sizeof(TYPE)) { \ + TYPE nn = (TYPEN)(*(TYPE *)(vn + i) >> sel1); \ + TYPE mm = (TYPEN)(*(TYPE *)(vm + i) >> sel2); \ + TYPE aa = *(TYPE *)(va + i); \ + *(TYPE *)(vd + i) = DO_ABD(nn, mm) + aa; \ + } \ +} + +DO_ABAL(sve2_sabal_h, int16_t, int8_t) +DO_ABAL(sve2_sabal_s, int32_t, int16_t) +DO_ABAL(sve2_sabal_d, int64_t, int32_t) + +DO_ABAL(sve2_uabal_h, uint16_t, uint8_t) +DO_ABAL(sve2_uabal_s, uint32_t, uint16_t) +DO_ABAL(sve2_uabal_d, uint64_t, uint32_t) + +#undef DO_ABAL + #define DO_BITPERM(NAME, TYPE, OP) \ void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ { \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 3b0aa86e79..c6161d2ce2 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6240,3 +6240,57 @@ static bool trans_SQCADD_rot270(DisasContext *s, arg_rrr_esz *a) { return do_cadd(s, a, true, true); } + +static bool do_sve2_zzzz_ool(DisasContext *s, arg_rrrr_esz *a, + gen_helper_gvec_4 *fn, int data) +{ + if (fn == NULL || !dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_4_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->ra), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vsz, vsz, data, fn); + } + return true; +} + +static bool do_abal(DisasContext *s, arg_rrrr_esz *a, bool uns, bool sel) +{ + static gen_helper_gvec_4 * const fns[2][3] = { + { gen_helper_sve2_sabal_h, + gen_helper_sve2_sabal_s, + gen_helper_sve2_sabal_d }, + { gen_helper_sve2_uabal_h, + gen_helper_sve2_uabal_s, + gen_helper_sve2_uabal_d }, + }; + + if (a->esz == 0) { + return false; + } + return do_sve2_zzzz_ool(s, a, fns[uns][a->esz - 1], sel); +} + +static bool trans_SABALB(DisasContext *s, arg_rrrr_esz *a) +{ + return do_abal(s, a, false, false); +} + +static bool trans_SABALT(DisasContext *s, arg_rrrr_esz *a) +{ + return do_abal(s, a, false, true); +} + +static bool trans_UABALB(DisasContext *s, arg_rrrr_esz *a) +{ + return do_abal(s, a, true, false); +} + +static bool trans_UABALT(DisasContext *s, arg_rrrr_esz *a) +{ + return do_abal(s, a, true, true); +} From patchwork Thu Mar 26 23:08:29 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 184952 Delivered-To: patch@linaro.org Received: by 2002:a92:de47:0:0:0:0:0 with SMTP id e7csp645482ilr; Thu, 26 Mar 2020 16:18:47 -0700 (PDT) X-Google-Smtp-Source: ADFU+vv8HseQ1kLRotauV5TNHZgQ5BDeitnWeNaEGj/UU5pCFMh1IwWiozAA5srZ7gC302eJviL3 X-Received: by 2002:a0c:ba9d:: with SMTP id x29mr8866377qvf.207.1585264727264; Thu, 26 Mar 2020 16:18:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585264727; cv=none; d=google.com; s=arc-20160816; b=Rg91q95jX/I8hJzM9i2SgWUnyVouVJNR1At+Ipd+6SvkBnUcq+WcMX4/9UlUwLFB9B Ez29nPTrea2bx8T+8c+ASK4YNkbWUkTsafjGK4xs2IFX2wjm1F7rN5En2F8GhPjnr87w CoF/98Fag2NM099IBvVj33sXyc+TzLI25S8S7SNOwx6albpyOwxki6TuB3feeYxUIoz9 aKfCkSW+xJRGTFwt2/iUjQ3dSlWqqYJdTornevhK7A3Hk39rb9kuskRnJNyEd938gMay aylRiT0S0pJhqDKCN6ny99D9CZe1k/TRNPs8NbhPrTVmla6rKJhSsu7Zr1ZuIiTW1ExU JDbA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=shWQ4UzPuX9NwbrJUFloHvE+a/223KVCdbky39wDCzw=; b=0FdSRQAWCReOUdUIQROy+rWzwH4li090n600m8xRDjyAayoL4K+OUt3L9VTszGFPAS pJie2nwD0nvfVJH3XYf3eZvCeZTYrEbk6U1U3KGDxpgJ4osmgOQdUAlOJX/OSsxcT7uZ 2agor52aNhZlNbaKBmarqxJDLcJLDCBKUBviIO7JKm1+GqUPm5H9Vo5iZ/8BOs6b44Gj o9wh93Ai1SW9XVTn2YqXGxzJTwJboQFGbUGjQlrVY6aAEn27IjJsgPII6xmbmRQ5K0Da pavVUbd/+v0TCMW4LUMlrGqRpdKQTldr5oxEbK5iQ0bbvctkLVl7RcPbVXTk4svxyff0 VNng== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b="uZP/lhXQ"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id c58si2434099qtk.371.2020.03.26.16.18.47 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 26 Mar 2020 16:18:47 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b="uZP/lhXQ"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:34752 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHblu-0001LF-Kz for patch@linaro.org; Thu, 26 Mar 2020 19:18:46 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:59043) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcc-0002p0-NG for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:12 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcb-0001o9-Ff for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:10 -0400 Received: from mail-pj1-x1044.google.com ([2607:f8b0:4864:20::1044]:39645) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcb-0001nC-9n for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:09 -0400 Received: by mail-pj1-x1044.google.com with SMTP id z3so2516243pjr.4 for ; Thu, 26 Mar 2020 16:09:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=shWQ4UzPuX9NwbrJUFloHvE+a/223KVCdbky39wDCzw=; b=uZP/lhXQ9+6tllKEfEN0KkQUWibVsIqtbsoZx4jNPeMpoCPd23g11WKmPnIa24KHvk Z8auWRPUzDsWDFnhCnEDBhqnR4kbtFkKd9AHAO2D3CAjoPijiF54ECmCyfg1C9PfRIh8 upojrFhm9RyZD2WkX2yWbtqOBS2SDeOBjI9RxSd6zitwCvctGM4alAv4Cjf+6wZn5bnS 3r+cvru6wsrE8ht/2U/gR6IsiTRnmCRT7Oh0SyQG2tQaES7HFfV2fFkUmKK2ZOIu+O40 jwI0prwSLnx2LeKppVkYjRRWYj5B9mtLbdm6pBkOpr1fh+eyewl7MFbXpphdoKXBB+lU 7hOw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=shWQ4UzPuX9NwbrJUFloHvE+a/223KVCdbky39wDCzw=; b=XLXfp7MOZgIRLnTH45yw3lI5M1L/YUHL4Hnm1H7gXfQLZfrP5iwW+86wFr2XWeKKiO Rv9xlE/hv0dJ5oRFwgFhY5LVBVKsIYy5VXb+q/U+Sq3LXQZiJ2FQYmDrR7TXyOrQIo5g 5hm+D7hVlFfWv8KieKNLgt/lqZ+Np1Idhn7cq2ahqQ/E4FoQhf1Vu1PQZmLbQttOawZe 4ANdH6AFwMeTIVPP6BKEdpT64NvjdzG3/1ZYtsNSioOJId2AXHAlvuIMmbxFrxCYolpf Lv088/llDo//NOVERnUB39hULMeuMLPOJBQ+Ca4FMGvYgjAZq+XKjnGq5vu5POJugxD1 4d2A== X-Gm-Message-State: ANhLgQ3VTeK96CmeeTOC9l6kxVbNkIRuu11Q71uOA32uuEJMSVuGPq61 xepr2J9fGTLCLJu1qy/+Pv5qFzIu2Vs= X-Received: by 2002:a17:90a:9501:: with SMTP id t1mr2659209pjo.108.1585264147996; Thu, 26 Mar 2020 16:09:07 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.09.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:09:07 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 22/31] target/arm: Implement SVE2 integer add/subtract long with carry Date: Thu, 26 Mar 2020 16:08:29 -0700 Message-Id: <20200326230838.31112-23-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::1044 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 3 +++ target/arm/sve.decode | 6 ++++++ target/arm/sve_helper.c | 33 +++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 23 +++++++++++++++++++++++ 4 files changed, 65 insertions(+) -- 2.20.1 diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index b48a88135f..cfc1357613 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2424,3 +2424,6 @@ DEF_HELPER_FLAGS_5(sve2_uabal_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_uabal_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_adcl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_adcl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index f66a6c242f..5d46e3ab45 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1247,3 +1247,9 @@ SABALB 01000101 .. 0 ..... 1100 00 ..... ..... @rda_rn_rm SABALT 01000101 .. 0 ..... 1100 01 ..... ..... @rda_rn_rm UABALB 01000101 .. 0 ..... 1100 10 ..... ..... @rda_rn_rm UABALT 01000101 .. 0 ..... 1100 11 ..... ..... @rda_rn_rm + +## SVE2 integer add/subtract long with carry + +# ADC and SBC decoded via size in helper dispatch. +ADCLB 01000101 .. 0 ..... 11010 0 ..... ..... @rda_rn_rm +ADCLT 01000101 .. 0 ..... 11010 1 ..... ..... @rda_rn_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index a0995d95c7..aa330f75c3 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1240,6 +1240,39 @@ DO_ABAL(sve2_uabal_d, uint64_t, uint32_t) #undef DO_ABAL +void HELPER(sve2_adcl_s)(void *vd, void *va, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int sel = extract32(desc, SIMD_DATA_SHIFT, 1) * 32; + uint32_t inv = -extract32(desc, SIMD_DATA_SHIFT + 1, 1); + uint64_t *d = vd, *a = va, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 8; ++i) { + uint32_t e1 = (uint32_t)a[i]; + uint32_t e2 = (uint32_t)(n[i] >> sel) ^ inv; + uint64_t c = extract64(m[i], 32, 1); + /* Compute and store the entire 33-bit result at once. */ + d[i] = c + e1 + e2; + } +} + +void HELPER(sve2_adcl_d)(void *vd, void *va, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int sel = extract32(desc, SIMD_DATA_SHIFT, 1) * 32; + uint64_t inv = -(uint64_t)extract32(desc, SIMD_DATA_SHIFT + 1, 1); + uint64_t *d = vd, *a = va, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 8; i += 2) { + Int128 e1 = int128_make64(a[i]); + Int128 e2 = int128_make64(n[i + sel] ^ inv); + Int128 c = int128_make64(m[i + 1] & 1); + Int128 r = int128_add(int128_add(e1, e2), c); + d[i + 0] = int128_getlo(r); + d[i + 1] = int128_gethi(r); + } +} + #define DO_BITPERM(NAME, TYPE, OP) \ void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ { \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index c6161d2ce2..a80765a978 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6294,3 +6294,26 @@ static bool trans_UABALT(DisasContext *s, arg_rrrr_esz *a) { return do_abal(s, a, true, true); } + +static bool do_adcl(DisasContext *s, arg_rrrr_esz *a, bool sel) +{ + static gen_helper_gvec_4 * const fns[2] = { + gen_helper_sve2_adcl_s, + gen_helper_sve2_adcl_d, + }; + /* + * Note that in this case the ESZ field encodes both size and sign. + * Split out 'subtract' into bit 1 of the data field for the helper. + */ + return do_sve2_zzzz_ool(s, a, fns[a->esz & 1], (a->esz & 2) | sel); +} + +static bool trans_ADCLB(DisasContext *s, arg_rrrr_esz *a) +{ + return do_adcl(s, a, false); +} + +static bool trans_ADCLT(DisasContext *s, arg_rrrr_esz *a) +{ + return do_adcl(s, a, true); +} From patchwork Thu Mar 26 23:08:30 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 184959 Delivered-To: patch@linaro.org Received: by 2002:a92:de47:0:0:0:0:0 with SMTP id e7csp648736ilr; Thu, 26 Mar 2020 16:22:39 -0700 (PDT) X-Google-Smtp-Source: ADFU+vtg5664Rog9vBvy4gCFxKKMAN8I86UCck5FsJz/xYisOakHmW8J+EPT4i6t1OcMmeqPh3Ed X-Received: by 2002:a37:2fc3:: with SMTP id v186mr11306741qkh.311.1585264959720; Thu, 26 Mar 2020 16:22:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585264959; cv=none; d=google.com; s=arc-20160816; b=g0tt4YyiQtqG6EYsrrGnPqAP3DnI97TTWh4+PTtqGmEEhZHdAvpHbQPsc1tPEEs1Mg yPU6++C1Amt5eqKLsdDR9iZtdGLqoBMq2Nv5cYoiAOVG2yU79ZtENJOJbPdEuCsLma/+ Pg8oNqo6+PygcBuxnBt6/hcGl7iVzCOa5yiBhFiq4NGDSbsyUL4yTK2ek3YAq8ndCVq9 XgvBCHGMmGQmKhF/hbJ9GmuYomqqFDJt/mUsX97NZwLzHH/gvC1HQV7f6/r/PAp12IjZ /8fUWPHjI9sSEdidJAhpEZ86CG1x78+SM+WIhwsvCgvBP+N8Jv1NL4KQTMsyxlz1z5Qp uTZQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=iY5JtAHPmUUP1F/+dH6oTIOXqKx5ZoOANRCDeiNVP7k=; b=xz0GzzKL4NyYZN7s4lk2vXU6feWU7OZlOQQbIL/QQKk2jQvHTtRw43JGvV5rK31dZ8 NrMs7ZpZYmhrSmup5/aVhcIq8T3Flelb4Muwp5dIhUYiJw6g6V+BB4PtumCjuENZLbb+ TamuaFEix/CFrahOOR0NIIa++P4MI43GbUPOeQFtJFaWt4OM6RBor8BvQBQn5J0dyUo2 YK9YSX4L+ZqU8RXWTnESU+JseMVhfK/Kj6wZSKU/nFdX3p2asTZbqodSn816JFMpE8gU GKZMeQb1kpzh++jOJn8Ttuu38R+TnaX8+qJs92rc9TXlWl+kXHx/F3vFUY6dYSEXjZgm j/Rw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b="VXIvD9E/"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id f1si2468562qtk.23.2020.03.26.16.22.39 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 26 Mar 2020 16:22:39 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b="VXIvD9E/"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:34862 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbpf-0006yO-54 for patch@linaro.org; Thu, 26 Mar 2020 19:22:39 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:59207) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcf-0002w3-AH for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:15 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcd-0001qr-5u for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:13 -0400 Received: from mail-pg1-x544.google.com ([2607:f8b0:4864:20::544]:44776) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcc-0001pE-U5 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:11 -0400 Received: by mail-pg1-x544.google.com with SMTP id 142so3639909pgf.11 for ; Thu, 26 Mar 2020 16:09:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=iY5JtAHPmUUP1F/+dH6oTIOXqKx5ZoOANRCDeiNVP7k=; b=VXIvD9E/nYGp0Qf6bb0kKd04aMAuAnzrY8CC6BqEIqDfq6wAMytN6pBNBeuVQezlxq 44pH3y18nnFRAjwmkQEV3/mFBZrf6noH1Q9VcKOiCNa3jFyGVJVZtqcr0DaOGZtO83Zv pkecPkO7mJoLTsQMqgLp5YoP9bCVyY0FFg4EVa52nCf8jEzUmDoj8T2Yc+TPShkMAXP3 m42Yil/NdH1kTIUs7VWe1aY85iISzpjYLV0BuUJfOz1VU1ml3YevOLoUxUXrEpMlmGIP cZ5WYYCjlp5QQriO/WKpS/xOOAmuiIiR1cpfp3AMhoSDBWAu4c/bQUEb+YiHyWdNhNZf +KSg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=iY5JtAHPmUUP1F/+dH6oTIOXqKx5ZoOANRCDeiNVP7k=; b=TsEZaWluVrkyV8Au7ySd2o6paox4iJbXcCtDj/t87X+rhI0PVoKfgz6ZGlntE2itu2 k32TB3+cGkPSDLfb999IHmMVx6lbiaOFbnHq/SrentMKSPcWqvbCveTFstg0f5hlwrel Hxo7zSbw4+xq637U3ef1ATL8AycahgXiDY9V0+7VIwQ6qnpkhLOjY2swgz6XjsGyp79d PcwifV/LrZDgL5PGjaMiifPdOWywrZ4aBIPvjnBGf9aG6eAJhjZzr2b9oyfTKfDVEiGo h1r4NJEIwpvlGrvodBNHE1bGiWUFEL16HF07ZUgEzK97BdLR7Fwg8vxqZBRfqt5omzf0 BIIA== X-Gm-Message-State: ANhLgQ2wKABaJNe6yL5eIzWjWNELd1pZ7rPXWsaEUYYn9RmjGnOTueUo MkcqPPFFWRDg26uUiYs+mP2i/vOYhPA= X-Received: by 2002:a62:1894:: with SMTP id 142mr11802128pfy.27.1585264149138; Thu, 26 Mar 2020 16:09:09 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.09.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:09:08 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 23/31] target/arm: Create arm_gen_gvec_[us]sra Date: Thu, 26 Mar 2020 16:08:30 -0700 Message-Id: <20200326230838.31112-24-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::544 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" The functions eliminate duplication of the special cases for this operation. They match up with the GVecGen2iFn typedef. Add out-of-line helpers. We got away with only having inline expanders because the neon vector size is only 16 bytes, and we know that the inline expansion will always succeed. When we reuse this for SVE, tcg-gvec-op may decide to use an out-of-line helper due to longer vector lengths. Signed-off-by: Richard Henderson --- target/arm/helper.h | 10 +++ target/arm/translate.h | 7 +- target/arm/translate-a64.c | 15 +--- target/arm/translate.c | 161 ++++++++++++++++++++++--------------- target/arm/vec_helper.c | 25 ++++++ 5 files changed, 139 insertions(+), 79 deletions(-) -- 2.20.1 diff --git a/target/arm/helper.h b/target/arm/helper.h index 938fdbc362..dc6a43dbd8 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -708,6 +708,16 @@ DEF_HELPER_FLAGS_4(gvec_pmull_q, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(neon_pmull_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ssra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ssra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ssra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ssra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(gvec_usra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_usra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_usra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_usra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate.h b/target/arm/translate.h index 5552ee5a94..1c5cdf13e3 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -291,8 +291,6 @@ extern const GVecGen3 mls_op[4]; extern const GVecGen3 cmtst_op[4]; extern const GVecGen3 sshl_op[4]; extern const GVecGen3 ushl_op[4]; -extern const GVecGen2i ssra_op[4]; -extern const GVecGen2i usra_op[4]; extern const GVecGen2i sri_op[4]; extern const GVecGen2i sli_op[4]; extern const GVecGen4 uqadd_op[4]; @@ -305,6 +303,11 @@ void gen_sshl_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b); void gen_ushl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); void gen_sshl_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b); +void arm_gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void arm_gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); + /* * Forward to the isar_feature_* tests given a DisasContext pointer. */ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 2bcf643069..d50207fcfb 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -10682,19 +10682,8 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, switch (opcode) { case 0x02: /* SSRA / USRA (accumulate) */ - if (is_u) { - /* Shift count same as element size produces zero to add. */ - if (shift == 8 << size) { - goto done; - } - gen_gvec_op2i(s, is_q, rd, rn, shift, &usra_op[size]); - } else { - /* Shift count same as element size produces all sign to add. */ - if (shift == 8 << size) { - shift -= 1; - } - gen_gvec_op2i(s, is_q, rd, rn, shift, &ssra_op[size]); - } + gen_gvec_fn2i(s, is_q, rd, rn, shift, + is_u ? arm_gen_gvec_usra : arm_gen_gvec_ssra, size); return; case 0x08: /* SRI */ /* Shift count same as element size is valid but does nothing. */ diff --git a/target/arm/translate.c b/target/arm/translate.c index cba84987db..f5768014d1 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -3947,33 +3947,51 @@ static void gen_ssra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) tcg_gen_add_vec(vece, d, d, a); } -static const TCGOpcode vecop_list_ssra[] = { - INDEX_op_sari_vec, INDEX_op_add_vec, 0 -}; +void arm_gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sari_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_ssra8_i64, + .fniv = gen_ssra_vec, + .fno = gen_helper_gvec_ssra_b, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_ssra16_i64, + .fniv = gen_ssra_vec, + .fno = gen_helper_gvec_ssra_h, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_ssra32_i32, + .fniv = gen_ssra_vec, + .fno = gen_helper_gvec_ssra_s, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_ssra64_i64, + .fniv = gen_ssra_vec, + .fno = gen_helper_gvec_ssra_b, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_64 }, + }; -const GVecGen2i ssra_op[4] = { - { .fni8 = gen_ssra8_i64, - .fniv = gen_ssra_vec, - .load_dest = true, - .opt_opc = vecop_list_ssra, - .vece = MO_8 }, - { .fni8 = gen_ssra16_i64, - .fniv = gen_ssra_vec, - .load_dest = true, - .opt_opc = vecop_list_ssra, - .vece = MO_16 }, - { .fni4 = gen_ssra32_i32, - .fniv = gen_ssra_vec, - .load_dest = true, - .opt_opc = vecop_list_ssra, - .vece = MO_32 }, - { .fni8 = gen_ssra64_i64, - .fniv = gen_ssra_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .opt_opc = vecop_list_ssra, - .load_dest = true, - .vece = MO_64 }, -}; + /* tszimm encoding produces immediates in the range [1..esize]. */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + /* + * Shifts larger than the element size are architecturally valid. + * Signed results in all sign bits. + */ + shift = MIN(shift, (8 << vece) - 1); + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); +} static void gen_usra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) { @@ -4005,33 +4023,55 @@ static void gen_usra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) tcg_gen_add_vec(vece, d, d, a); } -static const TCGOpcode vecop_list_usra[] = { - INDEX_op_shri_vec, INDEX_op_add_vec, 0 -}; +void arm_gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_usra8_i64, + .fniv = gen_usra_vec, + .fno = gen_helper_gvec_usra_b, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_8, }, + { .fni8 = gen_usra16_i64, + .fniv = gen_usra_vec, + .fno = gen_helper_gvec_usra_h, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16, }, + { .fni4 = gen_usra32_i32, + .fniv = gen_usra_vec, + .fno = gen_helper_gvec_usra_s, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32, }, + { .fni8 = gen_usra64_i64, + .fniv = gen_usra_vec, + .fno = gen_helper_gvec_usra_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_64, }, + }; -const GVecGen2i usra_op[4] = { - { .fni8 = gen_usra8_i64, - .fniv = gen_usra_vec, - .load_dest = true, - .opt_opc = vecop_list_usra, - .vece = MO_8, }, - { .fni8 = gen_usra16_i64, - .fniv = gen_usra_vec, - .load_dest = true, - .opt_opc = vecop_list_usra, - .vece = MO_16, }, - { .fni4 = gen_usra32_i32, - .fniv = gen_usra_vec, - .load_dest = true, - .opt_opc = vecop_list_usra, - .vece = MO_32, }, - { .fni8 = gen_usra64_i64, - .fniv = gen_usra_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .load_dest = true, - .opt_opc = vecop_list_usra, - .vece = MO_64, }, -}; + /* tszimm encoding produces immediates in the range [1..esize]. */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + /* + * Shifts larger than the element size are architecturally valid. + * Unsigned results in all zeros as input to accumulate: nop. + */ + if (shift < (8 << vece)) { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } else { + /* Nop, but we do need to clear the tail. */ + tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz); + } +} static void gen_shr8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) { @@ -5396,19 +5436,12 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) case 1: /* VSRA */ /* Right shift comes here negative. */ shift = -shift; - /* Shifts larger than the element size are architecturally - * valid. Unsigned results in all zeros; signed results - * in all sign bits. - */ - if (!u) { - tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size, - MIN(shift, (8 << size) - 1), - &ssra_op[size]); - } else if (shift >= 8 << size) { - /* rd += 0 */ + if (u) { + arm_gen_gvec_usra(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); } else { - tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size, - shift, &usra_op[size]); + arm_gen_gvec_ssra(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); } return 0; diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 154d32518a..aaaccc0a2d 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -899,6 +899,31 @@ void HELPER(gvec_sqsub_d)(void *vd, void *vq, void *vn, clear_tail(d, oprsz, simd_maxsz(desc)); } + +#define DO_SRA(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + int shift = simd_data(desc); \ + TYPE *d = vd, *n = vn; \ + for (i = 0; i < oprsz / sizeof(TYPE); i++) { \ + d[i] += n[i] >> shift; \ + } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ +} + +DO_SRA(gvec_ssra_b, int8_t) +DO_SRA(gvec_ssra_h, int16_t) +DO_SRA(gvec_ssra_s, int32_t) +DO_SRA(gvec_ssra_d, int64_t) + +DO_SRA(gvec_usra_b, uint8_t) +DO_SRA(gvec_usra_h, uint16_t) +DO_SRA(gvec_usra_s, uint32_t) +DO_SRA(gvec_usra_d, uint64_t) + +#undef DO_SRA + /* * Convert float16 to float32, raising no exceptions and * preserving exceptional values, including SNaN. From patchwork Thu Mar 26 23:08:31 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 184954 Delivered-To: patch@linaro.org Received: by 2002:a92:de47:0:0:0:0:0 with SMTP id e7csp646150ilr; Thu, 26 Mar 2020 16:19:36 -0700 (PDT) X-Google-Smtp-Source: ADFU+vsMaeqxQ0Edod9OAphJSQXvgttBhoA2yRR6HJDXkUmZP7WaBPAyBRscFSic02DRT/cMo0JN X-Received: by 2002:a37:4885:: with SMTP id v127mr11275570qka.253.1585264776864; Thu, 26 Mar 2020 16:19:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585264776; cv=none; d=google.com; s=arc-20160816; b=k6JWBtQ0ufY4JUoCOOLy7zBjG4MDx1I0adakyAR27gbywHyH30yphGX0NGLsbJ1dta eU5kxz+H1D6VZIuRaNz6PU7vMJDJUsO8KuaI5j4dSAWLrDpVSjOyTDlAUsxaeeiSJxrP iB2+dtHJuI2o+dWxxeGG+8NjPGE8pgLPuo5nVS07c10ept2qKNk1XpCmSQKxedSpAGnn QkDMYObMqjEEYP7wGvapFdlhvOR48+KXGYyIVqA9QOQ9CLJezrz3yXv1FYI/S42lwWht hWUwj7fgEV3riqXs0eJ2ZBRduPTzW1vE4bDcdTY0upmgXVvRevUYqNOyWSoryLNjaWXd tBTg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=I6ankYmNHyYeNrJfSSg8aj/RPi8jQqHs2aTejgbW4Ig=; b=whnlYvS7iWoNBcBX/u4njKJu1Vg3ozq0zaP6ARTO8N/57BuuPWXoY8eJFHlB67iz/J wEsB6qSQ+Ra1dLvNsxtowYl9xZVON6LwKv20LZTQ1wT8WfUSonq4uhMLCSKlGcdeFhRy 1PTtwo92wu523DYDXSfK1Xmqga4cKSJW/vdnbDa/YcZ9W6k8nC9p/6vSD8rwm+OY+Dpn ch/EfCHIZH8zIi/86onkDxTGw+phNlYUxSmWuU99f+abFyCPUYFQi512d6dTC691d+N6 12wYsWQUg7lw5TGT6JZWKinIMbaDLNdskZbd43oOEG713Q4ZHSIOsEwVh3RXpDJUxYoj SKUg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=ZzrtufDM; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id v4si2455332qtp.11.2020.03.26.16.19.36 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 26 Mar 2020 16:19:36 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=ZzrtufDM; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:34768 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbmi-00023Q-An for patch@linaro.org; Thu, 26 Mar 2020 19:19:36 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:59322) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbch-00031H-2t for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:17 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbce-0001si-C2 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:14 -0400 Received: from mail-pg1-x542.google.com ([2607:f8b0:4864:20::542]:36237) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbce-0001rZ-2I for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:12 -0400 Received: by mail-pg1-x542.google.com with SMTP id j29so3657723pgl.3 for ; Thu, 26 Mar 2020 16:09:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=I6ankYmNHyYeNrJfSSg8aj/RPi8jQqHs2aTejgbW4Ig=; b=ZzrtufDME0y+UJkq2FPDh6+9jlGlN/tr0Pd6WEgBMrlaseWdFI3acjL9qpda9yEysN q2AxjuPLIXfu5O0i/5bxmcNiLhc85OHlh0ezSju9DtpuWvwW4tDP4M4iQQ6GezZPydnX v6mn/e/NadYwKNe5LmAXsJ1oxEsChdNBtIxeew/R1SYKDmJNhd7YEjYaVPQGYyRBlJ24 D8AeAxrrojx6uQwCds18/Whd1bqAQoXbpYpl2K+UkcaJea+yfClQIwH6ons/36JENfvy 7VRA0Dd8wrh6uYGcS5ypIAJQP4uljGKZAzdyVBLimu9GDM+GBSSj+PawlvSUTj2wN6xM doMA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=I6ankYmNHyYeNrJfSSg8aj/RPi8jQqHs2aTejgbW4Ig=; b=DcnGLkq/AwSCAVr2LDZ4OxHYQdaSj6QC8Z9w8qGjbljAr7gER259yCo/l1RJRtm4y/ BSsSXRtKOoIOG4UP5tD8rcdGJ/tK0mLCKGAdjPEptw2IWJuMylKce4HAd853AokcvCsF bR/6dpW+6/cgOmSyCfcy3GMLtv7eo7MLU59yfF3bDb95cNseAHb0in/uBUKVQqP5Aup3 Hn+s1WJWit5OQx+oGxLApP9fCDwh547/fvobLcjxuzqF9RJf4KoX5SDrchpsTFiRLVfE NWuDoJJUvwekNB6d4GskrLpuTi6sTHfmx5PZH3ZoyLO7x70N3rTcRn/AbXW1hHydvBuz anBw== X-Gm-Message-State: ANhLgQ1kwdrSdiHhcQDP3gttFCN/FcK+lR4yxt9y6HKkG63X7YK44evZ kFuqn1xaeHQnYyfFi9v31PoUJM19X80= X-Received: by 2002:a63:e607:: with SMTP id g7mr10814971pgh.303.1585264150252; Thu, 26 Mar 2020 16:09:10 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.09.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:09:09 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 24/31] target/arm: Create arm_gen_gvec_{u,s}{rshr,rsra} Date: Thu, 26 Mar 2020 16:08:31 -0700 Message-Id: <20200326230838.31112-25-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::542 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Create vectorized versions of handle_shri_with_rndacc for shift+round and shift+round+accumulate. Add out-of-line helpers in preparation for longer vector lengths from SVE. Signed-off-by: Richard Henderson --- target/arm/helper.h | 20 ++ target/arm/translate-a64.h | 9 + target/arm/translate-a64.c | 458 ++++++++++++++++++++++++++++++++++++- target/arm/vec_helper.c | 50 ++++ 4 files changed, 534 insertions(+), 3 deletions(-) -- 2.20.1 diff --git a/target/arm/helper.h b/target/arm/helper.h index dc6a43dbd8..1ffd840f1d 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -718,6 +718,26 @@ DEF_HELPER_FLAGS_3(gvec_usra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_usra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_usra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srshr_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srshr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srshr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srshr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(gvec_urshr_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_urshr_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_urshr_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_urshr_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(gvec_srsra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srsra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srsra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_srsra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(gvec_ursra_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ursra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ursra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_ursra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h index 65c0280498..7846e91e51 100644 --- a/target/arm/translate-a64.h +++ b/target/arm/translate-a64.h @@ -129,4 +129,13 @@ typedef void GVecGen3Fn(unsigned, uint32_t, uint32_t, typedef void GVecGen4Fn(unsigned, uint32_t, uint32_t, uint32_t, uint32_t, uint32_t, uint32_t); +void arm_gen_gvec_srshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void arm_gen_gvec_urshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void arm_gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void arm_gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); + #endif /* TARGET_ARM_TRANSLATE_A64_H */ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index d50207fcfb..37ee85f867 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -8561,6 +8561,453 @@ static void handle_shri_with_rndacc(TCGv_i64 tcg_res, TCGv_i64 tcg_src, } } +static void gen_srshr8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) +{ + TCGv_i64 t = tcg_temp_new_i64(); + TCGv_i64 ones = tcg_const_i64(dup_const(MO_8, 1)); + + /* Shift one less than the requested amount. */ + if (shift > 1) { + tcg_gen_vec_sar8i_i64(a, a, shift - 1); + } + + /* The low bit is the rounding bit. Mask it off. */ + tcg_gen_and_i64(t, a, ones); + + /* Finish the shift. */ + tcg_gen_vec_sar8i_i64(d, a, 1); + + /* Round. */ + tcg_gen_vec_add8_i64(d, d, t); + + tcg_temp_free_i64(t); + tcg_temp_free_i64(ones); +} + +static void gen_srshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) +{ + TCGv_i64 t = tcg_temp_new_i64(); + TCGv_i64 ones = tcg_const_i64(dup_const(MO_16, 1)); + + if (shift > 1) { + tcg_gen_vec_sar16i_i64(a, a, shift - 1); + } + tcg_gen_and_i64(t, a, ones); + tcg_gen_vec_sar16i_i64(d, a, 1); + tcg_gen_vec_add16_i64(d, d, t); + + tcg_temp_free_i64(t); + tcg_temp_free_i64(ones); +} + +static void gen_srshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + tcg_gen_sari_i32(a, a, shift - 1); + tcg_gen_andi_i32(t, a, 1); + tcg_gen_sari_i32(d, a, 1); + tcg_gen_add_i32(d, d, t); + + tcg_temp_free_i32(t); +} + +static void gen_srshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_sari_i64(a, a, shift - 1); + tcg_gen_andi_i64(t, a, 1); + tcg_gen_sari_i64(d, a, 1); + tcg_gen_add_i64(d, d, t); + + tcg_temp_free_i64(t); +} + +static void gen_srshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t shift) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + TCGv_vec ones = tcg_temp_new_vec_matching(d); + + tcg_gen_sari_vec(vece, a, a, shift - 1); + tcg_gen_dupi_vec(vece, ones, 1); + tcg_gen_and_vec(vece, t, a, ones); + tcg_gen_sari_vec(vece, d, a, 1); + tcg_gen_add_vec(vece, d, d, t); + + tcg_temp_free_vec(t); + tcg_temp_free_vec(ones); +} + +void arm_gen_gvec_srshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sari_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_srshr8_i64, + .fniv = gen_srshr_vec, + .fno = gen_helper_gvec_srshr_b, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_srshr16_i64, + .fniv = gen_srshr_vec, + .fno = gen_helper_gvec_srshr_h, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_srshr32_i32, + .fniv = gen_srshr_vec, + .fno = gen_helper_gvec_srshr_s, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_srshr64_i64, + .fniv = gen_srshr_vec, + .fno = gen_helper_gvec_srshr_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + + /* tszimm encoding produces immediates in the range [1..esize] */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + if (shift == (8 << vece)) { + /* + * Shifts larger than the element size are architecturally valid. + * Signed results in all sign bits. With rounding, this produces + * (-1 + 1) >> 1 == 0, or (0 + 1) >> 1 == 0. + * I.e. always zero. + */ + tcg_gen_gvec_dup8i(rd_ofs, opr_sz, max_sz, 0); + } else { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } +} + +static void gen_srsra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) +{ + TCGv_i64 t = tcg_temp_new_i64(); + gen_srshr8_i64(t, a, shift); + tcg_gen_vec_add8_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_srsra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) +{ + TCGv_i64 t = tcg_temp_new_i64(); + gen_srshr16_i64(t, a, shift); + tcg_gen_vec_add16_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_srsra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift) +{ + TCGv_i32 t = tcg_temp_new_i32(); + gen_srshr32_i32(t, a, shift); + tcg_gen_add_i32(d, d, t); + tcg_temp_free_i32(t); +} + +static void gen_srsra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) +{ + TCGv_i64 t = tcg_temp_new_i64(); + gen_srshr64_i64(t, a, shift); + tcg_gen_add_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_srsra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t shift) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + gen_srshr_vec(vece, t, a, shift); + tcg_gen_add_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +void arm_gen_gvec_srsra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sari_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_srsra8_i64, + .fniv = gen_srsra_vec, + .fno = gen_helper_gvec_srsra_b, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_8 }, + { .fni8 = gen_srsra16_i64, + .fniv = gen_srsra_vec, + .fno = gen_helper_gvec_srsra_h, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_16 }, + { .fni4 = gen_srsra32_i32, + .fniv = gen_srsra_vec, + .fno = gen_helper_gvec_srsra_s, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_32 }, + { .fni8 = gen_srsra64_i64, + .fniv = gen_srsra_vec, + .fno = gen_helper_gvec_srsra_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_64 }, + }; + + /* tszimm encoding produces immediates in the range [1..esize] */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + /* + * Shifts larger than the element size are architecturally valid. + * Signed results in all sign bits. With rounding, this produces + * (-1 + 1) >> 1 == 0, or (0 + 1) >> 1 == 0. + * I.e. always zero. With accumulation, this leaves D unchanged. + */ + if (shift != (8 << vece)) { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } else { + /* Nop, but we do need to clear the tail. */ + tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz); + } +} + +static void gen_urshr8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) +{ + TCGv_i64 t = tcg_temp_new_i64(); + TCGv_i64 ones = tcg_const_i64(dup_const(MO_8, 1)); + + /* Shift one less than the requested amount. */ + if (shift > 1) { + tcg_gen_vec_shr8i_i64(a, a, shift - 1); + } + + /* The low bit is the rounding bit. Mask it off. */ + tcg_gen_and_i64(t, a, ones); + + /* Finish the shift. */ + tcg_gen_vec_shr8i_i64(d, a, 1); + + /* Round. */ + tcg_gen_vec_add8_i64(d, d, t); + + tcg_temp_free_i64(t); + tcg_temp_free_i64(ones); +} + +static void gen_urshr16_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) +{ + TCGv_i64 t = tcg_temp_new_i64(); + TCGv_i64 ones = tcg_const_i64(dup_const(MO_16, 1)); + + if (shift > 1) { + tcg_gen_vec_shr16i_i64(a, a, shift - 1); + } + tcg_gen_and_i64(t, a, ones); + tcg_gen_vec_shr16i_i64(d, a, 1); + tcg_gen_vec_add16_i64(d, d, t); + + tcg_temp_free_i64(t); + tcg_temp_free_i64(ones); +} + +static void gen_urshr32_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + tcg_gen_shri_i32(a, a, shift - 1); + tcg_gen_andi_i32(t, a, 1); + tcg_gen_shri_i32(d, a, 1); + tcg_gen_add_i32(d, d, t); + + tcg_temp_free_i32(t); +} + +static void gen_urshr64_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_shri_i64(a, a, shift - 1); + tcg_gen_andi_i64(t, a, 1); + tcg_gen_shri_i64(d, a, 1); + tcg_gen_add_i64(d, d, t); + + tcg_temp_free_i64(t); +} + +static void gen_urshr_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t shift) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + TCGv_vec ones = tcg_temp_new_vec_matching(d); + + tcg_gen_shri_vec(vece, a, a, shift - 1); + tcg_gen_dupi_vec(vece, ones, 1); + tcg_gen_and_vec(vece, t, a, ones); + tcg_gen_shri_vec(vece, d, a, 1); + tcg_gen_add_vec(vece, d, d, t); + + tcg_temp_free_vec(t); + tcg_temp_free_vec(ones); +} + +void arm_gen_gvec_urshr(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_urshr8_i64, + .fniv = gen_urshr_vec, + .fno = gen_helper_gvec_urshr_b, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_urshr16_i64, + .fniv = gen_urshr_vec, + .fno = gen_helper_gvec_urshr_h, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_urshr32_i32, + .fniv = gen_urshr_vec, + .fno = gen_helper_gvec_urshr_s, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_urshr64_i64, + .fniv = gen_urshr_vec, + .fno = gen_helper_gvec_urshr_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + + /* tszimm encoding produces immediates in the range [1..esize] */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + if (shift == (8 << vece)) { + /* + * Shifts larger than the element size are architecturally valid. + * Unsigned results in zero. With rounding, this produces a + * copy of the most significant bit. + */ + tcg_gen_gvec_shri(vece, rd_ofs, rm_ofs, shift - 1, opr_sz, max_sz); + } else { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } +} + +static void gen_ursra8_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + if (shift == 8) { + tcg_gen_vec_shr8i_i64(t, a, 7); + } else { + gen_urshr8_i64(t, a, shift); + } + tcg_gen_vec_add8_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_ursra16_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + if (shift == 16) { + tcg_gen_vec_shr16i_i64(t, a, 15); + } else { + gen_urshr16_i64(t, a, shift); + } + tcg_gen_vec_add16_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_ursra32_i32(TCGv_i32 d, TCGv_i32 a, int32_t shift) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + if (shift == 32) { + tcg_gen_shri_i32(t, a, 31); + } else { + gen_urshr32_i32(t, a, shift); + } + tcg_gen_add_i32(d, d, t); + tcg_temp_free_i32(t); +} + +static void gen_ursra64_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + if (shift == 64) { + tcg_gen_shri_i64(t, a, 63); + } else { + gen_urshr64_i64(t, a, shift); + } + tcg_gen_add_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_ursra_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t shift) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + + if (shift == (8 << vece)) { + tcg_gen_shri_vec(vece, t, a, (8 << vece) - 1); + } else { + gen_urshr_vec(vece, t, a, shift); + } + tcg_gen_add_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +void arm_gen_gvec_ursra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_shri_vec, INDEX_op_add_vec, 0 + }; + static const GVecGen2i ops[4] = { + { .fni8 = gen_ursra8_i64, + .fniv = gen_ursra_vec, + .fno = gen_helper_gvec_ursra_b, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_8 }, + { .fni8 = gen_ursra16_i64, + .fniv = gen_ursra_vec, + .fno = gen_helper_gvec_ursra_h, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_16 }, + { .fni4 = gen_ursra32_i32, + .fniv = gen_ursra_vec, + .fno = gen_helper_gvec_ursra_s, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_32 }, + { .fni8 = gen_ursra64_i64, + .fniv = gen_ursra_vec, + .fno = gen_helper_gvec_ursra_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_64 }, + }; + + /* tszimm encoding produces immediates in the range [1..esize] */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); +} + /* SSHR[RA]/USHR[RA] - Scalar shift right (optional rounding/accumulate) */ static void handle_scalar_simd_shri(DisasContext *s, bool is_u, int immh, int immb, @@ -10712,10 +11159,15 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, return; case 0x04: /* SRSHR / URSHR (rounding) */ - break; + gen_gvec_fn2i(s, is_q, rd, rn, shift, + is_u ? arm_gen_gvec_urshr : arm_gen_gvec_srshr, size); + return; + case 0x06: /* SRSRA / URSRA (accum + rounding) */ - accumulate = true; - break; + gen_gvec_fn2i(s, is_q, rd, rn, shift, + is_u ? arm_gen_gvec_ursra : arm_gen_gvec_srsra, size); + return; + default: g_assert_not_reached(); } diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index aaaccc0a2d..c6a39c188e 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -924,6 +924,56 @@ DO_SRA(gvec_usra_d, uint64_t) #undef DO_SRA +#define DO_RSHR(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + int shift = simd_data(desc); \ + TYPE *d = vd, *n = vn; \ + for (i = 0; i < oprsz / sizeof(TYPE); i++) { \ + TYPE tmp = n[i] >> (shift - 1); \ + d[i] = (tmp >> 1) + (tmp & 1); \ + } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ +} + +DO_RSHR(gvec_srshr_b, int8_t) +DO_RSHR(gvec_srshr_h, int16_t) +DO_RSHR(gvec_srshr_s, int32_t) +DO_RSHR(gvec_srshr_d, int64_t) + +DO_RSHR(gvec_urshr_b, uint8_t) +DO_RSHR(gvec_urshr_h, uint16_t) +DO_RSHR(gvec_urshr_s, uint32_t) +DO_RSHR(gvec_urshr_d, uint64_t) + +#undef DO_RSHR + +#define DO_RSRA(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + int shift = simd_data(desc); \ + TYPE *d = vd, *n = vn; \ + for (i = 0; i < oprsz / sizeof(TYPE); i++) { \ + TYPE tmp = n[i] >> (shift - 1); \ + d[i] += (tmp >> 1) + (tmp & 1); \ + } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ +} + +DO_RSRA(gvec_srsra_b, int8_t) +DO_RSRA(gvec_srsra_h, int16_t) +DO_RSRA(gvec_srsra_s, int32_t) +DO_RSRA(gvec_srsra_d, int64_t) + +DO_RSRA(gvec_ursra_b, uint8_t) +DO_RSRA(gvec_ursra_h, uint16_t) +DO_RSRA(gvec_ursra_s, uint32_t) +DO_RSRA(gvec_ursra_d, uint64_t) + +#undef DO_RSRA + /* * Convert float16 to float32, raising no exceptions and * preserving exceptional values, including SNaN. From patchwork Thu Mar 26 23:08:32 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 184965 Delivered-To: patch@linaro.org Received: by 2002:a92:de47:0:0:0:0:0 with SMTP id e7csp651385ilr; Thu, 26 Mar 2020 16:26:08 -0700 (PDT) X-Google-Smtp-Source: ADFU+vtlQVOQM5jZzizjaY/iO1g0rGcIHSTnS+ebRdeY99i41eimDzWm8brhNeq2vuVjaMsd8UMw X-Received: by 2002:aed:2207:: with SMTP id n7mr4797566qtc.132.1585265168154; Thu, 26 Mar 2020 16:26:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585265168; cv=none; d=google.com; s=arc-20160816; b=rafDVIKSnY4KA6kKlMe8FVMuhNXy+89qpKxL91uZOjXbGvXHnb3nKloFUZEFSdCM9+ 0BtIIBn4zxFKdviEQMStETVdih/XBeJZ4BmvSGXb3Vu/doXTks2HPJPe9LI6BN+LKdb3 8HQf19fSBdBUQXFsYBoucng8m2caLa8lJG+MrmWYqibO8pdSPJmT6b6mSLt7lLb4ffkc FFa2WecUWbQc43wcC9DtN+2F0hsJYrwshI50g74CzSTJHHps56i172S9UWITwCqkszOY 0N1gZsxX1JaKJ+0lMcyQ7712OGg0hBY74Tyv5DcS4us8vS0oDBXX+4zaVFH+2Z0vIa20 JCLQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=OdjaC+2trQ9ceTWGUWGj5oLcK8x+CzwmCibMzIBPms4=; b=C+BDD8F5RdJCMzfA/tT7X8SAZ9QddbNBKCzCfAKIQ4KDNGyw8jxYsPzNSqs+pDEyrD TGPuQeEa5NmJMn3j5ieZ00ShKg4TBfjgJWIBvTfDGNsQX0SENrCCbExzD4mbhycFrpz7 nE1UA0GObtOYrr8IjvMrwzhBfqIIwxfwEPutxymHzLgqwAr4gKOCF9L5Btfn3vYdkqrx F9cE0rTtHqxbZtciZyveOmJbWwjNlmDtzWrPZhclcPsTPnOOKTpKtH4vaQGK2MUUt9M+ w6No59mk9TfIiSgUR2SGPJfKwNKGJ2P6u72fNkso/pV7OcHVi6DS0/3DSyMX2nW3MWlr 16pg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=aD2zgnrz; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id o14si2485868qto.287.2020.03.26.16.26.08 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 26 Mar 2020 16:26:08 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=aD2zgnrz; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:34958 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbt1-00044a-MW for patch@linaro.org; Thu, 26 Mar 2020 19:26:07 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:59253) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcf-0002xc-TU for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:14 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbce-0001tV-Oo for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:13 -0400 Received: from mail-pg1-x541.google.com ([2607:f8b0:4864:20::541]:44774) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbce-0001sI-Iu for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:12 -0400 Received: by mail-pg1-x541.google.com with SMTP id 142so3639946pgf.11 for ; Thu, 26 Mar 2020 16:09:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=OdjaC+2trQ9ceTWGUWGj5oLcK8x+CzwmCibMzIBPms4=; b=aD2zgnrzX/IkgZWK1ZnF0cSKeiEAY8SMYX1u7WTSWJknIg6OOX4PVlZUgaR9pG6qy0 ZyNsuIILpcGidy1iqB9znjowkFRlTsfrtbdhGh89GR3i6GYamkAbiwTcpzkLXE6xgjH6 rqZXAq1FtARKFrJJCEnb7fJj667c3OWzmk1DpklnL5hbGvD2UitAmC6dMKTtDO5v0WKQ dauE7kIRDOJbpqXICicQuE4vJq8D3Io5DA1LGYWP2H6f8fIDtXbSot69QxJ7+Bdap39S z4+qhQj92FpCiV+YtMEILq/0xLm7s2omjTpRDQwGNn+7B4sxmzkMbPUJPtcUaPAdac+v wWCg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=OdjaC+2trQ9ceTWGUWGj5oLcK8x+CzwmCibMzIBPms4=; b=dR5DzU2qZj/RTIcNyknibI4jP7ZP5tWxrzfelz6P4dkPv5R83TMw9+IhKT4pMom9Rg 6WljhppKIMDZozQnEvDqrivhK/Riu2NOqMfJ/G9XwGrOEMgODdE/c65V48Dvx9Pj0GyT XqT831VjLbE7W+X56p/qWh980iZmF7gd4F+jS46VbgajRgOjaevSyPS8Obqz+Oj3Lfi3 bHvVQruTC6U4KG42OIRIvmbnBUm92jf5sVTZB0IJ2reBjvReYhZaopfRaRjdhhVQfKBx sNp4Z3o96CVXOhJwrHjsvVG5g+C0MF2v3ywxA0a4zMIEOj5O2WBqykcpAmOj63J6CvQL FTbQ== X-Gm-Message-State: ANhLgQ10YtX0jYjE2PfBa8nuM5JEbbxghdcohPjNuir0qj2l9fKsGuu0 1/4YpEu8xGEfi7OOiNlDN8DPnLPWD8M= X-Received: by 2002:a63:2cc3:: with SMTP id s186mr10720238pgs.71.1585264151297; Thu, 26 Mar 2020 16:09:11 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.09.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:09:10 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 25/31] target/arm: Implement SVE2 bitwise shift right and accumulate Date: Thu, 26 Mar 2020 16:08:32 -0700 Message-Id: <20200326230838.31112-26-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::541 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/sve.decode | 8 ++++++++ target/arm/translate-sve.c | 34 ++++++++++++++++++++++++++++++++++ 2 files changed, 42 insertions(+) -- 2.20.1 diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 5d46e3ab45..756f939df1 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1253,3 +1253,11 @@ UABALT 01000101 .. 0 ..... 1100 11 ..... ..... @rda_rn_rm # ADC and SBC decoded via size in helper dispatch. ADCLB 01000101 .. 0 ..... 11010 0 ..... ..... @rda_rn_rm ADCLT 01000101 .. 0 ..... 11010 1 ..... ..... @rda_rn_rm + +## SVE2 bitwise shift right and accumulate + +# TODO: Use @rda and %reg_movprfx here. +SSRA 01000101 .. 0 ..... 1110 00 ..... ..... @rd_rn_tszimm_shr +USRA 01000101 .. 0 ..... 1110 01 ..... ..... @rd_rn_tszimm_shr +SRSRA 01000101 .. 0 ..... 1110 10 ..... ..... @rd_rn_tszimm_shr +URSRA 01000101 .. 0 ..... 1110 11 ..... ..... @rd_rn_tszimm_shr diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index a80765a978..1d1f55dfdd 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6317,3 +6317,37 @@ static bool trans_ADCLT(DisasContext *s, arg_rrrr_esz *a) { return do_adcl(s, a, true); } + +static bool do_sve2_fn2i(DisasContext *s, arg_rri_esz *a, GVecGen2iFn *fn) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + unsigned rd_ofs = vec_full_reg_offset(s, a->rd); + unsigned rn_ofs = vec_full_reg_offset(s, a->rn); + fn(a->esz, rd_ofs, rn_ofs, a->imm, vsz, vsz); + } + return true; +} + +static bool trans_SSRA(DisasContext *s, arg_rri_esz *a) +{ + return do_sve2_fn2i(s, a, arm_gen_gvec_ssra); +} + +static bool trans_USRA(DisasContext *s, arg_rri_esz *a) +{ + return do_sve2_fn2i(s, a, arm_gen_gvec_usra); +} + +static bool trans_SRSRA(DisasContext *s, arg_rri_esz *a) +{ + return do_sve2_fn2i(s, a, arm_gen_gvec_srsra); +} + +static bool trans_URSRA(DisasContext *s, arg_rri_esz *a) +{ + return do_sve2_fn2i(s, a, arm_gen_gvec_ursra); +} From patchwork Thu Mar 26 23:08:33 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 184967 Delivered-To: patch@linaro.org Received: by 2002:a92:de47:0:0:0:0:0 with SMTP id e7csp652147ilr; Thu, 26 Mar 2020 16:27:04 -0700 (PDT) X-Google-Smtp-Source: ADFU+vsvCLWCBhpxxqphkRPEdN1NqooQYpSjLsn11aIZyK6L2a8jrWbiUSIz5BKdqAxmpVkS3kZ3 X-Received: by 2002:a05:620a:b90:: with SMTP id k16mr11067135qkh.243.1585265223900; Thu, 26 Mar 2020 16:27:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585265223; cv=none; d=google.com; s=arc-20160816; b=zE0haTME1YcpqHASmaHFJrdB8fsMV8umF2tDpHUHs8ahEdy6Myjbi2sDG/rd6Rck2h fOaKVbfNScItg/18zDodX1ZPefwfRpKch+/hZiGOJByTRVF6g+XPhPFs/Wq3oWA7JypP qjpgBHZLt5BgwN7UBsvcI51dGi7wH2zOBrWvGYti4LgxwPDgYXUxoWDpIoIlRLhjwxzw 7vqSoh81lLj7bvqNYRZxEAdG5Q84ahfYLetkbwn8WVyrzOlTp5epJrcHypPs3qdKjutp XW7udOPVnyrzZSQVT5+OZofUqVVUnlDZmaA+nvldAaexiVXnTi5RBIzzc/bdYikJbn3J Q2pQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=cxkAq7ERmXI5Oy2mbhzOoAS+SoesHDepdGEbK1/bQ9Y=; b=zBH9Lm5TxXFunAmivIVgr0kUYDK/I4oNZs7VSXpQhv2Z8GjXFSvK0aM1GctqcU7wja PbSM7r64TSPMsTW297g2tkXjKybzfumbYKyTOnqW5d+bJAaE3Mv3k1S0Gks6tauWRPXf 08Qa/8EZxV9x0pwDZuCkizNmLQSArR2w6jS6Ktu57p6yLkNyXv3W38LoV9yiFt3yQtfO 86yvE9ZvlyhL3fBiIOF7j3bhiMtFWVLdRLyicRhVuwnyH9Do4AQgXGIkqFcGWr1s1/Uf C0g7WMWAHsoIAYLmD4d9BSmhQYoxvOX2jvREziQs+f3XOkrXbZdnNBhqaD2PRjGtGWq8 5K0Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=JjgIBZIl; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id m8si2429079qtu.194.2020.03.26.16.27.03 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 26 Mar 2020 16:27:03 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=JjgIBZIl; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:34988 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbtv-00063g-CM for patch@linaro.org; Thu, 26 Mar 2020 19:27:03 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:59488) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbck-00036k-0z for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:20 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcg-0001wd-JA for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:16 -0400 Received: from mail-pf1-x441.google.com ([2607:f8b0:4864:20::441]:37108) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcg-0001vE-BO for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:14 -0400 Received: by mail-pf1-x441.google.com with SMTP id h72so3555250pfe.4 for ; Thu, 26 Mar 2020 16:09:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=cxkAq7ERmXI5Oy2mbhzOoAS+SoesHDepdGEbK1/bQ9Y=; b=JjgIBZIlijmglWEbZ7lrd8BT5g+5KVANuj7duhBiwhmyHPRb5QOPpuVBtopUwnklAF Ze1bX9ffvz9JKLzBaroqlN06xXjfmo8W1lK7evBO8bLItA1btrH/0fxYdg2iGCcC6pP5 wRTUCZ4yboS5M92auL2ttn0WYkokzR/TyMnDnyj70YBDMPGJNv/YxRJEzBamLrS/mhIP mBS6sdxw/4RjVw8mk5hugNZKjYmJRbh2hqV8jcp5RH7yZwgsKBrNagRG9q6/a+cwCm6d 8ukBXc6h+eMXiILNf2vwn+OfWA3jBMmm5KTGGukx3nePDJ6dFs/ksZhyYl0JjlzNdOsK CCnw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=cxkAq7ERmXI5Oy2mbhzOoAS+SoesHDepdGEbK1/bQ9Y=; b=KsmN8GRTHWCaKzdQzhvgKHoMMeGbv5D7GcooJ1VHcmj029EhkV0w1/RPOahoZ+pBYQ tZ8lXT7emCqQLItOCJwl6qqJNXgLizskRX83ZBk0RitYeCUumTkb9WEPG6+KqViGPN5P LuU/OxsDKxRpLbb97Vaw4MgNIp+0bCui93w1+iGP2zubXDAbkudvSS4PKQ91VaLdnv0w lTUJmsFZQG5t+36Bo1s3cgUS2MYskg4PhaaJImuoawqvBI50YiWXwht/q0rS8I15bG5N 2+UD8k3Z80hyt71raC6NeJkUuVkcaf/YcEFQmEog4aUIj7pN6PPf+T1DFvZkOSQkhY04 igCg== X-Gm-Message-State: ANhLgQ2Zp8aShONhFhBnkORNhaQyJl71jSchwG9Iorjbz0VKGT9ipi3J OeAcjwOEvE7v4Ftcbem1BhxYLTJibYY= X-Received: by 2002:a63:f95c:: with SMTP id q28mr3719550pgk.321.1585264152438; Thu, 26 Mar 2020 16:09:12 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.09.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:09:11 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 26/31] target/arm: Create arm_gen_gvec_{sri,sli} Date: Thu, 26 Mar 2020 16:08:33 -0700 Message-Id: <20200326230838.31112-27-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::441 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" The functions eliminate duplication of the special cases for this operation. They match up with the GVecGen2iFn typedef. Add out-of-line helpers. We got away with only having inline expanders because the neon vector size is only 16 bytes, and we know that the inline expansion will always succeed. When we reuse this for SVE, tcg-gvec-op may decide to use an out-of-line helper due to longer vector lengths. Signed-off-by: Richard Henderson --- target/arm/helper.h | 10 ++ target/arm/translate.h | 7 +- target/arm/translate-a64.c | 20 +--- target/arm/translate.c | 186 +++++++++++++++++++++---------------- target/arm/vec_helper.c | 38 ++++++++ 5 files changed, 160 insertions(+), 101 deletions(-) -- 2.20.1 diff --git a/target/arm/helper.h b/target/arm/helper.h index 1ffd840f1d..5ef7bb158f 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -738,6 +738,16 @@ DEF_HELPER_FLAGS_3(gvec_ursra_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_ursra_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_ursra_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sri_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sri_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sri_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sri_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(gvec_sli_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sli_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sli_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(gvec_sli_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate.h b/target/arm/translate.h index 1c5cdf13e3..843ecc1472 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -291,8 +291,6 @@ extern const GVecGen3 mls_op[4]; extern const GVecGen3 cmtst_op[4]; extern const GVecGen3 sshl_op[4]; extern const GVecGen3 ushl_op[4]; -extern const GVecGen2i sri_op[4]; -extern const GVecGen2i sli_op[4]; extern const GVecGen4 uqadd_op[4]; extern const GVecGen4 sqadd_op[4]; extern const GVecGen4 uqsub_op[4]; @@ -308,6 +306,11 @@ void arm_gen_gvec_ssra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, void arm_gen_gvec_usra(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void arm_gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void arm_gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz); + /* * Forward to the isar_feature_* tests given a DisasContext pointer. */ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 37ee85f867..f7d492cce4 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -680,16 +680,6 @@ static void gen_gvec_fn4(DisasContext *s, bool is_q, int rd, int rn, int rm, is_q ? 16 : 8, vec_full_reg_size(s)); } -/* Expand a 2-operand + immediate AdvSIMD vector operation using - * an op descriptor. - */ -static void gen_gvec_op2i(DisasContext *s, bool is_q, int rd, - int rn, int64_t imm, const GVecGen2i *gvec_op) -{ - tcg_gen_gvec_2i(vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn), - is_q ? 16 : 8, vec_full_reg_size(s), imm, gvec_op); -} - /* Expand a 3-operand AdvSIMD vector operation using an op descriptor. */ static void gen_gvec_op3(DisasContext *s, bool is_q, int rd, int rn, int rm, const GVecGen3 *gvec_op) @@ -11132,12 +11122,9 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, gen_gvec_fn2i(s, is_q, rd, rn, shift, is_u ? arm_gen_gvec_usra : arm_gen_gvec_ssra, size); return; + case 0x08: /* SRI */ - /* Shift count same as element size is valid but does nothing. */ - if (shift == 8 << size) { - goto done; - } - gen_gvec_op2i(s, is_q, rd, rn, shift, &sri_op[size]); + gen_gvec_fn2i(s, is_q, rd, rn, shift, arm_gen_gvec_sri, size); return; case 0x00: /* SSHR / USHR */ @@ -11188,7 +11175,6 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, } tcg_temp_free_i64(tcg_round); - done: clear_vec_high(s, is_q, rd); } @@ -11213,7 +11199,7 @@ static void handle_vec_simd_shli(DisasContext *s, bool is_q, bool insert, } if (insert) { - gen_gvec_op2i(s, is_q, rd, rn, shift, &sli_op[size]); + gen_gvec_fn2i(s, is_q, rd, rn, shift, arm_gen_gvec_sli, size); } else { gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_shli, size); } diff --git a/target/arm/translate.c b/target/arm/translate.c index f5768014d1..bb6db53598 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -4111,47 +4111,62 @@ static void gen_shr64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) static void gen_shr_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) { - if (sh == 0) { - tcg_gen_mov_vec(d, a); - } else { - TCGv_vec t = tcg_temp_new_vec_matching(d); - TCGv_vec m = tcg_temp_new_vec_matching(d); + TCGv_vec t = tcg_temp_new_vec_matching(d); + TCGv_vec m = tcg_temp_new_vec_matching(d); - tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK((8 << vece) - sh, sh)); - tcg_gen_shri_vec(vece, t, a, sh); - tcg_gen_and_vec(vece, d, d, m); - tcg_gen_or_vec(vece, d, d, t); + tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK((8 << vece) - sh, sh)); + tcg_gen_shri_vec(vece, t, a, sh); + tcg_gen_and_vec(vece, d, d, m); + tcg_gen_or_vec(vece, d, d, t); - tcg_temp_free_vec(t); - tcg_temp_free_vec(m); - } + tcg_temp_free_vec(t); + tcg_temp_free_vec(m); } -static const TCGOpcode vecop_list_sri[] = { INDEX_op_shri_vec, 0 }; +void arm_gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { INDEX_op_shri_vec, 0 }; + const GVecGen2i ops[4] = { + { .fni8 = gen_shr8_ins_i64, + .fniv = gen_shr_ins_vec, + .fno = gen_helper_gvec_sri_b, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_shr16_ins_i64, + .fniv = gen_shr_ins_vec, + .fno = gen_helper_gvec_sri_h, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_shr32_ins_i32, + .fniv = gen_shr_ins_vec, + .fno = gen_helper_gvec_sri_s, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_shr64_ins_i64, + .fniv = gen_shr_ins_vec, + .fno = gen_helper_gvec_sri_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; -const GVecGen2i sri_op[4] = { - { .fni8 = gen_shr8_ins_i64, - .fniv = gen_shr_ins_vec, - .load_dest = true, - .opt_opc = vecop_list_sri, - .vece = MO_8 }, - { .fni8 = gen_shr16_ins_i64, - .fniv = gen_shr_ins_vec, - .load_dest = true, - .opt_opc = vecop_list_sri, - .vece = MO_16 }, - { .fni4 = gen_shr32_ins_i32, - .fniv = gen_shr_ins_vec, - .load_dest = true, - .opt_opc = vecop_list_sri, - .vece = MO_32 }, - { .fni8 = gen_shr64_ins_i64, - .fniv = gen_shr_ins_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .load_dest = true, - .opt_opc = vecop_list_sri, - .vece = MO_64 }, -}; + /* tszimm encoding produces immediates in the range [1..esize]. */ + tcg_debug_assert(shift > 0); + tcg_debug_assert(shift <= (8 << vece)); + + /* Shift of esize leaves destination unchanged. */ + if (shift < (8 << vece)) { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } else { + /* Nop, but we do need to clear the tail. */ + tcg_gen_gvec_mov(vece, rd_ofs, rd_ofs, opr_sz, max_sz); + } +} static void gen_shl8_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) { @@ -4189,47 +4204,60 @@ static void gen_shl64_ins_i64(TCGv_i64 d, TCGv_i64 a, int64_t shift) static void gen_shl_ins_vec(unsigned vece, TCGv_vec d, TCGv_vec a, int64_t sh) { - if (sh == 0) { - tcg_gen_mov_vec(d, a); - } else { - TCGv_vec t = tcg_temp_new_vec_matching(d); - TCGv_vec m = tcg_temp_new_vec_matching(d); + TCGv_vec t = tcg_temp_new_vec_matching(d); + TCGv_vec m = tcg_temp_new_vec_matching(d); - tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK(0, sh)); - tcg_gen_shli_vec(vece, t, a, sh); - tcg_gen_and_vec(vece, d, d, m); - tcg_gen_or_vec(vece, d, d, t); + tcg_gen_shli_vec(vece, t, a, sh); + tcg_gen_dupi_vec(vece, m, MAKE_64BIT_MASK(0, sh)); + tcg_gen_and_vec(vece, d, d, m); + tcg_gen_or_vec(vece, d, d, t); - tcg_temp_free_vec(t); - tcg_temp_free_vec(m); - } + tcg_temp_free_vec(t); + tcg_temp_free_vec(m); } -static const TCGOpcode vecop_list_sli[] = { INDEX_op_shli_vec, 0 }; +void arm_gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, + int64_t shift, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { INDEX_op_shli_vec, 0 }; + const GVecGen2i ops[4] = { + { .fni8 = gen_shl8_ins_i64, + .fniv = gen_shl_ins_vec, + .fno = gen_helper_gvec_sli_b, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fni8 = gen_shl16_ins_i64, + .fniv = gen_shl_ins_vec, + .fno = gen_helper_gvec_sli_h, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_shl32_ins_i32, + .fniv = gen_shl_ins_vec, + .fno = gen_helper_gvec_sli_s, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_shl64_ins_i64, + .fniv = gen_shl_ins_vec, + .fno = gen_helper_gvec_sli_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .load_dest = true, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; -const GVecGen2i sli_op[4] = { - { .fni8 = gen_shl8_ins_i64, - .fniv = gen_shl_ins_vec, - .load_dest = true, - .opt_opc = vecop_list_sli, - .vece = MO_8 }, - { .fni8 = gen_shl16_ins_i64, - .fniv = gen_shl_ins_vec, - .load_dest = true, - .opt_opc = vecop_list_sli, - .vece = MO_16 }, - { .fni4 = gen_shl32_ins_i32, - .fniv = gen_shl_ins_vec, - .load_dest = true, - .opt_opc = vecop_list_sli, - .vece = MO_32 }, - { .fni8 = gen_shl64_ins_i64, - .fniv = gen_shl_ins_vec, - .prefer_i64 = TCG_TARGET_REG_BITS == 64, - .load_dest = true, - .opt_opc = vecop_list_sli, - .vece = MO_64 }, -}; + /* tszimm encoding produces immediates in the range [0..esize-1]. */ + tcg_debug_assert(shift >= 0); + tcg_debug_assert(shift < (8 << vece)); + + if (shift == 0) { + tcg_gen_gvec_mov(vece, rd_ofs, rm_ofs, opr_sz, max_sz); + } else { + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); + } +} static void gen_mla8_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) { @@ -5451,20 +5479,14 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) } /* Right shift comes here negative. */ shift = -shift; - /* Shift out of range leaves destination unchanged. */ - if (shift < 8 << size) { - tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, vec_size, - shift, &sri_op[size]); - } + arm_gen_gvec_sri(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); return 0; case 5: /* VSHL, VSLI */ if (u) { /* VSLI */ - /* Shift out of range leaves destination unchanged. */ - if (shift < 8 << size) { - tcg_gen_gvec_2i(rd_ofs, rm_ofs, vec_size, - vec_size, shift, &sli_op[size]); - } + arm_gen_gvec_sli(size, rd_ofs, rm_ofs, shift, + vec_size, vec_size); } else { /* VSHL */ /* Shifts larger than the element size are * architecturally valid and results in zero. diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index c6a39c188e..27035a8a42 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -974,6 +974,44 @@ DO_RSRA(gvec_ursra_d, uint64_t) #undef DO_RSRA +#define DO_SRI(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + int shift = simd_data(desc); \ + TYPE *d = vd, *n = vn; \ + for (i = 0; i < oprsz / sizeof(TYPE); i++) { \ + d[i] = deposit64(d[i], 0, sizeof(TYPE) * 8 - shift, n[i] >> shift); \ + } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ +} + +DO_SRI(gvec_sri_b, uint8_t) +DO_SRI(gvec_sri_h, uint16_t) +DO_SRI(gvec_sri_s, uint32_t) +DO_SRI(gvec_sri_d, uint64_t) + +#undef DO_SRI + +#define DO_SLI(NAME, TYPE) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, oprsz = simd_oprsz(desc); \ + int shift = simd_data(desc); \ + TYPE *d = vd, *n = vn; \ + for (i = 0; i < oprsz / sizeof(TYPE); i++) { \ + d[i] = deposit64(d[i], shift, sizeof(TYPE) * 8 - shift, n[i]); \ + } \ + clear_tail(d, oprsz, simd_maxsz(desc)); \ +} + +DO_SLI(gvec_sli_b, uint8_t) +DO_SLI(gvec_sli_h, uint16_t) +DO_SLI(gvec_sli_s, uint32_t) +DO_SLI(gvec_sli_d, uint64_t) + +#undef DO_SLI + /* * Convert float16 to float32, raising no exceptions and * preserving exceptional values, including SNaN. From patchwork Thu Mar 26 23:08:34 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 184962 Delivered-To: patch@linaro.org Received: by 2002:a92:de47:0:0:0:0:0 with SMTP id e7csp650274ilr; Thu, 26 Mar 2020 16:24:43 -0700 (PDT) X-Google-Smtp-Source: ADFU+vvHU2FOdPSBavx6tpumbh1/nPdiN7buE328wCLv3tu+vBP6kYl5R+f7zoKi2IkRfLcCewsH X-Received: by 2002:a0c:e8c5:: with SMTP id m5mr10940955qvo.40.1585265083831; Thu, 26 Mar 2020 16:24:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585265083; cv=none; d=google.com; s=arc-20160816; b=EgNJOyQ9acBUs6NI1H60denWQjUOGBD3J8+ajDUJ5Y8TBJKfQg98fykK7q7ab9nbvP 0CxgSft3M5MatEtPfnLFC36kpqlqLziOtS+AvwFxXGbHMADyY7FrLZhvTuIgp6SOQggd ryl/yGF+z9bICardUsTeA7gWlqgqOWklFX4b/rWasv8zuEt+CehELLU6BhJ3dBkQhBg9 lxwH+GBQy62/YzgY7PyD2hZJzXbKpUjKTSz14hgHQHXmTkKJE3bONeYOOahWSjxXHUnA RWg31infEqHdNZpHRxwAVhDAIoqy89RTrj9AkQD7bLTgte7bTM4PcKkBJZXCcTToxoTw E76Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=gruaHJOeD8WvnhMIfKkze3XKHGHvv2/+9IJLAI7Js3I=; b=RBGGl4/2tMvjpU0yG8nRhJBkG71XR1ZNiYeYYoXV4ms4vX5C4TtS6le68Y+ljEL8zz LPEu/IM49ob7nfdMNNeqTdIYJVD59JTNscdOnjbwtn4Tr2zAcR9RV6oG8IIr6ANhVKRW 0LWTGxc4ozIgit8csABqO1icXVaM5s7HbBS8r7+Yo0zVEZdZoSDY6zRO4toav1dVIZlo dU4JgZc5/tMauDiTYdtJ9Q5jhBsATZT6FBugJCtTf16bWsjr8EBbyYEqPwZJcGN5g1Aq ASJFrxQUT5LhrlerZjpaz0maTbZoRkLAr6lqlZVK0kXAwhDDUDyU0bKR9BgIsN6l8A7U DUFw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=T4spmkhx; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id r9si2503636qtb.264.2020.03.26.16.24.43 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 26 Mar 2020 16:24:43 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=T4spmkhx; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:34922 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbrf-0002Ve-9b for patch@linaro.org; Thu, 26 Mar 2020 19:24:43 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:59427) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbci-00035U-Nw for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:17 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbch-0001xl-7z for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:16 -0400 Received: from mail-pf1-x441.google.com ([2607:f8b0:4864:20::441]:41257) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbch-0001wZ-1c for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:15 -0400 Received: by mail-pf1-x441.google.com with SMTP id z65so3545055pfz.8 for ; Thu, 26 Mar 2020 16:09:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=gruaHJOeD8WvnhMIfKkze3XKHGHvv2/+9IJLAI7Js3I=; b=T4spmkhxY+XY0VoCM6cAeio6xDP/bnRE+S2bIyjiIN6IYxOPv8exi560GTi6kAUcpa hBnOBO5v3dTcj4NFoyWvTsPXW1uekeOEIunYMZjwaInR5jTKkNqMEXYLMniEMsXCl+ow 6M2TwsX5S4MklHqCSZkpDBx+P2v6tTMJhOk+5eZGOAdFH863SIRwai13pMHeC/BGqvPv iEz3E3u/tRsBJdwT/8ixdmUtNJ0qrYf7Ns4/MmT/K9BlsJok8Vm6tas1oUT5FBczpHLv iXcxkPKScIo8Jhmsgf+/9smNbjaJSeRSsZx8zAcPHXCnupl75DAxTMYtD+Gm/ZPSsZlQ oXvA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=gruaHJOeD8WvnhMIfKkze3XKHGHvv2/+9IJLAI7Js3I=; b=I05k7XVlskPStGMVpycqCrWV9AQsWA26hgHHGziBhBzAcwHPIdnFMVaRFWArGbgzzL Kxaxn2AfzyBUJO11/IJDdwiGxq72q0/T2cliln4uhBN4Qmtm8puZ8DAeAd7TV9Dl0Yk0 s0YRNnyqKDEDFsxc4QarCHoPtUqP7lQkhrNWJp9hzixVVdh6hQhJZJ2DbbclqAYb8NIW 8m3BwVoYO32oU7kV+OHLgJ707M1CZAaHSEGRenmiSHFdkhpoFlA/uHZl7/wrIRvPd3aR Lpq+Wj6+Ox8MHM+WivVLT0LzTwQDqY6xorAZbXnXv8PuZHUyTG9ELecmFgsaQUrvHyQf q3Iw== X-Gm-Message-State: ANhLgQ1xRaFs4ZooEb6yORTuh3kPbuKL9E8yIFZn4RbnunmS11AdMj5X LQ5+yF753Fg1jOZJgPUNweJF4lWwvuw= X-Received: by 2002:a63:7159:: with SMTP id b25mr10696807pgn.72.1585264153691; Thu, 26 Mar 2020 16:09:13 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.09.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:09:13 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 27/31] target/arm: Tidy handle_vec_simd_shri Date: Thu, 26 Mar 2020 16:08:34 -0700 Message-Id: <20200326230838.31112-28-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::441 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Now that we've converted all cases to gvec, there is quite a bit of dead code at the end of the function. Remove it. Sink the call to gen_gvec_fn2i to the end, loading a function pointer within the switch statement. Signed-off-by: Richard Henderson --- target/arm/translate-a64.c | 56 ++++++++++---------------------------- 1 file changed, 14 insertions(+), 42 deletions(-) -- 2.20.1 diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index f7d492cce4..fc156a217a 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -11096,16 +11096,7 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, int size = 32 - clz32(immh) - 1; int immhb = immh << 3 | immb; int shift = 2 * (8 << size) - immhb; - bool accumulate = false; - int dsize = is_q ? 128 : 64; - int esize = 8 << size; - int elements = dsize/esize; - MemOp memop = size | (is_u ? 0 : MO_SIGN); - TCGv_i64 tcg_rn = new_tmp_a64(s); - TCGv_i64 tcg_rd = new_tmp_a64(s); - TCGv_i64 tcg_round; - uint64_t round_const; - int i; + GVecGen2iFn *gvec_fn; if (extract32(immh, 3, 1) && !is_q) { unallocated_encoding(s); @@ -11119,13 +11110,12 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, switch (opcode) { case 0x02: /* SSRA / USRA (accumulate) */ - gen_gvec_fn2i(s, is_q, rd, rn, shift, - is_u ? arm_gen_gvec_usra : arm_gen_gvec_ssra, size); - return; + gvec_fn = is_u ? arm_gen_gvec_usra : arm_gen_gvec_ssra; + break; case 0x08: /* SRI */ - gen_gvec_fn2i(s, is_q, rd, rn, shift, arm_gen_gvec_sri, size); - return; + gvec_fn = arm_gen_gvec_sri; + break; case 0x00: /* SSHR / USHR */ if (is_u) { @@ -11133,49 +11123,31 @@ static void handle_vec_simd_shri(DisasContext *s, bool is_q, bool is_u, /* Shift count the same size as element size produces zero. */ tcg_gen_gvec_dup8i(vec_full_reg_offset(s, rd), is_q ? 16 : 8, vec_full_reg_size(s), 0); - } else { - gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_shri, size); + return; } + gvec_fn = tcg_gen_gvec_shri; } else { /* Shift count the same size as element size produces all sign. */ if (shift == 8 << size) { shift -= 1; } - gen_gvec_fn2i(s, is_q, rd, rn, shift, tcg_gen_gvec_sari, size); + gvec_fn = tcg_gen_gvec_sari; } - return; + break; case 0x04: /* SRSHR / URSHR (rounding) */ - gen_gvec_fn2i(s, is_q, rd, rn, shift, - is_u ? arm_gen_gvec_urshr : arm_gen_gvec_srshr, size); - return; + gvec_fn = is_u ? arm_gen_gvec_urshr : arm_gen_gvec_srshr; + break; case 0x06: /* SRSRA / URSRA (accum + rounding) */ - gen_gvec_fn2i(s, is_q, rd, rn, shift, - is_u ? arm_gen_gvec_ursra : arm_gen_gvec_srsra, size); - return; + gvec_fn = is_u ? arm_gen_gvec_ursra : arm_gen_gvec_srsra; + break; default: g_assert_not_reached(); } - round_const = 1ULL << (shift - 1); - tcg_round = tcg_const_i64(round_const); - - for (i = 0; i < elements; i++) { - read_vec_element(s, tcg_rn, rn, i, memop); - if (accumulate) { - read_vec_element(s, tcg_rd, rd, i, memop); - } - - handle_shri_with_rndacc(tcg_rd, tcg_rn, tcg_round, - accumulate, is_u, size, shift); - - write_vec_element(s, tcg_rd, rd, i, size); - } - tcg_temp_free_i64(tcg_round); - - clear_vec_high(s, is_q, rd); + gen_gvec_fn2i(s, is_q, rd, rn, shift, gvec_fn, size); } /* SHL/SLI - Vector shift left */ From patchwork Thu Mar 26 23:08:35 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 184958 Delivered-To: patch@linaro.org Received: by 2002:a92:de47:0:0:0:0:0 with SMTP id e7csp648413ilr; Thu, 26 Mar 2020 16:22:19 -0700 (PDT) X-Google-Smtp-Source: ADFU+vvTI2Lg7d8YO9gt6JKLXbc9yKtRn+k7lN4uOZRGSeNjaBJ+fCGOdS6HbfNoIK7JFr0KjJYp X-Received: by 2002:a37:514:: with SMTP id 20mr10881568qkf.420.1585264938778; Thu, 26 Mar 2020 16:22:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585264938; cv=none; d=google.com; s=arc-20160816; b=h4NFpgYRfIKNgkIu4/0ShSaQM8x2P+7hwYKkl1y/Vx8Kzu8ONZzAl85nLjCFq7y+wE +V3pSRtC4TtxiqD3+K31pl2HfZ3i9IwC599BvUx98Q+gDZxQ0f76hENmwvLHfqc7x8iX VUb4FYxx4uzgx0bES5uzFiJLycQUg5Pg4OHLoOpGnf/t9U4nwBfkqOaDsUT2ld1XAXRN vNt1rl9Zarq1enI2bODPh3PssBga7JN1ysLnAKSQv2Hj0vs0E+Vw1g9Mem85afMlTbiD 0bx7ljd90JV6y2aFD4j1gcmcQsHxOTNxgnCI0CAEoGHSJ5Y54tFwecEhgRjKkRBEBnZs YYRw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=XWCeE/5E1HZf0ja/1RRJESpNCJLpd5hNPRFBKO5ZIL8=; b=eTVAN0uSisQsOr5kXkdGCuEw4vwPOxSoMKmo0aSNnE3k8qBlzqMaUCByTYqZW2n/I8 zFz1FOcHA+7xcS2VHgcvhzO5Gsr1xwbRqSSMvcqpuVAIkRTZFfh3kSKSVwp6m1428Wql d7GtvX+n2mkdJnnANbevCF7+m4Ji+jXAPzBe1yUyztjEpTYZso1ahSjucv3hs18iPq/4 KoGHFsFHjKI3n25l4aEfxFfT+RvCltcfX8QBs00n00yCgaXfHlVLznN2n7k5whS0s5v3 JhNyDdP9AyF+FQYZJHAbfQyAo8tN5fkzxJGTTv/jbXtheUKggAkq39Z9bMNW8yaC+puw VNGg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=icKxNZgI; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id l18si2372011qvh.178.2020.03.26.16.22.18 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 26 Mar 2020 16:22:18 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=icKxNZgI; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:34850 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbpK-000673-A9 for patch@linaro.org; Thu, 26 Mar 2020 19:22:18 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:59538) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbck-00038S-JS for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:19 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbci-0001zl-Co for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:18 -0400 Received: from mail-pf1-x444.google.com ([2607:f8b0:4864:20::444]:39259) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbci-0001yT-6j for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:16 -0400 Received: by mail-pf1-x444.google.com with SMTP id d25so3539381pfn.6 for ; Thu, 26 Mar 2020 16:09:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=XWCeE/5E1HZf0ja/1RRJESpNCJLpd5hNPRFBKO5ZIL8=; b=icKxNZgINscLHQzuXJde8x0qqoeXNgrjWKtOxFEHuJVwQZzSoa8bQCLD44dWlVwp+9 BMfNG++TtWZ71up4vdx5vki2E1b08GzPXuPscp4JWsnA4bQQeZZYeT9ymHWWnfDHmqyW Pe8u5LgMyX/xLK4fbYM/+4Pt+BwY7A7glNoM6xzEa3/L8ou1HWORqMdV1ho3zkjmuAXr TayBcIb6G9kr/TFuiWU+MddOdi/9BLdy5TYLsQLRPcihZmyc7QKM2b136J/QLSIgEzOh XRdZP3f940aoT4Y2HwsPI+Uju0+0XJP5Q2HpQl/9renCWjG0v1KFTEjVspZYORJ7gvRg qdXQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=XWCeE/5E1HZf0ja/1RRJESpNCJLpd5hNPRFBKO5ZIL8=; b=HpIcjYOFsAoX/VyamSK+ITQT6/4M03S/UHFpyMDWCO4vB7M0OA5/Yl312Wg53QsbKN /xzIfE/3qqXJggDH5rDE9pjyG0l0gtvlPwqN1IELzkCOUwOfJiNoWT3kqxtvVHS82oLG D5phb4cC+tt0StgzTMZ4PSRw+ON3omkcnV2n+UhbuyBsR8pauunRbNXr3sDb+PCBXJtp UyZLerqyWoVfQpXBIntNfaq/xdCnvnccCD/ilot5HOWSVe37SOSf1RAl0HCNVJ+4krQI mXPFFEJiixKUpe6jky5RbJ8g7pHkEUFRjvVbzAPAJMVmITivxmykiq2w41DmWzMyb8ni CMiQ== X-Gm-Message-State: ANhLgQ0+WOd7g4+AgunnLFV/v1GdY6a4BfRjAwKh2uKHvv+2KuGdqD6J tpF3AvDya2esRdKge1TBXLAAV2OdKH8= X-Received: by 2002:aa7:8108:: with SMTP id b8mr11659763pfi.212.1585264154959; Thu, 26 Mar 2020 16:09:14 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.09.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:09:14 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 28/31] target/arm: Implement SVE2 bitwise shift and insert Date: Thu, 26 Mar 2020 16:08:35 -0700 Message-Id: <20200326230838.31112-29-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::444 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/sve.decode | 5 +++++ target/arm/translate-sve.c | 10 ++++++++++ 2 files changed, 15 insertions(+) -- 2.20.1 diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 756f939df1..9bf66e8ad4 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1261,3 +1261,8 @@ SSRA 01000101 .. 0 ..... 1110 00 ..... ..... @rd_rn_tszimm_shr USRA 01000101 .. 0 ..... 1110 01 ..... ..... @rd_rn_tszimm_shr SRSRA 01000101 .. 0 ..... 1110 10 ..... ..... @rd_rn_tszimm_shr URSRA 01000101 .. 0 ..... 1110 11 ..... ..... @rd_rn_tszimm_shr + +## SVE2 bitwise shift and insert + +SRI 01000101 .. 0 ..... 11110 0 ..... ..... @rd_rn_tszimm_shr +SLI 01000101 .. 0 ..... 11110 1 ..... ..... @rd_rn_tszimm_shl diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 1d1f55dfdd..7556cecfb3 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6351,3 +6351,13 @@ static bool trans_URSRA(DisasContext *s, arg_rri_esz *a) { return do_sve2_fn2i(s, a, arm_gen_gvec_ursra); } + +static bool trans_SRI(DisasContext *s, arg_rri_esz *a) +{ + return do_sve2_fn2i(s, a, arm_gen_gvec_sri); +} + +static bool trans_SLI(DisasContext *s, arg_rri_esz *a) +{ + return do_sve2_fn2i(s, a, arm_gen_gvec_sli); +} From patchwork Thu Mar 26 23:08:36 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 184968 Delivered-To: patch@linaro.org Received: by 2002:a92:de47:0:0:0:0:0 with SMTP id e7csp653461ilr; Thu, 26 Mar 2020 16:28:57 -0700 (PDT) X-Google-Smtp-Source: ADFU+vs70JjWjwvlxYRlkyEO1hSNEpoS1HxigosPriF3vVeP1BT1YlBCINwVqmITkMHAlbNqXVA8 X-Received: by 2002:ac8:72ce:: with SMTP id o14mr11608044qtp.226.1585265337819; Thu, 26 Mar 2020 16:28:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585265337; cv=none; d=google.com; s=arc-20160816; b=SJTZSMegklnJgy4qIDlTfY+t+AXk3GBMakqSqmM5xnVbF4Mgu64vhdUBm3YiJ73plz g6OoEeZOJC/IdvdtLEosDopu4Orz8k48qijeEfqlQe/8piPndGLtV6uelQgHC0ekEv8a sZaWpVJF6aZF4Mf3RBsomOnyky25Zuh5bK50olM3i3ku8e1Gb/nqvCtRplJH8BcMPOMK GePIg4dVTM3fTTtAd5iTfupdI7QuIhefOdgC5cffE8HfoZz06Fvnigumtc2NN8iOZD80 UtCwnc+GBvhilaPGompyhtA3Zay6jpvI0EefeLHX0gzS5W1oC5IXbletMB5CG6ObCery rCKw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=TUH/iWxYaarqAhHTxewl7Cdd7qJ//bZUOWIU1IWx5z8=; b=lB5TioTXXp3uC8aMNnrsJ4R0fVVihJbmfzXImAe40qnRzs2ptD/kBkoxP2tTfSxe7v lZS3dlYVpOeNAsDNeSXLV7uKZIKWbce3X+68fEgVfnUrmkgKrEx6uyij9U5zTMSUBgnj 7EjaaR8vsDFc34HYDnMHhqnIrIdYIEvlPb682QagZKdOtlT0GvNXHbijIqyFBo6Zk2Yn c3ZZPmrrWe3YcYTOvke60FdDonMb8GcOEdTtSpn8/cspKjKv4slTz/woJ5X7tdozsSHR 2eIzEtiZ2RxDikBoRKSaV7a+WScM3Uw16D3MuQIFrTUQGUS/IdFK91lCu99VwSNpjwwb UG8w== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=pJoNT6xt; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id s82si2561928qke.220.2020.03.26.16.28.57 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 26 Mar 2020 16:28:57 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=pJoNT6xt; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:35050 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbvl-0000LN-As for patch@linaro.org; Thu, 26 Mar 2020 19:28:57 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:59627) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcm-0003Bo-61 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:22 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcj-00022F-W6 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:20 -0400 Received: from mail-pl1-x641.google.com ([2607:f8b0:4864:20::641]:34597) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcj-000211-Nb for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:17 -0400 Received: by mail-pl1-x641.google.com with SMTP id a23so2746288plm.1 for ; Thu, 26 Mar 2020 16:09:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=TUH/iWxYaarqAhHTxewl7Cdd7qJ//bZUOWIU1IWx5z8=; b=pJoNT6xtsQOq3gFBSADOXLKKBy6htgXEVA+0bvS7tCgOo+fJQXfFi5yypP9l38HFU4 dg8A6Y6z/vkysMm/HaR4HNqbcSHLJU+pCRblmnfyOqbiPsldQE0HvKYAhaLNBMiD7/+5 YgFOaOhRXqKZSISylnOoYMa0CAkZZ9RTanP83HNGPMhkOsAbWd4cQKZl+gzeeyzI1o7F V1wjnhIdiy+n8YmOgJAL1A2pnMvEiTtgWSzz+Uj0RWqUKcmrg8PbN2/tSmzShlpYmkP0 MKlo9NrZQ7n3oyMrSbs+qSmg7pTfHTWVLzsdMF83ZHpNfzyDQVg3MH4UfTdBq0Lftuqm 8D2g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=TUH/iWxYaarqAhHTxewl7Cdd7qJ//bZUOWIU1IWx5z8=; b=LG/P2pa4uxncZlcy9ktYdwNwoswMTN3dn+Q8Ba7aBmIY/pdkVpY3/tU3mYpPFCX4g/ +GPp0z3hyU/8A9aS7Qt8IEFiC+yo6ndmuT9tZ1ODfO4dGtJLCLDzkn0+o5WLR0JM0g3/ Si5smnNygaSF9MHPq+wIIpKoovzjWJNYBJyoRPg8KXlruHKnz7w3CMifcwtJjlMS0daj w1w9+3gIxKGcEQnKyFlZqfiNrVGY3/POTs8k2Z35yDsJO2F3VKpIEJ/0x/UKa4Tzb2Vb MQnv+EtjZ60Xt453fWWFmNkneX125WrkagvRe3KFo5qipfgq6RZ+TxdJVEXY+CsQuO46 PaRQ== X-Gm-Message-State: ANhLgQ3kooVCAbYvIwimgQTP1TaWtwNfqeqArPJ013NXqnaw2dJ4DVDW H87jNJNsmHocxH7PuAJEjHiUDqfFgoI= X-Received: by 2002:a17:90a:7105:: with SMTP id h5mr2615902pjk.54.1585264155988; Thu, 26 Mar 2020 16:09:15 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.09.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:09:15 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 29/31] target/arm: Vectorize SABD/UABD Date: Thu, 26 Mar 2020 16:08:36 -0700 Message-Id: <20200326230838.31112-30-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::641 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Include 64-bit element size in preparation for SVE. Signed-off-by: Richard Henderson --- target/arm/helper.h | 10 +++ target/arm/translate.h | 5 ++ target/arm/translate-a64.c | 8 ++- target/arm/translate.c | 133 ++++++++++++++++++++++++++++++++++++- target/arm/vec_helper.c | 88 ++++++++++++++++++++++++ 5 files changed, 240 insertions(+), 4 deletions(-) -- 2.20.1 diff --git a/target/arm/helper.h b/target/arm/helper.h index 5ef7bb158f..97ccbd70c6 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -748,6 +748,16 @@ DEF_HELPER_FLAGS_3(gvec_sli_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_sli_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_sli_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sabd_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_sabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(gvec_uabd_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_uabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_uabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_uabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate.h b/target/arm/translate.h index 843ecc1472..c453aa1c47 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -311,6 +311,11 @@ void arm_gen_gvec_sri(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, void arm_gen_gvec_sli(unsigned vece, uint32_t rd_ofs, uint32_t rm_ofs, int64_t shift, uint32_t opr_sz, uint32_t max_sz); +void arm_gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void arm_gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); + /* * Forward to the isar_feature_* tests given a DisasContext pointer. */ diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index fc156a217a..1791c26a39 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -12159,6 +12159,13 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_smin, size); } return; + case 0xe: /* SABD, UABD */ + if (u) { + gen_gvec_fn3(s, is_q, rd, rn, rm, arm_gen_gvec_uabd, size); + } else { + gen_gvec_fn3(s, is_q, rd, rn, rm, arm_gen_gvec_sabd, size); + } + return; case 0x10: /* ADD, SUB */ if (u) { gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_sub, size); @@ -12291,7 +12298,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) genenvfn = fns[size][u]; break; } - case 0xe: /* SABD, UABD */ case 0xf: /* SABA, UABA */ { static NeonGenTwoOpFn * const fns[3][2] = { diff --git a/target/arm/translate.c b/target/arm/translate.c index bb6db53598..a29868976a 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -4849,6 +4849,126 @@ const GVecGen4 sqsub_op[4] = { .vece = MO_64 }, }; +static void gen_sabd_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + tcg_gen_sub_i32(t, a, b); + tcg_gen_sub_i32(d, b, a); + tcg_gen_movcond_i32(TCG_COND_LT, d, a, b, d, t); + tcg_temp_free_i32(t); +} + +static void gen_sabd_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_sub_i64(t, a, b); + tcg_gen_sub_i64(d, b, a); + tcg_gen_movcond_i64(TCG_COND_LT, d, a, b, d, t); + tcg_temp_free_i64(t); +} + +static void gen_sabd_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + + tcg_gen_smin_vec(vece, t, a, b); + tcg_gen_smax_vec(vece, d, a, b); + tcg_gen_sub_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +void arm_gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sub_vec, INDEX_op_smin_vec, INDEX_op_smax_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fniv = gen_sabd_vec, + .fno = gen_helper_gvec_sabd_b, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fniv = gen_sabd_vec, + .fno = gen_helper_gvec_sabd_h, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_sabd_i32, + .fniv = gen_sabd_vec, + .fno = gen_helper_gvec_sabd_s, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_sabd_i64, + .fniv = gen_sabd_vec, + .fno = gen_helper_gvec_sabd_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} + +static void gen_uabd_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + TCGv_i32 t = tcg_temp_new_i32(); + + tcg_gen_sub_i32(t, a, b); + tcg_gen_sub_i32(d, b, a); + tcg_gen_movcond_i32(TCG_COND_LTU, d, a, b, d, t); + tcg_temp_free_i32(t); +} + +static void gen_uabd_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_sub_i64(t, a, b); + tcg_gen_sub_i64(d, b, a); + tcg_gen_movcond_i64(TCG_COND_LTU, d, a, b, d, t); + tcg_temp_free_i64(t); +} + +static void gen_uabd_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + + tcg_gen_umin_vec(vece, t, a, b); + tcg_gen_umax_vec(vece, d, a, b); + tcg_gen_sub_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +void arm_gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sub_vec, INDEX_op_umin_vec, INDEX_op_umax_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fniv = gen_uabd_vec, + .fno = gen_helper_gvec_uabd_b, + .opt_opc = vecop_list, + .vece = MO_8 }, + { .fniv = gen_uabd_vec, + .fno = gen_helper_gvec_uabd_h, + .opt_opc = vecop_list, + .vece = MO_16 }, + { .fni4 = gen_uabd_i32, + .fniv = gen_uabd_vec, + .fno = gen_helper_gvec_uabd_s, + .opt_opc = vecop_list, + .vece = MO_32 }, + { .fni8 = gen_uabd_i64, + .fniv = gen_uabd_vec, + .fno = gen_helper_gvec_uabd_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} + /* Translate a NEON data processing instruction. Return nonzero if the instruction is invalid. We process data in a mixture of 32-bit and 64-bit chunks. @@ -5107,6 +5227,16 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) tcg_gen_gvec_3(rd_ofs, rm_ofs, rn_ofs, vec_size, vec_size, u ? &ushl_op[size] : &sshl_op[size]); return 0; + + case NEON_3R_VABD: + if (u) { + arm_gen_gvec_uabd(size, rd_ofs, rn_ofs, rm_ofs, + vec_size, vec_size); + } else { + arm_gen_gvec_sabd(size, rd_ofs, rn_ofs, rm_ofs, + vec_size, vec_size); + } + return 0; } if (size == 3) { @@ -5237,9 +5367,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) case NEON_3R_VQRSHL: GEN_NEON_INTEGER_OP_ENV(qrshl); break; - case NEON_3R_VABD: - GEN_NEON_INTEGER_OP(abd); - break; case NEON_3R_VABA: GEN_NEON_INTEGER_OP(abd); tcg_temp_free_i32(tmp2); diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 27035a8a42..e0694c16f4 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -1492,3 +1492,91 @@ void HELPER(gvec_umulh_d)(void *vd, void *vn, void *vm, uint32_t desc) } clear_tail(d, opr_sz, simd_maxsz(desc)); } + +void HELPER(gvec_sabd_b)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int8_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz; ++i) { + d[i] = n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_sabd_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int16_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 2; ++i) { + d[i] = n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_sabd_s)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int32_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 4; ++i) { + d[i] = n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_sabd_d)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int64_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 8; ++i) { + d[i] = n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_uabd_b)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint8_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz; ++i) { + d[i] = n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_uabd_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint16_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 2; ++i) { + d[i] = n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_uabd_s)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint32_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 4; ++i) { + d[i] = n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_uabd_d)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint64_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 8; ++i) { + d[i] = n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} From patchwork Thu Mar 26 23:08:37 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 184969 Delivered-To: patch@linaro.org Received: by 2002:a92:de47:0:0:0:0:0 with SMTP id e7csp653716ilr; Thu, 26 Mar 2020 16:29:15 -0700 (PDT) X-Google-Smtp-Source: ADFU+vvSyT70qtT3SlVMZrdWaaNLon2IDWlcBZrlUXkjKYu6AvAFAnBcvfIJtekyi0hwzzGUM0FJ X-Received: by 2002:ac8:3105:: with SMTP id g5mr11504475qtb.358.1585265355700; Thu, 26 Mar 2020 16:29:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585265355; cv=none; d=google.com; s=arc-20160816; b=XS1QLPPZalnd2H91zxQfLbr35TpFI9q03tAbzM939ykiBFEGRHbdF9OR1sHxuioy2V qle7HfZcc8iW0Vehjp8zXJFVvXWLoxVzFuMWQ3gpV7li8qczgE7cUf8HKoGwaZGrA7q5 srO0dzkZMdnLQXJ180yD5i3hYCmwH/pKzdjrgx9Z8+9309sPJOVApeKASMHPznFSOOQl Lcp+Ox4F10GlFVbSz8yeNtRj8ca5IvRnUUnv9bcbEVhK+usIJZKGfOYEGr50rHhs8fS1 MQpe01OqnWxzp9npY0S1RYo0DJUC114nr2daO1h1J6zxO7wViM8XCRvANB3EjtZTQVfG Mffg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=59fkymWDzWb5TOs7OFoXchMT/KmijGyH3f2e6vHC1Kk=; b=ydFsZYOy10h5Cb9wgvrt5oNHe9vdZpdYAIkZlLECgaiC9/9ApMAasFkpsZID0PZVj8 rv1YLNfHUt4Q2yFqeshWOxVNfuztGCrK+X/PoRnwx5pSACK3wRbzDS65/8/GbYXA4ufL 1rMqON69E0HTgJloJhlFAYcKG327irSl8i5tEFeOophJFgArrzCTZZirPJ65lgMPHJzj f9YPlbfB0OYr5VoueqrKDB9L+TNvpb/1qwAD4S0XP9foTjXhzpU6BfAn/STJcVi4VnLB rPwM5DGkxv68nk4uVUgX164+vO9Ah6lBwAjzMhuhRk8GyYNOUS0iXj9nuD4Xse4SbqCm I88A== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=kpXqVdEj; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id d10si2610579qte.405.2020.03.26.16.29.15 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 26 Mar 2020 16:29:15 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=kpXqVdEj; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:35054 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbw3-0000wH-6n for patch@linaro.org; Thu, 26 Mar 2020 19:29:15 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:59696) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcn-0003F8-DH for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:23 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcl-00024j-8T for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:21 -0400 Received: from mail-pj1-x1042.google.com ([2607:f8b0:4864:20::1042]:53741) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbck-000235-W4 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:19 -0400 Received: by mail-pj1-x1042.google.com with SMTP id l36so3088799pjb.3 for ; Thu, 26 Mar 2020 16:09:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=59fkymWDzWb5TOs7OFoXchMT/KmijGyH3f2e6vHC1Kk=; b=kpXqVdEj3qFqq7pDlrWworCzh0b+aZbdn51q8tVyXxCLo1KagMK8IwfOgtnBmbM3iG 8P0NhzKUaT5s7wTyDO2uNBY2t0Hy/ANoDpPSq9X5bsvOvDCNXkmi60KFpW9ymXSVjRj/ fIjJ0aLa4ytIZv5jqPVME2Arvyji4NiuFbzToxCEG04ZxcDTgA/jArhy9Cs5bgG3HOym FTN7iCEkU63Pu4CjqwmLV4Vf0765W2lVsk2QauuEVmturluVOqjXs+JnFKLfwwqngxnr GaVFbECe7EDJ6VBWzL+iL5M/5MMDc2w6PUlIbi7QeI3e5WuhOyevzOcyd9vYbAs+Uku6 1fcQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=59fkymWDzWb5TOs7OFoXchMT/KmijGyH3f2e6vHC1Kk=; b=QL5y8VnGWXJXi8pYntri/s50c8A8f4xsgrB8/NtkHZ+Q7YTPsepRrqDjYRGmeyp4BE pFb0x72HJXlRn/rvnA3DseVjOTKIrUrXpDlvbivOT9RgcyiYvoRH6URIdjbPHZ1xxiOj WPEb/jSLwMZ06EIuEzLprPyZ8OcPI9GjQE74Xg6Av/xVP0Jqy9/ZwWJDC4pLRuJGn4nG Fra0Znd9tSTpsgyD5cp7ukzfLMa0EsvILUi79yi5enmj3O1elwIcfunZ1DY4Zv0TSYrk amxvXfE0KNhP38vP4IyOYcjisRt0iH0Hl66mvHB6Hy6xFhEwkP13YJbTtECOYuO4ckmB cKpw== X-Gm-Message-State: ANhLgQ3e50YLqngzHpUfiL137jHdb4BACZHKXR4R1hN3n+CyUHSkpTUw 9mmG1QSbLSY9e5ufdjLsqxJgUKDgkkg= X-Received: by 2002:a17:90a:cb14:: with SMTP id z20mr2496268pjt.170.1585264157065; Thu, 26 Mar 2020 16:09:17 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.09.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:09:16 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 30/31] target/arm: Vectorize SABA/UABA Date: Thu, 26 Mar 2020 16:08:37 -0700 Message-Id: <20200326230838.31112-31-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::1042 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Include 64-bit element size in preparation for SVE. Signed-off-by: Richard Henderson --- target/arm/helper.h | 17 +++-- target/arm/translate.h | 5 ++ target/arm/neon_helper.c | 10 --- target/arm/translate-a64.c | 17 ++--- target/arm/translate.c | 134 +++++++++++++++++++++++++++++++++++-- target/arm/vec_helper.c | 88 ++++++++++++++++++++++++ 6 files changed, 238 insertions(+), 33 deletions(-) -- 2.20.1 diff --git a/target/arm/helper.h b/target/arm/helper.h index 97ccbd70c6..5cf6a5b4a0 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -299,13 +299,6 @@ DEF_HELPER_2(neon_pmax_s8, i32, i32, i32) DEF_HELPER_2(neon_pmax_u16, i32, i32, i32) DEF_HELPER_2(neon_pmax_s16, i32, i32, i32) -DEF_HELPER_2(neon_abd_u8, i32, i32, i32) -DEF_HELPER_2(neon_abd_s8, i32, i32, i32) -DEF_HELPER_2(neon_abd_u16, i32, i32, i32) -DEF_HELPER_2(neon_abd_s16, i32, i32, i32) -DEF_HELPER_2(neon_abd_u32, i32, i32, i32) -DEF_HELPER_2(neon_abd_s32, i32, i32, i32) - DEF_HELPER_2(neon_shl_u16, i32, i32, i32) DEF_HELPER_2(neon_shl_s16, i32, i32, i32) DEF_HELPER_2(neon_rshl_u8, i32, i32, i32) @@ -758,6 +751,16 @@ DEF_HELPER_FLAGS_4(gvec_uabd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_uabd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_uabd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_saba_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_saba_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_saba_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_saba_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(gvec_uaba_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_uaba_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_uaba_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_uaba_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate.h b/target/arm/translate.h index c453aa1c47..0df7ce51b2 100644 --- a/target/arm/translate.h +++ b/target/arm/translate.h @@ -316,6 +316,11 @@ void arm_gen_gvec_sabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, void arm_gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void arm_gen_gvec_saba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void arm_gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); + /* * Forward to the isar_feature_* tests given a DisasContext pointer. */ diff --git a/target/arm/neon_helper.c b/target/arm/neon_helper.c index e6481a5764..4c1cf1e031 100644 --- a/target/arm/neon_helper.c +++ b/target/arm/neon_helper.c @@ -595,16 +595,6 @@ NEON_POP(pmax_s16, neon_s16, 2) NEON_POP(pmax_u16, neon_u16, 2) #undef NEON_FN -#define NEON_FN(dest, src1, src2) \ - dest = (src1 > src2) ? (src1 - src2) : (src2 - src1) -NEON_VOP(abd_s8, neon_s8, 4) -NEON_VOP(abd_u8, neon_u8, 4) -NEON_VOP(abd_s16, neon_s16, 2) -NEON_VOP(abd_u16, neon_u16, 2) -NEON_VOP(abd_s32, neon_s32, 1) -NEON_VOP(abd_u32, neon_u32, 1) -#undef NEON_FN - #define NEON_FN(dest, src1, src2) \ (dest = do_uqrshl_bhs(src1, src2, 16, false, NULL)) NEON_VOP(shl_u16, neon_u16, 2) diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 1791c26a39..d830a58c3f 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -12166,6 +12166,13 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) gen_gvec_fn3(s, is_q, rd, rn, rm, arm_gen_gvec_sabd, size); } return; + case 0xf: /* SABA, UABA */ + if (u) { + gen_gvec_fn3(s, is_q, rd, rn, rm, arm_gen_gvec_uaba, size); + } else { + gen_gvec_fn3(s, is_q, rd, rn, rm, arm_gen_gvec_saba, size); + } + return; case 0x10: /* ADD, SUB */ if (u) { gen_gvec_fn3(s, is_q, rd, rn, rm, tcg_gen_gvec_sub, size); @@ -12298,16 +12305,6 @@ static void disas_simd_3same_int(DisasContext *s, uint32_t insn) genenvfn = fns[size][u]; break; } - case 0xf: /* SABA, UABA */ - { - static NeonGenTwoOpFn * const fns[3][2] = { - { gen_helper_neon_abd_s8, gen_helper_neon_abd_u8 }, - { gen_helper_neon_abd_s16, gen_helper_neon_abd_u16 }, - { gen_helper_neon_abd_s32, gen_helper_neon_abd_u32 }, - }; - genfn = fns[size][u]; - break; - } case 0x16: /* SQDMULH, SQRDMULH */ { static NeonGenTwoOpEnvFn * const fns[2][2] = { diff --git a/target/arm/translate.c b/target/arm/translate.c index a29868976a..4491ab0eb0 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -4969,6 +4969,124 @@ void arm_gen_gvec_uabd(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); } +static void gen_saba_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + TCGv_i32 t = tcg_temp_new_i32(); + gen_sabd_i32(t, a, b); + tcg_gen_add_i32(d, d, t); + tcg_temp_free_i32(t); +} + +static void gen_saba_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + gen_sabd_i64(t, a, b); + tcg_gen_add_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_saba_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + gen_sabd_vec(vece, t, a, b); + tcg_gen_add_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +void arm_gen_gvec_saba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sub_vec, INDEX_op_add_vec, + INDEX_op_smin_vec, INDEX_op_smax_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fniv = gen_saba_vec, + .fno = gen_helper_gvec_saba_b, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_8 }, + { .fniv = gen_saba_vec, + .fno = gen_helper_gvec_saba_h, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_16 }, + { .fni4 = gen_saba_i32, + .fniv = gen_saba_vec, + .fno = gen_helper_gvec_saba_s, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_32 }, + { .fni8 = gen_saba_i64, + .fniv = gen_saba_vec, + .fno = gen_helper_gvec_saba_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} + +static void gen_uaba_i32(TCGv_i32 d, TCGv_i32 a, TCGv_i32 b) +{ + TCGv_i32 t = tcg_temp_new_i32(); + gen_uabd_i32(t, a, b); + tcg_gen_add_i32(d, d, t); + tcg_temp_free_i32(t); +} + +static void gen_uaba_i64(TCGv_i64 d, TCGv_i64 a, TCGv_i64 b) +{ + TCGv_i64 t = tcg_temp_new_i64(); + gen_uabd_i64(t, a, b); + tcg_gen_add_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_uaba_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + gen_uabd_vec(vece, t, a, b); + tcg_gen_add_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +void arm_gen_gvec_uaba(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop_list[] = { + INDEX_op_sub_vec, INDEX_op_add_vec, + INDEX_op_umin_vec, INDEX_op_umax_vec, 0 + }; + static const GVecGen3 ops[4] = { + { .fniv = gen_uaba_vec, + .fno = gen_helper_gvec_uaba_b, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_8 }, + { .fniv = gen_uaba_vec, + .fno = gen_helper_gvec_uaba_h, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_16 }, + { .fni4 = gen_uaba_i32, + .fniv = gen_uaba_vec, + .fno = gen_helper_gvec_uaba_s, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_32 }, + { .fni8 = gen_uaba_i64, + .fniv = gen_uaba_vec, + .fno = gen_helper_gvec_uaba_d, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + .opt_opc = vecop_list, + .load_dest = true, + .vece = MO_64 }, + }; + tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, &ops[vece]); +} + /* Translate a NEON data processing instruction. Return nonzero if the instruction is invalid. We process data in a mixture of 32-bit and 64-bit chunks. @@ -5237,6 +5355,16 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) vec_size, vec_size); } return 0; + + case NEON_3R_VABA: + if (u) { + arm_gen_gvec_uaba(size, rd_ofs, rn_ofs, rm_ofs, + vec_size, vec_size); + } else { + arm_gen_gvec_saba(size, rd_ofs, rn_ofs, rm_ofs, + vec_size, vec_size); + } + return 0; } if (size == 3) { @@ -5367,12 +5495,6 @@ static int disas_neon_data_insn(DisasContext *s, uint32_t insn) case NEON_3R_VQRSHL: GEN_NEON_INTEGER_OP_ENV(qrshl); break; - case NEON_3R_VABA: - GEN_NEON_INTEGER_OP(abd); - tcg_temp_free_i32(tmp2); - tmp2 = neon_load_reg(rd, pass); - gen_neon_add(size, tmp, tmp2); - break; case NEON_3R_VPMAX: GEN_NEON_INTEGER_OP(pmax); break; diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index e0694c16f4..cbd0382c71 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -1580,3 +1580,91 @@ void HELPER(gvec_uabd_d)(void *vd, void *vn, void *vm, uint32_t desc) } clear_tail(d, opr_sz, simd_maxsz(desc)); } + +void HELPER(gvec_saba_b)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int8_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz; ++i) { + d[i] += n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_saba_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int16_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 2; ++i) { + d[i] += n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_saba_s)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int32_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 4; ++i) { + d[i] += n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_saba_d)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int64_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 8; ++i) { + d[i] += n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_uaba_b)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint8_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz; ++i) { + d[i] += n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_uaba_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint16_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 2; ++i) { + d[i] += n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_uaba_s)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint32_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 4; ++i) { + d[i] += n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_uaba_d)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint64_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 8; ++i) { + d[i] += n[i] < m[i] ? m[i] - n[i] : n[i] - m[i]; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} From patchwork Thu Mar 26 23:08:38 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 184961 Delivered-To: patch@linaro.org Received: by 2002:a92:de47:0:0:0:0:0 with SMTP id e7csp650048ilr; Thu, 26 Mar 2020 16:24:27 -0700 (PDT) X-Google-Smtp-Source: ADFU+vtUOZCj8WooaQW7xW3gn61Tx5R0AUrzyXFpqQZpKmr2ejMBPeYCCh89iUcFl2nMg24dLqP/ X-Received: by 2002:a37:9f42:: with SMTP id i63mr11783686qke.266.1585265067630; Thu, 26 Mar 2020 16:24:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585265067; cv=none; d=google.com; s=arc-20160816; b=U/VfxwjjvYx2rNCw9yJGt10OhHuWb45qhwg5YQ/ifrdO+As4vPv0CUsCbSRa+6ZgAi OGCjwD5hG2/J94+JghnFT/igAm36YsTlM8B3T3q7ssjss5jYFJxaECamQj54/F32iWZz aoMhTF/ParVgdwfbsyU/GEwWK609mNfZFPvYB3/mCl2HIoSUd/wUSOy47ZR0E6m7i+5b yIrrRZXcAblMAzBvLCFgiSLd10FP5hGEiZWR2pWDW+kbwhhk023oVfvBleGrAdeUakLY CIpKRNt9FWuqlsBb8jdv/O4ICFRX87vsAfz5IdE7cGUBzUItglwT4fQqIV2ghxZm+ghL w+eg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=y9AW+fkUrXuwDSItggnbypCfN/Kt8Ehf1WvwBsOiyX4=; b=To9DXAwpRX+sWPWLcJKnt03lqexqPtiLMo8op3I4MRstmXC+kPk9mappkGGZQcLxem o5re1cL5rICGHAsfpcM9JdTRVWoPLdNbQoDaCtmsJapkZIgUV20kiS2oW3OY6aA2OZTL N6O3iLVCy6nBRSR6X4djix+peM5hoV8t5WivRYeu3iWCkBUXeZN5diqyM0mSyTT7SrJy 9xUXcOuP/CHgMc/G2HhyW+/HaJWwc8Ki1PJgzHARzBRvs+nqvpe7rxk+C/zAiWV4ZC+s paWqTZP64JfDd52FzlhAF1owpFm6N7N+cz6dYTplne9qA3dbzf+OB1wCmV3fyYQzUAVt 1/SA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=khvwLdZa; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id ba10si2302533qvb.185.2020.03.26.16.24.27 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 26 Mar 2020 16:24:27 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=khvwLdZa; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:34910 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbrP-0001cM-68 for patch@linaro.org; Thu, 26 Mar 2020 19:24:27 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:59665) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jHbcm-0003DX-RU for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:21 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jHbcl-00025T-O3 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:20 -0400 Received: from mail-pg1-x52f.google.com ([2607:f8b0:4864:20::52f]:38921) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1jHbcl-00024Q-H3 for qemu-devel@nongnu.org; Thu, 26 Mar 2020 19:09:19 -0400 Received: by mail-pg1-x52f.google.com with SMTP id b22so3651273pgb.6 for ; Thu, 26 Mar 2020 16:09:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=y9AW+fkUrXuwDSItggnbypCfN/Kt8Ehf1WvwBsOiyX4=; b=khvwLdZab3bGspZGQIaKRdwlLVP1QxMoQh23OPqJ9bvQ0EaDbAH8hdQF9ns2qOh9+Y 5xDTzqwa9KB5x2bpqlMI0WYts9IWOfgKn/M6T/EvrVYh39mDqEO9DGodcUjyjqATD5L1 Z+adzhVsC1oe7IEciDLGylE+8OEhrgV9KIZOR1L0fKSTzVzkVnssBVTi7mF2oGOmFIIX wDhMMPndHWwPVxMkGspGsevtspvHPhqsxvjLmcRi+vYJ+uvl5K8evme7HZ7Iaq+hH6IO gf2YUtR2yWM6ZPiYulrCzlEoBrtEqUj1W/tZSKgKai/Bo2yVtjyKY67CpngbpsbA6YcV P/6w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=y9AW+fkUrXuwDSItggnbypCfN/Kt8Ehf1WvwBsOiyX4=; b=onLZePLaTvqFBOz4Be9P87sO7FSyNL+HoF6s+88tJGzaif4FGS+6LshSXNZpy5F2JX M/T3aAa3XgL9kNTw+D08c2uFP+oqpJngJsjgAoz515XLNm5aSUJoHOzkY5FiMdiSyk+X tV+wOS2B7HfB84i2wAPgaS1u8kBO/yFySqG4BEM+/RQ8nRQWSYuJLwNEuIUylhLeKdwy qD+7zTsBBfYst2cmJfimtwiBV9ziZygl5JW5J4iOTINyM5QMucJQr57XRKGgjw9Jbjk4 1atCUSGb/UFhk8TYeTBfkz4GsONb6lhSs45obR91hYrclKe3WarZG+XWcSqlMc23G6iw h5Jw== X-Gm-Message-State: ANhLgQ1qmvY/yZHty+bKau7JxoLB2A/oQLH7uu7CDjMN1HVzerrrkqwk Z/EewFObKtIxMChkSPSX9+VJy0/Ctpk= X-Received: by 2002:a62:7a82:: with SMTP id v124mr10983254pfc.10.1585264158220; Thu, 26 Mar 2020 16:09:18 -0700 (PDT) Received: from localhost.localdomain (174-21-138-234.tukw.qwest.net. [174.21.138.234]) by smtp.gmail.com with ESMTPSA id i187sm2530037pfg.33.2020.03.26.16.09.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Mar 2020 16:09:17 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH 31/31] target/arm: Implement SVE2 integer absolute difference and accumulate Date: Thu, 26 Mar 2020 16:08:38 -0700 Message-Id: <20200326230838.31112-32-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200326230838.31112-1-richard.henderson@linaro.org> References: <20200326230838.31112-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:4864:20::52f X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: rajav@quicinc.com, qemu-arm@nongnu.org, apazos@quicinc.com Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/sve.decode | 6 ++++++ target/arm/translate-sve.c | 25 +++++++++++++++++++++++++ 2 files changed, 31 insertions(+) -- 2.20.1 diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 9bf66e8ad4..6d565912e3 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1266,3 +1266,9 @@ URSRA 01000101 .. 0 ..... 1110 11 ..... ..... @rd_rn_tszimm_shr SRI 01000101 .. 0 ..... 11110 0 ..... ..... @rd_rn_tszimm_shr SLI 01000101 .. 0 ..... 11110 1 ..... ..... @rd_rn_tszimm_shl + +## SVE2 integer absolute difference and accumulate + +# TODO: Use @rda and %reg_movprfx here. +SABA 01000101 .. 0 ..... 11111 0 ..... ..... @rd_rn_rm +UABA 01000101 .. 0 ..... 11111 1 ..... ..... @rd_rn_rm diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 7556cecfb3..42ef031b77 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6361,3 +6361,28 @@ static bool trans_SLI(DisasContext *s, arg_rri_esz *a) { return do_sve2_fn2i(s, a, arm_gen_gvec_sli); } + +static bool do_sve2_fn3(DisasContext *s, arg_rrr_esz *a, GVecGen3Fn *fn) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + unsigned rd_ofs = vec_full_reg_offset(s, a->rd); + unsigned rn_ofs = vec_full_reg_offset(s, a->rn); + unsigned rm_ofs = vec_full_reg_offset(s, a->rm); + fn(a->esz, rd_ofs, rn_ofs, rm_ofs, vsz, vsz); + } + return true; +} + +static bool trans_SABA(DisasContext *s, arg_rrr_esz *a) +{ + return do_sve2_fn3(s, a, arm_gen_gvec_saba); +} + +static bool trans_UABA(DisasContext *s, arg_rrr_esz *a) +{ + return do_sve2_fn3(s, a, arm_gen_gvec_uaba); +}