From patchwork Fri Sep 18 18:36:31 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 305032 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BA4C0C43463 for ; Fri, 18 Sep 2020 18:43:04 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 20C8F21D7F for ; Fri, 18 Sep 2020 18:43:04 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="ebBXjqUh" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 20C8F21D7F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:42280 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLLb-0004Qt-2m for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 14:43:03 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47718) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLGi-0007JB-5d for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:00 -0400 Received: from mail-pj1-x1043.google.com ([2607:f8b0:4864:20::1043]:33125) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLGf-00072b-Gt for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:37:59 -0400 Received: by mail-pj1-x1043.google.com with SMTP id md22so4603364pjb.0 for ; Fri, 18 Sep 2020 11:37:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=fBhW7XDEMls1H/wfvRiEP/0AEEm3Vshw5FyVOPuU9nc=; b=ebBXjqUh9BnzS7oUs2hKkTxk0uigH5tQvciOH4Tqrn+KUKkHqRR19oPw3RJALIbE6I dvyd+DMBRqK39QoCo2Td0T3bysIt3qRBXOQlK48IXyiUo2jvWIsC7ulC+LXL+pBTIfON drY5ccboJaMTmK5/LzALak7ogqFCIGhnUNkt94tX39VDZecYdNWiEJxr8cblSU/yR1te nva6Bf2lYFskurgUTGNvTqJa5u9OhjtVubVwptks0cfQo+Dm0FHIipGCVz1MTsvnCtfg oqVT0g5keUO0S6SoC4EkCJbn9TpTScTHh3O7ZQ4erGXjrRtRj97HlpAn5qDWkxp3CGso d4WA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=fBhW7XDEMls1H/wfvRiEP/0AEEm3Vshw5FyVOPuU9nc=; b=hcj0VIn6UhD0c5mwuSn+ZVPiQReFZ+kT4pXved0I7FFFw+BbEzjZV+oiBUATBeELE6 VIsKxPHjJSxzLueg2kvGYHhzuDqW41gIIr1rCJTmjwvlXdTAalMecxmNdHDz0RuKWmxq hDZvHEbPG6FccyMxzynQNiPptbEy3fHDkeP8JYQB0J6LF9vNroktsPRSKpynkroO/m82 biTPNuIsrSu3Pl0fQ8yt/E1ixVlvYqddlAWTc58Jb+wdJ5+mNZIs0JChsAeJ9xC3JvVu 1DT57bBl8d0ZCpTXOsu2mpB8v4RBla0uf2Z9x9XaFO5kAiue01cHXONUP4Mwo1csbDUv RUWA== X-Gm-Message-State: AOAM533T2S5tG6ZydR2wnKhq49oAIDfUjUwB04B/C9ZfG0Mma3UKCB11 1UCKOp7O2WF8AJYWHWn61Pd2rTSWYlGnXQ== X-Google-Smtp-Source: ABdhPJwivMRS6LU1K56Wq9cBT4YMxwYgQSsMgf0tD/nlukG7cNBXezI2W7c9h37Mh+8AMzwCVnRKpg== X-Received: by 2002:a17:902:d711:b029:d1:c6b5:ae5f with SMTP id w17-20020a170902d711b02900d1c6b5ae5fmr25678639ply.38.1600454275153; Fri, 18 Sep 2020 11:37:55 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.37.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:37:54 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 01/81] target/arm: Fix sve_uzp_p vs odd vector lengths Date: Fri, 18 Sep 2020 11:36:31 -0700 Message-Id: <20200918183751.2787647-2-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1043; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1043.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Laurent Desnogues , peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Missed out on compressing the second half of a predicate with length vl % 512 > 256. Adjust all of the x + (y << s) to x | (y << s) as a general style fix. Drop the extract64 because the input uint64_t are known to be already zero-extended from the current size of the predicate. Reported-by: Laurent Desnogues Signed-off-by: Richard Henderson --- target/arm/sve_helper.c | 30 +++++++++++++++++++++--------- 1 file changed, 21 insertions(+), 9 deletions(-) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 4758d46f34..fcb46f150f 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1938,7 +1938,7 @@ void HELPER(sve_uzp_p)(void *vd, void *vn, void *vm, uint32_t pred_desc) if (oprsz <= 8) { l = compress_bits(n[0] >> odd, esz); h = compress_bits(m[0] >> odd, esz); - d[0] = extract64(l + (h << (4 * oprsz)), 0, 8 * oprsz); + d[0] = l | (h << (4 * oprsz)); } else { ARMPredicateReg tmp_m; intptr_t oprsz_16 = oprsz / 16; @@ -1952,23 +1952,35 @@ void HELPER(sve_uzp_p)(void *vd, void *vn, void *vm, uint32_t pred_desc) h = n[2 * i + 1]; l = compress_bits(l >> odd, esz); h = compress_bits(h >> odd, esz); - d[i] = l + (h << 32); + d[i] = l | (h << 32); } - /* For VL which is not a power of 2, the results from M do not - align nicely with the uint64_t for D. Put the aligned results - from M into TMP_M and then copy it into place afterward. */ + /* + * For VL which is not a multiple of 512, the results from M do not + * align nicely with the uint64_t for D. Put the aligned results + * from M into TMP_M and then copy it into place afterward. + */ if (oprsz & 15) { - d[i] = compress_bits(n[2 * i] >> odd, esz); + int final_shift = (oprsz & 15) * 2; + + l = n[2 * i + 0]; + h = n[2 * i + 1]; + l = compress_bits(l >> odd, esz); + h = compress_bits(h >> odd, esz); + d[i] = l | (h << final_shift); for (i = 0; i < oprsz_16; i++) { l = m[2 * i + 0]; h = m[2 * i + 1]; l = compress_bits(l >> odd, esz); h = compress_bits(h >> odd, esz); - tmp_m.p[i] = l + (h << 32); + tmp_m.p[i] = l | (h << 32); } - tmp_m.p[i] = compress_bits(m[2 * i] >> odd, esz); + l = m[2 * i + 0]; + h = m[2 * i + 1]; + l = compress_bits(l >> odd, esz); + h = compress_bits(h >> odd, esz); + tmp_m.p[i] = l | (h << final_shift); swap_memmove(vd + oprsz / 2, &tmp_m, oprsz / 2); } else { @@ -1977,7 +1989,7 @@ void HELPER(sve_uzp_p)(void *vd, void *vn, void *vm, uint32_t pred_desc) h = m[2 * i + 1]; l = compress_bits(l >> odd, esz); h = compress_bits(h >> odd, esz); - d[oprsz_16 + i] = l + (h << 32); + d[oprsz_16 + i] = l | (h << 32); } } } From patchwork Fri Sep 18 18:36:32 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273361 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8B66DC43465 for ; Fri, 18 Sep 2020 18:39:59 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id CC61E21D7F for ; Fri, 18 Sep 2020 18:39:58 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="X6yp7BoK" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CC61E21D7F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:33930 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLIb-0000k8-PT for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 14:39:57 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47740) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLGj-0007MR-ND for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:01 -0400 Received: from mail-pl1-x644.google.com ([2607:f8b0:4864:20::644]:36617) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLGi-00072k-0b for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:01 -0400 Received: by mail-pl1-x644.google.com with SMTP id k13so3442041plk.3 for ; Fri, 18 Sep 2020 11:37:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=rRP60tIXgXyDjfRHhOpTpwQ4z3K7JPkhoQFnyY4gDHg=; b=X6yp7BoK7Iom3GOnwch1Rsj7fljWGvc7tch+klKdqPu+sghEKLxq9lpGz5UZ/QGgYs GTfJM2ep0u1ug5zvIOckbot0H+KaYy3slynYwkFLSXSqwmcm4zNVRUc2fIjGtWQYPfbv ublJxte5uRQTPQXyKaCfrvemQaqigbjjRWGhgtR0B7nhZOKJ+IT5pBnFGMS3/h5I7Ydw OysE6L3FcFZEk8/HONnq6dslgGGEuC0/WEhWrzvsO07aMhQ4JWEfZ5GSGIsXpEMtr6zh JotPkHhbrBm66stBwchsaJg+4xk0NwVAjkF3jkrKVIgn7iBr1Ke0aSEnOdtncGz0vyfH spVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=rRP60tIXgXyDjfRHhOpTpwQ4z3K7JPkhoQFnyY4gDHg=; b=N3ISMWd8b8RbFtvKy+5qJpHnk4E+e86mRf6Cge7FD0wdBDUHXoVsVWsatew/WVZBJW qun5nFdMofTNrR0ZzIXDFKk6iBJI7do0HpOzsLHNL2WTy28razLiW9y9GFFXerKlT0Bw BQnZb1z18lD9RuCrCgGVNIA1fINTJxtEf0A4LRpayCQ8FGDQxeK8o3kAyWo3rZjc1fgb MGikALuKpJY1R1xB2Uq5bRG8bWx1fKb6B+JIsSEHBj2QZjEg2DYX1koYvWz3ZlXCgyEy FvQK/QZU0IQk9CHBYH8wS4fnvLNzCGCFgG/tth6WVWEpIieEKI1ZBxrvCCtnlRvBc+Qz SuQQ== X-Gm-Message-State: AOAM533KRBYD6ADZE3QEdE7zbv1HTbHdLQuRu8siyPNBUaS8Ol6X1EVr VXbEHK85xA4SQ4uJdxl7ZoOGYFh8Nrd9Jw== X-Google-Smtp-Source: ABdhPJxR1oYdOEKRTmfHQkUrJTsYp4jSTQ58ckutG0XGsOoz/Nld3I3RLOoxvJIz/axJRNkMk7sGOA== X-Received: by 2002:a17:90b:f83:: with SMTP id ft3mr14069235pjb.234.1600454276540; Fri, 18 Sep 2020 11:37:56 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.37.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:37:55 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 02/81] target/arm: Fix sve_zip_p vs odd vector lengths Date: Fri, 18 Sep 2020 11:36:32 -0700 Message-Id: <20200918183751.2787647-3-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::644; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x644.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Laurent Desnogues , peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Wrote too much with low-half zip (zip1) with vl % 512 != 0. Adjust all of the x + (y << s) to x | (y << s) as a style fix. We only ever have exact overlap between D, M, and N. Therefore we only need a single temporary, and we do not need to check for partial overlap. Reported-by: Laurent Desnogues Signed-off-by: Richard Henderson --- target/arm/sve_helper.c | 25 ++++++++++++++----------- 1 file changed, 14 insertions(+), 11 deletions(-) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index fcb46f150f..b8651ae173 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1870,6 +1870,7 @@ void HELPER(sve_zip_p)(void *vd, void *vn, void *vm, uint32_t pred_desc) intptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; int esz = extract32(pred_desc, SIMD_DATA_SHIFT, 2); intptr_t high = extract32(pred_desc, SIMD_DATA_SHIFT + 2, 1); + int esize = 1 << esz; uint64_t *d = vd; intptr_t i; @@ -1882,33 +1883,35 @@ void HELPER(sve_zip_p)(void *vd, void *vn, void *vm, uint32_t pred_desc) mm = extract64(mm, high * half, half); nn = expand_bits(nn, esz); mm = expand_bits(mm, esz); - d[0] = nn + (mm << (1 << esz)); + d[0] = nn | (mm << esize); } else { - ARMPredicateReg tmp_n, tmp_m; + ARMPredicateReg tmp; /* We produce output faster than we consume input. Therefore we must be mindful of possible overlap. */ - if ((vn - vd) < (uintptr_t)oprsz) { - vn = memcpy(&tmp_n, vn, oprsz); - } - if ((vm - vd) < (uintptr_t)oprsz) { - vm = memcpy(&tmp_m, vm, oprsz); + if (vd == vn) { + vn = memcpy(&tmp, vn, oprsz); + if (vd == vm) { + vm = vn; + } + } else if (vd == vm) { + vm = memcpy(&tmp, vm, oprsz); } if (high) { high = oprsz >> 1; } - if ((high & 3) == 0) { + if ((oprsz & 7) == 0) { uint32_t *n = vn, *m = vm; high >>= 2; - for (i = 0; i < DIV_ROUND_UP(oprsz, 8); i++) { + for (i = 0; i < oprsz / 8; i++) { uint64_t nn = n[H4(high + i)]; uint64_t mm = m[H4(high + i)]; nn = expand_bits(nn, esz); mm = expand_bits(mm, esz); - d[i] = nn + (mm << (1 << esz)); + d[i] = nn | (mm << esize); } } else { uint8_t *n = vn, *m = vm; @@ -1920,7 +1923,7 @@ void HELPER(sve_zip_p)(void *vd, void *vn, void *vm, uint32_t pred_desc) nn = expand_bits(nn, esz); mm = expand_bits(mm, esz); - d16[H2(i)] = nn + (mm << (1 << esz)); + d16[H2(i)] = nn | (mm << esize); } } } From patchwork Fri Sep 18 18:36:33 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273360 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8F772C43464 for ; Fri, 18 Sep 2020 18:43:06 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E83D321534 for ; Fri, 18 Sep 2020 18:43:05 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="XLcsD7Wg" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E83D321534 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:42440 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLLc-0004Ul-Nm for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 14:43:04 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47736) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLGj-0007LL-7n for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:01 -0400 Received: from mail-pf1-x444.google.com ([2607:f8b0:4864:20::444]:39426) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLGh-00072h-Ju for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:00 -0400 Received: by mail-pf1-x444.google.com with SMTP id n14so3987321pff.6 for ; Fri, 18 Sep 2020 11:37:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=VmRUb+25uU2BYgui1TfG+LmGUzhUne41PY164owiyR4=; b=XLcsD7WgzUlsqrCiVEvXnH66HGB35YC5t/YGz1osG45DA4qNpkArIMRwfQJ0gkK4XJ gO9KBXPOsjZzuLGGkdTBRng1oLzhlfumnfeIm9HIOc+CTsWmE0vk0zleYXRBIMbJZyYB fXbPOG1S89tadiUgECStbuO9c8Qtd+1usUsJFH54+soBdpPNCoCvNy2RNCIOlkqTFl1b cIbCCsIY1LYYY/dUW0M3ElDW/moOaQHRumr057BBWd3enFXD4hiYvV5s+sWYueRNiGOx YSqSQ15MZuz9oVs80pl+QvImGottaFoqQKUdcYYRuKrpovtmSICzkzzj/1O/x0mwRpyi zTEQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=VmRUb+25uU2BYgui1TfG+LmGUzhUne41PY164owiyR4=; b=abpOldztRVVzuNvqYhSzOlyVTztrIo6eYEZA/BSKhTfl12JGI+JT6A19KfNeIjZeD6 y8rHVVAv7ATzmjXw9m9nyxjRqsTGCis8WWh+6r+X/LPZB7rXhyRVJP1wj6abWyJGSzhR 9fxO6VWuxuAtKRlT/95aHfo3gj77VDtMxipKS5D+sYC/9PKzuOxfpXkkvTCI0/tXv+0X LBVUEJqIigNq9itTfAK1nZz6hRpGKNBUBFFv7R/dtKDuw3UftbIBeUqb5lNBnaFpEimA ApPLk0t7h/hPL0fVwsv6VC09zfdbBXxLt2y+Dbc8f+AOCpv1FWxc2us2YEkvQgS8VL39 jRHg== X-Gm-Message-State: AOAM533c/CRRplv6rI8tiFj/IgK/ruALXFnii5h7xg19NkqLh+7CrLeT 4p7PZPKSp5IUlCDnWaP/SVc+WTvSX6q7XQ== X-Google-Smtp-Source: ABdhPJxSgL/GUqEgJtUeKXgNql7hOOb8Q21mBa3p7fQX4ZCMKBzcm9O/lg0AOerTI4PeJEB4nnbTug== X-Received: by 2002:aa7:869a:0:b029:142:2501:34d1 with SMTP id d26-20020aa7869a0000b0290142250134d1mr17692667pfo.42.1600454277893; Fri, 18 Sep 2020 11:37:57 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.37.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:37:57 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 03/81] target/arm: Fix sve_punpk_p vs odd vector lengths Date: Fri, 18 Sep 2020 11:36:33 -0700 Message-Id: <20200918183751.2787647-4-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::444; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x444.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Laurent Desnogues , peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Wrote too much with punpk1 with vl % 512 != 0. Reviewed-by: Peter Maydell Reported-by: Laurent Desnogues Signed-off-by: Richard Henderson --- target/arm/sve_helper.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index b8651ae173..c983cd4356 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2104,11 +2104,11 @@ void HELPER(sve_punpk_p)(void *vd, void *vn, uint32_t pred_desc) high = oprsz >> 1; } - if ((high & 3) == 0) { + if ((oprsz & 7) == 0) { uint32_t *n = vn; high >>= 2; - for (i = 0; i < DIV_ROUND_UP(oprsz, 8); i++) { + for (i = 0; i < oprsz / 8; i++) { uint64_t nn = n[H4(high + i)]; d[i] = expand_bits(nn, 0); } From patchwork Fri Sep 18 18:36:34 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273358 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 73C6BC43463 for ; Fri, 18 Sep 2020 18:46:47 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id CFB3521534 for ; Fri, 18 Sep 2020 18:46:46 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="H8Sgmka4" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CFB3521534 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:50722 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLPA-0007vo-UD for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 14:46:45 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47748) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLGk-0007P1-SW for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:02 -0400 Received: from mail-pj1-x1043.google.com ([2607:f8b0:4864:20::1043]:36152) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLGj-00072v-3B for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:02 -0400 Received: by mail-pj1-x1043.google.com with SMTP id b17so3463689pji.1 for ; Fri, 18 Sep 2020 11:38:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=isaI5MdmjrbtBjf8VoptM4IIRi7lAgi5cvu7Yitun8I=; b=H8Sgmka4nfdqcYEWoKnbL5ekuNyccZKjAdONvGtToaQBV3qM313Sz5ukeLFUqHC87c g5iaNnFC4yB95skyKmYxkJRCDmrXoakHByFlMAUtKvITbWL72ltyzFOno4jpxQ8RTdKY UBEb+0vYpaW9421Yub/5Vp7qv7YCRVgrPHCjxRBF8evGtBhp8HJMOTr6/m9ND1j4q8p+ va9j8mq/NDX+tJ/P4FSxoYFOA9hX0a/QwqqtAmTwQgDEgpcEjWfWzxjC2nr/JtX64UDr 0HxO6RVNgz5svSSJcUeOvVMj2PClXw3aJlwS587fFUJRCJbKMAvXEmIFAZnckjRqo0kJ bb7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=isaI5MdmjrbtBjf8VoptM4IIRi7lAgi5cvu7Yitun8I=; b=orK7JpBMLZT70wmOalB9J5NDmmy4LR6Yl2YFG4JMfyuyAYd/Jlznf4L9LMHrkaP/dm MAI7gRgP7S9MpzpeUEfXrI1TWBDV5KFvDk02nAN6otNIrQYDY45N28wx/jpJF7p/HyGv +U6hZszUvk8zYchu8VFFlYPuqPaWfEbC8LB9zuAWWxon7uHLq/+B3HdLMK5NsKS2hZj7 BvLqhtqjm2JelMIY3sZz4q7ZBz1YZwJPkK1BjEvPFzM0Pe36fsyVfn7GsHbdigbg5sR+ FLMpELJB/SvE/ERDCpBOpRmCU/xDPTE2XPhNrcMLO7Ya1gGPl7MMiC9FLpRc9gPTn9ff DFHg== X-Gm-Message-State: AOAM5335B2B62KZQ4erI0njjkKDpEbbXwWvp/dj3xY8vIWYx8Fi8EoxO yS3L3ZGFu3XYt8lEIsVkUHY8zBvCjOorwA== X-Google-Smtp-Source: ABdhPJxKhAvvigAIqlJRPyfU4frgfG0YUXRJXM5lVR9LFc/XXYeof+4uBDMYOT6eWQQgM5uNOPJRjA== X-Received: by 2002:a17:902:c151:b029:d1:9bd3:6855 with SMTP id 17-20020a170902c151b02900d19bd36855mr34064171plj.38.1600454279433; Fri, 18 Sep 2020 11:37:59 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.37.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:37:58 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 04/81] target/arm: Add ID_AA64ZFR0 fields and isar_feature_aa64_sve2 Date: Fri, 18 Sep 2020 11:36:34 -0700 Message-Id: <20200918183751.2787647-5-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1043; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1043.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Will be used for SVE2 isa subset enablement. Reviewed-by: Alex Bennée Signed-off-by: Richard Henderson --- v2: Do not read zfr0 from kvm unless sve is available. --- target/arm/cpu.h | 16 ++++++++++++++++ target/arm/helper.c | 3 +-- target/arm/kvm64.c | 11 +++++++++++ 3 files changed, 28 insertions(+), 2 deletions(-) diff --git a/target/arm/cpu.h b/target/arm/cpu.h index 6036f61d60..6f80a85fd0 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -920,6 +920,7 @@ struct ARMCPU { uint64_t id_aa64mmfr2; uint64_t id_aa64dfr0; uint64_t id_aa64dfr1; + uint64_t id_aa64zfr0; } isar; uint64_t midr; uint32_t revidr; @@ -1880,6 +1881,16 @@ FIELD(ID_AA64DFR0, PMSVER, 32, 4) FIELD(ID_AA64DFR0, DOUBLELOCK, 36, 4) FIELD(ID_AA64DFR0, TRACEFILT, 40, 4) +FIELD(ID_AA64ZFR0, SVEVER, 0, 4) +FIELD(ID_AA64ZFR0, AES, 4, 4) +FIELD(ID_AA64ZFR0, BITPERM, 16, 4) +FIELD(ID_AA64ZFR0, BFLOAT16, 20, 4) +FIELD(ID_AA64ZFR0, SHA3, 32, 4) +FIELD(ID_AA64ZFR0, SM4, 40, 4) +FIELD(ID_AA64ZFR0, I8MM, 44, 4) +FIELD(ID_AA64ZFR0, F32MM, 52, 4) +FIELD(ID_AA64ZFR0, F64MM, 56, 4) + FIELD(ID_DFR0, COPDBG, 0, 4) FIELD(ID_DFR0, COPSDBG, 4, 4) FIELD(ID_DFR0, MMAPDBG, 8, 4) @@ -3886,6 +3897,11 @@ static inline bool isar_feature_aa64_tts2uxn(const ARMISARegisters *id) return FIELD_EX64(id->id_aa64mmfr1, ID_AA64MMFR1, XNX) != 0; } +static inline bool isar_feature_aa64_sve2(const ARMISARegisters *id) +{ + return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, SVEVER) != 0; +} + /* * Feature tests for "does this exist in either 32-bit or 64-bit?" */ diff --git a/target/arm/helper.c b/target/arm/helper.c index 88bd9dd35d..c0b84f9b99 100644 --- a/target/arm/helper.c +++ b/target/arm/helper.c @@ -7414,8 +7414,7 @@ void register_cp_regs_for_features(ARMCPU *cpu) .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 4, .opc2 = 4, .access = PL1_R, .type = ARM_CP_CONST, .accessfn = access_aa64_tid3, - /* At present, only SVEver == 0 is defined anyway. */ - .resetvalue = 0 }, + .resetvalue = cpu->isar.id_aa64zfr0 }, { .name = "ID_AA64PFR5_EL1_RESERVED", .state = ARM_CP_STATE_AA64, .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 4, .opc2 = 5, .access = PL1_R, .type = ARM_CP_CONST, diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c index 987b35e33f..904d97bdd6 100644 --- a/target/arm/kvm64.c +++ b/target/arm/kvm64.c @@ -548,6 +548,17 @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf) err |= read_sys_reg64(fdarray[2], &ahcf->isar.id_aa64mmfr2, ARM64_SYS_REG(3, 0, 0, 7, 2)); + /* + * Before v5.1, KVM did not support SVE and did not expose + * ID_AA64ZFR0_EL1 even as RAZ. Afterward, KVM still does + * not expose the register to "user" requests like this + * unless the host supports SVE. + */ + if (isar_feature_aa64_sve(&ahcf->isar)) { + err |= read_sys_reg64(fdarray[2], &ahcf->isar.id_aa64zfr0, + ARM64_SYS_REG(3, 0, 0, 4, 4)); + } + /* * Note that if AArch32 support is not present in the host, * the AArch32 sysregs are present to be read, but will From patchwork Fri Sep 18 18:36:35 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 305024 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6AA27C43463 for ; Fri, 18 Sep 2020 19:00:05 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id EB3D821534 for ; Fri, 18 Sep 2020 19:00:04 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="BunxfReD" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EB3D821534 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:55896 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLc3-00068N-WF for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:00:04 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47792) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLGp-0007bO-Fp for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:07 -0400 Received: from mail-pj1-x1034.google.com ([2607:f8b0:4864:20::1034]:52914) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLGl-00073E-UM for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:07 -0400 Received: by mail-pj1-x1034.google.com with SMTP id bw23so784815pjb.2 for ; Fri, 18 Sep 2020 11:38:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=MZqEaovDbnBOwqIEr/pFkJYxrDx0MySV1/qTfCDSnq0=; b=BunxfReDcopXxAY2qtnK3SFfFlvWyYCDtQqXQOyv8k8O/eODLG5567dh0qfrOvEuiA vJxvn4gaqhDAu8p2h08ibSzqAkFUII0OZkHDy/CoJtLOwSgixjaSvT6nQyHRWdvvG/+p NqHD1rcLAEemI08B6MuZclKwOIHW0D3D5YVh1FRWyvVupLqV00k5HXut7sdAqfym5uf0 DQ/D2yYcKnTcBr2WRCB3gkeI+F58YMfNVsrGJANGIR7JUbAyW+fyejk8iyoOoDFrsD03 jIBRQjJeC7DZDAzqew5U0uV+mNO2koJ69kC++AE+BXSmzI/bWppECGq6I/L6JGUiWgMu i1LA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=MZqEaovDbnBOwqIEr/pFkJYxrDx0MySV1/qTfCDSnq0=; b=di4OB+pQdL5nSCccWeC1Sj3VcRUptDrlb52Y5W5roNvXPydJCJmT4SwBz7Dv8/Kv8i iraz4KrWKMJGZZ57sXdtSKNm4i8l3E1zm96S89MtJXDmyQJ2fMNRZUJplfeq0RDXUm3m U+fHupInSLzXt70JoG6RqE5HBDiA4HCdwC5FNNLGeZUQw0/lxqPN8+bkpNr+Fok3uLZo c1Pfft+4Y/wqWcc8mrxZ5B2T/+Dq2FbVid2Ow0R1QyNoxu6fssOYbYVahzSDjoJDQOtz q0FWYb+tV2ZrMPpNwWdMXWb/iojisARiE93Gz70dy+VkEpo8WfSo+lnZ3epminy8hgpo xzZA== X-Gm-Message-State: AOAM531iBge4imZhVhZvy+MrlH6VzMgtKPgouHII8GEul3U6D9I0BDiF ypRf8513Sws1kMwh+OXYpsOHckg+1B2Ecg== X-Google-Smtp-Source: ABdhPJzCVMt96rhgIqnXHhm/vnhk8WmVW1DHo+fFYOI2M7CYLs8Z60o7HNAwCWB99eDwW2eS4QEwWg== X-Received: by 2002:a17:902:a50b:b029:d1:e5e7:bdd2 with SMTP id s11-20020a170902a50bb02900d1e5e7bdd2mr16860906plq.50.1600454280679; Fri, 18 Sep 2020 11:38:00 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.37.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:37:59 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 05/81] target/arm: Implement SVE2 Integer Multiply - Unpredicated Date: Fri, 18 Sep 2020 11:36:35 -0700 Message-Id: <20200918183751.2787647-6-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1034; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1034.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" For MUL, we can rely on generic support. For SMULH and UMULH, create some trivial helpers. For PMUL, back in a21bb78e5817, we organized helper_gvec_pmul_b in preparation for this use. Signed-off-by: Richard Henderson --- target/arm/helper.h | 10 ++++ target/arm/sve.decode | 10 ++++ target/arm/translate-sve.c | 50 ++++++++++++++++++++ target/arm/vec_helper.c | 96 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 166 insertions(+) diff --git a/target/arm/helper.h b/target/arm/helper.h index 8defd7c801..94658169b2 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -815,6 +815,16 @@ DEF_HELPER_FLAGS_3(gvec_cgt0_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_cge0_b, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(gvec_cge0_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_smulh_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_smulh_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_smulh_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_smulh_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(gvec_umulh_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_umulh_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_umulh_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_umulh_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(gvec_sshl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_sshl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_ushl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 6425396ac1..2c71d9e446 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1090,3 +1090,13 @@ ST1_zprz 1110010 .. 00 ..... 100 ... ..... ..... \ @rprr_scatter_store xs=0 esz=3 scale=0 ST1_zprz 1110010 .. 00 ..... 110 ... ..... ..... \ @rprr_scatter_store xs=1 esz=3 scale=0 + +#### SVE2 Support + +### SVE2 Integer Multiply - Unpredicated + +# SVE2 integer multiply vectors (unpredicated) +MUL_zzz 00000100 .. 1 ..... 0110 00 ..... ..... @rd_rn_rm +SMULH_zzz 00000100 .. 1 ..... 0110 10 ..... ..... @rd_rn_rm +UMULH_zzz 00000100 .. 1 ..... 0110 11 ..... ..... @rd_rn_rm +PMUL_zzz 00000100 00 1 ..... 0110 01 ..... ..... @rd_rn_rm_e0 diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 9095586fc9..04c5a2c8bd 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -5805,3 +5805,53 @@ static bool trans_MOVPRFX_z(DisasContext *s, arg_rpr_esz *a) { return do_movz_zpz(s, a->rd, a->rn, a->pg, a->esz, false); } + +/* + * SVE2 Integer Multiply - Unpredicated + */ + +static bool trans_MUL_zzz(DisasContext *s, arg_rrr_esz *a) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + gen_gvec_fn_zzz(s, tcg_gen_gvec_mul, a->esz, a->rd, a->rn, a->rm); + } + return true; +} + +static bool do_sve2_zzz_ool(DisasContext *s, arg_rrr_esz *a, + gen_helper_gvec_3 *fn) +{ + if (fn == NULL || !dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + gen_gvec_ool_zzz(s, fn, a->rd, a->rn, a->rm, 0); + } + return true; +} + +static bool trans_SMULH_zzz(DisasContext *s, arg_rrr_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_gvec_smulh_b, gen_helper_gvec_smulh_h, + gen_helper_gvec_smulh_s, gen_helper_gvec_smulh_d, + }; + return do_sve2_zzz_ool(s, a, fns[a->esz]); +} + +static bool trans_UMULH_zzz(DisasContext *s, arg_rrr_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_gvec_umulh_b, gen_helper_gvec_umulh_h, + gen_helper_gvec_umulh_s, gen_helper_gvec_umulh_d, + }; + return do_sve2_zzz_ool(s, a, fns[a->esz]); +} + +static bool trans_PMUL_zzz(DisasContext *s, arg_rrr_esz *a) +{ + return do_sve2_zzz_ool(s, a, gen_helper_gvec_pmul_b); +} diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index a973454e4f..84e1ae3933 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -1937,3 +1937,99 @@ DO_VRINT_RMODE(gvec_vrint_rm_h, helper_rinth, uint16_t) DO_VRINT_RMODE(gvec_vrint_rm_s, helper_rints, uint32_t) #undef DO_VRINT_RMODE + +/* + * NxN -> N highpart multiply + * + * TODO: expose this as a generic vector operation. + */ + +void HELPER(gvec_smulh_b)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int8_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz; ++i) { + d[i] = ((int32_t)n[i] * m[i]) >> 8; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_smulh_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int16_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 2; ++i) { + d[i] = ((int32_t)n[i] * m[i]) >> 16; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_smulh_s)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int32_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 4; ++i) { + d[i] = ((int64_t)n[i] * m[i]) >> 32; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_smulh_d)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint64_t *d = vd, *n = vn, *m = vm; + uint64_t discard; + + for (i = 0; i < opr_sz / 8; ++i) { + muls64(&discard, &d[i], n[i], m[i]); + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_umulh_b)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint8_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz; ++i) { + d[i] = ((uint32_t)n[i] * m[i]) >> 8; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_umulh_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint16_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 2; ++i) { + d[i] = ((uint32_t)n[i] * m[i]) >> 16; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_umulh_s)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint32_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 4; ++i) { + d[i] = ((uint64_t)n[i] * m[i]) >> 32; + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_umulh_d)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + uint64_t *d = vd, *n = vn, *m = vm; + uint64_t discard; + + for (i = 0; i < opr_sz / 8; ++i) { + mulu64(&discard, &d[i], n[i], m[i]); + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} From patchwork Fri Sep 18 18:36:36 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 305026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CB53BC43463 for ; Fri, 18 Sep 2020 18:57:53 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6757221534 for ; Fri, 18 Sep 2020 18:57:53 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="bNCEAK1T" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6757221534 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:47754 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLZw-00025K-BE for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 14:57:52 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47778) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLGn-0007WB-IB for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:05 -0400 Received: from mail-pj1-x1031.google.com ([2607:f8b0:4864:20::1031]:36288) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLGl-00073B-LT for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:05 -0400 Received: by mail-pj1-x1031.google.com with SMTP id b17so3463731pji.1 for ; Fri, 18 Sep 2020 11:38:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=vx8ZZ5+u/yR5WaW2fHP7h6lNyV8dijeNGCAPaQJ2js0=; b=bNCEAK1T71C5Ps/H0eB8lTYwc9J3Uu7fCW2gKcVs38T+8YkOApzzdiYDIVfgMxAl6/ DIyYZ4pvOMJYWchy6gOfgSp3CBsai/G/LHsJ37MzcKl5XqUQY6IpAcyfUPpw8W6fjUG+ 1I/W/0JL/sYfW3zZ3UtGsi8fDlV191fSnrSAZ3lB1rkprvH2iu+2tOEimqd8BdOl4sTV 0exRc3fpPSpQdnJgnNRPzUsaSV+DypcOr0xOJ+v8ww/nCG3Tnb6EfPofp5Koq2hP3EFC ZvmNHniLRtnvzQYneggZz1pKcga8C7Yf8oDDhBr4j6YrM0eJ1yI3j9w3ojT9eOodYTG6 0+2Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=vx8ZZ5+u/yR5WaW2fHP7h6lNyV8dijeNGCAPaQJ2js0=; b=IX7SMGNO2QcMSYaUfafcqtFckRSpJhF84gR86+HsLDcVBcrHGiBMJLCBfxF3XfPeZw +/RtCj55R5of8lD0q5WoRMwCBdAIGRMlBrTEF2+jA5u1Drca9cpVWfpF6LBDHXM8XTO7 tM3DL+Z7SHlwKgsNX23jHFvxN4xDiGi8ehDorflH1wbo3vdnASg4i3hjFy+pJuvlyvCr tJgLmPtBZzahxtum+6Tmz3gycIwhAc9NqOuV6y9WE/8B0rfzQfc5c3OltfSeeLjsxJux RFd9eBLHBjcXQoJ/yanx9kYw6pXR0iNVb4l+P0rFN2mgCU0mIGZ6Ak6IDyYEZj+D0zLH TZig== X-Gm-Message-State: AOAM5336cInxE0xdb4I0MKu9SnCfNsrXsKip3vfwH7jN588byx99F5Hi 5jlQx2JD2Pm3yGwwR+G4mlBpSrc3hSh1QQ== X-Google-Smtp-Source: ABdhPJz0nkAWTtopeYcUMPe5R7IxcKsRPXoNhqw7DHm8sShICsnG37dMyKfmGri95vl4VuZpluTG9Q== X-Received: by 2002:a17:90a:1548:: with SMTP id y8mr14151686pja.113.1600454281981; Fri, 18 Sep 2020 11:38:01 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:01 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 06/81] target/arm: Implement SVE2 integer pairwise add and accumulate long Date: Fri, 18 Sep 2020 11:36:36 -0700 Message-Id: <20200918183751.2787647-7-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1031; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1031.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 14 ++++++++++++ target/arm/sve.decode | 5 +++++ target/arm/sve_helper.c | 44 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 39 +++++++++++++++++++++++++++++++++ 4 files changed, 102 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 4411c47120..e185405cdc 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -158,6 +158,20 @@ DEF_HELPER_FLAGS_5(sve_umulh_zpzz_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_umulh_zpzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sadalp_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sadalp_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sadalp_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_uadalp_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uadalp_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uadalp_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_sdiv_zpzz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_sdiv_zpzz_d, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 2c71d9e446..4f54a30538 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1100,3 +1100,8 @@ MUL_zzz 00000100 .. 1 ..... 0110 00 ..... ..... @rd_rn_rm SMULH_zzz 00000100 .. 1 ..... 0110 10 ..... ..... @rd_rn_rm UMULH_zzz 00000100 .. 1 ..... 0110 11 ..... ..... @rd_rn_rm PMUL_zzz 00000100 00 1 ..... 0110 01 ..... ..... @rd_rn_rm_e0 + +### SVE2 Integer - Predicated + +SADALP_zpzz 01000100 .. 000 100 101 ... ..... ..... @rdm_pg_rn +UADALP_zpzz 01000100 .. 000 101 101 ... ..... ..... @rdm_pg_rn diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index c983cd4356..4705722b71 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -517,6 +517,50 @@ DO_ZPZZ_D(sve_asr_zpzz_d, int64_t, DO_ASR) DO_ZPZZ_D(sve_lsr_zpzz_d, uint64_t, DO_LSR) DO_ZPZZ_D(sve_lsl_zpzz_d, uint64_t, DO_LSL) +static inline uint16_t do_sadalp_h(uint16_t n, uint16_t m) +{ + int8_t n1 = n, n2 = n >> 8; + return m + n1 + n2; +} + +static inline uint32_t do_sadalp_s(uint32_t n, uint32_t m) +{ + int16_t n1 = n, n2 = n >> 16; + return m + n1 + n2; +} + +static inline uint64_t do_sadalp_d(uint64_t n, uint64_t m) +{ + int32_t n1 = n, n2 = n >> 32; + return m + n1 + n2; +} + +DO_ZPZZ(sve2_sadalp_zpzz_h, int16_t, H1_2, do_sadalp_h) +DO_ZPZZ(sve2_sadalp_zpzz_s, int32_t, H1_4, do_sadalp_s) +DO_ZPZZ_D(sve2_sadalp_zpzz_d, uint64_t, do_sadalp_d) + +static inline uint16_t do_uadalp_h(uint16_t n, uint16_t m) +{ + uint8_t n1 = n, n2 = n >> 8; + return m + n1 + n2; +} + +static inline uint32_t do_uadalp_s(uint32_t n, uint32_t m) +{ + uint16_t n1 = n, n2 = n >> 16; + return m + n1 + n2; +} + +static inline uint64_t do_uadalp_d(uint64_t n, uint64_t m) +{ + uint32_t n1 = n, n2 = n >> 32; + return m + n1 + n2; +} + +DO_ZPZZ(sve2_uadalp_zpzz_h, int16_t, H1_2, do_uadalp_h) +DO_ZPZZ(sve2_uadalp_zpzz_s, int32_t, H1_4, do_uadalp_s) +DO_ZPZZ_D(sve2_uadalp_zpzz_d, uint64_t, do_uadalp_d) + #undef DO_ZPZZ #undef DO_ZPZZ_D diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 04c5a2c8bd..56e9e60a89 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -5855,3 +5855,42 @@ static bool trans_PMUL_zzz(DisasContext *s, arg_rrr_esz *a) { return do_sve2_zzz_ool(s, a, gen_helper_gvec_pmul_b); } + +/* + * SVE2 Integer - Predicated + */ + +static bool do_sve2_zpzz_ool(DisasContext *s, arg_rprr_esz *a, + gen_helper_gvec_4 *fn) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + return do_zpzz_ool(s, a, fn); +} + +static bool trans_SADALP_zpzz(DisasContext *s, arg_rprr_esz *a) +{ + static gen_helper_gvec_4 * const fns[3] = { + gen_helper_sve2_sadalp_zpzz_h, + gen_helper_sve2_sadalp_zpzz_s, + gen_helper_sve2_sadalp_zpzz_d, + }; + if (a->esz == 0) { + return false; + } + return do_sve2_zpzz_ool(s, a, fns[a->esz - 1]); +} + +static bool trans_UADALP_zpzz(DisasContext *s, arg_rprr_esz *a) +{ + static gen_helper_gvec_4 * const fns[3] = { + gen_helper_sve2_uadalp_zpzz_h, + gen_helper_sve2_uadalp_zpzz_s, + gen_helper_sve2_uadalp_zpzz_d, + }; + if (a->esz == 0) { + return false; + } + return do_sve2_zpzz_ool(s, a, fns[a->esz - 1]); +} From patchwork Fri Sep 18 18:36:37 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 305029 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 83B35C43465 for ; Fri, 18 Sep 2020 18:51:00 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0088F206A1 for ; Fri, 18 Sep 2020 18:50:59 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="BELACQZy" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0088F206A1 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:59042 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLTH-0003B4-04 for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 14:50:59 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47794) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLGp-0007c5-P8 for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:07 -0400 Received: from mail-pf1-x443.google.com ([2607:f8b0:4864:20::443]:42620) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLGn-00073K-7c for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:07 -0400 Received: by mail-pf1-x443.google.com with SMTP id d6so3982691pfn.9 for ; Fri, 18 Sep 2020 11:38:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=GdQyDJqy1h1bl9wSC4ATOEyl4Dl7iyZGyWZyz4DEAkI=; b=BELACQZyipZZhzG6kLOvFClbXfikZo8RhB+PU3XlJID8vBtQIX7V/dDpZqu25tvP0f yrM0rANqxRgn8NRGTgRBSgM1hpPNcefzzr/QoGP8pI7Ka3352GO5//N8wtgmwumJsGTZ Rz7NSk18SAfBw+Fpz4x5RKtQm2SdqpcDnUgaC4AaMONY7SrmIgFL4BoBnEqkn5HfFj7t rB6FgvFlVhjWfvGuGPvZ7Gz86v4j8XJ3KL18u3dDeAU9ZVxMp5dQo3sd/JyfCYWZvYk3 cS0HH6Tf4PgX1YDwRX9sXoVexrzR4fmq9rVE1il7LoCiniqVLrYTQ+2A+LGI8rM+eu/n beXA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=GdQyDJqy1h1bl9wSC4ATOEyl4Dl7iyZGyWZyz4DEAkI=; b=gEy+4ajp17Pm6a4vjdLGM+UVup4kdzdfFQRiZPB8YkbHnZYhyoV5h77pOsrx2MfHCu 13wweO+vGTNHlQhVEr1bKhIwOviGLE/o077kcxbIwHOSIuLwOJw3EghV6I520ZiTZMP2 EQXKjdD6mrcdlTcaLzQb3SvGzgHVf926VRb7ZZqhbw7H+SpyVcV6PwysKaRk3c1vtLF7 yEX7/7qk5x2+A6kyWuLYDp5JW+AjVmapIV0vgQIh6iKWYYOhLImW/9KrWuqPSjyI4lRd PDLezitEL+ZlIjzAwyR3crzx0MhV2kDT1ggj8NKA7q8gNUVaRJ5Dn54YyWnbEXXC6vEH 0MKw== X-Gm-Message-State: AOAM532pWkNpPrUYGy1Uuuk51/8cFSSBTSUx/VI4b+TRsAyiYoc/Mpvo OIKJ8gN5IpMCK+0TcE12PR9kS08zPZF56w== X-Google-Smtp-Source: ABdhPJw0WgtUe7ePRxxgV04+OlrrJ3n1JJMWK+1wfEGk/IxgICEU7i0WH7PmtKT6hGUgCFeaTKifng== X-Received: by 2002:a63:6dc7:: with SMTP id i190mr27067147pgc.27.1600454283372; Fri, 18 Sep 2020 11:38:03 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:02 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 07/81] target/arm: Implement SVE2 integer unary operations (predicated) Date: Fri, 18 Sep 2020 11:36:37 -0700 Message-Id: <20200918183751.2787647-8-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::443; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x443.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- v2: Fix sqabs, sqneg (laurent desnogues) --- target/arm/helper-sve.h | 13 +++++++++++ target/arm/sve.decode | 7 ++++++ target/arm/sve_helper.c | 29 +++++++++++++++++++---- target/arm/translate-sve.c | 47 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 92 insertions(+), 4 deletions(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index e185405cdc..938c50f2c9 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -502,6 +502,19 @@ DEF_HELPER_FLAGS_4(sve_rbit_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_rbit_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_rbit_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqabs_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqabs_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqabs_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqabs_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_sqneg_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqneg_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqneg_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqneg_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_urecpe_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_ursqrte_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_splice, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_cmpeq_ppzz_b, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 4f54a30538..2ab463c722 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1105,3 +1105,10 @@ PMUL_zzz 00000100 00 1 ..... 0110 01 ..... ..... @rd_rn_rm_e0 SADALP_zpzz 01000100 .. 000 100 101 ... ..... ..... @rdm_pg_rn UADALP_zpzz 01000100 .. 000 101 101 ... ..... ..... @rdm_pg_rn + +### SVE2 integer unary operations (predicated) + +URECPE 01000100 .. 000 000 101 ... ..... ..... @rd_pg_rn +URSQRTE 01000100 .. 000 001 101 ... ..... ..... @rd_pg_rn +SQABS 01000100 .. 001 000 101 ... ..... ..... @rd_pg_rn +SQNEG 01000100 .. 001 001 101 ... ..... ..... @rd_pg_rn diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 4705722b71..b1c89f77d4 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -535,8 +535,8 @@ static inline uint64_t do_sadalp_d(uint64_t n, uint64_t m) return m + n1 + n2; } -DO_ZPZZ(sve2_sadalp_zpzz_h, int16_t, H1_2, do_sadalp_h) -DO_ZPZZ(sve2_sadalp_zpzz_s, int32_t, H1_4, do_sadalp_s) +DO_ZPZZ(sve2_sadalp_zpzz_h, uint16_t, H1_2, do_sadalp_h) +DO_ZPZZ(sve2_sadalp_zpzz_s, uint32_t, H1_4, do_sadalp_s) DO_ZPZZ_D(sve2_sadalp_zpzz_d, uint64_t, do_sadalp_d) static inline uint16_t do_uadalp_h(uint16_t n, uint16_t m) @@ -557,8 +557,8 @@ static inline uint64_t do_uadalp_d(uint64_t n, uint64_t m) return m + n1 + n2; } -DO_ZPZZ(sve2_uadalp_zpzz_h, int16_t, H1_2, do_uadalp_h) -DO_ZPZZ(sve2_uadalp_zpzz_s, int32_t, H1_4, do_uadalp_s) +DO_ZPZZ(sve2_uadalp_zpzz_h, uint16_t, H1_2, do_uadalp_h) +DO_ZPZZ(sve2_uadalp_zpzz_s, uint32_t, H1_4, do_uadalp_s) DO_ZPZZ_D(sve2_uadalp_zpzz_d, uint64_t, do_uadalp_d) #undef DO_ZPZZ @@ -728,6 +728,27 @@ DO_ZPZ(sve_rbit_h, uint16_t, H1_2, revbit16) DO_ZPZ(sve_rbit_s, uint32_t, H1_4, revbit32) DO_ZPZ_D(sve_rbit_d, uint64_t, revbit64) +#define DO_SQABS(X) \ + ({ __typeof(X) x_ = (X), min_ = 1ull << (sizeof(X) * 8 - 1); \ + x_ >= 0 ? x_ : x_ == min_ ? -min_ - 1 : -x_; }) + +DO_ZPZ(sve2_sqabs_b, int8_t, H1, DO_SQABS) +DO_ZPZ(sve2_sqabs_h, int16_t, H1_2, DO_SQABS) +DO_ZPZ(sve2_sqabs_s, int32_t, H1_4, DO_SQABS) +DO_ZPZ_D(sve2_sqabs_d, int64_t, DO_SQABS) + +#define DO_SQNEG(X) \ + ({ __typeof(X) x_ = (X), min_ = 1ull << (sizeof(X) * 8 - 1); \ + x_ == min_ ? -min_ - 1 : -x_; }) + +DO_ZPZ(sve2_sqneg_b, uint8_t, H1, DO_SQNEG) +DO_ZPZ(sve2_sqneg_h, uint16_t, H1_2, DO_SQNEG) +DO_ZPZ(sve2_sqneg_s, uint32_t, H1_4, DO_SQNEG) +DO_ZPZ_D(sve2_sqneg_d, uint64_t, DO_SQNEG) + +DO_ZPZ(sve2_urecpe_s, uint32_t, H1_4, helper_recpe_u32) +DO_ZPZ(sve2_ursqrte_s, uint32_t, H1_4, helper_rsqrte_u32) + /* Three-operand expander, unpredicated, in which the third operand is "wide". */ #define DO_ZZW(NAME, TYPE, TYPEW, H, OP) \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 56e9e60a89..1dacb3fbc4 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -5894,3 +5894,50 @@ static bool trans_UADALP_zpzz(DisasContext *s, arg_rprr_esz *a) } return do_sve2_zpzz_ool(s, a, fns[a->esz - 1]); } + +/* + * SVE2 integer unary operations (predicated) + */ + +static bool do_sve2_zpz_ool(DisasContext *s, arg_rpr_esz *a, + gen_helper_gvec_3 *fn) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + return do_zpz_ool(s, a, fn); +} + +static bool trans_URECPE(DisasContext *s, arg_rpr_esz *a) +{ + if (a->esz != 2) { + return false; + } + return do_sve2_zpz_ool(s, a, gen_helper_sve2_urecpe_s); +} + +static bool trans_URSQRTE(DisasContext *s, arg_rpr_esz *a) +{ + if (a->esz != 2) { + return false; + } + return do_sve2_zpz_ool(s, a, gen_helper_sve2_ursqrte_s); +} + +static bool trans_SQABS(DisasContext *s, arg_rpr_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve2_sqabs_b, gen_helper_sve2_sqabs_h, + gen_helper_sve2_sqabs_s, gen_helper_sve2_sqabs_d, + }; + return do_sve2_zpz_ool(s, a, fns[a->esz]); +} + +static bool trans_SQNEG(DisasContext *s, arg_rpr_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve2_sqneg_b, gen_helper_sve2_sqneg_h, + gen_helper_sve2_sqneg_s, gen_helper_sve2_sqneg_d, + }; + return do_sve2_zpz_ool(s, a, fns[a->esz]); +} From patchwork Fri Sep 18 18:36:38 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273359 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2BDECC43463 for ; Fri, 18 Sep 2020 18:43:37 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 64D3A21534 for ; Fri, 18 Sep 2020 18:43:36 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="a4TuihLV" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 64D3A21534 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:44180 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLM7-0005Ap-62 for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 14:43:35 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47840) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLGs-0007hz-6q for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:10 -0400 Received: from mail-pj1-x1042.google.com ([2607:f8b0:4864:20::1042]:51666) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLGp-00073T-04 for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:09 -0400 Received: by mail-pj1-x1042.google.com with SMTP id a9so3630103pjg.1 for ; Fri, 18 Sep 2020 11:38:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=NZak3l82SIOkXxMGPmecx2ON3q41zNmoRYnuO/ggS+0=; b=a4TuihLV10RMYXtgZk5JFMEEyQKu7NZJTThza7bl3uCzD2YMefJ5/niPJxanp5VC9w 5DEJz1+ITjocOom64zu7qZyyejNMdI/2TDdeu5EFMp0v6HNRegukJ7RJsXV6Ld4OqtD4 CiK2lU/MdmN5uTlK777OXjvzLgg2n6CL2Vvrb82FFFxdR6VwqzlOB3guNlPA71sjvlQq vych/8EWODKVGyBtHXKRu26ARxwY6fyix3OK9CChHiC8x1MW1Ccj8197B6NVU70TC9D9 E5a7QdhMHFLNUpG7rkbZa7vMiC+APid77dA+NTOFxDpl3V34hU9pMO+ERYU6ynwBDH6K FMuw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=NZak3l82SIOkXxMGPmecx2ON3q41zNmoRYnuO/ggS+0=; b=TGlLwWOPZkcB0lw9QzPdAeB+1bkkhvtbE6mMRtWNwrvnQcAEspUzz+bs7bTI0sGVYY Vrz7Ddtqif8oOf3LPNqfPX1hZ6XMgbKUVxqcJiri2d7tI33pRbZc/lGVwvQrMjNTWGVf Fgqokb452CxV0CJYIZvFgo2IWknbkVBGYN4cA9YUw5kJX7vU/UMxOCDiO7p3Ht+vS9Am 5G2jFN/KSwhuUl9CIn6HThhUeuorplJAhfLByTn7nyieJ+wASIuJRHKMI/AUn/xEjNDA 8SwvSOtmUZ9GJpc7HIvrdNcgKhp9vIf4F6yYeSHWQDt++CNqFrSIJZnH1+gZRIpdSh1v 6XgA== X-Gm-Message-State: AOAM533pmhLRKSUSo4CmDjIHk0dEXCc81x9sXjh1nO3NeHcm+uTgMIPD nfgtDijvn2MfKfULzrzOiEH+MA6CzfRLqA== X-Google-Smtp-Source: ABdhPJwJlq4n0IP+d6LlRrIe2JVScyuROIadheu/W2PgpdvKEuZHewxoDdQ7apW2DNsJS04ZXS3JNA== X-Received: by 2002:a17:90a:f415:: with SMTP id ch21mr14891273pjb.18.1600454284630; Fri, 18 Sep 2020 11:38:04 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:03 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 08/81] target/arm: Split out saturating/rounding shifts from neon Date: Fri, 18 Sep 2020 11:36:38 -0700 Message-Id: <20200918183751.2787647-9-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1042; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1042.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Split these operations out into a header that can be shared between neon and sve. The "sat" pointer acts both as a boolean for control of saturating behavior and controls the difference in behavior between neon and sve -- QC bit or no QC bit. Widen the shift operand in the new helpers, as the SVE2 insns treat the whole input element as significant. For the neon uses, truncate the shift to int8_t while passing the parameter. Implement right-shift rounding as tmp = src >> (shift - 1); dst = (tmp >> 1) + (tmp & 1); This is the same number of instructions as the current tmp = 1 << (shift - 1); dst = (src + tmp) >> shift; without any possibility of intermediate overflow. Signed-off-by: Richard Henderson --- v2: Widen the shift operand (laurent desnouges) --- target/arm/vec_internal.h | 138 +++++++++++ target/arm/neon_helper.c | 507 +++++++------------------------------- 2 files changed, 221 insertions(+), 424 deletions(-) diff --git a/target/arm/vec_internal.h b/target/arm/vec_internal.h index 00a8277765..372fe76523 100644 --- a/target/arm/vec_internal.h +++ b/target/arm/vec_internal.h @@ -30,4 +30,142 @@ static inline void clear_tail(void *vd, uintptr_t opr_sz, uintptr_t max_sz) } } +static inline int32_t do_sqrshl_bhs(int32_t src, int32_t shift, int bits, + bool round, uint32_t *sat) +{ + if (shift <= -bits) { + /* Rounding the sign bit always produces 0. */ + if (round) { + return 0; + } + return src >> 31; + } else if (shift < 0) { + if (round) { + src >>= -shift - 1; + return (src >> 1) + (src & 1); + } + return src >> -shift; + } else if (shift < bits) { + int32_t val = src << shift; + if (bits == 32) { + if (!sat || val >> shift == src) { + return val; + } + } else { + int32_t extval = sextract32(val, 0, bits); + if (!sat || val == extval) { + return extval; + } + } + } else if (!sat || src == 0) { + return 0; + } + + *sat = 1; + return (1u << (bits - 1)) - (src >= 0); +} + +static inline uint32_t do_uqrshl_bhs(uint32_t src, int32_t shift, int bits, + bool round, uint32_t *sat) +{ + if (shift <= -(bits + round)) { + return 0; + } else if (shift < 0) { + if (round) { + src >>= -shift - 1; + return (src >> 1) + (src & 1); + } + return src >> -shift; + } else if (shift < bits) { + uint32_t val = src << shift; + if (bits == 32) { + if (!sat || val >> shift == src) { + return val; + } + } else { + uint32_t extval = extract32(val, 0, bits); + if (!sat || val == extval) { + return extval; + } + } + } else if (!sat || src == 0) { + return 0; + } + + *sat = 1; + return MAKE_64BIT_MASK(0, bits); +} + +static inline int32_t do_suqrshl_bhs(int32_t src, int32_t shift, int bits, + bool round, uint32_t *sat) +{ + if (src < 0) { + *sat = 1; + return 0; + } + return do_uqrshl_bhs(src, shift, bits, round, sat); +} + +static inline int64_t do_sqrshl_d(int64_t src, int64_t shift, + bool round, uint32_t *sat) +{ + if (shift <= -64) { + /* Rounding the sign bit always produces 0. */ + if (round) { + return 0; + } + return src >> 63; + } else if (shift < 0) { + if (round) { + src >>= -shift - 1; + return (src >> 1) + (src & 1); + } + return src >> -shift; + } else if (shift < 64) { + int64_t val = src << shift; + if (!sat || val >> shift == src) { + return val; + } + } else if (!sat || src == 0) { + return 0; + } + + *sat = 1; + return src < 0 ? INT64_MIN : INT64_MAX; +} + +static inline uint64_t do_uqrshl_d(uint64_t src, int64_t shift, + bool round, uint32_t *sat) +{ + if (shift <= -(64 + round)) { + return 0; + } else if (shift < 0) { + if (round) { + src >>= -shift - 1; + return (src >> 1) + (src & 1); + } + return src >> -shift; + } else if (shift < 64) { + uint64_t val = src << shift; + if (!sat || val >> shift == src) { + return val; + } + } else if (!sat || src == 0) { + return 0; + } + + *sat = 1; + return UINT64_MAX; +} + +static inline int64_t do_suqrshl_d(int64_t src, int64_t shift, + bool round, uint32_t *sat) +{ + if (src < 0) { + *sat = 1; + return 0; + } + return do_uqrshl_d(src, shift, round, sat); +} + #endif /* TARGET_ARM_VEC_INTERNALS_H */ diff --git a/target/arm/neon_helper.c b/target/arm/neon_helper.c index b637265691..338b9189d5 100644 --- a/target/arm/neon_helper.c +++ b/target/arm/neon_helper.c @@ -11,6 +11,7 @@ #include "cpu.h" #include "exec/helper-proto.h" #include "fpu/softfloat.h" +#include "vec_internal.h" #define SIGNBIT (uint32_t)0x80000000 #define SIGNBIT64 ((uint64_t)1 << 63) @@ -576,496 +577,154 @@ NEON_POP(pmax_s16, neon_s16, 2) NEON_POP(pmax_u16, neon_u16, 2) #undef NEON_FN -#define NEON_FN(dest, src1, src2) do { \ - int8_t tmp; \ - tmp = (int8_t)src2; \ - if (tmp >= (ssize_t)sizeof(src1) * 8 || \ - tmp <= -(ssize_t)sizeof(src1) * 8) { \ - dest = 0; \ - } else if (tmp < 0) { \ - dest = src1 >> -tmp; \ - } else { \ - dest = src1 << tmp; \ - }} while (0) +#define NEON_FN(dest, src1, src2) \ + (dest = do_uqrshl_bhs(src1, (int8_t)src2, 16, false, NULL)) NEON_VOP(shl_u16, neon_u16, 2) #undef NEON_FN -#define NEON_FN(dest, src1, src2) do { \ - int8_t tmp; \ - tmp = (int8_t)src2; \ - if (tmp >= (ssize_t)sizeof(src1) * 8) { \ - dest = 0; \ - } else if (tmp <= -(ssize_t)sizeof(src1) * 8) { \ - dest = src1 >> (sizeof(src1) * 8 - 1); \ - } else if (tmp < 0) { \ - dest = src1 >> -tmp; \ - } else { \ - dest = src1 << tmp; \ - }} while (0) +#define NEON_FN(dest, src1, src2) \ + (dest = do_sqrshl_bhs(src1, (int8_t)src2, 16, false, NULL)) NEON_VOP(shl_s16, neon_s16, 2) #undef NEON_FN -#define NEON_FN(dest, src1, src2) do { \ - int8_t tmp; \ - tmp = (int8_t)src2; \ - if ((tmp >= (ssize_t)sizeof(src1) * 8) \ - || (tmp <= -(ssize_t)sizeof(src1) * 8)) { \ - dest = 0; \ - } else if (tmp < 0) { \ - dest = (src1 + (1 << (-1 - tmp))) >> -tmp; \ - } else { \ - dest = src1 << tmp; \ - }} while (0) +#define NEON_FN(dest, src1, src2) \ + (dest = do_sqrshl_bhs(src1, (int8_t)src2, 8, true, NULL)) NEON_VOP(rshl_s8, neon_s8, 4) +#undef NEON_FN + +#define NEON_FN(dest, src1, src2) \ + (dest = do_sqrshl_bhs(src1, (int8_t)src2, 16, true, NULL)) NEON_VOP(rshl_s16, neon_s16, 2) #undef NEON_FN -/* The addition of the rounding constant may overflow, so we use an - * intermediate 64 bit accumulator. */ -uint32_t HELPER(neon_rshl_s32)(uint32_t valop, uint32_t shiftop) +uint32_t HELPER(neon_rshl_s32)(uint32_t val, uint32_t shift) { - int32_t dest; - int32_t val = (int32_t)valop; - int8_t shift = (int8_t)shiftop; - if ((shift >= 32) || (shift <= -32)) { - dest = 0; - } else if (shift < 0) { - int64_t big_dest = ((int64_t)val + (1 << (-1 - shift))); - dest = big_dest >> -shift; - } else { - dest = val << shift; - } - return dest; + return do_sqrshl_bhs(val, (int8_t)shift, 32, true, NULL); } -/* Handling addition overflow with 64 bit input values is more - * tricky than with 32 bit values. */ -uint64_t HELPER(neon_rshl_s64)(uint64_t valop, uint64_t shiftop) +uint64_t HELPER(neon_rshl_s64)(uint64_t val, uint64_t shift) { - int8_t shift = (int8_t)shiftop; - int64_t val = valop; - if ((shift >= 64) || (shift <= -64)) { - val = 0; - } else if (shift < 0) { - val >>= (-shift - 1); - if (val == INT64_MAX) { - /* In this case, it means that the rounding constant is 1, - * and the addition would overflow. Return the actual - * result directly. */ - val = 0x4000000000000000LL; - } else { - val++; - val >>= 1; - } - } else { - val <<= shift; - } - return val; + return do_sqrshl_d(val, (int8_t)shift, true, NULL); } -#define NEON_FN(dest, src1, src2) do { \ - int8_t tmp; \ - tmp = (int8_t)src2; \ - if (tmp >= (ssize_t)sizeof(src1) * 8 || \ - tmp < -(ssize_t)sizeof(src1) * 8) { \ - dest = 0; \ - } else if (tmp == -(ssize_t)sizeof(src1) * 8) { \ - dest = src1 >> (-tmp - 1); \ - } else if (tmp < 0) { \ - dest = (src1 + (1 << (-1 - tmp))) >> -tmp; \ - } else { \ - dest = src1 << tmp; \ - }} while (0) +#define NEON_FN(dest, src1, src2) \ + (dest = do_uqrshl_bhs(src1, (int8_t)src2, 8, true, NULL)) NEON_VOP(rshl_u8, neon_u8, 4) +#undef NEON_FN + +#define NEON_FN(dest, src1, src2) \ + (dest = do_uqrshl_bhs(src1, (int8_t)src2, 16, true, NULL)) NEON_VOP(rshl_u16, neon_u16, 2) #undef NEON_FN -/* The addition of the rounding constant may overflow, so we use an - * intermediate 64 bit accumulator. */ -uint32_t HELPER(neon_rshl_u32)(uint32_t val, uint32_t shiftop) +uint32_t HELPER(neon_rshl_u32)(uint32_t val, uint32_t shift) { - uint32_t dest; - int8_t shift = (int8_t)shiftop; - if (shift >= 32 || shift < -32) { - dest = 0; - } else if (shift == -32) { - dest = val >> 31; - } else if (shift < 0) { - uint64_t big_dest = ((uint64_t)val + (1 << (-1 - shift))); - dest = big_dest >> -shift; - } else { - dest = val << shift; - } - return dest; + return do_uqrshl_bhs(val, (int8_t)shift, 32, true, NULL); } -/* Handling addition overflow with 64 bit input values is more - * tricky than with 32 bit values. */ -uint64_t HELPER(neon_rshl_u64)(uint64_t val, uint64_t shiftop) +uint64_t HELPER(neon_rshl_u64)(uint64_t val, uint64_t shift) { - int8_t shift = (uint8_t)shiftop; - if (shift >= 64 || shift < -64) { - val = 0; - } else if (shift == -64) { - /* Rounding a 1-bit result just preserves that bit. */ - val >>= 63; - } else if (shift < 0) { - val >>= (-shift - 1); - if (val == UINT64_MAX) { - /* In this case, it means that the rounding constant is 1, - * and the addition would overflow. Return the actual - * result directly. */ - val = 0x8000000000000000ULL; - } else { - val++; - val >>= 1; - } - } else { - val <<= shift; - } - return val; + return do_uqrshl_d(val, (int8_t)shift, true, NULL); } -#define NEON_FN(dest, src1, src2) do { \ - int8_t tmp; \ - tmp = (int8_t)src2; \ - if (tmp >= (ssize_t)sizeof(src1) * 8) { \ - if (src1) { \ - SET_QC(); \ - dest = ~0; \ - } else { \ - dest = 0; \ - } \ - } else if (tmp <= -(ssize_t)sizeof(src1) * 8) { \ - dest = 0; \ - } else if (tmp < 0) { \ - dest = src1 >> -tmp; \ - } else { \ - dest = src1 << tmp; \ - if ((dest >> tmp) != src1) { \ - SET_QC(); \ - dest = ~0; \ - } \ - }} while (0) +#define NEON_FN(dest, src1, src2) \ + (dest = do_uqrshl_bhs(src1, (int8_t)src2, 8, false, env->vfp.qc)) NEON_VOP_ENV(qshl_u8, neon_u8, 4) +#undef NEON_FN + +#define NEON_FN(dest, src1, src2) \ + (dest = do_uqrshl_bhs(src1, (int8_t)src2, 16, false, env->vfp.qc)) NEON_VOP_ENV(qshl_u16, neon_u16, 2) -NEON_VOP_ENV(qshl_u32, neon_u32, 1) #undef NEON_FN -uint64_t HELPER(neon_qshl_u64)(CPUARMState *env, uint64_t val, uint64_t shiftop) +uint32_t HELPER(neon_qshl_u32)(CPUARMState *env, uint32_t val, uint32_t shift) { - int8_t shift = (int8_t)shiftop; - if (shift >= 64) { - if (val) { - val = ~(uint64_t)0; - SET_QC(); - } - } else if (shift <= -64) { - val = 0; - } else if (shift < 0) { - val >>= -shift; - } else { - uint64_t tmp = val; - val <<= shift; - if ((val >> shift) != tmp) { - SET_QC(); - val = ~(uint64_t)0; - } - } - return val; + return do_uqrshl_bhs(val, (int8_t)shift, 32, false, env->vfp.qc); } -#define NEON_FN(dest, src1, src2) do { \ - int8_t tmp; \ - tmp = (int8_t)src2; \ - if (tmp >= (ssize_t)sizeof(src1) * 8) { \ - if (src1) { \ - SET_QC(); \ - dest = (uint32_t)(1 << (sizeof(src1) * 8 - 1)); \ - if (src1 > 0) { \ - dest--; \ - } \ - } else { \ - dest = src1; \ - } \ - } else if (tmp <= -(ssize_t)sizeof(src1) * 8) { \ - dest = src1 >> 31; \ - } else if (tmp < 0) { \ - dest = src1 >> -tmp; \ - } else { \ - dest = src1 << tmp; \ - if ((dest >> tmp) != src1) { \ - SET_QC(); \ - dest = (uint32_t)(1 << (sizeof(src1) * 8 - 1)); \ - if (src1 > 0) { \ - dest--; \ - } \ - } \ - }} while (0) +uint64_t HELPER(neon_qshl_u64)(CPUARMState *env, uint64_t val, uint64_t shift) +{ + return do_uqrshl_d(val, (int8_t)shift, false, env->vfp.qc); +} + +#define NEON_FN(dest, src1, src2) \ + (dest = do_sqrshl_bhs(src1, (int8_t)src2, 8, false, env->vfp.qc)) NEON_VOP_ENV(qshl_s8, neon_s8, 4) +#undef NEON_FN + +#define NEON_FN(dest, src1, src2) \ + (dest = do_sqrshl_bhs(src1, (int8_t)src2, 16, false, env->vfp.qc)) NEON_VOP_ENV(qshl_s16, neon_s16, 2) -NEON_VOP_ENV(qshl_s32, neon_s32, 1) #undef NEON_FN -uint64_t HELPER(neon_qshl_s64)(CPUARMState *env, uint64_t valop, uint64_t shiftop) +uint32_t HELPER(neon_qshl_s32)(CPUARMState *env, uint32_t val, uint32_t shift) { - int8_t shift = (uint8_t)shiftop; - int64_t val = valop; - if (shift >= 64) { - if (val) { - SET_QC(); - val = (val >> 63) ^ ~SIGNBIT64; - } - } else if (shift <= -64) { - val >>= 63; - } else if (shift < 0) { - val >>= -shift; - } else { - int64_t tmp = val; - val <<= shift; - if ((val >> shift) != tmp) { - SET_QC(); - val = (tmp >> 63) ^ ~SIGNBIT64; - } - } - return val; + return do_sqrshl_bhs(val, (int8_t)shift, 32, false, env->vfp.qc); } -#define NEON_FN(dest, src1, src2) do { \ - if (src1 & (1 << (sizeof(src1) * 8 - 1))) { \ - SET_QC(); \ - dest = 0; \ - } else { \ - int8_t tmp; \ - tmp = (int8_t)src2; \ - if (tmp >= (ssize_t)sizeof(src1) * 8) { \ - if (src1) { \ - SET_QC(); \ - dest = ~0; \ - } else { \ - dest = 0; \ - } \ - } else if (tmp <= -(ssize_t)sizeof(src1) * 8) { \ - dest = 0; \ - } else if (tmp < 0) { \ - dest = src1 >> -tmp; \ - } else { \ - dest = src1 << tmp; \ - if ((dest >> tmp) != src1) { \ - SET_QC(); \ - dest = ~0; \ - } \ - } \ - }} while (0) -NEON_VOP_ENV(qshlu_s8, neon_u8, 4) -NEON_VOP_ENV(qshlu_s16, neon_u16, 2) +uint64_t HELPER(neon_qshl_s64)(CPUARMState *env, uint64_t val, uint64_t shift) +{ + return do_sqrshl_d(val, (int8_t)shift, false, env->vfp.qc); +} + +#define NEON_FN(dest, src1, src2) \ + (dest = do_suqrshl_bhs(src1, (int8_t)src2, 8, false, env->vfp.qc)) +NEON_VOP_ENV(qshlu_s8, neon_s8, 4) #undef NEON_FN -uint32_t HELPER(neon_qshlu_s32)(CPUARMState *env, uint32_t valop, uint32_t shiftop) +#define NEON_FN(dest, src1, src2) \ + (dest = do_suqrshl_bhs(src1, (int8_t)src2, 16, false, env->vfp.qc)) +NEON_VOP_ENV(qshlu_s16, neon_s16, 2) +#undef NEON_FN + +uint32_t HELPER(neon_qshlu_s32)(CPUARMState *env, uint32_t val, uint32_t shift) { - if ((int32_t)valop < 0) { - SET_QC(); - return 0; - } - return helper_neon_qshl_u32(env, valop, shiftop); + return do_suqrshl_bhs(val, (int8_t)shift, 32, false, env->vfp.qc); } -uint64_t HELPER(neon_qshlu_s64)(CPUARMState *env, uint64_t valop, uint64_t shiftop) +uint64_t HELPER(neon_qshlu_s64)(CPUARMState *env, uint64_t val, uint64_t shift) { - if ((int64_t)valop < 0) { - SET_QC(); - return 0; - } - return helper_neon_qshl_u64(env, valop, shiftop); + return do_suqrshl_d(val, (int8_t)shift, false, env->vfp.qc); } -#define NEON_FN(dest, src1, src2) do { \ - int8_t tmp; \ - tmp = (int8_t)src2; \ - if (tmp >= (ssize_t)sizeof(src1) * 8) { \ - if (src1) { \ - SET_QC(); \ - dest = ~0; \ - } else { \ - dest = 0; \ - } \ - } else if (tmp < -(ssize_t)sizeof(src1) * 8) { \ - dest = 0; \ - } else if (tmp == -(ssize_t)sizeof(src1) * 8) { \ - dest = src1 >> (sizeof(src1) * 8 - 1); \ - } else if (tmp < 0) { \ - dest = (src1 + (1 << (-1 - tmp))) >> -tmp; \ - } else { \ - dest = src1 << tmp; \ - if ((dest >> tmp) != src1) { \ - SET_QC(); \ - dest = ~0; \ - } \ - }} while (0) +#define NEON_FN(dest, src1, src2) \ + (dest = do_uqrshl_bhs(src1, (int8_t)src2, 8, true, env->vfp.qc)) NEON_VOP_ENV(qrshl_u8, neon_u8, 4) +#undef NEON_FN + +#define NEON_FN(dest, src1, src2) \ + (dest = do_uqrshl_bhs(src1, (int8_t)src2, 16, true, env->vfp.qc)) NEON_VOP_ENV(qrshl_u16, neon_u16, 2) #undef NEON_FN -/* The addition of the rounding constant may overflow, so we use an - * intermediate 64 bit accumulator. */ -uint32_t HELPER(neon_qrshl_u32)(CPUARMState *env, uint32_t val, uint32_t shiftop) +uint32_t HELPER(neon_qrshl_u32)(CPUARMState *env, uint32_t val, uint32_t shift) { - uint32_t dest; - int8_t shift = (int8_t)shiftop; - if (shift >= 32) { - if (val) { - SET_QC(); - dest = ~0; - } else { - dest = 0; - } - } else if (shift < -32) { - dest = 0; - } else if (shift == -32) { - dest = val >> 31; - } else if (shift < 0) { - uint64_t big_dest = ((uint64_t)val + (1 << (-1 - shift))); - dest = big_dest >> -shift; - } else { - dest = val << shift; - if ((dest >> shift) != val) { - SET_QC(); - dest = ~0; - } - } - return dest; + return do_uqrshl_bhs(val, (int8_t)shift, 32, true, env->vfp.qc); } -/* Handling addition overflow with 64 bit input values is more - * tricky than with 32 bit values. */ -uint64_t HELPER(neon_qrshl_u64)(CPUARMState *env, uint64_t val, uint64_t shiftop) +uint64_t HELPER(neon_qrshl_u64)(CPUARMState *env, uint64_t val, uint64_t shift) { - int8_t shift = (int8_t)shiftop; - if (shift >= 64) { - if (val) { - SET_QC(); - val = ~0; - } - } else if (shift < -64) { - val = 0; - } else if (shift == -64) { - val >>= 63; - } else if (shift < 0) { - val >>= (-shift - 1); - if (val == UINT64_MAX) { - /* In this case, it means that the rounding constant is 1, - * and the addition would overflow. Return the actual - * result directly. */ - val = 0x8000000000000000ULL; - } else { - val++; - val >>= 1; - } - } else { \ - uint64_t tmp = val; - val <<= shift; - if ((val >> shift) != tmp) { - SET_QC(); - val = ~0; - } - } - return val; + return do_uqrshl_d(val, (int8_t)shift, true, env->vfp.qc); } -#define NEON_FN(dest, src1, src2) do { \ - int8_t tmp; \ - tmp = (int8_t)src2; \ - if (tmp >= (ssize_t)sizeof(src1) * 8) { \ - if (src1) { \ - SET_QC(); \ - dest = (typeof(dest))(1 << (sizeof(src1) * 8 - 1)); \ - if (src1 > 0) { \ - dest--; \ - } \ - } else { \ - dest = 0; \ - } \ - } else if (tmp <= -(ssize_t)sizeof(src1) * 8) { \ - dest = 0; \ - } else if (tmp < 0) { \ - dest = (src1 + (1 << (-1 - tmp))) >> -tmp; \ - } else { \ - dest = src1 << tmp; \ - if ((dest >> tmp) != src1) { \ - SET_QC(); \ - dest = (uint32_t)(1 << (sizeof(src1) * 8 - 1)); \ - if (src1 > 0) { \ - dest--; \ - } \ - } \ - }} while (0) +#define NEON_FN(dest, src1, src2) \ + (dest = do_sqrshl_bhs(src1, (int8_t)src2, 8, true, env->vfp.qc)) NEON_VOP_ENV(qrshl_s8, neon_s8, 4) +#undef NEON_FN + +#define NEON_FN(dest, src1, src2) \ + (dest = do_sqrshl_bhs(src1, (int8_t)src2, 16, true, env->vfp.qc)) NEON_VOP_ENV(qrshl_s16, neon_s16, 2) #undef NEON_FN -/* The addition of the rounding constant may overflow, so we use an - * intermediate 64 bit accumulator. */ -uint32_t HELPER(neon_qrshl_s32)(CPUARMState *env, uint32_t valop, uint32_t shiftop) +uint32_t HELPER(neon_qrshl_s32)(CPUARMState *env, uint32_t val, uint32_t shift) { - int32_t dest; - int32_t val = (int32_t)valop; - int8_t shift = (int8_t)shiftop; - if (shift >= 32) { - if (val) { - SET_QC(); - dest = (val >> 31) ^ ~SIGNBIT; - } else { - dest = 0; - } - } else if (shift <= -32) { - dest = 0; - } else if (shift < 0) { - int64_t big_dest = ((int64_t)val + (1 << (-1 - shift))); - dest = big_dest >> -shift; - } else { - dest = val << shift; - if ((dest >> shift) != val) { - SET_QC(); - dest = (val >> 31) ^ ~SIGNBIT; - } - } - return dest; + return do_sqrshl_bhs(val, (int8_t)shift, 32, true, env->vfp.qc); } -/* Handling addition overflow with 64 bit input values is more - * tricky than with 32 bit values. */ -uint64_t HELPER(neon_qrshl_s64)(CPUARMState *env, uint64_t valop, uint64_t shiftop) +uint64_t HELPER(neon_qrshl_s64)(CPUARMState *env, uint64_t val, uint64_t shift) { - int8_t shift = (uint8_t)shiftop; - int64_t val = valop; - - if (shift >= 64) { - if (val) { - SET_QC(); - val = (val >> 63) ^ ~SIGNBIT64; - } - } else if (shift <= -64) { - val = 0; - } else if (shift < 0) { - val >>= (-shift - 1); - if (val == INT64_MAX) { - /* In this case, it means that the rounding constant is 1, - * and the addition would overflow. Return the actual - * result directly. */ - val = 0x4000000000000000ULL; - } else { - val++; - val >>= 1; - } - } else { - int64_t tmp = val; - val <<= shift; - if ((val >> shift) != tmp) { - SET_QC(); - val = (tmp >> 63) ^ ~SIGNBIT64; - } - } - return val; + return do_sqrshl_d(val, (int8_t)shift, true, env->vfp.qc); } uint32_t HELPER(neon_add_u8)(uint32_t a, uint32_t b) From patchwork Fri Sep 18 18:36:39 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 305023 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6F9EFC43463 for ; Fri, 18 Sep 2020 19:04:11 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id ADD2221D7F for ; Fri, 18 Sep 2020 19:04:10 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="vHg0ouE4" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org ADD2221D7F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:35906 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLg1-0001DD-QJ for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:04:09 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47868) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLGt-0007lb-Ki for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:11 -0400 Received: from mail-pj1-x1044.google.com ([2607:f8b0:4864:20::1044]:38624) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLGq-00073a-0J for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:11 -0400 Received: by mail-pj1-x1044.google.com with SMTP id u3so3458668pjr.3 for ; Fri, 18 Sep 2020 11:38:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=9Qg5d/yrEd3aWZXQd/oZNgkhFNliYUk831GSUJKGHps=; b=vHg0ouE4gKbtjbIQVRpCq8GJZ7FCPau58sVZf4appTIdoaNB5xjudWSUl8LCW7CPOa PhXUhO30aVJrBZOj855gPsghCLbtZ2bxdqmebouVtFOUyAqxl7iHi3OV/YReXlSKv+HL +YiBY8UJkRWjQl6HJIxCBbO/CDdDhsK7ooLko7Jke4h5pEo+S0iYSmV5H2oBNzTi2nPB 0zo/HgFKxnhtSRiilmEJvdv3xZewII8kuTtekwnYRJYw0t9bMX3fE1amBydABSIph4og +UrR21BLKymQn70/UZfDE39z/tMsnsptZ8QnQgMK+OCGR0ccNpPWvjkVmfw6sl7bNvrK wJLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=9Qg5d/yrEd3aWZXQd/oZNgkhFNliYUk831GSUJKGHps=; b=ioHauAooMsnBr12u+x++97wVAj7u2sMkEss6MlJMRQHKvShPZC5MKMBNAlQvXXyDlm zkMTj9MERqmFKzNZQQr7i+rQxwbAO3IaRM+yvwk78dtXtKOio7yw3N5O6G6hfwY3nISQ Nd/Qs5Y96xn2+B8rNyj0XvbmGANWn49sOPQBoipmG0JZv6hqbsRYnPobCYvnF8pPphLc 1jMQw8vEGs8BS4xorwabMls+Upo6DOvi1PDcBcaXQ+AOWXLWdiKFEhtNtMGwmsDS9NcI iS1rC9Zn7tDFn6bvHTxs9ZrEhEa4ktwJ3q9vO+BJ8r16pHx1GsrLNmhREMGvO4VyTwzZ z/iQ== X-Gm-Message-State: AOAM533l49BZTpUpSxVnGZomFSVlTVGDff4Oa85Jd3F30B+AW9naBpok hbIHvsg5cjOv0NSBteGBm7KcpOw5YJKejw== X-Google-Smtp-Source: ABdhPJya2C6irxMJcYevyZ4vEnTXKer6MhZo7ClOXb1Yu0BSfTUbfs11zX/a9MTbD2trywoq0a3dEA== X-Received: by 2002:a17:902:aa85:b029:d0:cbe1:e70d with SMTP id d5-20020a170902aa85b02900d0cbe1e70dmr34657616plr.27.1600454285901; Fri, 18 Sep 2020 11:38:05 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:05 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 09/81] target/arm: Implement SVE2 saturating/rounding bitwise shift left (predicated) Date: Fri, 18 Sep 2020 11:36:39 -0700 Message-Id: <20200918183751.2787647-10-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1044; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1044.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- v2: Shift values are always signed (laurent desnogues). --- target/arm/helper-sve.h | 54 ++++++++++++++++++++++++++ target/arm/sve.decode | 17 +++++++++ target/arm/sve_helper.c | 78 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 18 +++++++++ 4 files changed, 167 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 938c50f2c9..3c87b1ce29 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -172,6 +172,60 @@ DEF_HELPER_FLAGS_5(sve2_uadalp_zpzz_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve2_uadalp_zpzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_srshl_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_srshl_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_srshl_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_srshl_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_urshl_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_urshl_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_urshl_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_urshl_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_sqshl_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqshl_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqshl_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqshl_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_uqshl_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqshl_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqshl_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqshl_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_sqrshl_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrshl_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrshl_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrshl_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_uqrshl_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqrshl_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqrshl_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqrshl_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_sdiv_zpzz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_sdiv_zpzz_d, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 2ab463c722..7c940e45c6 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1112,3 +1112,20 @@ URECPE 01000100 .. 000 000 101 ... ..... ..... @rd_pg_rn URSQRTE 01000100 .. 000 001 101 ... ..... ..... @rd_pg_rn SQABS 01000100 .. 001 000 101 ... ..... ..... @rd_pg_rn SQNEG 01000100 .. 001 001 101 ... ..... ..... @rd_pg_rn + +### SVE2 saturating/rounding bitwise shift left (predicated) + +SRSHL 01000100 .. 000 010 100 ... ..... ..... @rdn_pg_rm +URSHL 01000100 .. 000 011 100 ... ..... ..... @rdn_pg_rm +SRSHL 01000100 .. 000 110 100 ... ..... ..... @rdm_pg_rn # SRSHLR +URSHL 01000100 .. 000 111 100 ... ..... ..... @rdm_pg_rn # URSHLR + +SQSHL 01000100 .. 001 000 100 ... ..... ..... @rdn_pg_rm +UQSHL 01000100 .. 001 001 100 ... ..... ..... @rdn_pg_rm +SQSHL 01000100 .. 001 100 100 ... ..... ..... @rdm_pg_rn # SQSHLR +UQSHL 01000100 .. 001 101 100 ... ..... ..... @rdm_pg_rn # UQSHLR + +SQRSHL 01000100 .. 001 010 100 ... ..... ..... @rdn_pg_rm +UQRSHL 01000100 .. 001 011 100 ... ..... ..... @rdn_pg_rm +SQRSHL 01000100 .. 001 110 100 ... ..... ..... @rdm_pg_rn # SQRSHLR +UQRSHL 01000100 .. 001 111 100 ... ..... ..... @rdm_pg_rn # UQRSHLR diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index b1c89f77d4..40e1b5df30 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -26,6 +26,7 @@ #include "tcg/tcg-gvec-desc.h" #include "fpu/softfloat.h" #include "tcg/tcg.h" +#include "vec_internal.h" /* Note that vector data is stored in host-endian 64-bit chunks, @@ -561,6 +562,83 @@ DO_ZPZZ(sve2_uadalp_zpzz_h, uint16_t, H1_2, do_uadalp_h) DO_ZPZZ(sve2_uadalp_zpzz_s, uint32_t, H1_4, do_uadalp_s) DO_ZPZZ_D(sve2_uadalp_zpzz_d, uint64_t, do_uadalp_d) +#define do_srshl_b(n, m) do_sqrshl_bhs(n, m, 8, true, NULL) +#define do_srshl_h(n, m) do_sqrshl_bhs(n, m, 16, true, NULL) +#define do_srshl_s(n, m) do_sqrshl_bhs(n, m, 32, true, NULL) +#define do_srshl_d(n, m) do_sqrshl_d(n, m, true, NULL) + +DO_ZPZZ(sve2_srshl_zpzz_b, int8_t, H1_2, do_srshl_b) +DO_ZPZZ(sve2_srshl_zpzz_h, int16_t, H1_2, do_srshl_h) +DO_ZPZZ(sve2_srshl_zpzz_s, int32_t, H1_4, do_srshl_s) +DO_ZPZZ_D(sve2_srshl_zpzz_d, int64_t, do_srshl_d) + +#define do_urshl_b(n, m) do_uqrshl_bhs(n, (int8_t)m, 8, true, NULL) +#define do_urshl_h(n, m) do_uqrshl_bhs(n, (int16_t)m, 16, true, NULL) +#define do_urshl_s(n, m) do_uqrshl_bhs(n, m, 32, true, NULL) +#define do_urshl_d(n, m) do_uqrshl_d(n, m, true, NULL) + +DO_ZPZZ(sve2_urshl_zpzz_b, uint8_t, H1_2, do_urshl_b) +DO_ZPZZ(sve2_urshl_zpzz_h, uint16_t, H1_2, do_urshl_h) +DO_ZPZZ(sve2_urshl_zpzz_s, uint32_t, H1_4, do_urshl_s) +DO_ZPZZ_D(sve2_urshl_zpzz_d, uint64_t, do_urshl_d) + +/* Unlike the NEON and AdvSIMD versions, there is no QC bit to set. */ +#define do_sqshl_b(n, m) \ + ({ uint32_t discard; do_sqrshl_bhs(n, m, 8, false, &discard); }) +#define do_sqshl_h(n, m) \ + ({ uint32_t discard; do_sqrshl_bhs(n, m, 16, false, &discard); }) +#define do_sqshl_s(n, m) \ + ({ uint32_t discard; do_sqrshl_bhs(n, m, 32, false, &discard); }) +#define do_sqshl_d(n, m) \ + ({ uint32_t discard; do_sqrshl_d(n, m, false, &discard); }) + +DO_ZPZZ(sve2_sqshl_zpzz_b, int8_t, H1_2, do_sqshl_b) +DO_ZPZZ(sve2_sqshl_zpzz_h, int16_t, H1_2, do_sqshl_h) +DO_ZPZZ(sve2_sqshl_zpzz_s, int32_t, H1_4, do_sqshl_s) +DO_ZPZZ_D(sve2_sqshl_zpzz_d, int64_t, do_sqshl_d) + +#define do_uqshl_b(n, m) \ + ({ uint32_t discard; do_uqrshl_bhs(n, (int8_t)m, 8, false, &discard); }) +#define do_uqshl_h(n, m) \ + ({ uint32_t discard; do_uqrshl_bhs(n, (int16_t)m, 16, false, &discard); }) +#define do_uqshl_s(n, m) \ + ({ uint32_t discard; do_uqrshl_bhs(n, m, 32, false, &discard); }) +#define do_uqshl_d(n, m) \ + ({ uint32_t discard; do_uqrshl_d(n, m, false, &discard); }) + +DO_ZPZZ(sve2_uqshl_zpzz_b, uint8_t, H1_2, do_uqshl_b) +DO_ZPZZ(sve2_uqshl_zpzz_h, uint16_t, H1_2, do_uqshl_h) +DO_ZPZZ(sve2_uqshl_zpzz_s, uint32_t, H1_4, do_uqshl_s) +DO_ZPZZ_D(sve2_uqshl_zpzz_d, uint64_t, do_uqshl_d) + +#define do_sqrshl_b(n, m) \ + ({ uint32_t discard; do_sqrshl_bhs(n, m, 8, true, &discard); }) +#define do_sqrshl_h(n, m) \ + ({ uint32_t discard; do_sqrshl_bhs(n, m, 16, true, &discard); }) +#define do_sqrshl_s(n, m) \ + ({ uint32_t discard; do_sqrshl_bhs(n, m, 32, true, &discard); }) +#define do_sqrshl_d(n, m) \ + ({ uint32_t discard; do_sqrshl_d(n, m, true, &discard); }) + +DO_ZPZZ(sve2_sqrshl_zpzz_b, int8_t, H1_2, do_sqrshl_b) +DO_ZPZZ(sve2_sqrshl_zpzz_h, int16_t, H1_2, do_sqrshl_h) +DO_ZPZZ(sve2_sqrshl_zpzz_s, int32_t, H1_4, do_sqrshl_s) +DO_ZPZZ_D(sve2_sqrshl_zpzz_d, int64_t, do_sqrshl_d) + +#define do_uqrshl_b(n, m) \ + ({ uint32_t discard; do_uqrshl_bhs(n, (int8_t)m, 8, true, &discard); }) +#define do_uqrshl_h(n, m) \ + ({ uint32_t discard; do_uqrshl_bhs(n, (int16_t)m, 16, true, &discard); }) +#define do_uqrshl_s(n, m) \ + ({ uint32_t discard; do_uqrshl_bhs(n, m, 32, true, &discard); }) +#define do_uqrshl_d(n, m) \ + ({ uint32_t discard; do_uqrshl_d(n, m, true, &discard); }) + +DO_ZPZZ(sve2_uqrshl_zpzz_b, uint8_t, H1_2, do_uqrshl_b) +DO_ZPZZ(sve2_uqrshl_zpzz_h, uint16_t, H1_2, do_uqrshl_h) +DO_ZPZZ(sve2_uqrshl_zpzz_s, uint32_t, H1_4, do_uqrshl_s) +DO_ZPZZ_D(sve2_uqrshl_zpzz_d, uint64_t, do_uqrshl_d) + #undef DO_ZPZZ #undef DO_ZPZZ_D diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 1dacb3fbc4..1aa8db673f 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -5941,3 +5941,21 @@ static bool trans_SQNEG(DisasContext *s, arg_rpr_esz *a) }; return do_sve2_zpz_ool(s, a, fns[a->esz]); } + +#define DO_SVE2_ZPZZ(NAME, name) \ +static bool trans_##NAME(DisasContext *s, arg_rprr_esz *a) \ +{ \ + static gen_helper_gvec_4 * const fns[4] = { \ + gen_helper_sve2_##name##_zpzz_b, gen_helper_sve2_##name##_zpzz_h, \ + gen_helper_sve2_##name##_zpzz_s, gen_helper_sve2_##name##_zpzz_d, \ + }; \ + return do_sve2_zpzz_ool(s, a, fns[a->esz]); \ +} + +DO_SVE2_ZPZZ(SQSHL, sqshl) +DO_SVE2_ZPZZ(SQRSHL, sqrshl) +DO_SVE2_ZPZZ(SRSHL, srshl) + +DO_SVE2_ZPZZ(UQSHL, uqshl) +DO_SVE2_ZPZZ(UQRSHL, uqrshl) +DO_SVE2_ZPZZ(URSHL, urshl) From patchwork Fri Sep 18 18:36:40 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 305021 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CCA9CC43463 for ; Fri, 18 Sep 2020 19:07:56 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 60B78207D3 for ; Fri, 18 Sep 2020 19:07:56 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="KWZgDjgJ" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 60B78207D3 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:44268 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLjf-0004qb-GL for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:07:55 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47872) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLGt-0007m8-RA for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:11 -0400 Received: from mail-pf1-x443.google.com ([2607:f8b0:4864:20::443]:33949) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLGr-00073q-6p for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:11 -0400 Received: by mail-pf1-x443.google.com with SMTP id k13so3483717pfg.1 for ; Fri, 18 Sep 2020 11:38:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ss3sueQ2k0y/yj4aZsmXC40Cu6wbqeb6iK2BxXztoiI=; b=KWZgDjgJ1JLbQPGoN+HtdYVQ1Oprrk+nkz8Xd1WYiYUhdNY8yjnXIqlIOdLkc86yax gfJ3k2RPCzuqJ9mrDjNag1l0aa6kkc3QPlGkoi+aoavqW60jly+Vwj006WortgJZ3EUB hFNekXgtDajpaUkypLztjZ+pant3DjYFCwWBONd+833psDXWt3ne77rn11LO60ycQucm zMlzFXMgSVmhznQ2jF4u7OpJDiaF26f2Y6TDeBcDbv667ba4xDzwLn6Fr0s3aVtr4uvv 5PT8VEIi9duFRLu/zLULjj7yglRWFdR1s7jfwVlQvljRr54HsuHukNQI308EzkBVoZtL dr8g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ss3sueQ2k0y/yj4aZsmXC40Cu6wbqeb6iK2BxXztoiI=; b=kUGI93+MRXSlohn/SHPudX+GpDwidhfvAGaolPhQn6j1cNS9uUoYbOAZjBs0cQ40zz sm3EAUBhHRw8mhV/+JCPQ9iCew1YH2dTLIULM+Zllw1rgFeaa+AJbtaWKNRjtyrOf1Aj 9l5ZBRHbLTTEt3xKMT8/tC7Ymh3CBRAbCNsoJzMVdZ2W23F7S1wSC/SZTMGs5LMfAv9c 9EeIUgoTi5tFYwzifnPel+xpld/XozDlJOKlDupdKOJMBY4RCHnxv0UjTabdkcniAv0K JEcRXAfHEiKAWJ9qnkU0xDpWpXFzw4Ha87ZhvDygfr8nCIvVHW3AnjCwEnjP02I6Vj3f PdTw== X-Gm-Message-State: AOAM5311vAsMvm9sOpQ+bAFb3MUikO9xF5MQdyX8jUlZIWKicuAz2bHE Zgz26IhH6Hu/UNPQ57dckUM3VaBXwQkvZg== X-Google-Smtp-Source: ABdhPJxrMi9xn4ApzqCe8zCODJA5x7BHqWvH15vSOKbhtDSBaU/ejtBDM/79b+JVhtPLvK0yg2toZg== X-Received: by 2002:a63:f441:: with SMTP id p1mr18388194pgk.453.1600454287482; Fri, 18 Sep 2020 11:38:07 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:06 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 10/81] target/arm: Implement SVE2 integer halving add/subtract (predicated) Date: Fri, 18 Sep 2020 11:36:40 -0700 Message-Id: <20200918183751.2787647-11-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::443; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x443.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 54 ++++++++++++++++++++++++++++++++++++++ target/arm/sve.decode | 11 ++++++++ target/arm/sve_helper.c | 39 +++++++++++++++++++++++++++ target/arm/translate-sve.c | 8 ++++++ 4 files changed, 112 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 3c87b1ce29..d2acd46cf9 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -226,6 +226,60 @@ DEF_HELPER_FLAGS_5(sve2_uqrshl_zpzz_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve2_uqrshl_zpzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_shadd_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_shadd_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_shadd_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_shadd_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_uhadd_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uhadd_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uhadd_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uhadd_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_srhadd_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_srhadd_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_srhadd_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_srhadd_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_urhadd_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_urhadd_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_urhadd_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_urhadd_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_shsub_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_shsub_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_shsub_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_shsub_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_uhsub_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uhsub_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uhsub_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uhsub_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_sdiv_zpzz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_sdiv_zpzz_d, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 7c940e45c6..25508a6d69 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1129,3 +1129,14 @@ SQRSHL 01000100 .. 001 010 100 ... ..... ..... @rdn_pg_rm UQRSHL 01000100 .. 001 011 100 ... ..... ..... @rdn_pg_rm SQRSHL 01000100 .. 001 110 100 ... ..... ..... @rdm_pg_rn # SQRSHLR UQRSHL 01000100 .. 001 111 100 ... ..... ..... @rdm_pg_rn # UQRSHLR + +### SVE2 integer halving add/subtract (predicated) + +SHADD 01000100 .. 010 000 100 ... ..... ..... @rdn_pg_rm +UHADD 01000100 .. 010 001 100 ... ..... ..... @rdn_pg_rm +SHSUB 01000100 .. 010 010 100 ... ..... ..... @rdn_pg_rm +UHSUB 01000100 .. 010 011 100 ... ..... ..... @rdn_pg_rm +SRHADD 01000100 .. 010 100 100 ... ..... ..... @rdn_pg_rm +URHADD 01000100 .. 010 101 100 ... ..... ..... @rdn_pg_rm +SHSUB 01000100 .. 010 110 100 ... ..... ..... @rdm_pg_rn # SHSUBR +UHSUB 01000100 .. 010 111 100 ... ..... ..... @rdm_pg_rn # UHSUBR diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 40e1b5df30..bbd2666e3a 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -639,6 +639,45 @@ DO_ZPZZ(sve2_uqrshl_zpzz_h, uint16_t, H1_2, do_uqrshl_h) DO_ZPZZ(sve2_uqrshl_zpzz_s, uint32_t, H1_4, do_uqrshl_s) DO_ZPZZ_D(sve2_uqrshl_zpzz_d, uint64_t, do_uqrshl_d) +#define DO_HADD_BHS(n, m) (((int64_t)n + m) >> 1) +#define DO_HADD_D(n, m) ((n >> 1) + (m >> 1) + (n & m & 1)) + +DO_ZPZZ(sve2_shadd_zpzz_b, int8_t, H1_2, DO_HADD_BHS) +DO_ZPZZ(sve2_shadd_zpzz_h, int16_t, H1_2, DO_HADD_BHS) +DO_ZPZZ(sve2_shadd_zpzz_s, int32_t, H1_4, DO_HADD_BHS) +DO_ZPZZ_D(sve2_shadd_zpzz_d, int64_t, DO_HADD_D) + +DO_ZPZZ(sve2_uhadd_zpzz_b, uint8_t, H1_2, DO_HADD_BHS) +DO_ZPZZ(sve2_uhadd_zpzz_h, uint16_t, H1_2, DO_HADD_BHS) +DO_ZPZZ(sve2_uhadd_zpzz_s, uint32_t, H1_4, DO_HADD_BHS) +DO_ZPZZ_D(sve2_uhadd_zpzz_d, uint64_t, DO_HADD_D) + +#define DO_RHADD_BHS(n, m) (((int64_t)n + m + 1) >> 1) +#define DO_RHADD_D(n, m) ((n >> 1) + (m >> 1) + ((n | m) & 1)) + +DO_ZPZZ(sve2_srhadd_zpzz_b, int8_t, H1_2, DO_RHADD_BHS) +DO_ZPZZ(sve2_srhadd_zpzz_h, int16_t, H1_2, DO_RHADD_BHS) +DO_ZPZZ(sve2_srhadd_zpzz_s, int32_t, H1_4, DO_RHADD_BHS) +DO_ZPZZ_D(sve2_srhadd_zpzz_d, int64_t, DO_RHADD_D) + +DO_ZPZZ(sve2_urhadd_zpzz_b, uint8_t, H1_2, DO_RHADD_BHS) +DO_ZPZZ(sve2_urhadd_zpzz_h, uint16_t, H1_2, DO_RHADD_BHS) +DO_ZPZZ(sve2_urhadd_zpzz_s, uint32_t, H1_4, DO_RHADD_BHS) +DO_ZPZZ_D(sve2_urhadd_zpzz_d, uint64_t, DO_RHADD_D) + +#define DO_HSUB_BHS(n, m) (((int64_t)n - m) >> 1) +#define DO_HSUB_D(n, m) ((n >> 1) - (m >> 1) - (~n & m & 1)) + +DO_ZPZZ(sve2_shsub_zpzz_b, int8_t, H1_2, DO_HSUB_BHS) +DO_ZPZZ(sve2_shsub_zpzz_h, int16_t, H1_2, DO_HSUB_BHS) +DO_ZPZZ(sve2_shsub_zpzz_s, int32_t, H1_4, DO_HSUB_BHS) +DO_ZPZZ_D(sve2_shsub_zpzz_d, int64_t, DO_HSUB_D) + +DO_ZPZZ(sve2_uhsub_zpzz_b, uint8_t, H1_2, DO_HSUB_BHS) +DO_ZPZZ(sve2_uhsub_zpzz_h, uint16_t, H1_2, DO_HSUB_BHS) +DO_ZPZZ(sve2_uhsub_zpzz_s, uint32_t, H1_4, DO_HSUB_BHS) +DO_ZPZZ_D(sve2_uhsub_zpzz_d, uint64_t, DO_HSUB_D) + #undef DO_ZPZZ #undef DO_ZPZZ_D diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 1aa8db673f..9703400849 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -5959,3 +5959,11 @@ DO_SVE2_ZPZZ(SRSHL, srshl) DO_SVE2_ZPZZ(UQSHL, uqshl) DO_SVE2_ZPZZ(UQRSHL, uqrshl) DO_SVE2_ZPZZ(URSHL, urshl) + +DO_SVE2_ZPZZ(SHADD, shadd) +DO_SVE2_ZPZZ(SRHADD, srhadd) +DO_SVE2_ZPZZ(SHSUB, shsub) + +DO_SVE2_ZPZZ(UHADD, uhadd) +DO_SVE2_ZPZZ(URHADD, urhadd) +DO_SVE2_ZPZZ(UHSUB, uhsub) From patchwork Fri Sep 18 18:36:41 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 305028 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C2484C43464 for ; Fri, 18 Sep 2020 18:54:48 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 1DE8B21D7F for ; Fri, 18 Sep 2020 18:54:48 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="liijiFtr" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1DE8B21D7F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:39044 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLWx-0006fn-2C for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 14:54:47 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47876) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLGu-0007oL-HB for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:12 -0400 Received: from mail-pf1-x444.google.com ([2607:f8b0:4864:20::444]:33950) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLGs-00073y-Fb for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:12 -0400 Received: by mail-pf1-x444.google.com with SMTP id k13so3483745pfg.1 for ; Fri, 18 Sep 2020 11:38:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=LklvD6j6Ed/YjsP16FYp8Gym00ppx22uNHICYMgqD+k=; b=liijiFtrYsY0j00DbdzkTEhIcO7K+xI3uGYSri8ssOwmicnxPWBgwgyzRw9/g4rJR0 C55oEQs8RnJCeeDvMHj/QwzEpd2cyhaKbH3226eZmLQtwYJQrBy4OSr9p2SKNZD1JyHA iAL3gsMlJkgCJ0qTebXWcfM/wbjXMD94UeEiJYw7HeO83V4RHgjQTH/SXfErrajpACUJ VBXr2ZQ0iAuMICyX5k6dWHcf51lxRMP830wipe1GYPLlF2G8ZS3R/Li749pnCb4XtJ9O c0eEVUv6FDHtG3LHdEOnR1wCwE8VG+mNmAwRtM6c/usvJeT3EEqkWO1RbXwUhudZr7uW J0AQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=LklvD6j6Ed/YjsP16FYp8Gym00ppx22uNHICYMgqD+k=; b=eU6dLhRix1lmfmg3qVk7biNVFFRCyFuuLDoGKRJBdI+LfmryDFkNuuNORjjr1zmu6C DkXSRT666CALlvH149IODtRBos6h3BVAVVfe6llAwt9OateySdoaV9QgA8YMyFpABzFI fbPAMlitcKnfi1uCeLb29F0IsHRaIqgUMnmKktnVSWvmYDrPJHNtE/UKIRkF+FiOInzE 4QsWW6hpcJB2hi6KniooXrRS71ydJaXqOkoi3YYIqAXv2igrhsk9B5yXjoz5GMD6qHaK gc2yZZ6MairGPKbIUrX8SjiD7CfB3tGjxQNVga4tXyhvkh7w7MK40AX70wJc8hWelUfF kDHg== X-Gm-Message-State: AOAM530ysxvYNOwgYj5ukj4E4MCdGiD+Dfqi15/AJx1nchpMo06EwRAs Axi4s8Ti/XnnnXsTV8dxJVDoddrk3G7z0A== X-Google-Smtp-Source: ABdhPJyakMPUxnIAgo8kJgFiNsvcX5MxcwKvsMs13lljhm9yOzKryB2fym5fwerJSaB7YhbqmScsLg== X-Received: by 2002:a63:5119:: with SMTP id f25mr27759150pgb.351.1600454288736; Fri, 18 Sep 2020 11:38:08 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:08 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 11/81] target/arm: Implement SVE2 integer pairwise arithmetic Date: Fri, 18 Sep 2020 11:36:41 -0700 Message-Id: <20200918183751.2787647-12-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::444; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x444.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- v2: Load all inputs before writing any output (laurent desnogues) --- target/arm/helper-sve.h | 45 ++++++++++++++++++++++ target/arm/sve.decode | 8 ++++ target/arm/sve_helper.c | 76 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 6 +++ 4 files changed, 135 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index d2acd46cf9..63d35ddd4f 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -326,6 +326,51 @@ DEF_HELPER_FLAGS_5(sve_sel_zpzz_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve_sel_zpzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_addp_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_addp_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_addp_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_addp_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_smaxp_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_smaxp_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_smaxp_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_smaxp_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_umaxp_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_umaxp_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_umaxp_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_umaxp_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_sminp_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sminp_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sminp_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sminp_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_uminp_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uminp_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uminp_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uminp_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_asr_zpzw_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_asr_zpzw_h, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 25508a6d69..859137df34 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1140,3 +1140,11 @@ SRHADD 01000100 .. 010 100 100 ... ..... ..... @rdn_pg_rm URHADD 01000100 .. 010 101 100 ... ..... ..... @rdn_pg_rm SHSUB 01000100 .. 010 110 100 ... ..... ..... @rdm_pg_rn # SHSUBR UHSUB 01000100 .. 010 111 100 ... ..... ..... @rdm_pg_rn # UHSUBR + +### SVE2 integer pairwise arithmetic + +ADDP 01000100 .. 010 001 101 ... ..... ..... @rdn_pg_rm +SMAXP 01000100 .. 010 100 101 ... ..... ..... @rdn_pg_rm +UMAXP 01000100 .. 010 101 101 ... ..... ..... @rdn_pg_rm +SMINP 01000100 .. 010 110 101 ... ..... ..... @rdn_pg_rm +UMINP 01000100 .. 010 111 101 ... ..... ..... @rdn_pg_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index bbd2666e3a..f861ba0504 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -681,6 +681,82 @@ DO_ZPZZ_D(sve2_uhsub_zpzz_d, uint64_t, DO_HSUB_D) #undef DO_ZPZZ #undef DO_ZPZZ_D +/* + * Three operand expander, operating on element pairs. + * If the slot I is even, the elements from from VN {I, I+1}. + * If the slot I is odd, the elements from from VM {I-1, I}. + * Load all of the input elements in each pair before overwriting output. + */ +#define DO_ZPZZ_PAIR(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + for (i = 0; i < opr_sz; ) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + TYPE n0 = *(TYPE *)(vn + H(i)); \ + TYPE m0 = *(TYPE *)(vm + H(i)); \ + TYPE n1 = *(TYPE *)(vn + H(i + sizeof(TYPE))); \ + TYPE m1 = *(TYPE *)(vm + H(i + sizeof(TYPE))); \ + if (pg & 1) { \ + *(TYPE *)(vd + H(i)) = OP(n0, n1); \ + } \ + i += sizeof(TYPE), pg >>= sizeof(TYPE); \ + if (pg & 1) { \ + *(TYPE *)(vd + H(i)) = OP(m0, m1); \ + } \ + i += sizeof(TYPE), pg >>= sizeof(TYPE); \ + } while (i & 15); \ + } \ +} + +/* Similarly, specialized for 64-bit operands. */ +#define DO_ZPZZ_PAIR_D(NAME, TYPE, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; \ + TYPE *d = vd, *n = vn, *m = vm; \ + uint8_t *pg = vg; \ + for (i = 0; i < opr_sz; i += 2) { \ + TYPE n0 = n[i], n1 = n[i + 1]; \ + TYPE m0 = m[i], m1 = m[i + 1]; \ + if (pg[H1(i)] & 1) { \ + d[i] = OP(n0, n1); \ + } \ + if (pg[H1(i + 1)] & 1) { \ + d[i + 1] = OP(m0, m1); \ + } \ + } \ +} + +DO_ZPZZ_PAIR(sve2_addp_zpzz_b, uint8_t, H1_2, DO_ADD) +DO_ZPZZ_PAIR(sve2_addp_zpzz_h, uint16_t, H1_2, DO_ADD) +DO_ZPZZ_PAIR(sve2_addp_zpzz_s, uint32_t, H1_4, DO_ADD) +DO_ZPZZ_PAIR_D(sve2_addp_zpzz_d, uint64_t, DO_ADD) + +DO_ZPZZ_PAIR(sve2_umaxp_zpzz_b, uint8_t, H1_2, DO_MAX) +DO_ZPZZ_PAIR(sve2_umaxp_zpzz_h, uint16_t, H1_2, DO_MAX) +DO_ZPZZ_PAIR(sve2_umaxp_zpzz_s, uint32_t, H1_4, DO_MAX) +DO_ZPZZ_PAIR_D(sve2_umaxp_zpzz_d, uint64_t, DO_MAX) + +DO_ZPZZ_PAIR(sve2_uminp_zpzz_b, uint8_t, H1_2, DO_MIN) +DO_ZPZZ_PAIR(sve2_uminp_zpzz_h, uint16_t, H1_2, DO_MIN) +DO_ZPZZ_PAIR(sve2_uminp_zpzz_s, uint32_t, H1_4, DO_MIN) +DO_ZPZZ_PAIR_D(sve2_uminp_zpzz_d, uint64_t, DO_MIN) + +DO_ZPZZ_PAIR(sve2_smaxp_zpzz_b, int8_t, H1_2, DO_MAX) +DO_ZPZZ_PAIR(sve2_smaxp_zpzz_h, int16_t, H1_2, DO_MAX) +DO_ZPZZ_PAIR(sve2_smaxp_zpzz_s, int32_t, H1_4, DO_MAX) +DO_ZPZZ_PAIR_D(sve2_smaxp_zpzz_d, int64_t, DO_MAX) + +DO_ZPZZ_PAIR(sve2_sminp_zpzz_b, int8_t, H1_2, DO_MIN) +DO_ZPZZ_PAIR(sve2_sminp_zpzz_h, int16_t, H1_2, DO_MIN) +DO_ZPZZ_PAIR(sve2_sminp_zpzz_s, int32_t, H1_4, DO_MIN) +DO_ZPZZ_PAIR_D(sve2_sminp_zpzz_d, int64_t, DO_MIN) + +#undef DO_ZPZZ_PAIR +#undef DO_ZPZZ_PAIR_D + /* Three-operand expander, controlled by a predicate, in which the * third operand is "wide". That is, for D = N op M, the same 64-bit * value of M is used with all of the narrower values of N. diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 9703400849..3d3b4d61d4 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -5967,3 +5967,9 @@ DO_SVE2_ZPZZ(SHSUB, shsub) DO_SVE2_ZPZZ(UHADD, uhadd) DO_SVE2_ZPZZ(URHADD, urhadd) DO_SVE2_ZPZZ(UHSUB, uhsub) + +DO_SVE2_ZPZZ(ADDP, addp) +DO_SVE2_ZPZZ(SMAXP, smaxp) +DO_SVE2_ZPZZ(UMAXP, umaxp) +DO_SVE2_ZPZZ(SMINP, sminp) +DO_SVE2_ZPZZ(UMINP, uminp) From patchwork Fri Sep 18 18:36:42 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273351 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9EE21C43463 for ; Fri, 18 Sep 2020 19:00:00 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0BDDB21D7F for ; Fri, 18 Sep 2020 19:00:00 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="y12O+QWo" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0BDDB21D7F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:55524 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLbz-0005yx-4m for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 14:59:59 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47932) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLGy-0007xr-5t for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:16 -0400 Received: from mail-pj1-x102a.google.com ([2607:f8b0:4864:20::102a]:36282) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLGv-00074U-Pt for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:15 -0400 Received: by mail-pj1-x102a.google.com with SMTP id b17so3463941pji.1 for ; Fri, 18 Sep 2020 11:38:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=imFygz1hm3dmcfo1hJl0NYua8qbjU25CyXGh4+UY5Pk=; b=y12O+QWo/vvIo5CjJ/TLlBhWNcR6jVqnFh3IkNO9/SBSNF9Q7xkO/6ToBwLm49yNVg exARgsLpSfZoeuzMUf40sK4hEsDoEPyj+ZOPjBOTE0I1O59V5qKQsJB6a5Dym4qIOarw 5hoJGLVGONZz3VSdRflMVYXOSMsp3kZ2cLjriupXyofMaXHhbsPOzuVj0qhzERpkukfE JYHeXSuR2hSW+9Qjj8HbDLgE317X+tgJlN2ao8luBqD9NdUfqQ2uFB4QMHXqcqr8tU8h SH9k+2JqanJjzv+X10hJXC8Ad0hX8aVi53hLu1uNeV6cmnJ43P3OHMgVOA5AU8HdQner Q2ww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=imFygz1hm3dmcfo1hJl0NYua8qbjU25CyXGh4+UY5Pk=; b=tdiEMhFXP+Zb1w4ksvJgibYWiHeHlk2gEnQcbqijqAmrm5Nnn9m18J36yurxw9KKJj mK169BaB0JZyQLx7cEZ3HW9xIJwiSVFyZfMwPJWfJaO9GakdjGLo47PIkjL1iERs9Nbm BSzrbb/HUpULPwZbOMsTD92apA7RgGpx/70SFP2GwpUqX5YAenmixZBftbXyXdEsjx2m UJahNbEUpcqum7yog4zhb6EQRz0cdD/e8BdFMkK+7F6MxDr5a2nxkxVRLzwAqHbGnsIV ViN9BgmpU+BXq6HekpLwFmZadPMDUmqfSIScGELqnfpHrD/CQJ913dmka0LLAZh/pEgs ZY7Q== X-Gm-Message-State: AOAM530k+uU8JCzalVFwA2jWKkmUV7TIhC1lnOzPfvVgIdP+D2aphD9l YZEhy8CZKjBEz6DMyFK0sJPC0EXTQ3CpMw== X-Google-Smtp-Source: ABdhPJxOUfiqbrN4iz56hN0lZk0rwaKigjT5Rr+LsI7BUFibBSG+3RNZ7I22XAVEMX/F6G4UKAMnGQ== X-Received: by 2002:a17:90a:9403:: with SMTP id r3mr14674747pjo.52.1600454290313; Fri, 18 Sep 2020 11:38:10 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:09 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 12/81] target/arm: Implement SVE2 saturating add/subtract (predicated) Date: Fri, 18 Sep 2020 11:36:42 -0700 Message-Id: <20200918183751.2787647-13-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::102a; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x102a.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 54 +++++++++++ target/arm/sve.decode | 11 +++ target/arm/sve_helper.c | 194 ++++++++++++++++++++++++++----------- target/arm/translate-sve.c | 7 ++ 4 files changed, 210 insertions(+), 56 deletions(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 63d35ddd4f..ae06619d7d 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -371,6 +371,60 @@ DEF_HELPER_FLAGS_5(sve2_uminp_zpzz_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve2_uminp_zpzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqadd_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqadd_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqadd_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqadd_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_uqadd_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqadd_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqadd_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqadd_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_sqsub_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqsub_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqsub_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqsub_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_uqsub_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqsub_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqsub_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uqsub_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_suqadd_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_suqadd_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_suqadd_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_suqadd_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_usqadd_zpzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_usqadd_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_usqadd_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_usqadd_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve_asr_zpzw_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_asr_zpzw_h, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 859137df34..d018aef3dc 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1148,3 +1148,14 @@ SMAXP 01000100 .. 010 100 101 ... ..... ..... @rdn_pg_rm UMAXP 01000100 .. 010 101 101 ... ..... ..... @rdn_pg_rm SMINP 01000100 .. 010 110 101 ... ..... ..... @rdn_pg_rm UMINP 01000100 .. 010 111 101 ... ..... ..... @rdn_pg_rm + +### SVE2 saturating add/subtract (predicated) + +SQADD_zpzz 01000100 .. 011 000 100 ... ..... ..... @rdn_pg_rm +UQADD_zpzz 01000100 .. 011 001 100 ... ..... ..... @rdn_pg_rm +SQSUB_zpzz 01000100 .. 011 010 100 ... ..... ..... @rdn_pg_rm +UQSUB_zpzz 01000100 .. 011 011 100 ... ..... ..... @rdn_pg_rm +SUQADD 01000100 .. 011 100 100 ... ..... ..... @rdn_pg_rm +USQADD 01000100 .. 011 101 100 ... ..... ..... @rdn_pg_rm +SQSUB_zpzz 01000100 .. 011 110 100 ... ..... ..... @rdm_pg_rn # SQSUBR +UQSUB_zpzz 01000100 .. 011 111 100 ... ..... ..... @rdm_pg_rn # UQSUBR diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index f861ba0504..d3c7e0e03c 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -678,6 +678,135 @@ DO_ZPZZ(sve2_uhsub_zpzz_h, uint16_t, H1_2, DO_HSUB_BHS) DO_ZPZZ(sve2_uhsub_zpzz_s, uint32_t, H1_4, DO_HSUB_BHS) DO_ZPZZ_D(sve2_uhsub_zpzz_d, uint64_t, DO_HSUB_D) +static inline int32_t do_sat_bhs(int64_t val, int64_t min, int64_t max) +{ + return val >= max ? max : val <= min ? min : val; +} + +#define DO_SQADD_B(n, m) do_sat_bhs((int64_t)n + m, INT8_MIN, INT8_MAX) +#define DO_SQADD_H(n, m) do_sat_bhs((int64_t)n + m, INT16_MIN, INT16_MAX) +#define DO_SQADD_S(n, m) do_sat_bhs((int64_t)n + m, INT32_MIN, INT32_MAX) + +static inline int64_t do_sqadd_d(int64_t n, int64_t m) +{ + int64_t r = n + m; + if (((r ^ n) & ~(n ^ m)) < 0) { + /* Signed overflow. */ + return r < 0 ? INT64_MAX : INT64_MIN; + } + return r; +} + +DO_ZPZZ(sve2_sqadd_zpzz_b, int8_t, H1_2, DO_SQADD_B) +DO_ZPZZ(sve2_sqadd_zpzz_h, int16_t, H1_2, DO_SQADD_H) +DO_ZPZZ(sve2_sqadd_zpzz_s, int32_t, H1_4, DO_SQADD_S) +DO_ZPZZ_D(sve2_sqadd_zpzz_d, int64_t, do_sqadd_d) + +#define DO_UQADD_B(n, m) do_sat_bhs((int64_t)n + m, 0, UINT8_MAX) +#define DO_UQADD_H(n, m) do_sat_bhs((int64_t)n + m, 0, UINT16_MAX) +#define DO_UQADD_S(n, m) do_sat_bhs((int64_t)n + m, 0, UINT32_MAX) + +static inline uint64_t do_uqadd_d(uint64_t n, uint64_t m) +{ + uint64_t r = n + m; + return r < n ? UINT64_MAX : r; +} + +DO_ZPZZ(sve2_uqadd_zpzz_b, uint8_t, H1_2, DO_UQADD_B) +DO_ZPZZ(sve2_uqadd_zpzz_h, uint16_t, H1_2, DO_UQADD_H) +DO_ZPZZ(sve2_uqadd_zpzz_s, uint32_t, H1_4, DO_UQADD_S) +DO_ZPZZ_D(sve2_uqadd_zpzz_d, uint64_t, do_uqadd_d) + +#define DO_SQSUB_B(n, m) do_sat_bhs((int64_t)n - m, INT8_MIN, INT8_MAX) +#define DO_SQSUB_H(n, m) do_sat_bhs((int64_t)n - m, INT16_MIN, INT16_MAX) +#define DO_SQSUB_S(n, m) do_sat_bhs((int64_t)n - m, INT32_MIN, INT32_MAX) + +static inline int64_t do_sqsub_d(int64_t n, int64_t m) +{ + int64_t r = n - m; + if (((r ^ n) & (n ^ m)) < 0) { + /* Signed overflow. */ + return r < 0 ? INT64_MAX : INT64_MIN; + } + return r; +} + +DO_ZPZZ(sve2_sqsub_zpzz_b, int8_t, H1_2, DO_SQSUB_B) +DO_ZPZZ(sve2_sqsub_zpzz_h, int16_t, H1_2, DO_SQSUB_H) +DO_ZPZZ(sve2_sqsub_zpzz_s, int32_t, H1_4, DO_SQSUB_S) +DO_ZPZZ_D(sve2_sqsub_zpzz_d, int64_t, do_sqsub_d) + +#define DO_UQSUB_B(n, m) do_sat_bhs((int64_t)n - m, 0, UINT8_MAX) +#define DO_UQSUB_H(n, m) do_sat_bhs((int64_t)n - m, 0, UINT16_MAX) +#define DO_UQSUB_S(n, m) do_sat_bhs((int64_t)n - m, 0, UINT32_MAX) + +static inline uint64_t do_uqsub_d(uint64_t n, uint64_t m) +{ + return n > m ? n - m : 0; +} + +DO_ZPZZ(sve2_uqsub_zpzz_b, uint8_t, H1_2, DO_UQSUB_B) +DO_ZPZZ(sve2_uqsub_zpzz_h, uint16_t, H1_2, DO_UQSUB_H) +DO_ZPZZ(sve2_uqsub_zpzz_s, uint32_t, H1_4, DO_UQSUB_S) +DO_ZPZZ_D(sve2_uqsub_zpzz_d, uint64_t, do_uqsub_d) + +#define DO_SUQADD_B(n, m) \ + do_sat_bhs((int64_t)(int8_t)n + m, INT8_MIN, INT8_MAX) +#define DO_SUQADD_H(n, m) \ + do_sat_bhs((int64_t)(int16_t)n + m, INT16_MIN, INT16_MAX) +#define DO_SUQADD_S(n, m) \ + do_sat_bhs((int64_t)(int32_t)n + m, INT32_MIN, INT32_MAX) + +static inline int64_t do_suqadd_d(int64_t n, uint64_t m) +{ + uint64_t r = n + m; + + if (n < 0) { + /* Note that m - abs(n) cannot underflow. */ + if (r > INT64_MAX) { + /* Result is either very large positive or negative. */ + if (m > -n) { + /* m > abs(n), so r is a very large positive. */ + return INT64_MAX; + } + /* Result is negative. */ + } + } else { + /* Both inputs are positive: check for overflow. */ + if (r < m || r > INT64_MAX) { + return INT64_MAX; + } + } + return r; +} + +DO_ZPZZ(sve2_suqadd_zpzz_b, uint8_t, H1_2, DO_SUQADD_B) +DO_ZPZZ(sve2_suqadd_zpzz_h, uint16_t, H1_2, DO_SUQADD_H) +DO_ZPZZ(sve2_suqadd_zpzz_s, uint32_t, H1_4, DO_SUQADD_S) +DO_ZPZZ_D(sve2_suqadd_zpzz_d, uint64_t, do_suqadd_d) + +#define DO_USQADD_B(n, m) \ + do_sat_bhs((int64_t)n + (int8_t)m, 0, UINT8_MAX) +#define DO_USQADD_H(n, m) \ + do_sat_bhs((int64_t)n + (int16_t)m, 0, UINT16_MAX) +#define DO_USQADD_S(n, m) \ + do_sat_bhs((int64_t)n + (int32_t)m, 0, UINT32_MAX) + +static inline uint64_t do_usqadd_d(uint64_t n, int64_t m) +{ + uint64_t r = n + m; + + if (m < 0) { + return n < -m ? 0 : r; + } + return r < n ? UINT64_MAX : r; +} + +DO_ZPZZ(sve2_usqadd_zpzz_b, uint8_t, H1_2, DO_USQADD_B) +DO_ZPZZ(sve2_usqadd_zpzz_h, uint16_t, H1_2, DO_USQADD_H) +DO_ZPZZ(sve2_usqadd_zpzz_s, uint32_t, H1_4, DO_USQADD_S) +DO_ZPZZ_D(sve2_usqadd_zpzz_d, uint64_t, do_usqadd_d) + #undef DO_ZPZZ #undef DO_ZPZZ_D @@ -1613,13 +1742,7 @@ void HELPER(sve_sqaddi_b)(void *d, void *a, int32_t b, uint32_t desc) intptr_t i, oprsz = simd_oprsz(desc); for (i = 0; i < oprsz; i += sizeof(int8_t)) { - int r = *(int8_t *)(a + i) + b; - if (r > INT8_MAX) { - r = INT8_MAX; - } else if (r < INT8_MIN) { - r = INT8_MIN; - } - *(int8_t *)(d + i) = r; + *(int8_t *)(d + i) = DO_SQADD_B(b, *(int8_t *)(a + i)); } } @@ -1628,13 +1751,7 @@ void HELPER(sve_sqaddi_h)(void *d, void *a, int32_t b, uint32_t desc) intptr_t i, oprsz = simd_oprsz(desc); for (i = 0; i < oprsz; i += sizeof(int16_t)) { - int r = *(int16_t *)(a + i) + b; - if (r > INT16_MAX) { - r = INT16_MAX; - } else if (r < INT16_MIN) { - r = INT16_MIN; - } - *(int16_t *)(d + i) = r; + *(int16_t *)(d + i) = DO_SQADD_H(b, *(int16_t *)(a + i)); } } @@ -1643,13 +1760,7 @@ void HELPER(sve_sqaddi_s)(void *d, void *a, int64_t b, uint32_t desc) intptr_t i, oprsz = simd_oprsz(desc); for (i = 0; i < oprsz; i += sizeof(int32_t)) { - int64_t r = *(int32_t *)(a + i) + b; - if (r > INT32_MAX) { - r = INT32_MAX; - } else if (r < INT32_MIN) { - r = INT32_MIN; - } - *(int32_t *)(d + i) = r; + *(int32_t *)(d + i) = DO_SQADD_S(b, *(int32_t *)(a + i)); } } @@ -1658,13 +1769,7 @@ void HELPER(sve_sqaddi_d)(void *d, void *a, int64_t b, uint32_t desc) intptr_t i, oprsz = simd_oprsz(desc); for (i = 0; i < oprsz; i += sizeof(int64_t)) { - int64_t ai = *(int64_t *)(a + i); - int64_t r = ai + b; - if (((r ^ ai) & ~(ai ^ b)) < 0) { - /* Signed overflow. */ - r = (r < 0 ? INT64_MAX : INT64_MIN); - } - *(int64_t *)(d + i) = r; + *(int64_t *)(d + i) = do_sqadd_d(b, *(int64_t *)(a + i)); } } @@ -1677,13 +1782,7 @@ void HELPER(sve_uqaddi_b)(void *d, void *a, int32_t b, uint32_t desc) intptr_t i, oprsz = simd_oprsz(desc); for (i = 0; i < oprsz; i += sizeof(uint8_t)) { - int r = *(uint8_t *)(a + i) + b; - if (r > UINT8_MAX) { - r = UINT8_MAX; - } else if (r < 0) { - r = 0; - } - *(uint8_t *)(d + i) = r; + *(uint8_t *)(d + i) = DO_UQADD_B(b, *(uint8_t *)(a + i)); } } @@ -1692,13 +1791,7 @@ void HELPER(sve_uqaddi_h)(void *d, void *a, int32_t b, uint32_t desc) intptr_t i, oprsz = simd_oprsz(desc); for (i = 0; i < oprsz; i += sizeof(uint16_t)) { - int r = *(uint16_t *)(a + i) + b; - if (r > UINT16_MAX) { - r = UINT16_MAX; - } else if (r < 0) { - r = 0; - } - *(uint16_t *)(d + i) = r; + *(uint16_t *)(d + i) = DO_UQADD_H(b, *(uint16_t *)(a + i)); } } @@ -1707,13 +1800,7 @@ void HELPER(sve_uqaddi_s)(void *d, void *a, int64_t b, uint32_t desc) intptr_t i, oprsz = simd_oprsz(desc); for (i = 0; i < oprsz; i += sizeof(uint32_t)) { - int64_t r = *(uint32_t *)(a + i) + b; - if (r > UINT32_MAX) { - r = UINT32_MAX; - } else if (r < 0) { - r = 0; - } - *(uint32_t *)(d + i) = r; + *(uint32_t *)(d + i) = DO_UQADD_S(b, *(uint32_t *)(a + i)); } } @@ -1722,11 +1809,7 @@ void HELPER(sve_uqaddi_d)(void *d, void *a, uint64_t b, uint32_t desc) intptr_t i, oprsz = simd_oprsz(desc); for (i = 0; i < oprsz; i += sizeof(uint64_t)) { - uint64_t r = *(uint64_t *)(a + i) + b; - if (r < b) { - r = UINT64_MAX; - } - *(uint64_t *)(d + i) = r; + *(uint64_t *)(d + i) = do_uqadd_d(b, *(uint64_t *)(a + i)); } } @@ -1735,8 +1818,7 @@ void HELPER(sve_uqsubi_d)(void *d, void *a, uint64_t b, uint32_t desc) intptr_t i, oprsz = simd_oprsz(desc); for (i = 0; i < oprsz; i += sizeof(uint64_t)) { - uint64_t ai = *(uint64_t *)(a + i); - *(uint64_t *)(d + i) = (ai < b ? 0 : ai - b); + *(uint64_t *)(d + i) = do_uqsub_d(*(uint64_t *)(a + i), b); } } diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 3d3b4d61d4..f3e979bfb7 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -5973,3 +5973,10 @@ DO_SVE2_ZPZZ(SMAXP, smaxp) DO_SVE2_ZPZZ(UMAXP, umaxp) DO_SVE2_ZPZZ(SMINP, sminp) DO_SVE2_ZPZZ(UMINP, uminp) + +DO_SVE2_ZPZZ(SQADD_zpzz, sqadd) +DO_SVE2_ZPZZ(UQADD_zpzz, uqadd) +DO_SVE2_ZPZZ(SQSUB_zpzz, sqsub) +DO_SVE2_ZPZZ(UQSUB_zpzz, uqsub) +DO_SVE2_ZPZZ(SUQADD, suqadd) +DO_SVE2_ZPZZ(USQADD, usqadd) From patchwork Fri Sep 18 18:36:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 305027 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 97DE7C43463 for ; Fri, 18 Sep 2020 18:57:33 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 1B8FA21534 for ; Fri, 18 Sep 2020 18:57:33 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="H2VgAbzd" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1B8FA21534 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:47240 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLZb-0001qD-MK for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 14:57:32 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47924) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLGx-0007vy-DG for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:15 -0400 Received: from mail-pl1-x642.google.com ([2607:f8b0:4864:20::642]:37395) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLGv-00074P-9O for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:15 -0400 Received: by mail-pl1-x642.google.com with SMTP id u9so3435010plk.4 for ; Fri, 18 Sep 2020 11:38:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=mY+15TD6kdMcYrFiZPjyUpAWKa7gPFK7CgZYkOafutw=; b=H2VgAbzdnQVsUM925aCWYQZoY064D/c921JbfRXlqwiBe7eljVsGjjGPi2CV2cEWt7 /nRb3s9wzWQ6iOY2/ZylWs0UXbJk9aSD9ByiKfEQNsD9BOekPsX2qPp5nSO1F+3ZRoYF dXmJdRNNQt7JkAStTx00FDVVUAjWi7ZTR/I8qBjUZyUSA02nQvKsBuF1HCk7JWKnt7U6 Oqg3TYsky9/8kNvK/PHfw72VjzGX4nh+iEYh0B5XIlHDRGNPvjmlwNn8C2qo/h2ACzUd CCBDTn070h58M+KY5zphxBDxqWpWsvsBP5OwGnQz2MU/tRSf2vxuUedRwlamBUEH06s/ +VNA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=mY+15TD6kdMcYrFiZPjyUpAWKa7gPFK7CgZYkOafutw=; b=ab/R1VpQrMqN5dv3Twehn/mha+DKBSHHRQ6mYaqFOkUyNmF3t6ySmjAfDRF5+pGRAa h98kOV2fTO7tCOcFabexNEwbDeyJacqazfXjG9jUXhtiQ7nmapAD34irYd/aTs7/e+sJ F/HlMOn608gyA17yIAicHF8fuheufVrBY5UPBhNM6T8DYzmSOGKya14g3YpJwaLxmZ3P ClutpslkgC9G40Qgzm+RWtSD8KYiZxitP8ENcxxstMkgaUA7HTu3PB0BnMxm1SmqJZ/B z9lo50EZg+iqqrJkxCzJOuK3yUvICtcdynMiOCy6g/Gn+QYtoGONSNG0t9DLMUEOVchl Zivw== X-Gm-Message-State: AOAM532Qo2BG3BKr2ZiiiuMyqWi3eZSQYrhmDoZwDLigf80XCMH35U2U H0M5SyiPbOF26iXY+cr3mZBiyrN/pjN0ng== X-Google-Smtp-Source: ABdhPJwUiXFEfp26qwJFHMekDTuYmgfD8T1jfzMdQG3+U2uk81AYx/SMpiT6jmPaX4hPYBEQwjukzQ== X-Received: by 2002:a17:90a:d18b:: with SMTP id fu11mr13773335pjb.203.1600454291534; Fri, 18 Sep 2020 11:38:11 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:10 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 13/81] target/arm: Implement SVE2 integer add/subtract long Date: Fri, 18 Sep 2020 11:36:43 -0700 Message-Id: <20200918183751.2787647-14-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::642; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x642.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- v2: Fix select offsets (laurent desnogues). --- target/arm/helper-sve.h | 24 ++++++++++++++++++++ target/arm/sve.decode | 19 ++++++++++++++++ target/arm/sve_helper.c | 43 +++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 46 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 132 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index ae06619d7d..676ddfbb2d 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -1367,6 +1367,30 @@ DEF_HELPER_FLAGS_5(sve_ftmad_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_ftmad_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve_ftmad_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_saddl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_saddl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_saddl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_ssubl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_ssubl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_ssubl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_sabdl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sabdl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sabdl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_uaddl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_uaddl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_uaddl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_usubl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_usubl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_usubl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_uabdl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_uabdl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_uabdl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(sve_ld1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index d018aef3dc..70712039e4 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1159,3 +1159,22 @@ SUQADD 01000100 .. 011 100 100 ... ..... ..... @rdn_pg_rm USQADD 01000100 .. 011 101 100 ... ..... ..... @rdn_pg_rm SQSUB_zpzz 01000100 .. 011 110 100 ... ..... ..... @rdm_pg_rn # SQSUBR UQSUB_zpzz 01000100 .. 011 111 100 ... ..... ..... @rdm_pg_rn # UQSUBR + +#### SVE2 Widening Integer Arithmetic + +## SVE2 integer add/subtract long + +SADDLB 01000101 .. 0 ..... 00 0000 ..... ..... @rd_rn_rm +SADDLT 01000101 .. 0 ..... 00 0001 ..... ..... @rd_rn_rm +UADDLB 01000101 .. 0 ..... 00 0010 ..... ..... @rd_rn_rm +UADDLT 01000101 .. 0 ..... 00 0011 ..... ..... @rd_rn_rm + +SSUBLB 01000101 .. 0 ..... 00 0100 ..... ..... @rd_rn_rm +SSUBLT 01000101 .. 0 ..... 00 0101 ..... ..... @rd_rn_rm +USUBLB 01000101 .. 0 ..... 00 0110 ..... ..... @rd_rn_rm +USUBLT 01000101 .. 0 ..... 00 0111 ..... ..... @rd_rn_rm + +SABDLB 01000101 .. 0 ..... 00 1100 ..... ..... @rd_rn_rm +SABDLT 01000101 .. 0 ..... 00 1101 ..... ..... @rd_rn_rm +UABDLB 01000101 .. 0 ..... 00 1110 ..... ..... @rd_rn_rm +UABDLT 01000101 .. 0 ..... 00 1111 ..... ..... @rd_rn_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index d3c7e0e03c..ca39b0f6f1 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1113,6 +1113,49 @@ DO_ZZW(sve_lsl_zzw_s, uint32_t, uint64_t, H1_4, DO_LSL) #undef DO_ZPZ #undef DO_ZPZ_D +/* + * Three-operand expander, unpredicated, in which the two inputs are + * selected from the top or bottom half of the wide column. + */ +#define DO_ZZZ_TB(NAME, TYPEW, TYPEN, HW, HN, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + int sel1 = extract32(desc, SIMD_DATA_SHIFT, 1) * sizeof(TYPEN); \ + int sel2 = extract32(desc, SIMD_DATA_SHIFT + 1, 1) * sizeof(TYPEN); \ + for (i = 0; i < opr_sz; i += sizeof(TYPEW)) { \ + TYPEW nn = *(TYPEN *)(vn + HN(i + sel1)); \ + TYPEW mm = *(TYPEN *)(vm + HN(i + sel2)); \ + *(TYPEW *)(vd + HW(i)) = OP(nn, mm); \ + } \ +} + +DO_ZZZ_TB(sve2_saddl_h, int16_t, int8_t, H1_2, H1, DO_ADD) +DO_ZZZ_TB(sve2_saddl_s, int32_t, int16_t, H1_4, H1_2, DO_ADD) +DO_ZZZ_TB(sve2_saddl_d, int64_t, int32_t, , H1_4, DO_ADD) + +DO_ZZZ_TB(sve2_ssubl_h, int16_t, int8_t, H1_2, H1, DO_SUB) +DO_ZZZ_TB(sve2_ssubl_s, int32_t, int16_t, H1_4, H1_2, DO_SUB) +DO_ZZZ_TB(sve2_ssubl_d, int64_t, int32_t, , H1_4, DO_SUB) + +DO_ZZZ_TB(sve2_sabdl_h, int16_t, int8_t, H1_2, H1, DO_ABD) +DO_ZZZ_TB(sve2_sabdl_s, int32_t, int16_t, H1_4, H1_2, DO_ABD) +DO_ZZZ_TB(sve2_sabdl_d, int64_t, int32_t, , H1_4, DO_ABD) + +DO_ZZZ_TB(sve2_uaddl_h, uint16_t, uint8_t, H1_2, H1, DO_ADD) +DO_ZZZ_TB(sve2_uaddl_s, uint32_t, uint16_t, H1_4, H1_2, DO_ADD) +DO_ZZZ_TB(sve2_uaddl_d, uint64_t, uint32_t, , H1_4, DO_ADD) + +DO_ZZZ_TB(sve2_usubl_h, uint16_t, uint8_t, H1_2, H1, DO_SUB) +DO_ZZZ_TB(sve2_usubl_s, uint32_t, uint16_t, H1_4, H1_2, DO_SUB) +DO_ZZZ_TB(sve2_usubl_d, uint64_t, uint32_t, , H1_4, DO_SUB) + +DO_ZZZ_TB(sve2_uabdl_h, uint16_t, uint8_t, H1_2, H1, DO_ABD) +DO_ZZZ_TB(sve2_uabdl_s, uint32_t, uint16_t, H1_4, H1_2, DO_ABD) +DO_ZZZ_TB(sve2_uabdl_d, uint64_t, uint32_t, , H1_4, DO_ABD) + +#undef DO_ZZZ_TB + /* Two-operand reduction expander, controlled by a predicate. * The difference between TYPERED and TYPERET has to do with * sign-extension. E.g. for SMAX, TYPERED must be signed, diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index f3e979bfb7..cbbce6a3e2 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -5980,3 +5980,49 @@ DO_SVE2_ZPZZ(SQSUB_zpzz, sqsub) DO_SVE2_ZPZZ(UQSUB_zpzz, uqsub) DO_SVE2_ZPZZ(SUQADD, suqadd) DO_SVE2_ZPZZ(USQADD, usqadd) + +/* + * SVE2 Widening Integer Arithmetic + */ + +static bool do_sve2_zzw_ool(DisasContext *s, arg_rrr_esz *a, + gen_helper_gvec_3 *fn, int data) +{ + if (fn == NULL || !dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vsz, vsz, data, fn); + } + return true; +} + +#define DO_SVE2_ZZZ_TB(NAME, name, SEL1, SEL2) \ +static bool trans_##NAME(DisasContext *s, arg_rrr_esz *a) \ +{ \ + static gen_helper_gvec_3 * const fns[4] = { \ + NULL, gen_helper_sve2_##name##_h, \ + gen_helper_sve2_##name##_s, gen_helper_sve2_##name##_d, \ + }; \ + return do_sve2_zzw_ool(s, a, fns[a->esz], (SEL2 << 1) | SEL1); \ +} + +DO_SVE2_ZZZ_TB(SADDLB, saddl, false, false) +DO_SVE2_ZZZ_TB(SSUBLB, ssubl, false, false) +DO_SVE2_ZZZ_TB(SABDLB, sabdl, false, false) + +DO_SVE2_ZZZ_TB(UADDLB, uaddl, false, false) +DO_SVE2_ZZZ_TB(USUBLB, usubl, false, false) +DO_SVE2_ZZZ_TB(UABDLB, uabdl, false, false) + +DO_SVE2_ZZZ_TB(SADDLT, saddl, true, true) +DO_SVE2_ZZZ_TB(SSUBLT, ssubl, true, true) +DO_SVE2_ZZZ_TB(SABDLT, sabdl, true, true) + +DO_SVE2_ZZZ_TB(UADDLT, uaddl, true, true) +DO_SVE2_ZZZ_TB(USUBLT, usubl, true, true) +DO_SVE2_ZZZ_TB(UABDLT, uabdl, true, true) From patchwork Fri Sep 18 18:36:44 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 305030 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D6329C43463 for ; Fri, 18 Sep 2020 18:47:26 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4A65521534 for ; Fri, 18 Sep 2020 18:47:26 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="uCGc2FeZ" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4A65521534 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:52768 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLPp-0000OM-6m for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 14:47:25 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47936) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLGy-0007yG-B4 for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:16 -0400 Received: from mail-pj1-x1035.google.com ([2607:f8b0:4864:20::1035]:53570) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLGw-00074a-Df for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:15 -0400 Received: by mail-pj1-x1035.google.com with SMTP id t7so3624118pjd.3 for ; Fri, 18 Sep 2020 11:38:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=7Vcs32PMGSC/nwjonAN6b6oQdS3irDyVBrqXX5IRByw=; b=uCGc2FeZaQ1SsHfWLbRfhZ0h/ow/SCe7YxO6NUHYCZ5ohIw9h21TWRqVdjV+0Bom1I 9+y5FSoOKSCAE4fVacllVYWCO+0rZswwwmYYGbFHn/TfC/8W5bShtI0hlrx+BXv0R8a7 hO3Uxo87IYRrc3km160cwCYaTy8xOt3bESa9N3McjNrhYFjuhFDdIAfvTPvAMUw5VVCf rZKLjYNSmOJaVVat2UymlvkyIlk6Iaj/OtjBD0TUNg56L0okQLNFuMRN0gjRgMDQxxWk KJMkvMZ09kCSKVleiZuJISv2Ct+Xx0/rYvarE6sSx9irt91NyAydW2ynMcY9SqcGrD4D xyZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=7Vcs32PMGSC/nwjonAN6b6oQdS3irDyVBrqXX5IRByw=; b=LHR9MyA/D1RWzYPJj6FDvAKa3B7rO1fxW1LluYsIxYmwSJo4PqWLk16nJx2OkPZBfP nP9MCgESW8615fGJmF70VSTQT2ZH4/8WpwmGsONlS8j9Ip8o5aljX4Mn5tT1OAhLmNR4 WLEwNZqV6ef7f6unlW+M7jniACfLj8OWQ3IerjwLLD7F7gZwTh+K/firHHrhhMF4fLYG q8+6d4omA0YsFjBePPcDcKfUEuih/1+roh1CvbTmBn379pcqtkJyF262wwiiIPH6e4pB ldtEX40nWEV3aUuWDtQOT+b7N+onxQKHpECQEoDur+nJYII62pHBrojC2Ooyg6JFZfzM 83LA== X-Gm-Message-State: AOAM530700Zv+Mj7Fumr6fy7luuYHHCTZgntdDJdVsTS+slI91+d+hm+ rmDNRQSMgoPBevKRQrDblhxvN0ZCKkDgHQ== X-Google-Smtp-Source: ABdhPJyc64u7kLXXujiZpnEP2C86L7HzBkfbwFJlnyTGjx0+/yLTyoGVXzJXMSu613DBsOm/LA7S8Q== X-Received: by 2002:a17:90a:ea09:: with SMTP id w9mr13597807pjy.203.1600454292698; Fri, 18 Sep 2020 11:38:12 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:11 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 14/81] target/arm: Implement SVE2 integer add/subtract interleaved long Date: Fri, 18 Sep 2020 11:36:44 -0700 Message-Id: <20200918183751.2787647-15-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1035; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1035.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/sve.decode | 6 ++++++ target/arm/translate-sve.c | 4 ++++ 2 files changed, 10 insertions(+) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 70712039e4..a28172c017 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1178,3 +1178,9 @@ SABDLB 01000101 .. 0 ..... 00 1100 ..... ..... @rd_rn_rm SABDLT 01000101 .. 0 ..... 00 1101 ..... ..... @rd_rn_rm UABDLB 01000101 .. 0 ..... 00 1110 ..... ..... @rd_rn_rm UABDLT 01000101 .. 0 ..... 00 1111 ..... ..... @rd_rn_rm + +## SVE2 integer add/subtract interleaved long + +SADDLBT 01000101 .. 0 ..... 1000 00 ..... ..... @rd_rn_rm +SSUBLBT 01000101 .. 0 ..... 1000 10 ..... ..... @rd_rn_rm +SSUBLTB 01000101 .. 0 ..... 1000 11 ..... ..... @rd_rn_rm diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index cbbce6a3e2..02eba872f9 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6026,3 +6026,7 @@ DO_SVE2_ZZZ_TB(SABDLT, sabdl, true, true) DO_SVE2_ZZZ_TB(UADDLT, uaddl, true, true) DO_SVE2_ZZZ_TB(USUBLT, usubl, true, true) DO_SVE2_ZZZ_TB(UABDLT, uabdl, true, true) + +DO_SVE2_ZZZ_TB(SADDLBT, saddl, false, true) +DO_SVE2_ZZZ_TB(SSUBLBT, ssubl, false, true) +DO_SVE2_ZZZ_TB(SSUBLTB, ssubl, true, false) From patchwork Fri Sep 18 18:36:45 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 305031 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6ED64C43463 for ; Fri, 18 Sep 2020 18:46:31 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id B012621534 for ; Fri, 18 Sep 2020 18:46:30 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="Ko3y9Fgo" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B012621534 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:50646 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLOv-0007tb-Mj for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 14:46:29 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47948) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLGz-00081J-Fn for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:17 -0400 Received: from mail-pj1-x1044.google.com ([2607:f8b0:4864:20::1044]:53492) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLGx-00074k-Il for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:17 -0400 Received: by mail-pj1-x1044.google.com with SMTP id t7so3624140pjd.3 for ; Fri, 18 Sep 2020 11:38:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=o4kREfopSj9D5g5uleOIWdafqm7OqEWz1vFizJAa+RE=; b=Ko3y9FgoJa1N4yIDQaYR1wRaSybnb9sjMCC7Qw+OxR/mVbb5wskMDdHu9P6qKP2mgX EBBqA0iHdWQVX6Z4ht1UZoGbBSOQxyK7qWILsq6THIWyaYvPe4sLiDkNQk+KMAR7rhn8 SJyizt2EaLI+/ZSyq+DT1H+kYH91jiL62pqQ3rNbGRNME9Cyhm8yn4dWzRczDWIBFFGW stUb8VdUblmYy4pSOLbwBaVXJ8D/dkxqPi7PXcHIE4tci/fyJk+7BwoOMTHcLWGCKMsu yCaqtii9p2OVrmDKxEBWLhc7iPanQcRrxWuc4y0sD6cMmmZvLDd/DtA8GmDozMpTCWac 25Dw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=o4kREfopSj9D5g5uleOIWdafqm7OqEWz1vFizJAa+RE=; b=F9gYBOWiQ3m9vn1/PHgYjXOBdHn67FhcIbDDsweBNYv3l0m4BDAfe5Wv6rw89JIVu+ Cipjwtxeg9aIrlDQtIhuVrGvXnz4e+ujJJKJX3dXHrz5HMN4+Sr/q6v/u4EFZyzrIZe7 5lJ2jfg5P2sokG2MHMRuCKFmI3ZejyYZgfPtXtnupBNYdBs4YDoXor0fMq/QEQjAZWrc HWqH7kFe388Co+dKNtq1RlEffjyULWPQycNhSd3TWTe59mZiSU8flkgFPumKVNzihBhw FrmH6xCs5PVtL/ygFgl+HKpw/61NVD0y/5ukS6q5niwr7TsC8ai1oZGcouPwrPQ2eJX5 2OLA== X-Gm-Message-State: AOAM532ACaAH+vDXvb4SzeE+DjSBZnpp1kwNG9eJjVvW0i1nmG5aI3Uv DXdyrLLkDFnR3EWxtErESZdy+4cp+ZSb9A== X-Google-Smtp-Source: ABdhPJzPuLPYd67XHDL650FRdTqHON30vBJIOAjkhFWc7A0ozHisnmaQscfR7rHeLmB5mQIO1BztnA== X-Received: by 2002:a17:90a:2ecb:: with SMTP id h11mr13619138pjs.195.1600454293867; Fri, 18 Sep 2020 11:38:13 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:13 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 15/81] target/arm: Implement SVE2 integer add/subtract wide Date: Fri, 18 Sep 2020 11:36:45 -0700 Message-Id: <20200918183751.2787647-16-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1044; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1044.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- v2: Fix select offsets (laurent desnogues). --- target/arm/helper-sve.h | 16 ++++++++++++++++ target/arm/sve.decode | 12 ++++++++++++ target/arm/sve_helper.c | 30 ++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 20 ++++++++++++++++++++ 4 files changed, 78 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 676ddfbb2d..a3cbfe0ea5 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -1391,6 +1391,22 @@ DEF_HELPER_FLAGS_4(sve2_uabdl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_uabdl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_uabdl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_saddw_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_saddw_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_saddw_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_ssubw_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_ssubw_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_ssubw_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_uaddw_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_uaddw_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_uaddw_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_usubw_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_usubw_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_usubw_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(sve_ld1bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld2bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) DEF_HELPER_FLAGS_4(sve_ld3bb_r, TCG_CALL_NO_WG, void, env, ptr, tl, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index a28172c017..b9a34726d6 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1184,3 +1184,15 @@ UABDLT 01000101 .. 0 ..... 00 1111 ..... ..... @rd_rn_rm SADDLBT 01000101 .. 0 ..... 1000 00 ..... ..... @rd_rn_rm SSUBLBT 01000101 .. 0 ..... 1000 10 ..... ..... @rd_rn_rm SSUBLTB 01000101 .. 0 ..... 1000 11 ..... ..... @rd_rn_rm + +## SVE2 integer add/subtract wide + +SADDWB 01000101 .. 0 ..... 010 000 ..... ..... @rd_rn_rm +SADDWT 01000101 .. 0 ..... 010 001 ..... ..... @rd_rn_rm +UADDWB 01000101 .. 0 ..... 010 010 ..... ..... @rd_rn_rm +UADDWT 01000101 .. 0 ..... 010 011 ..... ..... @rd_rn_rm + +SSUBWB 01000101 .. 0 ..... 010 100 ..... ..... @rd_rn_rm +SSUBWT 01000101 .. 0 ..... 010 101 ..... ..... @rd_rn_rm +USUBWB 01000101 .. 0 ..... 010 110 ..... ..... @rd_rn_rm +USUBWT 01000101 .. 0 ..... 010 111 ..... ..... @rd_rn_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index ca39b0f6f1..75b87b1070 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1156,6 +1156,36 @@ DO_ZZZ_TB(sve2_uabdl_d, uint64_t, uint32_t, , H1_4, DO_ABD) #undef DO_ZZZ_TB +#define DO_ZZZ_WTB(NAME, TYPEW, TYPEN, HW, HN, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + int sel2 = extract32(desc, SIMD_DATA_SHIFT, 1) * sizeof(TYPEN); \ + for (i = 0; i < opr_sz; i += sizeof(TYPEW)) { \ + TYPEW nn = *(TYPEW *)(vn + HW(i)); \ + TYPEW mm = *(TYPEN *)(vm + HN(i + sel2)); \ + *(TYPEW *)(vd + HW(i)) = OP(nn, mm); \ + } \ +} + +DO_ZZZ_WTB(sve2_saddw_h, int16_t, int8_t, H1_2, H1, DO_ADD) +DO_ZZZ_WTB(sve2_saddw_s, int32_t, int16_t, H1_4, H1_2, DO_ADD) +DO_ZZZ_WTB(sve2_saddw_d, int64_t, int32_t, , H1_4, DO_ADD) + +DO_ZZZ_WTB(sve2_ssubw_h, int16_t, int8_t, H1_2, H1, DO_SUB) +DO_ZZZ_WTB(sve2_ssubw_s, int32_t, int16_t, H1_4, H1_2, DO_SUB) +DO_ZZZ_WTB(sve2_ssubw_d, int64_t, int32_t, , H1_4, DO_SUB) + +DO_ZZZ_WTB(sve2_uaddw_h, uint16_t, uint8_t, H1_2, H1, DO_ADD) +DO_ZZZ_WTB(sve2_uaddw_s, uint32_t, uint16_t, H1_4, H1_2, DO_ADD) +DO_ZZZ_WTB(sve2_uaddw_d, uint64_t, uint32_t, , H1_4, DO_ADD) + +DO_ZZZ_WTB(sve2_usubw_h, uint16_t, uint8_t, H1_2, H1, DO_SUB) +DO_ZZZ_WTB(sve2_usubw_s, uint32_t, uint16_t, H1_4, H1_2, DO_SUB) +DO_ZZZ_WTB(sve2_usubw_d, uint64_t, uint32_t, , H1_4, DO_SUB) + +#undef DO_ZZZ_WTB + /* Two-operand reduction expander, controlled by a predicate. * The difference between TYPERED and TYPERET has to do with * sign-extension. E.g. for SMAX, TYPERED must be signed, diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 02eba872f9..a3eba11bab 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6030,3 +6030,23 @@ DO_SVE2_ZZZ_TB(UABDLT, uabdl, true, true) DO_SVE2_ZZZ_TB(SADDLBT, saddl, false, true) DO_SVE2_ZZZ_TB(SSUBLBT, ssubl, false, true) DO_SVE2_ZZZ_TB(SSUBLTB, ssubl, true, false) + +#define DO_SVE2_ZZZ_WTB(NAME, name, SEL2) \ +static bool trans_##NAME(DisasContext *s, arg_rrr_esz *a) \ +{ \ + static gen_helper_gvec_3 * const fns[4] = { \ + NULL, gen_helper_sve2_##name##_h, \ + gen_helper_sve2_##name##_s, gen_helper_sve2_##name##_d, \ + }; \ + return do_sve2_zzw_ool(s, a, fns[a->esz], SEL2); \ +} + +DO_SVE2_ZZZ_WTB(SADDWB, saddw, false) +DO_SVE2_ZZZ_WTB(SADDWT, saddw, true) +DO_SVE2_ZZZ_WTB(SSUBWB, ssubw, false) +DO_SVE2_ZZZ_WTB(SSUBWT, ssubw, true) + +DO_SVE2_ZZZ_WTB(UADDWB, uaddw, false) +DO_SVE2_ZZZ_WTB(UADDWT, uaddw, true) +DO_SVE2_ZZZ_WTB(USUBWB, usubw, false) +DO_SVE2_ZZZ_WTB(USUBWT, usubw, true) From patchwork Fri Sep 18 18:36:46 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 305018 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EC1E0C43464 for ; Fri, 18 Sep 2020 19:12:20 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6834C21D7F for ; Fri, 18 Sep 2020 19:12:20 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="rvTjGViS" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6834C21D7F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:55606 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLnv-0001Cc-FY for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:12:19 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47964) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLH1-00086L-0q for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:19 -0400 Received: from mail-pg1-x541.google.com ([2607:f8b0:4864:20::541]:39379) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLGz-000751-3V for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:18 -0400 Received: by mail-pg1-x541.google.com with SMTP id d13so3976786pgl.6 for ; Fri, 18 Sep 2020 11:38:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=tM4z3uIRJm8YZecfVjb/gfFJtKuQrqT40n8NaxdOdqs=; b=rvTjGViSXYyANp7lbdo2PXWoA1k3WBQwlBu1Gj+fBLB8xGMsLA1bjL81UL0m7/QFUY TNrV+xRCu8LLCc6q0W2rmWzF5pfC6pxBmATO59HdmbnWmAiBQFr+dHkU2s7HBEcMO/KH 8KAfS4fmnwSfhbYl+FYsXkuTvJBzW1SiQWDIDY/5dYeYknnKq2lT60vNDePAQ9Vmsy23 Q7YxsQ9zZM0OLeS9c4yAsVC3uA5NREi4EjXcmdTqgdEXVnb80+RNgU/xJqAPeztne33z 4OVeflmdGj3mUCAXAJ2rhapZb8t0QKbZnuRUkkeqyDbvOyieky6OfgZRq76QjjH3I0sg kbRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=tM4z3uIRJm8YZecfVjb/gfFJtKuQrqT40n8NaxdOdqs=; b=MTssOd9P3IIdyvKBvhyWslAASzrHCBLpPg6Go2gxtq8ooMzo8FT8QUL3PFiSv56Wjt S98R2roi1xZ+w7OUr/Pal2pDq5vOd/GHXRmYRxTDORGU3FQDsUKGrFGMOthFpnJaw4gO iy917sRv8Yt5uzwcL/3KjEgGvWRj9azJ377ol12LMsV0t8s27TTnh9ef41h2/7W8Ptok 5MBJ/qat98Xr51qcKQpMRc0JfO+ciFgz5HEGCBIsFr3/wMOe5HMuLby4Qm9Tjpx4Dd2A rcq7cZVOS+g7mV+gOAFi/i6ZHzHrZiKrs6FC7QmxF8lNN5Iz6+DiW5neimpWYhJVEE2L C1OA== X-Gm-Message-State: AOAM533SH7QpTNV+VLhQol0BwTDgNmrfRFs8SKDX9vX29Wy9EOdoqqdi zCqQqEmTcZz6OnBLBNvOy0fhPUeEjykj4g== X-Google-Smtp-Source: ABdhPJx7hls4/aQkRXI34ESf1V2fBHr1R3eAnrocFnG4tozdsLvCpvA0VxkD0q2aTvtKZBVXXdOmXA== X-Received: by 2002:aa7:81d5:0:b029:142:2501:39fa with SMTP id c21-20020aa781d50000b0290142250139famr17555085pfn.73.1600454295244; Fri, 18 Sep 2020 11:38:15 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:14 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 16/81] target/arm: Implement SVE2 integer multiply long Date: Fri, 18 Sep 2020 11:36:46 -0700 Message-Id: <20200918183751.2787647-17-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::541; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x541.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Exclude PMULL from this category for the moment. Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 15 +++++++++++++++ target/arm/sve.decode | 9 +++++++++ target/arm/sve_helper.c | 31 +++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 9 +++++++++ 4 files changed, 64 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index a3cbfe0ea5..fff7316292 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2347,4 +2347,19 @@ DEF_HELPER_FLAGS_6(sve_stdd_le_zd_mte, TCG_CALL_NO_WG, DEF_HELPER_FLAGS_6(sve_stdd_be_zd_mte, TCG_CALL_NO_WG, void, env, ptr, ptr, ptr, tl, i32) +DEF_HELPER_FLAGS_4(sve2_sqdmull_zzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqdmull_zzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqdmull_zzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_smull_zzz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_smull_zzz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_smull_zzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_umull_zzz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_umull_zzz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_umull_zzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(sve2_pmull_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index b9a34726d6..f39ca29320 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1196,3 +1196,12 @@ SSUBWB 01000101 .. 0 ..... 010 100 ..... ..... @rd_rn_rm SSUBWT 01000101 .. 0 ..... 010 101 ..... ..... @rd_rn_rm USUBWB 01000101 .. 0 ..... 010 110 ..... ..... @rd_rn_rm USUBWT 01000101 .. 0 ..... 010 111 ..... ..... @rd_rn_rm + +## SVE2 integer multiply long + +SQDMULLB_zzz 01000101 .. 0 ..... 011 000 ..... ..... @rd_rn_rm +SQDMULLT_zzz 01000101 .. 0 ..... 011 001 ..... ..... @rd_rn_rm +SMULLB_zzz 01000101 .. 0 ..... 011 100 ..... ..... @rd_rn_rm +SMULLT_zzz 01000101 .. 0 ..... 011 101 ..... ..... @rd_rn_rm +UMULLB_zzz 01000101 .. 0 ..... 011 110 ..... ..... @rd_rn_rm +UMULLT_zzz 01000101 .. 0 ..... 011 111 ..... ..... @rd_rn_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 75b87b1070..fbaf1f2911 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1154,6 +1154,37 @@ DO_ZZZ_TB(sve2_uabdl_h, uint16_t, uint8_t, H1_2, H1, DO_ABD) DO_ZZZ_TB(sve2_uabdl_s, uint32_t, uint16_t, H1_4, H1_2, DO_ABD) DO_ZZZ_TB(sve2_uabdl_d, uint64_t, uint32_t, , H1_4, DO_ABD) +DO_ZZZ_TB(sve2_smull_zzz_h, int16_t, int8_t, H1_2, H1, DO_MUL) +DO_ZZZ_TB(sve2_smull_zzz_s, int32_t, int16_t, H1_4, H1_2, DO_MUL) +DO_ZZZ_TB(sve2_smull_zzz_d, int64_t, int32_t, , H1_4, DO_MUL) + +DO_ZZZ_TB(sve2_umull_zzz_h, uint16_t, uint8_t, H1_2, H1, DO_MUL) +DO_ZZZ_TB(sve2_umull_zzz_s, uint32_t, uint16_t, H1_4, H1_2, DO_MUL) +DO_ZZZ_TB(sve2_umull_zzz_d, uint64_t, uint32_t, , H1_4, DO_MUL) + +/* Note that the multiply cannot overflow, but the doubling can. */ +static inline int16_t do_sqdmull_h(int16_t n, int16_t m) +{ + int16_t val = n * m; + return DO_SQADD_H(val, val); +} + +static inline int32_t do_sqdmull_s(int32_t n, int32_t m) +{ + int32_t val = n * m; + return DO_SQADD_S(val, val); +} + +static inline int64_t do_sqdmull_d(int64_t n, int64_t m) +{ + int64_t val = n * m; + return do_sqadd_d(val, val); +} + +DO_ZZZ_TB(sve2_sqdmull_zzz_h, int16_t, int8_t, H1_2, H1, do_sqdmull_h) +DO_ZZZ_TB(sve2_sqdmull_zzz_s, int32_t, int16_t, H1_4, H1_2, do_sqdmull_s) +DO_ZZZ_TB(sve2_sqdmull_zzz_d, int64_t, int32_t, , H1_4, do_sqdmull_d) + #undef DO_ZZZ_TB #define DO_ZZZ_WTB(NAME, TYPEW, TYPEN, HW, HN, OP) \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index a3eba11bab..71ea0cd0c9 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6031,6 +6031,15 @@ DO_SVE2_ZZZ_TB(SADDLBT, saddl, false, true) DO_SVE2_ZZZ_TB(SSUBLBT, ssubl, false, true) DO_SVE2_ZZZ_TB(SSUBLTB, ssubl, true, false) +DO_SVE2_ZZZ_TB(SQDMULLB_zzz, sqdmull_zzz, false, false) +DO_SVE2_ZZZ_TB(SQDMULLT_zzz, sqdmull_zzz, true, true) + +DO_SVE2_ZZZ_TB(SMULLB_zzz, smull_zzz, false, false) +DO_SVE2_ZZZ_TB(SMULLT_zzz, smull_zzz, true, true) + +DO_SVE2_ZZZ_TB(UMULLB_zzz, umull_zzz, false, false) +DO_SVE2_ZZZ_TB(UMULLT_zzz, umull_zzz, true, true) + #define DO_SVE2_ZZZ_WTB(NAME, name, SEL2) \ static bool trans_##NAME(DisasContext *s, arg_rrr_esz *a) \ { \ From patchwork Fri Sep 18 18:36:47 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273356 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B9910C43464 for ; Fri, 18 Sep 2020 18:51:42 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 237A9206A1 for ; Fri, 18 Sep 2020 18:51:42 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="W8HBu8bM" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 237A9206A1 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:32832 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLTx-000403-2A for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 14:51:41 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47976) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLH2-00089A-2W for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:20 -0400 Received: from mail-pj1-x1042.google.com ([2607:f8b0:4864:20::1042]:35205) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLH0-00075h-9z for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:19 -0400 Received: by mail-pj1-x1042.google.com with SMTP id jw11so3470986pjb.0 for ; Fri, 18 Sep 2020 11:38:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=80TlpzmA1GZsK3dg+P8GxR+XGATZYs7NilsFyZcBo9o=; b=W8HBu8bMXGrg4RI5V6G7Y93OuUjG8KAOuEeoBKKyPQFvQ/aFh9hxz06KCkSqkovW6q NkBLEXs1XdaJe4V0fiNGse+t2yycJxhGlIWej4sqQZb6lR2bcc5QvQnarhJrti+eQRfE G5av3unxfe+r2b0L/HYLBkLi1FpKUaYnlKtUg1PAEHL2eMdJFf5z0o6kewBcw7H9KL6o /16BcUQe58ziT0TC33ZM9sEeXNvzEBlowBYwRAsRThN9YHoROSdlE+ZVNqu6uHxIFUgS zvQbR6KuzBQLT07G9VvYmFDVdLgXcvelAQYRMfewX7knvm6zrG1Ep1/a6+SReouhRkDF 5hYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=80TlpzmA1GZsK3dg+P8GxR+XGATZYs7NilsFyZcBo9o=; b=UGDXDAVa7o5YEHysFSuJ17uh+ojdDJwkd/vISnqv/lySmz/pYhp7jGzW03ekYc69OB LWV0/iNqkolTkmK+zUKLxwSyB/gE+aHUhWMRxPfnAJK0INEb3EYAebJLVfnEY7V3onDR 2w7N+OBf1rh5ArpsJ7slE+QVyiCtcl0ME2wrN53pv5jpgYdYL+xF0zTphq4D3WmYFCua ysrHaNFzuR8vs6Dfo7RV+wXRp0KB+ttFNGt1XvrjElhS8fAMGItAB4XyjuaKL7YEojss oq3XYbqYXgOW0TiQ1JWTDjmVCSpyFGhwbhAPb+JfgrQz5xSzvoqViHsDBLLpxv2Byj18 +itQ== X-Gm-Message-State: AOAM5325zphPujgnAUo9vX0VEibQt/f2Cne+LLZjR2NoVouP0RbrPhjZ TN123JEe9605/Pkt6SMpWzKJIvFXkaio6Q== X-Google-Smtp-Source: ABdhPJz1r9jAebZVTwspiOtAS1w6dC683rFtB3YROWi3plZdJaOb86iSNjsDh4XFTLlHd217S4rgFA== X-Received: by 2002:a17:902:d909:b029:d0:cbe1:e716 with SMTP id c9-20020a170902d909b02900d0cbe1e716mr34245061plz.36.1600454296544; Fri, 18 Sep 2020 11:38:16 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:15 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 17/81] target/arm: Implement PMULLB and PMULLT Date: Fri, 18 Sep 2020 11:36:47 -0700 Message-Id: <20200918183751.2787647-18-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1042; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1042.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/cpu.h | 10 ++++++++++ target/arm/helper-sve.h | 1 + target/arm/sve.decode | 2 ++ target/arm/translate-sve.c | 22 ++++++++++++++++++++++ target/arm/vec_helper.c | 24 ++++++++++++++++++++++++ 5 files changed, 59 insertions(+) diff --git a/target/arm/cpu.h b/target/arm/cpu.h index 6f80a85fd0..59415184db 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -3902,6 +3902,16 @@ static inline bool isar_feature_aa64_sve2(const ARMISARegisters *id) return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, SVEVER) != 0; } +static inline bool isar_feature_aa64_sve2_aes(const ARMISARegisters *id) +{ + return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, AES) != 0; +} + +static inline bool isar_feature_aa64_sve2_pmull128(const ARMISARegisters *id) +{ + return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, AES) >= 2; +} + /* * Feature tests for "does this exist in either 32-bit or 64-bit?" */ diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index fff7316292..fd1ebe180f 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2363,3 +2363,4 @@ DEF_HELPER_FLAGS_4(sve2_umull_zzz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_umull_zzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_pmull_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_pmull_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index f39ca29320..417b11fdd5 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1201,6 +1201,8 @@ USUBWT 01000101 .. 0 ..... 010 111 ..... ..... @rd_rn_rm SQDMULLB_zzz 01000101 .. 0 ..... 011 000 ..... ..... @rd_rn_rm SQDMULLT_zzz 01000101 .. 0 ..... 011 001 ..... ..... @rd_rn_rm +PMULLB 01000101 .. 0 ..... 011 010 ..... ..... @rd_rn_rm +PMULLT 01000101 .. 0 ..... 011 011 ..... ..... @rd_rn_rm SMULLB_zzz 01000101 .. 0 ..... 011 100 ..... ..... @rd_rn_rm SMULLT_zzz 01000101 .. 0 ..... 011 101 ..... ..... @rd_rn_rm UMULLB_zzz 01000101 .. 0 ..... 011 110 ..... ..... @rd_rn_rm diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 71ea0cd0c9..748730e1a8 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6040,6 +6040,28 @@ DO_SVE2_ZZZ_TB(SMULLT_zzz, smull_zzz, true, true) DO_SVE2_ZZZ_TB(UMULLB_zzz, umull_zzz, false, false) DO_SVE2_ZZZ_TB(UMULLT_zzz, umull_zzz, true, true) +static bool do_trans_pmull(DisasContext *s, arg_rrr_esz *a, bool sel) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_gvec_pmull_q, gen_helper_sve2_pmull_h, + NULL, gen_helper_sve2_pmull_d, + }; + if (a->esz == 0 && !dc_isar_feature(aa64_sve2_pmull128, s)) { + return false; + } + return do_sve2_zzw_ool(s, a, fns[a->esz], sel); +} + +static bool trans_PMULLB(DisasContext *s, arg_rrr_esz *a) +{ + return do_trans_pmull(s, a, false); +} + +static bool trans_PMULLT(DisasContext *s, arg_rrr_esz *a) +{ + return do_trans_pmull(s, a, true); +} + #define DO_SVE2_ZZZ_WTB(NAME, name, SEL2) \ static bool trans_##NAME(DisasContext *s, arg_rrr_esz *a) \ { \ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 84e1ae3933..dcaf3926e1 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -1750,6 +1750,30 @@ void HELPER(sve2_pmull_h)(void *vd, void *vn, void *vm, uint32_t desc) d[i] = pmull_h(nn, mm); } } + +static uint64_t pmull_d(uint64_t op1, uint64_t op2) +{ + uint64_t result = 0; + int i; + + for (i = 0; i < 32; ++i) { + uint64_t mask = -((op1 >> i) & 1); + result ^= (op2 << i) & mask; + } + return result; +} + +void HELPER(sve2_pmull_d)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t sel = H4(simd_data(desc)); + intptr_t i, opr_sz = simd_oprsz(desc); + uint32_t *n = vn, *m = vm; + uint64_t *d = vd; + + for (i = 0; i < opr_sz / 8; ++i) { + d[i] = pmull_d(n[2 * i + sel], m[2 * i + sel]); + } +} #endif #define DO_CMP0(NAME, TYPE, OP) \ From patchwork Fri Sep 18 18:36:48 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273354 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 06481C43463 for ; Fri, 18 Sep 2020 18:55:11 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 43575221EC for ; Fri, 18 Sep 2020 18:55:10 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="czQKhR3r" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 43575221EC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:40896 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLXJ-0007UL-Dk for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 14:55:09 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:47996) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLH3-0008D1-Qf for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:21 -0400 Received: from mail-pg1-x542.google.com ([2607:f8b0:4864:20::542]:37414) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLH1-00076N-KU for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:21 -0400 Received: by mail-pg1-x542.google.com with SMTP id l71so3987279pge.4 for ; Fri, 18 Sep 2020 11:38:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=LFZDXA8IiL4Ps6z7MJFJVz7cCQ9oEUrlP9izr7ZEZ3Q=; b=czQKhR3rtFShsJAXDlVUw8sSEpPQx+UmiB6El9nhUr/C8ZGvP63gKvAotYcQBhgTYW CcWJYBqHFSa7G9hBQTUhiu7tuDJIdoyFt7xby+LHlm99Ng/Z35Kb2cds/H4uWJW0DmrC lAoj33zwwWp9WG/R6mmQjen9qEMlh+DNb/yv9Q31l96eXhS99y4fx3pS3Dwc2EYGImiG KJtem8Y2NoXg2g94GbyA8oT5fD3TDblI9Kgk63TnfY4Nu8mBXSVcqZethKeETvkOisUR GHlcrVBOpY1lO10RIv9sNk4aY4M0B7t3hy9A58KhlIB7tdBD6HxHTx1FhwrxOAWoHDHs VIuA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=LFZDXA8IiL4Ps6z7MJFJVz7cCQ9oEUrlP9izr7ZEZ3Q=; b=sj5stSKsriz689rod7R9l8ApldQ0OLC2V0nfPohEOowz2NIr0ORy/a2AxDpssOdz+l k1mSJdGMVh/Bzzil2eBdPCcqLizXmIjfVW2/s/bBfBrjCACuXGn0X2LhyRQyCzS9jFnJ IiV8804naWHNINvk6zzgWzbjTI081IFnNszs0FEWSwbgNkujaRyVQ9Xo0StrwKP1+X/g 6Sluuap/LrsjgsS/NYGVHKVKC6sXATEZ5gRcSc1sujRy9KbpaFH593oAo/loR2RXNmWn 6BpvCZGcON9m1pARSaI54pDhQqj0FQBc7UjKln/OilLZp24c7OKujoeSoRsrcU0eZvU8 sJCQ== X-Gm-Message-State: AOAM531q6WBRypqQbC7FBwTxOLjInkruo1efgg9LM59mjhjrhuHGjCMl Z+e3jh/Eh+XWo46KCZ4w9bIwWJ+pO0Ph3w== X-Google-Smtp-Source: ABdhPJxvwR0awpVsIaH1e+ngXPwEyKSb61GqUeiQOHihBPfDjYk73kUyWoZVy949cavuXJ5OyGJ4yQ== X-Received: by 2002:a62:86c3:0:b029:14e:e050:289 with SMTP id x186-20020a6286c30000b029014ee0500289mr55205pfd.50.1600454297833; Fri, 18 Sep 2020 11:38:17 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:17 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 18/81] target/arm: Implement SVE2 bitwise shift left long Date: Fri, 18 Sep 2020 11:36:48 -0700 Message-Id: <20200918183751.2787647-19-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::542; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x542.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 8 ++ target/arm/sve.decode | 8 ++ target/arm/sve_helper.c | 26 ++++++ target/arm/translate-sve.c | 159 +++++++++++++++++++++++++++++++++++++ 4 files changed, 201 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index fd1ebe180f..fd7abbf3ff 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2364,3 +2364,11 @@ DEF_HELPER_FLAGS_4(sve2_umull_zzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_pmull_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_pmull_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_sshll_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sshll_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sshll_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_ushll_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_ushll_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_ushll_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 417b11fdd5..851b336c7b 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1207,3 +1207,11 @@ SMULLB_zzz 01000101 .. 0 ..... 011 100 ..... ..... @rd_rn_rm SMULLT_zzz 01000101 .. 0 ..... 011 101 ..... ..... @rd_rn_rm UMULLB_zzz 01000101 .. 0 ..... 011 110 ..... ..... @rd_rn_rm UMULLT_zzz 01000101 .. 0 ..... 011 111 ..... ..... @rd_rn_rm + +## SVE2 bitwise shift left long + +# Note bit23 == 0 is handled by esz > 0 in do_sve2_shll_tb. +SSHLLB 01000101 .. 0 ..... 1010 00 ..... ..... @rd_rn_tszimm_shl +SSHLLT 01000101 .. 0 ..... 1010 01 ..... ..... @rd_rn_tszimm_shl +USHLLB 01000101 .. 0 ..... 1010 10 ..... ..... @rd_rn_tszimm_shl +USHLLT 01000101 .. 0 ..... 1010 11 ..... ..... @rd_rn_tszimm_shl diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index fbaf1f2911..5c0ed5208a 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -625,6 +625,8 @@ DO_ZPZZ(sve2_sqrshl_zpzz_h, int16_t, H1_2, do_sqrshl_h) DO_ZPZZ(sve2_sqrshl_zpzz_s, int32_t, H1_4, do_sqrshl_s) DO_ZPZZ_D(sve2_sqrshl_zpzz_d, int64_t, do_sqrshl_d) +#undef do_sqrshl_d + #define do_uqrshl_b(n, m) \ ({ uint32_t discard; do_uqrshl_bhs(n, (int8_t)m, 8, true, &discard); }) #define do_uqrshl_h(n, m) \ @@ -639,6 +641,8 @@ DO_ZPZZ(sve2_uqrshl_zpzz_h, uint16_t, H1_2, do_uqrshl_h) DO_ZPZZ(sve2_uqrshl_zpzz_s, uint32_t, H1_4, do_uqrshl_s) DO_ZPZZ_D(sve2_uqrshl_zpzz_d, uint64_t, do_uqrshl_d) +#undef do_uqrshl_d + #define DO_HADD_BHS(n, m) (((int64_t)n + m) >> 1) #define DO_HADD_D(n, m) ((n >> 1) + (m >> 1) + (n & m & 1)) @@ -1217,6 +1221,28 @@ DO_ZZZ_WTB(sve2_usubw_d, uint64_t, uint32_t, , H1_4, DO_SUB) #undef DO_ZZZ_WTB +#define DO_ZZI_SHLL(NAME, TYPEW, TYPEN, HW, HN) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + intptr_t sel = (simd_data(desc) & 1) * sizeof(TYPEN); \ + int shift = simd_data(desc) >> 1; \ + for (i = 0; i < opr_sz; i += sizeof(TYPEW)) { \ + TYPEW nn = *(TYPEN *)(vn + HN(i + sel)); \ + *(TYPEW *)(vd + HW(i)) = nn << shift; \ + } \ +} + +DO_ZZI_SHLL(sve2_sshll_h, int16_t, int8_t, H1_2, H1) +DO_ZZI_SHLL(sve2_sshll_s, int32_t, int16_t, H1_4, H1_2) +DO_ZZI_SHLL(sve2_sshll_d, int64_t, int32_t, , H1_4) + +DO_ZZI_SHLL(sve2_ushll_h, uint16_t, uint8_t, H1_2, H1) +DO_ZZI_SHLL(sve2_ushll_s, uint32_t, uint16_t, H1_4, H1_2) +DO_ZZI_SHLL(sve2_ushll_d, uint64_t, uint32_t, , H1_4) + +#undef DO_ZZI_SHLL + /* Two-operand reduction expander, controlled by a predicate. * The difference between TYPERED and TYPERET has to do with * sign-extension. E.g. for SMAX, TYPERED must be signed, diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 748730e1a8..4c3c197b5d 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6081,3 +6081,162 @@ DO_SVE2_ZZZ_WTB(UADDWB, uaddw, false) DO_SVE2_ZZZ_WTB(UADDWT, uaddw, true) DO_SVE2_ZZZ_WTB(USUBWB, usubw, false) DO_SVE2_ZZZ_WTB(USUBWT, usubw, true) + +static void gen_sshll_vec(unsigned vece, TCGv_vec d, TCGv_vec n, int64_t imm) +{ + int top = imm & 1; + int shl = imm >> 1; + int halfbits = 4 << vece; + + if (top) { + if (shl == halfbits) { + TCGv_vec t = tcg_temp_new_vec_matching(d); + tcg_gen_dupi_vec(vece, t, MAKE_64BIT_MASK(halfbits, halfbits)); + tcg_gen_and_vec(vece, d, n, t); + tcg_temp_free_vec(t); + } else { + tcg_gen_sari_vec(vece, d, n, halfbits); + tcg_gen_shli_vec(vece, d, d, shl); + } + } else { + tcg_gen_shli_vec(vece, d, n, halfbits); + tcg_gen_sari_vec(vece, d, d, halfbits - shl); + } +} + +static void gen_ushll_i64(unsigned vece, TCGv_i64 d, TCGv_i64 n, int imm) +{ + int halfbits = 4 << vece; + int top = imm & 1; + int shl = (imm >> 1); + int shift; + uint64_t mask; + + mask = MAKE_64BIT_MASK(0, halfbits); + mask <<= shl; + mask = dup_const(vece, mask); + + shift = shl - top * halfbits; + if (shift < 0) { + tcg_gen_shri_i64(d, n, -shift); + } else { + tcg_gen_shri_i64(d, n, shift); + } + tcg_gen_andi_i64(d, d, mask); +} + +static void gen_ushll16_i64(TCGv_i64 d, TCGv_i64 n, int64_t imm) +{ + gen_ushll_i64(MO_16, d, n, imm); +} + +static void gen_ushll32_i64(TCGv_i64 d, TCGv_i64 n, int64_t imm) +{ + gen_ushll_i64(MO_32, d, n, imm); +} + +static void gen_ushll64_i64(TCGv_i64 d, TCGv_i64 n, int64_t imm) +{ + gen_ushll_i64(MO_64, d, n, imm); +} + +static void gen_ushll_vec(unsigned vece, TCGv_vec d, TCGv_vec n, int64_t imm) +{ + int halfbits = 4 << vece; + int top = imm & 1; + int shl = imm >> 1; + + if (top) { + if (shl == halfbits) { + TCGv_vec t = tcg_temp_new_vec_matching(d); + tcg_gen_dupi_vec(vece, t, MAKE_64BIT_MASK(halfbits, halfbits)); + tcg_gen_and_vec(vece, d, n, t); + tcg_temp_free_vec(t); + } else { + tcg_gen_shri_vec(vece, d, n, halfbits); + tcg_gen_shli_vec(vece, d, d, shl); + } + } else { + if (shl == 0) { + TCGv_vec t = tcg_temp_new_vec_matching(d); + tcg_gen_dupi_vec(vece, t, MAKE_64BIT_MASK(0, halfbits)); + tcg_gen_and_vec(vece, d, n, t); + tcg_temp_free_vec(t); + } else { + tcg_gen_shli_vec(vece, d, n, halfbits); + tcg_gen_shri_vec(vece, d, d, halfbits - shl); + } + } +} + +static bool do_sve2_shll_tb(DisasContext *s, arg_rri_esz *a, + bool sel, bool uns) +{ + static const TCGOpcode sshll_list[] = { + INDEX_op_shli_vec, INDEX_op_sari_vec, 0 + }; + static const TCGOpcode ushll_list[] = { + INDEX_op_shli_vec, INDEX_op_shri_vec, 0 + }; + static const GVecGen2i ops[2][3] = { + { { .fniv = gen_sshll_vec, + .opt_opc = sshll_list, + .fno = gen_helper_sve2_sshll_h, + .vece = MO_16 }, + { .fniv = gen_sshll_vec, + .opt_opc = sshll_list, + .fno = gen_helper_sve2_sshll_s, + .vece = MO_32 }, + { .fniv = gen_sshll_vec, + .opt_opc = sshll_list, + .fno = gen_helper_sve2_sshll_d, + .vece = MO_64 } }, + { { .fni8 = gen_ushll16_i64, + .fniv = gen_ushll_vec, + .opt_opc = ushll_list, + .fno = gen_helper_sve2_ushll_h, + .vece = MO_16 }, + { .fni8 = gen_ushll32_i64, + .fniv = gen_ushll_vec, + .opt_opc = ushll_list, + .fno = gen_helper_sve2_ushll_s, + .vece = MO_32 }, + { .fni8 = gen_ushll64_i64, + .fniv = gen_ushll_vec, + .opt_opc = ushll_list, + .fno = gen_helper_sve2_ushll_d, + .vece = MO_64 } }, + }; + + if (a->esz < 0 || a->esz > 2 || !dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_2i(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vsz, vsz, (a->imm << 1) | sel, + &ops[uns][a->esz]); + } + return true; +} + +static bool trans_SSHLLB(DisasContext *s, arg_rri_esz *a) +{ + return do_sve2_shll_tb(s, a, false, false); +} + +static bool trans_SSHLLT(DisasContext *s, arg_rri_esz *a) +{ + return do_sve2_shll_tb(s, a, true, false); +} + +static bool trans_USHLLB(DisasContext *s, arg_rri_esz *a) +{ + return do_sve2_shll_tb(s, a, false, true); +} + +static bool trans_USHLLT(DisasContext *s, arg_rri_esz *a) +{ + return do_sve2_shll_tb(s, a, true, true); +} From patchwork Fri Sep 18 18:36:49 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273357 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B97D1C43463 for ; Fri, 18 Sep 2020 18:50:59 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 1F56C206A1 for ; Fri, 18 Sep 2020 18:50:59 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="gFMfHoHH" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1F56C206A1 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:58960 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLTG-00038b-5Z for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 14:50:58 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48004) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLH4-0008Er-L2 for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:22 -0400 Received: from mail-pg1-x532.google.com ([2607:f8b0:4864:20::532]:42195) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLH2-00076X-Rs for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:22 -0400 Received: by mail-pg1-x532.google.com with SMTP id k14so3971979pgi.9 for ; Fri, 18 Sep 2020 11:38:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=QqsaL0EBukwhamIocsMzA86GugIvb/1iB81VRSn7764=; b=gFMfHoHH3v2cgX+a6PK2U5Uv71um76ju7GOQeCRr6pwAPvb/O+rsliM36F/rOnh0Fy 2396E0gTX8QNauiwvuCLXwSCVUIHitFUU0+6Ns1fvpVpjf0COJZ3VIVGKlwEFOCLrcnP bgBRlBjwJdJF/DcTMUG7y08Zdn+VsHiB7k6s7pOi1mRpdS80fxXjEYDpzJ++ca10G7gk FrdB2n+YxO9FD9VPxBt0mmUKNo5bvShip99mAoV1FOvPMfjwTei29DAx6N6bU0LvZ4Fq Duku1l8Fz5LwzHR8pRhJ+UVrm2ItD8n2/1XGQ+EC+yY3qVfYW8BXGMmR3KhDfYONG8qs syFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=QqsaL0EBukwhamIocsMzA86GugIvb/1iB81VRSn7764=; b=Xuxn6cnocByw+TAjMGazrlZ6wubKGtk7UuL8XWTtm/3JwSW+xLqU6YbWT+1GcvDuJ7 RDNZSoVUVeFNDBmkEGCSd2zZhR9gxGHf+6NeIX6hJAWBiQK9zGfkb5gEY93zblrzRRNt IDAlsbnCQaxd5vkuTrU0FAljiGxFD+ATUqzqrPs1+zT68HF7WRYq4vj2wLyFolUntgNo oNIYeGNqfdOOPJoMrTvwHvcOa3nHFr1YldxM294OkXXUjB7Vva5pa1DBrtZNPp0HaJ+I /zTKQKlUBcA5UnPSkzMjPVXyh/+VoaNP+X5vIesTkSOVWhkPKVOKbmWvCIp4MMnmytC2 nWuw== X-Gm-Message-State: AOAM532PwK2M46k7VOfqw5sP5jIzG8b4YpBPMVbAkiJrykkEf2fGWT/z 30V9GQO6TMADYSj5iAaeaF/CAhb26dPmtA== X-Google-Smtp-Source: ABdhPJyvK8NZNyN1lajYTz4ZNuFx2vfBa4bLR+iKgoGp727Cu4mYf07KB2D2QKOCuNBMk0FOepx9jQ== X-Received: by 2002:aa7:911a:0:b029:13e:d13d:a13d with SMTP id 26-20020aa7911a0000b029013ed13da13dmr32964455pfh.37.1600454299094; Fri, 18 Sep 2020 11:38:19 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:18 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 19/81] target/arm: Implement SVE2 bitwise exclusive-or interleaved Date: Fri, 18 Sep 2020 11:36:49 -0700 Message-Id: <20200918183751.2787647-20-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::532; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x532.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 5 +++++ target/arm/sve.decode | 5 +++++ target/arm/sve_helper.c | 20 ++++++++++++++++++++ target/arm/translate-sve.c | 19 +++++++++++++++++++ 4 files changed, 49 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index fd7abbf3ff..cdd3cab607 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2372,3 +2372,8 @@ DEF_HELPER_FLAGS_3(sve2_sshll_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_ushll_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_ushll_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_ushll_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_eoril_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_eoril_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_eoril_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_eoril_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 851b336c7b..79d915cf5b 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1215,3 +1215,8 @@ SSHLLB 01000101 .. 0 ..... 1010 00 ..... ..... @rd_rn_tszimm_shl SSHLLT 01000101 .. 0 ..... 1010 01 ..... ..... @rd_rn_tszimm_shl USHLLB 01000101 .. 0 ..... 1010 10 ..... ..... @rd_rn_tszimm_shl USHLLT 01000101 .. 0 ..... 1010 11 ..... ..... @rd_rn_tszimm_shl + +## SVE2 bitwise exclusive-or interleaved + +EORBT 01000101 .. 0 ..... 10010 0 ..... ..... @rd_rn_rm +EORTB 01000101 .. 0 ..... 10010 1 ..... ..... @rd_rn_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 5c0ed5208a..9133d0a58d 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1221,6 +1221,26 @@ DO_ZZZ_WTB(sve2_usubw_d, uint64_t, uint32_t, , H1_4, DO_SUB) #undef DO_ZZZ_WTB +#define DO_ZZZ_NTB(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + intptr_t sel1 = extract32(desc, SIMD_DATA_SHIFT, 1) * sizeof(TYPE); \ + intptr_t sel2 = extract32(desc, SIMD_DATA_SHIFT + 1, 1) * sizeof(TYPE); \ + for (i = 0; i < opr_sz; i += 2 * sizeof(TYPE)) { \ + TYPE nn = *(TYPE *)(vn + H(i + sel1)); \ + TYPE mm = *(TYPE *)(vm + H(i + sel2)); \ + *(TYPE *)(vd + H(i + sel1)) = OP(nn, mm); \ + } \ +} + +DO_ZZZ_NTB(sve2_eoril_b, uint8_t, H1, DO_EOR) +DO_ZZZ_NTB(sve2_eoril_h, uint16_t, H1_2, DO_EOR) +DO_ZZZ_NTB(sve2_eoril_s, uint32_t, H1_4, DO_EOR) +DO_ZZZ_NTB(sve2_eoril_d, uint64_t, , DO_EOR) + +#undef DO_ZZZ_NTB + #define DO_ZZI_SHLL(NAME, TYPEW, TYPEN, HW, HN) \ void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ { \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 4c3c197b5d..0a194a0e8c 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6040,6 +6040,25 @@ DO_SVE2_ZZZ_TB(SMULLT_zzz, smull_zzz, true, true) DO_SVE2_ZZZ_TB(UMULLB_zzz, umull_zzz, false, false) DO_SVE2_ZZZ_TB(UMULLT_zzz, umull_zzz, true, true) +static bool do_eor_tb(DisasContext *s, arg_rrr_esz *a, bool sel1) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve2_eoril_b, gen_helper_sve2_eoril_h, + gen_helper_sve2_eoril_s, gen_helper_sve2_eoril_d, + }; + return do_sve2_zzw_ool(s, a, fns[a->esz], (!sel1 << 1) | sel1); +} + +static bool trans_EORBT(DisasContext *s, arg_rrr_esz *a) +{ + return do_eor_tb(s, a, false); +} + +static bool trans_EORTB(DisasContext *s, arg_rrr_esz *a) +{ + return do_eor_tb(s, a, true); +} + static bool do_trans_pmull(DisasContext *s, arg_rrr_esz *a, bool sel) { static gen_helper_gvec_3 * const fns[4] = { From patchwork Fri Sep 18 18:36:50 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273342 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 68348C43463 for ; Fri, 18 Sep 2020 19:14:47 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D4E3C21707 for ; Fri, 18 Sep 2020 19:14:46 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="hmHehQn8" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D4E3C21707 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:35642 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLqH-0004aB-Ql for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:14:45 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48020) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLH6-0008I4-2N for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:24 -0400 Received: from mail-pj1-x102e.google.com ([2607:f8b0:4864:20::102e]:38748) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLH4-00076d-3K for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:23 -0400 Received: by mail-pj1-x102e.google.com with SMTP id u3so3458959pjr.3 for ; Fri, 18 Sep 2020 11:38:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ZbLRc+IuD/nbM0BZc/f1DlwGe0rxW4uXRdQXUe4P2FY=; b=hmHehQn8U0+FG94/i9fACPsgggM4cJ3wHQTD0cAdvEM4axP8r3b0A521UtteVOvvuP FJZ2Q16C1zZxWb86fxbyC9136zBU/YvGSbJ6h9IF5EH20s8p8jl7wAcu2c29wEhNqBsh 8hYMJ82d6NJCXaOJmItS7iKfXTFBJRV1yxk9HT/yT3GDekAQ7/lAJxefbtQX3JXdro4u sJBys+wH/69pVY7SCToHohh+kwF69PJHMhhb09XDoHvkxa+90BAJ9QdiAqwCTvxQ2MqT omCdIKEs1Fuq4yvZCVMUi+LX7v2Pf4T0tT4Wd/7VKhiny1cVPudvtpiRy9WQSzGH8oF2 H5XA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ZbLRc+IuD/nbM0BZc/f1DlwGe0rxW4uXRdQXUe4P2FY=; b=rhyY+H7jdr+cOZWz5fsUcZ2SyJjAIMqZqpPxu0kptAywG1xdUN3Mb0StVPzJaHkovE zHs/lq5TrBSOj3Y0MZC9UJimLepuvYmYSl44uG1yTuhujDv8YT/Ik3X9xcscQP8TvIB5 jeu+HmM0xSZj+PY386TGMQN63xT4eWtvMV4GspQ0MdpkbxVQuj2AbUgp7mYqxxyoLXlU coQhH+OFgJvOxqZ57NEQY4+55B255mY5gtfuitVGiNYRAGQ8wAWbBOKUvERPNHr5c9Xr /16EZSEx1+CzsCsRGkXf9Htt8TFEIY6W7ySQ4iqAN/KQbIPaQiFVyCkIjp2R73UM9aRO tcSA== X-Gm-Message-State: AOAM532kvRtizer4YtP9csDI72iWFiTuOvU5ZiO2Zg/iBHL8VqQWWnZ/ ZXjG+DMm4OWtdH8LAOGMbBOj5wijy7W5JQ== X-Google-Smtp-Source: ABdhPJzOlWRuUA/Bc1BM5RSMgminb/vRknoJ+mDa/Quqh+7uQBtcb2t+jqIjYEyOnPwX3sABVdqgnw== X-Received: by 2002:a17:902:6b05:b029:d2:8b5:14 with SMTP id o5-20020a1709026b05b02900d208b50014mr3790154plk.80.1600454300344; Fri, 18 Sep 2020 11:38:20 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:19 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 20/81] target/arm: Implement SVE2 bitwise permute Date: Fri, 18 Sep 2020 11:36:50 -0700 Message-Id: <20200918183751.2787647-21-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::102e; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x102e.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/cpu.h | 5 +++ target/arm/helper-sve.h | 15 ++++++++ target/arm/sve.decode | 6 ++++ target/arm/sve_helper.c | 73 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 36 +++++++++++++++++++ 5 files changed, 135 insertions(+) diff --git a/target/arm/cpu.h b/target/arm/cpu.h index 59415184db..5ed2f2c65b 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -3912,6 +3912,11 @@ static inline bool isar_feature_aa64_sve2_pmull128(const ARMISARegisters *id) return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, AES) >= 2; } +static inline bool isar_feature_aa64_sve2_bitperm(const ARMISARegisters *id) +{ + return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, BITPERM) != 0; +} + /* * Feature tests for "does this exist in either 32-bit or 64-bit?" */ diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index cdd3cab607..492079578d 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2377,3 +2377,18 @@ DEF_HELPER_FLAGS_4(sve2_eoril_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_eoril_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_eoril_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_eoril_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_bext_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_bext_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_bext_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_bext_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_bdep_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_bdep_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_bdep_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_bdep_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_bgrp_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_bgrp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_bgrp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_bgrp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 79d915cf5b..b316610bbb 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1220,3 +1220,9 @@ USHLLT 01000101 .. 0 ..... 1010 11 ..... ..... @rd_rn_tszimm_shl EORBT 01000101 .. 0 ..... 10010 0 ..... ..... @rd_rn_rm EORTB 01000101 .. 0 ..... 10010 1 ..... ..... @rd_rn_rm + +## SVE2 bitwise permute + +BEXT 01000101 .. 0 ..... 1011 00 ..... ..... @rd_rn_rm +BDEP 01000101 .. 0 ..... 1011 01 ..... ..... @rd_rn_rm +BGRP 01000101 .. 0 ..... 1011 10 ..... ..... @rd_rn_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 9133d0a58d..fffd3ca4a5 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1241,6 +1241,79 @@ DO_ZZZ_NTB(sve2_eoril_d, uint64_t, , DO_EOR) #undef DO_ZZZ_NTB +#define DO_BITPERM(NAME, TYPE, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + for (i = 0; i < opr_sz; i += sizeof(TYPE)) { \ + TYPE nn = *(TYPE *)(vn + i); \ + TYPE mm = *(TYPE *)(vm + i); \ + *(TYPE *)(vd + i) = OP(nn, mm, sizeof(TYPE) * 8); \ + } \ +} + +static uint64_t bitextract(uint64_t data, uint64_t mask, int n) +{ + uint64_t res = 0; + int db, rb = 0; + + for (db = 0; db < n; ++db) { + if ((mask >> db) & 1) { + res |= ((data >> db) & 1) << rb; + ++rb; + } + } + return res; +} + +DO_BITPERM(sve2_bext_b, uint8_t, bitextract) +DO_BITPERM(sve2_bext_h, uint16_t, bitextract) +DO_BITPERM(sve2_bext_s, uint32_t, bitextract) +DO_BITPERM(sve2_bext_d, uint64_t, bitextract) + +static uint64_t bitdeposit(uint64_t data, uint64_t mask, int n) +{ + uint64_t res = 0; + int rb, db = 0; + + for (rb = 0; rb < n; ++rb) { + if ((mask >> rb) & 1) { + res |= ((data >> db) & 1) << rb; + ++db; + } + } + return res; +} + +DO_BITPERM(sve2_bdep_b, uint8_t, bitdeposit) +DO_BITPERM(sve2_bdep_h, uint16_t, bitdeposit) +DO_BITPERM(sve2_bdep_s, uint32_t, bitdeposit) +DO_BITPERM(sve2_bdep_d, uint64_t, bitdeposit) + +static uint64_t bitgroup(uint64_t data, uint64_t mask, int n) +{ + uint64_t resm = 0, resu = 0; + int db, rbm = 0, rbu = 0; + + for (db = 0; db < n; ++db) { + uint64_t val = (data >> db) & 1; + if ((mask >> db) & 1) { + resm |= val << rbm++; + } else { + resu |= val << rbu++; + } + } + + return resm | (resu << rbm); +} + +DO_BITPERM(sve2_bgrp_b, uint8_t, bitgroup) +DO_BITPERM(sve2_bgrp_h, uint16_t, bitgroup) +DO_BITPERM(sve2_bgrp_s, uint32_t, bitgroup) +DO_BITPERM(sve2_bgrp_d, uint64_t, bitgroup) + +#undef DO_BITPERM + #define DO_ZZI_SHLL(NAME, TYPEW, TYPEN, HW, HN) \ void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ { \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 0a194a0e8c..c2be092370 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6259,3 +6259,39 @@ static bool trans_USHLLT(DisasContext *s, arg_rri_esz *a) { return do_sve2_shll_tb(s, a, true, true); } + +static bool trans_BEXT(DisasContext *s, arg_rrr_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve2_bext_b, gen_helper_sve2_bext_h, + gen_helper_sve2_bext_s, gen_helper_sve2_bext_d, + }; + if (!dc_isar_feature(aa64_sve2_bitperm, s)) { + return false; + } + return do_sve2_zzw_ool(s, a, fns[a->esz], 0); +} + +static bool trans_BDEP(DisasContext *s, arg_rrr_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve2_bdep_b, gen_helper_sve2_bdep_h, + gen_helper_sve2_bdep_s, gen_helper_sve2_bdep_d, + }; + if (!dc_isar_feature(aa64_sve2_bitperm, s)) { + return false; + } + return do_sve2_zzw_ool(s, a, fns[a->esz], 0); +} + +static bool trans_BGRP(DisasContext *s, arg_rrr_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve2_bgrp_b, gen_helper_sve2_bgrp_h, + gen_helper_sve2_bgrp_s, gen_helper_sve2_bgrp_d, + }; + if (!dc_isar_feature(aa64_sve2_bitperm, s)) { + return false; + } + return do_sve2_zzw_ool(s, a, fns[a->esz], 0); +} From patchwork Fri Sep 18 18:36:51 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 305014 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 10ABEC43464 for ; Fri, 18 Sep 2020 19:18:46 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A7F8D21D7F for ; Fri, 18 Sep 2020 19:18:45 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="vlGZqNyr" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A7F8D21D7F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:43956 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLu8-0008AE-Ga for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:18:44 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48032) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLH7-0008Ky-8H for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:25 -0400 Received: from mail-pg1-x52d.google.com ([2607:f8b0:4864:20::52d]:43724) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLH5-00076m-BI for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:24 -0400 Received: by mail-pg1-x52d.google.com with SMTP id t14so3968748pgl.10 for ; Fri, 18 Sep 2020 11:38:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=vijPLZsfKPtAfMgKZB4Xg2CdBhksRxTz9Vqh+PqvAG0=; b=vlGZqNyrwdz0H655/vIp+V1sW3IGTLEQVh3SsTjrWVyv0qhWKCPGPiOz1tw956xSzi PqoZU6QDqluRRdXh73lCsqnDXPbSiR1W63XoyLpXU9iTSnvOfVw3ltK9W31VZ62g/+qE 8uMbnpkrJVh40g6VMJzsLStutCu9jg4XhZJh84SiD5N0KUBl7o26K85nTrkRkpfeO//X k+qaTnDxa8YVJzRMNM2vmAmSaZmq/ylm1JEby/aZiveQ2NeTaX7qQdYmgyPXJ2bbyACo ebjOqN+5VJK4/RA1COXCyycM53beEIYTOkjz2MpNoGWbQa4GokUN6HFakQ6Z6fizymdq dwzg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=vijPLZsfKPtAfMgKZB4Xg2CdBhksRxTz9Vqh+PqvAG0=; b=CcBav+8q8SpHfikSHX00JGgfGPVwwzQO0WyNenKNIWz6OERgS9HoUo1QMVyqX/hS6l TH4/fyWFGd+A7+Mhf8oy2koyigEvfN64VEtJtlOeO42nqJwkDIdAnMKsvC0DbyOXbQYV HMgw57HQhhRt4jII/h+nOz3D76rrVjMjeN83e9SKrx1cRx35XpNM4+lm3jBmOi3foz6C wKw1i+Z/tdHyY96INcQ7CfP1ax+u8RyIPzY7ELzYuyGYHBvYvghjKcaZwujDjbNYEyet rGZkOyUH90pKAnKE9YPqXUOye0d/pRSHI/wCLFYIidatXd4mSUEC94BueR0KhjC4YuqL eBzQ== X-Gm-Message-State: AOAM533mrAX5yFUKtRJANaWg06DCDVSYqexittwRjRQsb6SQmEMBzpzv 9wfwrLJg3lj+ZRmOto3bMWTXxhfM8jbpeQ== X-Google-Smtp-Source: ABdhPJwMyV8mXhzwP9XWTMxQ/pbaOArIn1iNdW2aMyf+HGHdd75OMe/SBAhBwJM9rH9jndG4fJc30A== X-Received: by 2002:a63:e813:: with SMTP id s19mr28294697pgh.33.1600454301568; Fri, 18 Sep 2020 11:38:21 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:20 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 21/81] target/arm: Implement SVE2 complex integer add Date: Fri, 18 Sep 2020 11:36:51 -0700 Message-Id: <20200918183751.2787647-22-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::52d; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x52d.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- v2: Fix subtraction ordering (laurent desnogues). --- target/arm/helper-sve.h | 10 +++++++++ target/arm/sve.decode | 9 ++++++++ target/arm/sve_helper.c | 42 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 31 ++++++++++++++++++++++++++++ 4 files changed, 92 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 492079578d..b399fb2576 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2392,3 +2392,13 @@ DEF_HELPER_FLAGS_4(sve2_bgrp_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_bgrp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_bgrp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_bgrp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_cadd_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_cadd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_cadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_cadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_sqcadd_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqcadd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqcadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqcadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index b316610bbb..655cb5c12f 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1226,3 +1226,12 @@ EORTB 01000101 .. 0 ..... 10010 1 ..... ..... @rd_rn_rm BEXT 01000101 .. 0 ..... 1011 00 ..... ..... @rd_rn_rm BDEP 01000101 .. 0 ..... 1011 01 ..... ..... @rd_rn_rm BGRP 01000101 .. 0 ..... 1011 10 ..... ..... @rd_rn_rm + +#### SVE2 Accumulate + +## SVE2 complex integer add + +CADD_rot90 01000101 .. 00000 0 11011 0 ..... ..... @rdn_rm +CADD_rot270 01000101 .. 00000 0 11011 1 ..... ..... @rdn_rm +SQCADD_rot90 01000101 .. 00000 1 11011 0 ..... ..... @rdn_rm +SQCADD_rot270 01000101 .. 00000 1 11011 1 ..... ..... @rdn_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index fffd3ca4a5..b8541168bf 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1314,6 +1314,48 @@ DO_BITPERM(sve2_bgrp_d, uint64_t, bitgroup) #undef DO_BITPERM +#define DO_CADD(NAME, TYPE, H, ADD_OP, SUB_OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + int sub_r = simd_data(desc); \ + if (sub_r) { \ + for (i = 0; i < opr_sz; i += 2 * sizeof(TYPE)) { \ + TYPE acc_r = *(TYPE *)(vn + H(i)); \ + TYPE acc_i = *(TYPE *)(vn + H(i + sizeof(TYPE))); \ + TYPE el2_r = *(TYPE *)(vm + H(i)); \ + TYPE el2_i = *(TYPE *)(vm + H(i + sizeof(TYPE))); \ + acc_r = ADD_OP(acc_r, el2_i); \ + acc_i = SUB_OP(acc_i, el2_r); \ + *(TYPE *)(vd + H(i)) = acc_r; \ + *(TYPE *)(vd + H(i + sizeof(TYPE))) = acc_i; \ + } \ + } else { \ + for (i = 0; i < opr_sz; i += 2 * sizeof(TYPE)) { \ + TYPE acc_r = *(TYPE *)(vn + H(i)); \ + TYPE acc_i = *(TYPE *)(vn + H(i + sizeof(TYPE))); \ + TYPE el2_r = *(TYPE *)(vm + H(i)); \ + TYPE el2_i = *(TYPE *)(vm + H(i + sizeof(TYPE))); \ + acc_r = SUB_OP(acc_r, el2_i); \ + acc_i = ADD_OP(acc_i, el2_r); \ + *(TYPE *)(vd + H(i)) = acc_r; \ + *(TYPE *)(vd + H(i + sizeof(TYPE))) = acc_i; \ + } \ + } \ +} + +DO_CADD(sve2_cadd_b, int8_t, H1, DO_ADD, DO_SUB) +DO_CADD(sve2_cadd_h, int16_t, H1_2, DO_ADD, DO_SUB) +DO_CADD(sve2_cadd_s, int32_t, H1_4, DO_ADD, DO_SUB) +DO_CADD(sve2_cadd_d, int64_t, , DO_ADD, DO_SUB) + +DO_CADD(sve2_sqcadd_b, int8_t, H1, DO_SQADD_B, DO_SQSUB_B) +DO_CADD(sve2_sqcadd_h, int16_t, H1_2, DO_SQADD_H, DO_SQSUB_H) +DO_CADD(sve2_sqcadd_s, int32_t, H1_4, DO_SQADD_S, DO_SQSUB_S) +DO_CADD(sve2_sqcadd_d, int64_t, , do_sqadd_d, do_sqsub_d) + +#undef DO_CADD + #define DO_ZZI_SHLL(NAME, TYPEW, TYPEN, HW, HN) \ void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ { \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index c2be092370..95a81eb101 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6295,3 +6295,34 @@ static bool trans_BGRP(DisasContext *s, arg_rrr_esz *a) } return do_sve2_zzw_ool(s, a, fns[a->esz], 0); } + +static bool do_cadd(DisasContext *s, arg_rrr_esz *a, bool sq, bool rot) +{ + static gen_helper_gvec_3 * const fns[2][4] = { + { gen_helper_sve2_cadd_b, gen_helper_sve2_cadd_h, + gen_helper_sve2_cadd_s, gen_helper_sve2_cadd_d }, + { gen_helper_sve2_sqcadd_b, gen_helper_sve2_sqcadd_h, + gen_helper_sve2_sqcadd_s, gen_helper_sve2_sqcadd_d }, + }; + return do_sve2_zzw_ool(s, a, fns[sq][a->esz], rot); +} + +static bool trans_CADD_rot90(DisasContext *s, arg_rrr_esz *a) +{ + return do_cadd(s, a, false, false); +} + +static bool trans_CADD_rot270(DisasContext *s, arg_rrr_esz *a) +{ + return do_cadd(s, a, false, true); +} + +static bool trans_SQCADD_rot90(DisasContext *s, arg_rrr_esz *a) +{ + return do_cadd(s, a, true, false); +} + +static bool trans_SQCADD_rot270(DisasContext *s, arg_rrr_esz *a) +{ + return do_cadd(s, a, true, true); +} From patchwork Fri Sep 18 18:36:52 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273352 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AD169C43463 for ; Fri, 18 Sep 2020 18:58:06 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 5399721534 for ; Fri, 18 Sep 2020 18:58:06 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="M+U70K9G" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5399721534 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:49092 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLa9-0002qI-50 for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 14:58:05 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48054) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLH9-0008OZ-0K for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:27 -0400 Received: from mail-pg1-x544.google.com ([2607:f8b0:4864:20::544]:45297) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLH6-00076w-SS for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:26 -0400 Received: by mail-pg1-x544.google.com with SMTP id 67so3961252pgd.12 for ; Fri, 18 Sep 2020 11:38:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=XoAuPYstkxIPHMjsuTl0OhwsFdxzci4ZF6atqOZrAS0=; b=M+U70K9GGYgXGutulYLqbnyxz6r419vFY8pkw6ycvFsRQPr0siGdAgDKLbhlIdojnL q6HXRTL1WBDsihUNwycinJ7pcsF3S9tp0zGK4vzenvzN5xiRU/Dz8ekC5Y5MpyEd/xt/ FyfVwzoZL5VJZWXS8xcBfu3Qwc+pGS17pdesGD0Xu4adbO4Kjsz9d1TEOKXuV39MxWiz U6NRkCu14rgHkePhN2XltxHToeuswrTc4AIO9zXSDylgD+mrDBHZYDIZpC4FxU6p+Res ukSgqgVAhAsWJWrQXEiE15L45YxnJ8DuJRn9YrD6vgan135uywkXxPaNGlYVp6vsJB1w DWSw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=XoAuPYstkxIPHMjsuTl0OhwsFdxzci4ZF6atqOZrAS0=; b=XsUUo/Gx7tN5P2CdH6T0V4q74GwsUp8AI/ivIIRAf90EdVpb3hDvwrIv5P88Mv2eGT MGA0fFy2F7yS0JYvGhCBcjTLtwRvnD81HB1XW/s8QixKn+p3rtRSRyBsaNupJ07qK3Xi K0wijRFuorwdWeR/9ieLhgA7n5RdABF+Q1M5iTpLE66nFrV16yO5IlEuLgypVQqEL//U KFOIkRzS/92p5LMUIO0mAfgkMMD//N8xzahiUvg5xhYxIrq7oD7bF5n8U+8jbrPXfcnz I/LQY0yunzMPQy5qKbgYJ6RbwfjTqbZPVLfCV/Yn1FBRG/ZUpejm3HMQui352gwcSOLB NxTA== X-Gm-Message-State: AOAM531XCNmuT9tGPdllRl9ppjAKAxeQ73xai9tUyJkA55svuJGv+wCK iPFCt8pHgrY7RzbUMjIvdqfzGSAikaGgbg== X-Google-Smtp-Source: ABdhPJy28XsVQeLigDRc0OPqpJswlWcUqc+NMHqenSJFEY+0Va5xoTLFxy3masR+KtSJVnWUrsLSXA== X-Received: by 2002:a63:c5b:: with SMTP id 27mr13066794pgm.80.1600454302881; Fri, 18 Sep 2020 11:38:22 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:22 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 22/81] target/arm: Implement SVE2 integer absolute difference and accumulate long Date: Fri, 18 Sep 2020 11:36:52 -0700 Message-Id: <20200918183751.2787647-23-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::544; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x544.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- v2: Fix select offsetting and argument order (laurent desnogues). --- target/arm/helper-sve.h | 14 ++++++++++ target/arm/sve.decode | 12 +++++++++ target/arm/sve_helper.c | 23 ++++++++++++++++ target/arm/translate-sve.c | 55 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 104 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index b399fb2576..d5dfd4edea 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2402,3 +2402,17 @@ DEF_HELPER_FLAGS_4(sve2_sqcadd_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_sqcadd_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_sqcadd_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_sqcadd_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_sabal_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sabal_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sabal_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_uabal_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uabal_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_uabal_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 655cb5c12f..6cf09847a0 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -70,6 +70,7 @@ &rpr_s rd pg rn s &rprr_s rd pg rn rm s &rprr_esz rd pg rn rm esz +&rrrr_esz rd ra rn rm esz &rprrr_esz rd pg rn rm ra esz &rpri_esz rd pg rn imm esz &ptrue rd esz pat s @@ -119,6 +120,10 @@ @rdn_i8s ........ esz:2 ...... ... imm:s8 rd:5 \ &rri_esz rn=%reg_movprfx +# Four operand, vector element size +@rda_rn_rm ........ esz:2 . rm:5 ... ... rn:5 rd:5 \ + &rrrr_esz ra=%reg_movprfx + # Three operand with "memory" size, aka immediate left shift @rd_rn_msz_rm ........ ... rm:5 .... imm:2 rn:5 rd:5 &rrri @@ -1235,3 +1240,10 @@ CADD_rot90 01000101 .. 00000 0 11011 0 ..... ..... @rdn_rm CADD_rot270 01000101 .. 00000 0 11011 1 ..... ..... @rdn_rm SQCADD_rot90 01000101 .. 00000 1 11011 0 ..... ..... @rdn_rm SQCADD_rot270 01000101 .. 00000 1 11011 1 ..... ..... @rdn_rm + +## SVE2 integer absolute difference and accumulate long + +SABALB 01000101 .. 0 ..... 1100 00 ..... ..... @rda_rn_rm +SABALT 01000101 .. 0 ..... 1100 01 ..... ..... @rda_rn_rm +UABALB 01000101 .. 0 ..... 1100 10 ..... ..... @rda_rn_rm +UABALT 01000101 .. 0 ..... 1100 11 ..... ..... @rda_rn_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index b8541168bf..cc8450c44e 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1241,6 +1241,29 @@ DO_ZZZ_NTB(sve2_eoril_d, uint64_t, , DO_EOR) #undef DO_ZZZ_NTB +#define DO_ZZZW_ACC(NAME, TYPEW, TYPEN, HW, HN, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + intptr_t sel1 = simd_data(desc) * sizeof(TYPEN); \ + for (i = 0; i < opr_sz; i += sizeof(TYPEW)) { \ + TYPEW nn = *(TYPEN *)(vn + HN(i + sel1)); \ + TYPEW mm = *(TYPEN *)(vm + HN(i + sel1)); \ + TYPEW aa = *(TYPEW *)(va + HW(i)); \ + *(TYPEW *)(vd + HW(i)) = OP(nn, mm) + aa; \ + } \ +} + +DO_ZZZW_ACC(sve2_sabal_h, int16_t, int8_t, H1_2, H1, DO_ABD) +DO_ZZZW_ACC(sve2_sabal_s, int32_t, int16_t, H1_4, H1_2, DO_ABD) +DO_ZZZW_ACC(sve2_sabal_d, int64_t, int32_t, , H1_4, DO_ABD) + +DO_ZZZW_ACC(sve2_uabal_h, uint16_t, uint8_t, H1_2, H1, DO_ABD) +DO_ZZZW_ACC(sve2_uabal_s, uint32_t, uint16_t, H1_4, H1_2, DO_ABD) +DO_ZZZW_ACC(sve2_uabal_d, uint64_t, uint32_t, , H1_4, DO_ABD) + +#undef DO_ZZZW_ACC + #define DO_BITPERM(NAME, TYPE, OP) \ void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ { \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 95a81eb101..7b3720e8ef 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -163,6 +163,18 @@ static void gen_gvec_ool_zzz(DisasContext *s, gen_helper_gvec_3 *fn, vsz, vsz, data, fn); } +/* Invoke an out-of-line helper on 4 Zregs. */ +static void gen_gvec_ool_zzzz(DisasContext *s, gen_helper_gvec_4 *fn, + int rd, int rn, int rm, int ra, int data) +{ + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_4_ool(vec_full_reg_offset(s, rd), + vec_full_reg_offset(s, rn), + vec_full_reg_offset(s, rm), + vec_full_reg_offset(s, ra), + vsz, vsz, data, fn); +} + /* Invoke an out-of-line helper on 2 Zregs and a predicate. */ static void gen_gvec_ool_zzp(DisasContext *s, gen_helper_gvec_3 *fn, int rd, int rn, int pg, int data) @@ -6326,3 +6338,46 @@ static bool trans_SQCADD_rot270(DisasContext *s, arg_rrr_esz *a) { return do_cadd(s, a, true, true); } + +static bool do_sve2_zzzz_ool(DisasContext *s, arg_rrrr_esz *a, + gen_helper_gvec_4 *fn, int data) +{ + if (fn == NULL || !dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + gen_gvec_ool_zzzz(s, fn, a->rd, a->rn, a->rm, a->ra, data); + } + return true; +} + +static bool do_abal(DisasContext *s, arg_rrrr_esz *a, bool uns, bool sel) +{ + static gen_helper_gvec_4 * const fns[2][4] = { + { NULL, gen_helper_sve2_sabal_h, + gen_helper_sve2_sabal_s, gen_helper_sve2_sabal_d }, + { NULL, gen_helper_sve2_uabal_h, + gen_helper_sve2_uabal_s, gen_helper_sve2_uabal_d }, + }; + return do_sve2_zzzz_ool(s, a, fns[uns][a->esz], sel); +} + +static bool trans_SABALB(DisasContext *s, arg_rrrr_esz *a) +{ + return do_abal(s, a, false, false); +} + +static bool trans_SABALT(DisasContext *s, arg_rrrr_esz *a) +{ + return do_abal(s, a, false, true); +} + +static bool trans_UABALB(DisasContext *s, arg_rrrr_esz *a) +{ + return do_abal(s, a, true, false); +} + +static bool trans_UABALT(DisasContext *s, arg_rrrr_esz *a) +{ + return do_abal(s, a, true, true); +} From patchwork Fri Sep 18 18:36:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273355 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9A503C43463 for ; Fri, 18 Sep 2020 18:54:48 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id DA12721534 for ; Fri, 18 Sep 2020 18:54:47 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="IEqwhKlx" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DA12721534 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:39014 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLWw-0006ev-NA for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 14:54:46 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48064) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLH9-0008QS-PW for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:27 -0400 Received: from mail-pg1-x542.google.com ([2607:f8b0:4864:20::542]:36449) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLH7-000775-SN for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:27 -0400 Received: by mail-pg1-x542.google.com with SMTP id f2so3990166pgd.3 for ; Fri, 18 Sep 2020 11:38:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=u72TPmFnd2IwCRN+S19KOwe4vK5mpV7WpqoTxQELU5Q=; b=IEqwhKlxq5x1lliXm21ia4fiO/88kj8TinrRqEV5Jn14hLRqFjTNkWrAJVPJ5Q3M73 v4f02WhQBet201hyWpDFUgAz8nVJhHUqFhWxYrACyItmjEM+V4/iwo+gJliAjaGsuKa8 smJTkA8Lb4aXjj+fGQGPvSxr8BhWtH+5kXZEDQch/ZyrpctRbovCYeK0fDo0urFNQtE/ hU2rG3OhWG4LCcEhY7u+rVKJb71gL0QFtCKf8xaFLWRlW9hjR77DZK3mzf38KaKhmufn PmkGgsyDoapa0ghrm+9waoFdoN/dEl7vOInMV9E+rMUAw5clG5JOI4SZbeEEL2ONrNPm 5A6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=u72TPmFnd2IwCRN+S19KOwe4vK5mpV7WpqoTxQELU5Q=; b=IJefoC/ObRkCMc4yep9moj3iOTGobsgfYx1iauJqLZAbuPTBGF4pXzKNUvKToHdYsw 6BXhSywjT6loJ/Juzvj0gT5gv8ATshACwE42O6PxFtLtqFxTmOgfe8iaY1sZS/SGgFvd Ss7wIMc2Ts7YvhfcRpY5WGTslwAhraBF1NY6DC31MRr/o4SGFSJXNqRWwmq8uWV8bGhA DujMp4/NHu8lb8t7afl+NjleaotOUSDGGwLsQSLADilWxc3N+V5tbfneK3NI4w81K3zF agSH1D+vb41PVbPkaBGeMBhTqg+O1a/anIXGUWK5V/GWZwYPfsPVLLOTQ3d1Mo41wBBm tk+w== X-Gm-Message-State: AOAM532KqE63smhvLwSc8/FNV4ubc0uVvz8BxJ/Mv/4tItT+YP1/Mg+W sT4EKf5HYQob3hv9+SDbD6/3aFOZXyYZBA== X-Google-Smtp-Source: ABdhPJzd9gA0JuyRkIiULeMCcR0YpV+/t3hXuhSAKHtO/SIIL8vuA5V0b+5fAn8etoJcJcAylMgC5w== X-Received: by 2002:a63:a05:: with SMTP id 5mr23759119pgk.140.1600454304180; Fri, 18 Sep 2020 11:38:24 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:23 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 23/81] target/arm: Implement SVE2 integer add/subtract long with carry Date: Fri, 18 Sep 2020 11:36:53 -0700 Message-Id: <20200918183751.2787647-24-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::542; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x542.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- v2: Fix sel indexing and argument order (laurent desnogues). --- target/arm/helper-sve.h | 3 +++ target/arm/sve.decode | 6 ++++++ target/arm/sve_helper.c | 34 ++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 23 +++++++++++++++++++++++ 4 files changed, 66 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index d5dfd4edea..a806973221 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2416,3 +2416,6 @@ DEF_HELPER_FLAGS_5(sve2_uabal_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_uabal_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_adcl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_adcl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 6cf09847a0..f4f0c2ade6 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1247,3 +1247,9 @@ SABALB 01000101 .. 0 ..... 1100 00 ..... ..... @rda_rn_rm SABALT 01000101 .. 0 ..... 1100 01 ..... ..... @rda_rn_rm UABALB 01000101 .. 0 ..... 1100 10 ..... ..... @rda_rn_rm UABALT 01000101 .. 0 ..... 1100 11 ..... ..... @rda_rn_rm + +## SVE2 integer add/subtract long with carry + +# ADC and SBC decoded via size in helper dispatch. +ADCLB 01000101 .. 0 ..... 11010 0 ..... ..... @rda_rn_rm +ADCLT 01000101 .. 0 ..... 11010 1 ..... ..... @rda_rn_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index cc8450c44e..a7ba9e71d5 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1264,6 +1264,40 @@ DO_ZZZW_ACC(sve2_uabal_d, uint64_t, uint32_t, , H1_4, DO_ABD) #undef DO_ZZZW_ACC +void HELPER(sve2_adcl_s)(void *vd, void *vn, void *vm, void *va, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int sel = H4(extract32(desc, SIMD_DATA_SHIFT, 1)); + uint32_t inv = -extract32(desc, SIMD_DATA_SHIFT + 1, 1); + uint32_t *a = va, *n = vn; + uint64_t *d = vd, *m = vm; + + for (i = 0; i < opr_sz / 8; ++i) { + uint32_t e1 = a[2 * i + H4(0)]; + uint32_t e2 = n[2 * i + sel] ^ inv; + uint64_t c = extract64(m[i], 32, 1); + /* Compute and store the entire 33-bit result at once. */ + d[i] = c + e1 + e2; + } +} + +void HELPER(sve2_adcl_d)(void *vd, void *vn, void *vm, void *va, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int sel = extract32(desc, SIMD_DATA_SHIFT, 1); + uint64_t inv = -(uint64_t)extract32(desc, SIMD_DATA_SHIFT + 1, 1); + uint64_t *d = vd, *a = va, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 8; i += 2) { + Int128 e1 = int128_make64(a[i]); + Int128 e2 = int128_make64(n[i + sel] ^ inv); + Int128 c = int128_make64(m[i + 1] & 1); + Int128 r = int128_add(int128_add(e1, e2), c); + d[i + 0] = int128_getlo(r); + d[i + 1] = int128_gethi(r); + } +} + #define DO_BITPERM(NAME, TYPE, OP) \ void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ { \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 7b3720e8ef..4e23b6cedd 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6381,3 +6381,26 @@ static bool trans_UABALT(DisasContext *s, arg_rrrr_esz *a) { return do_abal(s, a, true, true); } + +static bool do_adcl(DisasContext *s, arg_rrrr_esz *a, bool sel) +{ + static gen_helper_gvec_4 * const fns[2] = { + gen_helper_sve2_adcl_s, + gen_helper_sve2_adcl_d, + }; + /* + * Note that in this case the ESZ field encodes both size and sign. + * Split out 'subtract' into bit 1 of the data field for the helper. + */ + return do_sve2_zzzz_ool(s, a, fns[a->esz & 1], (a->esz & 2) | sel); +} + +static bool trans_ADCLB(DisasContext *s, arg_rrrr_esz *a) +{ + return do_adcl(s, a, false); +} + +static bool trans_ADCLT(DisasContext *s, arg_rrrr_esz *a) +{ + return do_adcl(s, a, true); +} From patchwork Fri Sep 18 18:36:54 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273350 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 576AAC43464 for ; Fri, 18 Sep 2020 19:01:22 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id CB87021707 for ; Fri, 18 Sep 2020 19:01:21 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="WwSfg2Ly" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CB87021707 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:57218 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLdI-0006gN-R4 for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:01:20 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48088) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLHA-0008TI-W1 for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:29 -0400 Received: from mail-pg1-x542.google.com ([2607:f8b0:4864:20::542]:44087) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLH9-00077E-8A for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:28 -0400 Received: by mail-pg1-x542.google.com with SMTP id 7so3962343pgm.11 for ; Fri, 18 Sep 2020 11:38:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=wCS+cVDKywQG7aSiFHHNebXlRWHY40+AHdDnJu3EQvg=; b=WwSfg2LyyJTsegKz+1o6Yqfwbq56Gsox3djP3zLOGYWSX+ASFU0EtMO1l28o8MNi5o tTrCXkCzEuAqMvVrgKC9De6gjtFaWDI5rOMvXM1tVEADdY+EcNZEaxkrvWPLqKJWDA4Y l14KCvPgBIxgipMBZ9dSaVRaFTGSlS9Q4DJQ/tfzgaJEXo6/tItadHlBkRGcmAxUfNir cB1PGEjOv2STUpYKFaLyO/3cJ0C7KsSibZDsWFlUdjNmyq+snVSXYNK9IbhucwX14sjB Nfl9uo7g30ztvWaPllzh0cd5/PZ+UZ91hazdMYHkUu+REXI1fchdH82Vy2uQSfT4KQVx FzKg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=wCS+cVDKywQG7aSiFHHNebXlRWHY40+AHdDnJu3EQvg=; b=Dq/XLmEHQG/P3RWeCOujRDS6CVMPZmfD469/YDxA0p0zhQk4OpS3lCV1AwUX19gkem xicy2PS9cMALCOxEX2e+qlNyrzoR3U7k/8W4zEz03X/Ep+Mz6ap4Q8T7hQNYaTaqGSpF DQW1IAB4kiU6kwzT65Ap6XWO+RDIBZUtiCMUVE2S7VN3ICX0yjN9KFPUhEB0+LHGqqfU tF60VGs+vdVct5qAR4z4rLNGlA04hxHTttiniU+JJUYLlLl/bjZ+nLiVqFlw32XNm4O6 sW+hnOIWzwMBShvt3XnRkjurkA2BlZY78Ycsvb0j3wRMv+IvWjdpOpBiYu0r3Y5gTRbL 3Whw== X-Gm-Message-State: AOAM53207eXMrlloWWwA7GJuEvz9q1ALPH3XG++Y95z9d/lPEXIFMoAw IGzwl6iT3XyLyG8jrwZusBmC5R/7bafR+g== X-Google-Smtp-Source: ABdhPJyvNUkb+i4BfPmNxxwqYzALqi2aO8OlVy2BrLa/eimvsqN2roDNM47XVfffn6b7x/YfRUfrMA== X-Received: by 2002:a62:2985:0:b029:142:2501:34d6 with SMTP id p127-20020a6229850000b0290142250134d6mr16931074pfp.47.1600454305403; Fri, 18 Sep 2020 11:38:25 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:24 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 24/81] target/arm: Implement SVE2 bitwise shift right and accumulate Date: Fri, 18 Sep 2020 11:36:54 -0700 Message-Id: <20200918183751.2787647-25-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::542; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x542.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/sve.decode | 8 ++++++++ target/arm/translate-sve.c | 34 ++++++++++++++++++++++++++++++++++ 2 files changed, 42 insertions(+) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index f4f0c2ade6..7783e9f0d3 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1253,3 +1253,11 @@ UABALT 01000101 .. 0 ..... 1100 11 ..... ..... @rda_rn_rm # ADC and SBC decoded via size in helper dispatch. ADCLB 01000101 .. 0 ..... 11010 0 ..... ..... @rda_rn_rm ADCLT 01000101 .. 0 ..... 11010 1 ..... ..... @rda_rn_rm + +## SVE2 bitwise shift right and accumulate + +# TODO: Use @rda and %reg_movprfx here. +SSRA 01000101 .. 0 ..... 1110 00 ..... ..... @rd_rn_tszimm_shr +USRA 01000101 .. 0 ..... 1110 01 ..... ..... @rd_rn_tszimm_shr +SRSRA 01000101 .. 0 ..... 1110 10 ..... ..... @rd_rn_tszimm_shr +URSRA 01000101 .. 0 ..... 1110 11 ..... ..... @rd_rn_tszimm_shr diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 4e23b6cedd..312a47f4b9 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6404,3 +6404,37 @@ static bool trans_ADCLT(DisasContext *s, arg_rrrr_esz *a) { return do_adcl(s, a, true); } + +static bool do_sve2_fn2i(DisasContext *s, arg_rri_esz *a, GVecGen2iFn *fn) +{ + if (a->esz < 0 || !dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + unsigned rd_ofs = vec_full_reg_offset(s, a->rd); + unsigned rn_ofs = vec_full_reg_offset(s, a->rn); + fn(a->esz, rd_ofs, rn_ofs, a->imm, vsz, vsz); + } + return true; +} + +static bool trans_SSRA(DisasContext *s, arg_rri_esz *a) +{ + return do_sve2_fn2i(s, a, gen_gvec_ssra); +} + +static bool trans_USRA(DisasContext *s, arg_rri_esz *a) +{ + return do_sve2_fn2i(s, a, gen_gvec_usra); +} + +static bool trans_SRSRA(DisasContext *s, arg_rri_esz *a) +{ + return do_sve2_fn2i(s, a, gen_gvec_srsra); +} + +static bool trans_URSRA(DisasContext *s, arg_rri_esz *a) +{ + return do_sve2_fn2i(s, a, gen_gvec_ursra); +} From patchwork Fri Sep 18 18:36:55 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273339 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 86E45C43463 for ; Fri, 18 Sep 2020 19:21:44 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 1057E21D7F for ; Fri, 18 Sep 2020 19:21:44 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="TprxvJD3" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1057E21D7F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:52200 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLx1-0003CU-0e for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:21:43 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48114) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLHC-0008W3-Ae for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:30 -0400 Received: from mail-pg1-x543.google.com ([2607:f8b0:4864:20::543]:39382) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLHA-00077N-Cs for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:29 -0400 Received: by mail-pg1-x543.google.com with SMTP id d13so3977049pgl.6 for ; Fri, 18 Sep 2020 11:38:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=X/NVWweCrcSG9yvM/pSUZVGJlFvHwlP+Fb2RbDObtgY=; b=TprxvJD3gwIhK8PfALgUCowt1L6Tw4uSGW0DTUdFnB7HrVSLD69JSSrR3NRDoxP8Ss 0Mny4GKovjiqwYUUNB19WJOP713zjAnrmdlYZTRdzkH/YR5L/Nl2VaUxk0cZ39WBlSKj TkiS8tC/QTnbdWhtekNtJmnHZItHlDDbG3PfU4o07N9d/F1anScwNi2p3yy4CAfx2Dee ZJI7CfvBUKFyBupwSCIzfdoFJQ2Dh9WUg01dYCJiQBa0WHmPIHjG3LCHwGTbzpwqP5eZ fpxKZdq+fKzRJXZVy/T6UD2oieWIWSq64eI8O1j6MgGYRQLp37FdOyPRygogeBNmQTNd SKNQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=X/NVWweCrcSG9yvM/pSUZVGJlFvHwlP+Fb2RbDObtgY=; b=cSz0VzFMo3lzoLbnjZQF0Jow48nlo7S0hdL1bDemKTlRoS5u+VkidKz1zEYqIGYdS8 zwGDbrFzcgKamGh8HcuDAsN3VDRSCTHQ5YAlwTmXh49rp2NT+kPoS7lsdgQQje+NDZQz r57tQqMlU+k7tY3cObGb6XUYAZjjctQYgKewtqogfZ29i7neajJg4tO+bJtaZomMaR7o /Ct9A3MUJoCQh4gSyd862iEPkx4yFmdKBWctWN5z14OU6YbDOo2PbA++cOJN9OGm/+FP QCK9yycgtFp3j7TvUIyqtjnoVJIrcSCH22mz9mVXEgXoj6eYRw2OB3wyci4/NugETG1J CtXQ== X-Gm-Message-State: AOAM531LaN1uy1Qow5YBUm7mQPLIjwctkq/KlI6vWA7LUYFo1/ihhUuG nf5Mq6B+z6RI8DCGs7zsxsGvRwgmDcrHjQ== X-Google-Smtp-Source: ABdhPJw6P5O0rHWYgcJoQRQ5GiHirvO3q2h6zlGlbOBb6Zvu9xRyuptoMmH/WM11NsvVhIp4na1aMg== X-Received: by 2002:a62:5a04:0:b029:142:2501:397f with SMTP id o4-20020a625a040000b02901422501397fmr17132534pfb.68.1600454306696; Fri, 18 Sep 2020 11:38:26 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:25 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 25/81] target/arm: Implement SVE2 bitwise shift and insert Date: Fri, 18 Sep 2020 11:36:55 -0700 Message-Id: <20200918183751.2787647-26-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::543; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x543.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/sve.decode | 5 +++++ target/arm/translate-sve.c | 10 ++++++++++ 2 files changed, 15 insertions(+) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 7783e9f0d3..90a9d6552a 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1261,3 +1261,8 @@ SSRA 01000101 .. 0 ..... 1110 00 ..... ..... @rd_rn_tszimm_shr USRA 01000101 .. 0 ..... 1110 01 ..... ..... @rd_rn_tszimm_shr SRSRA 01000101 .. 0 ..... 1110 10 ..... ..... @rd_rn_tszimm_shr URSRA 01000101 .. 0 ..... 1110 11 ..... ..... @rd_rn_tszimm_shr + +## SVE2 bitwise shift and insert + +SRI 01000101 .. 0 ..... 11110 0 ..... ..... @rd_rn_tszimm_shr +SLI 01000101 .. 0 ..... 11110 1 ..... ..... @rd_rn_tszimm_shl diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 312a47f4b9..33e575ce72 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6438,3 +6438,13 @@ static bool trans_URSRA(DisasContext *s, arg_rri_esz *a) { return do_sve2_fn2i(s, a, gen_gvec_ursra); } + +static bool trans_SRI(DisasContext *s, arg_rri_esz *a) +{ + return do_sve2_fn2i(s, a, gen_gvec_sri); +} + +static bool trans_SLI(DisasContext *s, arg_rri_esz *a) +{ + return do_sve2_fn2i(s, a, gen_gvec_sli); +} From patchwork Fri Sep 18 18:36:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 305011 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DF8D8C4346E for ; Fri, 18 Sep 2020 19:23:45 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 755D3221EC for ; Fri, 18 Sep 2020 19:23:45 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="euXPpleq" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 755D3221EC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:60412 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLyy-0006Z3-Fy for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:23:44 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48126) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLHD-00006t-Aa for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:31 -0400 Received: from mail-pf1-x442.google.com ([2607:f8b0:4864:20::442]:46700) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLHB-00077j-MK for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:31 -0400 Received: by mail-pf1-x442.google.com with SMTP id b124so3977875pfg.13 for ; Fri, 18 Sep 2020 11:38:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=FnLvIpl6tkY/ABtv3jqDiaV7bxS8ugwGLaJksYt9Yyk=; b=euXPpleqtCkeomfnGxAyEtMI+D77EYq12Sh9xonMCtfGYa3pEU3pBNuBVk9Hn3m7to w+EY7K3KkmBR9bDmFi8a3vk33rN5VvGjMJsZx6WULm6mhliQw+htt1VSTIz9rbNatsKc 1fu+xKSaXZsbhMx1nxopWgvH87JHsnDekVJgiupVFuqVepxA7F2jTyP7Dq2XI7EQd7wU xXKlW6tV5czNKe4n4pRGace/+9YbbxwLFJH+2jQ2ZnpfOUZtn7EUDqIaIu4eadfUp08D vKjy/TJlwLlnnljoiuq83aYKksZz2jWnSf7AQK7eZ4C3S3d26WwZYAfLWauOMckwzkmd jOhw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=FnLvIpl6tkY/ABtv3jqDiaV7bxS8ugwGLaJksYt9Yyk=; b=FPZyBiIjKhOzA7y1FSOTOqGDWeUBpkG0z0t6r/VcQPLYMOiQgQHBBtiEe47sKBx6hj DvuYW7hj92PA02t5B6c1zISxEkzgVoFYbnimi2P5BF4sL8n2od+VaHcw5h7y2+7sUYZB QBPpLItqafRM7L60antsOnzcIxXW7gAXPZVL1Ug1AHDRAj11ib17heDmax9UDhHfp9hf HsmrAacluVKpyBjFRrT4npdazjA4JFu+pdnDUvYG1hgVc7uUyL86JyfeyLIXSvCFd+t4 ZUydEHRLorf4g3Qb7weqiQdFSgvrTx+U9ZwlayFiiTU7afvSkIRPwJSdsN2imGnvj//U aCmQ== X-Gm-Message-State: AOAM533LSmDsCQ8/odiZ/s3cwtUIR5IWqbN7me0FdV5rScGkPMA/606p Z/jbKuIPpu/mJEzeEVTFvpOXpAeKP1bQ9w== X-Google-Smtp-Source: ABdhPJxNRtKPQnlVMARrsKDz04Flt+HAhQYWDbu70R48HtijhUc8JV7aS3RjcpFyz5fBuQ0NvrwM2g== X-Received: by 2002:a62:1dc1:0:b029:13e:d13d:a051 with SMTP id d184-20020a621dc10000b029013ed13da051mr32287944pfd.23.1600454307941; Fri, 18 Sep 2020 11:38:27 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:27 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 26/81] target/arm: Implement SVE2 integer absolute difference and accumulate Date: Fri, 18 Sep 2020 11:36:56 -0700 Message-Id: <20200918183751.2787647-27-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::442; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x442.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/sve.decode | 6 ++++++ target/arm/translate-sve.c | 21 +++++++++++++++++++++ 2 files changed, 27 insertions(+) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 90a9d6552a..b5450b1d4d 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1266,3 +1266,9 @@ URSRA 01000101 .. 0 ..... 1110 11 ..... ..... @rd_rn_tszimm_shr SRI 01000101 .. 0 ..... 11110 0 ..... ..... @rd_rn_tszimm_shr SLI 01000101 .. 0 ..... 11110 1 ..... ..... @rd_rn_tszimm_shl + +## SVE2 integer absolute difference and accumulate + +# TODO: Use @rda and %reg_movprfx here. +SABA 01000101 .. 0 ..... 11111 0 ..... ..... @rd_rn_rm +UABA 01000101 .. 0 ..... 11111 1 ..... ..... @rd_rn_rm diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 33e575ce72..e21f078af7 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6448,3 +6448,24 @@ static bool trans_SLI(DisasContext *s, arg_rri_esz *a) { return do_sve2_fn2i(s, a, gen_gvec_sli); } + +static bool do_sve2_fn_zzz(DisasContext *s, arg_rrr_esz *a, GVecGen3Fn *fn) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + gen_gvec_fn_zzz(s, fn, a->esz, a->rd, a->rn, a->rm); + } + return true; +} + +static bool trans_SABA(DisasContext *s, arg_rrr_esz *a) +{ + return do_sve2_fn_zzz(s, a, gen_gvec_saba); +} + +static bool trans_UABA(DisasContext *s, arg_rrr_esz *a) +{ + return do_sve2_fn_zzz(s, a, gen_gvec_uaba); +} From patchwork Fri Sep 18 18:36:57 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273349 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5CB7DC43463 for ; Fri, 18 Sep 2020 19:04:40 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id DEFF221707 for ; Fri, 18 Sep 2020 19:04:39 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="KBp+1/Zi" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DEFF221707 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:37220 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLgV-0001mu-07 for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:04:39 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48148) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLHF-0000CI-Fg for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:33 -0400 Received: from mail-pf1-x442.google.com ([2607:f8b0:4864:20::442]:46049) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLHD-00077v-1X for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:33 -0400 Received: by mail-pf1-x442.google.com with SMTP id k15so3975618pfc.12 for ; Fri, 18 Sep 2020 11:38:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=LZBWCy8WDtg5umhTrHp/xYXTK+j0h9Jqr90U+hWaBwU=; b=KBp+1/Zi6Ngo/Y0nUPT/cxCNbKTGkZ+2u9SAzHu+9EJai5QJwCMMhKR7kLSSSrcvPH 0DKBU0pCyO9nO19YX4pPHxwWyAE+kRMaDNYUo7ZMSRS3K8Wg4u+YRK4DD0ALmofau/0O oiQgoGYJ3Yqpc0yFS9oMeCJrICkFNTVxAt2BXry3jbY8x9LsFLbWy6jZusoYXIzNeK73 k2JF9PpS9fZ+NptS17zvdTJRTZmBCFnfKIN2edtId5Ty/x4bYyfImUoyKhdEIDxTQ8no ifw1P9YNyn00CQerJOOv+svr+9vydP7MI33H4twq/0mHlHLDYQ68dRN7COeArATD4tWc IKrA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=LZBWCy8WDtg5umhTrHp/xYXTK+j0h9Jqr90U+hWaBwU=; b=f3YCmSb1BXvz5W9ZtVEe7Vll4WuQrDEqn9LE0cU4LLvYvpzTHytIM6F819RKL1QbtL cITv387EAXWV0EbAgDyqvMcMR1bSqRpK4hhTVJO0jbPeNnDvvbG5lSAPu0u7Jw/qcz57 kzmC8w5zvvDk+d7njWPj9MEPpduFjbp4aEbHGhHRSWB+eMioFlcOBT0aJGdq9Zqi8Jlq PfWDtqq6zVlqFwAhSKh11IeQ2LBUnntCVNll0RJcEKPkvuEtz08UXlk05mpxRoZMOSP4 KDAIeO+lE88z0ak9sMwmuDZCswpGEa4pc45x3zKuBsuyibBco4A+k+dR1A4+07evzBEA yZ8g== X-Gm-Message-State: AOAM530UPtWL5LanFAFIDIko6JzxpnloTcSqHU+m+irrwBLvcSaF9fvE 8ax3VXQCWmMrr3YKP3sMwOrgopIOrRFIqA== X-Google-Smtp-Source: ABdhPJxRQSuPvy6UCgT7cTglCWAwKiBAMk1tmEm7anZBc3aqeSAzlRI+FwgiAYwdQ2ROCfwY6ov1jg== X-Received: by 2002:aa7:9548:0:b029:13e:d13d:a08d with SMTP id w8-20020aa795480000b029013ed13da08dmr32863381pfq.36.1600454309035; Fri, 18 Sep 2020 11:38:29 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:28 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 27/81] target/arm: Implement SVE2 saturating extract narrow Date: Fri, 18 Sep 2020 11:36:57 -0700 Message-Id: <20200918183751.2787647-28-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::442; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x442.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 24 ++++ target/arm/sve.decode | 12 ++ target/arm/sve_helper.c | 56 +++++++++ target/arm/translate-sve.c | 248 ++++++++++++++++++++++++++++++++++++- 4 files changed, 335 insertions(+), 5 deletions(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index a806973221..8abc06f358 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2419,3 +2419,27 @@ DEF_HELPER_FLAGS_5(sve2_uabal_d, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve2_adcl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_adcl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_sqxtnb_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqxtnb_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqxtnb_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_uqxtnb_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_uqxtnb_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_uqxtnb_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_sqxtunb_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqxtunb_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqxtunb_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_sqxtnt_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqxtnt_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqxtnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_uqxtnt_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_uqxtnt_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_uqxtnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_sqxtunt_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqxtunt_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqxtunt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index b5450b1d4d..733f9a3db4 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1272,3 +1272,15 @@ SLI 01000101 .. 0 ..... 11110 1 ..... ..... @rd_rn_tszimm_shl # TODO: Use @rda and %reg_movprfx here. SABA 01000101 .. 0 ..... 11111 0 ..... ..... @rd_rn_rm UABA 01000101 .. 0 ..... 11111 1 ..... ..... @rd_rn_rm + +#### SVE2 Narrowing + +## SVE2 saturating extract narrow + +# Bits 23, 18-16 are zero, limited in the translator via esz < 3 & imm == 0. +SQXTNB 01000101 .. 1 ..... 010 000 ..... ..... @rd_rn_tszimm_shl +SQXTNT 01000101 .. 1 ..... 010 001 ..... ..... @rd_rn_tszimm_shl +UQXTNB 01000101 .. 1 ..... 010 010 ..... ..... @rd_rn_tszimm_shl +UQXTNT 01000101 .. 1 ..... 010 011 ..... ..... @rd_rn_tszimm_shl +SQXTUNB 01000101 .. 1 ..... 010 100 ..... ..... @rd_rn_tszimm_shl +SQXTUNT 01000101 .. 1 ..... 010 101 ..... ..... @rd_rn_tszimm_shl diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index a7ba9e71d5..07c3e8f02a 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1264,6 +1264,62 @@ DO_ZZZW_ACC(sve2_uabal_d, uint64_t, uint32_t, , H1_4, DO_ABD) #undef DO_ZZZW_ACC +#define DO_XTNB(NAME, TYPE, OP) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + for (i = 0; i < opr_sz; i += sizeof(TYPE)) { \ + TYPE nn = *(TYPE *)(vn + i); \ + nn = OP(nn) & MAKE_64BIT_MASK(0, sizeof(TYPE) * 4); \ + *(TYPE *)(vd + i) = nn; \ + } \ +} + +#define DO_XTNT(NAME, TYPE, TYPEN, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc), odd = H(sizeof(TYPEN)); \ + for (i = 0; i < opr_sz; i += sizeof(TYPE)) { \ + TYPE nn = *(TYPE *)(vn + i); \ + *(TYPEN *)(vd + i + odd) = OP(nn); \ + } \ +} + +#define DO_SQXTN_H(n) do_sat_bhs(n, INT8_MIN, INT8_MAX) +#define DO_SQXTN_S(n) do_sat_bhs(n, INT16_MIN, INT16_MAX) +#define DO_SQXTN_D(n) do_sat_bhs(n, INT32_MIN, INT32_MAX) + +DO_XTNB(sve2_sqxtnb_h, int16_t, DO_SQXTN_H) +DO_XTNB(sve2_sqxtnb_s, int32_t, DO_SQXTN_S) +DO_XTNB(sve2_sqxtnb_d, int64_t, DO_SQXTN_D) + +DO_XTNT(sve2_sqxtnt_h, int16_t, int8_t, H1, DO_SQXTN_H) +DO_XTNT(sve2_sqxtnt_s, int32_t, int16_t, H1_2, DO_SQXTN_S) +DO_XTNT(sve2_sqxtnt_d, int64_t, int32_t, H1_4, DO_SQXTN_D) + +#define DO_UQXTN_H(n) do_sat_bhs(n, 0, UINT8_MAX) +#define DO_UQXTN_S(n) do_sat_bhs(n, 0, UINT16_MAX) +#define DO_UQXTN_D(n) do_sat_bhs(n, 0, UINT32_MAX) + +DO_XTNB(sve2_uqxtnb_h, uint16_t, DO_UQXTN_H) +DO_XTNB(sve2_uqxtnb_s, uint32_t, DO_UQXTN_S) +DO_XTNB(sve2_uqxtnb_d, uint64_t, DO_UQXTN_D) + +DO_XTNT(sve2_uqxtnt_h, uint16_t, uint8_t, H1, DO_UQXTN_H) +DO_XTNT(sve2_uqxtnt_s, uint32_t, uint16_t, H1_2, DO_UQXTN_S) +DO_XTNT(sve2_uqxtnt_d, uint64_t, uint32_t, H1_4, DO_UQXTN_D) + +DO_XTNB(sve2_sqxtunb_h, int16_t, DO_UQXTN_H) +DO_XTNB(sve2_sqxtunb_s, int32_t, DO_UQXTN_S) +DO_XTNB(sve2_sqxtunb_d, int64_t, DO_UQXTN_D) + +DO_XTNT(sve2_sqxtunt_h, int16_t, int8_t, H1, DO_UQXTN_H) +DO_XTNT(sve2_sqxtunt_s, int32_t, int16_t, H1_2, DO_UQXTN_S) +DO_XTNT(sve2_sqxtunt_d, int64_t, int32_t, H1_4, DO_UQXTN_D) + +#undef DO_XTNB +#undef DO_XTNT + void HELPER(sve2_adcl_s)(void *vd, void *vn, void *vm, void *va, uint32_t desc) { intptr_t i, opr_sz = simd_oprsz(desc); diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index e21f078af7..6b29d2580b 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6203,22 +6203,22 @@ static void gen_ushll_vec(unsigned vece, TCGv_vec d, TCGv_vec n, int64_t imm) static bool do_sve2_shll_tb(DisasContext *s, arg_rri_esz *a, bool sel, bool uns) { - static const TCGOpcode sshll_list[] = { + static const TCGOpcode sshll_list[] = { INDEX_op_shli_vec, INDEX_op_sari_vec, 0 }; - static const TCGOpcode ushll_list[] = { + static const TCGOpcode ushll_list[] = { INDEX_op_shli_vec, INDEX_op_shri_vec, 0 }; static const GVecGen2i ops[2][3] = { - { { .fniv = gen_sshll_vec, + { { .fniv = gen_sshll_vec, .opt_opc = sshll_list, .fno = gen_helper_sve2_sshll_h, .vece = MO_16 }, - { .fniv = gen_sshll_vec, + { .fniv = gen_sshll_vec, .opt_opc = sshll_list, .fno = gen_helper_sve2_sshll_s, .vece = MO_32 }, - { .fniv = gen_sshll_vec, + { .fniv = gen_sshll_vec, .opt_opc = sshll_list, .fno = gen_helper_sve2_sshll_d, .vece = MO_64 } }, @@ -6469,3 +6469,241 @@ static bool trans_UABA(DisasContext *s, arg_rrr_esz *a) { return do_sve2_fn_zzz(s, a, gen_gvec_uaba); } + +static bool do_sve2_narrow_extract(DisasContext *s, arg_rri_esz *a, + const GVecGen2 ops[3]) +{ + if (a->esz < 0 || a->esz > MO_32 || a->imm != 0 || + !dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_2(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vsz, vsz, &ops[a->esz]); + } + return true; +} + +static const TCGOpcode sqxtn_list[] = { + INDEX_op_shli_vec, INDEX_op_smin_vec, INDEX_op_smax_vec, 0 +}; + +static void gen_sqxtnb_vec(unsigned vece, TCGv_vec d, TCGv_vec n) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + int halfbits = 4 << vece; + int64_t mask = (1ull << halfbits) - 1; + int64_t min = -1ull << (halfbits - 1); + int64_t max = -min - 1; + + tcg_gen_dupi_vec(vece, t, min); + tcg_gen_smax_vec(vece, d, n, t); + tcg_gen_dupi_vec(vece, t, max); + tcg_gen_smin_vec(vece, d, d, t); + tcg_gen_dupi_vec(vece, t, mask); + tcg_gen_and_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +static bool trans_SQXTNB(DisasContext *s, arg_rri_esz *a) +{ + static const GVecGen2 ops[3] = { + { .fniv = gen_sqxtnb_vec, + .opt_opc = sqxtn_list, + .fno = gen_helper_sve2_sqxtnb_h, + .vece = MO_16 }, + { .fniv = gen_sqxtnb_vec, + .opt_opc = sqxtn_list, + .fno = gen_helper_sve2_sqxtnb_s, + .vece = MO_32 }, + { .fniv = gen_sqxtnb_vec, + .opt_opc = sqxtn_list, + .fno = gen_helper_sve2_sqxtnb_d, + .vece = MO_64 }, + }; + return do_sve2_narrow_extract(s, a, ops); +} + +static void gen_sqxtnt_vec(unsigned vece, TCGv_vec d, TCGv_vec n) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + int halfbits = 4 << vece; + int64_t mask = (1ull << halfbits) - 1; + int64_t min = -1ull << (halfbits - 1); + int64_t max = -min - 1; + + tcg_gen_dupi_vec(vece, t, min); + tcg_gen_smax_vec(vece, n, n, t); + tcg_gen_dupi_vec(vece, t, max); + tcg_gen_smin_vec(vece, n, n, t); + tcg_gen_shli_vec(vece, n, n, halfbits); + tcg_gen_dupi_vec(vece, t, mask); + tcg_gen_bitsel_vec(vece, d, t, d, n); + tcg_temp_free_vec(t); +} + +static bool trans_SQXTNT(DisasContext *s, arg_rri_esz *a) +{ + static const GVecGen2 ops[3] = { + { .fniv = gen_sqxtnt_vec, + .opt_opc = sqxtn_list, + .load_dest = true, + .fno = gen_helper_sve2_sqxtnt_h, + .vece = MO_16 }, + { .fniv = gen_sqxtnt_vec, + .opt_opc = sqxtn_list, + .load_dest = true, + .fno = gen_helper_sve2_sqxtnt_s, + .vece = MO_32 }, + { .fniv = gen_sqxtnt_vec, + .opt_opc = sqxtn_list, + .load_dest = true, + .fno = gen_helper_sve2_sqxtnt_d, + .vece = MO_64 }, + }; + return do_sve2_narrow_extract(s, a, ops); +} + +static const TCGOpcode uqxtn_list[] = { + INDEX_op_shli_vec, INDEX_op_umin_vec, 0 +}; + +static void gen_uqxtnb_vec(unsigned vece, TCGv_vec d, TCGv_vec n) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + int halfbits = 4 << vece; + int64_t max = (1ull << halfbits) - 1; + + tcg_gen_dupi_vec(vece, t, max); + tcg_gen_umin_vec(vece, d, n, t); + tcg_temp_free_vec(t); +} + +static bool trans_UQXTNB(DisasContext *s, arg_rri_esz *a) +{ + static const GVecGen2 ops[3] = { + { .fniv = gen_uqxtnb_vec, + .opt_opc = uqxtn_list, + .fno = gen_helper_sve2_uqxtnb_h, + .vece = MO_16 }, + { .fniv = gen_uqxtnb_vec, + .opt_opc = uqxtn_list, + .fno = gen_helper_sve2_uqxtnb_s, + .vece = MO_32 }, + { .fniv = gen_uqxtnb_vec, + .opt_opc = uqxtn_list, + .fno = gen_helper_sve2_uqxtnb_d, + .vece = MO_64 }, + }; + return do_sve2_narrow_extract(s, a, ops); +} + +static void gen_uqxtnt_vec(unsigned vece, TCGv_vec d, TCGv_vec n) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + int halfbits = 4 << vece; + int64_t max = (1ull << halfbits) - 1; + + tcg_gen_dupi_vec(vece, t, max); + tcg_gen_umin_vec(vece, n, n, t); + tcg_gen_shli_vec(vece, n, n, halfbits); + tcg_gen_bitsel_vec(vece, d, t, d, n); + tcg_temp_free_vec(t); +} + +static bool trans_UQXTNT(DisasContext *s, arg_rri_esz *a) +{ + static const GVecGen2 ops[3] = { + { .fniv = gen_uqxtnt_vec, + .opt_opc = uqxtn_list, + .load_dest = true, + .fno = gen_helper_sve2_uqxtnt_h, + .vece = MO_16 }, + { .fniv = gen_uqxtnt_vec, + .opt_opc = uqxtn_list, + .load_dest = true, + .fno = gen_helper_sve2_uqxtnt_s, + .vece = MO_32 }, + { .fniv = gen_uqxtnt_vec, + .opt_opc = uqxtn_list, + .load_dest = true, + .fno = gen_helper_sve2_uqxtnt_d, + .vece = MO_64 }, + }; + return do_sve2_narrow_extract(s, a, ops); +} + +static const TCGOpcode sqxtun_list[] = { + INDEX_op_shli_vec, INDEX_op_umin_vec, INDEX_op_smax_vec, 0 +}; + +static void gen_sqxtunb_vec(unsigned vece, TCGv_vec d, TCGv_vec n) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + int halfbits = 4 << vece; + int64_t max = (1ull << halfbits) - 1; + + tcg_gen_dupi_vec(vece, t, 0); + tcg_gen_smax_vec(vece, d, n, t); + tcg_gen_dupi_vec(vece, t, max); + tcg_gen_umin_vec(vece, d, d, t); + tcg_temp_free_vec(t); +} + +static bool trans_SQXTUNB(DisasContext *s, arg_rri_esz *a) +{ + static const GVecGen2 ops[3] = { + { .fniv = gen_sqxtunb_vec, + .opt_opc = sqxtun_list, + .fno = gen_helper_sve2_sqxtunb_h, + .vece = MO_16 }, + { .fniv = gen_sqxtunb_vec, + .opt_opc = sqxtun_list, + .fno = gen_helper_sve2_sqxtunb_s, + .vece = MO_32 }, + { .fniv = gen_sqxtunb_vec, + .opt_opc = sqxtun_list, + .fno = gen_helper_sve2_sqxtunb_d, + .vece = MO_64 }, + }; + return do_sve2_narrow_extract(s, a, ops); +} + +static void gen_sqxtunt_vec(unsigned vece, TCGv_vec d, TCGv_vec n) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + int halfbits = 4 << vece; + int64_t max = (1ull << halfbits) - 1; + + tcg_gen_dupi_vec(vece, t, 0); + tcg_gen_smax_vec(vece, n, n, t); + tcg_gen_dupi_vec(vece, t, max); + tcg_gen_umin_vec(vece, n, n, t); + tcg_gen_shli_vec(vece, n, n, halfbits); + tcg_gen_bitsel_vec(vece, d, t, d, n); + tcg_temp_free_vec(t); +} + +static bool trans_SQXTUNT(DisasContext *s, arg_rri_esz *a) +{ + static const GVecGen2 ops[3] = { + { .fniv = gen_sqxtunt_vec, + .opt_opc = sqxtun_list, + .load_dest = true, + .fno = gen_helper_sve2_sqxtunt_h, + .vece = MO_16 }, + { .fniv = gen_sqxtunt_vec, + .opt_opc = sqxtun_list, + .load_dest = true, + .fno = gen_helper_sve2_sqxtunt_s, + .vece = MO_32 }, + { .fniv = gen_sqxtunt_vec, + .opt_opc = sqxtun_list, + .load_dest = true, + .fno = gen_helper_sve2_sqxtunt_d, + .vece = MO_64 }, + }; + return do_sve2_narrow_extract(s, a, ops); +} From patchwork Fri Sep 18 18:36:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273347 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 76773C43463 for ; Fri, 18 Sep 2020 19:08:10 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 13E6821707 for ; Fri, 18 Sep 2020 19:08:10 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="Nh341JK8" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 13E6821707 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:45606 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLjt-0005Nf-1j for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:08:09 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48150) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLHG-0000D7-74 for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:34 -0400 Received: from mail-pl1-x643.google.com ([2607:f8b0:4864:20::643]:40078) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLHE-000785-75 for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:33 -0400 Received: by mail-pl1-x643.google.com with SMTP id bd2so3427159plb.7 for ; Fri, 18 Sep 2020 11:38:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Rt9orcFA9u3sMdrguxOW16QVC5Bid4e2eA0tJ15Q92Q=; b=Nh341JK8uFadN2zAwxJht85W2RIz946VVhvzLTpwQs8zRSWljK01rQxOIHXrYKfKaI ip5tXBGJ37/5KY9rJZsIeD94YUPWWdK41ND/syOFYCpkpIueXItRtK6NGzcBN5AUHqGY fr+1YW1uucZupsmEtT7Jiw8FkquP06adVLL7IQmtqpecLGK9VRRI8ZvgXVZItvg3+ziA aYE4NmBHYxLPfatpGjZvJEREOw9d7S9EHqWdmW9EY3GUOXRMVzfxnyktOudC1XuLBQAp /r9in1ZuNhPa6r56+BKgoeuEisGHA7sjG03AX2rbJH0M6Yl64TWPQEQsaHWaxiQd9vmm 14xw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Rt9orcFA9u3sMdrguxOW16QVC5Bid4e2eA0tJ15Q92Q=; b=b1IVxYZJxNVut9lUdtZ29103gkYFoCsGlso19OUkzcD/Khpb6ZgaURfyICjawdDtLi 1CZqlHcm9C32QoUU5+XVqNIVnxS6ElSKH3oIV0Z+2QMeHwoU/oSSaK3cX/4RaulAPvna e8LcMHsQVJkMKdqJKnUPhpxmhnrLiRxka00a/cH++RTlsfEAOlSPHM/oEJuyH3L0v70i 537K/JTt0oqYatMjEqmaxpjhiDyTt7NZbfO6I5yADhB1HLwHXrEYUgOet3Ohwrsvi4DE ZkztTSoxM/LolvIKiQGXYLMqrsf6dQO+pyS0Wk2uT7GV0i1qfaG8uqLY57Y9wz0ZLcKP 3sww== X-Gm-Message-State: AOAM532PdbmL6j8WP1lAeClZDzfEfGM1yV2qajQzufIKBHTGC7rCgmm+ J+14oJPFro9ilElfDhnMQ/1KlJ888jF7VA== X-Google-Smtp-Source: ABdhPJw6gBLxQRx3tn3+90KshFEkKcaOS+2EIPurGpoZWGM6Q19biSaNgHkCHC75D4wOjlzneNGcnw== X-Received: by 2002:a17:90a:67cb:: with SMTP id g11mr14181144pjm.56.1600454310363; Fri, 18 Sep 2020 11:38:30 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:29 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 28/81] target/arm: Implement SVE2 floating-point pairwise Date: Fri, 18 Sep 2020 11:36:58 -0700 Message-Id: <20200918183751.2787647-29-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::643; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x643.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org, Stephen Long Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Stephen Long Signed-off-by: Stephen Long Reviewed-by: Richard Henderson Signed-off-by: Richard Henderson --- v2: Load all inputs before writing any output (laurent desnogues) --- target/arm/helper-sve.h | 35 +++++++++++++++++++++++++++++ target/arm/sve.decode | 8 +++++++ target/arm/sve_helper.c | 46 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 25 +++++++++++++++++++++ 4 files changed, 114 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 8abc06f358..12cded4210 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2443,3 +2443,38 @@ DEF_HELPER_FLAGS_3(sve2_uqxtnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_sqxtunt_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_sqxtunt_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_sqxtunt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve2_faddp_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve2_faddp_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve2_faddp_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve2_fmaxnmp_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve2_fmaxnmp_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve2_fmaxnmp_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve2_fminnmp_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve2_fminnmp_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve2_fminnmp_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve2_fmaxp_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve2_fmaxp_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve2_fmaxp_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(sve2_fminp_zpzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve2_fminp_zpzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve2_fminp_zpzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 733f9a3db4..54657d996a 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1284,3 +1284,11 @@ UQXTNB 01000101 .. 1 ..... 010 010 ..... ..... @rd_rn_tszimm_shl UQXTNT 01000101 .. 1 ..... 010 011 ..... ..... @rd_rn_tszimm_shl SQXTUNB 01000101 .. 1 ..... 010 100 ..... ..... @rd_rn_tszimm_shl SQXTUNT 01000101 .. 1 ..... 010 101 ..... ..... @rd_rn_tszimm_shl + +## SVE2 floating-point pairwise operations + +FADDP 01100100 .. 010 00 0 100 ... ..... ..... @rdn_pg_rm +FMAXNMP 01100100 .. 010 10 0 100 ... ..... ..... @rdn_pg_rm +FMINNMP 01100100 .. 010 10 1 100 ... ..... ..... @rdn_pg_rm +FMAXP 01100100 .. 010 11 0 100 ... ..... ..... @rdn_pg_rm +FMINP 01100100 .. 010 11 1 100 ... ..... ..... @rdn_pg_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 07c3e8f02a..dd8e122765 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -890,6 +890,52 @@ DO_ZPZZ_PAIR_D(sve2_sminp_zpzz_d, int64_t, DO_MIN) #undef DO_ZPZZ_PAIR #undef DO_ZPZZ_PAIR_D +#define DO_ZPZZ_PAIR_FP(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, \ + void *status, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + for (i = 0; i < opr_sz; ) { \ + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ + do { \ + TYPE n0 = *(TYPE *)(vn + H(i)); \ + TYPE m0 = *(TYPE *)(vm + H(i)); \ + TYPE n1 = *(TYPE *)(vn + H(i + sizeof(TYPE))); \ + TYPE m1 = *(TYPE *)(vm + H(i + sizeof(TYPE))); \ + if (pg & 1) { \ + *(TYPE *)(vd + H(i)) = OP(n0, n1, status); \ + } \ + i += sizeof(TYPE), pg >>= sizeof(TYPE); \ + if (pg & 1) { \ + *(TYPE *)(vd + H(i)) = OP(m0, m1, status); \ + } \ + i += sizeof(TYPE), pg >>= sizeof(TYPE); \ + } while (i & 15); \ + } \ +} + +DO_ZPZZ_PAIR_FP(sve2_faddp_zpzz_h, float16, H1_2, float16_add) +DO_ZPZZ_PAIR_FP(sve2_faddp_zpzz_s, float32, H1_4, float32_add) +DO_ZPZZ_PAIR_FP(sve2_faddp_zpzz_d, float64, , float64_add) + +DO_ZPZZ_PAIR_FP(sve2_fmaxnmp_zpzz_h, float16, H1_2, float16_maxnum) +DO_ZPZZ_PAIR_FP(sve2_fmaxnmp_zpzz_s, float32, H1_4, float32_maxnum) +DO_ZPZZ_PAIR_FP(sve2_fmaxnmp_zpzz_d, float64, , float64_maxnum) + +DO_ZPZZ_PAIR_FP(sve2_fminnmp_zpzz_h, float16, H1_2, float16_minnum) +DO_ZPZZ_PAIR_FP(sve2_fminnmp_zpzz_s, float32, H1_4, float32_minnum) +DO_ZPZZ_PAIR_FP(sve2_fminnmp_zpzz_d, float64, , float64_minnum) + +DO_ZPZZ_PAIR_FP(sve2_fmaxp_zpzz_h, float16, H1_2, float16_max) +DO_ZPZZ_PAIR_FP(sve2_fmaxp_zpzz_s, float32, H1_4, float32_max) +DO_ZPZZ_PAIR_FP(sve2_fmaxp_zpzz_d, float64, , float64_max) + +DO_ZPZZ_PAIR_FP(sve2_fminp_zpzz_h, float16, H1_2, float16_min) +DO_ZPZZ_PAIR_FP(sve2_fminp_zpzz_s, float32, H1_4, float32_min) +DO_ZPZZ_PAIR_FP(sve2_fminp_zpzz_d, float64, , float64_min) + +#undef DO_ZPZZ_PAIR_FP + /* Three-operand expander, controlled by a predicate, in which the * third operand is "wide". That is, for D = N op M, the same 64-bit * value of M is used with all of the narrower values of N. diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 6b29d2580b..c7f9c7b561 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6707,3 +6707,28 @@ static bool trans_SQXTUNT(DisasContext *s, arg_rri_esz *a) }; return do_sve2_narrow_extract(s, a, ops); } + +static bool do_sve2_zpzz_fp(DisasContext *s, arg_rprr_esz *a, + gen_helper_gvec_4_ptr *fn) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + return do_zpzz_fp(s, a, fn); +} + +#define DO_SVE2_ZPZZ_FP(NAME, name) \ +static bool trans_##NAME(DisasContext *s, arg_rprr_esz *a) \ +{ \ + static gen_helper_gvec_4_ptr * const fns[4] = { \ + NULL, gen_helper_sve2_##name##_zpzz_h, \ + gen_helper_sve2_##name##_zpzz_s, gen_helper_sve2_##name##_zpzz_d \ + }; \ + return do_sve2_zpzz_fp(s, a, fns[a->esz]); \ +} + +DO_SVE2_ZPZZ_FP(FADDP, faddp) +DO_SVE2_ZPZZ_FP(FMAXNMP, fmaxnmp) +DO_SVE2_ZPZZ_FP(FMINNMP, fminnmp) +DO_SVE2_ZPZZ_FP(FMAXP, fmaxp) +DO_SVE2_ZPZZ_FP(FMINP, fminp) From patchwork Fri Sep 18 18:36:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273348 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A32EBC43463 for ; Fri, 18 Sep 2020 19:05:30 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 36B1E20684 for ; Fri, 18 Sep 2020 19:05:30 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="zFV0jLcT" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 36B1E20684 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:38744 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLhJ-0002Q6-Be for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:05:29 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48184) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLHI-0000IU-QI for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:36 -0400 Received: from mail-pj1-x1043.google.com ([2607:f8b0:4864:20::1043]:54398) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLHG-00078q-JL for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:36 -0400 Received: by mail-pj1-x1043.google.com with SMTP id mm21so3620179pjb.4 for ; Fri, 18 Sep 2020 11:38:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=1q+ADORq6W2uNcIYuW7L5MpwGg95XHzefx6CftIdm8g=; b=zFV0jLcTMah2meV3AyQE41+fs5yiVxb6FZxgowChl/BiUCzw36EkH2aSEl8e3d/zsC fgER4XWk9W28uWisYlYxrtr/hpGyfMRbLTOSRZXEipcxwmOh4GfNIPP2CXCTpCHuJ2qY FbNZ6d+1ucY/UjehlQb0oqblpFozwx5P57xIyPSBVuy7ct2M/W5/wOS1EfViUsGQLj6c e5hO/ZXja7KXjqk8T4Oj1617DLPGvXxgNM4OMHenTDxtqKTlkZxoBTLJNzRDy2ZdGs/m mJSYkz+gmZf3pclB4i8AQ5sSwKAA4/yqFreF2fcJcQx4fE3207Siz7W0PD9ovftlXtsv xmow== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=1q+ADORq6W2uNcIYuW7L5MpwGg95XHzefx6CftIdm8g=; b=BMD2EZ51Ye4ELbycqmFRa+MAXPYxRDNUKvnrZgE5Lk4gpbxp9h3gCGReR2HoL5Wnlg vnf1+gKzBO7MdhEAUtEfxuwA1UVBmGypAk5HCUl9Gtw5tJbTzD4SqOlXPAZ971fIob1Q Fn+Vz4e+zFfyz7V8AycI5kJeKq3lK4e1Ufn4n18BJA00f3ongy5ZC+77KE8X7GDXk9qs AhgHoQ9JhOxhUXUS9+LgZKenPydWYfxP7u1jCa24lYaWUi1nq4rNmFVGTA7E+UhZHu6Q jMPd8sMy3xbsG2gxFWhZGUqv3T7w/nLtA3Kp/wsDhusDPUbn/ZzKWHxuC0Ntjj3AAddB gbhw== X-Gm-Message-State: AOAM531BPh4jpnRMQoqSgW8WxnjfsT2ipVr3x8Pg3jAWUdsFjHtMfqBu AS40HyFq1c+e2jWhsQz4K2tgivQtNkm4Ew== X-Google-Smtp-Source: ABdhPJzcPprtfGrT8XTyCgOQpbYVMvvdzNZM1tvtNh4M8sTKtKCjRLbWgIbWSpnBKt+/BYHcjTRXpg== X-Received: by 2002:a17:90a:c288:: with SMTP id f8mr14172896pjt.123.1600454311662; Fri, 18 Sep 2020 11:38:31 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:30 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 29/81] target/arm: Implement SVE2 SHRN, RSHRN Date: Fri, 18 Sep 2020 11:36:59 -0700 Message-Id: <20200918183751.2787647-30-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1043; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1043.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- v2: Fix typo in gen_shrnb_vec (laurent desnogues) v3: Replace DO_RSHR with an inline function --- target/arm/helper-sve.h | 16 ++++ target/arm/sve.decode | 8 ++ target/arm/sve_helper.c | 54 ++++++++++++- target/arm/translate-sve.c | 160 +++++++++++++++++++++++++++++++++++++ 4 files changed, 236 insertions(+), 2 deletions(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 12cded4210..f4bf3f9a40 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2444,6 +2444,22 @@ DEF_HELPER_FLAGS_3(sve2_sqxtunt_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_sqxtunt_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_sqxtunt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_shrnb_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_shrnb_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_shrnb_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_shrnt_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_shrnt_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_shrnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_rshrnb_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_rshrnb_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_rshrnb_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_rshrnt_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_rshrnt_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_rshrnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + DEF_HELPER_FLAGS_6(sve2_faddp_zpzz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_6(sve2_faddp_zpzz_s, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 54657d996a..7cc4b6cc43 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1285,6 +1285,14 @@ UQXTNT 01000101 .. 1 ..... 010 011 ..... ..... @rd_rn_tszimm_shl SQXTUNB 01000101 .. 1 ..... 010 100 ..... ..... @rd_rn_tszimm_shl SQXTUNT 01000101 .. 1 ..... 010 101 ..... ..... @rd_rn_tszimm_shl +## SVE2 bitwise shift right narrow + +# Bit 23 == 0 is handled by esz > 0 in the translator. +SHRNB 01000101 .. 1 ..... 00 0100 ..... ..... @rd_rn_tszimm_shr +SHRNT 01000101 .. 1 ..... 00 0101 ..... ..... @rd_rn_tszimm_shr +RSHRNB 01000101 .. 1 ..... 00 0110 ..... ..... @rd_rn_tszimm_shr +RSHRNT 01000101 .. 1 ..... 00 0111 ..... ..... @rd_rn_tszimm_shr + ## SVE2 floating-point pairwise operations FADDP 01100100 .. 010 00 0 100 ... ..... ..... @rdn_pg_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index dd8e122765..27907ed1a5 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1862,6 +1862,17 @@ void HELPER(NAME)(void *vd, void *vn, void *vg, uint32_t desc) \ when N is negative, add 2**M-1. */ #define DO_ASRD(N, M) ((N + (N < 0 ? ((__typeof(N))1 << M) - 1 : 0)) >> M) +static inline uint64_t do_urshr(uint64_t x, unsigned sh) +{ + if (likely(sh < 64)) { + return (x >> sh) + ((x >> (sh - 1)) & 1); + } else if (sh == 64) { + return x >> 63; + } else { + return 0; + } +} + DO_ZPZI(sve_asr_zpzi_b, int8_t, H1, DO_SHR) DO_ZPZI(sve_asr_zpzi_h, int16_t, H1_2, DO_SHR) DO_ZPZI(sve_asr_zpzi_s, int32_t, H1_4, DO_SHR) @@ -1882,12 +1893,51 @@ DO_ZPZI(sve_asrd_h, int16_t, H1_2, DO_ASRD) DO_ZPZI(sve_asrd_s, int32_t, H1_4, DO_ASRD) DO_ZPZI_D(sve_asrd_d, int64_t, DO_ASRD) -#undef DO_SHR -#undef DO_SHL #undef DO_ASRD #undef DO_ZPZI #undef DO_ZPZI_D +#define DO_SHRNB(NAME, TYPEW, TYPEN, OP) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + int shift = simd_data(desc); \ + for (i = 0; i < opr_sz; i += sizeof(TYPEW)) { \ + TYPEW nn = *(TYPEW *)(vn + i); \ + *(TYPEW *)(vd + i) = (TYPEN)OP(nn, shift); \ + } \ +} + +#define DO_SHRNT(NAME, TYPEW, TYPEN, HW, HN, OP) \ +void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + int shift = simd_data(desc); \ + for (i = 0; i < opr_sz; i += sizeof(TYPEW)) { \ + TYPEW nn = *(TYPEW *)(vn + HW(i)); \ + *(TYPEN *)(vd + HN(i + sizeof(TYPEN))) = OP(nn, shift); \ + } \ +} + +DO_SHRNB(sve2_shrnb_h, uint16_t, uint8_t, DO_SHR) +DO_SHRNB(sve2_shrnb_s, uint32_t, uint16_t, DO_SHR) +DO_SHRNB(sve2_shrnb_d, uint64_t, uint32_t, DO_SHR) + +DO_SHRNT(sve2_shrnt_h, uint16_t, uint8_t, H1_2, H1, DO_SHR) +DO_SHRNT(sve2_shrnt_s, uint32_t, uint16_t, H1_4, H1_2, DO_SHR) +DO_SHRNT(sve2_shrnt_d, uint64_t, uint32_t, , H1_4, DO_SHR) + +DO_SHRNB(sve2_rshrnb_h, uint16_t, uint8_t, do_urshr) +DO_SHRNB(sve2_rshrnb_s, uint32_t, uint16_t, do_urshr) +DO_SHRNB(sve2_rshrnb_d, uint64_t, uint32_t, do_urshr) + +DO_SHRNT(sve2_rshrnt_h, uint16_t, uint8_t, H1_2, H1, do_urshr) +DO_SHRNT(sve2_rshrnt_s, uint32_t, uint16_t, H1_4, H1_2, do_urshr) +DO_SHRNT(sve2_rshrnt_d, uint64_t, uint32_t, , H1_4, do_urshr) + +#undef DO_SHRNB +#undef DO_SHRNT + /* Fully general four-operand expander, controlled by a predicate. */ #define DO_ZPZZZ(NAME, TYPE, H, OP) \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index c7f9c7b561..ad31066296 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6708,6 +6708,166 @@ static bool trans_SQXTUNT(DisasContext *s, arg_rri_esz *a) return do_sve2_narrow_extract(s, a, ops); } +static bool do_sve2_shr_narrow(DisasContext *s, arg_rri_esz *a, + const GVecGen2i ops[3]) +{ + if (a->esz < 0 || a->esz > MO_32 || !dc_isar_feature(aa64_sve2, s)) { + return false; + } + assert(a->imm > 0 && a->imm <= (8 << a->esz)); + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_2i(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vsz, vsz, a->imm, &ops[a->esz]); + } + return true; +} + +static void gen_shrnb_i64(unsigned vece, TCGv_i64 d, TCGv_i64 n, int shr) +{ + int halfbits = 4 << vece; + uint64_t mask = dup_const(vece, MAKE_64BIT_MASK(0, halfbits)); + + tcg_gen_shri_i64(d, n, shr); + tcg_gen_andi_i64(d, d, mask); +} + +static void gen_shrnb16_i64(TCGv_i64 d, TCGv_i64 n, int64_t shr) +{ + gen_shrnb_i64(MO_16, d, n, shr); +} + +static void gen_shrnb32_i64(TCGv_i64 d, TCGv_i64 n, int64_t shr) +{ + gen_shrnb_i64(MO_32, d, n, shr); +} + +static void gen_shrnb64_i64(TCGv_i64 d, TCGv_i64 n, int64_t shr) +{ + gen_shrnb_i64(MO_64, d, n, shr); +} + +static void gen_shrnb_vec(unsigned vece, TCGv_vec d, TCGv_vec n, int64_t shr) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + int halfbits = 4 << vece; + uint64_t mask = MAKE_64BIT_MASK(0, halfbits); + + tcg_gen_shri_vec(vece, n, n, shr); + tcg_gen_dupi_vec(vece, t, mask); + tcg_gen_and_vec(vece, d, n, t); + tcg_temp_free_vec(t); +} + +static bool trans_SHRNB(DisasContext *s, arg_rri_esz *a) +{ + static const TCGOpcode vec_list[] = { INDEX_op_shri_vec, 0 }; + static const GVecGen2i ops[3] = { + { .fni8 = gen_shrnb16_i64, + .fniv = gen_shrnb_vec, + .opt_opc = vec_list, + .fno = gen_helper_sve2_shrnb_h, + .vece = MO_16 }, + { .fni8 = gen_shrnb32_i64, + .fniv = gen_shrnb_vec, + .opt_opc = vec_list, + .fno = gen_helper_sve2_shrnb_s, + .vece = MO_32 }, + { .fni8 = gen_shrnb64_i64, + .fniv = gen_shrnb_vec, + .opt_opc = vec_list, + .fno = gen_helper_sve2_shrnb_d, + .vece = MO_64 }, + }; + return do_sve2_shr_narrow(s, a, ops); +} + +static void gen_shrnt_i64(unsigned vece, TCGv_i64 d, TCGv_i64 n, int shr) +{ + int halfbits = 4 << vece; + uint64_t mask = dup_const(vece, MAKE_64BIT_MASK(0, halfbits)); + + tcg_gen_shli_i64(n, n, halfbits - shr); + tcg_gen_andi_i64(n, n, ~mask); + tcg_gen_andi_i64(d, d, mask); + tcg_gen_or_i64(d, d, n); +} + +static void gen_shrnt16_i64(TCGv_i64 d, TCGv_i64 n, int64_t shr) +{ + gen_shrnt_i64(MO_16, d, n, shr); +} + +static void gen_shrnt32_i64(TCGv_i64 d, TCGv_i64 n, int64_t shr) +{ + gen_shrnt_i64(MO_32, d, n, shr); +} + +static void gen_shrnt64_i64(TCGv_i64 d, TCGv_i64 n, int64_t shr) +{ + tcg_gen_shri_i64(n, n, shr); + tcg_gen_deposit_i64(d, d, n, 32, 32); +} + +static void gen_shrnt_vec(unsigned vece, TCGv_vec d, TCGv_vec n, int64_t shr) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + int halfbits = 4 << vece; + uint64_t mask = MAKE_64BIT_MASK(0, halfbits); + + tcg_gen_shli_vec(vece, n, n, halfbits - shr); + tcg_gen_dupi_vec(vece, t, mask); + tcg_gen_bitsel_vec(vece, d, t, d, n); + tcg_temp_free_vec(t); +} + +static bool trans_SHRNT(DisasContext *s, arg_rri_esz *a) +{ + static const TCGOpcode vec_list[] = { INDEX_op_shli_vec, 0 }; + static const GVecGen2i ops[3] = { + { .fni8 = gen_shrnt16_i64, + .fniv = gen_shrnt_vec, + .opt_opc = vec_list, + .load_dest = true, + .fno = gen_helper_sve2_shrnt_h, + .vece = MO_16 }, + { .fni8 = gen_shrnt32_i64, + .fniv = gen_shrnt_vec, + .opt_opc = vec_list, + .load_dest = true, + .fno = gen_helper_sve2_shrnt_s, + .vece = MO_32 }, + { .fni8 = gen_shrnt64_i64, + .fniv = gen_shrnt_vec, + .opt_opc = vec_list, + .load_dest = true, + .fno = gen_helper_sve2_shrnt_d, + .vece = MO_64 }, + }; + return do_sve2_shr_narrow(s, a, ops); +} + +static bool trans_RSHRNB(DisasContext *s, arg_rri_esz *a) +{ + static const GVecGen2i ops[3] = { + { .fno = gen_helper_sve2_rshrnb_h }, + { .fno = gen_helper_sve2_rshrnb_s }, + { .fno = gen_helper_sve2_rshrnb_d }, + }; + return do_sve2_shr_narrow(s, a, ops); +} + +static bool trans_RSHRNT(DisasContext *s, arg_rri_esz *a) +{ + static const GVecGen2i ops[3] = { + { .fno = gen_helper_sve2_rshrnt_h }, + { .fno = gen_helper_sve2_rshrnt_s }, + { .fno = gen_helper_sve2_rshrnt_d }, + }; + return do_sve2_shr_narrow(s, a, ops); +} + static bool do_sve2_zpzz_fp(DisasContext *s, arg_rprr_esz *a, gen_helper_gvec_4_ptr *fn) { From patchwork Fri Sep 18 18:37:00 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273335 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 35D4CC43464 for ; Fri, 18 Sep 2020 19:26:57 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C069522208 for ; Fri, 18 Sep 2020 19:26:56 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="rZuYZr4P" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C069522208 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:40440 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJM23-0001a5-VI for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:26:55 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48200) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLHK-0000LM-0e for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:38 -0400 Received: from mail-pg1-x542.google.com ([2607:f8b0:4864:20::542]:36451) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLHG-000793-On for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:37 -0400 Received: by mail-pg1-x542.google.com with SMTP id f2so3990380pgd.3 for ; Fri, 18 Sep 2020 11:38:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=sw3fn0l1rQhTFIrGVphb8kkMpbmZh0rgVNt+StOMsvU=; b=rZuYZr4PtyjCIVZnT5OT69edQLdxT/ZuDmKUvVuLOGM+IjeJatHK1Y2FuBhOryXG2O a6n5fbqarRCtusN5AXthxZpXG89+abgHYaf3kKKWVPJRZGcG/WmeMiep8r2XSPJmW1XF XpZqETfgmNFPS1v/k5U5oWAWPOE8lurINxTvwb96kIl8AHxXximmhn2bCCEtXNeDvX+m AB49Ht4+x5tHjSqAzSMElS1PbQPr5Ruw0B2vSFxx0y7iT0r+iAiUblmbJDKIWm9cngoi idU4fk3tX/Vn/+SeaUyjWitqZ7abtxDp2+GkGGtPU/XvpNiMV3FLxHoPno6qXtR4q9Si yK9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=sw3fn0l1rQhTFIrGVphb8kkMpbmZh0rgVNt+StOMsvU=; b=Nui3Ot9i3/t1VNmc7yYggOqtOJsD8zAa0EROF6+YcqaivP8kwVnAoPVRH/8Rq2F/+U Zd68NKUvztVw3DXTFo3NQL0WDID5Icf5P2IoobguNiFaKhJiNkvHckKv+SV414sJg/fB CkgJ5so8YWyQk+7H0mv2ohkAnPFDQUYVaAS8UkTcF1BfW4wELvErxpyPM2Xk5w1LOmvX cKzcXUZGi4IrLrgMMAw0wr07D9oO4MI+jAkXGevgybn/ILSrt534srGuWFbhMHn2Y8zQ qCF0jhujDWhpfShZPk3cMESyBDco+EBq/Wi6tIkxSOVD4I1zThzPgaqZDVkocNTPgoho 1hMg== X-Gm-Message-State: AOAM532tkfSnP9AIMNAm2nr58T+U3uCWdLTrx6wYswOhdkEpozkBD4HB XrMvgFBpFN3mnVAMlktpSAZo6z6M0XIY8Q== X-Google-Smtp-Source: ABdhPJyrBEKXnUhxc2JJOKfaP5y1AX21Gb+BkZdVybR/ovZP0jBTR3vCSe9dfcR4eS4jnzfAXxjnSA== X-Received: by 2002:aa7:99c2:0:b029:142:440b:fa28 with SMTP id v2-20020aa799c20000b0290142440bfa28mr14540611pfi.30.1600454313028; Fri, 18 Sep 2020 11:38:33 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:32 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 30/81] target/arm: Implement SVE2 SQSHRUN, SQRSHRUN Date: Fri, 18 Sep 2020 11:37:00 -0700 Message-Id: <20200918183751.2787647-31-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::542; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x542.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 16 +++++++ target/arm/sve.decode | 4 ++ target/arm/sve_helper.c | 35 ++++++++++++++ target/arm/translate-sve.c | 98 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 153 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index f4bf3f9a40..8ca99b5c3c 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2460,6 +2460,22 @@ DEF_HELPER_FLAGS_3(sve2_rshrnt_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_rshrnt_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_rshrnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqshrunb_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqshrunb_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqshrunb_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_sqshrunt_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqshrunt_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqshrunt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_sqrshrunb_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqrshrunb_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqrshrunb_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_sqrshrunt_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqrshrunt_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqrshrunt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + DEF_HELPER_FLAGS_6(sve2_faddp_zpzz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_6(sve2_faddp_zpzz_s, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 7cc4b6cc43..cade628cfd 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1288,6 +1288,10 @@ SQXTUNT 01000101 .. 1 ..... 010 101 ..... ..... @rd_rn_tszimm_shl ## SVE2 bitwise shift right narrow # Bit 23 == 0 is handled by esz > 0 in the translator. +SQSHRUNB 01000101 .. 1 ..... 00 0000 ..... ..... @rd_rn_tszimm_shr +SQSHRUNT 01000101 .. 1 ..... 00 0001 ..... ..... @rd_rn_tszimm_shr +SQRSHRUNB 01000101 .. 1 ..... 00 0010 ..... ..... @rd_rn_tszimm_shr +SQRSHRUNT 01000101 .. 1 ..... 00 0011 ..... ..... @rd_rn_tszimm_shr SHRNB 01000101 .. 1 ..... 00 0100 ..... ..... @rd_rn_tszimm_shr SHRNT 01000101 .. 1 ..... 00 0101 ..... ..... @rd_rn_tszimm_shr RSHRNB 01000101 .. 1 ..... 00 0110 ..... ..... @rd_rn_tszimm_shr diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 27907ed1a5..df324f0498 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1873,6 +1873,16 @@ static inline uint64_t do_urshr(uint64_t x, unsigned sh) } } +static inline int64_t do_srshr(int64_t x, unsigned sh) +{ + if (likely(sh < 64)) { + return (x >> sh) + ((x >> (sh - 1)) & 1); + } else { + /* Rounding the sign bit always produces 0. */ + return 0; + } +} + DO_ZPZI(sve_asr_zpzi_b, int8_t, H1, DO_SHR) DO_ZPZI(sve_asr_zpzi_h, int16_t, H1_2, DO_SHR) DO_ZPZI(sve_asr_zpzi_s, int32_t, H1_4, DO_SHR) @@ -1935,6 +1945,31 @@ DO_SHRNT(sve2_rshrnt_h, uint16_t, uint8_t, H1_2, H1, do_urshr) DO_SHRNT(sve2_rshrnt_s, uint32_t, uint16_t, H1_4, H1_2, do_urshr) DO_SHRNT(sve2_rshrnt_d, uint64_t, uint32_t, , H1_4, do_urshr) +#define DO_SQSHRUN_H(x, sh) do_sat_bhs((int64_t)(x) >> sh, 0, UINT8_MAX) +#define DO_SQSHRUN_S(x, sh) do_sat_bhs((int64_t)(x) >> sh, 0, UINT16_MAX) +#define DO_SQSHRUN_D(x, sh) \ + do_sat_bhs((int64_t)(x) >> (sh < 64 ? sh : 63), 0, UINT32_MAX) + +DO_SHRNB(sve2_sqshrunb_h, int16_t, uint8_t, DO_SQSHRUN_H) +DO_SHRNB(sve2_sqshrunb_s, int32_t, uint16_t, DO_SQSHRUN_S) +DO_SHRNB(sve2_sqshrunb_d, int64_t, uint32_t, DO_SQSHRUN_D) + +DO_SHRNT(sve2_sqshrunt_h, int16_t, uint8_t, H1_2, H1, DO_SQSHRUN_H) +DO_SHRNT(sve2_sqshrunt_s, int32_t, uint16_t, H1_4, H1_2, DO_SQSHRUN_S) +DO_SHRNT(sve2_sqshrunt_d, int64_t, uint32_t, , H1_4, DO_SQSHRUN_D) + +#define DO_SQRSHRUN_H(x, sh) do_sat_bhs(do_srshr(x, sh), 0, UINT8_MAX) +#define DO_SQRSHRUN_S(x, sh) do_sat_bhs(do_srshr(x, sh), 0, UINT16_MAX) +#define DO_SQRSHRUN_D(x, sh) do_sat_bhs(do_srshr(x, sh), 0, UINT32_MAX) + +DO_SHRNB(sve2_sqrshrunb_h, int16_t, uint8_t, DO_SQRSHRUN_H) +DO_SHRNB(sve2_sqrshrunb_s, int32_t, uint16_t, DO_SQRSHRUN_S) +DO_SHRNB(sve2_sqrshrunb_d, int64_t, uint32_t, DO_SQRSHRUN_D) + +DO_SHRNT(sve2_sqrshrunt_h, int16_t, uint8_t, H1_2, H1, DO_SQRSHRUN_H) +DO_SHRNT(sve2_sqrshrunt_s, int32_t, uint16_t, H1_4, H1_2, DO_SQRSHRUN_S) +DO_SHRNT(sve2_sqrshrunt_d, int64_t, uint32_t, , H1_4, DO_SQRSHRUN_D) + #undef DO_SHRNB #undef DO_SHRNT diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index ad31066296..3751544e95 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6868,6 +6868,104 @@ static bool trans_RSHRNT(DisasContext *s, arg_rri_esz *a) return do_sve2_shr_narrow(s, a, ops); } +static void gen_sqshrunb_vec(unsigned vece, TCGv_vec d, + TCGv_vec n, int64_t shr) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + int halfbits = 4 << vece; + + tcg_gen_sari_vec(vece, n, n, shr); + tcg_gen_dupi_vec(vece, t, 0); + tcg_gen_smax_vec(vece, n, n, t); + tcg_gen_dupi_vec(vece, t, MAKE_64BIT_MASK(0, halfbits)); + tcg_gen_umin_vec(vece, d, n, t); + tcg_temp_free_vec(t); +} + +static bool trans_SQSHRUNB(DisasContext *s, arg_rri_esz *a) +{ + static const TCGOpcode vec_list[] = { + INDEX_op_sari_vec, INDEX_op_smax_vec, INDEX_op_umin_vec, 0 + }; + static const GVecGen2i ops[3] = { + { .fniv = gen_sqshrunb_vec, + .opt_opc = vec_list, + .fno = gen_helper_sve2_sqshrunb_h, + .vece = MO_16 }, + { .fniv = gen_sqshrunb_vec, + .opt_opc = vec_list, + .fno = gen_helper_sve2_sqshrunb_s, + .vece = MO_32 }, + { .fniv = gen_sqshrunb_vec, + .opt_opc = vec_list, + .fno = gen_helper_sve2_sqshrunb_d, + .vece = MO_64 }, + }; + return do_sve2_shr_narrow(s, a, ops); +} + +static void gen_sqshrunt_vec(unsigned vece, TCGv_vec d, + TCGv_vec n, int64_t shr) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + int halfbits = 4 << vece; + + tcg_gen_sari_vec(vece, n, n, shr); + tcg_gen_dupi_vec(vece, t, 0); + tcg_gen_smax_vec(vece, n, n, t); + tcg_gen_dupi_vec(vece, t, MAKE_64BIT_MASK(0, halfbits)); + tcg_gen_umin_vec(vece, n, n, t); + tcg_gen_shli_vec(vece, n, n, halfbits); + tcg_gen_bitsel_vec(vece, d, t, d, n); + tcg_temp_free_vec(t); +} + +static bool trans_SQSHRUNT(DisasContext *s, arg_rri_esz *a) +{ + static const TCGOpcode vec_list[] = { + INDEX_op_shli_vec, INDEX_op_sari_vec, + INDEX_op_smax_vec, INDEX_op_umin_vec, 0 + }; + static const GVecGen2i ops[3] = { + { .fniv = gen_sqshrunt_vec, + .opt_opc = vec_list, + .load_dest = true, + .fno = gen_helper_sve2_sqshrunt_h, + .vece = MO_16 }, + { .fniv = gen_sqshrunt_vec, + .opt_opc = vec_list, + .load_dest = true, + .fno = gen_helper_sve2_sqshrunt_s, + .vece = MO_32 }, + { .fniv = gen_sqshrunt_vec, + .opt_opc = vec_list, + .load_dest = true, + .fno = gen_helper_sve2_sqshrunt_d, + .vece = MO_64 }, + }; + return do_sve2_shr_narrow(s, a, ops); +} + +static bool trans_SQRSHRUNB(DisasContext *s, arg_rri_esz *a) +{ + static const GVecGen2i ops[3] = { + { .fno = gen_helper_sve2_sqrshrunb_h }, + { .fno = gen_helper_sve2_sqrshrunb_s }, + { .fno = gen_helper_sve2_sqrshrunb_d }, + }; + return do_sve2_shr_narrow(s, a, ops); +} + +static bool trans_SQRSHRUNT(DisasContext *s, arg_rri_esz *a) +{ + static const GVecGen2i ops[3] = { + { .fno = gen_helper_sve2_sqrshrunt_h }, + { .fno = gen_helper_sve2_sqrshrunt_s }, + { .fno = gen_helper_sve2_sqrshrunt_d }, + }; + return do_sve2_shr_narrow(s, a, ops); +} + static bool do_sve2_zpzz_fp(DisasContext *s, arg_rprr_esz *a, gen_helper_gvec_4_ptr *fn) { From patchwork Fri Sep 18 18:37:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 305019 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 74A79C43464 for ; Fri, 18 Sep 2020 19:11:39 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 15A1621D7F for ; Fri, 18 Sep 2020 19:11:39 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="atwJNoh8" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 15A1621D7F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:53882 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLnG-0000QO-3z for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:11:38 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48222) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLHL-0000Ns-5T for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:39 -0400 Received: from mail-pj1-x1035.google.com ([2607:f8b0:4864:20::1035]:36293) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLHI-00079P-3C for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:38 -0400 Received: by mail-pj1-x1035.google.com with SMTP id b17so3464376pji.1 for ; Fri, 18 Sep 2020 11:38:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=3MFvp2HcBnlPPA+rMyZkl48h/AKlAPFXRSq6uHnCmAc=; b=atwJNoh8TIlMYM6sPgBGqCYDnjsCOHu2fBMx/aq/XBZHUZ7N4er5TxHw7NE7Fm/c9P d8eabK2aY9lTif7D1hhrhUyvWEXp5/temY9Lg4m8PyY495Z9ji/MS2FR9Uxr+87srthl Syi3GxBR9R/W8xZh+j5iPeFTmw+CKudf8myqcwEYi0TUBRuC7cxfhTRS15tcmQCpLcyd LS6qhc9j4dgkJp/wQld0xZ2RstyvrQLw7l21cXMU5ICWU9KDjj0iLQdZCaZLgyof7Qn3 AA9ljbZMFMTUB88cNlKqyVQznlAqKQECDx5PMeyaZpmfEc3qj9AyezL5tvRxcXjdted3 VRIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=3MFvp2HcBnlPPA+rMyZkl48h/AKlAPFXRSq6uHnCmAc=; b=aboWiL0xvNQVyCM+IKmKP2PbeM2oiUOiw+fGeLkDb7mLzOIFQU25y2qE7aiNjMWOz0 IJF9F+IeblCSHUESOK3/0fhkaAzcGSiIpB+HosZSBsL2Apf27Job/q+h3zzgySvd1cls ec07zqvwxjq8naIaRwDTvy23321AgfNDQyw8L0JsiJr9FuUdrd1qWKfNmuYw/1pxJf08 dSYrdO+jNck40ix0XD5NyF9sC2d7DZqv2t3p06APz33yQbUbhi/CxcAJyOZifnWP/Wkb EZZRZwNMVKfCyFauW3dAyXfK7pll/d9lR7mCbfifIV7XfjIDBIAPmpGpwbU1iya2gopx jU5g== X-Gm-Message-State: AOAM5324co+aZAMjsnU+uMJ5kwIE60jXxMitl6SRDaifj00IuoH3Hnms TjyRjc2Lx4wKvZKR1+EGUOqKI2HU12sj6A== X-Google-Smtp-Source: ABdhPJz0mEJTpbX3NEuuxmQWJu617HvliIY1p/2avDzn2lx7bb8Ku2xcMS2DVDmvmg2JQE/8Jdy+4A== X-Received: by 2002:a17:90b:1256:: with SMTP id gx22mr14719979pjb.47.1600454314368; Fri, 18 Sep 2020 11:38:34 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:33 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 31/81] target/arm: Implement SVE2 UQSHRN, UQRSHRN Date: Fri, 18 Sep 2020 11:37:01 -0700 Message-Id: <20200918183751.2787647-32-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1035; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1035.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 16 +++++++ target/arm/sve.decode | 4 ++ target/arm/sve_helper.c | 24 ++++++++++ target/arm/translate-sve.c | 93 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 137 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 8ca99b5c3c..3a13064980 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2476,6 +2476,22 @@ DEF_HELPER_FLAGS_3(sve2_sqrshrunt_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_sqrshrunt_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_sqrshrunt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_uqshrnb_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_uqshrnb_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_uqshrnb_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_uqshrnt_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_uqshrnt_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_uqshrnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_uqrshrnb_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_uqrshrnb_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_uqrshrnb_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_uqrshrnt_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_uqrshrnt_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_uqrshrnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + DEF_HELPER_FLAGS_6(sve2_faddp_zpzz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_6(sve2_faddp_zpzz_s, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index cade628cfd..69915398e7 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1296,6 +1296,10 @@ SHRNB 01000101 .. 1 ..... 00 0100 ..... ..... @rd_rn_tszimm_shr SHRNT 01000101 .. 1 ..... 00 0101 ..... ..... @rd_rn_tszimm_shr RSHRNB 01000101 .. 1 ..... 00 0110 ..... ..... @rd_rn_tszimm_shr RSHRNT 01000101 .. 1 ..... 00 0111 ..... ..... @rd_rn_tszimm_shr +UQSHRNB 01000101 .. 1 ..... 00 1100 ..... ..... @rd_rn_tszimm_shr +UQSHRNT 01000101 .. 1 ..... 00 1101 ..... ..... @rd_rn_tszimm_shr +UQRSHRNB 01000101 .. 1 ..... 00 1110 ..... ..... @rd_rn_tszimm_shr +UQRSHRNT 01000101 .. 1 ..... 00 1111 ..... ..... @rd_rn_tszimm_shr ## SVE2 floating-point pairwise operations diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index df324f0498..c478a1f68d 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1970,6 +1970,30 @@ DO_SHRNT(sve2_sqrshrunt_h, int16_t, uint8_t, H1_2, H1, DO_SQRSHRUN_H) DO_SHRNT(sve2_sqrshrunt_s, int32_t, uint16_t, H1_4, H1_2, DO_SQRSHRUN_S) DO_SHRNT(sve2_sqrshrunt_d, int64_t, uint32_t, , H1_4, DO_SQRSHRUN_D) +#define DO_UQSHRN_H(x, sh) MIN(x >> sh, UINT8_MAX) +#define DO_UQSHRN_S(x, sh) MIN(x >> sh, UINT16_MAX) +#define DO_UQSHRN_D(x, sh) MIN(x >> sh, UINT32_MAX) + +DO_SHRNB(sve2_uqshrnb_h, uint16_t, uint8_t, DO_UQSHRN_H) +DO_SHRNB(sve2_uqshrnb_s, uint32_t, uint16_t, DO_UQSHRN_S) +DO_SHRNB(sve2_uqshrnb_d, uint64_t, uint32_t, DO_UQSHRN_D) + +DO_SHRNT(sve2_uqshrnt_h, uint16_t, uint8_t, H1_2, H1, DO_UQSHRN_H) +DO_SHRNT(sve2_uqshrnt_s, uint32_t, uint16_t, H1_4, H1_2, DO_UQSHRN_S) +DO_SHRNT(sve2_uqshrnt_d, uint64_t, uint32_t, , H1_4, DO_UQSHRN_D) + +#define DO_UQRSHRN_H(x, sh) MIN(do_urshr(x, sh), UINT8_MAX) +#define DO_UQRSHRN_S(x, sh) MIN(do_urshr(x, sh), UINT16_MAX) +#define DO_UQRSHRN_D(x, sh) MIN(do_urshr(x, sh), UINT32_MAX) + +DO_SHRNB(sve2_uqrshrnb_h, uint16_t, uint8_t, DO_UQRSHRN_H) +DO_SHRNB(sve2_uqrshrnb_s, uint32_t, uint16_t, DO_UQRSHRN_S) +DO_SHRNB(sve2_uqrshrnb_d, uint64_t, uint32_t, DO_UQRSHRN_D) + +DO_SHRNT(sve2_uqrshrnt_h, uint16_t, uint8_t, H1_2, H1, DO_UQRSHRN_H) +DO_SHRNT(sve2_uqrshrnt_s, uint32_t, uint16_t, H1_4, H1_2, DO_UQRSHRN_S) +DO_SHRNT(sve2_uqrshrnt_d, uint64_t, uint32_t, , H1_4, DO_UQRSHRN_D) + #undef DO_SHRNB #undef DO_SHRNT diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 3751544e95..fe7c916344 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6966,6 +6966,99 @@ static bool trans_SQRSHRUNT(DisasContext *s, arg_rri_esz *a) return do_sve2_shr_narrow(s, a, ops); } +static void gen_uqshrnb_vec(unsigned vece, TCGv_vec d, + TCGv_vec n, int64_t shr) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + int halfbits = 4 << vece; + + tcg_gen_shri_vec(vece, n, n, shr); + tcg_gen_dupi_vec(vece, t, MAKE_64BIT_MASK(0, halfbits)); + tcg_gen_umin_vec(vece, d, n, t); + tcg_temp_free_vec(t); +} + +static bool trans_UQSHRNB(DisasContext *s, arg_rri_esz *a) +{ + static const TCGOpcode vec_list[] = { + INDEX_op_shri_vec, INDEX_op_umin_vec, 0 + }; + static const GVecGen2i ops[3] = { + { .fniv = gen_uqshrnb_vec, + .opt_opc = vec_list, + .fno = gen_helper_sve2_uqshrnb_h, + .vece = MO_16 }, + { .fniv = gen_uqshrnb_vec, + .opt_opc = vec_list, + .fno = gen_helper_sve2_uqshrnb_s, + .vece = MO_32 }, + { .fniv = gen_uqshrnb_vec, + .opt_opc = vec_list, + .fno = gen_helper_sve2_uqshrnb_d, + .vece = MO_64 }, + }; + return do_sve2_shr_narrow(s, a, ops); +} + +static void gen_uqshrnt_vec(unsigned vece, TCGv_vec d, + TCGv_vec n, int64_t shr) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + int halfbits = 4 << vece; + + tcg_gen_shri_vec(vece, n, n, shr); + tcg_gen_dupi_vec(vece, t, MAKE_64BIT_MASK(0, halfbits)); + tcg_gen_umin_vec(vece, n, n, t); + tcg_gen_shli_vec(vece, n, n, halfbits); + tcg_gen_bitsel_vec(vece, d, t, d, n); + tcg_temp_free_vec(t); +} + +static bool trans_UQSHRNT(DisasContext *s, arg_rri_esz *a) +{ + static const TCGOpcode vec_list[] = { + INDEX_op_shli_vec, INDEX_op_shri_vec, INDEX_op_umin_vec, 0 + }; + static const GVecGen2i ops[3] = { + { .fniv = gen_uqshrnt_vec, + .opt_opc = vec_list, + .load_dest = true, + .fno = gen_helper_sve2_uqshrnt_h, + .vece = MO_16 }, + { .fniv = gen_uqshrnt_vec, + .opt_opc = vec_list, + .load_dest = true, + .fno = gen_helper_sve2_uqshrnt_s, + .vece = MO_32 }, + { .fniv = gen_uqshrnt_vec, + .opt_opc = vec_list, + .load_dest = true, + .fno = gen_helper_sve2_uqshrnt_d, + .vece = MO_64 }, + }; + return do_sve2_shr_narrow(s, a, ops); +} + +static bool trans_UQRSHRNB(DisasContext *s, arg_rri_esz *a) +{ + static const GVecGen2i ops[3] = { + { .fno = gen_helper_sve2_uqrshrnb_h }, + { .fno = gen_helper_sve2_uqrshrnb_s }, + { .fno = gen_helper_sve2_uqrshrnb_d }, + }; + return do_sve2_shr_narrow(s, a, ops); +} + +static bool trans_UQRSHRNT(DisasContext *s, arg_rri_esz *a) +{ + static const GVecGen2i ops[3] = { + { .fno = gen_helper_sve2_uqrshrnt_h }, + { .fno = gen_helper_sve2_uqrshrnt_s }, + { .fno = gen_helper_sve2_uqrshrnt_d }, + }; + return do_sve2_shr_narrow(s, a, ops); +} + static bool do_sve2_zpzz_fp(DisasContext *s, arg_rprr_esz *a, gen_helper_gvec_4_ptr *fn) { From patchwork Fri Sep 18 18:37:02 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 305017 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B4CE3C43463 for ; Fri, 18 Sep 2020 19:14:06 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 530D421707 for ; Fri, 18 Sep 2020 19:14:06 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="t6i+923l" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 530D421707 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:34036 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLpd-0003wZ-9l for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:14:05 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48258) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLHN-0000To-MN for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:41 -0400 Received: from mail-pl1-x641.google.com ([2607:f8b0:4864:20::641]:43706) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLHJ-00079b-G2 for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:41 -0400 Received: by mail-pl1-x641.google.com with SMTP id e4so3422866pln.10 for ; Fri, 18 Sep 2020 11:38:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=bUT9hsOBcfnImmcdy9VYnBVolzyQ2I837LQqeYQhtKk=; b=t6i+923ltEm1WI8FFIzmg0XCS96+IAOP4wbSNbgTS+QJavgKH4xNOMx0anxJ+GFaN2 JCQucxQL+3rEJ7o4dY6bh985vahdUZBFFUvnriceqlxBT7ownf9KqEIUs0u5l9sCY1qh mB2P8qdAgU65V9r2lKQs1NvyHLFRju5a/4vKNyC+65yoHHRpxh0a3ImIPBPYtgGteJn6 ULEBdaFVzaj5OWugLQIK/YcpASfYzD32X4hknyBclxiGinVVhjWDbcWxoTt0OBBpkQMd Ev96Cjo+VcqGXPAmmSgfU5rj1Bi8I0jfSG37s/EoO5FJc4ly6OyCk+/n6dOKiuaWtf3D 0bMA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=bUT9hsOBcfnImmcdy9VYnBVolzyQ2I837LQqeYQhtKk=; b=nSwf7LUe0O5bUISlaWYuMpvm8/8A+rhpbMo+3C7ErRukvoIs4oLZrtzOYL6KSKlJJX /yJ2Yqz8lmT3z9mMN07oLUCZpt0Sd3qrGa1bJnt7XgNSI5QhpQHsGQ9F4SB9P3d7+sXL CnxQkjN6sbUTRVqGhIb8RRVgPC71Q4Q5ytNBbqHMkMkgwRNgWvkSXNJy3BOmS+eh8jcX q55jGBFNJFVSu/0IhjWi2vkGCvu8MLGl2w6Lpzm712oZ/q4uqhcg8jHuIYLamPGiw14E uiUubfyXf+QkOkxLrdVy4wQaYc42IgN5nGDenKSrEFi9yuxhAxm7DUUQ5a2Gb5T1Bpfn vWEw== X-Gm-Message-State: AOAM5333xR+oeYSowLrWeDXCZwy5z48D7ILWEWjVUhBE04HR5Q4t+tUE /TdK+IUhMPGcJiL7RTujkKOiB4P8lryCUQ== X-Google-Smtp-Source: ABdhPJwjvIjn/kVM7Ei/rHI1N2QOadWBBqRlOpkMeWO/O/Irr21MHjvYs4MsnzGW9XXU/OhHThzUYw== X-Received: by 2002:a17:902:bb89:b029:d0:cb2d:f273 with SMTP id m9-20020a170902bb89b02900d0cb2df273mr34064340pls.12.1600454315659; Fri, 18 Sep 2020 11:38:35 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:35 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 32/81] target/arm: Implement SVE2 SQSHRN, SQRSHRN Date: Fri, 18 Sep 2020 11:37:02 -0700 Message-Id: <20200918183751.2787647-33-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::641; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x641.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" This completes the section "SVE2 bitwise shift right narrow". Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 16 ++++++ target/arm/sve.decode | 4 ++ target/arm/sve_helper.c | 24 +++++++++ target/arm/translate-sve.c | 105 +++++++++++++++++++++++++++++++++++++ 4 files changed, 149 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 3a13064980..04b3ae3219 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2476,6 +2476,22 @@ DEF_HELPER_FLAGS_3(sve2_sqrshrunt_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_sqrshrunt_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_sqrshrunt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqshrnb_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqshrnb_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqshrnb_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_sqshrnt_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqshrnt_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqshrnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_sqrshrnb_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqrshrnb_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqrshrnb_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + +DEF_HELPER_FLAGS_3(sve2_sqrshrnt_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqrshrnt_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_3(sve2_sqrshrnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) + DEF_HELPER_FLAGS_3(sve2_uqshrnb_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_uqshrnb_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_uqshrnb_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 69915398e7..5f76a95139 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1296,6 +1296,10 @@ SHRNB 01000101 .. 1 ..... 00 0100 ..... ..... @rd_rn_tszimm_shr SHRNT 01000101 .. 1 ..... 00 0101 ..... ..... @rd_rn_tszimm_shr RSHRNB 01000101 .. 1 ..... 00 0110 ..... ..... @rd_rn_tszimm_shr RSHRNT 01000101 .. 1 ..... 00 0111 ..... ..... @rd_rn_tszimm_shr +SQSHRNB 01000101 .. 1 ..... 00 1000 ..... ..... @rd_rn_tszimm_shr +SQSHRNT 01000101 .. 1 ..... 00 1001 ..... ..... @rd_rn_tszimm_shr +SQRSHRNB 01000101 .. 1 ..... 00 1010 ..... ..... @rd_rn_tszimm_shr +SQRSHRNT 01000101 .. 1 ..... 00 1011 ..... ..... @rd_rn_tszimm_shr UQSHRNB 01000101 .. 1 ..... 00 1100 ..... ..... @rd_rn_tszimm_shr UQSHRNT 01000101 .. 1 ..... 00 1101 ..... ..... @rd_rn_tszimm_shr UQRSHRNB 01000101 .. 1 ..... 00 1110 ..... ..... @rd_rn_tszimm_shr diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index c478a1f68d..2a3dc76d87 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1970,6 +1970,30 @@ DO_SHRNT(sve2_sqrshrunt_h, int16_t, uint8_t, H1_2, H1, DO_SQRSHRUN_H) DO_SHRNT(sve2_sqrshrunt_s, int32_t, uint16_t, H1_4, H1_2, DO_SQRSHRUN_S) DO_SHRNT(sve2_sqrshrunt_d, int64_t, uint32_t, , H1_4, DO_SQRSHRUN_D) +#define DO_SQSHRN_H(x, sh) do_sat_bhs(x >> sh, INT8_MIN, INT8_MAX) +#define DO_SQSHRN_S(x, sh) do_sat_bhs(x >> sh, INT16_MIN, INT16_MAX) +#define DO_SQSHRN_D(x, sh) do_sat_bhs(x >> sh, INT32_MIN, INT32_MAX) + +DO_SHRNB(sve2_sqshrnb_h, int16_t, uint8_t, DO_SQSHRN_H) +DO_SHRNB(sve2_sqshrnb_s, int32_t, uint16_t, DO_SQSHRN_S) +DO_SHRNB(sve2_sqshrnb_d, int64_t, uint32_t, DO_SQSHRN_D) + +DO_SHRNT(sve2_sqshrnt_h, int16_t, uint8_t, H1_2, H1, DO_SQSHRN_H) +DO_SHRNT(sve2_sqshrnt_s, int32_t, uint16_t, H1_4, H1_2, DO_SQSHRN_S) +DO_SHRNT(sve2_sqshrnt_d, int64_t, uint32_t, , H1_4, DO_SQSHRN_D) + +#define DO_SQRSHRN_H(x, sh) do_sat_bhs(do_srshr(x, sh), INT8_MIN, INT8_MAX) +#define DO_SQRSHRN_S(x, sh) do_sat_bhs(do_srshr(x, sh), INT16_MIN, INT16_MAX) +#define DO_SQRSHRN_D(x, sh) do_sat_bhs(do_srshr(x, sh), INT32_MIN, INT32_MAX) + +DO_SHRNB(sve2_sqrshrnb_h, int16_t, uint8_t, DO_SQRSHRN_H) +DO_SHRNB(sve2_sqrshrnb_s, int32_t, uint16_t, DO_SQRSHRN_S) +DO_SHRNB(sve2_sqrshrnb_d, int64_t, uint32_t, DO_SQRSHRN_D) + +DO_SHRNT(sve2_sqrshrnt_h, int16_t, uint8_t, H1_2, H1, DO_SQRSHRN_H) +DO_SHRNT(sve2_sqrshrnt_s, int32_t, uint16_t, H1_4, H1_2, DO_SQRSHRN_S) +DO_SHRNT(sve2_sqrshrnt_d, int64_t, uint32_t, , H1_4, DO_SQRSHRN_D) + #define DO_UQSHRN_H(x, sh) MIN(x >> sh, UINT8_MAX) #define DO_UQSHRN_S(x, sh) MIN(x >> sh, UINT16_MAX) #define DO_UQSHRN_D(x, sh) MIN(x >> sh, UINT32_MAX) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index fe7c916344..5091c0bc67 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6966,6 +6966,111 @@ static bool trans_SQRSHRUNT(DisasContext *s, arg_rri_esz *a) return do_sve2_shr_narrow(s, a, ops); } +static void gen_sqshrnb_vec(unsigned vece, TCGv_vec d, + TCGv_vec n, int64_t shr) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + int halfbits = 4 << vece; + int64_t max = MAKE_64BIT_MASK(0, halfbits - 1); + int64_t min = -max - 1; + + tcg_gen_sari_vec(vece, n, n, shr); + tcg_gen_dupi_vec(vece, t, min); + tcg_gen_smax_vec(vece, n, n, t); + tcg_gen_dupi_vec(vece, t, max); + tcg_gen_smin_vec(vece, n, n, t); + tcg_gen_dupi_vec(vece, t, MAKE_64BIT_MASK(0, halfbits)); + tcg_gen_and_vec(vece, d, n, t); + tcg_temp_free_vec(t); +} + +static bool trans_SQSHRNB(DisasContext *s, arg_rri_esz *a) +{ + static const TCGOpcode vec_list[] = { + INDEX_op_sari_vec, INDEX_op_smax_vec, INDEX_op_smin_vec, 0 + }; + static const GVecGen2i ops[3] = { + { .fniv = gen_sqshrnb_vec, + .opt_opc = vec_list, + .fno = gen_helper_sve2_sqshrnb_h, + .vece = MO_16 }, + { .fniv = gen_sqshrnb_vec, + .opt_opc = vec_list, + .fno = gen_helper_sve2_sqshrnb_s, + .vece = MO_32 }, + { .fniv = gen_sqshrnb_vec, + .opt_opc = vec_list, + .fno = gen_helper_sve2_sqshrnb_d, + .vece = MO_64 }, + }; + return do_sve2_shr_narrow(s, a, ops); +} + +static void gen_sqshrnt_vec(unsigned vece, TCGv_vec d, + TCGv_vec n, int64_t shr) +{ + TCGv_vec t = tcg_temp_new_vec_matching(d); + int halfbits = 4 << vece; + int64_t max = MAKE_64BIT_MASK(0, halfbits - 1); + int64_t min = -max - 1; + + tcg_gen_sari_vec(vece, n, n, shr); + tcg_gen_dupi_vec(vece, t, min); + tcg_gen_smax_vec(vece, n, n, t); + tcg_gen_dupi_vec(vece, t, max); + tcg_gen_smin_vec(vece, n, n, t); + tcg_gen_shli_vec(vece, n, n, halfbits); + tcg_gen_dupi_vec(vece, t, MAKE_64BIT_MASK(0, halfbits)); + tcg_gen_bitsel_vec(vece, d, t, d, n); + tcg_temp_free_vec(t); +} + +static bool trans_SQSHRNT(DisasContext *s, arg_rri_esz *a) +{ + static const TCGOpcode vec_list[] = { + INDEX_op_shli_vec, INDEX_op_sari_vec, + INDEX_op_smax_vec, INDEX_op_smin_vec, 0 + }; + static const GVecGen2i ops[3] = { + { .fniv = gen_sqshrnt_vec, + .opt_opc = vec_list, + .load_dest = true, + .fno = gen_helper_sve2_sqshrnt_h, + .vece = MO_16 }, + { .fniv = gen_sqshrnt_vec, + .opt_opc = vec_list, + .load_dest = true, + .fno = gen_helper_sve2_sqshrnt_s, + .vece = MO_32 }, + { .fniv = gen_sqshrnt_vec, + .opt_opc = vec_list, + .load_dest = true, + .fno = gen_helper_sve2_sqshrnt_d, + .vece = MO_64 }, + }; + return do_sve2_shr_narrow(s, a, ops); +} + +static bool trans_SQRSHRNB(DisasContext *s, arg_rri_esz *a) +{ + static const GVecGen2i ops[3] = { + { .fno = gen_helper_sve2_sqrshrnb_h }, + { .fno = gen_helper_sve2_sqrshrnb_s }, + { .fno = gen_helper_sve2_sqrshrnb_d }, + }; + return do_sve2_shr_narrow(s, a, ops); +} + +static bool trans_SQRSHRNT(DisasContext *s, arg_rri_esz *a) +{ + static const GVecGen2i ops[3] = { + { .fno = gen_helper_sve2_sqrshrnt_h }, + { .fno = gen_helper_sve2_sqrshrnt_s }, + { .fno = gen_helper_sve2_sqrshrnt_d }, + }; + return do_sve2_shr_narrow(s, a, ops); +} + static void gen_uqshrnb_vec(unsigned vece, TCGv_vec d, TCGv_vec n, int64_t shr) { From patchwork Fri Sep 18 18:37:03 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273346 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 91889C43463 for ; Fri, 18 Sep 2020 19:08:42 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 011CF21D7F for ; Fri, 18 Sep 2020 19:08:41 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="zM1ruSuI" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 011CF21D7F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:47194 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLkP-00061o-4c for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:08:41 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48250) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLHM-0000SD-Uy for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:40 -0400 Received: from mail-pl1-x641.google.com ([2607:f8b0:4864:20::641]:36620) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLHK-00079p-Qe for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:40 -0400 Received: by mail-pl1-x641.google.com with SMTP id k13so3442835plk.3 for ; Fri, 18 Sep 2020 11:38:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=16ObUsciA5sTLGBaYdN2/secK6wqo2MfYlYAt1+Qus8=; b=zM1ruSuIdrogJrevkujSv8/ZDVdBc3R1QH8euSWlZkwYgP+BpRujHrW7Tsy9x7hCrq 9yqbTEex2JVNY/wjPDsN42NvVwJybXTXGu5BDJPPFrrboOKzXHsrd4SbseNnbmfpbOj5 OWiz2vClFoBZzq3k+8WF5QZDZOtpmfmJVAASjeZQFLNvzlHZ5gommAhicCNrkhCTVPcG OZaz6iEW3xz3587KsBP0iDNcsrpVsBQVVjGcgH4qJD+O2Fpi11klN9jLN56WAuosgT/a m1MsMtkRRJg6xD4Av+foKYxR/84QyPvAXxlkglNvJPxSqpFRoG/1bndk+y/zz4sTCrQ0 2ULw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=16ObUsciA5sTLGBaYdN2/secK6wqo2MfYlYAt1+Qus8=; b=WHixUKKPVWVoWyDt+trIPpZcfBN49Ulx+xzQ0gECoME5jEd0/xWw7h1T3L5jE3o7j+ NENTb4CWbfa0osSVYR0b65GpzMwKlrd0mefn8D+zlvtkxkO4xoGg866oVvuFw3kiIPnB CDlTEF52IJWUL+C8+nFl8Jguz4UuOLzeUnw2vmHOVXEZ0Vo67EmqqKn02YWFN31z2rCY mXsVcdvsobdeLZjCNrT/N7eSV1H53Q36QMZrXuFywUEcgV21ck0BF4baY2W1s9WIhTDw TNtMNhAnMdb7SS7qYF1vmVlIYgUrpV2qJFvpevmiBV2GcGgWMCjnBDI4jJAWtmpVbAvD 7AiA== X-Gm-Message-State: AOAM53094E4Tn9HW+MKDs+5Gs/FLarf3X80VsRM3RYNDddpsGWvkB4GS LLjmcZYMqzXPhnyg7dfZx9j0/DJpcLld+g== X-Google-Smtp-Source: ABdhPJwO2aE/m51QbsDYVg9CnfcN479jucGK15XXN2Ef1ONeAu0s5oCQFiy1bULzfM890rVK/7QLgg== X-Received: by 2002:a17:902:8549:b029:d1:e603:af73 with SMTP id d9-20020a1709028549b02900d1e603af73mr15856813plo.60.1600454316848; Fri, 18 Sep 2020 11:38:36 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:36 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 33/81] target/arm: Implement SVE2 WHILEGT, WHILEGE, WHILEHI, WHILEHS Date: Fri, 18 Sep 2020 11:37:03 -0700 Message-Id: <20200918183751.2787647-34-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::641; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x641.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Rename the existing sve_while (less-than) helper to sve_whilel to make room for a new sve_whileg helper for greater-than. Signed-off-by: Richard Henderson --- v2: Use a new helper function to implement this. --- target/arm/helper-sve.h | 3 +- target/arm/sve.decode | 2 +- target/arm/sve_helper.c | 38 +++++++++++++++++++++++++- target/arm/translate-sve.c | 56 ++++++++++++++++++++++++++++---------- 4 files changed, 82 insertions(+), 17 deletions(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 04b3ae3219..6b4ae8f98b 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -913,7 +913,8 @@ DEF_HELPER_FLAGS_4(sve_brkns, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_cntp, TCG_CALL_NO_RWG, i64, ptr, ptr, i32) -DEF_HELPER_FLAGS_3(sve_while, TCG_CALL_NO_RWG, i32, ptr, i32, i32) +DEF_HELPER_FLAGS_3(sve_whilel, TCG_CALL_NO_RWG, i32, ptr, i32, i32) +DEF_HELPER_FLAGS_3(sve_whileg, TCG_CALL_NO_RWG, i32, ptr, i32, i32) DEF_HELPER_FLAGS_4(sve_subri_b, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) DEF_HELPER_FLAGS_4(sve_subri_h, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 5f76a95139..b7038f9f57 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -700,7 +700,7 @@ SINCDECP_z 00100101 .. 1010 d:1 u:1 10000 00 .... ..... @incdec2_pred CTERM 00100101 1 sf:1 1 rm:5 001000 rn:5 ne:1 0000 # SVE integer compare scalar count and limit -WHILE 00100101 esz:2 1 rm:5 000 sf:1 u:1 1 rn:5 eq:1 rd:4 +WHILE 00100101 esz:2 1 rm:5 000 sf:1 u:1 lt:1 rn:5 eq:1 rd:4 ### SVE Integer Wide Immediate - Unpredicated Group diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 2a3dc76d87..0bb9d73cbf 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3746,7 +3746,7 @@ uint64_t HELPER(sve_cntp)(void *vn, void *vg, uint32_t pred_desc) return sum; } -uint32_t HELPER(sve_while)(void *vd, uint32_t count, uint32_t pred_desc) +uint32_t HELPER(sve_whilel)(void *vd, uint32_t count, uint32_t pred_desc) { uintptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; intptr_t esz = extract32(pred_desc, SIMD_DATA_SHIFT, 2); @@ -3772,6 +3772,42 @@ uint32_t HELPER(sve_while)(void *vd, uint32_t count, uint32_t pred_desc) return predtest_ones(d, oprsz, esz_mask); } +uint32_t HELPER(sve_whileg)(void *vd, uint32_t count, uint32_t pred_desc) +{ + uintptr_t oprsz = extract32(pred_desc, 0, SIMD_OPRSZ_BITS) + 2; + intptr_t esz = extract32(pred_desc, SIMD_DATA_SHIFT, 2); + uint64_t esz_mask = pred_esz_masks[esz]; + ARMPredicateReg *d = vd; + intptr_t i, invcount, oprbits; + uint64_t bits; + + if (count == 0) { + return do_zero(d, oprsz); + } + + oprbits = oprsz * 8; + tcg_debug_assert(count <= oprbits); + + bits = esz_mask; + if (oprbits & 63) { + bits &= MAKE_64BIT_MASK(0, oprbits & 63); + } + + invcount = oprbits - count; + for (i = (oprsz - 1) / 8; i > invcount / 64; --i) { + d->p[i] = bits; + bits = esz_mask; + } + + d->p[i] = bits & MAKE_64BIT_MASK(invcount & 63, 64); + + while (--i >= 0) { + d->p[i] = 0; + } + + return predtest_ones(d, oprsz, esz_mask); +} + /* Recursive reduction on a function; * C.f. the ARM ARM function ReducePredicated. * diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 5091c0bc67..f1bc4c63e6 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3121,7 +3121,14 @@ static bool trans_WHILE(DisasContext *s, arg_WHILE *a) TCGv_ptr ptr; unsigned desc, vsz = vec_full_reg_size(s); TCGCond cond; + uint64_t maxval; + /* Note that GE/HS has a->eq == 0 and GT/HI has a->eq == 1. */ + bool eq = a->eq == a->lt; + /* The greater-than conditions are all SVE2. */ + if (!a->lt && !dc_isar_feature(aa64_sve2, s)) { + return false; + } if (!sve_access_check(s)) { return true; } @@ -3144,22 +3151,42 @@ static bool trans_WHILE(DisasContext *s, arg_WHILE *a) */ t0 = tcg_temp_new_i64(); t1 = tcg_temp_new_i64(); - tcg_gen_sub_i64(t0, op1, op0); + + if (a->lt) { + tcg_gen_sub_i64(t0, op1, op0); + if (a->u) { + maxval = a->sf ? UINT64_MAX : UINT32_MAX; + cond = eq ? TCG_COND_LEU : TCG_COND_LTU; + } else { + maxval = a->sf ? INT64_MAX : INT32_MAX; + cond = eq ? TCG_COND_LE : TCG_COND_LT; + } + } else { + tcg_gen_sub_i64(t0, op0, op1); + if (a->u) { + maxval = 0; + cond = eq ? TCG_COND_GEU : TCG_COND_GTU; + } else { + maxval = a->sf ? INT64_MIN : INT32_MIN; + cond = eq ? TCG_COND_GE : TCG_COND_GT; + } + } tmax = tcg_const_i64(vsz >> a->esz); - if (a->eq) { + if (eq) { /* Equality means one more iteration. */ tcg_gen_addi_i64(t0, t0, 1); - /* If op1 is max (un)signed integer (and the only time the addition - * above could overflow), then we produce an all-true predicate by - * setting the count to the vector length. This is because the - * pseudocode is described as an increment + compare loop, and the - * max integer would always compare true. + /* + * For the less-than while, if op1 is maxval (and the only time + * the addition above could overflow), then we produce an all-true + * predicate by setting the count to the vector length. This is + * because the pseudocode is described as an increment + compare + * loop, and the maximum integer would always compare true. + * Similarly, the greater-than while has the same issue with the + * minimum integer due to the decrement + compare loop. */ - tcg_gen_movi_i64(t1, (a->sf - ? (a->u ? UINT64_MAX : INT64_MAX) - : (a->u ? UINT32_MAX : INT32_MAX))); + tcg_gen_movi_i64(t1, maxval); tcg_gen_movcond_i64(TCG_COND_EQ, t0, op1, t1, tmax, t0); } @@ -3168,9 +3195,6 @@ static bool trans_WHILE(DisasContext *s, arg_WHILE *a) tcg_temp_free_i64(tmax); /* Set the count to zero if the condition is false. */ - cond = (a->u - ? (a->eq ? TCG_COND_LEU : TCG_COND_LTU) - : (a->eq ? TCG_COND_LE : TCG_COND_LT)); tcg_gen_movi_i64(t1, 0); tcg_gen_movcond_i64(cond, t0, op0, op1, t0, t1); tcg_temp_free_i64(t1); @@ -3190,7 +3214,11 @@ static bool trans_WHILE(DisasContext *s, arg_WHILE *a) ptr = tcg_temp_new_ptr(); tcg_gen_addi_ptr(ptr, cpu_env, pred_full_reg_offset(s, a->rd)); - gen_helper_sve_while(t2, ptr, t2, t3); + if (a->lt) { + gen_helper_sve_whilel(t2, ptr, t2, t3); + } else { + gen_helper_sve_whileg(t2, ptr, t2, t3); + } do_pred_flags(t2); tcg_temp_free_ptr(ptr); From patchwork Fri Sep 18 18:37:04 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273333 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C6C8BC43465 for ; Fri, 18 Sep 2020 19:29:22 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4582A22208 for ; Fri, 18 Sep 2020 19:29:22 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="u1SC/TkI" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4582A22208 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:48552 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJM4P-0004zQ-D9 for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:29:21 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48260) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLHN-0000UF-SJ for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:41 -0400 Received: from mail-pj1-x1044.google.com ([2607:f8b0:4864:20::1044]:53496) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLHL-0007A2-Sg for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:41 -0400 Received: by mail-pj1-x1044.google.com with SMTP id t7so3624660pjd.3 for ; Fri, 18 Sep 2020 11:38:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=4IryoG8DYenziLinJYml5j3JERwzXkbcIwlI0q1lzG0=; b=u1SC/TkInsER5S2a0B3bOSqStzTXqdEH3wYnzWC+83OErlKJ2CcM7SiDXWDPLZb0Gn Ox/3WxYFCR51scwfjY5nahjVNuREkCfeQ3ByWx8yoTh9js5PCWbAVJAFEueD02T3AzTY sZ2K0osnBV7PCd/zrSsZnkh8n3ZwEPfLxSLoVyvqorDmz7YhfsduTVG0s0xVN363ikTP cmJzeoyVQ9AIf0RPK6HIxOWZj6ZBgJrlPmAPJ8KOuNlJz3nhXXjoLGUmVeVYiizTVNjW pdu+UK2l0DyVJ3HZQ8eW8K10JDdn7/DpOJLl1/QOjqQl8iqgmAlCurgmkK6HMeXft/lt tRHg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=4IryoG8DYenziLinJYml5j3JERwzXkbcIwlI0q1lzG0=; b=LEB6zeLMsBb6IAT05JU7cc/sMYTM+TuK5Yh1l61XfcuhynhmJgue1WZxEP7Im7p55z N0zz+0DsJ7l3eU0xtU+KHRuOuxM0Zv+wB4ny+kCS3Lr1VIoQdEKuTa0Ti3glYpz3IkN1 AdmpsF7vNpd8DCGEe7XDyaDBAWlv88OgUIU6XIUWS5AVZrg3/Drck1Hk0M/CQLvEbAx1 7zZsNDch6jsMlAGeEJp5g2e4DZGNsuhyBo5UY5DKcUIlka3jJvSa1HvqHrRPB2jfNAc1 bC0QFmJSzqajF2H3ytsdUi/g5r1mLGg4cbPMoLBD+A7yFwZRHXzLRVnn0qNQLpG9iPl/ locw== X-Gm-Message-State: AOAM53322E8buhEYTx6cSHNJ6hTaO443bcYmvJh02EyyaAvOHW/g7kGr 74vuLlVBsM7WkPCtN8EeCtrZkrmyU9S81Q== X-Google-Smtp-Source: ABdhPJyEsxKXXyuM15gkXwxQmbwKFTLQW2zv1cKmf58livm2H6nbOtg4dT7b8iX5l569OVK32P25ig== X-Received: by 2002:a17:902:a701:b029:d0:89f3:28d5 with SMTP id w1-20020a170902a701b02900d089f328d5mr34293648plq.17.1600454318252; Fri, 18 Sep 2020 11:38:38 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:37 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 34/81] target/arm: Implement SVE2 WHILERW, WHILEWR Date: Fri, 18 Sep 2020 11:37:04 -0700 Message-Id: <20200918183751.2787647-35-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1044; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1044.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- v2: Fix decodetree typo --- target/arm/sve.decode | 3 ++ target/arm/translate-sve.c | 62 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 65 insertions(+) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index b7038f9f57..19d503e2f4 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -702,6 +702,9 @@ CTERM 00100101 1 sf:1 1 rm:5 001000 rn:5 ne:1 0000 # SVE integer compare scalar count and limit WHILE 00100101 esz:2 1 rm:5 000 sf:1 u:1 lt:1 rn:5 eq:1 rd:4 +# SVE2 pointer conflict compare +WHILE_ptr 00100101 esz:2 1 rm:5 001 100 rn:5 rw:1 rd:4 + ### SVE Integer Wide Immediate - Unpredicated Group # SVE broadcast floating-point immediate (unpredicated) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index f1bc4c63e6..d3241ce167 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3227,6 +3227,68 @@ static bool trans_WHILE(DisasContext *s, arg_WHILE *a) return true; } +static bool trans_WHILE_ptr(DisasContext *s, arg_WHILE_ptr *a) +{ + TCGv_i64 op0, op1, diff, t1, tmax; + TCGv_i32 t2, t3; + TCGv_ptr ptr; + unsigned desc, vsz = vec_full_reg_size(s); + + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (!sve_access_check(s)) { + return true; + } + + op0 = read_cpu_reg(s, a->rn, 1); + op1 = read_cpu_reg(s, a->rm, 1); + + tmax = tcg_const_i64(vsz); + diff = tcg_temp_new_i64(); + + if (a->rw) { + /* WHILERW */ + /* diff = abs(op1 - op0), noting that op0/1 are unsigned. */ + t1 = tcg_temp_new_i64(); + tcg_gen_sub_i64(diff, op0, op1); + tcg_gen_sub_i64(t1, op1, op0); + tcg_gen_movcond_i64(TCG_COND_LTU, diff, op0, op1, diff, t1); + tcg_temp_free_i64(t1); + /* If op1 == op0, diff == 0, and the condition is always true. */ + tcg_gen_movcond_i64(TCG_COND_EQ, diff, op0, op1, tmax, diff); + } else { + /* WHILEWR */ + tcg_gen_sub_i64(diff, op1, op0); + /* If op0 >= op1, diff <= 0, the condition is always true. */ + tcg_gen_movcond_i64(TCG_COND_GEU, diff, op0, op1, tmax, diff); + } + + /* Bound to the maximum. */ + tcg_gen_umin_i64(diff, diff, tmax); + tcg_temp_free_i64(tmax); + + /* Since we're bounded, pass as a 32-bit type. */ + t2 = tcg_temp_new_i32(); + tcg_gen_extrl_i64_i32(t2, diff); + tcg_temp_free_i64(diff); + + desc = (vsz / 8) - 2; + desc = deposit32(desc, SIMD_DATA_SHIFT, 2, a->esz); + t3 = tcg_const_i32(desc); + + ptr = tcg_temp_new_ptr(); + tcg_gen_addi_ptr(ptr, cpu_env, pred_full_reg_offset(s, a->rd)); + + gen_helper_sve_whilel(t2, ptr, t2, t3); + do_pred_flags(t2); + + tcg_temp_free_ptr(ptr); + tcg_temp_free_i32(t2); + tcg_temp_free_i32(t3); + return true; +} + /* *** SVE Integer Wide Immediate - Unpredicated Group */ From patchwork Fri Sep 18 18:37:05 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273331 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CCF64C43463 for ; Fri, 18 Sep 2020 19:33:13 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4BFB921734 for ; Fri, 18 Sep 2020 19:33:13 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="a8UxX1Uk" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4BFB921734 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:56736 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJM88-0000FS-5Z for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:33:12 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48292) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLHP-0000YI-Ld for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:43 -0400 Received: from mail-pj1-x1042.google.com ([2607:f8b0:4864:20::1042]:38627) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLHN-0007A8-B1 for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:43 -0400 Received: by mail-pj1-x1042.google.com with SMTP id u3so3459387pjr.3 for ; Fri, 18 Sep 2020 11:38:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=BqaKaRArf9lPcdISKYFFrvwjSengw95aGke2XPpXa3o=; b=a8UxX1UkpxDPeqWK0VSWXQDaTY7STnv0aYbiCEUPJO/549RVHUw3kd2qVtQ0MkQvuC 4haCKfz+uc36/51l0RJykiF6F6hCh4AhiRVu2Kt+Yn6yZ4PuYeKqquU5wAR+rX73Q8Q0 rFHRDXjuAZbUg06nNBm4jfZ0Ipu2hgKRnFYSNtEr8lNFDKJYRA6m3BXkx+HcihNZFC0w sWz/uZaH4AtjybGTqgPry71k22jqLOIpgNyTHa3VyM3Tz+Kj878n0AirY4MtUjz8rvSN PpZQBt7I808Jww1eEy4znSTrAPsSG1USUvmDwVaOQ3Y/bD+QWZDnU86rM+JU68NwGyxx kP/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=BqaKaRArf9lPcdISKYFFrvwjSengw95aGke2XPpXa3o=; b=dh8Ei2OsFurCtnWeCSTNYXkpHw4aq7HocS0HaP99F2uJ6tMG238vldoHnsBZMZ5rc5 bJoG1jX4cZIc1NQJVbkQsYrhCA3tmCM7qGp4y7ILSlEyjVhqrsXg/2IXleUVCBg4S3Xa NTtXop+zNen7xZmMtNd9ZvP9xi32YchyfhA2JCiwN0h7RAQj7ntEs3aPiQpbo4qTCSwj 5opWCdC3TipiQrRlToyNV0KDTCF+KVQDHd6LwPf600XO5MU/RQZz+P8rcEkmYWdsxW5T R3W9JBsIwmRsj1nbiHayrW48vsd8uD3d8g1kzZYDNdyj7nrmQzbUNA2XaIIvMxafzlFe df5Q== X-Gm-Message-State: AOAM532RstqE3Jw+9INvChOehCZ8hIGq6d10rVHUGms4Yd2G9yNxxuMl +9OUjRLzEaSi+5GRtDYnVdRAGji2MkEQ6A== X-Google-Smtp-Source: ABdhPJyNaKjzSCRSpVUFyqnWmalhTdJjFqPX+4PeqiEVjWoiyUWuh0tRV8ay5MPlhBMSYY2CD1qIRg== X-Received: by 2002:a17:902:760f:b029:d1:f8be:b0be with SMTP id k15-20020a170902760fb02900d1f8beb0bemr9291203pll.9.1600454319482; Fri, 18 Sep 2020 11:38:39 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:38 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 35/81] target/arm: Implement SVE2 bitwise ternary operations Date: Fri, 18 Sep 2020 11:37:05 -0700 Message-Id: <20200918183751.2787647-36-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1042; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1042.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 6 ++ target/arm/sve.decode | 12 +++ target/arm/sve_helper.c | 50 +++++++++ target/arm/translate-sve.c | 213 +++++++++++++++++++++++++++++++++++++ 4 files changed, 281 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 6b4ae8f98b..923911ca21 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2543,3 +2543,9 @@ DEF_HELPER_FLAGS_6(sve2_fminp_zpzz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_6(sve2_fminp_zpzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_eor3, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_bcax, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_bsl1n, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_bsl2n, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_nbsl, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 19d503e2f4..a50afd40c2 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -124,6 +124,10 @@ @rda_rn_rm ........ esz:2 . rm:5 ... ... rn:5 rd:5 \ &rrrr_esz ra=%reg_movprfx +# Four operand with unused vector element size +@rdn_ra_rm_e0 ........ ... rm:5 ... ... ra:5 rd:5 \ + &rrrr_esz esz=0 rn=%reg_movprfx + # Three operand with "memory" size, aka immediate left shift @rd_rn_msz_rm ........ ... rm:5 .... imm:2 rn:5 rd:5 &rrri @@ -379,6 +383,14 @@ ORR_zzz 00000100 01 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 EOR_zzz 00000100 10 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 BIC_zzz 00000100 11 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 +# SVE2 bitwise ternary operations +EOR3 00000100 00 1 ..... 001 110 ..... ..... @rdn_ra_rm_e0 +BSL 00000100 00 1 ..... 001 111 ..... ..... @rdn_ra_rm_e0 +BCAX 00000100 01 1 ..... 001 110 ..... ..... @rdn_ra_rm_e0 +BSL1N 00000100 01 1 ..... 001 111 ..... ..... @rdn_ra_rm_e0 +BSL2N 00000100 10 1 ..... 001 111 ..... ..... @rdn_ra_rm_e0 +NBSL 00000100 11 1 ..... 001 111 ..... ..... @rdn_ra_rm_e0 + ### SVE Index Generation Group # SVE index generation (immediate start, immediate increment) diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 0bb9d73cbf..aa7ff94812 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -6825,3 +6825,53 @@ DO_ST1_ZPZ_D(dd_be, zd, MO_64) #undef DO_ST1_ZPZ_S #undef DO_ST1_ZPZ_D + +void HELPER(sve2_eor3)(void *vd, void *vn, void *vm, void *vk, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd, *n = vn, *m = vm, *k = vk; + + for (i = 0; i < opr_sz; ++i) { + d[i] = n[i] ^ m[i] ^ k[i]; + } +} + +void HELPER(sve2_bcax)(void *vd, void *vn, void *vm, void *vk, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd, *n = vn, *m = vm, *k = vk; + + for (i = 0; i < opr_sz; ++i) { + d[i] = n[i] ^ (m[i] & ~k[i]); + } +} + +void HELPER(sve2_bsl1n)(void *vd, void *vn, void *vm, void *vk, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd, *n = vn, *m = vm, *k = vk; + + for (i = 0; i < opr_sz; ++i) { + d[i] = (~n[i] & k[i]) | (m[i] & ~k[i]); + } +} + +void HELPER(sve2_bsl2n)(void *vd, void *vn, void *vm, void *vk, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd, *n = vn, *m = vm, *k = vk; + + for (i = 0; i < opr_sz; ++i) { + d[i] = (n[i] & k[i]) | (~m[i] & ~k[i]); + } +} + +void HELPER(sve2_nbsl)(void *vd, void *vn, void *vm, void *vk, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + uint64_t *d = vd, *n = vn, *m = vm, *k = vk; + + for (i = 0; i < opr_sz; ++i) { + d[i] = ~((n[i] & k[i]) | (m[i] & ~k[i])); + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index d3241ce167..540e926c18 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -217,6 +217,17 @@ static void gen_gvec_fn_zzz(DisasContext *s, GVecGen3Fn *gvec_fn, vec_full_reg_offset(s, rm), vsz, vsz); } +/* Invoke a vector expander on four Zregs. */ +static void gen_gvec_fn_zzzz(DisasContext *s, GVecGen4Fn *gvec_fn, + int esz, int rd, int rn, int rm, int ra) +{ + unsigned vsz = vec_full_reg_size(s); + gvec_fn(esz, vec_full_reg_offset(s, rd), + vec_full_reg_offset(s, rn), + vec_full_reg_offset(s, rm), + vec_full_reg_offset(s, ra), vsz, vsz); +} + /* Invoke a vector move on two Zregs. */ static bool do_mov_z(DisasContext *s, int rd, int rn) { @@ -329,6 +340,208 @@ static bool trans_BIC_zzz(DisasContext *s, arg_rrr_esz *a) return do_zzz_fn(s, a, tcg_gen_gvec_andc); } +static bool do_sve2_zzzz_fn(DisasContext *s, arg_rrrr_esz *a, GVecGen4Fn *fn) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + gen_gvec_fn_zzzz(s, fn, a->esz, a->rd, a->rn, a->rm, a->ra); + } + return true; +} + +static void gen_eor3_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, TCGv_i64 k) +{ + tcg_gen_xor_i64(d, n, m); + tcg_gen_xor_i64(d, d, k); +} + +static void gen_eor3_vec(unsigned vece, TCGv_vec d, TCGv_vec n, + TCGv_vec m, TCGv_vec k) +{ + tcg_gen_xor_vec(vece, d, n, m); + tcg_gen_xor_vec(vece, d, d, k); +} + +static void gen_eor3(unsigned vece, uint32_t d, uint32_t n, uint32_t m, + uint32_t a, uint32_t oprsz, uint32_t maxsz) +{ + static const GVecGen4 op = { + .fni8 = gen_eor3_i64, + .fniv = gen_eor3_vec, + .fno = gen_helper_sve2_eor3, + .vece = MO_64, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + }; + tcg_gen_gvec_4(d, n, m, a, oprsz, maxsz, &op); +} + +static bool trans_EOR3(DisasContext *s, arg_rrrr_esz *a) +{ + return do_sve2_zzzz_fn(s, a, gen_eor3); +} + +static void gen_bcax_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, TCGv_i64 k) +{ + tcg_gen_andc_i64(d, m, k); + tcg_gen_xor_i64(d, d, n); +} + +static void gen_bcax_vec(unsigned vece, TCGv_vec d, TCGv_vec n, + TCGv_vec m, TCGv_vec k) +{ + tcg_gen_andc_vec(vece, d, m, k); + tcg_gen_xor_vec(vece, d, d, n); +} + +static void gen_bcax(unsigned vece, uint32_t d, uint32_t n, uint32_t m, + uint32_t a, uint32_t oprsz, uint32_t maxsz) +{ + static const GVecGen4 op = { + .fni8 = gen_bcax_i64, + .fniv = gen_bcax_vec, + .fno = gen_helper_sve2_bcax, + .vece = MO_64, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + }; + tcg_gen_gvec_4(d, n, m, a, oprsz, maxsz, &op); +} + +static bool trans_BCAX(DisasContext *s, arg_rrrr_esz *a) +{ + return do_sve2_zzzz_fn(s, a, gen_bcax); +} + +static void gen_bsl(unsigned vece, uint32_t d, uint32_t n, uint32_t m, + uint32_t a, uint32_t oprsz, uint32_t maxsz) +{ + /* BSL differs from the generic bitsel in argument ordering. */ + tcg_gen_gvec_bitsel(vece, d, a, n, m, oprsz, maxsz); +} + +static bool trans_BSL(DisasContext *s, arg_rrrr_esz *a) +{ + return do_sve2_zzzz_fn(s, a, gen_bsl); +} + +static void gen_bsl1n_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, TCGv_i64 k) +{ + tcg_gen_andc_i64(n, k, n); + tcg_gen_andc_i64(m, m, k); + tcg_gen_or_i64(d, n, m); +} + +static void gen_bsl1n_vec(unsigned vece, TCGv_vec d, TCGv_vec n, + TCGv_vec m, TCGv_vec k) +{ + if (TCG_TARGET_HAS_bitsel_vec) { + tcg_gen_not_vec(vece, n, n); + tcg_gen_bitsel_vec(vece, d, k, n, m); + } else { + tcg_gen_andc_vec(vece, n, k, n); + tcg_gen_andc_vec(vece, m, m, k); + tcg_gen_or_vec(vece, d, n, m); + } +} + +static void gen_bsl1n(unsigned vece, uint32_t d, uint32_t n, uint32_t m, + uint32_t a, uint32_t oprsz, uint32_t maxsz) +{ + static const GVecGen4 op = { + .fni8 = gen_bsl1n_i64, + .fniv = gen_bsl1n_vec, + .fno = gen_helper_sve2_bsl1n, + .vece = MO_64, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + }; + tcg_gen_gvec_4(d, n, m, a, oprsz, maxsz, &op); +} + +static bool trans_BSL1N(DisasContext *s, arg_rrrr_esz *a) +{ + return do_sve2_zzzz_fn(s, a, gen_bsl1n); +} + +static void gen_bsl2n_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, TCGv_i64 k) +{ + /* + * Z[dn] = (n & k) | (~m & ~k) + * = | ~(m | k) + */ + tcg_gen_and_i64(n, n, k); + if (TCG_TARGET_HAS_orc_i64) { + tcg_gen_or_i64(m, m, k); + tcg_gen_orc_i64(d, n, m); + } else { + tcg_gen_nor_i64(m, m, k); + tcg_gen_or_i64(d, n, m); + } +} + +static void gen_bsl2n_vec(unsigned vece, TCGv_vec d, TCGv_vec n, + TCGv_vec m, TCGv_vec k) +{ + if (TCG_TARGET_HAS_bitsel_vec) { + tcg_gen_not_vec(vece, m, m); + tcg_gen_bitsel_vec(vece, d, k, n, m); + } else { + tcg_gen_and_vec(vece, n, n, k); + tcg_gen_or_vec(vece, m, m, k); + tcg_gen_orc_vec(vece, d, n, m); + } +} + +static void gen_bsl2n(unsigned vece, uint32_t d, uint32_t n, uint32_t m, + uint32_t a, uint32_t oprsz, uint32_t maxsz) +{ + static const GVecGen4 op = { + .fni8 = gen_bsl2n_i64, + .fniv = gen_bsl2n_vec, + .fno = gen_helper_sve2_bsl2n, + .vece = MO_64, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + }; + tcg_gen_gvec_4(d, n, m, a, oprsz, maxsz, &op); +} + +static bool trans_BSL2N(DisasContext *s, arg_rrrr_esz *a) +{ + return do_sve2_zzzz_fn(s, a, gen_bsl2n); +} + +static void gen_nbsl_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, TCGv_i64 k) +{ + tcg_gen_and_i64(n, n, k); + tcg_gen_andc_i64(m, m, k); + tcg_gen_nor_i64(d, n, m); +} + +static void gen_nbsl_vec(unsigned vece, TCGv_vec d, TCGv_vec n, + TCGv_vec m, TCGv_vec k) +{ + tcg_gen_bitsel_vec(vece, d, k, n, m); + tcg_gen_not_vec(vece, d, d); +} + +static void gen_nbsl(unsigned vece, uint32_t d, uint32_t n, uint32_t m, + uint32_t a, uint32_t oprsz, uint32_t maxsz) +{ + static const GVecGen4 op = { + .fni8 = gen_nbsl_i64, + .fniv = gen_nbsl_vec, + .fno = gen_helper_sve2_nbsl, + .vece = MO_64, + .prefer_i64 = TCG_TARGET_REG_BITS == 64, + }; + tcg_gen_gvec_4(d, n, m, a, oprsz, maxsz, &op); +} + +static bool trans_NBSL(DisasContext *s, arg_rrrr_esz *a) +{ + return do_sve2_zzzz_fn(s, a, gen_nbsl); +} + /* *** SVE Integer Arithmetic - Unpredicated Group */ From patchwork Fri Sep 18 18:37:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273340 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C48BC43463 for ; Fri, 18 Sep 2020 19:19:04 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 17EEE21D7F for ; Fri, 18 Sep 2020 19:19:02 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="H1kjFfGu" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 17EEE21D7F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:45432 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLuP-0000L9-59 for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:19:01 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48322) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLHU-0000a5-Fg for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:48 -0400 Received: from mail-pl1-x633.google.com ([2607:f8b0:4864:20::633]:40200) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLHO-0007AP-Ne for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:48 -0400 Received: by mail-pl1-x633.google.com with SMTP id bd2so3427354plb.7 for ; Fri, 18 Sep 2020 11:38:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Ub2ynCDHLnvijoSLHJW2aWQL1n7R5p9vDIqMXp4ZHSk=; b=H1kjFfGuGUJLJrC45gAqwwG/PcE0N6UjVVW2N2+/vvCQxm0zAyoJ/YlV+albivg06L iP3W+NuQgagIcy6ZbztBaM4822HM6deWhiNN8PkEuiugj34z/Pc2PxfBBtJHUVwXlVOq 0CS4W5E3aENok62PDVcEpJpRupoelM0nPlA/IX6Y4han93v4xCGvplhK+r+zVtYBUiNr I/ahWDx/bUEq8f5TpfovpdzPIC1aWPrK1zsezUWgoneWaCdwmbfB7/iQOhiDIXi0XN/O 1dmB0EZK3mT3Ooo2SXKFJPt0XZhgUIPzz6iqKXlhkBWHCXsbnZqHcuTd5Yb6rkOjcdRV Y9Zg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Ub2ynCDHLnvijoSLHJW2aWQL1n7R5p9vDIqMXp4ZHSk=; b=eKy8wsVn+Ny6XOrufChCnp6GtRCxvZpiyBjKC9m2L5NoP88KCdHxq9oq6Hi/OeqQjN XsCW1+ntSVc4rYwzbHvTnjHWD+FCfXOYSJuewTo/MqFE1XeWO0c3rYlrGiit6l2kfXBX kx3nNb6M/Xnhx+5EKZteCR7D6ivj76CiRSGrvhWvYbhe9XiciFNOZeoowG/gkT3n5k4d 8qk+N9stCK3DDwepp7eNMTyQkV1GHoAhan7Hg1+EgMhJ8uBryUUsXMMu0dMqOUVIDMho 1duaXL3wnj3h2c6695ccZWBrCD330HTGjVN04XVL1sUy6GOUKK5WW4mm8Elw2FKpKv59 450w== X-Gm-Message-State: AOAM531NCBbN/5snx2yB6BOwsOePb4/DHyBWG7Cz9RTCdOLhCd7yiARI AecGCjQjA6XjPsWFPF1IPc3n1jhTPwSPtg== X-Google-Smtp-Source: ABdhPJxo/bnRKZ8bRUsIZBPMSMacNidfGSf7kAM9DSd2X+Fx8taWyXqRpAT9ucVcENSGG87q05Nbcw== X-Received: by 2002:a17:90b:20d1:: with SMTP id ju17mr13299274pjb.134.1600454320765; Fri, 18 Sep 2020 11:38:40 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:39 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 36/81] target/arm: Implement SVE2 MATCH, NMATCH Date: Fri, 18 Sep 2020 11:37:06 -0700 Message-Id: <20200918183751.2787647-37-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::633; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x633.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org, Stephen Long Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Stephen Long Reviewed-by: Richard Henderson Signed-off-by: Stephen Long Message-Id: <20200415145915.2859-1-steplong@quicinc.com> [rth: Expanded comment for do_match2] Signed-off-by: Richard Henderson --- v2: Apply esz_mask to input pg to fix output flags. --- target/arm/helper-sve.h | 10 ++++++ target/arm/sve.decode | 5 +++ target/arm/sve_helper.c | 64 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 22 +++++++++++++ 4 files changed, 101 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 923911ca21..7f53287f0d 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2509,6 +2509,16 @@ DEF_HELPER_FLAGS_3(sve2_uqrshrnt_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_uqrshrnt_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_uqrshrnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_match_ppzz_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_match_ppzz_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_nmatch_ppzz_b, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_nmatch_ppzz_h, TCG_CALL_NO_RWG, + i32, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_6(sve2_faddp_zpzz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_6(sve2_faddp_zpzz_s, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index a50afd40c2..2207693d28 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1320,6 +1320,11 @@ UQSHRNT 01000101 .. 1 ..... 00 1101 ..... ..... @rd_rn_tszimm_shr UQRSHRNB 01000101 .. 1 ..... 00 1110 ..... ..... @rd_rn_tszimm_shr UQRSHRNT 01000101 .. 1 ..... 00 1111 ..... ..... @rd_rn_tszimm_shr +### SVE2 Character Match + +MATCH 01000101 .. 1 ..... 100 ... ..... 0 .... @pd_pg_rn_rm +NMATCH 01000101 .. 1 ..... 100 ... ..... 1 .... @pd_pg_rn_rm + ## SVE2 floating-point pairwise operations FADDP 01100100 .. 010 00 0 100 ... ..... ..... @rdn_pg_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index aa7ff94812..139dea423f 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -6875,3 +6875,67 @@ void HELPER(sve2_nbsl)(void *vd, void *vn, void *vm, void *vk, uint32_t desc) d[i] = ~((n[i] & k[i]) | (m[i] & ~k[i])); } } + +/* + * Returns true if m0 or m1 contains the low uint8_t/uint16_t in n. + * See hasless(v,1) from + * https://graphics.stanford.edu/~seander/bithacks.html#ZeroInWord + */ +static inline bool do_match2(uint64_t n, uint64_t m0, uint64_t m1, int esz) +{ + int bits = 8 << esz; + uint64_t ones = dup_const(esz, 1); + uint64_t signs = ones << (bits - 1); + uint64_t cmp0, cmp1; + + cmp1 = dup_const(esz, n); + cmp0 = cmp1 ^ m0; + cmp1 = cmp1 ^ m1; + cmp0 = (cmp0 - ones) & ~cmp0; + cmp1 = (cmp1 - ones) & ~cmp1; + return (cmp0 | cmp1) & signs; +} + +static inline uint32_t do_match(void *vd, void *vn, void *vm, void *vg, + uint32_t desc, int esz, bool nmatch) +{ + uint16_t esz_mask = pred_esz_masks[esz]; + intptr_t opr_sz = simd_oprsz(desc); + uint32_t flags = PREDTEST_INIT; + intptr_t i, j, k; + + for (i = 0; i < opr_sz; i += 16) { + uint64_t m0 = *(uint64_t *)(vm + i); + uint64_t m1 = *(uint64_t *)(vm + i + 8); + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)) & esz_mask; + uint16_t out = 0; + + for (j = 0; j < 16; j += 8) { + uint64_t n = *(uint64_t *)(vn + i + j); + + for (k = 0; k < 8; k += 1 << esz) { + if (pg & (1 << (j + k))) { + bool o = do_match2(n >> (k * 8), m0, m1, esz); + out |= (o ^ nmatch) << (j + k); + } + } + } + *(uint16_t *)(vd + H1_2(i >> 3)) = out; + flags = iter_predtest_fwd(out, pg, flags); + } + return flags; +} + +#define DO_PPZZ_MATCH(NAME, ESZ, INV) \ +uint32_t HELPER(NAME)(void *vd, void *vn, void *vm, void *vg, uint32_t desc) \ +{ \ + return do_match(vd, vn, vm, vg, desc, ESZ, INV); \ +} + +DO_PPZZ_MATCH(sve2_match_ppzz_b, MO_8, false) +DO_PPZZ_MATCH(sve2_match_ppzz_h, MO_16, false) + +DO_PPZZ_MATCH(sve2_nmatch_ppzz_b, MO_8, true) +DO_PPZZ_MATCH(sve2_nmatch_ppzz_h, MO_16, true) + +#undef DO_PPZZ_MATCH diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 540e926c18..ffb7f5c8a0 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -7467,6 +7467,28 @@ static bool trans_UQRSHRNT(DisasContext *s, arg_rri_esz *a) return do_sve2_shr_narrow(s, a, ops); } +static bool do_sve2_ppzz_flags(DisasContext *s, arg_rprr_esz *a, + gen_helper_gvec_flags_4 *fn) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + return do_ppzz_flags(s, a, fn); +} + +#define DO_SVE2_PPZZ_MATCH(NAME, name) \ +static bool trans_##NAME(DisasContext *s, arg_rprr_esz *a) \ +{ \ + static gen_helper_gvec_flags_4 * const fns[4] = { \ + gen_helper_sve2_##name##_ppzz_b, gen_helper_sve2_##name##_ppzz_h, \ + NULL, NULL \ + }; \ + return do_sve2_ppzz_flags(s, a, fns[a->esz]); \ +} + +DO_SVE2_PPZZ_MATCH(MATCH, match) +DO_SVE2_PPZZ_MATCH(NMATCH, nmatch) + static bool do_sve2_zpzz_fp(DisasContext *s, arg_rprr_esz *a, gen_helper_gvec_4_ptr *fn) { From patchwork Fri Sep 18 18:37:07 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273353 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5753EC43464 for ; Fri, 18 Sep 2020 18:57:38 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E069921534 for ; Fri, 18 Sep 2020 18:57:37 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="NDKEuoPt" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E069921534 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:47306 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLZf-0001ro-6Y for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 14:57:36 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48324) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLHU-0000aX-Lr for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:48 -0400 Received: from mail-pj1-x102e.google.com ([2607:f8b0:4864:20::102e]:51683) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLHP-0007AY-Pv for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:48 -0400 Received: by mail-pj1-x102e.google.com with SMTP id a9so3630927pjg.1 for ; Fri, 18 Sep 2020 11:38:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=XljpyKkgZ1Ylf/wYnD22Mr/r5+Ws8scArlq6WmrXRKo=; b=NDKEuoPtsJH26YrgKtlWYu2aBgl/oWghlym3CTiYtEzRE6sAMvciLX5BgStmP6+7Fr 75015R+/SgoLjkicC525vOOD6nL0DkF8OgQ9Hq/g0fxXVIFIK7Wp5UqkH024avO7sOLw LI8gkwKgmnK8VaXCZXLS7UHpnoGMtBkcqP1n/trl7c0Q2RycTuAshdxRjcVQByTRjJ2n XZa77erH+QOuWG0+ikIeKi+XQX+io1J4K6vZpD6/fy8jPcO5SgHWbqu8cf3FSqJ2wlAw ezJJRGmtjswtiuD+wa6f1yaE5c292HNpCohQKX5vmn0z2culcht7onsabDArBNAO1Rrs mJGQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=XljpyKkgZ1Ylf/wYnD22Mr/r5+Ws8scArlq6WmrXRKo=; b=RseUZkGTEX8funxN42fLlVEyUvrSLsDrlkL6AfqbF2f2MKbbMQKhrqcXTSGHETl6Ql K8/bRWLcjqv0cEL0LDZEDf7TkM3FQFDRlvs2+h0sm6pQkJbBhDRQ1zrjlaWt2RvWWWZK lWMa3zkCR21QVLliN23siURfnpgQosYa8wCaPAUOYkwv7noB0piPRlMz236Exmd9HBL9 HovamVim0PRLkWtS7yv0PWG7fSug1LPdzMHtPvPYB9aUm21wlWhZrYIuHrmC/P6J1Fip UZnbQbAdlMMD0NttSQddni0WYSIcxHYloGhTVH6oRsGXRKoPQC92zNMyUlTyDFGpxCzh o3yg== X-Gm-Message-State: AOAM5326L1qTTt8+7qPpd40lLk3h/93vWbaEU5HYxmk9NFJDDEqW63PV AmVnZD0HHsYFkH2zN20G7jFJQPsgaGQuQA== X-Google-Smtp-Source: ABdhPJx5Dkm7flcHa0AtHElGPiFpziXmMTZdaARHw79A5fePMxg+nzrjky8YeP+c46n6PSrGE2pEGQ== X-Received: by 2002:a17:90a:7bcf:: with SMTP id d15mr14145279pjl.230.1600454322106; Fri, 18 Sep 2020 11:38:42 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:41 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 37/81] target/arm: Implement SVE2 saturating multiply-add long Date: Fri, 18 Sep 2020 11:37:07 -0700 Message-Id: <20200918183751.2787647-38-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::102e; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x102e.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 14 ++++++++++ target/arm/sve.decode | 14 ++++++++++ target/arm/sve_helper.c | 30 +++++++++++++++++++++ target/arm/translate-sve.c | 54 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 112 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 7f53287f0d..ff5f744b9a 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2559,3 +2559,17 @@ DEF_HELPER_FLAGS_5(sve2_bcax, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_bsl1n, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_bsl2n, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_nbsl, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_sqdmlal_zzzw_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqdmlal_zzzw_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqdmlal_zzzw_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_sqdmlsl_zzzw_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqdmlsl_zzzw_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqdmlsl_zzzw_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 2207693d28..d0d24978bb 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1332,3 +1332,17 @@ FMAXNMP 01100100 .. 010 10 0 100 ... ..... ..... @rdn_pg_rm FMINNMP 01100100 .. 010 10 1 100 ... ..... ..... @rdn_pg_rm FMAXP 01100100 .. 010 11 0 100 ... ..... ..... @rdn_pg_rm FMINP 01100100 .. 010 11 1 100 ... ..... ..... @rdn_pg_rm + +#### SVE Integer Multiply-Add (unpredicated) + +## SVE2 saturating multiply-add long + +SQDMLALB_zzzw 01000100 .. 0 ..... 0110 00 ..... ..... @rda_rn_rm +SQDMLALT_zzzw 01000100 .. 0 ..... 0110 01 ..... ..... @rda_rn_rm +SQDMLSLB_zzzw 01000100 .. 0 ..... 0110 10 ..... ..... @rda_rn_rm +SQDMLSLT_zzzw 01000100 .. 0 ..... 0110 11 ..... ..... @rda_rn_rm + +## SVE2 saturating multiply-add interleaved long + +SQDMLALBT 01000100 .. 0 ..... 00001 0 ..... ..... @rda_rn_rm +SQDMLSLBT 01000100 .. 0 ..... 00001 1 ..... ..... @rda_rn_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 139dea423f..9822675892 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1400,6 +1400,36 @@ void HELPER(sve2_adcl_d)(void *vd, void *vn, void *vm, void *va, uint32_t desc) } } +#define DO_SQDMLAL(NAME, TYPEW, TYPEN, HW, HN, DMUL_OP, SUM_OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + int sel1 = extract32(desc, SIMD_DATA_SHIFT, 1) * sizeof(TYPEN); \ + int sel2 = extract32(desc, SIMD_DATA_SHIFT + 1, 1) * sizeof(TYPEN); \ + for (i = 0; i < opr_sz; i += sizeof(TYPEW)) { \ + TYPEW nn = *(TYPEN *)(vn + HN(i + sel1)); \ + TYPEW mm = *(TYPEN *)(vm + HN(i + sel2)); \ + TYPEW aa = *(TYPEW *)(va + HW(i)); \ + *(TYPEW *)(vd + HW(i)) = SUM_OP(aa, DMUL_OP(nn, mm)); \ + } \ +} + +DO_SQDMLAL(sve2_sqdmlal_zzzw_h, int16_t, int8_t, H1_2, H1, + do_sqdmull_h, DO_SQADD_H) +DO_SQDMLAL(sve2_sqdmlal_zzzw_s, int32_t, int16_t, H1_4, H1_2, + do_sqdmull_s, DO_SQADD_S) +DO_SQDMLAL(sve2_sqdmlal_zzzw_d, int64_t, int32_t, , H1_4, + do_sqdmull_d, do_sqadd_d) + +DO_SQDMLAL(sve2_sqdmlsl_zzzw_h, int16_t, int8_t, H1_2, H1, + do_sqdmull_h, DO_SQSUB_H) +DO_SQDMLAL(sve2_sqdmlsl_zzzw_s, int32_t, int16_t, H1_4, H1_2, + do_sqdmull_s, DO_SQSUB_S) +DO_SQDMLAL(sve2_sqdmlsl_zzzw_d, int64_t, int32_t, , H1_4, + do_sqdmull_d, do_sqsub_d) + +#undef DO_SQDMLAL + #define DO_BITPERM(NAME, TYPE, OP) \ void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ { \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index ffb7f5c8a0..5da4793d49 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -7513,3 +7513,57 @@ DO_SVE2_ZPZZ_FP(FMAXNMP, fmaxnmp) DO_SVE2_ZPZZ_FP(FMINNMP, fminnmp) DO_SVE2_ZPZZ_FP(FMAXP, fmaxp) DO_SVE2_ZPZZ_FP(FMINP, fminp) + +/* + * SVE Integer Multiply-Add (unpredicated) + */ + +static bool do_sqdmlal_zzzw(DisasContext *s, arg_rrrr_esz *a, + bool sel1, bool sel2) +{ + static gen_helper_gvec_4 * const fns[] = { + NULL, gen_helper_sve2_sqdmlal_zzzw_h, + gen_helper_sve2_sqdmlal_zzzw_s, gen_helper_sve2_sqdmlal_zzzw_d, + }; + return do_sve2_zzzz_ool(s, a, fns[a->esz], (sel2 << 1) | sel1); +} + +static bool do_sqdmlsl_zzzw(DisasContext *s, arg_rrrr_esz *a, + bool sel1, bool sel2) +{ + static gen_helper_gvec_4 * const fns[] = { + NULL, gen_helper_sve2_sqdmlsl_zzzw_h, + gen_helper_sve2_sqdmlsl_zzzw_s, gen_helper_sve2_sqdmlsl_zzzw_d, + }; + return do_sve2_zzzz_ool(s, a, fns[a->esz], (sel2 << 1) | sel1); +} + +static bool trans_SQDMLALB_zzzw(DisasContext *s, arg_rrrr_esz *a) +{ + return do_sqdmlal_zzzw(s, a, false, false); +} + +static bool trans_SQDMLALT_zzzw(DisasContext *s, arg_rrrr_esz *a) +{ + return do_sqdmlal_zzzw(s, a, true, true); +} + +static bool trans_SQDMLALBT(DisasContext *s, arg_rrrr_esz *a) +{ + return do_sqdmlal_zzzw(s, a, false, true); +} + +static bool trans_SQDMLSLB_zzzw(DisasContext *s, arg_rrrr_esz *a) +{ + return do_sqdmlsl_zzzw(s, a, false, false); +} + +static bool trans_SQDMLSLT_zzzw(DisasContext *s, arg_rrrr_esz *a) +{ + return do_sqdmlsl_zzzw(s, a, true, true); +} + +static bool trans_SQDMLSLBT(DisasContext *s, arg_rrrr_esz *a) +{ + return do_sqdmlsl_zzzw(s, a, false, true); +} From patchwork Fri Sep 18 18:37:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 305010 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2ABF1C43465 for ; Fri, 18 Sep 2020 19:24:09 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 914EC22208 for ; Fri, 18 Sep 2020 19:24:08 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="lClPEEAx" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 914EC22208 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:33730 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLzL-0007BL-Md for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:24:07 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48370) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLHW-0000fi-Uk for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:50 -0400 Received: from mail-pl1-x636.google.com ([2607:f8b0:4864:20::636]:39742) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLHT-0007Ao-Rp for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:50 -0400 Received: by mail-pl1-x636.google.com with SMTP id x18so3432403pll.6 for ; Fri, 18 Sep 2020 11:38:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=8aLDon3GmukeA3iJksKFpczvbGJR/eapy4X1HQ+inSU=; b=lClPEEAxiTnDHMOjZh1Q6I9wsw9pyPIw6y2tkxAa/vDLlV4jZOyjt5kQFFmaVM5SS+ EPbMaM2uFtP5xXYcs1UwKoJo1MWIpPx/B2KOFIE4IFkhyvwTVX7lY9Rj4VSzkZkt2KII H6ckR8eW5tspFrdE84bYB8ldCm4fWfDm0FpJkdIBxooxXOH5krUCqKP2Icx/h0TLlWci r2zWZ/Px2eFpOYGoY9fRmDEq+aqeGlHkfOHTKXcH2XJ3/wr+R5dKcw0VxhWRl6NL2l3x E0ubG8qol+wLZpGJhuNAbMg+bwlcRY/1UTxnSCz5lu5VhZIDo0zcRcVg1qnYz77JA2Qo dZJQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=8aLDon3GmukeA3iJksKFpczvbGJR/eapy4X1HQ+inSU=; b=N4X4C+7K11PJLKUZD8QBmHZc67JAUfJw3CjUclfVAu1V7QIcxOBY1AHLQyAROXy1s2 eOOzKhkkqp1TjANy7XSxG5oXkz65/cLc/s2oHo+zJwNN77nE/J2g2LfvOIDMJ+t+y8/i PVi+R5t+8bL6iHz2DYq/5QXszg0u1Q7F4I2zAqQNE5i20QZ8ggBkbZzC/ECrbupsp0ro 11yXOQx+5Dg1e4Hd9LkLz7phNsQbdgPXVmhXxfsW8wpg6hq+GRnOOzicljeGpNX/crWU lwJTgT0wVXuPmwSLnhFkoNxUvKsadUqqlPmHHWz62owBxTB3lS4EC1xYybz/FBZqJG8R cUtQ== X-Gm-Message-State: AOAM5328pER6Jmj10Dv214pBdoElxviuIREBEdtmAGQx15vbFiVxGKsz 2RIWJvCClcCRZci2v2b3FamPdPRyvvwN3g== X-Google-Smtp-Source: ABdhPJxDkOjB59Re1/rEGqJPoL8DbdhBtEWl3jwvDyG1qYum4c2U6LaifA92n1CbtkGfQVNAIQwa5w== X-Received: by 2002:a17:90a:c501:: with SMTP id k1mr12828189pjt.170.1600454323241; Fri, 18 Sep 2020 11:38:43 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:42 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 38/81] target/arm: Implement SVE2 saturating multiply-add high Date: Fri, 18 Sep 2020 11:37:08 -0700 Message-Id: <20200918183751.2787647-39-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::636; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x636.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" SVE2 has two additional sizes of the operation and unlike NEON, there is no saturation flag. Create new entry points for SVE2 that do not set QC. Signed-off-by: Richard Henderson --- target/arm/helper.h | 17 ++++ target/arm/sve.decode | 5 ++ target/arm/translate-sve.c | 18 +++++ target/arm/vec_helper.c | 161 +++++++++++++++++++++++++++++++++++-- 4 files changed, 195 insertions(+), 6 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index 94658169b2..c8eb165989 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -578,6 +578,23 @@ DEF_HELPER_FLAGS_5(gvec_qrdmlah_s32, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_qrdmlsh_s32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrdmlah_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrdmlsh_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrdmlah_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrdmlsh_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrdmlah_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrdmlsh_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrdmlah_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrdmlsh_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(gvec_sdot_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_udot_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(gvec_sdot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index d0d24978bb..177b3cc803 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1346,3 +1346,8 @@ SQDMLSLT_zzzw 01000100 .. 0 ..... 0110 11 ..... ..... @rda_rn_rm SQDMLALBT 01000100 .. 0 ..... 00001 0 ..... ..... @rda_rn_rm SQDMLSLBT 01000100 .. 0 ..... 00001 1 ..... ..... @rda_rn_rm + +## SVE2 saturating multiply-add high + +SQRDMLAH_zzzz 01000100 .. 0 ..... 01110 0 ..... ..... @rda_rn_rm +SQRDMLSH_zzzz 01000100 .. 0 ..... 01110 1 ..... ..... @rda_rn_rm diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 5da4793d49..1f75489868 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -7567,3 +7567,21 @@ static bool trans_SQDMLSLBT(DisasContext *s, arg_rrrr_esz *a) { return do_sqdmlsl_zzzw(s, a, false, true); } + +static bool trans_SQRDMLAH_zzzz(DisasContext *s, arg_rrrr_esz *a) +{ + static gen_helper_gvec_4 * const fns[] = { + gen_helper_sve2_sqrdmlah_b, gen_helper_sve2_sqrdmlah_h, + gen_helper_sve2_sqrdmlah_s, gen_helper_sve2_sqrdmlah_d, + }; + return do_sve2_zzzz_ool(s, a, fns[a->esz], 0); +} + +static bool trans_SQRDMLSH_zzzz(DisasContext *s, arg_rrrr_esz *a) +{ + static gen_helper_gvec_4 * const fns[] = { + gen_helper_sve2_sqrdmlsh_b, gen_helper_sve2_sqrdmlsh_h, + gen_helper_sve2_sqrdmlsh_s, gen_helper_sve2_sqrdmlsh_d, + }; + return do_sve2_zzzz_ool(s, a, fns[a->esz], 0); +} diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index dcaf3926e1..b40d9688c5 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -22,6 +22,7 @@ #include "exec/helper-proto.h" #include "tcg/tcg-gvec-desc.h" #include "fpu/softfloat.h" +#include "qemu/int128.h" #include "vec_internal.h" /* Note that vector data is stored in host-endian 64-bit chunks, @@ -36,15 +37,55 @@ #define H4(x) (x) #endif +/* Signed saturating rounding doubling multiply-accumulate high half, 8-bit */ +static int8_t do_sqrdmlah_b(int8_t src1, int8_t src2, int8_t src3, + bool neg, bool round) +{ + /* + * Simplify: + * = ((a3 << 8) + ((e1 * e2) << 1) + (round << 7)) >> 8 + * = ((a3 << 7) + (e1 * e2) + (round << 6)) >> 7 + */ + int32_t ret = (int32_t)src1 * src2; + if (neg) { + ret = -ret; + } + ret += ((int32_t)src3 << 7) + (round << 6); + ret >>= 7; + + if (ret != (int8_t)ret) { + ret = (ret < 0 ? INT8_MIN : INT8_MAX); + } + return ret; +} + +void HELPER(sve2_sqrdmlah_b)(void *vd, void *vn, void *vm, + void *va, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int8_t *d = vd, *n = vn, *m = vm, *a = va; + + for (i = 0; i < opr_sz; ++i) { + d[i] = do_sqrdmlah_b(n[i], m[i], a[i], false, true); + } +} + +void HELPER(sve2_sqrdmlsh_b)(void *vd, void *vn, void *vm, + void *va, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int8_t *d = vd, *n = vn, *m = vm, *a = va; + + for (i = 0; i < opr_sz; ++i) { + d[i] = do_sqrdmlah_b(n[i], m[i], a[i], true, true); + } +} + /* Signed saturating rounding doubling multiply-accumulate high half, 16-bit */ static int16_t do_sqrdmlah_h(int16_t src1, int16_t src2, int16_t src3, bool neg, bool round, uint32_t *sat) { - /* - * Simplify: - * = ((a3 << 16) + ((e1 * e2) << 1) + (1 << 15)) >> 16 - * = ((a3 << 15) + (e1 * e2) + (1 << 14)) >> 15 - */ + /* Simplify similarly to do_sqrdmlah_b above. */ int32_t ret = (int32_t)src1 * src2; if (neg) { ret = -ret; @@ -133,11 +174,35 @@ void HELPER(neon_sqrdmulh_h)(void *vd, void *vn, void *vm, clear_tail(d, opr_sz, simd_maxsz(desc)); } +void HELPER(sve2_sqrdmlah_h)(void *vd, void *vn, void *vm, + void *va, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int16_t *d = vd, *n = vn, *m = vm, *a = va; + uint32_t discard; + + for (i = 0; i < opr_sz / 2; ++i) { + d[i] = do_sqrdmlah_h(n[i], m[i], a[i], false, true, &discard); + } +} + +void HELPER(sve2_sqrdmlsh_h)(void *vd, void *vn, void *vm, + void *va, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int16_t *d = vd, *n = vn, *m = vm, *a = va; + uint32_t discard; + + for (i = 0; i < opr_sz / 2; ++i) { + d[i] = do_sqrdmlah_h(n[i], m[i], a[i], true, true, &discard); + } +} + /* Signed saturating rounding doubling multiply-accumulate high half, 32-bit */ static int32_t do_sqrdmlah_s(int32_t src1, int32_t src2, int32_t src3, bool neg, bool round, uint32_t *sat) { - /* Simplify similarly to int_qrdmlah_s16 above. */ + /* Simplify similarly to do_sqrdmlah_b above. */ int64_t ret = (int64_t)src1 * src2; if (neg) { ret = -ret; @@ -220,6 +285,90 @@ void HELPER(neon_sqrdmulh_s)(void *vd, void *vn, void *vm, clear_tail(d, opr_sz, simd_maxsz(desc)); } +void HELPER(sve2_sqrdmlah_s)(void *vd, void *vn, void *vm, + void *va, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int32_t *d = vd, *n = vn, *m = vm, *a = va; + uint32_t discard; + + for (i = 0; i < opr_sz / 4; ++i) { + d[i] = do_sqrdmlah_s(n[i], m[i], a[i], false, true, &discard); + } +} + +void HELPER(sve2_sqrdmlsh_s)(void *vd, void *vn, void *vm, + void *va, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int32_t *d = vd, *n = vn, *m = vm, *a = va; + uint32_t discard; + + for (i = 0; i < opr_sz / 4; ++i) { + d[i] = do_sqrdmlah_s(n[i], m[i], a[i], true, true, &discard); + } +} + +/* Signed saturating rounding doubling multiply-accumulate high half, 64-bit */ +static int64_t do_sat128_d(Int128 r) +{ + int64_t ls = int128_getlo(r); + int64_t hs = int128_gethi(r); + + if (unlikely(hs != (ls >> 63))) { + return hs < 0 ? INT64_MIN : INT64_MAX; + } + return ls; +} + +static int64_t do_sqrdmlah_d(int64_t n, int64_t m, int64_t a, + bool neg, bool round) +{ + uint64_t l, h; + Int128 r, t; + + /* As in do_sqrdmlah_b, but with 128-bit arithmetic. */ + muls64(&l, &h, m, n); + r = int128_make128(l, h); + if (neg) { + r = int128_neg(r); + } + if (a) { + t = int128_exts64(a); + t = int128_lshift(t, 63); + r = int128_add(r, t); + } + if (round) { + t = int128_exts64(1ll << 62); + r = int128_add(r, t); + } + r = int128_rshift(r, 63); + + return do_sat128_d(r); +} + +void HELPER(sve2_sqrdmlah_d)(void *vd, void *vn, void *vm, + void *va, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int64_t *d = vd, *n = vn, *m = vm, *a = va; + + for (i = 0; i < opr_sz / 8; ++i) { + d[i] = do_sqrdmlah_d(n[i], m[i], a[i], false, true); + } +} + +void HELPER(sve2_sqrdmlsh_d)(void *vd, void *vn, void *vm, + void *va, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int64_t *d = vd, *n = vn, *m = vm, *a = va; + + for (i = 0; i < opr_sz / 8; ++i) { + d[i] = do_sqrdmlah_d(n[i], m[i], a[i], true, true); + } +} + /* Integer 8 and 16-bit dot-product. * * Note that for the loops herein, host endianness does not matter From patchwork Fri Sep 18 18:37:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273344 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CA5D1C43463 for ; Fri, 18 Sep 2020 19:12:20 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 5C0DF21707 for ; Fri, 18 Sep 2020 19:12:20 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="ujCo+kbI" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5C0DF21707 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:55490 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLnv-00019l-C7 for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:12:19 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48356) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLHV-0000d7-TX for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:49 -0400 Received: from mail-pf1-x431.google.com ([2607:f8b0:4864:20::431]:46072) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLHT-0007At-Ry for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:49 -0400 Received: by mail-pf1-x431.google.com with SMTP id k15so3976014pfc.12 for ; Fri, 18 Sep 2020 11:38:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=4peHVuz9622WQuT+i8s+DfMlyUVZYgJimB9LhhpBvXQ=; b=ujCo+kbICynZonlzYO4SF+937gpplZxNmAjGTYwb/5vUnKpVFiRzGu5cTlZTSwvPst O8adzV4iQ6xmiK7tH4e3UDV6VvVhCxabDS/ROESuZElc0hs2Xs31+ghku03nEgfQ2R2R hY82zSBqjf1h0TkKVJDIo3EN6bvUQ2Vz9AsthIWd5qmf9dM+ti5ChZgtZAVUqxNj3KTH osm0jelR62El29PnJk5X+s4y8aOPSYwJVABwSBd4KjCYF3JF/5AhMAunU0YVD6EnnMfT 8f6uyG8r8rhpNhWhuziA3ZbJg/klmhhCxkM169GrCwz8iLdu0g9mhlyV9cUquAxISLFT tVRA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=4peHVuz9622WQuT+i8s+DfMlyUVZYgJimB9LhhpBvXQ=; b=n4tilmJtY0WDYhOlgwoF4cljHpOktp7JeaZ8Dp0iMOsoYWtRTO1CPn5gN99UknqNg+ wWnaIBRpRbgpGv0NzpT2dENqH+y8PiRUIgrgkVbUePigUXuGIVZ3ehzRJcsUIcguF6xh 4jkYd1P3Y+HBe3SgWMVx6IdIdhizjvUMA+xuh9X3ufJz1jbMljJJBHKrmUP0RS1GyUZR ZbFKmAI5sKtF7Du8wlUEZz3beZPHcvnc8IUwwQ6R7+FZ4Y33oHHcYjVEmAlv3NK2QZnX FoPDGVukYR9J7WF16+5Aw7qQoFRZUrvldSwMZQEOr0/FBgTm7K1n0QrEgvAiEXovRe3q 23ww== X-Gm-Message-State: AOAM533iFMH3FSMWR5R7qxCme5WMgU4UVAOGzapY/jVSFIOwuabft/j3 0kWvtC9jI3CR2eWOQKUJgbjqsQTtn5DvTA== X-Google-Smtp-Source: ABdhPJwe7gL0x0fEwb0nYPABWWb3WcZtm43L/CD5mHIC2fCtBYwVKM90LDmRldxWEiZuNQJgTFB/fQ== X-Received: by 2002:a63:a08:: with SMTP id 8mr25408542pgk.300.1600454324638; Fri, 18 Sep 2020 11:38:44 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:43 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 39/81] target/arm: Implement SVE2 integer multiply-add long Date: Fri, 18 Sep 2020 11:37:09 -0700 Message-Id: <20200918183751.2787647-40-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::431; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x431.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 28 ++++++++++++++ target/arm/sve.decode | 11 ++++++ target/arm/sve_helper.c | 18 +++++++++ target/arm/translate-sve.c | 76 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 133 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index ff5f744b9a..3054215e9d 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2573,3 +2573,31 @@ DEF_HELPER_FLAGS_5(sve2_sqdmlsl_zzzw_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_sqdmlsl_zzzw_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_smlal_zzzw_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_smlal_zzzw_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_smlal_zzzw_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_umlal_zzzw_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_umlal_zzzw_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_umlal_zzzw_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_smlsl_zzzw_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_smlsl_zzzw_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_smlsl_zzzw_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_umlsl_zzzw_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_umlsl_zzzw_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_umlsl_zzzw_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 177b3cc803..19c5013ddd 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1351,3 +1351,14 @@ SQDMLSLBT 01000100 .. 0 ..... 00001 1 ..... ..... @rda_rn_rm SQRDMLAH_zzzz 01000100 .. 0 ..... 01110 0 ..... ..... @rda_rn_rm SQRDMLSH_zzzz 01000100 .. 0 ..... 01110 1 ..... ..... @rda_rn_rm + +## SVE2 integer multiply-add long + +SMLALB_zzzw 01000100 .. 0 ..... 010 000 ..... ..... @rda_rn_rm +SMLALT_zzzw 01000100 .. 0 ..... 010 001 ..... ..... @rda_rn_rm +UMLALB_zzzw 01000100 .. 0 ..... 010 010 ..... ..... @rda_rn_rm +UMLALT_zzzw 01000100 .. 0 ..... 010 011 ..... ..... @rda_rn_rm +SMLSLB_zzzw 01000100 .. 0 ..... 010 100 ..... ..... @rda_rn_rm +SMLSLT_zzzw 01000100 .. 0 ..... 010 101 ..... ..... @rda_rn_rm +UMLSLB_zzzw 01000100 .. 0 ..... 010 110 ..... ..... @rda_rn_rm +UMLSLT_zzzw 01000100 .. 0 ..... 010 111 ..... ..... @rda_rn_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 9822675892..76e8db5dd3 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1308,6 +1308,24 @@ DO_ZZZW_ACC(sve2_uabal_h, uint16_t, uint8_t, H1_2, H1, DO_ABD) DO_ZZZW_ACC(sve2_uabal_s, uint32_t, uint16_t, H1_4, H1_2, DO_ABD) DO_ZZZW_ACC(sve2_uabal_d, uint64_t, uint32_t, , H1_4, DO_ABD) +DO_ZZZW_ACC(sve2_smlal_zzzw_h, int16_t, int8_t, H1_2, H1, DO_MUL) +DO_ZZZW_ACC(sve2_smlal_zzzw_s, int32_t, int16_t, H1_4, H1_2, DO_MUL) +DO_ZZZW_ACC(sve2_smlal_zzzw_d, int64_t, int32_t, , H1_4, DO_MUL) + +DO_ZZZW_ACC(sve2_umlal_zzzw_h, uint16_t, uint8_t, H1_2, H1, DO_MUL) +DO_ZZZW_ACC(sve2_umlal_zzzw_s, uint32_t, uint16_t, H1_4, H1_2, DO_MUL) +DO_ZZZW_ACC(sve2_umlal_zzzw_d, uint64_t, uint32_t, , H1_4, DO_MUL) + +#define DO_NMUL(N, M) -(N * M) + +DO_ZZZW_ACC(sve2_smlsl_zzzw_h, int16_t, int8_t, H1_2, H1, DO_NMUL) +DO_ZZZW_ACC(sve2_smlsl_zzzw_s, int32_t, int16_t, H1_4, H1_2, DO_NMUL) +DO_ZZZW_ACC(sve2_smlsl_zzzw_d, int64_t, int32_t, , H1_4, DO_NMUL) + +DO_ZZZW_ACC(sve2_umlsl_zzzw_h, uint16_t, uint8_t, H1_2, H1, DO_NMUL) +DO_ZZZW_ACC(sve2_umlsl_zzzw_s, uint32_t, uint16_t, H1_4, H1_2, DO_NMUL) +DO_ZZZW_ACC(sve2_umlsl_zzzw_d, uint64_t, uint32_t, , H1_4, DO_NMUL) + #undef DO_ZZZW_ACC #define DO_XTNB(NAME, TYPE, OP) \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 1f75489868..9fdf3eb9ef 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -7585,3 +7585,79 @@ static bool trans_SQRDMLSH_zzzz(DisasContext *s, arg_rrrr_esz *a) }; return do_sve2_zzzz_ool(s, a, fns[a->esz], 0); } + +static bool do_smlal_zzzw(DisasContext *s, arg_rrrr_esz *a, bool sel) +{ + static gen_helper_gvec_4 * const fns[] = { + NULL, gen_helper_sve2_smlal_zzzw_h, + gen_helper_sve2_smlal_zzzw_s, gen_helper_sve2_smlal_zzzw_d, + }; + return do_sve2_zzzz_ool(s, a, fns[a->esz], sel); +} + +static bool trans_SMLALB_zzzw(DisasContext *s, arg_rrrr_esz *a) +{ + return do_smlal_zzzw(s, a, false); +} + +static bool trans_SMLALT_zzzw(DisasContext *s, arg_rrrr_esz *a) +{ + return do_smlal_zzzw(s, a, true); +} + +static bool do_umlal_zzzw(DisasContext *s, arg_rrrr_esz *a, bool sel) +{ + static gen_helper_gvec_4 * const fns[] = { + NULL, gen_helper_sve2_umlal_zzzw_h, + gen_helper_sve2_umlal_zzzw_s, gen_helper_sve2_umlal_zzzw_d, + }; + return do_sve2_zzzz_ool(s, a, fns[a->esz], sel); +} + +static bool trans_UMLALB_zzzw(DisasContext *s, arg_rrrr_esz *a) +{ + return do_umlal_zzzw(s, a, false); +} + +static bool trans_UMLALT_zzzw(DisasContext *s, arg_rrrr_esz *a) +{ + return do_umlal_zzzw(s, a, true); +} + +static bool do_smlsl_zzzw(DisasContext *s, arg_rrrr_esz *a, bool sel) +{ + static gen_helper_gvec_4 * const fns[] = { + NULL, gen_helper_sve2_smlsl_zzzw_h, + gen_helper_sve2_smlsl_zzzw_s, gen_helper_sve2_smlsl_zzzw_d, + }; + return do_sve2_zzzz_ool(s, a, fns[a->esz], sel); +} + +static bool trans_SMLSLB_zzzw(DisasContext *s, arg_rrrr_esz *a) +{ + return do_smlsl_zzzw(s, a, false); +} + +static bool trans_SMLSLT_zzzw(DisasContext *s, arg_rrrr_esz *a) +{ + return do_smlsl_zzzw(s, a, true); +} + +static bool do_umlsl_zzzw(DisasContext *s, arg_rrrr_esz *a, bool sel) +{ + static gen_helper_gvec_4 * const fns[] = { + NULL, gen_helper_sve2_umlsl_zzzw_h, + gen_helper_sve2_umlsl_zzzw_s, gen_helper_sve2_umlsl_zzzw_d, + }; + return do_sve2_zzzz_ool(s, a, fns[a->esz], sel); +} + +static bool trans_UMLSLB_zzzw(DisasContext *s, arg_rrrr_esz *a) +{ + return do_umlsl_zzzw(s, a, false); +} + +static bool trans_UMLSLT_zzzw(DisasContext *s, arg_rrrr_esz *a) +{ + return do_umlsl_zzzw(s, a, true); +} From patchwork Fri Sep 18 18:37:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 305012 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 23840C43463 for ; Fri, 18 Sep 2020 19:22:00 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id BBE7621D7F for ; Fri, 18 Sep 2020 19:21:59 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="jxQJ6DNW" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BBE7621D7F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:53588 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLxG-0003nx-Qg for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:21:58 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48362) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLHW-0000du-7d for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:50 -0400 Received: from mail-pl1-x635.google.com ([2607:f8b0:4864:20::635]:46808) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLHU-0007Ay-3A for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:49 -0400 Received: by mail-pl1-x635.google.com with SMTP id f1so3411883plo.13 for ; Fri, 18 Sep 2020 11:38:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=F5G0SRC5khoKx1rLfVfs5va5dW2xBloL5lgBwx7Drcc=; b=jxQJ6DNWsarN+SapeflJJIHFbei8ydHblT2IUVhO0QG8CzXEdHHkyWIqoqCj/U5L85 cAWifIfHR9biJ/XTuL16twhax3sgcXZ5KDFc/uuIN3/xKyf0jr+9NYHKDblda8B9raZ2 JwyaOebBCjHdX5Wj8K/F85EFrpl3rGG0aitB7PHMX5fqIBW4Oaw7E7MAvjkSLFyVlHcP hYvG9UTLfAz+aZ5BhmVVy7jmOhDeWxcQ2ehu9BWNJlsqYMvjHOoJonKh8akWC6vJvF6n 1TAJQIvWvRyflHGJOlF7VBdPw3Wa/t/hiChaieeLFfBmLKNtKr4YQlqk4LGkArdg2tS9 zmig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=F5G0SRC5khoKx1rLfVfs5va5dW2xBloL5lgBwx7Drcc=; b=Fultn7FUBJ9QaRzbtZ/151jCvKKX6d+bIaCFnOW1IV1L7ezb7bP6fXOJkVwOyOP3yF h1vN7DZy8EiD5+t2/HWILfVN4JUEzAgRUxNZd1dNtmlEZSFF9vCpdLueH1X/viH5uVP2 DNAtRtxKGkp0FAw5ZaaH0j8J5vAZX0QuGGccPB+gUybzZekUpdt0LMXlNrVuklXCyTf0 KHYBe14ughzZmI0WZyr50RwfijQjIdVelczc8cT4Y82K9Dy7y468hzhMgzmP3sP28VJ3 Gbid1jgxpf8pYVvrDWCIwPZZSvgBUv03IY0KO3mKGOEbVdlcLXWt5Ievfdxc+7q/bmdG SzCw== X-Gm-Message-State: AOAM533eEaaItyfM2p0cxcfuLvYCEmZbPyQCdFWabAelN20wxf+k9P1J kPP85mym5zY5riqr5ZRFPd3LVTJ718cIjw== X-Google-Smtp-Source: ABdhPJwODMpMW8LukyJSszInO8yF+YCQ/PHComzGsncAF7xaVBGxx8cVqJ7sDVHnIICPNoGE8XUK7w== X-Received: by 2002:a17:90a:d315:: with SMTP id p21mr14802846pju.88.1600454326167; Fri, 18 Sep 2020 11:38:46 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:45 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 40/81] target/arm: Implement SVE2 complex integer multiply-add Date: Fri, 18 Sep 2020 11:37:10 -0700 Message-Id: <20200918183751.2787647-41-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::635; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x635.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- v2: Fix do_sqrdmlah_d (laurent desnogues) --- target/arm/helper-sve.h | 18 ++++++++++++++++ target/arm/vec_internal.h | 5 +++++ target/arm/sve.decode | 5 +++++ target/arm/sve_helper.c | 42 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 32 +++++++++++++++++++++++++++++ target/arm/vec_helper.c | 15 +++++++------- 6 files changed, 109 insertions(+), 8 deletions(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 3054215e9d..4cd43d3ecc 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2601,3 +2601,21 @@ DEF_HELPER_FLAGS_5(sve2_umlsl_zzzw_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_umlsl_zzzw_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_cmla_zzzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_cmla_zzzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_cmla_zzzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_cmla_zzzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_sqrdcmlah_zzzz_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrdcmlah_zzzz_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrdcmlah_zzzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrdcmlah_zzzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/vec_internal.h b/target/arm/vec_internal.h index 372fe76523..38ce31b4ca 100644 --- a/target/arm/vec_internal.h +++ b/target/arm/vec_internal.h @@ -168,4 +168,9 @@ static inline int64_t do_suqrshl_d(int64_t src, int64_t shift, return do_uqrshl_d(src, shift, round, sat); } +int8_t do_sqrdmlah_b(int8_t, int8_t, int8_t, bool, bool); +int16_t do_sqrdmlah_h(int16_t, int16_t, int16_t, bool, bool, uint32_t *); +int32_t do_sqrdmlah_s(int32_t, int32_t, int32_t, bool, bool, uint32_t *); +int64_t do_sqrdmlah_d(int64_t, int64_t, int64_t, bool, bool); + #endif /* TARGET_ARM_VEC_INTERNALS_H */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 19c5013ddd..a03d6107da 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1362,3 +1362,8 @@ SMLSLB_zzzw 01000100 .. 0 ..... 010 100 ..... ..... @rda_rn_rm SMLSLT_zzzw 01000100 .. 0 ..... 010 101 ..... ..... @rda_rn_rm UMLSLB_zzzw 01000100 .. 0 ..... 010 110 ..... ..... @rda_rn_rm UMLSLT_zzzw 01000100 .. 0 ..... 010 111 ..... ..... @rda_rn_rm + +## SVE2 complex integer multiply-add + +CMLA_zzzz 01000100 esz:2 0 rm:5 0010 rot:2 rn:5 rd:5 ra=%reg_movprfx +SQRDCMLAH_zzzz 01000100 esz:2 0 rm:5 0011 rot:2 rn:5 rd:5 ra=%reg_movprfx diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 76e8db5dd3..10e3bd8415 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1448,6 +1448,48 @@ DO_SQDMLAL(sve2_sqdmlsl_zzzw_d, int64_t, int32_t, , H1_4, #undef DO_SQDMLAL +#define DO_CMLA(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc) / sizeof(TYPE); \ + int rot = simd_data(desc); \ + int sel_a = rot & 1, sel_b = sel_a ^ 1; \ + bool sub_r = rot == 1 || rot == 2; \ + bool sub_i = rot >= 2; \ + TYPE *d = vd, *n = vn, *m = vm, *a = va; \ + for (i = 0; i < opr_sz; i += 2) { \ + TYPE elt1_a = n[H(i + sel_a)]; \ + TYPE elt2_a = m[H(i + sel_a)]; \ + TYPE elt2_b = m[H(i + sel_b)]; \ + d[H(i)] = OP(elt1_a, elt2_a, a[H(i)], sub_r); \ + d[H(i + 1)] = OP(elt1_a, elt2_b, a[H(i + 1)], sub_i); \ + } \ +} + +#define do_cmla(N, M, A, S) (A + (N * M) * (S ? -1 : 1)) + +DO_CMLA(sve2_cmla_zzzz_b, uint8_t, H1, do_cmla) +DO_CMLA(sve2_cmla_zzzz_h, uint16_t, H2, do_cmla) +DO_CMLA(sve2_cmla_zzzz_s, uint32_t, H4, do_cmla) +DO_CMLA(sve2_cmla_zzzz_d, uint64_t, , do_cmla) + +#define DO_SQRDMLAH_B(N, M, A, S) \ + do_sqrdmlah_b(N, M, A, S, true) +#define DO_SQRDMLAH_H(N, M, A, S) \ + ({ uint32_t discard; do_sqrdmlah_h(N, M, A, S, true, &discard); }) +#define DO_SQRDMLAH_S(N, M, A, S) \ + ({ uint32_t discard; do_sqrdmlah_s(N, M, A, S, true, &discard); }) +#define DO_SQRDMLAH_D(N, M, A, S) \ + do_sqrdmlah_d(N, M, A, S, true) + +DO_CMLA(sve2_sqrdcmlah_zzzz_b, int8_t, H1, DO_SQRDMLAH_B) +DO_CMLA(sve2_sqrdcmlah_zzzz_h, int16_t, H2, DO_SQRDMLAH_H) +DO_CMLA(sve2_sqrdcmlah_zzzz_s, int32_t, H4, DO_SQRDMLAH_S) +DO_CMLA(sve2_sqrdcmlah_zzzz_d, int64_t, , DO_SQRDMLAH_D) + +#undef do_cmla +#undef DO_CMLA + #define DO_BITPERM(NAME, TYPE, OP) \ void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ { \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 9fdf3eb9ef..5f4d879193 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -7661,3 +7661,35 @@ static bool trans_UMLSLT_zzzw(DisasContext *s, arg_rrrr_esz *a) { return do_umlsl_zzzw(s, a, true); } + +static bool trans_CMLA_zzzz(DisasContext *s, arg_CMLA_zzzz *a) +{ + static gen_helper_gvec_4 * const fns[] = { + gen_helper_sve2_cmla_zzzz_b, gen_helper_sve2_cmla_zzzz_h, + gen_helper_sve2_cmla_zzzz_s, gen_helper_sve2_cmla_zzzz_d, + }; + + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + gen_gvec_ool_zzzz(s, fns[a->esz], a->rd, a->rn, a->rm, a->ra, a->rot); + } + return true; +} + +static bool trans_SQRDCMLAH_zzzz(DisasContext *s, arg_SQRDCMLAH_zzzz *a) +{ + static gen_helper_gvec_4 * const fns[] = { + gen_helper_sve2_sqrdcmlah_zzzz_b, gen_helper_sve2_sqrdcmlah_zzzz_h, + gen_helper_sve2_sqrdcmlah_zzzz_s, gen_helper_sve2_sqrdcmlah_zzzz_d, + }; + + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + gen_gvec_ool_zzzz(s, fns[a->esz], a->rd, a->rn, a->rm, a->ra, a->rot); + } + return true; +} diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index b40d9688c5..ed33e1ee36 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -38,8 +38,8 @@ #endif /* Signed saturating rounding doubling multiply-accumulate high half, 8-bit */ -static int8_t do_sqrdmlah_b(int8_t src1, int8_t src2, int8_t src3, - bool neg, bool round) +int8_t do_sqrdmlah_b(int8_t src1, int8_t src2, int8_t src3, + bool neg, bool round) { /* * Simplify: @@ -82,8 +82,8 @@ void HELPER(sve2_sqrdmlsh_b)(void *vd, void *vn, void *vm, } /* Signed saturating rounding doubling multiply-accumulate high half, 16-bit */ -static int16_t do_sqrdmlah_h(int16_t src1, int16_t src2, int16_t src3, - bool neg, bool round, uint32_t *sat) +int16_t do_sqrdmlah_h(int16_t src1, int16_t src2, int16_t src3, + bool neg, bool round, uint32_t *sat) { /* Simplify similarly to do_sqrdmlah_b above. */ int32_t ret = (int32_t)src1 * src2; @@ -199,8 +199,8 @@ void HELPER(sve2_sqrdmlsh_h)(void *vd, void *vn, void *vm, } /* Signed saturating rounding doubling multiply-accumulate high half, 32-bit */ -static int32_t do_sqrdmlah_s(int32_t src1, int32_t src2, int32_t src3, - bool neg, bool round, uint32_t *sat) +int32_t do_sqrdmlah_s(int32_t src1, int32_t src2, int32_t src3, + bool neg, bool round, uint32_t *sat) { /* Simplify similarly to do_sqrdmlah_b above. */ int64_t ret = (int64_t)src1 * src2; @@ -321,8 +321,7 @@ static int64_t do_sat128_d(Int128 r) return ls; } -static int64_t do_sqrdmlah_d(int64_t n, int64_t m, int64_t a, - bool neg, bool round) +int64_t do_sqrdmlah_d(int64_t n, int64_t m, int64_t a, bool neg, bool round) { uint64_t l, h; Int128 r, t; From patchwork Fri Sep 18 18:37:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 305003 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8EBE0C43463 for ; Fri, 18 Sep 2020 19:36:44 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 1B97E21741 for ; Fri, 18 Sep 2020 19:36:44 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="JrYxgSgU" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1B97E21741 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:36802 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJMBX-00048X-6i for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:36:43 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48376) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLHX-0000g8-4u for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:51 -0400 Received: from mail-pj1-x102c.google.com ([2607:f8b0:4864:20::102c]:39561) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLHV-0007B8-7Y for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:50 -0400 Received: by mail-pj1-x102c.google.com with SMTP id v14so3456654pjd.4 for ; Fri, 18 Sep 2020 11:38:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=DdpfsJo4khtYdr7vVBoU+AUZhm6/E1BDB1ibnFAszUA=; b=JrYxgSgU6Q6lvfUHt6jtkPrIfrZ1GgNOEX5kTUsK6vVLiXuHVv2fWFsOYQXIwSHGZE nWvU/OCWCkQ51prlDUhWZctwewb12Gz6IWpVtsLcem9mhdgZNnXniHv4jbwUruDlayQh TjlHDWYP2cbdykJyhnszjP7lE1DH2F3EvXxQHoeAIWVlUvAbpInL6g9+rIP9x7I78QXY 70mSRZlujzNywlEE4wvAoNPScu2MyFv4yOTaDWZH5jdh9QWoB7woZFLQPL6DGRQPOVfm vQHoz+AfifjanY/g44WmnOxV4rRGiE/1sswakLjDbzBEJLAbTUOy5MM80o413DyRZ+6K JdoA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=DdpfsJo4khtYdr7vVBoU+AUZhm6/E1BDB1ibnFAszUA=; b=KjyhXrKwhY+VELWmCLLW79wd5skrcGWh8WWzaFi/gXho/u2kvKWysBlsCHtFAUmFF+ VH3Te2BNBY4CTPnH0KmfjwyFF6SYD9Z9zStrGwFc0tcKmoY175So5aIIepY+YbDbdwAU 0U6dvdRYWrETDR+Ts8jUBuL7NVrYtCpb2Iw3B4jOgjJkEMxOiYDcvpZjPc4A8g82kqWF 4O0i5irtNeK7P3Y0X5GEnxOf8J4xGkfn+tEMbHFQIfzRrquClI+DlqJS9zeAU2un+Xur R1SmzIvx+Oo1r8s5TzrtdfD9PpXlsy7JRsw28NpcT2iMA/MShmoVQsp5emVzu3/HWzF9 2WnQ== X-Gm-Message-State: AOAM530HMfrMen4zPoWBKRDp0DpsB1gpn9/ppw/2m5Ktzo/ZO5J0jkxw lMFON0Y+sFmEdaWaXKOvld7ckDrLX/644A== X-Google-Smtp-Source: ABdhPJx5hFJK23WVZMKMOG7SSPR9+blIka3gIkPkfOaheBHCX8tGs9dPsIZcDW7QVlCFmDyZrF++RQ== X-Received: by 2002:a17:902:fe84:b029:d1:e598:3ff6 with SMTP id x4-20020a170902fe84b02900d1e5983ff6mr17096201plm.48.1600454327405; Fri, 18 Sep 2020 11:38:47 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:46 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 41/81] target/arm: Implement SVE2 ADDHNB, ADDHNT Date: Fri, 18 Sep 2020 11:37:11 -0700 Message-Id: <20200918183751.2787647-42-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::102c; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x102c.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org, Stephen Long Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Stephen Long Signed-off-by: Stephen Long Message-Id: <20200417162231.10374-2-steplong@quicinc.com> Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 8 ++++++++ target/arm/sve.decode | 5 +++++ target/arm/sve_helper.c | 36 ++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 13 +++++++++++++ 4 files changed, 62 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 4cd43d3ecc..5cf5473487 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2509,6 +2509,14 @@ DEF_HELPER_FLAGS_3(sve2_uqrshrnt_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_uqrshrnt_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve2_uqrshrnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_addhnb_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_addhnb_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_addhnb_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_addhnt_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_addhnt_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_addhnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve2_match_ppzz_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_match_ppzz_h, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index a03d6107da..af9e87e88d 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1320,6 +1320,11 @@ UQSHRNT 01000101 .. 1 ..... 00 1101 ..... ..... @rd_rn_tszimm_shr UQRSHRNB 01000101 .. 1 ..... 00 1110 ..... ..... @rd_rn_tszimm_shr UQRSHRNT 01000101 .. 1 ..... 00 1111 ..... ..... @rd_rn_tszimm_shr +## SVE2 integer add/subtract narrow high part + +ADDHNB 01000101 .. 1 ..... 011 000 ..... ..... @rd_rn_rm +ADDHNT 01000101 .. 1 ..... 011 001 ..... ..... @rd_rn_rm + ### SVE2 Character Match MATCH 01000101 .. 1 ..... 100 ... ..... 0 .... @pd_pg_rn_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 10e3bd8415..6b1c0ad266 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2111,6 +2111,42 @@ DO_SHRNT(sve2_uqrshrnt_d, uint64_t, uint32_t, , H1_4, DO_UQRSHRN_D) #undef DO_SHRNB #undef DO_SHRNT +#define DO_BINOPNB(NAME, TYPEW, TYPEN, SHIFT, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + for (i = 0; i < opr_sz; i += sizeof(TYPEW)) { \ + TYPEW nn = *(TYPEW *)(vn + i); \ + TYPEW mm = *(TYPEW *)(vm + i); \ + *(TYPEW *)(vd + i) = (TYPEN)OP(nn, mm, SHIFT); \ + } \ +} + +#define DO_BINOPNT(NAME, TYPEW, TYPEN, SHIFT, HW, HN, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, opr_sz = simd_oprsz(desc); \ + for (i = 0; i < opr_sz; i += sizeof(TYPEW)) { \ + TYPEW nn = *(TYPEW *)(vn + HW(i)); \ + TYPEW mm = *(TYPEW *)(vm + HW(i)); \ + *(TYPEN *)(vd + HN(i + sizeof(TYPEN))) = OP(nn, mm, SHIFT); \ + } \ +} + +#define DO_ADDHN(N, M, SH) ((N + M) >> SH) + +DO_BINOPNB(sve2_addhnb_h, uint16_t, uint8_t, 8, DO_ADDHN) +DO_BINOPNB(sve2_addhnb_s, uint32_t, uint16_t, 16, DO_ADDHN) +DO_BINOPNB(sve2_addhnb_d, uint64_t, uint32_t, 32, DO_ADDHN) + +DO_BINOPNT(sve2_addhnt_h, uint16_t, uint8_t, 8, H1_2, H1, DO_ADDHN) +DO_BINOPNT(sve2_addhnt_s, uint32_t, uint16_t, 16, H1_4, H1_2, DO_ADDHN) +DO_BINOPNT(sve2_addhnt_d, uint64_t, uint32_t, 32, , H1_4, DO_ADDHN) + +#undef DO_ADDHN + +#undef DO_BINOPNB + /* Fully general four-operand expander, controlled by a predicate. */ #define DO_ZPZZZ(NAME, TYPE, H, OP) \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 5f4d879193..aa21aa49d0 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -7467,6 +7467,19 @@ static bool trans_UQRSHRNT(DisasContext *s, arg_rri_esz *a) return do_sve2_shr_narrow(s, a, ops); } +#define DO_SVE2_ZZZ_NARROW(NAME, name) \ +static bool trans_##NAME(DisasContext *s, arg_rrr_esz *a) \ +{ \ + static gen_helper_gvec_3 * const fns[4] = { \ + NULL, gen_helper_sve2_##name##_h, \ + gen_helper_sve2_##name##_s, gen_helper_sve2_##name##_d, \ + }; \ + return do_sve2_zzz_ool(s, a, fns[a->esz]); \ +} + +DO_SVE2_ZZZ_NARROW(ADDHNB, addhnb) +DO_SVE2_ZZZ_NARROW(ADDHNT, addhnt) + static bool do_sve2_ppzz_flags(DisasContext *s, arg_rprr_esz *a, gen_helper_gvec_flags_4 *fn) { From patchwork Fri Sep 18 18:37:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 305008 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 04988C43463 for ; Fri, 18 Sep 2020 19:27:38 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A355A208DB for ; Fri, 18 Sep 2020 19:27:37 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="CSLmc+gs" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A355A208DB Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:42026 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJM2i-0002Ek-Ov for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:27:36 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48402) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLHZ-0000hA-DP for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:54 -0400 Received: from mail-pj1-x1029.google.com ([2607:f8b0:4864:20::1029]:32900) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLHW-0007BM-Bg for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:53 -0400 Received: by mail-pj1-x1029.google.com with SMTP id md22so4604198pjb.0 for ; Fri, 18 Sep 2020 11:38:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=qqx1ZeFX7xpaj1v3sBf+Oi7x2Z3kHrjTa9zZlA0tu/g=; b=CSLmc+gsHtI7sWZfR0/tNOhfoiDqRrAX7z0M/HuJeMgF0vkSKDKhOKNbTAgr2c/VZH DjKdNf6gd+PeJPn8CiZg9U8cGfaeQcQnp0b4Hi/xu9puC6KKAeLxauRJV4+UHNJb6AUn XgHA0fw91EimwzHVc0I6y8QppRHeVwOwTQWRlxQXCic2vRv0HA80Ck/40YvKiAlob9CU Eqldy1+dvtRI+f5Hy8gXPAzXpAx0K0F6OeNk2jou/CdKR5xARJ0ntor8Exavrym10kfQ qN5Qw9O3tKvDO+S//f0Bk9miL2ktYFquxm75IBNmxm32EEtiOL0BObHjCbQ/WATua08q ZdAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=qqx1ZeFX7xpaj1v3sBf+Oi7x2Z3kHrjTa9zZlA0tu/g=; b=nG+2uRUeevXxA9eLlqmUvkxMMI3rzHRUrwVuFypYrNo97W3GhkAI9SjT9f3rxRVNMi PwAvqTiZQYIkFTySIkw5XKlFj4Jmu8SDMP3sNyQseKBn7wrv48IAYf2fz9CS/K2nR96i /YOmu0Gmv2DxnRg1+XfrCIBym5ft1xEbksIDIGkCL2A1yg0tu1tfI/4U4o0QV5hnEtA0 cCdEYlH7mmK1hV4QQk3f4fYQSmVLJ5qgSHJJmymLYQqdzDV7VwNuJ2AakA+4zMc7ayWK G6YBQBb+MK2m6TiSuDAkOjvMT4+JdMI6aSY5RbJh5p9wJ381tB6YJxDE3LelL0Ym2IsG CGLA== X-Gm-Message-State: AOAM533pEcXgob2BbpctJE2n242PlxcfackEjpkf+VQIWQ0GBD/S6Xdo BgO65s0/xGBhVj8Qnq8P5dCtnVQFeudLBA== X-Google-Smtp-Source: ABdhPJw+UpcJFSS+r6uAySs7evF+hJLxZpZC8C5ZbVh0obnflkUqo/XY95lNKBVD1mHSXlHqg1Jq9g== X-Received: by 2002:a17:902:b715:b029:d1:e598:4000 with SMTP id d21-20020a170902b715b02900d1e5984000mr16840392pls.58.1600454328643; Fri, 18 Sep 2020 11:38:48 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:47 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 42/81] target/arm: Implement SVE2 RADDHNB, RADDHNT Date: Fri, 18 Sep 2020 11:37:12 -0700 Message-Id: <20200918183751.2787647-43-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1029; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1029.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org, Stephen Long Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Stephen Long Signed-off-by: Stephen Long Message-Id: <20200417162231.10374-3-steplong@quicinc.com> Signed-off-by: Richard Henderson --- v2: Fix round bit type (laurent desnogues) --- target/arm/helper-sve.h | 8 ++++++++ target/arm/sve.decode | 2 ++ target/arm/sve_helper.c | 10 ++++++++++ target/arm/translate-sve.c | 2 ++ 4 files changed, 22 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 5cf5473487..4d395d3965 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2517,6 +2517,14 @@ DEF_HELPER_FLAGS_4(sve2_addhnt_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_addhnt_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_addhnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_raddhnb_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_raddhnb_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_raddhnb_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_raddhnt_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_raddhnt_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_raddhnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve2_match_ppzz_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_match_ppzz_h, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index af9e87e88d..a33825066c 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1324,6 +1324,8 @@ UQRSHRNT 01000101 .. 1 ..... 00 1111 ..... ..... @rd_rn_tszimm_shr ADDHNB 01000101 .. 1 ..... 011 000 ..... ..... @rd_rn_rm ADDHNT 01000101 .. 1 ..... 011 001 ..... ..... @rd_rn_rm +RADDHNB 01000101 .. 1 ..... 011 010 ..... ..... @rd_rn_rm +RADDHNT 01000101 .. 1 ..... 011 011 ..... ..... @rd_rn_rm ### SVE2 Character Match diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 6b1c0ad266..b778fad142 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2134,6 +2134,7 @@ void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ } #define DO_ADDHN(N, M, SH) ((N + M) >> SH) +#define DO_RADDHN(N, M, SH) ((N + M + ((__typeof(N))1 << (SH - 1))) >> SH) DO_BINOPNB(sve2_addhnb_h, uint16_t, uint8_t, 8, DO_ADDHN) DO_BINOPNB(sve2_addhnb_s, uint32_t, uint16_t, 16, DO_ADDHN) @@ -2143,6 +2144,15 @@ DO_BINOPNT(sve2_addhnt_h, uint16_t, uint8_t, 8, H1_2, H1, DO_ADDHN) DO_BINOPNT(sve2_addhnt_s, uint32_t, uint16_t, 16, H1_4, H1_2, DO_ADDHN) DO_BINOPNT(sve2_addhnt_d, uint64_t, uint32_t, 32, , H1_4, DO_ADDHN) +DO_BINOPNB(sve2_raddhnb_h, uint16_t, uint8_t, 8, DO_RADDHN) +DO_BINOPNB(sve2_raddhnb_s, uint32_t, uint16_t, 16, DO_RADDHN) +DO_BINOPNB(sve2_raddhnb_d, uint64_t, uint32_t, 32, DO_RADDHN) + +DO_BINOPNT(sve2_raddhnt_h, uint16_t, uint8_t, 8, H1_2, H1, DO_RADDHN) +DO_BINOPNT(sve2_raddhnt_s, uint32_t, uint16_t, 16, H1_4, H1_2, DO_RADDHN) +DO_BINOPNT(sve2_raddhnt_d, uint64_t, uint32_t, 32, , H1_4, DO_RADDHN) + +#undef DO_RADDHN #undef DO_ADDHN #undef DO_BINOPNB diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index aa21aa49d0..e384bed108 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -7479,6 +7479,8 @@ static bool trans_##NAME(DisasContext *s, arg_rrr_esz *a) \ DO_SVE2_ZZZ_NARROW(ADDHNB, addhnb) DO_SVE2_ZZZ_NARROW(ADDHNT, addhnt) +DO_SVE2_ZZZ_NARROW(RADDHNB, raddhnb) +DO_SVE2_ZZZ_NARROW(RADDHNT, raddhnt) static bool do_sve2_ppzz_flags(DisasContext *s, arg_rprr_esz *a, gen_helper_gvec_flags_4 *fn) From patchwork Fri Sep 18 18:37:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 305016 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4D456C43464 for ; Fri, 18 Sep 2020 19:14:46 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C925B21707 for ; Fri, 18 Sep 2020 19:14:45 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="nwRKsjRG" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C925B21707 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:35608 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLqG-0004ZV-Mr for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:14:44 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48444) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLHb-0000jL-RR for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:55 -0400 Received: from mail-pj1-x102c.google.com ([2607:f8b0:4864:20::102c]:32903) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLHZ-0007Bc-1Q for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:55 -0400 Received: by mail-pj1-x102c.google.com with SMTP id md22so4604216pjb.0 for ; Fri, 18 Sep 2020 11:38:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=D5b7IThx7KrAPMTWBgzwFwc83JSiI1zvkP9/DWeZF+E=; b=nwRKsjRG7fGQcHduWwMFBQPky771SAlT2zxVpnYWmwXJr1dEFo0umCL3cgdHOmnorE iNlQWukxs18l/J519b9yDbTjcolmxmfYJ+qUF4vWQgwAaeg0M+j4c2xOKX//uG33V1eQ ngebtyj46f1V8C4Ff2LtwrR/MJaAmClm+yazCDyuljeNVDKLNb2VnF/1fVy/D+qNuVRV nH8/7++DdgnSimnaJEyR3H8bcdjdVmftqUqBrUNSS+ixmHMVrrGbobRvtGs4oVL8Hw/O q1Fuk60g6RNouA54PY7b552+rLqE30FUvkv0Ru15O4vtf+HSe5s0bOJ4dwKxbTVcS5EI vRUw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=D5b7IThx7KrAPMTWBgzwFwc83JSiI1zvkP9/DWeZF+E=; b=XXBcQOjuHszgxlFucMROgFFXB/r6jpAx4oCFaks2RADsN+dENVSEHbJ8qIfbrQh8RX vHOth2zLuY7sYSV8zf4ZlNxMnEppiHjfO6j4W0sPwt1DneW+SO1jqXTb/mutG6mdAd2E PtvgiqKu0fYOKRaoYM13zialmCjAuec8dYKVnksyHO0Nc6uPEs5pRpewWB5Z/sK4G4TJ g/icSYtgxSsxXnAQMBxFePLMLreQlYgVik+QVhpyEvDQm/tscrf2vjzap96gEMmVVk/N gDXasP4HK0gPutePAkzlhWQ5/DWXIPrNVP/5e24hXPKLjF0y+AOm79tqPkmyPozc6NTE EWCw== X-Gm-Message-State: AOAM5314g0H7LNt5SXzqYtmPmistmfQqMJdkXWTr+uF5ih8PegqaqB4b Ob/5ZNrWdVwkiS55F6z9sds98aNlK6a/tA== X-Google-Smtp-Source: ABdhPJySGPu6pscR1KuWFq/5JIKAbZGUkI0ziLSJobfEW0/lNkO82W2ALWaOtRjLQUTWX03aBvGi2Q== X-Received: by 2002:a17:90a:f992:: with SMTP id cq18mr14414551pjb.136.1600454329805; Fri, 18 Sep 2020 11:38:49 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:49 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 43/81] target/arm: Implement SVE2 SUBHNB, SUBHNT Date: Fri, 18 Sep 2020 11:37:13 -0700 Message-Id: <20200918183751.2787647-44-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::102c; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x102c.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org, Stephen Long Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Stephen Long Signed-off-by: Stephen Long Message-Id: <20200417162231.10374-4-steplong@quicinc.com> Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 8 ++++++++ target/arm/sve.decode | 2 ++ target/arm/sve_helper.c | 10 ++++++++++ target/arm/translate-sve.c | 3 +++ 4 files changed, 23 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 4d395d3965..4c57dde9a9 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2525,6 +2525,14 @@ DEF_HELPER_FLAGS_4(sve2_raddhnt_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_raddhnt_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_raddhnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_subhnb_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_subhnb_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_subhnb_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_subhnt_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_subhnt_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_subhnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve2_match_ppzz_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_match_ppzz_h, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index a33825066c..8ad2698bcf 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1326,6 +1326,8 @@ ADDHNB 01000101 .. 1 ..... 011 000 ..... ..... @rd_rn_rm ADDHNT 01000101 .. 1 ..... 011 001 ..... ..... @rd_rn_rm RADDHNB 01000101 .. 1 ..... 011 010 ..... ..... @rd_rn_rm RADDHNT 01000101 .. 1 ..... 011 011 ..... ..... @rd_rn_rm +SUBHNB 01000101 .. 1 ..... 011 100 ..... ..... @rd_rn_rm +SUBHNT 01000101 .. 1 ..... 011 101 ..... ..... @rd_rn_rm ### SVE2 Character Match diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index b778fad142..35681cf0c6 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2135,6 +2135,7 @@ void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ #define DO_ADDHN(N, M, SH) ((N + M) >> SH) #define DO_RADDHN(N, M, SH) ((N + M + ((__typeof(N))1 << (SH - 1))) >> SH) +#define DO_SUBHN(N, M, SH) ((N - M) >> SH) DO_BINOPNB(sve2_addhnb_h, uint16_t, uint8_t, 8, DO_ADDHN) DO_BINOPNB(sve2_addhnb_s, uint32_t, uint16_t, 16, DO_ADDHN) @@ -2152,6 +2153,15 @@ DO_BINOPNT(sve2_raddhnt_h, uint16_t, uint8_t, 8, H1_2, H1, DO_RADDHN) DO_BINOPNT(sve2_raddhnt_s, uint32_t, uint16_t, 16, H1_4, H1_2, DO_RADDHN) DO_BINOPNT(sve2_raddhnt_d, uint64_t, uint32_t, 32, , H1_4, DO_RADDHN) +DO_BINOPNB(sve2_subhnb_h, uint16_t, uint8_t, 8, DO_SUBHN) +DO_BINOPNB(sve2_subhnb_s, uint32_t, uint16_t, 16, DO_SUBHN) +DO_BINOPNB(sve2_subhnb_d, uint64_t, uint32_t, 32, DO_SUBHN) + +DO_BINOPNT(sve2_subhnt_h, uint16_t, uint8_t, 8, H1_2, H1, DO_SUBHN) +DO_BINOPNT(sve2_subhnt_s, uint32_t, uint16_t, 16, H1_4, H1_2, DO_SUBHN) +DO_BINOPNT(sve2_subhnt_d, uint64_t, uint32_t, 32, , H1_4, DO_SUBHN) + +#undef DO_SUBHN #undef DO_RADDHN #undef DO_ADDHN diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index e384bed108..1c730a835f 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -7482,6 +7482,9 @@ DO_SVE2_ZZZ_NARROW(ADDHNT, addhnt) DO_SVE2_ZZZ_NARROW(RADDHNB, raddhnb) DO_SVE2_ZZZ_NARROW(RADDHNT, raddhnt) +DO_SVE2_ZZZ_NARROW(SUBHNB, subhnb) +DO_SVE2_ZZZ_NARROW(SUBHNT, subhnt) + static bool do_sve2_ppzz_flags(DisasContext *s, arg_rprr_esz *a, gen_helper_gvec_flags_4 *fn) { From patchwork Fri Sep 18 18:37:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 305002 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 697CEC43465 for ; Fri, 18 Sep 2020 19:39:26 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id EC5DD21741 for ; Fri, 18 Sep 2020 19:39:25 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="nChQr6y6" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EC5DD21741 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:44944 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJME9-0007Uc-0c for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:39:25 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48440) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLHb-0000hf-AP for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:55 -0400 Received: from mail-pf1-x443.google.com ([2607:f8b0:4864:20::443]:43805) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLHZ-0007Bo-3d for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:54 -0400 Received: by mail-pf1-x443.google.com with SMTP id f18so3983889pfa.10 for ; Fri, 18 Sep 2020 11:38:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=cXCOeTAaB0nGgVpUoiSwTXXV2445mwtT8Qe11dK9pgk=; b=nChQr6y6YygYzNhN20wQJdCknD2jA8Hm+sE2TMEXz3UrjjD0KVbjZLe5LgtTl/e2/q s9YnpoF7Qd1H7b4cuTLylXdSVaVJhBZ/dISxQ6BgrZmxGre9ja1QImSCWkvA/qK52E1C zsmPhL5C0KkZ2gjUgoCxWtX+vDreBZKHrbHvxR8yrkfhqtW4g5Fgwtafd1YbBrdzJmUU dbU9gTwSdJX2l+eoGcUJKgker3catPHYz741XoaPBqWMkoKiL/FexFBxREdeAwGW4f/X K/XDe+jfSHurT1WmxJcC4f2sPTrAnT4ZSreh2VLPUOWC7bXgK41NxshKINq3B+9Aw9lA NT4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=cXCOeTAaB0nGgVpUoiSwTXXV2445mwtT8Qe11dK9pgk=; b=LRcRBNQF77s0rRqly2r9/AfcaRXd+6Z1834CCKFMjvSRFpIPKZHX+p03mUeCpJEW5s BD8NGKnUMeLxrBzy3XYWLM8x10j4AQ8T4CRuYvcFwL7FDxkKOF1sWznWYQNYeJ5b5fM+ 54SpYWrXeNhP1wNWiuKrLOjSbybK/T800VCIYMad1xpLbIz0dS2LL/89ytyvXm0LQKhs p5oFnJQsdg4iS7TPVKGz1PRbnFQaUL3zFZL4678E5vT3xOa/6o1W4HscFG5QsF9qK4Si YY9QDo6MEvFAYq/dmxz+8IfdWknzI7r/xDEubrnI7VzrGau3sAZA0zkCVy97gznHs1hZ Vulw== X-Gm-Message-State: AOAM530KTEanWmIaboyeqylCmSwOxcIgBLWEj6N7Hv9M3Ye071WRZMky jQiEmxSmJGQjBIzHnBoPSsj9UrXXKOvj2A== X-Google-Smtp-Source: ABdhPJyi+KbfQKMdab9EcSQoRVxjhcfSp1e97THYoCAafHs2agn4V4Yd/4PUq5UeTinEmztLOiuAIQ== X-Received: by 2002:a62:503:0:b029:13e:d13d:a0f9 with SMTP id 3-20020a6205030000b029013ed13da0f9mr32927998pff.21.1600454330930; Fri, 18 Sep 2020 11:38:50 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:50 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 44/81] target/arm: Implement SVE2 RSUBHNB, RSUBHNT Date: Fri, 18 Sep 2020 11:37:14 -0700 Message-Id: <20200918183751.2787647-45-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::443; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x443.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org, Stephen Long Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Stephen Long This completes the section 'SVE2 integer add/subtract narrow high part' Signed-off-by: Stephen Long Message-Id: <20200417162231.10374-5-steplong@quicinc.com> Signed-off-by: Richard Henderson --- v2: Fix round bit type (laurent desnogues) --- target/arm/helper-sve.h | 8 ++++++++ target/arm/sve.decode | 2 ++ target/arm/sve_helper.c | 10 ++++++++++ target/arm/translate-sve.c | 2 ++ 4 files changed, 22 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 4c57dde9a9..9e8641e1c0 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2533,6 +2533,14 @@ DEF_HELPER_FLAGS_4(sve2_subhnt_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_subhnt_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_subhnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_rsubhnb_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_rsubhnb_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_rsubhnb_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_rsubhnt_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_rsubhnt_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_rsubhnt_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve2_match_ppzz_b, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_match_ppzz_h, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 8ad2698bcf..3121eabbf8 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1328,6 +1328,8 @@ RADDHNB 01000101 .. 1 ..... 011 010 ..... ..... @rd_rn_rm RADDHNT 01000101 .. 1 ..... 011 011 ..... ..... @rd_rn_rm SUBHNB 01000101 .. 1 ..... 011 100 ..... ..... @rd_rn_rm SUBHNT 01000101 .. 1 ..... 011 101 ..... ..... @rd_rn_rm +RSUBHNB 01000101 .. 1 ..... 011 110 ..... ..... @rd_rn_rm +RSUBHNT 01000101 .. 1 ..... 011 111 ..... ..... @rd_rn_rm ### SVE2 Character Match diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 35681cf0c6..19fbf94189 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2136,6 +2136,7 @@ void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ #define DO_ADDHN(N, M, SH) ((N + M) >> SH) #define DO_RADDHN(N, M, SH) ((N + M + ((__typeof(N))1 << (SH - 1))) >> SH) #define DO_SUBHN(N, M, SH) ((N - M) >> SH) +#define DO_RSUBHN(N, M, SH) ((N - M + ((__typeof(N))1 << (SH - 1))) >> SH) DO_BINOPNB(sve2_addhnb_h, uint16_t, uint8_t, 8, DO_ADDHN) DO_BINOPNB(sve2_addhnb_s, uint32_t, uint16_t, 16, DO_ADDHN) @@ -2161,6 +2162,15 @@ DO_BINOPNT(sve2_subhnt_h, uint16_t, uint8_t, 8, H1_2, H1, DO_SUBHN) DO_BINOPNT(sve2_subhnt_s, uint32_t, uint16_t, 16, H1_4, H1_2, DO_SUBHN) DO_BINOPNT(sve2_subhnt_d, uint64_t, uint32_t, 32, , H1_4, DO_SUBHN) +DO_BINOPNB(sve2_rsubhnb_h, uint16_t, uint8_t, 8, DO_RSUBHN) +DO_BINOPNB(sve2_rsubhnb_s, uint32_t, uint16_t, 16, DO_RSUBHN) +DO_BINOPNB(sve2_rsubhnb_d, uint64_t, uint32_t, 32, DO_RSUBHN) + +DO_BINOPNT(sve2_rsubhnt_h, uint16_t, uint8_t, 8, H1_2, H1, DO_RSUBHN) +DO_BINOPNT(sve2_rsubhnt_s, uint32_t, uint16_t, 16, H1_4, H1_2, DO_RSUBHN) +DO_BINOPNT(sve2_rsubhnt_d, uint64_t, uint32_t, 32, , H1_4, DO_RSUBHN) + +#undef DO_RSUBHN #undef DO_SUBHN #undef DO_RADDHN #undef DO_ADDHN diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 1c730a835f..e947a0ff25 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -7484,6 +7484,8 @@ DO_SVE2_ZZZ_NARROW(RADDHNT, raddhnt) DO_SVE2_ZZZ_NARROW(SUBHNB, subhnb) DO_SVE2_ZZZ_NARROW(SUBHNT, subhnt) +DO_SVE2_ZZZ_NARROW(RSUBHNB, rsubhnb) +DO_SVE2_ZZZ_NARROW(RSUBHNT, rsubhnt) static bool do_sve2_ppzz_flags(DisasContext *s, arg_rprr_esz *a, gen_helper_gvec_flags_4 *fn) From patchwork Fri Sep 18 18:37:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 305025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A4D5AC43465 for ; Fri, 18 Sep 2020 19:00:00 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0AED621534 for ; Fri, 18 Sep 2020 19:00:00 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="mjdtbTKH" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0AED621534 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:55556 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLbz-0005zW-4o for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 14:59:59 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48442) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLHb-0000iO-Iq for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:55 -0400 Received: from mail-pj1-x1042.google.com ([2607:f8b0:4864:20::1042]:35210) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLHZ-0007Bt-LA for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:55 -0400 Received: by mail-pj1-x1042.google.com with SMTP id jw11so3471636pjb.0 for ; Fri, 18 Sep 2020 11:38:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=akvCKazypQZMb717CSMcQwlZML/Ps2x2LftSTD0dLoQ=; b=mjdtbTKHv2Bv4vBgDVsaNKRZcrcS6Y6zIBBFqKiUc2yVu0v6roIaWa4SKV/Sru2cHu wPRCMEvZNOU/8L40EL8knbHhLOP7gMewbT3kIsHXUteeWEBBPhV1v1F8eRAjuduwf7sD xXfrcMIFmvWxMqhl413zsHlBAfU9hWN8ws4DlxfwYHdkxHjy9PmJRZFkPvIxueLP51OH dp8aafE4FzsiacGYZhuUHMmLd3ZXXYE1/wBWBZqn7cJrMj/jFgiOVG7MVD8STsnthihS 9++bdpAJNeC3EFJcpAqZ1v6IyNTjVSSvgm8n/x7g+XDaP+g5gZFA3KxFk+5OJeJFyD1E CsZg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=akvCKazypQZMb717CSMcQwlZML/Ps2x2LftSTD0dLoQ=; b=jCZYCXcucXJE0++Z6pRNPMYaVzXaT0Upjg1bVfPgRdjbfAvan+vbzK+KXdWmFRjX6+ /WjmUF5Jeeqz4WHjfeNWBITL0hI2+u7TPBS2HLt+UVvaxTe55s/nAx0rg1QgtkrTZrse 4VvooeYWoKnnhYzvBahcO0Ett6aAF4IjMPbKCuFTRjYXIjUT/0hgWQuJZJGf+xOoT+fE zEPp/HMfn6X13SGPZr9lAHKQxOxqAHnQIP2qNRsHSeY44DZPTWRCdzHs+auv8ToI/0dF 6wEpKtiSk8V/zb3sajukOGHbFRROYhOJaUZxJu3bEJUbHw3xGZq6yz8EqvPVEg2cg/nU VHfQ== X-Gm-Message-State: AOAM5337xMI1nubvB8P2ItUQRRGRZSb4oSHUFP8vWRIo6TfBPabYVaJ+ yCgfV7e1MhPEYQYtHSFj7h9hoNFFWwanpA== X-Google-Smtp-Source: ABdhPJyxOiY+Z/5K5LIvdsFMy9dOJzP1bP7xS+Fog5i9aQttGqBjW9Il4+hu7NRjIDlz3y2cuafErw== X-Received: by 2002:a17:90a:4cc6:: with SMTP id k64mr14079840pjh.103.1600454332042; Fri, 18 Sep 2020 11:38:52 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:51 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 45/81] target/arm: Implement SVE2 HISTCNT, HISTSEG Date: Fri, 18 Sep 2020 11:37:15 -0700 Message-Id: <20200918183751.2787647-46-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1042; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1042.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org, Stephen Long Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Stephen Long Signed-off-by: Stephen Long Message-Id: <20200416173109.8856-1-steplong@quicinc.com> Signed-off-by: Richard Henderson --- v2: Fix overlap between output and input vectors. --- target/arm/helper-sve.h | 7 +++ target/arm/sve.decode | 6 ++ target/arm/sve_helper.c | 124 +++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 19 ++++++ 4 files changed, 156 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 9e8641e1c0..34bbb767ef 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2551,6 +2551,13 @@ DEF_HELPER_FLAGS_5(sve2_nmatch_ppzz_b, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve2_nmatch_ppzz_h, TCG_CALL_NO_RWG, i32, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_histcnt_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_histcnt_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_histseg, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_6(sve2_faddp_zpzz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_6(sve2_faddp_zpzz_s, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 3121eabbf8..0edb72d4fb 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -146,6 +146,7 @@ &rprrr_esz rn=%reg_movprfx @rdn_pg_rm_ra ........ esz:2 . ra:5 ... pg:3 rm:5 rd:5 \ &rprrr_esz rn=%reg_movprfx +@rd_pg_rn_rm ........ esz:2 . rm:5 ... pg:3 rn:5 rd:5 &rprr_esz # One register operand, with governing predicate, vector element size @rd_pg_rn ........ esz:2 ... ... ... pg:3 rn:5 rd:5 &rpr_esz @@ -1336,6 +1337,11 @@ RSUBHNT 01000101 .. 1 ..... 011 111 ..... ..... @rd_rn_rm MATCH 01000101 .. 1 ..... 100 ... ..... 0 .... @pd_pg_rn_rm NMATCH 01000101 .. 1 ..... 100 ... ..... 1 .... @pd_pg_rn_rm +### SVE2 Histogram Computation + +HISTCNT 01000101 .. 1 ..... 110 ... ..... ..... @rd_pg_rn_rm +HISTSEG 01000101 .. 1 ..... 101 000 ..... ..... @rd_rn_rm + ## SVE2 floating-point pairwise operations FADDP 01100100 .. 010 00 0 100 ... ..... ..... @rdn_pg_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 19fbf94189..fa4848bc5c 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -7095,3 +7095,127 @@ DO_PPZZ_MATCH(sve2_nmatch_ppzz_b, MO_8, true) DO_PPZZ_MATCH(sve2_nmatch_ppzz_h, MO_16, true) #undef DO_PPZZ_MATCH + +void HELPER(sve2_histcnt_s)(void *vd, void *vn, void *vm, void *vg, + uint32_t desc) +{ + ARMVectorReg scratch; + intptr_t i, j; + intptr_t opr_sz = simd_oprsz(desc); + uint32_t *d = vd, *n = vn, *m = vm; + uint8_t *pg = vg; + + if (d == n) { + n = memcpy(&scratch, n, opr_sz); + if (d == m) { + m = n; + } + } else if (d == m) { + m = memcpy(&scratch, m, opr_sz); + } + + for (i = 0; i < opr_sz; i += 4) { + uint64_t count = 0; + uint8_t pred; + + pred = pg[H1(i >> 3)] >> (i & 7); + if (pred & 1) { + uint32_t nn = n[H4(i >> 2)]; + + for (j = 0; j <= i; j += 4) { + pred = pg[H1(j >> 3)] >> (j & 7); + if ((pred & 1) && nn == m[H4(j >> 2)]) { + ++count; + } + } + } + d[H4(i >> 2)] = count; + } +} + +void HELPER(sve2_histcnt_d)(void *vd, void *vn, void *vm, void *vg, + uint32_t desc) +{ + ARMVectorReg scratch; + intptr_t i, j; + intptr_t opr_sz = simd_oprsz(desc); + uint64_t *d = vd, *n = vn, *m = vm; + uint8_t *pg = vg; + + if (d == n) { + n = memcpy(&scratch, n, opr_sz); + if (d == m) { + m = n; + } + } else if (d == m) { + m = memcpy(&scratch, m, opr_sz); + } + + for (i = 0; i < opr_sz / 8; ++i) { + uint64_t count = 0; + if (pg[H1(i)] & 1) { + uint64_t nn = n[i]; + for (j = 0; j <= i; ++j) { + if ((pg[H1(j)] & 1) && nn == m[j]) { + ++count; + } + } + } + d[i] = count; + } +} + +/* + * Returns the number of bytes in m0 and m1 that match n. + * See comment for do_match2(). + * */ +static inline uint64_t do_histseg_cnt(uint8_t n, uint64_t m0, uint64_t m1) +{ + int esz = MO_8; + int bits = 8 << esz; + uint64_t ones = dup_const(esz, 1); + uint64_t signs = ones << (bits - 1); + uint64_t cmp0, cmp1; + + cmp1 = dup_const(esz, n); + cmp0 = cmp1 ^ m0; + cmp1 = cmp1 ^ m1; + cmp0 = (cmp0 - ones) & ~cmp0 & signs; + cmp1 = (cmp1 - ones) & ~cmp1 & signs; + + /* + * Combine the two compares in a way that the bits do + * not overlap, and so preserves the count of set bits. + * If the host has an efficient instruction for ctpop, + * then ctpop(x) + ctpop(y) has the same number of + * operations as ctpop(x | (y >> 1)). If the host does + * not have an efficient ctpop, then we only want to + * use it once. + */ + return ctpop64(cmp0 | (cmp1 >> 1)); +} + +void HELPER(sve2_histseg)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, j; + intptr_t opr_sz = simd_oprsz(desc); + + for (i = 0; i < opr_sz; i += 16) { + uint64_t n0 = *(uint64_t *)(vn + i); + uint64_t m0 = *(uint64_t *)(vm + i); + uint64_t n1 = *(uint64_t *)(vn + i + 8); + uint64_t m1 = *(uint64_t *)(vm + i + 8); + uint64_t out0 = 0; + uint64_t out1 = 0; + + for (j = 0; j < 64; j += 8) { + uint64_t cnt0 = do_histseg_cnt(n0 >> j, m0, m1); + uint64_t cnt1 = do_histseg_cnt(n1 >> j, m0, m1); + out0 |= cnt0 << j; + out1 |= cnt1 << j; + } + + *(uint64_t *)(vd + i) = out0; + *(uint64_t *)(vd + i + 8) = out1; + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index e947a0ff25..4e792c9b9a 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -7509,6 +7509,25 @@ static bool trans_##NAME(DisasContext *s, arg_rprr_esz *a) \ DO_SVE2_PPZZ_MATCH(MATCH, match) DO_SVE2_PPZZ_MATCH(NMATCH, nmatch) +static bool trans_HISTCNT(DisasContext *s, arg_rprr_esz *a) +{ + static gen_helper_gvec_4 * const fns[2] = { + gen_helper_sve2_histcnt_s, gen_helper_sve2_histcnt_d + }; + if (a->esz < 2) { + return false; + } + return do_sve2_zpzz_ool(s, a, fns[a->esz - 2]); +} + +static bool trans_HISTSEG(DisasContext *s, arg_rrr_esz *a) +{ + if (a->esz != 0) { + return false; + } + return do_sve2_zzz_ool(s, a, gen_helper_sve2_histseg); +} + static bool do_sve2_zpzz_fp(DisasContext *s, arg_rprr_esz *a, gen_helper_gvec_4_ptr *fn) { From patchwork Fri Sep 18 18:37:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 305006 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0E363C43465 for ; Fri, 18 Sep 2020 19:29:49 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 7025922208 for ; Fri, 18 Sep 2020 19:29:48 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="UzRt3rp3" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7025922208 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:50218 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJM4p-0005hG-EP for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:29:47 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48472) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLHe-0000kv-2a for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:03 -0400 Received: from mail-pj1-x1041.google.com ([2607:f8b0:4864:20::1041]:38628) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLHb-0007C5-0u for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:38:57 -0400 Received: by mail-pj1-x1041.google.com with SMTP id u3so3459672pjr.3 for ; Fri, 18 Sep 2020 11:38:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=5lA9h6er1TNE/JRaTF1NKAfHTgmAILWUVDozjnr4f8U=; b=UzRt3rp31xcbPhPYyEyqXIP6yLfII0C3fx0sL98LinwU2gZ6mYUnP1qL0ReIc4ihIj pZu9MnpIUGW3P3+1YBopezQ4hNXXPsQ5nKhpJVNkMxr7KKhiRbaAldkm/c2kbdB+ht8Y +bezrmUxMOKxwZx4HW6wRopujzTtkzuJqW1WRnD9SDcs4i9d5V8bymSXqmlBzna+jZbC W6h/t0LmmAMIL0t0Tazpp+iM9EJDRD/QEnGZ/DIkHhAn0nKL1/cxhxpquy9XaQdyLjUH 5Nhq/p1kLWp2uN7J3klqyiBPfaCJha8CDLb5p3NzsEjSj23JDhblgUffTbGRbHDs3ty6 UhYg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=5lA9h6er1TNE/JRaTF1NKAfHTgmAILWUVDozjnr4f8U=; b=ezWmbBqJhj/aNr4eECdKP+ahDau5oZS6Qj/RdUj1OTHlWBMIRbNu3svthWgRUMC+uX hdjWENkmfUBa0AvmX/xNKYYZDHr4pvhSvA2kJS7H5C5bDnhFMXZ0avqwmWGUJW8/uIxj Qu3/B+wH+//7eiP2R2Zt1eNFkCkHc65vZ34fFSnA5dU5b3jO8LMg68kd1BxTaF2UCWP8 Ckhqq/Tzt0ykOmdPljZzVaYH8NrEX/PTlKCtsp5PUXSK2dw0xfkpcK7u8Mo+SUwIaKsg 53oxbXo+IqBztp5vQuHseo8ZJNY3pd80fVxNmetlrKeRrYO2lErLvRobnA97zR/y7dtl WGnQ== X-Gm-Message-State: AOAM5335o+DcScG6X/IwICvkkdznfUp4Myd8YWqgq+sdeMXKPSWYNmaN XWU4abBtE+d8Y4yqg9xYZ2gkwiinDQ/zsA== X-Google-Smtp-Source: ABdhPJw7hmjmM4CzWuTiES7k4fpBtWN7MBHMBboWUAIQRB0blzUbOUn6jJ0kMn6uu85/AaNmtdL5uQ== X-Received: by 2002:a17:902:c394:b029:d2:4ca:2e22 with SMTP id g20-20020a170902c394b02900d204ca2e22mr6316135plg.77.1600454333233; Fri, 18 Sep 2020 11:38:53 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:52 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 46/81] target/arm: Implement SVE2 XAR Date: Fri, 18 Sep 2020 11:37:16 -0700 Message-Id: <20200918183751.2787647-47-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1041; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1041.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" In addition, use the same vector generator interface for AdvSIMD. This fixes a bug in which the AdvSIMD insn failed to clear the high bits of the SVE register. Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 4 ++ target/arm/helper.h | 2 + target/arm/translate-a64.h | 3 ++ target/arm/sve.decode | 4 ++ target/arm/sve_helper.c | 39 ++++++++++++++ target/arm/translate-a64.c | 25 ++------- target/arm/translate-sve.c | 104 +++++++++++++++++++++++++++++++++++++ target/arm/vec_helper.c | 12 +++++ 8 files changed, 172 insertions(+), 21 deletions(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 34bbb767ef..d472fc009d 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2558,6 +2558,10 @@ DEF_HELPER_FLAGS_5(sve2_histcnt_d, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_4(sve2_histseg, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_xar_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_xar_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_xar_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_6(sve2_faddp_zpzz_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_6(sve2_faddp_zpzz_s, TCG_CALL_NO_RWG, diff --git a/target/arm/helper.h b/target/arm/helper.h index c8eb165989..8294055cab 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -940,6 +940,8 @@ DEF_HELPER_FLAGS_5(neon_sqrdmulh_h, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(neon_sqrdmulh_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(gvec_xar_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + #ifdef TARGET_AARCH64 #include "helper-a64.h" #include "helper-sve.h" diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h index 2e0d16da25..83729d0280 100644 --- a/target/arm/translate-a64.h +++ b/target/arm/translate-a64.h @@ -122,5 +122,8 @@ bool disas_sve(DisasContext *, uint32_t); void gen_gvec_rax1(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t opr_sz, uint32_t max_sz); +void gen_gvec_xar(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, int64_t shift, + uint32_t opr_sz, uint32_t max_sz); #endif /* TARGET_ARM_TRANSLATE_A64_H */ diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 0edb72d4fb..a375ce31f1 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -65,6 +65,7 @@ &rr_dbm rd rn dbm &rrri rd rn rm imm &rri_esz rd rn imm esz +&rrri_esz rd rn rm imm esz &rrr_esz rd rn rm esz &rpr_esz rd pg rn esz &rpr_s rd pg rn s @@ -384,6 +385,9 @@ ORR_zzz 00000100 01 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 EOR_zzz 00000100 10 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 BIC_zzz 00000100 11 1 ..... 001 100 ..... ..... @rd_rn_rm_e0 +XAR 00000100 .. 1 ..... 001 101 rm:5 rd:5 &rrri_esz \ + rn=%reg_movprfx esz=%tszimm16_esz imm=%tszimm16_shr + # SVE2 bitwise ternary operations EOR3 00000100 00 1 ..... 001 110 ..... ..... @rdn_ra_rm_e0 BSL 00000100 00 1 ..... 001 111 ..... ..... @rdn_ra_rm_e0 diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index fa4848bc5c..e0bd0fbe78 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -7219,3 +7219,42 @@ void HELPER(sve2_histseg)(void *vd, void *vn, void *vm, uint32_t desc) *(uint64_t *)(vd + i + 8) = out1; } } + +void HELPER(sve2_xar_b)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + int shr = simd_data(desc); + int shl = 8 - shr; + uint64_t mask = dup_const(MO_8, 0xff >> shr); + uint64_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz; ++i) { + uint64_t t = n[i] ^ m[i]; + d[i] = ((t >> shr) & mask) | ((t << shl) & ~mask); + } +} + +void HELPER(sve2_xar_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + int shr = simd_data(desc); + int shl = 16 - shr; + uint64_t mask = dup_const(MO_16, 0xffff >> shr); + uint64_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz; ++i) { + uint64_t t = n[i] ^ m[i]; + d[i] = ((t >> shr) & mask) | ((t << shl) & ~mask); + } +} + +void HELPER(sve2_xar_s)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 4; + int shr = simd_data(desc); + uint32_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz; ++i) { + d[i] = ror32(n[i] ^ m[i], shr); + } +} diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 7188808341..76e54c1a4e 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -14368,8 +14368,6 @@ static void disas_crypto_xar(DisasContext *s, uint32_t insn) int imm6 = extract32(insn, 10, 6); int rn = extract32(insn, 5, 5); int rd = extract32(insn, 0, 5); - TCGv_i64 tcg_op1, tcg_op2, tcg_res[2]; - int pass; if (!dc_isar_feature(aa64_sha3, s)) { unallocated_encoding(s); @@ -14380,25 +14378,10 @@ static void disas_crypto_xar(DisasContext *s, uint32_t insn) return; } - tcg_op1 = tcg_temp_new_i64(); - tcg_op2 = tcg_temp_new_i64(); - tcg_res[0] = tcg_temp_new_i64(); - tcg_res[1] = tcg_temp_new_i64(); - - for (pass = 0; pass < 2; pass++) { - read_vec_element(s, tcg_op1, rn, pass, MO_64); - read_vec_element(s, tcg_op2, rm, pass, MO_64); - - tcg_gen_xor_i64(tcg_res[pass], tcg_op1, tcg_op2); - tcg_gen_rotri_i64(tcg_res[pass], tcg_res[pass], imm6); - } - write_vec_element(s, tcg_res[0], rd, 0, MO_64); - write_vec_element(s, tcg_res[1], rd, 1, MO_64); - - tcg_temp_free_i64(tcg_op1); - tcg_temp_free_i64(tcg_op2); - tcg_temp_free_i64(tcg_res[0]); - tcg_temp_free_i64(tcg_res[1]); + gen_gvec_xar(MO_64, vec_full_reg_offset(s, rd), + vec_full_reg_offset(s, rn), + vec_full_reg_offset(s, rm), imm6, 16, + vec_full_reg_size(s)); } /* Crypto three-reg imm2 diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 4e792c9b9a..0d0e0c3b46 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -340,6 +340,110 @@ static bool trans_BIC_zzz(DisasContext *s, arg_rrr_esz *a) return do_zzz_fn(s, a, tcg_gen_gvec_andc); } +static void gen_xar8_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + uint64_t mask = dup_const(MO_8, 0xff >> sh); + + tcg_gen_xor_i64(t, n, m); + tcg_gen_shri_i64(d, t, sh); + tcg_gen_shli_i64(t, t, 8 - sh); + tcg_gen_andi_i64(d, d, mask); + tcg_gen_andi_i64(t, t, ~mask); + tcg_gen_or_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_xar16_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, int64_t sh) +{ + TCGv_i64 t = tcg_temp_new_i64(); + uint64_t mask = dup_const(MO_16, 0xffff >> sh); + + tcg_gen_xor_i64(t, n, m); + tcg_gen_shri_i64(d, t, sh); + tcg_gen_shli_i64(t, t, 16 - sh); + tcg_gen_andi_i64(d, d, mask); + tcg_gen_andi_i64(t, t, ~mask); + tcg_gen_or_i64(d, d, t); + tcg_temp_free_i64(t); +} + +static void gen_xar_i32(TCGv_i32 d, TCGv_i32 n, TCGv_i32 m, int32_t sh) +{ + tcg_gen_xor_i32(d, n, m); + tcg_gen_rotri_i32(d, d, sh); +} + +static void gen_xar_i64(TCGv_i64 d, TCGv_i64 n, TCGv_i64 m, int64_t sh) +{ + tcg_gen_xor_i64(d, n, m); + tcg_gen_rotri_i64(d, d, sh); +} + +static void gen_xar_vec(unsigned vece, TCGv_vec d, TCGv_vec n, + TCGv_vec m, int64_t sh) +{ + tcg_gen_xor_vec(vece, d, n, m); + tcg_gen_rotri_vec(vece, d, d, sh); +} + +void gen_gvec_xar(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, + uint32_t rm_ofs, int64_t shift, + uint32_t opr_sz, uint32_t max_sz) +{ + static const TCGOpcode vecop[] = { INDEX_op_rotli_vec, 0 }; + static const GVecGen3i ops[4] = { + { .fni8 = gen_xar8_i64, + .fniv = gen_xar_vec, + .fno = gen_helper_sve2_xar_b, + .opt_opc = vecop, + .vece = MO_8 }, + { .fni8 = gen_xar16_i64, + .fniv = gen_xar_vec, + .fno = gen_helper_sve2_xar_h, + .opt_opc = vecop, + .vece = MO_16 }, + { .fni4 = gen_xar_i32, + .fniv = gen_xar_vec, + .fno = gen_helper_sve2_xar_s, + .opt_opc = vecop, + .vece = MO_32 }, + { .fni8 = gen_xar_i64, + .fniv = gen_xar_vec, + .fno = gen_helper_gvec_xar_d, + .opt_opc = vecop, + .vece = MO_64 } + }; + int esize = 8 << vece; + + /* The SVE2 range is 1 .. esize; the AdvSIMD range is 0 .. esize-1. */ + tcg_debug_assert(shift >= 0); + tcg_debug_assert(shift <= esize); + shift &= esize - 1; + + if (shift == 0) { + /* xar with no rotate devolves to xor. */ + tcg_gen_gvec_xor(vece, rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz); + } else { + tcg_gen_gvec_3i(rd_ofs, rn_ofs, rm_ofs, opr_sz, max_sz, + shift, &ops[vece]); + } +} + +static bool trans_XAR(DisasContext *s, arg_rrri_esz *a) +{ + if (a->esz < 0 || !dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + gen_gvec_xar(a->esz, vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), a->imm, vsz, vsz); + } + return true; +} + static bool do_sve2_zzzz_fn(DisasContext *s, arg_rrrr_esz *a, GVecGen4Fn *fn) { if (!dc_isar_feature(aa64_sve2, s)) { diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index ed33e1ee36..32a4403256 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -2205,3 +2205,15 @@ void HELPER(gvec_umulh_d)(void *vd, void *vn, void *vm, uint32_t desc) } clear_tail(d, opr_sz, simd_maxsz(desc)); } + +void HELPER(gvec_xar_d)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc) / 8; + int shr = simd_data(desc); + uint64_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz; ++i) { + d[i] = ror64(n[i] ^ m[i], shr); + } + clear_tail(d, opr_sz * 8, simd_maxsz(desc)); +} From patchwork Fri Sep 18 18:37:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273341 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 80AAFC43467 for ; Fri, 18 Sep 2020 19:18:43 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 12C3121D7F for ; Fri, 18 Sep 2020 19:18:43 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="mtnAcR53" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 12C3121D7F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:43786 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLu5-00086A-VP for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:18:41 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48514) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLHj-0000lh-O1 for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:03 -0400 Received: from mail-pl1-x62f.google.com ([2607:f8b0:4864:20::62f]:43072) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLHc-0007CI-5x for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:03 -0400 Received: by mail-pl1-x62f.google.com with SMTP id e4so3423262pln.10 for ; Fri, 18 Sep 2020 11:38:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=IXCdiBhItEWZqfoFfbOxAWF3meHLpHpXy0aUL/QPPJ0=; b=mtnAcR53HrNNLZ9zKYsef5jKHSgoJrdoP3h7zFcyQY5Yu6/YTShJNPA4RwjEpwd7+t Sb6q5YJZn233wU+I6l0ZIxoGCUqPn6nMpBF0EO/Y+nhznEV9AOAKlkAooXzycpt49Pk+ aUposYWh8h9M/OOsMvg40wcWn80oq+WRY8LTNoGLhDgFXnWpp5r6KtyK/AIENNxgpbuf mqYL55eaCVVs3Z1pAiQb5H4YUIWOPdPWrTBCX/6ZwzyS4tEslAH/bGv7+fkG0w9nknM1 ByJMmC0Kr6ZCeHKaSzc4iTUrAMLqReHJsJLgc0AcmN5wrTK78rGJqpRTkcPHM9N0Whjb NBxw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=IXCdiBhItEWZqfoFfbOxAWF3meHLpHpXy0aUL/QPPJ0=; b=gH/3ReIzq0jlZnRoYYVf5/7tYFHjUlL6w6dIALm0qvgb66glVV6aETftQpCxed/Cv+ jPaMPlu7C2lB5SUxCBm1tq6hF4ZjKlMoJjG7YPzyjmgDBfJ3oid1+MDseBhwhdB7Mr62 irlmsvy5PHD13FQ+lzud1smeZ8VhvqifDwsZjX3oCeZMfAwJoHPBGRRam+tu969Ox9Qb AWsCZYBg7KAFYt27f/fbw1yNmhvLBG5WQnBYy2k+IT7rYGDfNHRsF4lFP9yYR7nIxJ8n skdi6R8zYdOHws8/n3GveqD0QiQb1+87tJfyOrqi1VmZz0v89HynSyZix8i98oB4Ig7D buVg== X-Gm-Message-State: AOAM53207J7N9pjo/x2DtVn5A7c592xn7zp1bykPdEiostkKHLnR4BP+ 1Aq/BEH54fnxl2NLJ7iPdRKBUF5eJ/ZJeA== X-Google-Smtp-Source: ABdhPJxfx1lONS1Zy81r0gasNea8sVDvC5CKyvgROEHhA2QQb/pCs3elNnwBOu3DI2ooL2EiHCsPiA== X-Received: by 2002:a17:90a:1548:: with SMTP id y8mr14154247pja.113.1600454334501; Fri, 18 Sep 2020 11:38:54 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:53 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 47/81] target/arm: Implement SVE2 scatter store insns Date: Fri, 18 Sep 2020 11:37:17 -0700 Message-Id: <20200918183751.2787647-48-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::62f; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x62f.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org, Stephen Long Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Stephen Long Add decoding logic for SVE2 64-bit/32-bit scatter non-temporal store insns. 64-bit * STNT1B (vector plus scalar) * STNT1H (vector plus scalar) * STNT1W (vector plus scalar) * STNT1D (vector plus scalar) 32-bit * STNT1B (vector plus scalar) * STNT1H (vector plus scalar) * STNT1W (vector plus scalar) Signed-off-by: Stephen Long Message-Id: <20200422141553.8037-1-steplong@quicinc.com> Signed-off-by: Richard Henderson --- target/arm/sve.decode | 10 ++++++++++ target/arm/translate-sve.c | 8 ++++++++ 2 files changed, 18 insertions(+) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index a375ce31f1..dc784dcabe 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1388,3 +1388,13 @@ UMLSLT_zzzw 01000100 .. 0 ..... 010 111 ..... ..... @rda_rn_rm CMLA_zzzz 01000100 esz:2 0 rm:5 0010 rot:2 rn:5 rd:5 ra=%reg_movprfx SQRDCMLAH_zzzz 01000100 esz:2 0 rm:5 0011 rot:2 rn:5 rd:5 ra=%reg_movprfx + +### SVE2 Memory Store Group + +# SVE2 64-bit scatter non-temporal store (vector plus scalar) +STNT1_zprz 1110010 .. 00 ..... 001 ... ..... ..... \ + @rprr_scatter_store xs=2 esz=3 scale=0 + +# SVE2 32-bit scatter non-temporal store (vector plus scalar) +STNT1_zprz 1110010 .. 10 ..... 001 ... ..... ..... \ + @rprr_scatter_store xs=0 esz=2 scale=0 diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 0d0e0c3b46..af8feff707 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6172,6 +6172,14 @@ static bool trans_ST1_zpiz(DisasContext *s, arg_ST1_zpiz *a) return true; } +static bool trans_STNT1_zprz(DisasContext *s, arg_ST1_zprz *a) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + return trans_ST1_zprz(s, a); +} + /* * Prefetches */ From patchwork Fri Sep 18 18:37:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 305004 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B2582C43463 for ; Fri, 18 Sep 2020 19:34:01 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0A29A21734 for ; Fri, 18 Sep 2020 19:34:01 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="AECXOMj6" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0A29A21734 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:58444 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJM8u-0000xu-1x for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:34:00 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48526) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLHk-0000nA-K3 for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:05 -0400 Received: from mail-pf1-x434.google.com ([2607:f8b0:4864:20::434]:41131) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLHd-0007CW-OM for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:04 -0400 Received: by mail-pf1-x434.google.com with SMTP id z19so3986295pfn.8 for ; Fri, 18 Sep 2020 11:38:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=aEKi5HB5DZ4Vz/ADDH+O3l8Z8WFjuqPobP8PDP6SNRs=; b=AECXOMj6GxW9kYWw6LBi2ll54mYHkrx1QWyYG55UsfqTqtcyKYaaPKnobYbWxNf6M+ FzINm1r+5Qc+vKZatD/kOOQWilEeB0UjL1A4uW9QnvF76AsQj/Uc3zXzkgJNlfHlKpTo tz8PU0Wj0OwqupkvQlLta1Pd/zPBFj7fN/nu8ANSkNfTDe/i+FbLWSte5mIjRO4dOcZd Z6QufyfbbR2W0NiaLYePSGZ3MXL/1CuylkFBIX9maOTrgfoMAzCf16bd4fapM1Ay/txc e7JNMCGo613P5pVXCxaTjox4Yep6ps5ejxxZijIi3svveU+z3NJq9iH289u8xsKFskb4 L5Kw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=aEKi5HB5DZ4Vz/ADDH+O3l8Z8WFjuqPobP8PDP6SNRs=; b=FRHrdNm6MQtV9cT1gnkslqw3yKejUMgePnbYyc7HJVuRtJr6X3et9TjU2dOWMLCQCh UZx0ceR0U63UtAoC35llfv6HFxBDNVlS7xmP7CJQ9irveAgOE23dAoVU6RwRGbMY5heg z7eQDE3lJCWqQVyHlp53MkeZlEBJY1KhVQxGvQkYWI+EfJbf4zkltbdVbyzFcUkmrk7U JWwBQexl5WOGhpCA899CSTf+7GvuDBNR/H4BINmAEpuQdnQYeKhdnC7JSMXfeanubSVS eo9RnfPyFG6XEJU8K7OgRQi99QWLlKwPSTq3x1EhjHCFM4SCGAYknM+eYmVB3xzf01JG z7Tw== X-Gm-Message-State: AOAM530BfUMFjzNpRyMMaEBrWL2u2KGLvcIkJakbJc6sTPS/DN/8pKs7 5jADfcAtlNRlKZnuQZyFqZOnZo5HQM0jlw== X-Google-Smtp-Source: ABdhPJyEpd/jrYOLeo5vUxNc6FjJek6NGfwZZMz4tXUeGbAy2KzlDJXxYVKAO8Ui5jYVZG+HTl3HuA== X-Received: by 2002:a63:e708:: with SMTP id b8mr408090pgi.77.1600454336044; Fri, 18 Sep 2020 11:38:56 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:55 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 48/81] target/arm: Implement SVE2 gather load insns Date: Fri, 18 Sep 2020 11:37:18 -0700 Message-Id: <20200918183751.2787647-49-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::434; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x434.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org, Stephen Long Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Stephen Long Add decoding logic for SVE2 64-bit/32-bit gather non-temporal load insns. 64-bit * LDNT1SB * LDNT1B (vector plus scalar) * LDNT1SH * LDNT1H (vector plus scalar) * LDNT1SW * LDNT1W (vector plus scalar) * LDNT1D (vector plus scalar) 32-bit * LDNT1SB * LDNT1B (vector plus scalar) * LDNT1SH * LDNT1H (vector plus scalar) * LDNT1W (vector plus scalar) Signed-off-by: Stephen Long Message-Id: <20200422152343.12493-1-steplong@quicinc.com> Signed-off-by: Richard Henderson --- target/arm/sve.decode | 11 +++++++++++ target/arm/translate-sve.c | 8 ++++++++ 2 files changed, 19 insertions(+) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index dc784dcabe..1b5bd2d193 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1389,6 +1389,17 @@ UMLSLT_zzzw 01000100 .. 0 ..... 010 111 ..... ..... @rda_rn_rm CMLA_zzzz 01000100 esz:2 0 rm:5 0010 rot:2 rn:5 rd:5 ra=%reg_movprfx SQRDCMLAH_zzzz 01000100 esz:2 0 rm:5 0011 rot:2 rn:5 rd:5 ra=%reg_movprfx +### SVE2 Memory Gather Load Group + +# SVE2 64-bit gather non-temporal load +# (scalar plus unpacked 32-bit unscaled offsets) +LDNT1_zprz 1100010 msz:2 00 rm:5 1 u:1 0 pg:3 rn:5 rd:5 \ + &rprr_gather_load xs=0 esz=3 scale=0 ff=0 + +# SVE2 32-bit gather non-temporal load (scalar plus 32-bit unscaled offsets) +LDNT1_zprz 1000010 msz:2 00 rm:5 10 u:1 pg:3 rn:5 rd:5 \ + &rprr_gather_load xs=0 esz=2 scale=0 ff=0 + ### SVE2 Memory Store Group # SVE2 64-bit scatter non-temporal store (vector plus scalar) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index af8feff707..0cb10ac3e2 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6020,6 +6020,14 @@ static bool trans_LD1_zpiz(DisasContext *s, arg_LD1_zpiz *a) return true; } +static bool trans_LDNT1_zprz(DisasContext *s, arg_LD1_zprz *a) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + return trans_LD1_zprz(s, a); +} + /* Indexed by [mte][be][xs][msz]. */ static gen_helper_gvec_mem_scatter * const scatter_store_fn32[2][2][2][3] = { { /* MTE Inactive */ From patchwork Fri Sep 18 18:37:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 305013 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 299D3C43464 for ; Fri, 18 Sep 2020 19:21:41 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 9AD8821D7F for ; Fri, 18 Sep 2020 19:21:40 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="VYS1pFZD" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9AD8821D7F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:52006 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLwx-00037j-G7 for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:21:39 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48600) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLHq-0000oT-Gd for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:15 -0400 Received: from mail-pf1-x42b.google.com ([2607:f8b0:4864:20::42b]:37881) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLHj-0007Cg-D5 for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:10 -0400 Received: by mail-pf1-x42b.google.com with SMTP id w7so3995290pfi.4 for ; Fri, 18 Sep 2020 11:38:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=F9WZxYtN1t58DJuwcxgcE7EyJ0JsU31eFbA5w0cg3Y0=; b=VYS1pFZDx12BJTVstF/LJ4p8Mleko9uRxS3mzWWmag0XVEXWtHOMLepVt3148bNZsP m/OtHBQzgd5cl9i6p1e01yFLOUfPRbkEhLEpQ1RjrXn5dzobvvmOevf2hEz/1wVfmj4E SrfbgxgoXZNO/S5OphKUsB9Vyh1Svh4dmMy2JoH+IdWAykKII7UsOVN0VP5zEyknFqJq CHdfrHy709E5A00vIfCdNzl3aWVrWBX89ycch8Xcs/ltOecuYbsZOUg2giaIHbsQ7H4n 0jdBWaBWfgH1YH0g9sbQt/G6oqffJ3R/fVlWygvV1sBOky9QBZbk4+WsLZ5w58MMF0x4 wD8g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=F9WZxYtN1t58DJuwcxgcE7EyJ0JsU31eFbA5w0cg3Y0=; b=AwMW7uiZkor/IQe9Q7wjHAKq+gajDFAL5mH3RPMBnr3qX7zquPpKAXFNcqH0hoshnK e1e4QlRgjuP08O2oePjjk3jVOFYM8UugE4jV6D6kHZcxcHQ5Aq7FyW7QUSJNXs7Wk/Ks UF1HQi6U5Qh9jYsLX5buDP30i3HIIPd0zfCqXgHwHU1uBXwILN1dB2WjJG2xga1bOTey ch2dnCzXngzltw9K+x1gvux9I4r2PtcwLRl8yw+vl7he6VGeiXtIwxYUGUu4HXHV4IaA GGVWgoIrkebsLj58Mqfhp72h6fC4MF37oEk/iLm88oCPM0uYdd0DsYD2N47cPeYkGcKq SDJw== X-Gm-Message-State: AOAM5336KZjuTOetOncaL2hJ7QjeIKwNTZB/EHHx/iDSk5XAtTqeT4qa X0xB88qOrLMA9AgJii3r8ZGMqn12xCeYVA== X-Google-Smtp-Source: ABdhPJyGMQ0spxS3OXzCQR3X4lF/WMlJgFZfvjfJOdCPXxEVCa2KXUqhN9Q6jsEWzYFGxGfGL9alXg== X-Received: by 2002:a62:406:0:b029:142:2501:3978 with SMTP id 6-20020a6204060000b029014225013978mr16388530pfe.61.1600454337103; Fri, 18 Sep 2020 11:38:57 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:56 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 49/81] target/arm: Implement SVE2 FMMLA Date: Fri, 18 Sep 2020 11:37:19 -0700 Message-Id: <20200918183751.2787647-50-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42b; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42b.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org, Stephen Long Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Stephen Long Signed-off-by: Stephen Long Message-Id: <20200422165503.13511-1-steplong@quicinc.com> [rth: Fix indexing in helpers, expand macro to straight functions.] Signed-off-by: Richard Henderson --- target/arm/cpu.h | 10 ++++++ target/arm/helper-sve.h | 3 ++ target/arm/sve.decode | 4 +++ target/arm/sve_helper.c | 74 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 34 ++++++++++++++++++ 5 files changed, 125 insertions(+) diff --git a/target/arm/cpu.h b/target/arm/cpu.h index 5ed2f2c65b..aaefbf0167 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -3917,6 +3917,16 @@ static inline bool isar_feature_aa64_sve2_bitperm(const ARMISARegisters *id) return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, BITPERM) != 0; } +static inline bool isar_feature_aa64_sve2_f32mm(const ARMISARegisters *id) +{ + return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, F32MM) != 0; +} + +static inline bool isar_feature_aa64_sve2_f64mm(const ARMISARegisters *id) +{ + return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, F64MM) != 0; +} + /* * Feature tests for "does this exist in either 32-bit or 64-bit?" */ diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index d472fc009d..2c6d3431ec 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2662,3 +2662,6 @@ DEF_HELPER_FLAGS_5(sve2_sqrdcmlah_zzzz_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_sqrdcmlah_zzzz_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_6(fmmla_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(fmmla_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 1b5bd2d193..11e724d3a2 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1389,6 +1389,10 @@ UMLSLT_zzzw 01000100 .. 0 ..... 010 111 ..... ..... @rda_rn_rm CMLA_zzzz 01000100 esz:2 0 rm:5 0010 rot:2 rn:5 rd:5 ra=%reg_movprfx SQRDCMLAH_zzzz 01000100 esz:2 0 rm:5 0011 rot:2 rn:5 rd:5 ra=%reg_movprfx +### SVE2 floating point matrix multiply accumulate + +FMMLA 01100100 .. 1 ..... 111001 ..... ..... @rda_rn_rm + ### SVE2 Memory Gather Load Group # SVE2 64-bit gather non-temporal load diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index e0bd0fbe78..aaf336118b 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -7258,3 +7258,77 @@ void HELPER(sve2_xar_s)(void *vd, void *vn, void *vm, uint32_t desc) d[i] = ror32(n[i] ^ m[i], shr); } } + +void HELPER(fmmla_s)(void *vd, void *vn, void *vm, void *va, + void *status, uint32_t desc) +{ + intptr_t s, opr_sz = simd_oprsz(desc) / (sizeof(float32) * 4); + + for (s = 0; s < opr_sz; ++s) { + float32 *n = vn + s * sizeof(float32) * 4; + float32 *m = vm + s * sizeof(float32) * 4; + float32 *a = va + s * sizeof(float32) * 4; + float32 *d = vd + s * sizeof(float32) * 4; + float32 n00 = n[H4(0)], n01 = n[H4(1)]; + float32 n10 = n[H4(2)], n11 = n[H4(3)]; + float32 m00 = m[H4(0)], m01 = m[H4(1)]; + float32 m10 = m[H4(2)], m11 = m[H4(3)]; + float32 p0, p1; + + /* i = 0, j = 0 */ + p0 = float32_mul(n00, m00, status); + p1 = float32_mul(n01, m01, status); + d[H4(0)] = float32_add(a[H4(0)], float32_add(p0, p1, status), status); + + /* i = 0, j = 1 */ + p0 = float32_mul(n00, m10, status); + p1 = float32_mul(n01, m11, status); + d[H4(1)] = float32_add(a[H4(1)], float32_add(p0, p1, status), status); + + /* i = 1, j = 0 */ + p0 = float32_mul(n10, m00, status); + p1 = float32_mul(n11, m01, status); + d[H4(2)] = float32_add(a[H4(2)], float32_add(p0, p1, status), status); + + /* i = 1, j = 1 */ + p0 = float32_mul(n10, m10, status); + p1 = float32_mul(n11, m11, status); + d[H4(3)] = float32_add(a[H4(3)], float32_add(p0, p1, status), status); + } +} + +void HELPER(fmmla_d)(void *vd, void *vn, void *vm, void *va, + void *status, uint32_t desc) +{ + intptr_t s, opr_sz = simd_oprsz(desc) / (sizeof(float64) * 4); + + for (s = 0; s < opr_sz; ++s) { + float64 *n = vn + s * sizeof(float64) * 4; + float64 *m = vm + s * sizeof(float64) * 4; + float64 *a = va + s * sizeof(float64) * 4; + float64 *d = vd + s * sizeof(float64) * 4; + float64 n00 = n[0], n01 = n[1], n10 = n[2], n11 = n[3]; + float64 m00 = m[0], m01 = m[1], m10 = m[2], m11 = m[3]; + float64 p0, p1; + + /* i = 0, j = 0 */ + p0 = float64_mul(n00, m00, status); + p1 = float64_mul(n01, m01, status); + d[0] = float64_add(a[0], float64_add(p0, p1, status), status); + + /* i = 0, j = 1 */ + p0 = float64_mul(n00, m10, status); + p1 = float64_mul(n01, m11, status); + d[1] = float64_add(a[1], float64_add(p0, p1, status), status); + + /* i = 1, j = 0 */ + p0 = float64_mul(n10, m00, status); + p1 = float64_mul(n11, m01, status); + d[2] = float64_add(a[2], float64_add(p0, p1, status), status); + + /* i = 1, j = 1 */ + p0 = float64_mul(n10, m10, status); + p1 = float64_mul(n11, m11, status); + d[3] = float64_add(a[3], float64_add(p0, p1, status), status); + } +} diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 0cb10ac3e2..63acb25921 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -7677,6 +7677,40 @@ DO_SVE2_ZPZZ_FP(FMINP, fminp) * SVE Integer Multiply-Add (unpredicated) */ +static bool trans_FMMLA(DisasContext *s, arg_rrrr_esz *a) +{ + gen_helper_gvec_4_ptr *fn; + + switch (a->esz) { + case MO_32: + if (!dc_isar_feature(aa64_sve2_f32mm, s)) { + return false; + } + fn = gen_helper_fmmla_s; + break; + case MO_64: + if (!dc_isar_feature(aa64_sve2_f64mm, s)) { + return false; + } + fn = gen_helper_fmmla_d; + break; + default: + return false; + } + + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + TCGv_ptr status = fpstatus_ptr(a->esz == MO_16 ? FPST_FPCR_F16 : FPST_FPCR); + tcg_gen_gvec_4_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vec_full_reg_offset(s, a->ra), + status, vsz, vsz, 0, fn); + tcg_temp_free_ptr(status); + } + return true; +} + static bool do_sqdmlal_zzzw(DisasContext *s, arg_rrrr_esz *a, bool sel1, bool sel2) { From patchwork Fri Sep 18 18:37:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 305022 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 16874C43465 for ; Fri, 18 Sep 2020 19:05:24 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 9803F21707 for ; Fri, 18 Sep 2020 19:05:23 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="SxywB5ra" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9803F21707 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:38614 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLhC-0002N0-N6 for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:05:22 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48576) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLHo-0000nl-J0 for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:10 -0400 Received: from mail-pl1-x642.google.com ([2607:f8b0:4864:20::642]:33105) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLHj-0007Cn-DN for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:08 -0400 Received: by mail-pl1-x642.google.com with SMTP id d19so3450878pld.0 for ; Fri, 18 Sep 2020 11:38:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=5ZRxRihoVdRLpqF0F3gozSyDHbCAhl2YZuePR17xmiA=; b=SxywB5ra8Z28Oz/tQ45mwVcuJ9sxykj8+VBSMyrkJ/qxFcJSx38rk3bna/VVXPK925 Rbarsn6KHxwGeQ+ApT4aghgIcGPtMh6BoRdXFM2PwDSZnoIOlk8wu0EF7i24JiUrO/DK /Ah2d6Wj3JDJRxDPfedv7hwAS5H6CBNEiHkPtSfpNGH4zYYGugS88SYOxXyOZq5DOjyv l5x39Q9+zpDEu5zIOZ+whKluQBjvIi94YHwE0QuRkroGq5HwXrnQ++qf3VKY37WlR0GN W4MBhkGG34o0Sjmjuq2FgzqF+lezUgrC4TpqMRk4uMYBZfRgV8qlUkieYq4k7HJmE0JH TuMQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=5ZRxRihoVdRLpqF0F3gozSyDHbCAhl2YZuePR17xmiA=; b=awRI9ODdEpj0O1MN+NRyYQRo4CfAI15Bh8VqrHfXcUV0gwhQf2zLisLhxnuRkKndkA AdnqBVoQlGQ6DC7xEd8QADGi5B6WN5kYwdL8HbgRLwBxpgCh2cimpqpFxn2z0rMropKv 3k/0x2OYt8b/bUzMas2iAx+NO/3VsvYbW8FnST+sGFJ+B49PVzxZ6SAdRiw8AU2TJmBm ECBfczLbdNveiZYMJ2XCw3apnf1IkqodClZc+4/gwzu8W7Gw4hXfxbZaS34TYSVuY4V4 UZBiqgv7YIe/4YkVbS8aQfAZPuv6Ofk8Lho8h6PrAItsD5Jw7RHdI1GF35QQdx2B/lQO Cv8A== X-Gm-Message-State: AOAM531Nd8Vpn+wlLgGKr+3Wh8kfrzsZzQq0OJdUb9P25McgfOY9G/Xe Woxw+yom7VBIWQT/tvvAkIeD1+DLs0a8Qg== X-Google-Smtp-Source: ABdhPJzGdzG6PI8aVnOToVhOY+jyqJCDJlTsi33r2NzZ8BVtNiizwqfc/mH5LQJlE8Zql22scjNYCA== X-Received: by 2002:a17:90a:9403:: with SMTP id r3mr14677177pjo.52.1600454338371; Fri, 18 Sep 2020 11:38:58 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:57 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 50/81] target/arm: Implement SVE2 SPLICE, EXT Date: Fri, 18 Sep 2020 11:37:20 -0700 Message-Id: <20200918183751.2787647-51-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::642; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x642.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org, Stephen Long Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Stephen Long Signed-off-by: Stephen Long Message-Id: <20200423180347.9403-1-steplong@quicinc.com> [rth: Rename the trans_* functions to *_sve2.] Signed-off-by: Richard Henderson --- target/arm/sve.decode | 11 +++++++++-- target/arm/translate-sve.c | 35 ++++++++++++++++++++++++++++++----- 2 files changed, 39 insertions(+), 7 deletions(-) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 11e724d3a2..0688dae450 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -494,10 +494,14 @@ CPY_z_i 00000101 .. 01 .... 00 . ........ ..... @rdn_pg4 imm=%sh8_i8s ### SVE Permute - Extract Group -# SVE extract vector (immediate offset) +# SVE extract vector (destructive) EXT 00000101 001 ..... 000 ... rm:5 rd:5 \ &rrri rn=%reg_movprfx imm=%imm8_16_10 +# SVE2 extract vector (constructive) +EXT_sve2 00000101 011 ..... 000 ... rn:5 rd:5 \ + &rri imm=%imm8_16_10 + ### SVE Permute - Unpredicated Group # SVE broadcast general register @@ -588,9 +592,12 @@ REVH 00000101 .. 1001 01 100 ... ..... ..... @rd_pg_rn REVW 00000101 .. 1001 10 100 ... ..... ..... @rd_pg_rn RBIT 00000101 .. 1001 11 100 ... ..... ..... @rd_pg_rn -# SVE vector splice (predicated) +# SVE vector splice (predicated, destructive) SPLICE 00000101 .. 101 100 100 ... ..... ..... @rdn_pg_rm +# SVE2 vector splice (predicated, constructive) +SPLICE_sve2 00000101 .. 101 101 100 ... ..... ..... @rd_pg_rn + ### SVE Select Vectors Group # SVE select vector elements (predicated) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 63acb25921..5e8291e44b 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2266,18 +2266,18 @@ static bool trans_CPY_z_i(DisasContext *s, arg_CPY_z_i *a) *** SVE Permute Extract Group */ -static bool trans_EXT(DisasContext *s, arg_EXT *a) +static bool do_EXT(DisasContext *s, int rd, int rn, int rm, int imm) { if (!sve_access_check(s)) { return true; } unsigned vsz = vec_full_reg_size(s); - unsigned n_ofs = a->imm >= vsz ? 0 : a->imm; + unsigned n_ofs = imm >= vsz ? 0 : imm; unsigned n_siz = vsz - n_ofs; - unsigned d = vec_full_reg_offset(s, a->rd); - unsigned n = vec_full_reg_offset(s, a->rn); - unsigned m = vec_full_reg_offset(s, a->rm); + unsigned d = vec_full_reg_offset(s, rd); + unsigned n = vec_full_reg_offset(s, rn); + unsigned m = vec_full_reg_offset(s, rm); /* Use host vector move insns if we have appropriate sizes * and no unfortunate overlap. @@ -2296,6 +2296,19 @@ static bool trans_EXT(DisasContext *s, arg_EXT *a) return true; } +static bool trans_EXT(DisasContext *s, arg_EXT *a) +{ + return do_EXT(s, a->rd, a->rn, a->rm, a->imm); +} + +static bool trans_EXT_sve2(DisasContext *s, arg_rri *a) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + return do_EXT(s, a->rd, a->rn, (a->rn + 1) % 32, a->imm); +} + /* *** SVE Permute - Unpredicated Group */ @@ -3023,6 +3036,18 @@ static bool trans_SPLICE(DisasContext *s, arg_rprr_esz *a) return true; } +static bool trans_SPLICE_sve2(DisasContext *s, arg_rpr_esz *a) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + gen_gvec_ool_zzzp(s, gen_helper_sve_splice, + a->rd, a->rn, (a->rn + 1) % 32, a->pg, a->esz); + } + return true; +} + /* *** SVE Integer Compare - Vectors Group */ From patchwork Fri Sep 18 18:37:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273337 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 04189C4363C for ; Fri, 18 Sep 2020 19:23:45 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8645D221EC for ; Fri, 18 Sep 2020 19:23:44 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="upiQyLvO" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8645D221EC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:60304 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLyx-0006Wb-LA for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:23:43 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48666) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLHy-0000r5-9e for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:19 -0400 Received: from mail-pl1-x644.google.com ([2607:f8b0:4864:20::644]:42500) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLHj-0007Cv-D4 for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:16 -0400 Received: by mail-pl1-x644.google.com with SMTP id y6so3423364plt.9 for ; Fri, 18 Sep 2020 11:39:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=rFMth/G5vjuMI2/yW8CGjNFUXe0PlR2vFrByOQBMY5U=; b=upiQyLvOVGaLzF4kMWC3+YGg8hIMH2CD9oKUHcaEiy+nFvPFwSC+bYudzvciuRGwC3 1PRG9LxGIVfd5JC1ITnkAbc8IBt8c+mnzIz5LVstz+n6KV5vVAZzW+36aY+6azP+7Iir J1Di+AHfMPP/shdMRzJUp0j2P+xVHCCj5RVR4wqe01c+5kwXwaGBldgdG07WuNcwFyMp wj+RpA44d/mfsAHh70rHN2WPi+BaHeW+zc7uxDkAREhgpHr/aFrvVlYmUGip0L0UQ8F9 42ne6w6UohiJsAGFRgsV1w1feS1m7v4VQkQDaCYZJVJ1B2Nt0PHj0XWEV6DUJ4Z2BOvX wE+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=rFMth/G5vjuMI2/yW8CGjNFUXe0PlR2vFrByOQBMY5U=; b=hYFIO0S7Xhu44aAMEZk0iZsYBI7k3ot7edliwyvhO4ZEi8WF1ysvdgfVrQTIWc+12p KkkFSPJdrup+4KtuqneXb/JrqWmwbYM3uogN1nP+ryFnrXyDMT12qgqq3e7//0oZT9tN haQCoMAx+5Lw4a1xy7bEyd2Mqs/E+ABcDBiCzh2p/cDSE74muKWn4oKeYfHZExzXURIM EyWqNAFonY8XK7GYGRYTjLsxNlzQwqjmKTMCDyjmth8SoiodjfrJOB+qq8SlHomec7gm i7/oMurb6MsRnDs6SHb2Gx65fkrIuA0ZZ7p52SgB4a3nzY8Xd12QbjD/dYOUFyHjsyTq fMZw== X-Gm-Message-State: AOAM530TvBXdXxyzo7hH574o+pivnzJ6fmvgqVYi6dmDKe44RiJKstQS rD/Bfbpc1HCj4MvM1wCmgmZNQ4ayYbHpXA== X-Google-Smtp-Source: ABdhPJy65Om140TqIeCi1AaJ+vDsVWdbZB1eM5XZ/1kRTMWXEq1njlDGcrY61zLZwONgq2CLumGVdw== X-Received: by 2002:a17:90a:d18b:: with SMTP id fu11mr13775675pjb.203.1600454339729; Fri, 18 Sep 2020 11:38:59 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:38:59 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 51/81] target/arm: Pass separate addend to {U, S}DOT helpers Date: Fri, 18 Sep 2020 11:37:21 -0700 Message-Id: <20200918183751.2787647-52-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::644; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x644.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" For SVE, we potentially have a 4th argument coming from the movprfx instruction. Currently we do not optimize movprfx, so the problem is not visible. Signed-off-by: Richard Henderson --- target/arm/helper.h | 20 +++--- target/arm/sve.decode | 7 +- target/arm/translate-a64.c | 15 ++++- target/arm/translate-sve.c | 13 ++-- target/arm/vec_helper.c | 112 ++++++++++++++++++-------------- target/arm/translate-neon.c.inc | 10 +-- 6 files changed, 105 insertions(+), 72 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index 8294055cab..97222bd256 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -595,15 +595,19 @@ DEF_HELPER_FLAGS_5(sve2_sqrdmlah_d, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve2_sqrdmlsh_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) -DEF_HELPER_FLAGS_4(gvec_sdot_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) -DEF_HELPER_FLAGS_4(gvec_udot_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) -DEF_HELPER_FLAGS_4(gvec_sdot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) -DEF_HELPER_FLAGS_4(gvec_udot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_sdot_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_udot_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_sdot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_udot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) -DEF_HELPER_FLAGS_4(gvec_sdot_idx_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) -DEF_HELPER_FLAGS_4(gvec_udot_idx_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) -DEF_HELPER_FLAGS_4(gvec_sdot_idx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) -DEF_HELPER_FLAGS_4(gvec_udot_idx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_sdot_idx_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_udot_idx_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_sdot_idx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_udot_idx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fcaddh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 0688dae450..5815ba9b1c 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -756,12 +756,13 @@ UMIN_zzi 00100101 .. 101 011 110 ........ ..... @rdn_i8u MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s # SVE integer dot product (unpredicated) -DOT_zzz 01000100 1 sz:1 0 rm:5 00000 u:1 rn:5 rd:5 ra=%reg_movprfx +DOT_zzzz 01000100 1 sz:1 0 rm:5 00000 u:1 rn:5 rd:5 \ + ra=%reg_movprfx # SVE integer dot product (indexed) -DOT_zzx 01000100 101 index:2 rm:3 00000 u:1 rn:5 rd:5 \ +DOT_zzxw 01000100 101 index:2 rm:3 00000 u:1 rn:5 rd:5 \ sz=0 ra=%reg_movprfx -DOT_zzx 01000100 111 index:1 rm:4 00000 u:1 rn:5 rd:5 \ +DOT_zzxw 01000100 111 index:1 rm:4 00000 u:1 rn:5 rd:5 \ sz=1 ra=%reg_movprfx # SVE floating-point complex add (predicated) diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 76e54c1a4e..1a9251db67 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -692,6 +692,17 @@ static void gen_gvec_op3_qc(DisasContext *s, bool is_q, int rd, int rn, tcg_temp_free_ptr(qc_ptr); } +/* Expand a 4-operand operation using an out-of-line helper. */ +static void gen_gvec_op4_ool(DisasContext *s, bool is_q, int rd, int rn, + int rm, int ra, int data, gen_helper_gvec_4 *fn) +{ + tcg_gen_gvec_4_ool(vec_full_reg_offset(s, rd), + vec_full_reg_offset(s, rn), + vec_full_reg_offset(s, rm), + vec_full_reg_offset(s, ra), + is_q ? 16 : 8, vec_full_reg_size(s), data, fn); +} + /* Set ZF and NF based on a 64 bit result. This is alas fiddlier * than the 32 bit equivalent. */ @@ -12202,7 +12213,7 @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn) return; case 0x2: /* SDOT / UDOT */ - gen_gvec_op3_ool(s, is_q, rd, rn, rm, 0, + gen_gvec_op4_ool(s, is_q, rd, rn, rm, rd, 0, u ? gen_helper_gvec_udot_b : gen_helper_gvec_sdot_b); return; @@ -13461,7 +13472,7 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) switch (16 * u + opcode) { case 0x0e: /* SDOT */ case 0x1e: /* UDOT */ - gen_gvec_op3_ool(s, is_q, rd, rn, rm, index, + gen_gvec_op4_ool(s, is_q, rd, rn, rm, rd, index, u ? gen_helper_gvec_udot_idx_b : gen_helper_gvec_sdot_idx_b); return; diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 5e8291e44b..66303dac54 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3804,28 +3804,29 @@ DO_ZZI(UMIN, umin) #undef DO_ZZI -static bool trans_DOT_zzz(DisasContext *s, arg_DOT_zzz *a) +static bool trans_DOT_zzzz(DisasContext *s, arg_DOT_zzzz *a) { - static gen_helper_gvec_3 * const fns[2][2] = { + static gen_helper_gvec_4 * const fns[2][2] = { { gen_helper_gvec_sdot_b, gen_helper_gvec_sdot_h }, { gen_helper_gvec_udot_b, gen_helper_gvec_udot_h } }; if (sve_access_check(s)) { - gen_gvec_ool_zzz(s, fns[a->u][a->sz], a->rd, a->rn, a->rm, 0); + gen_gvec_ool_zzzz(s, fns[a->u][a->sz], a->rd, a->rn, a->rm, a->ra, 0); } return true; } -static bool trans_DOT_zzx(DisasContext *s, arg_DOT_zzx *a) +static bool trans_DOT_zzxw(DisasContext *s, arg_DOT_zzxw *a) { - static gen_helper_gvec_3 * const fns[2][2] = { + static gen_helper_gvec_4 * const fns[2][2] = { { gen_helper_gvec_sdot_idx_b, gen_helper_gvec_sdot_idx_h }, { gen_helper_gvec_udot_idx_b, gen_helper_gvec_udot_idx_h } }; if (sve_access_check(s)) { - gen_gvec_ool_zzz(s, fns[a->u][a->sz], a->rd, a->rn, a->rm, a->index); + gen_gvec_ool_zzzz(s, fns[a->u][a->sz], a->rd, a->rn, a->rm, + a->ra, a->index); } return true; } diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 32a4403256..d7ef31915b 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -375,71 +375,76 @@ void HELPER(sve2_sqrdmlsh_d)(void *vd, void *vn, void *vm, * All elements are treated equally, no matter where they are. */ -void HELPER(gvec_sdot_b)(void *vd, void *vn, void *vm, uint32_t desc) +void HELPER(gvec_sdot_b)(void *vd, void *vn, void *vm, void *va, uint32_t desc) { intptr_t i, opr_sz = simd_oprsz(desc); - uint32_t *d = vd; + int32_t *d = vd, *a = va; int8_t *n = vn, *m = vm; for (i = 0; i < opr_sz / 4; ++i) { - d[i] += n[i * 4 + 0] * m[i * 4 + 0] - + n[i * 4 + 1] * m[i * 4 + 1] - + n[i * 4 + 2] * m[i * 4 + 2] - + n[i * 4 + 3] * m[i * 4 + 3]; + d[i] = (a[i] + + n[i * 4 + 0] * m[i * 4 + 0] + + n[i * 4 + 1] * m[i * 4 + 1] + + n[i * 4 + 2] * m[i * 4 + 2] + + n[i * 4 + 3] * m[i * 4 + 3]); } clear_tail(d, opr_sz, simd_maxsz(desc)); } -void HELPER(gvec_udot_b)(void *vd, void *vn, void *vm, uint32_t desc) +void HELPER(gvec_udot_b)(void *vd, void *vn, void *vm, void *va, uint32_t desc) { intptr_t i, opr_sz = simd_oprsz(desc); - uint32_t *d = vd; + uint32_t *d = vd, *a = va; uint8_t *n = vn, *m = vm; for (i = 0; i < opr_sz / 4; ++i) { - d[i] += n[i * 4 + 0] * m[i * 4 + 0] - + n[i * 4 + 1] * m[i * 4 + 1] - + n[i * 4 + 2] * m[i * 4 + 2] - + n[i * 4 + 3] * m[i * 4 + 3]; + d[i] = (a[i] + + n[i * 4 + 0] * m[i * 4 + 0] + + n[i * 4 + 1] * m[i * 4 + 1] + + n[i * 4 + 2] * m[i * 4 + 2] + + n[i * 4 + 3] * m[i * 4 + 3]); } clear_tail(d, opr_sz, simd_maxsz(desc)); } -void HELPER(gvec_sdot_h)(void *vd, void *vn, void *vm, uint32_t desc) +void HELPER(gvec_sdot_h)(void *vd, void *vn, void *vm, void *va, uint32_t desc) { intptr_t i, opr_sz = simd_oprsz(desc); - uint64_t *d = vd; + int64_t *d = vd, *a = va; int16_t *n = vn, *m = vm; for (i = 0; i < opr_sz / 8; ++i) { - d[i] += (int64_t)n[i * 4 + 0] * m[i * 4 + 0] - + (int64_t)n[i * 4 + 1] * m[i * 4 + 1] - + (int64_t)n[i * 4 + 2] * m[i * 4 + 2] - + (int64_t)n[i * 4 + 3] * m[i * 4 + 3]; + d[i] = (a[i] + + (int64_t)n[i * 4 + 0] * m[i * 4 + 0] + + (int64_t)n[i * 4 + 1] * m[i * 4 + 1] + + (int64_t)n[i * 4 + 2] * m[i * 4 + 2] + + (int64_t)n[i * 4 + 3] * m[i * 4 + 3]); } clear_tail(d, opr_sz, simd_maxsz(desc)); } -void HELPER(gvec_udot_h)(void *vd, void *vn, void *vm, uint32_t desc) +void HELPER(gvec_udot_h)(void *vd, void *vn, void *vm, void *va, uint32_t desc) { intptr_t i, opr_sz = simd_oprsz(desc); - uint64_t *d = vd; + uint64_t *d = vd, *a = va; uint16_t *n = vn, *m = vm; for (i = 0; i < opr_sz / 8; ++i) { - d[i] += (uint64_t)n[i * 4 + 0] * m[i * 4 + 0] - + (uint64_t)n[i * 4 + 1] * m[i * 4 + 1] - + (uint64_t)n[i * 4 + 2] * m[i * 4 + 2] - + (uint64_t)n[i * 4 + 3] * m[i * 4 + 3]; + d[i] = (a[i] + + (uint64_t)n[i * 4 + 0] * m[i * 4 + 0] + + (uint64_t)n[i * 4 + 1] * m[i * 4 + 1] + + (uint64_t)n[i * 4 + 2] * m[i * 4 + 2] + + (uint64_t)n[i * 4 + 3] * m[i * 4 + 3]); } clear_tail(d, opr_sz, simd_maxsz(desc)); } -void HELPER(gvec_sdot_idx_b)(void *vd, void *vn, void *vm, uint32_t desc) +void HELPER(gvec_sdot_idx_b)(void *vd, void *vn, void *vm, + void *va, uint32_t desc) { intptr_t i, segend, opr_sz = simd_oprsz(desc), opr_sz_4 = opr_sz / 4; intptr_t index = simd_data(desc); - uint32_t *d = vd; + int32_t *d = vd, *a = va; int8_t *n = vn; int8_t *m_indexed = (int8_t *)vm + index * 4; @@ -455,10 +460,11 @@ void HELPER(gvec_sdot_idx_b)(void *vd, void *vn, void *vm, uint32_t desc) int8_t m3 = m_indexed[i * 4 + 3]; do { - d[i] += n[i * 4 + 0] * m0 - + n[i * 4 + 1] * m1 - + n[i * 4 + 2] * m2 - + n[i * 4 + 3] * m3; + d[i] = (a[i] + + n[i * 4 + 0] * m0 + + n[i * 4 + 1] * m1 + + n[i * 4 + 2] * m2 + + n[i * 4 + 3] * m3); } while (++i < segend); segend = i + 4; } while (i < opr_sz_4); @@ -466,11 +472,12 @@ void HELPER(gvec_sdot_idx_b)(void *vd, void *vn, void *vm, uint32_t desc) clear_tail(d, opr_sz, simd_maxsz(desc)); } -void HELPER(gvec_udot_idx_b)(void *vd, void *vn, void *vm, uint32_t desc) +void HELPER(gvec_udot_idx_b)(void *vd, void *vn, void *vm, + void *va, uint32_t desc) { intptr_t i, segend, opr_sz = simd_oprsz(desc), opr_sz_4 = opr_sz / 4; intptr_t index = simd_data(desc); - uint32_t *d = vd; + uint32_t *d = vd, *a = va; uint8_t *n = vn; uint8_t *m_indexed = (uint8_t *)vm + index * 4; @@ -486,10 +493,11 @@ void HELPER(gvec_udot_idx_b)(void *vd, void *vn, void *vm, uint32_t desc) uint8_t m3 = m_indexed[i * 4 + 3]; do { - d[i] += n[i * 4 + 0] * m0 - + n[i * 4 + 1] * m1 - + n[i * 4 + 2] * m2 - + n[i * 4 + 3] * m3; + d[i] = (a[i] + + n[i * 4 + 0] * m0 + + n[i * 4 + 1] * m1 + + n[i * 4 + 2] * m2 + + n[i * 4 + 3] * m3); } while (++i < segend); segend = i + 4; } while (i < opr_sz_4); @@ -497,11 +505,12 @@ void HELPER(gvec_udot_idx_b)(void *vd, void *vn, void *vm, uint32_t desc) clear_tail(d, opr_sz, simd_maxsz(desc)); } -void HELPER(gvec_sdot_idx_h)(void *vd, void *vn, void *vm, uint32_t desc) +void HELPER(gvec_sdot_idx_h)(void *vd, void *vn, void *vm, + void *va, uint32_t desc) { intptr_t i, opr_sz = simd_oprsz(desc), opr_sz_8 = opr_sz / 8; intptr_t index = simd_data(desc); - uint64_t *d = vd; + int64_t *d = vd, *a = va; int16_t *n = vn; int16_t *m_indexed = (int16_t *)vm + index * 4; @@ -509,14 +518,17 @@ void HELPER(gvec_sdot_idx_h)(void *vd, void *vn, void *vm, uint32_t desc) * Process the entire segment all at once, writing back the results * only after we've consumed all of the inputs. */ - for (i = 0; i < opr_sz_8 ; i += 2) { - uint64_t d0, d1; + for (i = 0; i < opr_sz_8; i += 2) { + int64_t d0, d1; - d0 = n[i * 4 + 0] * (int64_t)m_indexed[i * 4 + 0]; + d0 = a[i + 0]; + d0 += n[i * 4 + 0] * (int64_t)m_indexed[i * 4 + 0]; d0 += n[i * 4 + 1] * (int64_t)m_indexed[i * 4 + 1]; d0 += n[i * 4 + 2] * (int64_t)m_indexed[i * 4 + 2]; d0 += n[i * 4 + 3] * (int64_t)m_indexed[i * 4 + 3]; - d1 = n[i * 4 + 4] * (int64_t)m_indexed[i * 4 + 0]; + + d1 = a[i + 1]; + d1 += n[i * 4 + 4] * (int64_t)m_indexed[i * 4 + 0]; d1 += n[i * 4 + 5] * (int64_t)m_indexed[i * 4 + 1]; d1 += n[i * 4 + 6] * (int64_t)m_indexed[i * 4 + 2]; d1 += n[i * 4 + 7] * (int64_t)m_indexed[i * 4 + 3]; @@ -524,15 +536,15 @@ void HELPER(gvec_sdot_idx_h)(void *vd, void *vn, void *vm, uint32_t desc) d[i + 0] += d0; d[i + 1] += d1; } - clear_tail(d, opr_sz, simd_maxsz(desc)); } -void HELPER(gvec_udot_idx_h)(void *vd, void *vn, void *vm, uint32_t desc) +void HELPER(gvec_udot_idx_h)(void *vd, void *vn, void *vm, + void *va, uint32_t desc) { intptr_t i, opr_sz = simd_oprsz(desc), opr_sz_8 = opr_sz / 8; intptr_t index = simd_data(desc); - uint64_t *d = vd; + uint64_t *d = vd, *a = va; uint16_t *n = vn; uint16_t *m_indexed = (uint16_t *)vm + index * 4; @@ -540,14 +552,17 @@ void HELPER(gvec_udot_idx_h)(void *vd, void *vn, void *vm, uint32_t desc) * Process the entire segment all at once, writing back the results * only after we've consumed all of the inputs. */ - for (i = 0; i < opr_sz_8 ; i += 2) { + for (i = 0; i < opr_sz_8; i += 2) { uint64_t d0, d1; - d0 = n[i * 4 + 0] * (uint64_t)m_indexed[i * 4 + 0]; + d0 = a[i + 0]; + d0 += n[i * 4 + 0] * (uint64_t)m_indexed[i * 4 + 0]; d0 += n[i * 4 + 1] * (uint64_t)m_indexed[i * 4 + 1]; d0 += n[i * 4 + 2] * (uint64_t)m_indexed[i * 4 + 2]; d0 += n[i * 4 + 3] * (uint64_t)m_indexed[i * 4 + 3]; - d1 = n[i * 4 + 4] * (uint64_t)m_indexed[i * 4 + 0]; + + d1 = a[i + 1]; + d1 += n[i * 4 + 4] * (uint64_t)m_indexed[i * 4 + 0]; d1 += n[i * 4 + 5] * (uint64_t)m_indexed[i * 4 + 1]; d1 += n[i * 4 + 6] * (uint64_t)m_indexed[i * 4 + 2]; d1 += n[i * 4 + 7] * (uint64_t)m_indexed[i * 4 + 3]; @@ -555,7 +570,6 @@ void HELPER(gvec_udot_idx_h)(void *vd, void *vn, void *vm, uint32_t desc) d[i + 0] += d0; d[i + 1] += d1; } - clear_tail(d, opr_sz, simd_maxsz(desc)); } diff --git a/target/arm/translate-neon.c.inc b/target/arm/translate-neon.c.inc index 4d1a292981..7efe3d9556 100644 --- a/target/arm/translate-neon.c.inc +++ b/target/arm/translate-neon.c.inc @@ -240,7 +240,7 @@ static bool trans_VCADD(DisasContext *s, arg_VCADD *a) static bool trans_VDOT(DisasContext *s, arg_VDOT *a) { int opr_sz; - gen_helper_gvec_3 *fn_gvec; + gen_helper_gvec_4 *fn_gvec; if (!dc_isar_feature(aa32_dp, s)) { return false; @@ -262,9 +262,10 @@ static bool trans_VDOT(DisasContext *s, arg_VDOT *a) opr_sz = (1 + a->q) * 8; fn_gvec = a->u ? gen_helper_gvec_udot_b : gen_helper_gvec_sdot_b; - tcg_gen_gvec_3_ool(vfp_reg_offset(1, a->vd), + tcg_gen_gvec_4_ool(vfp_reg_offset(1, a->vd), vfp_reg_offset(1, a->vn), vfp_reg_offset(1, a->vm), + vfp_reg_offset(1, a->vd), opr_sz, opr_sz, 0, fn_gvec); return true; } @@ -342,7 +343,7 @@ static bool trans_VCMLA_scalar(DisasContext *s, arg_VCMLA_scalar *a) static bool trans_VDOT_scalar(DisasContext *s, arg_VDOT_scalar *a) { - gen_helper_gvec_3 *fn_gvec; + gen_helper_gvec_4 *fn_gvec; int opr_sz; TCGv_ptr fpst; @@ -367,9 +368,10 @@ static bool trans_VDOT_scalar(DisasContext *s, arg_VDOT_scalar *a) fn_gvec = a->u ? gen_helper_gvec_udot_idx_b : gen_helper_gvec_sdot_idx_b; opr_sz = (1 + a->q) * 8; fpst = fpstatus_ptr(FPST_STD); - tcg_gen_gvec_3_ool(vfp_reg_offset(1, a->vd), + tcg_gen_gvec_4_ool(vfp_reg_offset(1, a->vd), vfp_reg_offset(1, a->vn), vfp_reg_offset(1, a->rm), + vfp_reg_offset(1, a->vd), opr_sz, opr_sz, a->index, fn_gvec); tcg_temp_free_ptr(fpst); return true; From patchwork Fri Sep 18 18:37:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273329 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 62DCEC43464 for ; Fri, 18 Sep 2020 19:37:13 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id DB50922208 for ; Fri, 18 Sep 2020 19:37:12 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="uZVQy1Vi" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DB50922208 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:38468 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJMC0-0004nx-0s for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:37:12 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48560) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLHl-0000nC-Qx for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:08 -0400 Received: from mail-pj1-x1042.google.com ([2607:f8b0:4864:20::1042]:38630) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLHj-0007Cz-DS for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:05 -0400 Received: by mail-pj1-x1042.google.com with SMTP id u3so3459832pjr.3 for ; Fri, 18 Sep 2020 11:39:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=58nX2w8XZEafvNzYoIWAzJayUresjRPt/UjUBIAFJac=; b=uZVQy1ViuyfYDbMPPE7YvK4Y9+FB5XDD/dFSgvzno4pQ5vlab2TJQkxvIJZptzjoN9 x5nxeV97veXDh+naEomoLsGqabsVAU/bDB2dP9NXrNRvqf4jsIjAx4EAA2w1f+Z7xQug r/epfqTKe9TyGyPb3yd555gW/5tsozhERJbI2El8w8QMvbk6OoPHZR/5dRciN7aZfi8D K7l1EZs/ho1TeWbEmYB8NLSdebckzzDTW7cE6txm4CZq3Lrl4EgzzSrJMsjbht32Z8rz N5Lm4WS8xrYPtqwxoNHLSWvItNGedl63alkON14tUU85uzgjRkxaenB1QMSwddc1/euY e9Jw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=58nX2w8XZEafvNzYoIWAzJayUresjRPt/UjUBIAFJac=; b=QDvEwqmHtIwAUcLIDIXmBSGk4QuHUO+4bFKjwBk8gwu+wAgaXq8WtllmkMwP5MlqoE DcIp3QF967+oK/QzSKVXnyI98QOTIdcQQvfZu649Q2W5wLMRWHsUlDMS2Ad6P5MoVlBx VG5354NFVaerGMZ1GxIQZCl28oRC6Y1lHxSMpScVLHBIbUOFC+hRx1zA5nCAIWcMc9n/ 4W6fcIejeO7Z7EifyOHtq93Paj4Kk17T+aYOZKlR4JAxDoofIzGlVhrWDi6A2BgHjyED awWWsYEK3WpFJE3/pVOMrQQ8kuPveQe2YGQDRKTUejkvxFTzX9dsUmVtrCBOrq/eJLuC PtZg== X-Gm-Message-State: AOAM531zPxR+VTiH1o7Vm/7lfq1veVtiqPXfq1p9NVGrFWgRim2lVSFE ueDACfTzC56uwZSAUJU5zAnDOvkdAd8b/A== X-Google-Smtp-Source: ABdhPJyiXMWdG03k5SQEj8GkPmy8d2JwryT/Hbsf1z8+jTPBtXS7VW5LpAeGYhSVMAPxIsnqSTA8Og== X-Received: by 2002:a17:902:b206:b029:d2:e7b:2e65 with SMTP id t6-20020a170902b206b02900d20e7b2e65mr1902901plr.18.1600454341115; Fri, 18 Sep 2020 11:39:01 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.38.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:39:00 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 52/81] target/arm: Pass separate addend to FCMLA helpers Date: Fri, 18 Sep 2020 11:37:22 -0700 Message-Id: <20200918183751.2787647-53-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1042; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1042.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" For SVE, we potentially have a 4th argument coming from the movprfx instruction. Currently we do not optimize movprfx, so the problem is not visible. Signed-off-by: Richard Henderson --- target/arm/helper.h | 20 ++++++------- target/arm/translate-a64.c | 27 ++++++++++++++---- target/arm/translate-sve.c | 5 ++-- target/arm/vec_helper.c | 50 +++++++++++++-------------------- target/arm/translate-neon.c.inc | 10 ++++--- 5 files changed, 61 insertions(+), 51 deletions(-) diff --git a/target/arm/helper.h b/target/arm/helper.h index 97222bd256..8b3dc9b4e3 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -616,16 +616,16 @@ DEF_HELPER_FLAGS_5(gvec_fcadds, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(gvec_fcaddd, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) -DEF_HELPER_FLAGS_5(gvec_fcmlah, TCG_CALL_NO_RWG, - void, ptr, ptr, ptr, ptr, i32) -DEF_HELPER_FLAGS_5(gvec_fcmlah_idx, TCG_CALL_NO_RWG, - void, ptr, ptr, ptr, ptr, i32) -DEF_HELPER_FLAGS_5(gvec_fcmlas, TCG_CALL_NO_RWG, - void, ptr, ptr, ptr, ptr, i32) -DEF_HELPER_FLAGS_5(gvec_fcmlas_idx, TCG_CALL_NO_RWG, - void, ptr, ptr, ptr, ptr, i32) -DEF_HELPER_FLAGS_5(gvec_fcmlad, TCG_CALL_NO_RWG, - void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(gvec_fcmlah, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(gvec_fcmlah_idx, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(gvec_fcmlas, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(gvec_fcmlas_idx, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(gvec_fcmlad, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(neon_paddh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(neon_pmaxh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c index 1a9251db67..0a463058f0 100644 --- a/target/arm/translate-a64.c +++ b/target/arm/translate-a64.c @@ -703,6 +703,22 @@ static void gen_gvec_op4_ool(DisasContext *s, bool is_q, int rd, int rn, is_q ? 16 : 8, vec_full_reg_size(s), data, fn); } +/* Expand a 4-operand + fpstatus pointer + simd data value operation using + * an out-of-line helper. + */ +static void gen_gvec_op4_fpst(DisasContext *s, bool is_q, int rd, int rn, + int rm, int ra, bool is_fp16, int data, + gen_helper_gvec_4_ptr *fn) +{ + TCGv_ptr fpst = fpstatus_ptr(is_fp16 ? FPST_FPCR_F16 : FPST_FPCR); + tcg_gen_gvec_4_ptr(vec_full_reg_offset(s, rd), + vec_full_reg_offset(s, rn), + vec_full_reg_offset(s, rm), + vec_full_reg_offset(s, ra), fpst, + is_q ? 16 : 8, vec_full_reg_size(s), data, fn); + tcg_temp_free_ptr(fpst); +} + /* Set ZF and NF based on a 64 bit result. This is alas fiddlier * than the 32 bit equivalent. */ @@ -12224,15 +12240,15 @@ static void disas_simd_three_reg_same_extra(DisasContext *s, uint32_t insn) rot = extract32(opcode, 0, 2); switch (size) { case 1: - gen_gvec_op3_fpst(s, is_q, rd, rn, rm, true, rot, + gen_gvec_op4_fpst(s, is_q, rd, rn, rm, rd, true, rot, gen_helper_gvec_fcmlah); break; case 2: - gen_gvec_op3_fpst(s, is_q, rd, rn, rm, false, rot, + gen_gvec_op4_fpst(s, is_q, rd, rn, rm, rd, false, rot, gen_helper_gvec_fcmlas); break; case 3: - gen_gvec_op3_fpst(s, is_q, rd, rn, rm, false, rot, + gen_gvec_op4_fpst(s, is_q, rd, rn, rm, rd, false, rot, gen_helper_gvec_fcmlad); break; default: @@ -13483,9 +13499,10 @@ static void disas_simd_indexed(DisasContext *s, uint32_t insn) { int rot = extract32(insn, 13, 2); int data = (index << 2) | rot; - tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, rd), + tcg_gen_gvec_4_ptr(vec_full_reg_offset(s, rd), vec_full_reg_offset(s, rn), - vec_full_reg_offset(s, rm), fpst, + vec_full_reg_offset(s, rm), + vec_full_reg_offset(s, rd), fpst, is_q ? 16 : 8, vec_full_reg_size(s), data, size == MO_64 ? gen_helper_gvec_fcmlas_idx diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 66303dac54..10200f0207 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -4387,7 +4387,7 @@ static bool trans_FCMLA_zpzzz(DisasContext *s, arg_FCMLA_zpzzz *a) static bool trans_FCMLA_zzxz(DisasContext *s, arg_FCMLA_zzxz *a) { - static gen_helper_gvec_3_ptr * const fns[2] = { + static gen_helper_gvec_4_ptr * const fns[2] = { gen_helper_gvec_fcmlah_idx, gen_helper_gvec_fcmlas_idx, }; @@ -4397,9 +4397,10 @@ static bool trans_FCMLA_zzxz(DisasContext *s, arg_FCMLA_zzxz *a) if (sve_access_check(s)) { unsigned vsz = vec_full_reg_size(s); TCGv_ptr status = fpstatus_ptr(a->esz == MO_16 ? FPST_FPCR_F16 : FPST_FPCR); - tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd), + tcg_gen_gvec_4_ptr(vec_full_reg_offset(s, a->rd), vec_full_reg_offset(s, a->rn), vec_full_reg_offset(s, a->rm), + vec_full_reg_offset(s, a->ra), status, vsz, vsz, a->index * 4 + a->rot, fns[a->esz - 1]); diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index d7ef31915b..82e12b51c1 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -657,13 +657,11 @@ void HELPER(gvec_fcaddd)(void *vd, void *vn, void *vm, clear_tail(d, opr_sz, simd_maxsz(desc)); } -void HELPER(gvec_fcmlah)(void *vd, void *vn, void *vm, +void HELPER(gvec_fcmlah)(void *vd, void *vn, void *vm, void *va, void *vfpst, uint32_t desc) { uintptr_t opr_sz = simd_oprsz(desc); - float16 *d = vd; - float16 *n = vn; - float16 *m = vm; + float16 *d = vd, *n = vn, *m = vm, *a = va; float_status *fpst = vfpst; intptr_t flip = extract32(desc, SIMD_DATA_SHIFT, 1); uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1); @@ -680,19 +678,17 @@ void HELPER(gvec_fcmlah)(void *vd, void *vn, void *vm, float16 e4 = e2; float16 e3 = m[H2(i + 1 - flip)] ^ neg_imag; - d[H2(i)] = float16_muladd(e2, e1, d[H2(i)], 0, fpst); - d[H2(i + 1)] = float16_muladd(e4, e3, d[H2(i + 1)], 0, fpst); + d[H2(i)] = float16_muladd(e2, e1, a[H2(i)], 0, fpst); + d[H2(i + 1)] = float16_muladd(e4, e3, a[H2(i + 1)], 0, fpst); } clear_tail(d, opr_sz, simd_maxsz(desc)); } -void HELPER(gvec_fcmlah_idx)(void *vd, void *vn, void *vm, +void HELPER(gvec_fcmlah_idx)(void *vd, void *vn, void *vm, void *va, void *vfpst, uint32_t desc) { uintptr_t opr_sz = simd_oprsz(desc); - float16 *d = vd; - float16 *n = vn; - float16 *m = vm; + float16 *d = vd, *n = vn, *m = vm, *a = va; float_status *fpst = vfpst; intptr_t flip = extract32(desc, SIMD_DATA_SHIFT, 1); uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1); @@ -716,20 +712,18 @@ void HELPER(gvec_fcmlah_idx)(void *vd, void *vn, void *vm, float16 e2 = n[H2(j + flip)]; float16 e4 = e2; - d[H2(j)] = float16_muladd(e2, e1, d[H2(j)], 0, fpst); - d[H2(j + 1)] = float16_muladd(e4, e3, d[H2(j + 1)], 0, fpst); + d[H2(j)] = float16_muladd(e2, e1, a[H2(j)], 0, fpst); + d[H2(j + 1)] = float16_muladd(e4, e3, a[H2(j + 1)], 0, fpst); } } clear_tail(d, opr_sz, simd_maxsz(desc)); } -void HELPER(gvec_fcmlas)(void *vd, void *vn, void *vm, +void HELPER(gvec_fcmlas)(void *vd, void *vn, void *vm, void *va, void *vfpst, uint32_t desc) { uintptr_t opr_sz = simd_oprsz(desc); - float32 *d = vd; - float32 *n = vn; - float32 *m = vm; + float32 *d = vd, *n = vn, *m = vm, *a = va; float_status *fpst = vfpst; intptr_t flip = extract32(desc, SIMD_DATA_SHIFT, 1); uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1); @@ -746,19 +740,17 @@ void HELPER(gvec_fcmlas)(void *vd, void *vn, void *vm, float32 e4 = e2; float32 e3 = m[H4(i + 1 - flip)] ^ neg_imag; - d[H4(i)] = float32_muladd(e2, e1, d[H4(i)], 0, fpst); - d[H4(i + 1)] = float32_muladd(e4, e3, d[H4(i + 1)], 0, fpst); + d[H4(i)] = float32_muladd(e2, e1, a[H4(i)], 0, fpst); + d[H4(i + 1)] = float32_muladd(e4, e3, a[H4(i + 1)], 0, fpst); } clear_tail(d, opr_sz, simd_maxsz(desc)); } -void HELPER(gvec_fcmlas_idx)(void *vd, void *vn, void *vm, +void HELPER(gvec_fcmlas_idx)(void *vd, void *vn, void *vm, void *va, void *vfpst, uint32_t desc) { uintptr_t opr_sz = simd_oprsz(desc); - float32 *d = vd; - float32 *n = vn; - float32 *m = vm; + float32 *d = vd, *n = vn, *m = vm, *a = va; float_status *fpst = vfpst; intptr_t flip = extract32(desc, SIMD_DATA_SHIFT, 1); uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1); @@ -782,20 +774,18 @@ void HELPER(gvec_fcmlas_idx)(void *vd, void *vn, void *vm, float32 e2 = n[H4(j + flip)]; float32 e4 = e2; - d[H4(j)] = float32_muladd(e2, e1, d[H4(j)], 0, fpst); - d[H4(j + 1)] = float32_muladd(e4, e3, d[H4(j + 1)], 0, fpst); + d[H4(j)] = float32_muladd(e2, e1, a[H4(j)], 0, fpst); + d[H4(j + 1)] = float32_muladd(e4, e3, a[H4(j + 1)], 0, fpst); } } clear_tail(d, opr_sz, simd_maxsz(desc)); } -void HELPER(gvec_fcmlad)(void *vd, void *vn, void *vm, +void HELPER(gvec_fcmlad)(void *vd, void *vn, void *vm, void *va, void *vfpst, uint32_t desc) { uintptr_t opr_sz = simd_oprsz(desc); - float64 *d = vd; - float64 *n = vn; - float64 *m = vm; + float64 *d = vd, *n = vn, *m = vm, *a = va; float_status *fpst = vfpst; intptr_t flip = extract32(desc, SIMD_DATA_SHIFT, 1); uint64_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1); @@ -812,8 +802,8 @@ void HELPER(gvec_fcmlad)(void *vd, void *vn, void *vm, float64 e4 = e2; float64 e3 = m[i + 1 - flip] ^ neg_imag; - d[i] = float64_muladd(e2, e1, d[i], 0, fpst); - d[i + 1] = float64_muladd(e4, e3, d[i + 1], 0, fpst); + d[i] = float64_muladd(e2, e1, a[i], 0, fpst); + d[i + 1] = float64_muladd(e4, e3, a[i + 1], 0, fpst); } clear_tail(d, opr_sz, simd_maxsz(desc)); } diff --git a/target/arm/translate-neon.c.inc b/target/arm/translate-neon.c.inc index 7efe3d9556..499ebeb325 100644 --- a/target/arm/translate-neon.c.inc +++ b/target/arm/translate-neon.c.inc @@ -165,7 +165,7 @@ static bool trans_VCMLA(DisasContext *s, arg_VCMLA *a) { int opr_sz; TCGv_ptr fpst; - gen_helper_gvec_3_ptr *fn_gvec_ptr; + gen_helper_gvec_4_ptr *fn_gvec_ptr; if (!dc_isar_feature(aa32_vcma, s) || (a->size == MO_16 && !dc_isar_feature(aa32_fp16_arith, s))) { @@ -190,9 +190,10 @@ static bool trans_VCMLA(DisasContext *s, arg_VCMLA *a) fpst = fpstatus_ptr(a->size == MO_16 ? FPST_STD_F16 : FPST_STD); fn_gvec_ptr = (a->size == MO_16) ? gen_helper_gvec_fcmlah : gen_helper_gvec_fcmlas; - tcg_gen_gvec_3_ptr(vfp_reg_offset(1, a->vd), + tcg_gen_gvec_4_ptr(vfp_reg_offset(1, a->vd), vfp_reg_offset(1, a->vn), vfp_reg_offset(1, a->vm), + vfp_reg_offset(1, a->vd), fpst, opr_sz, opr_sz, a->rot, fn_gvec_ptr); tcg_temp_free_ptr(fpst); @@ -303,7 +304,7 @@ static bool trans_VFML(DisasContext *s, arg_VFML *a) static bool trans_VCMLA_scalar(DisasContext *s, arg_VCMLA_scalar *a) { - gen_helper_gvec_3_ptr *fn_gvec_ptr; + gen_helper_gvec_4_ptr *fn_gvec_ptr; int opr_sz; TCGv_ptr fpst; @@ -332,9 +333,10 @@ static bool trans_VCMLA_scalar(DisasContext *s, arg_VCMLA_scalar *a) gen_helper_gvec_fcmlah_idx : gen_helper_gvec_fcmlas_idx; opr_sz = (1 + a->q) * 8; fpst = fpstatus_ptr(a->size == MO_16 ? FPST_STD_F16 : FPST_STD); - tcg_gen_gvec_3_ptr(vfp_reg_offset(1, a->vd), + tcg_gen_gvec_4_ptr(vfp_reg_offset(1, a->vd), vfp_reg_offset(1, a->vn), vfp_reg_offset(1, a->vm), + vfp_reg_offset(1, a->vd), fpst, opr_sz, opr_sz, (a->index << 2) | a->rot, fn_gvec_ptr); tcg_temp_free_ptr(fpst); From patchwork Fri Sep 18 18:37:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 305001 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 833F7C43465 for ; Fri, 18 Sep 2020 19:40:12 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 20E0D21741 for ; Fri, 18 Sep 2020 19:40:12 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="CuPysYGj" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 20E0D21741 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:46902 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJMEt-0008JO-8b for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:40:11 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48636) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLHv-0000p4-NF for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:15 -0400 Received: from mail-pj1-x1042.google.com ([2607:f8b0:4864:20::1042]:53497) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLHk-0007D5-Bu for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:14 -0400 Received: by mail-pj1-x1042.google.com with SMTP id t7so3625173pjd.3 for ; Fri, 18 Sep 2020 11:39:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=MepNhYFr1+IOeR1yUlz4CllcCZZho0JnjR8wxuWPCCs=; b=CuPysYGjqEO0eaJBqR/yOj+GBnaAP/WwJNo+v8RHGa2xgQDUWLontjg2vzXXX4Bmp5 5GR7xTMfYdBYN6Yvt552xPVUc0miD2mcvLgpyUakmzIW9orUtzXGluerhRs8lgCFlxaE 8P/DHU23w13gWf+gQJzVEhStdVat7RYr930cTW+3OzUeiU18PkwvY4VGmEn/u/U3aUW7 vBIm4bwswTiMpf0icdi7Fe0CVbtRFfrZDB/p4NqnAU+Pb+x9xKsIEMA5TVZzJ52KB+Lh edjpa980eGtxyP4KfKVG+mLpdH2+jhZNX8X2+4j9DF2ahBM6ZUF1pPj6omN/gdM1INEj cPHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=MepNhYFr1+IOeR1yUlz4CllcCZZho0JnjR8wxuWPCCs=; b=nqcohE1kY/75G/MlzmbjkpUgR8F90IO3UjuimmF9dzDxDgMsEynbzt3cHjkJhkoYgY 9IIG+oWWBXb3aC1Sn3BMJtIsJy2n2O8U9L6II3pynh2Fn+8QCduG13Et5Ne85/Z0xyiA iti94acGVOTk2e13Oi8oe871eA8fScBryQIDHtkassEo1DdmLYh1m64HviJmUkaE518m 0uJRllr4RfLUMOe8Kdn6Qtd/Yb8RDXHXv5ly1e+yzUPR7BLiUMqR3CUOQJrP7eRNT9Kc Ktaep5MHStJNLjbXhoE9XnGIaKWpZBnpzSIma3x9bSd8CxjE+x7VByZ8a2dvBjl8L+yB EhMQ== X-Gm-Message-State: AOAM532FDr/hGKiGKkXbBQLTxQ2XeSeph8SvLpJakm0p+fdKI5OMDlRA gbMoHUwIrRWnaUTvCdfvwxUL6hRcL06gMA== X-Google-Smtp-Source: ABdhPJxkMej0qKoQvMai6i/iaVC34KE4nWAwf4sqc+3izr99g1+GjI3KNWPL0Ko0cfOVK+MSSunZoQ== X-Received: by 2002:a17:902:7608:b029:d0:cbe1:e70e with SMTP id k8-20020a1709027608b02900d0cbe1e70emr34642367pll.28.1600454342681; Fri, 18 Sep 2020 11:39:02 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.39.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:39:01 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 53/81] target/arm: Split out formats for 2 vectors + 1 index Date: Fri, 18 Sep 2020 11:37:23 -0700 Message-Id: <20200918183751.2787647-54-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1042; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1042.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Currently only used by FMUL, but will shortly be used more. Signed-off-by: Richard Henderson --- target/arm/sve.decode | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 5815ba9b1c..a121e55f07 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -67,6 +67,7 @@ &rri_esz rd rn imm esz &rrri_esz rd rn rm imm esz &rrr_esz rd rn rm esz +&rrx_esz rd rn rm index esz &rpr_esz rd pg rn esz &rpr_s rd pg rn s &rprr_s rd pg rn rm s @@ -245,6 +246,14 @@ @rpri_scatter_store ....... msz:2 .. imm:5 ... pg:3 rn:5 rd:5 \ &rpri_scatter_store +# Two registers and a scalar by index +@rrx_h ........ 0. . .. rm:3 ...... rn:5 rd:5 \ + &rrx_esz index=%index3_22_19 esz=1 +@rrx_s ........ 10 . index:2 rm:3 ...... rn:5 rd:5 \ + &rrx_esz esz=2 +@rrx_d ........ 11 . index:1 rm:4 ...... rn:5 rd:5 \ + &rrx_esz esz=3 + ########################################################################### # Instruction patterns. Grouped according to the SVE encodingindex.xhtml. @@ -792,10 +801,9 @@ FMLA_zzxz 01100100 111 index:1 rm:4 00000 sub:1 rn:5 rd:5 \ ### SVE FP Multiply Indexed Group # SVE floating-point multiply (indexed) -FMUL_zzx 01100100 0.1 .. rm:3 001000 rn:5 rd:5 \ - index=%index3_22_19 esz=1 -FMUL_zzx 01100100 101 index:2 rm:3 001000 rn:5 rd:5 esz=2 -FMUL_zzx 01100100 111 index:1 rm:4 001000 rn:5 rd:5 esz=3 +FMUL_zzx 01100100 .. 1 ..... 001000 ..... ..... @rrx_h +FMUL_zzx 01100100 .. 1 ..... 001000 ..... ..... @rrx_s +FMUL_zzx 01100100 .. 1 ..... 001000 ..... ..... @rrx_d ### SVE FP Fast Reduction Group From patchwork Fri Sep 18 18:37:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 305009 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D17D4C43463 for ; Fri, 18 Sep 2020 19:26:56 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 46156208DB for ; Fri, 18 Sep 2020 19:26:56 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="Ql8amN+x" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 46156208DB Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:40408 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJM23-0001ZK-Ce for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:26:55 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48766) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLI7-0000uI-9o for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:27 -0400 Received: from mail-pj1-x1044.google.com ([2607:f8b0:4864:20::1044]:33132) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLHo-0007DQ-7J for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:23 -0400 Received: by mail-pj1-x1044.google.com with SMTP id md22so4604441pjb.0 for ; Fri, 18 Sep 2020 11:39:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=tpgwU+O2Pyt/u4rQ1PaoM0R+miZyXm0Qzr3tAtW7L6c=; b=Ql8amN+xWtuJhkPjoOXWPeOMnsejcjZJtsBpVEVjwfAGbata+yob1JQv16IaGld45x Qb3de2irou0ZZL+WR7e/y6mrs/9/VwKoeW4rePjalii5bEgApG7sIiLK3d6rD1fpMGRU dWiC1kT5KcbdyVihov9fo2l2HKMMWZW7OzcK76mAhV0zV1yrqMYHRSp0wxCxOTs+5G+w EKtui3HcZX3aT6mo4CzFkleFL7DesUWAtfTnnx1BPgbXZIlHfSoIw//v6bAc5TQqdxK1 uBWudgFEHTpSKxKvKfeViiXqa3yygeFxIS1erD95qjcN2Hxt7eBxLKNq4Oq/tL6Y/bT1 u0pg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=tpgwU+O2Pyt/u4rQ1PaoM0R+miZyXm0Qzr3tAtW7L6c=; b=bKpP+Kxr+PFYTQmiIOXjLx7caQNyYDPbisqyycaIe3T5xmZHV6wdsSwD4AopZBIgBp 86q3t4ZUzGiGuaw5N3Oqw+Ia6qUB9BQJfZaDaWjB7fEzuESO6ERb43Bfipu0sdqtvp7K O0ivORgVu7E3Oz+hDwq1WJiNfCE63bzfMe9vVDZxMFQSmATggk90VGcmbcJ/rWYfAC4u 9Xw+rejLDM5VT4Wit1VvHeeY8+/MRDJWBtHUk4QnL5kZWG/riaP5FmBk2m9AlRSJ9kRQ mGJHvTC9rpFF//d8axfVqipnha8QezPYR7MmpP/qanLZZoMA1BRDAyXzYVg8bT5xrHnO KkKA== X-Gm-Message-State: AOAM530l9Mjem5qLkv2smb32fb0HCvvfeenOUyyeHzdovUtsQ7tl9db9 sphlMcOmkg8ODHNBvxQr3ge8qkgrduNeDw== X-Google-Smtp-Source: ABdhPJwaagueNwgQmf5LdgaXYAVZBEN80kU4itwiVW8562wFKNB+4GAo+E7B57evy8cSBrUSlxBV6g== X-Received: by 2002:a17:90a:ea09:: with SMTP id w9mr13600298pjy.203.1600454344420; Fri, 18 Sep 2020 11:39:04 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.39.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:39:03 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 54/81] target/arm: Split out formats for 3 vectors + 1 index Date: Fri, 18 Sep 2020 11:37:24 -0700 Message-Id: <20200918183751.2787647-55-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1044; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1044.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Used by FMLA and DOT, but will shortly be used more. Split FMLA from FMLS to avoid an extra sub field; similarly for SDOT from UDOT. Signed-off-by: Richard Henderson --- target/arm/sve.decode | 29 +++++++++++++++++++---------- target/arm/translate-sve.c | 38 ++++++++++++++++++++++++++++---------- 2 files changed, 47 insertions(+), 20 deletions(-) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index a121e55f07..ea6ec5f198 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -73,6 +73,7 @@ &rprr_s rd pg rn rm s &rprr_esz rd pg rn rm esz &rrrr_esz rd ra rn rm esz +&rrxr_esz rd rn rm ra index esz &rprrr_esz rd pg rn rm ra esz &rpri_esz rd pg rn imm esz &ptrue rd esz pat s @@ -254,6 +255,14 @@ @rrx_d ........ 11 . index:1 rm:4 ...... rn:5 rd:5 \ &rrx_esz esz=3 +# Three registers and a scalar by index +@rrxr_h ........ 0. . .. rm:3 ...... rn:5 rd:5 \ + &rrxr_esz ra=%reg_movprfx index=%index3_22_19 esz=1 +@rrxr_s ........ 10 . index:2 rm:3 ...... rn:5 rd:5 \ + &rrxr_esz ra=%reg_movprfx esz=2 +@rrxr_d ........ 11 . index:1 rm:4 ...... rn:5 rd:5 \ + &rrxr_esz ra=%reg_movprfx esz=3 + ########################################################################### # Instruction patterns. Grouped according to the SVE encodingindex.xhtml. @@ -769,10 +778,10 @@ DOT_zzzz 01000100 1 sz:1 0 rm:5 00000 u:1 rn:5 rd:5 \ ra=%reg_movprfx # SVE integer dot product (indexed) -DOT_zzxw 01000100 101 index:2 rm:3 00000 u:1 rn:5 rd:5 \ - sz=0 ra=%reg_movprfx -DOT_zzxw 01000100 111 index:1 rm:4 00000 u:1 rn:5 rd:5 \ - sz=1 ra=%reg_movprfx +SDOT_zzxw_s 01000100 .. 1 ..... 000000 ..... ..... @rrxr_s +SDOT_zzxw_d 01000100 .. 1 ..... 000000 ..... ..... @rrxr_d +UDOT_zzxw_s 01000100 .. 1 ..... 000001 ..... ..... @rrxr_s +UDOT_zzxw_d 01000100 .. 1 ..... 000001 ..... ..... @rrxr_d # SVE floating-point complex add (predicated) FCADD 01100100 esz:2 00000 rot:1 100 pg:3 rm:5 rd:5 \ @@ -791,12 +800,12 @@ FCMLA_zzxz 01100100 11 1 index:1 rm:4 0001 rot:2 rn:5 rd:5 \ ### SVE FP Multiply-Add Indexed Group # SVE floating-point multiply-add (indexed) -FMLA_zzxz 01100100 0.1 .. rm:3 00000 sub:1 rn:5 rd:5 \ - ra=%reg_movprfx index=%index3_22_19 esz=1 -FMLA_zzxz 01100100 101 index:2 rm:3 00000 sub:1 rn:5 rd:5 \ - ra=%reg_movprfx esz=2 -FMLA_zzxz 01100100 111 index:1 rm:4 00000 sub:1 rn:5 rd:5 \ - ra=%reg_movprfx esz=3 +FMLA_zzxz 01100100 .. 1 ..... 000000 ..... ..... @rrxr_h +FMLA_zzxz 01100100 .. 1 ..... 000000 ..... ..... @rrxr_s +FMLA_zzxz 01100100 .. 1 ..... 000000 ..... ..... @rrxr_d +FMLS_zzxz 01100100 .. 1 ..... 000001 ..... ..... @rrxr_h +FMLS_zzxz 01100100 .. 1 ..... 000001 ..... ..... @rrxr_s +FMLS_zzxz 01100100 .. 1 ..... 000001 ..... ..... @rrxr_d ### SVE FP Multiply Indexed Group diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 10200f0207..7c50298302 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3817,26 +3817,34 @@ static bool trans_DOT_zzzz(DisasContext *s, arg_DOT_zzzz *a) return true; } -static bool trans_DOT_zzxw(DisasContext *s, arg_DOT_zzxw *a) +static bool do_zzxz_ool(DisasContext *s, arg_rrxr_esz *a, + gen_helper_gvec_4 *fn) { - static gen_helper_gvec_4 * const fns[2][2] = { - { gen_helper_gvec_sdot_idx_b, gen_helper_gvec_sdot_idx_h }, - { gen_helper_gvec_udot_idx_b, gen_helper_gvec_udot_idx_h } - }; - + if (fn == NULL) { + return false; + } if (sve_access_check(s)) { - gen_gvec_ool_zzzz(s, fns[a->u][a->sz], a->rd, a->rn, a->rm, - a->ra, a->index); + gen_gvec_ool_zzzz(s, fn, a->rd, a->rn, a->rm, a->ra, a->index); } return true; } +#define DO_RRXR(NAME, FUNC) \ + static bool NAME(DisasContext *s, arg_rrxr_esz *a) \ + { return do_zzxz_ool(s, a, FUNC); } + +DO_RRXR(trans_SDOT_zzxw_s, gen_helper_gvec_sdot_idx_b) +DO_RRXR(trans_SDOT_zzxw_d, gen_helper_gvec_sdot_idx_h) +DO_RRXR(trans_UDOT_zzxw_s, gen_helper_gvec_udot_idx_b) +DO_RRXR(trans_UDOT_zzxw_d, gen_helper_gvec_udot_idx_h) + +#undef DO_RRXR /* *** SVE Floating Point Multiply-Add Indexed Group */ -static bool trans_FMLA_zzxz(DisasContext *s, arg_FMLA_zzxz *a) +static bool do_FMLA_zzxz(DisasContext *s, arg_rrxr_esz *a, bool sub) { static gen_helper_gvec_4_ptr * const fns[3] = { gen_helper_gvec_fmla_idx_h, @@ -3851,13 +3859,23 @@ static bool trans_FMLA_zzxz(DisasContext *s, arg_FMLA_zzxz *a) vec_full_reg_offset(s, a->rn), vec_full_reg_offset(s, a->rm), vec_full_reg_offset(s, a->ra), - status, vsz, vsz, (a->index << 1) | a->sub, + status, vsz, vsz, (a->index << 1) | sub, fns[a->esz - 1]); tcg_temp_free_ptr(status); } return true; } +static bool trans_FMLA_zzxz(DisasContext *s, arg_FMLA_zzxz *a) +{ + return do_FMLA_zzxz(s, a, false); +} + +static bool trans_FMLS_zzxz(DisasContext *s, arg_FMLA_zzxz *a) +{ + return do_FMLA_zzxz(s, a, true); +} + /* *** SVE Floating Point Multiply Indexed Group */ From patchwork Fri Sep 18 18:37:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 304999 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C095C43464 for ; Fri, 18 Sep 2020 19:44:32 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id B57E121D42 for ; Fri, 18 Sep 2020 19:44:31 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="fXoEylwy" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B57E121D42 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:54984 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJMJ4-0003O4-Kb for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:44:30 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48760) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLI7-0000to-6K for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:27 -0400 Received: from mail-pg1-x543.google.com ([2607:f8b0:4864:20::543]:41044) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLHo-0007Dd-AU for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:23 -0400 Received: by mail-pg1-x543.google.com with SMTP id y1so3976513pgk.8 for ; Fri, 18 Sep 2020 11:39:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ubmLqueedmlYGL75RBWrWqOt2/2zwwVKDDZSU+bMx/Y=; b=fXoEylwy1B37nMDqxlmcELoNNiE5a9We1itCOrLEgA6D1pQO8s3ReRPdVfevZsMH7p 7KwXo/VgAxeB1javwMwZ389qhKbmVdYPaos/vULmDrqtcyPJMh0n8YcLe7dDpXfHD4ZK rSBlTG7U7czIjcAv3FQM2osae/Q27dJW9+BODQtOeC+FlLO+LWZdclM9Jn7MeQQXAzcI yJxc1RXzKCOiL579/uw8S4AAAq+2WOcolmfn6JpZuhlRi/BcflpT3tRlg6R20evvcWXA 3Eh3b75WJW2cfs814n4d/6RqhsC+VTV8bggwcQq5V2vybzLsuxwjaao2VL5Gu1rHBnEw HLwQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ubmLqueedmlYGL75RBWrWqOt2/2zwwVKDDZSU+bMx/Y=; b=Q0h/FNWT7xR1kzGSsTkYsxJAN+GkOGV00VudCeQjfkQEsfw1kqpmgG6AzEWCMSKfBf c3MHRTGHY1ElAlOyipsvu44YIfKT98gmFTsSfQRFq9kIvYVlvLbpGpdl0/yokOKXRrH8 hrC8KuMMzy3uPTt0UVSFg7IWWh6bpus6z+ZF/5XDzY0XbOiBl9XDYckKd27iMIvWWQW8 w7SVYAZr48XTaNqvHEu/q9NESCWwN/uyMoRqOWbM6nd2oqtuv/v25z9J22uzDObTSi8B er9Lzn3fk5y2xv0K4qNpN4bbYS8b76i2CBPwZNA7sEwJaPdKqzCKzF5wRtPjANzQCO4s 7bMQ== X-Gm-Message-State: AOAM5304cW5Ho15p5OWJDi131AaPgqsFn7iaJJTpNcUM1S8mBb64Se19 9dbOZQyZr8Gy3PYKX3Vwkjl9L+0yoEYpbg== X-Google-Smtp-Source: ABdhPJyhmAfkRXdnRffrMHMaBH5yvt9A5znsYzKL7jegd45xeNXvkdMLrru2vHxzThn8eMSfcwzBrg== X-Received: by 2002:aa7:869a:0:b029:142:2501:34d1 with SMTP id d26-20020aa7869a0000b0290142250134d1mr17696070pfo.42.1600454345619; Fri, 18 Sep 2020 11:39:05 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.39.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:39:04 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 55/81] target/arm: Implement SVE2 integer multiply (indexed) Date: Fri, 18 Sep 2020 11:37:25 -0700 Message-Id: <20200918183751.2787647-56-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::543; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x543.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/sve.decode | 7 +++++++ target/arm/translate-sve.c | 30 ++++++++++++++++++++++++++++++ 2 files changed, 37 insertions(+) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index ea6ec5f198..fa0a572da6 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -777,12 +777,19 @@ MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s DOT_zzzz 01000100 1 sz:1 0 rm:5 00000 u:1 rn:5 rd:5 \ ra=%reg_movprfx +#### SVE Multiply - Indexed + # SVE integer dot product (indexed) SDOT_zzxw_s 01000100 .. 1 ..... 000000 ..... ..... @rrxr_s SDOT_zzxw_d 01000100 .. 1 ..... 000000 ..... ..... @rrxr_d UDOT_zzxw_s 01000100 .. 1 ..... 000001 ..... ..... @rrxr_s UDOT_zzxw_d 01000100 .. 1 ..... 000001 ..... ..... @rrxr_d +# SVE2 integer multiply (indexed) +MUL_zzx_h 01000100 .. 1 ..... 111110 ..... ..... @rrx_h +MUL_zzx_s 01000100 .. 1 ..... 111110 ..... ..... @rrx_s +MUL_zzx_d 01000100 .. 1 ..... 111110 ..... ..... @rrx_d + # SVE floating-point complex add (predicated) FCADD 01100100 esz:2 00000 rot:1 100 pg:3 rm:5 rd:5 \ rn=%reg_movprfx diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 7c50298302..90f0b1ff9b 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3817,6 +3817,10 @@ static bool trans_DOT_zzzz(DisasContext *s, arg_DOT_zzzz *a) return true; } +/* + * SVE Multiply - Indexed + */ + static bool do_zzxz_ool(DisasContext *s, arg_rrxr_esz *a, gen_helper_gvec_4 *fn) { @@ -3840,6 +3844,32 @@ DO_RRXR(trans_UDOT_zzxw_d, gen_helper_gvec_udot_idx_h) #undef DO_RRXR +static bool do_sve2_zzx_ool(DisasContext *s, arg_rrx_esz *a, + gen_helper_gvec_3 *fn) +{ + if (fn == NULL || !dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vsz, vsz, a->index, fn); + } + return true; +} + +#define DO_SVE2_RRX(NAME, FUNC) \ + static bool NAME(DisasContext *s, arg_rrx_esz *a) \ + { return do_sve2_zzx_ool(s, a, FUNC); } + +DO_SVE2_RRX(trans_MUL_zzx_h, gen_helper_gvec_mul_idx_h) +DO_SVE2_RRX(trans_MUL_zzx_s, gen_helper_gvec_mul_idx_s) +DO_SVE2_RRX(trans_MUL_zzx_d, gen_helper_gvec_mul_idx_d) + +#undef DO_SVE2_RRX + /* *** SVE Floating Point Multiply-Add Indexed Group */ From patchwork Fri Sep 18 18:37:26 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 305020 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 15128C43464 for ; Fri, 18 Sep 2020 19:08:25 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id B5A4E21707 for ; Fri, 18 Sep 2020 19:08:24 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="J73Hc7k8" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B5A4E21707 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:46898 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLk7-0005uT-Oe for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:08:23 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48690) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLI0-0000s3-B1 for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:21 -0400 Received: from mail-pg1-x533.google.com ([2607:f8b0:4864:20::533]:36858) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLHq-0007Dj-62 for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:19 -0400 Received: by mail-pg1-x533.google.com with SMTP id f2so3991113pgd.3 for ; Fri, 18 Sep 2020 11:39:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=r8CExXyc5Biic3ZXnX/VF5QU/88e8T+vsiHKGnc7erQ=; b=J73Hc7k8aprsC2Kyh7FWm8qs+wlrmokdslblzsDefg1yvyGeQg+MaaQgfvmro9E98J g588OxBgAABuD0iWM4XW5uWh1JmL5a+bdljEs4IaRRbpslOc0BtugdUBbZlfxSCPwQFx ZkXK5Wvdp/gU9ekjLNNqxCJmUXAtCpmLFvKIWZ48qaAbZkRD5sLCmNcANXQnk/7Aji4K Mi3LJ1YX7LRmZKw4aOHonMdkjz+gI50Cl2bWF6HgkLyLqXe5WldG1yruWDmLIH1ueyM7 WBJqSPSMLIWHdgErbyIIJiseUHKEnV4pkpp1NfEOJ6iIOCq7krOpXkRoUtHsLDr41sZO i1Fg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=r8CExXyc5Biic3ZXnX/VF5QU/88e8T+vsiHKGnc7erQ=; b=D6hzmCH91G7kWd33hHL9hyU0BzXByElwmTrB3xE2SUNBj3y6B2eGhF7oHji7N6rZsO TIc/434KrLLbEwGoWh403qOcPc7xj40Oa1lM/f7bZzLf57iUT4bL5gCUKn8QTXFPZhj6 MEDEx0wKc8/b91+1PEq9J7xe8cQMgzUL4qNSWoCNrkuztvXid/GeyDvgZXJjp2N/3Lpz gU2iMyZbxyoJQjaAoyryKRABLCDb8vsDmngD1+9XUHBvA5J20/BFzLlkAulSrbeTmkDx zdjHeAv1Nnu/UrkHMPOnZvouHbs51bi8OYzpkAh79fe4qGIrFnvTU74Adi7rAh62Dj3e grYg== X-Gm-Message-State: AOAM5318VZRDodQETAbhuTO7TLj/1qrQYrwwOLpIyXZeARJ4MvH92jxO U9O0APv6rrYu9AQ/ppBqv3hPeJbggNAohw== X-Google-Smtp-Source: ABdhPJxoYtrSLFPYDViw1wf2Agzdt/HHvbOFziCP6RXpjrXivWPOkjE8M7d2jf00y9Mx6pDQMn1tqg== X-Received: by 2002:a62:2c8:0:b029:13e:9ee9:5b25 with SMTP id 191-20020a6202c80000b029013e9ee95b25mr33655368pfc.1.1600454347295; Fri, 18 Sep 2020 11:39:07 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.39.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:39:06 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 56/81] target/arm: Implement SVE2 integer multiply-add (indexed) Date: Fri, 18 Sep 2020 11:37:26 -0700 Message-Id: <20200918183751.2787647-57-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::533; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x533.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/sve.decode | 8 ++++++++ target/arm/translate-sve.c | 23 +++++++++++++++++++++++ 2 files changed, 31 insertions(+) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index fa0a572da6..467a93052f 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -785,6 +785,14 @@ SDOT_zzxw_d 01000100 .. 1 ..... 000000 ..... ..... @rrxr_d UDOT_zzxw_s 01000100 .. 1 ..... 000001 ..... ..... @rrxr_s UDOT_zzxw_d 01000100 .. 1 ..... 000001 ..... ..... @rrxr_d +# SVE2 integer multiply-add (indexed) +MLA_zzxz_h 01000100 .. 1 ..... 000010 ..... ..... @rrxr_h +MLA_zzxz_s 01000100 .. 1 ..... 000010 ..... ..... @rrxr_s +MLA_zzxz_d 01000100 .. 1 ..... 000010 ..... ..... @rrxr_d +MLS_zzxz_h 01000100 .. 1 ..... 000011 ..... ..... @rrxr_h +MLS_zzxz_s 01000100 .. 1 ..... 000011 ..... ..... @rrxr_s +MLS_zzxz_d 01000100 .. 1 ..... 000011 ..... ..... @rrxr_d + # SVE2 integer multiply (indexed) MUL_zzx_h 01000100 .. 1 ..... 111110 ..... ..... @rrx_h MUL_zzx_s 01000100 .. 1 ..... 111110 ..... ..... @rrx_s diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 90f0b1ff9b..a6235d78d5 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3870,6 +3870,29 @@ DO_SVE2_RRX(trans_MUL_zzx_d, gen_helper_gvec_mul_idx_d) #undef DO_SVE2_RRX +static bool do_sve2_zzxz_ool(DisasContext *s, arg_rrxr_esz *a, + gen_helper_gvec_4 *fn) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + return do_zzxz_ool(s, a, fn); +} + +#define DO_SVE2_RRXR(NAME, FUNC) \ + static bool NAME(DisasContext *s, arg_rrxr_esz *a) \ + { return do_sve2_zzxz_ool(s, a, FUNC); } + +DO_SVE2_RRXR(trans_MLA_zzxz_h, gen_helper_gvec_mla_idx_h) +DO_SVE2_RRXR(trans_MLA_zzxz_s, gen_helper_gvec_mla_idx_s) +DO_SVE2_RRXR(trans_MLA_zzxz_d, gen_helper_gvec_mla_idx_d) + +DO_SVE2_RRXR(trans_MLS_zzxz_h, gen_helper_gvec_mls_idx_h) +DO_SVE2_RRXR(trans_MLS_zzxz_s, gen_helper_gvec_mls_idx_s) +DO_SVE2_RRXR(trans_MLS_zzxz_d, gen_helper_gvec_mls_idx_d) + +#undef DO_SVE2_RRXR + /* *** SVE Floating Point Multiply-Add Indexed Group */ From patchwork Fri Sep 18 18:37:27 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 305007 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4A0E8C43463 for ; Fri, 18 Sep 2020 19:29:22 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id CF76C208DB for ; Fri, 18 Sep 2020 19:29:21 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="pjuzX5QV" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CF76C208DB Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:48544 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJM4O-0004zD-Ry for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:29:20 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48770) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLI7-0000uV-Cb for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:27 -0400 Received: from mail-pg1-x536.google.com ([2607:f8b0:4864:20::536]:41909) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLHv-0007Dv-G2 for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:27 -0400 Received: by mail-pg1-x536.google.com with SMTP id y1so3976581pgk.8 for ; Fri, 18 Sep 2020 11:39:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=KFpgXc3wDM39uIXZhQQA2BpO+t4rSB0v97/ARha6/jc=; b=pjuzX5QVtmD6pC9IaFEHD3+4352P66KU2f/Kkxdbg7XEs3IqlUYSdmYZ4i5ktmJu7h /fAN9EwfQB3k/3mjVBo4rukIyQxVu6b1lU08xllC83DS9IpPgkEHCeUO2rYp37dk2EDI fy3bEDUG0zkdus4uzUtW5vuj6pedyWfZxZ38OOS7k9nLsD5itekXrRczEFE4xm6Qis5t k9Mug3I+m/iSfzN5cr4xR1X+2VRCGk8XneDtonH3X2/M7VSSbwNwbZOnu6ml5Rm5HZ0/ M4nmC+o59QoflDZAnsUNN7uF6WRcfujaoFRp2vhimzdrneUxVqzGsnfAu4MOYVHVuwWU ro6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=KFpgXc3wDM39uIXZhQQA2BpO+t4rSB0v97/ARha6/jc=; b=Fl3ucPBautwhc7l/VMsS0MAAHpicVRd1ftDZELOK/L6UHl1G4Ac6pxrOSGB+qzahHP A6A6Hba5hNaoyLai4dVjxPufuPl15dbgelJaT+OxUjaZqhbfh29HOEN1aFwOgwqy0oKa +NAsOK9XYbQdEztShloQqdcQiv1ROfIC850YZaBs8sKjudT0/LTJ6dB+kH7bsK4f7LTx gsHKouT/yAIykVYiq2k/AbZ7BUhBBcnKOKaRC8ykEZr7T01ol+owCtbrg2jgBcq9X1dV ejeCfeiRYENQLdx/7+sFI2Cs/UWaQIbm4PNyd+kV7YF67abBNYL5RzjJML/zRZhpAVyT FxWg== X-Gm-Message-State: AOAM531z5fLeSdIsqjdxrmUCaxO8p7p2GOA8WRf9yM6C1ty3/TNL42xX m8GvocesgbzfrNDYxMu92a8mwOCFR+IJSQ== X-Google-Smtp-Source: ABdhPJxW4VsLdtNrJJ/rcjNiLNQs1MYK3LW/jZj6yLTqYWd4rG95tF53/SnDrfe1bJIbVW/TrrKR+g== X-Received: by 2002:aa7:911a:0:b029:13e:d13d:a13d with SMTP id 26-20020aa7911a0000b029013ed13da13dmr32966820pfh.37.1600454348766; Fri, 18 Sep 2020 11:39:08 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.39.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:39:08 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 57/81] target/arm: Implement SVE2 saturating multiply-add high (indexed) Date: Fri, 18 Sep 2020 11:37:27 -0700 Message-Id: <20200918183751.2787647-58-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::536; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x536.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 14 +++++++++++++ target/arm/sve.decode | 8 ++++++++ target/arm/sve_helper.c | 40 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 8 ++++++++ 4 files changed, 70 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 2c6d3431ec..4ff7298f67 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2665,3 +2665,17 @@ DEF_HELPER_FLAGS_5(sve2_sqrdcmlah_zzzz_d, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_6(fmmla_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_6(fmmla_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_sqrdmlah_idx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrdmlah_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrdmlah_idx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_sqrdmlsh_idx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrdmlsh_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrdmlsh_idx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 467a93052f..5fc76b7fc3 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -793,6 +793,14 @@ MLS_zzxz_h 01000100 .. 1 ..... 000011 ..... ..... @rrxr_h MLS_zzxz_s 01000100 .. 1 ..... 000011 ..... ..... @rrxr_s MLS_zzxz_d 01000100 .. 1 ..... 000011 ..... ..... @rrxr_d +# SVE2 saturating multiply-add high (indexed) +SQRDMLAH_zzxz_h 01000100 .. 1 ..... 000100 ..... ..... @rrxr_h +SQRDMLAH_zzxz_s 01000100 .. 1 ..... 000100 ..... ..... @rrxr_s +SQRDMLAH_zzxz_d 01000100 .. 1 ..... 000100 ..... ..... @rrxr_d +SQRDMLSH_zzxz_h 01000100 .. 1 ..... 000101 ..... ..... @rrxr_h +SQRDMLSH_zzxz_s 01000100 .. 1 ..... 000101 ..... ..... @rrxr_s +SQRDMLSH_zzxz_d 01000100 .. 1 ..... 000101 ..... ..... @rrxr_d + # SVE2 integer multiply (indexed) MUL_zzx_h 01000100 .. 1 ..... 111110 ..... ..... @rrx_h MUL_zzx_s 01000100 .. 1 ..... 111110 ..... ..... @rrx_s diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index aaf336118b..1facec8ce0 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1487,9 +1487,49 @@ DO_CMLA(sve2_sqrdcmlah_zzzz_h, int16_t, H2, DO_SQRDMLAH_H) DO_CMLA(sve2_sqrdcmlah_zzzz_s, int32_t, H4, DO_SQRDMLAH_S) DO_CMLA(sve2_sqrdcmlah_zzzz_d, int64_t, , DO_SQRDMLAH_D) +#undef DO_SQRDMLAH_B +#undef DO_SQRDMLAH_H +#undef DO_SQRDMLAH_S +#undef DO_SQRDMLAH_D #undef do_cmla #undef DO_CMLA +#define DO_ZZXZ(NAME, TYPE, H, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, uint32_t desc) \ +{ \ + intptr_t oprsz = simd_oprsz(desc), segment = 16 / sizeof(TYPE); \ + intptr_t i, j, idx = simd_data(desc); \ + TYPE *d = vd, *a = va, *n = vn, *m = (TYPE *)vm + H(idx); \ + for (i = 0; i < oprsz / sizeof(TYPE); i += segment) { \ + TYPE mm = m[i]; \ + for (j = 0; j < segment; j++) { \ + d[i + j] = OP(n[i + j], mm, a[i + j]); \ + } \ + } \ +} + +#define DO_SQRDMLAH_H(N, M, A) \ + ({ uint32_t discard; do_sqrdmlah_h(N, M, A, false, true, &discard); }) +#define DO_SQRDMLAH_S(N, M, A) \ + ({ uint32_t discard; do_sqrdmlah_s(N, M, A, false, true, &discard); }) +#define DO_SQRDMLAH_D(N, M, A) do_sqrdmlah_d(N, M, A, false, true) + +DO_ZZXZ(sve2_sqrdmlah_idx_h, int16_t, H2, DO_SQRDMLAH_H) +DO_ZZXZ(sve2_sqrdmlah_idx_s, int32_t, H4, DO_SQRDMLAH_S) +DO_ZZXZ(sve2_sqrdmlah_idx_d, int64_t, , DO_SQRDMLAH_D) + +#define DO_SQRDMLSH_H(N, M, A) \ + ({ uint32_t discard; do_sqrdmlah_h(N, M, A, true, true, &discard); }) +#define DO_SQRDMLSH_S(N, M, A) \ + ({ uint32_t discard; do_sqrdmlah_s(N, M, A, true, true, &discard); }) +#define DO_SQRDMLSH_D(N, M, A) do_sqrdmlah_d(N, M, A, true, true) + +DO_ZZXZ(sve2_sqrdmlsh_idx_h, int16_t, H2, DO_SQRDMLSH_H) +DO_ZZXZ(sve2_sqrdmlsh_idx_s, int32_t, H4, DO_SQRDMLSH_S) +DO_ZZXZ(sve2_sqrdmlsh_idx_d, int64_t, , DO_SQRDMLSH_D) + +#undef DO_ZZXZ + #define DO_BITPERM(NAME, TYPE, OP) \ void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ { \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index a6235d78d5..1dd03ae21e 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3891,6 +3891,14 @@ DO_SVE2_RRXR(trans_MLS_zzxz_h, gen_helper_gvec_mls_idx_h) DO_SVE2_RRXR(trans_MLS_zzxz_s, gen_helper_gvec_mls_idx_s) DO_SVE2_RRXR(trans_MLS_zzxz_d, gen_helper_gvec_mls_idx_d) +DO_SVE2_RRXR(trans_SQRDMLAH_zzxz_h, gen_helper_sve2_sqrdmlah_idx_h) +DO_SVE2_RRXR(trans_SQRDMLAH_zzxz_s, gen_helper_sve2_sqrdmlah_idx_s) +DO_SVE2_RRXR(trans_SQRDMLAH_zzxz_d, gen_helper_sve2_sqrdmlah_idx_d) + +DO_SVE2_RRXR(trans_SQRDMLSH_zzxz_h, gen_helper_sve2_sqrdmlsh_idx_h) +DO_SVE2_RRXR(trans_SQRDMLSH_zzxz_s, gen_helper_sve2_sqrdmlsh_idx_s) +DO_SVE2_RRXR(trans_SQRDMLSH_zzxz_d, gen_helper_sve2_sqrdmlsh_idx_d) + #undef DO_SVE2_RRXR /* From patchwork Fri Sep 18 18:37:28 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273343 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7FF4BC43463 for ; Fri, 18 Sep 2020 19:14:42 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0A3AD21707 for ; Fri, 18 Sep 2020 19:14:41 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="N4aD2f4P" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0A3AD21707 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:35418 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLqC-0004Uj-QK for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:14:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48792) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLI8-0000yA-Pq for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:28 -0400 Received: from mail-pg1-x544.google.com ([2607:f8b0:4864:20::544]:45306) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLHv-0007E6-Ev for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:28 -0400 Received: by mail-pg1-x544.google.com with SMTP id 67so3962333pgd.12 for ; Fri, 18 Sep 2020 11:39:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=1+q4eXbYiQuNqCAaqZWB5nUYF2MXnwNvxRtcvSEp0SQ=; b=N4aD2f4PfBlol7HpPNZM1BbgHyJzk7wCA0iI8KI6HJu5o3WCqIVqhM5qlN49LdIWsS 398Fc1RuiO+pHsLJ5VFJaKxdJw5EPJAElkXHJ9zks2C+2Hyto2l4cv4j6gTat/HpBiDp vVpJCxndIcRnE0RE+Z1FDLhtPSfHfoIJfT/2SaCG5zjLBbl0Rs5EiJWQ2G5HyHHyI4jQ O4ASKHepbN/T2RXL7xGfB8Ccvf//+bI4f1MSs73zFFtbcZY+ZBlm1uzrtiCZfuAhibjb b+Yw461oa5WHf4WPwLf1/Ivj3kasa0xVEKIjVIulCkJ3ynrQAx/2yYWonc+0gg5Ji8pA cogg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=1+q4eXbYiQuNqCAaqZWB5nUYF2MXnwNvxRtcvSEp0SQ=; b=oV4hVw9UOkQO6fupPsqrDrLXy2whQOvbi7IZiPXM56yqjM5xufDkiTA79dE9sST+Gl z+Navj0AW/nTmkGK6if+GTZ0CnIdxyPp82xavi0GmZc4VtuJ5VZpVE132h7h6RH7Kvxn d+Z0ddDIK30I/zQBptXFwoSfgIZKz43d+MndbjveFervFgVWEGvOXM9XsVLzA6Y20diq 4/7eznffD0bhnrmgmaw5BMSB2Q5GzHVhd34M8e5j9RySyusbjobxhfTfVLDzwVgb/q7Q 50eGZxFhScj3vcSmOrA5YLK8bl2WhzNHPj92LPpVIQFqaed4ZL0Ci6XE0Vx6nP7yojSL ze3g== X-Gm-Message-State: AOAM531aWfQHVYncJhhwHXPhYXa4KC45il1AN/n86DYYhq+K8xDbEAHb xkbbi9Ha4klQE4nfbbeixI+xAFjV84W9Tw== X-Google-Smtp-Source: ABdhPJwPxMtxAspuYv/tFWhAQ+Y+QkU5a8Zp/NixXW04rAQUJl5eN+n2m6B4yB6NR47aGw7LmsZHLw== X-Received: by 2002:a63:c9:: with SMTP id 192mr28683181pga.37.1600454350333; Fri, 18 Sep 2020 11:39:10 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.39.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:39:09 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 58/81] target/arm: Implement SVE2 saturating multiply-add (indexed) Date: Fri, 18 Sep 2020 11:37:28 -0700 Message-Id: <20200918183751.2787647-59-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::544; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x544.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 9 +++++++++ target/arm/sve.decode | 18 ++++++++++++++++++ target/arm/sve_helper.c | 30 ++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 32 ++++++++++++++++++++++++-------- 4 files changed, 81 insertions(+), 8 deletions(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 4ff7298f67..d6e94a8b9c 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2679,3 +2679,12 @@ DEF_HELPER_FLAGS_5(sve2_sqrdmlsh_idx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_sqrdmlsh_idx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_sqdmlal_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqdmlal_idx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqdmlsl_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqdmlsl_idx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 5fc76b7fc3..36cdd9dab4 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -30,6 +30,8 @@ %size_23 23:2 %dtype_23_13 23:2 13:2 %index3_22_19 22:1 19:2 +%index3_19_11 19:2 11:1 +%index2_20_11 20:1 11:1 # A combination of tsz:imm3 -- extract esize. %tszimm_esz 22:2 5:5 !function=tszimm_esz @@ -263,6 +265,12 @@ @rrxr_d ........ 11 . index:1 rm:4 ...... rn:5 rd:5 \ &rrxr_esz ra=%reg_movprfx esz=3 +# Three registers and a scalar by index, wide +@rrxw_s ........ 10 ... rm:3 ...... rn:5 rd:5 \ + &rrxr_esz ra=%reg_movprfx index=%index3_19_11 esz=2 +@rrxw_d ........ 11 .. rm:4 ...... rn:5 rd:5 \ + &rrxr_esz ra=%reg_movprfx index=%index2_20_11 esz=3 + ########################################################################### # Instruction patterns. Grouped according to the SVE encodingindex.xhtml. @@ -801,6 +809,16 @@ SQRDMLSH_zzxz_h 01000100 .. 1 ..... 000101 ..... ..... @rrxr_h SQRDMLSH_zzxz_s 01000100 .. 1 ..... 000101 ..... ..... @rrxr_s SQRDMLSH_zzxz_d 01000100 .. 1 ..... 000101 ..... ..... @rrxr_d +# SVE2 saturating multiply-add (indexed) +SQDMLALB_zzxw_s 01000100 .. 1 ..... 0010.0 ..... ..... @rrxw_s +SQDMLALB_zzxw_d 01000100 .. 1 ..... 0010.0 ..... ..... @rrxw_d +SQDMLALT_zzxw_s 01000100 .. 1 ..... 0010.1 ..... ..... @rrxw_s +SQDMLALT_zzxw_d 01000100 .. 1 ..... 0010.1 ..... ..... @rrxw_d +SQDMLSLB_zzxw_s 01000100 .. 1 ..... 0011.0 ..... ..... @rrxw_s +SQDMLSLB_zzxw_d 01000100 .. 1 ..... 0011.0 ..... ..... @rrxw_d +SQDMLSLT_zzxw_s 01000100 .. 1 ..... 0011.1 ..... ..... @rrxw_s +SQDMLSLT_zzxw_d 01000100 .. 1 ..... 0011.1 ..... ..... @rrxw_d + # SVE2 integer multiply (indexed) MUL_zzx_h 01000100 .. 1 ..... 111110 ..... ..... @rrx_h MUL_zzx_s 01000100 .. 1 ..... 111110 ..... ..... @rrx_s diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 1facec8ce0..125da036b8 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1530,6 +1530,36 @@ DO_ZZXZ(sve2_sqrdmlsh_idx_d, int64_t, , DO_SQRDMLSH_D) #undef DO_ZZXZ +#define DO_ZZXW(NAME, TYPEW, TYPEN, HW, HN, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, uint32_t desc) \ +{ \ + intptr_t i, j, oprsz = simd_oprsz(desc); \ + intptr_t sel = extract32(desc, SIMD_DATA_SHIFT, 1) * sizeof(TYPEN); \ + intptr_t idx = extract32(desc, SIMD_DATA_SHIFT + 1, 3) * sizeof(TYPEN); \ + for (i = 0; i < oprsz; i += 16) { \ + TYPEW mm = *(TYPEN *)(vm + i + idx); \ + for (j = 0; j < 16; j += sizeof(TYPEW)) { \ + TYPEW nn = *(TYPEN *)(vn + HN(i + j + sel)); \ + TYPEW aa = *(TYPEW *)(va + HW(i + j)); \ + *(TYPEW *)(vd + HW(i + j)) = OP(nn, mm, aa); \ + } \ + } \ +} + +#define DO_SQDMLAL_S(N, M, A) DO_SQADD_S(A, do_sqdmull_s(N, M)) +#define DO_SQDMLAL_D(N, M, A) do_sqadd_d(A, do_sqdmull_d(N, M)) + +DO_ZZXW(sve2_sqdmlal_idx_s, int32_t, int16_t, H1_4, H1_2, DO_SQDMLAL_S) +DO_ZZXW(sve2_sqdmlal_idx_d, int64_t, int32_t, , H1_4, DO_SQDMLAL_D) + +#define DO_SQDMLSL_S(N, M, A) DO_SQSUB_S(A, do_sqdmull_s(N, M)) +#define DO_SQDMLSL_D(N, M, A) do_sqsub_d(A, do_sqdmull_d(N, M)) + +DO_ZZXW(sve2_sqdmlsl_idx_s, int32_t, int16_t, H1_4, H1_2, DO_SQDMLSL_S) +DO_ZZXW(sve2_sqdmlsl_idx_d, int64_t, int32_t, , H1_4, DO_SQDMLSL_D) + +#undef DO_ZZXW + #define DO_BITPERM(NAME, TYPE, OP) \ void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ { \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 1dd03ae21e..5b694a9243 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3821,21 +3821,21 @@ static bool trans_DOT_zzzz(DisasContext *s, arg_DOT_zzzz *a) * SVE Multiply - Indexed */ -static bool do_zzxz_ool(DisasContext *s, arg_rrxr_esz *a, - gen_helper_gvec_4 *fn) +static bool do_zzxz_data(DisasContext *s, arg_rrxr_esz *a, + gen_helper_gvec_4 *fn, int data) { if (fn == NULL) { return false; } if (sve_access_check(s)) { - gen_gvec_ool_zzzz(s, fn, a->rd, a->rn, a->rm, a->ra, a->index); + gen_gvec_ool_zzzz(s, fn, a->rd, a->rn, a->rm, a->ra, data); } return true; } #define DO_RRXR(NAME, FUNC) \ static bool NAME(DisasContext *s, arg_rrxr_esz *a) \ - { return do_zzxz_ool(s, a, FUNC); } + { return do_zzxz_data(s, a, FUNC, a->index); } DO_RRXR(trans_SDOT_zzxw_s, gen_helper_gvec_sdot_idx_b) DO_RRXR(trans_SDOT_zzxw_d, gen_helper_gvec_sdot_idx_h) @@ -3870,18 +3870,18 @@ DO_SVE2_RRX(trans_MUL_zzx_d, gen_helper_gvec_mul_idx_d) #undef DO_SVE2_RRX -static bool do_sve2_zzxz_ool(DisasContext *s, arg_rrxr_esz *a, - gen_helper_gvec_4 *fn) +static bool do_sve2_zzxz_data(DisasContext *s, arg_rrxr_esz *a, + gen_helper_gvec_4 *fn, int data) { if (!dc_isar_feature(aa64_sve2, s)) { return false; } - return do_zzxz_ool(s, a, fn); + return do_zzxz_data(s, a, fn, data); } #define DO_SVE2_RRXR(NAME, FUNC) \ static bool NAME(DisasContext *s, arg_rrxr_esz *a) \ - { return do_sve2_zzxz_ool(s, a, FUNC); } + { return do_sve2_zzxz_data(s, a, FUNC, a->index); } DO_SVE2_RRXR(trans_MLA_zzxz_h, gen_helper_gvec_mla_idx_h) DO_SVE2_RRXR(trans_MLA_zzxz_s, gen_helper_gvec_mla_idx_s) @@ -3901,6 +3901,22 @@ DO_SVE2_RRXR(trans_SQRDMLSH_zzxz_d, gen_helper_sve2_sqrdmlsh_idx_d) #undef DO_SVE2_RRXR +#define DO_SVE2_RRXR_TB(NAME, FUNC, TOP) \ + static bool NAME(DisasContext *s, arg_rrxr_esz *a) \ + { return do_sve2_zzxz_data(s, a, FUNC, (a->index << 1) | TOP); } + +DO_SVE2_RRXR_TB(trans_SQDMLALB_zzxw_s, gen_helper_sve2_sqdmlal_idx_s, false) +DO_SVE2_RRXR_TB(trans_SQDMLALB_zzxw_d, gen_helper_sve2_sqdmlal_idx_d, false) +DO_SVE2_RRXR_TB(trans_SQDMLALT_zzxw_s, gen_helper_sve2_sqdmlal_idx_s, true) +DO_SVE2_RRXR_TB(trans_SQDMLALT_zzxw_d, gen_helper_sve2_sqdmlal_idx_d, true) + +DO_SVE2_RRXR_TB(trans_SQDMLSLB_zzxw_s, gen_helper_sve2_sqdmlsl_idx_s, false) +DO_SVE2_RRXR_TB(trans_SQDMLSLB_zzxw_d, gen_helper_sve2_sqdmlsl_idx_d, false) +DO_SVE2_RRXR_TB(trans_SQDMLSLT_zzxw_s, gen_helper_sve2_sqdmlsl_idx_s, true) +DO_SVE2_RRXR_TB(trans_SQDMLSLT_zzxw_d, gen_helper_sve2_sqdmlsl_idx_d, true) + +#undef DO_SVE2_RRXR_TB + /* *** SVE Floating Point Multiply-Add Indexed Group */ From patchwork Fri Sep 18 18:37:29 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273345 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6098BC43463 for ; Fri, 18 Sep 2020 19:12:18 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D7FD421707 for ; Fri, 18 Sep 2020 19:12:17 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="gc20q2Io" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D7FD421707 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:55256 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLns-00013z-EI for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:12:16 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48710) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLI2-0000tL-Bb for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:26 -0400 Received: from mail-pg1-x542.google.com ([2607:f8b0:4864:20::542]:45305) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLHv-0007EA-01 for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:22 -0400 Received: by mail-pg1-x542.google.com with SMTP id 67so3962352pgd.12 for ; Fri, 18 Sep 2020 11:39:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=dxBQWue3d1N+N0crqPmWVlgMt5psWuE6mkZzqPJgF8s=; b=gc20q2IoSyIGduQXDt5bP2MkvJBnu8aeVmz7/vY6cLVSOGpKQ8HyZXBFnE0Pm8tenZ Z6AUky/eslcazE98rravXRKjA1WvF+MOtneD9zXIZ94Ds3+jmC2HRotC1Jbf0BrmGIPj Am9PNwjFaeteNZylcFilQGkU/SL1MKnBOrz6c03WP46ZoYwl0UjbxrdSqaeBKEWY7X5d 41Rbbt0tEEKDetV93haBdPBaF4vSzl1bdHaJmnHb0JzI8lAgbIb+rtZiEhvsNQiUjARb skAzMGtTU9mbCf5pht5nAduYMFFlXdS0PV2ljgA90XhiSSzyHkf32GtfT6pxqlJQ+XQs P5xQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=dxBQWue3d1N+N0crqPmWVlgMt5psWuE6mkZzqPJgF8s=; b=pYGvhfiLwVODcnf9Ss2mD+j75ovTla11BmkwMpNNg7C6M/lETwaCKUKL0E9p66h/LC NSG1YS8HMu8yk6dyLT5xtTTW5OJcaBzmPxaiBspc9eraWva57rIhjsC82AHONfRGu+mj 3wFeLK5+S5gax6x1arfPN9pr+AeWeP44Z/Y3Dnh+bV+NM4ZgkVyAskmUd+YEGm6PGR7I v7Mrg33zkfg9bDfrKR7TDjR1gWCmzcjOUIfIgqPY+RxqNRaKDJ3GnumIFobrQmInAqKM UGSZb4o3xjeqyiXqlbRfW4PPt5+FEUbLIkJVl4uiRKdmzf0KQmX2xew36umVP1c7RsfI MIuw== X-Gm-Message-State: AOAM532JYtSQK7VanCZ3CMEJsvUcjkoawu+ru6vxUyZ1IfCGVcoZ8BwE DH+ELDCJP7dOd96PMuCFiOpE2d53ErXczQ== X-Google-Smtp-Source: ABdhPJxTImP8HFqYWUCfdztWoFplGTo3nKLS6/GdxiXa3cgmXTbMOqwP9F82qbGezjvM+FPDwB/RTg== X-Received: by 2002:a63:2022:: with SMTP id g34mr27236745pgg.378.1600454351513; Fri, 18 Sep 2020 11:39:11 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.39.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:39:10 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 59/81] target/arm: Implement SVE2 integer multiply long (indexed) Date: Fri, 18 Sep 2020 11:37:29 -0700 Message-Id: <20200918183751.2787647-60-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::542; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x542.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 5 +++++ target/arm/sve.decode | 16 ++++++++++++++++ target/arm/sve_helper.c | 23 +++++++++++++++++++++++ target/arm/translate-sve.c | 24 ++++++++++++++++++++---- 4 files changed, 64 insertions(+), 4 deletions(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index d6e94a8b9c..d11598a75a 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2688,3 +2688,8 @@ DEF_HELPER_FLAGS_5(sve2_sqdmlsl_idx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_sqdmlsl_idx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_smull_idx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_smull_idx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_umull_idx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_umull_idx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 36cdd9dab4..f0a4d86428 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -257,6 +257,12 @@ @rrx_d ........ 11 . index:1 rm:4 ...... rn:5 rd:5 \ &rrx_esz esz=3 +# Two registers and a scalar by index, wide +@rrxl_s ........ 10 ... rm:3 ...... rn:5 rd:5 \ + &rrx_esz index=%index3_19_11 esz=2 +@rrxl_d ........ 11 .. rm:4 ...... rn:5 rd:5 \ + &rrx_esz index=%index2_20_11 esz=3 + # Three registers and a scalar by index @rrxr_h ........ 0. . .. rm:3 ...... rn:5 rd:5 \ &rrxr_esz ra=%reg_movprfx index=%index3_22_19 esz=1 @@ -819,6 +825,16 @@ SQDMLSLB_zzxw_d 01000100 .. 1 ..... 0011.0 ..... ..... @rrxw_d SQDMLSLT_zzxw_s 01000100 .. 1 ..... 0011.1 ..... ..... @rrxw_s SQDMLSLT_zzxw_d 01000100 .. 1 ..... 0011.1 ..... ..... @rrxw_d +# SVE2 integer multiply long (indexed) +SMULLB_zzx_s 01000100 .. 1 ..... 1100.0 ..... ..... @rrxl_s +SMULLB_zzx_d 01000100 .. 1 ..... 1100.0 ..... ..... @rrxl_d +SMULLT_zzx_s 01000100 .. 1 ..... 1100.1 ..... ..... @rrxl_s +SMULLT_zzx_d 01000100 .. 1 ..... 1100.1 ..... ..... @rrxl_d +UMULLB_zzx_s 01000100 .. 1 ..... 1101.0 ..... ..... @rrxl_s +UMULLB_zzx_d 01000100 .. 1 ..... 1101.0 ..... ..... @rrxl_d +UMULLT_zzx_s 01000100 .. 1 ..... 1101.1 ..... ..... @rrxl_s +UMULLT_zzx_d 01000100 .. 1 ..... 1101.1 ..... ..... @rrxl_d + # SVE2 integer multiply (indexed) MUL_zzx_h 01000100 .. 1 ..... 111110 ..... ..... @rrx_h MUL_zzx_s 01000100 .. 1 ..... 111110 ..... ..... @rrx_s diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 125da036b8..80449cc569 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1560,6 +1560,29 @@ DO_ZZXW(sve2_sqdmlsl_idx_d, int64_t, int32_t, , H1_4, DO_SQDMLSL_D) #undef DO_ZZXW +#define DO_ZZX(NAME, TYPEW, TYPEN, HW, HN, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + intptr_t i, j, oprsz = simd_oprsz(desc); \ + intptr_t sel = extract32(desc, SIMD_DATA_SHIFT, 1) * sizeof(TYPEN); \ + intptr_t idx = extract32(desc, SIMD_DATA_SHIFT + 1, 3) * sizeof(TYPEN); \ + for (i = 0; i < oprsz; i += 16) { \ + TYPEW mm = *(TYPEN *)(vm + i + idx); \ + for (j = 0; j < 16; j += sizeof(TYPEW)) { \ + TYPEW nn = *(TYPEN *)(vn + HN(i + j + sel)); \ + *(TYPEW *)(vd + HW(i + j)) = OP(nn, mm); \ + } \ + } \ +} + +DO_ZZX(sve2_smull_idx_s, int32_t, int16_t, H1_4, H1_2, DO_MUL) +DO_ZZX(sve2_smull_idx_d, int64_t, int32_t, , H1_4, DO_MUL) + +DO_ZZX(sve2_umull_idx_s, uint32_t, uint16_t, H1_4, H1_2, DO_MUL) +DO_ZZX(sve2_umull_idx_d, uint64_t, uint32_t, , H1_4, DO_MUL) + +#undef DO_ZZX + #define DO_BITPERM(NAME, TYPE, OP) \ void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ { \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 5b694a9243..39e5160e25 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3844,8 +3844,8 @@ DO_RRXR(trans_UDOT_zzxw_d, gen_helper_gvec_udot_idx_h) #undef DO_RRXR -static bool do_sve2_zzx_ool(DisasContext *s, arg_rrx_esz *a, - gen_helper_gvec_3 *fn) +static bool do_sve2_zzx_data(DisasContext *s, arg_rrx_esz *a, + gen_helper_gvec_3 *fn, int data) { if (fn == NULL || !dc_isar_feature(aa64_sve2, s)) { return false; @@ -3855,14 +3855,14 @@ static bool do_sve2_zzx_ool(DisasContext *s, arg_rrx_esz *a, tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), vec_full_reg_offset(s, a->rn), vec_full_reg_offset(s, a->rm), - vsz, vsz, a->index, fn); + vsz, vsz, data, fn); } return true; } #define DO_SVE2_RRX(NAME, FUNC) \ static bool NAME(DisasContext *s, arg_rrx_esz *a) \ - { return do_sve2_zzx_ool(s, a, FUNC); } + { return do_sve2_zzx_data(s, a, FUNC, a->index); } DO_SVE2_RRX(trans_MUL_zzx_h, gen_helper_gvec_mul_idx_h) DO_SVE2_RRX(trans_MUL_zzx_s, gen_helper_gvec_mul_idx_s) @@ -3870,6 +3870,22 @@ DO_SVE2_RRX(trans_MUL_zzx_d, gen_helper_gvec_mul_idx_d) #undef DO_SVE2_RRX +#define DO_SVE2_RRX_TB(NAME, FUNC, TOP) \ + static bool NAME(DisasContext *s, arg_rrx_esz *a) \ + { return do_sve2_zzx_data(s, a, FUNC, (a->index << 1) | TOP); } + +DO_SVE2_RRX_TB(trans_SMULLB_zzx_s, gen_helper_sve2_smull_idx_s, false) +DO_SVE2_RRX_TB(trans_SMULLB_zzx_d, gen_helper_sve2_smull_idx_d, false) +DO_SVE2_RRX_TB(trans_SMULLT_zzx_s, gen_helper_sve2_smull_idx_s, true) +DO_SVE2_RRX_TB(trans_SMULLT_zzx_d, gen_helper_sve2_smull_idx_d, true) + +DO_SVE2_RRX_TB(trans_UMULLB_zzx_s, gen_helper_sve2_umull_idx_s, false) +DO_SVE2_RRX_TB(trans_UMULLB_zzx_d, gen_helper_sve2_umull_idx_d, false) +DO_SVE2_RRX_TB(trans_UMULLT_zzx_s, gen_helper_sve2_umull_idx_s, true) +DO_SVE2_RRX_TB(trans_UMULLT_zzx_d, gen_helper_sve2_umull_idx_d, true) + +#undef DO_SVE2_RRX_TB + static bool do_sve2_zzxz_data(DisasContext *s, arg_rrxr_esz *a, gen_helper_gvec_4 *fn, int data) { From patchwork Fri Sep 18 18:37:30 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 305000 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 129A0C43464 for ; Fri, 18 Sep 2020 19:43:42 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 9E3CB21D42 for ; Fri, 18 Sep 2020 19:43:41 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="OB9XYUM7" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9E3CB21D42 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:53280 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJMIG-0002fo-KA for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:43:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48794) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLI8-0000yJ-TO for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:28 -0400 Received: from mail-pf1-x442.google.com ([2607:f8b0:4864:20::442]:33504) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLHv-0007EG-Fw for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:28 -0400 Received: by mail-pf1-x442.google.com with SMTP id z18so4009941pfg.0 for ; Fri, 18 Sep 2020 11:39:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=9SaDxjAes0OJRLPIkAIggjDGEtTmMvajhiHB5LibM7I=; b=OB9XYUM7ZUCJLaOBUQwdrQ2aT8rlTYih4sTelB8UIFhi9c3i0IP+4RAn5DpDlKVdoE 09ETP74f1Twkg1RJ1KY2GfmL4PUCwaRHTbo2xHSj10P4/ZNAgI5QdzK0Y8l1mEgsyrI+ Trnmqm0KfcEM9xliYZCLrcluRXWnyiTiZBQSFB7OQA2ez8/gnNrBOrasdHwilbi2CIHA 5onAVGaXxPOpuufDfnEhONWucIQMNMPApp4ffd+0+2MwpgU9bPIUo3D3gZi0eMpV7k+i upvquRGJvswP3byKoEwMfxjsguZdx1GdInSdhyE+9DJ370Q++3BIMuRicURhimtysyUH oXNg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=9SaDxjAes0OJRLPIkAIggjDGEtTmMvajhiHB5LibM7I=; b=KcHwKKIH/TRRHLn0PreS5Zznw0v8QII4bIAQac/WqKMdc8KvjdIENLNwW9NvItqMYy JWiQS3VS7Jt59axmwpc3LH85TqOGvdavXUuivQXzsB3fMkAYYlhUHN9x7uXxhC9rJ/X5 Q9Qgmcefx5CiFjc/9skuw84prA7MyOOPqDAAO+PaHYKvsuw0vQJuByLZPd3K4irHXazw +HhB7sVjW1giqC54w0qjvbQV89QosNge0LSEoaMPFHG5AIVUl7+l8N6xnun9rLS6tPfb yl/X5W/lBqA+50pdpIQ3I0q3Xwuyfhn6g84pDCD97mNx8kHG3m0FVNJdA+6/GzDu3/0g Uu6Q== X-Gm-Message-State: AOAM531Z7LSYkoddpCrOGLd/WctHbus69DaizoCXpHngp8rCPAQbi0I+ gK7j2Gpu59yetV7CDfQYHy18b+mNzkteBg== X-Google-Smtp-Source: ABdhPJx0+ByJNnXbxdhIUgB3Aiz1IgqP1Wtpjo3zOCdWLzeJiiehBz1gewBl1ZfbMXO53f11h5Ez0g== X-Received: by 2002:a62:2985:0:b029:142:2501:34d6 with SMTP id p127-20020a6229850000b0290142250134d6mr16933373pfp.47.1600454352836; Fri, 18 Sep 2020 11:39:12 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.39.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:39:11 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 60/81] target/arm: Implement SVE2 saturating multiply (indexed) Date: Fri, 18 Sep 2020 11:37:30 -0700 Message-Id: <20200918183751.2787647-61-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::442; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x442.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 5 +++++ target/arm/sve.decode | 6 ++++++ target/arm/sve_helper.c | 3 +++ target/arm/translate-sve.c | 5 +++++ 4 files changed, 19 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index d11598a75a..ff85e1327b 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2693,3 +2693,8 @@ DEF_HELPER_FLAGS_4(sve2_smull_idx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_smull_idx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_umull_idx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_umull_idx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_sqdmull_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqdmull_idx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index f0a4d86428..400940a18d 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -835,6 +835,12 @@ UMULLB_zzx_d 01000100 .. 1 ..... 1101.0 ..... ..... @rrxl_d UMULLT_zzx_s 01000100 .. 1 ..... 1101.1 ..... ..... @rrxl_s UMULLT_zzx_d 01000100 .. 1 ..... 1101.1 ..... ..... @rrxl_d +# SVE2 saturating multiply (indexed) +SQDMULLB_zzx_s 01000100 .. 1 ..... 1110.0 ..... ..... @rrxl_s +SQDMULLB_zzx_d 01000100 .. 1 ..... 1110.0 ..... ..... @rrxl_d +SQDMULLT_zzx_s 01000100 .. 1 ..... 1110.1 ..... ..... @rrxl_s +SQDMULLT_zzx_d 01000100 .. 1 ..... 1110.1 ..... ..... @rrxl_d + # SVE2 integer multiply (indexed) MUL_zzx_h 01000100 .. 1 ..... 111110 ..... ..... @rrx_h MUL_zzx_s 01000100 .. 1 ..... 111110 ..... ..... @rrx_s diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 80449cc569..926fe98eb5 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1581,6 +1581,9 @@ DO_ZZX(sve2_smull_idx_d, int64_t, int32_t, , H1_4, DO_MUL) DO_ZZX(sve2_umull_idx_s, uint32_t, uint16_t, H1_4, H1_2, DO_MUL) DO_ZZX(sve2_umull_idx_d, uint64_t, uint32_t, , H1_4, DO_MUL) +DO_ZZX(sve2_sqdmull_idx_s, int32_t, int16_t, H1_4, H1_2, do_sqdmull_s) +DO_ZZX(sve2_sqdmull_idx_d, int64_t, int32_t, , H1_4, do_sqdmull_d) + #undef DO_ZZX #define DO_BITPERM(NAME, TYPE, OP) \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 39e5160e25..c30aed211f 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3884,6 +3884,11 @@ DO_SVE2_RRX_TB(trans_UMULLB_zzx_d, gen_helper_sve2_umull_idx_d, false) DO_SVE2_RRX_TB(trans_UMULLT_zzx_s, gen_helper_sve2_umull_idx_s, true) DO_SVE2_RRX_TB(trans_UMULLT_zzx_d, gen_helper_sve2_umull_idx_d, true) +DO_SVE2_RRX_TB(trans_SQDMULLB_zzx_s, gen_helper_sve2_sqdmull_idx_s, false) +DO_SVE2_RRX_TB(trans_SQDMULLB_zzx_d, gen_helper_sve2_sqdmull_idx_d, false) +DO_SVE2_RRX_TB(trans_SQDMULLT_zzx_s, gen_helper_sve2_sqdmull_idx_s, true) +DO_SVE2_RRX_TB(trans_SQDMULLT_zzx_d, gen_helper_sve2_sqdmull_idx_d, true) + #undef DO_SVE2_RRX_TB static bool do_sve2_zzxz_data(DisasContext *s, arg_rrxr_esz *a, From patchwork Fri Sep 18 18:37:31 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273324 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4AECFC43463 for ; Fri, 18 Sep 2020 19:47:39 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 9B81521D42 for ; Fri, 18 Sep 2020 19:47:38 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="bxr3bq6e" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9B81521D42 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:33296 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJMM5-0006Bk-Or for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:47:37 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48820) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLI9-00010O-Qi for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:29 -0400 Received: from mail-pg1-x544.google.com ([2607:f8b0:4864:20::544]:42447) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLHy-0007EV-3Q for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:29 -0400 Received: by mail-pg1-x544.google.com with SMTP id k14so3973227pgi.9 for ; Fri, 18 Sep 2020 11:39:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=lFJ8kTFmIiX5tpGTknCea588jF44LhIYKzuEd0gTcfs=; b=bxr3bq6eWIPh4qOlnsTf0jmJiEbtbfjt6recMWPnZUBh4K9EA853OJrZJEqhf4zeOi sDSRSV4oqn/TSbdSkXBlEzInCS0C6kzH2p6h/l6AJ+itGIv18Mt9A6u8KTSieiHa/HLI wpxSMACTKKDVdcDtPaB17Gy6B73TNHiuxS87yJvuSVwUcerTIBknaGMSGTha/XWHZdTM +vjjL+1Df/0LMrqvJ/rlBb8Fng6wsQ4D+jA2UD/sN3KiackpPU7/2YhuTX2YvxV264Z+ Togcg142LDEcRWCFWu9u8G6BZlTzY1Jklw2iRixtY4i8Aq7dSs/eZ16et2dCOYcOjCwu ulXQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=lFJ8kTFmIiX5tpGTknCea588jF44LhIYKzuEd0gTcfs=; b=NlyaUBFFZif5G6xyfGFJNhLD6RxjJLo820nxO7PJlKaZN+tosvI6vxA8J/3rU/3qEZ VMVKlRhFIPd577wD/apMrJeuB+A89yzc6cWritNeo+T+DKE9vQy/FfwYQuX6wpj6ueLn caKiq2yBcPpKXSrwsjJ1bgWv7/+xseoOwIQJTgevtNK7XiV8qaiN2+ny7daDhwyb8TWF z2ZBKVJpl+CWgLvQLvmYIv5v4WczrlLIwoQECaJmxJfqy0oAayzM0sgU3QMECkr3AFBs EI9KALk4C2TMOf4hKdGk88osraos7TLC34kfCa1UFr3fzVs9Md31rV+hGMMQ1htZCva9 th1A== X-Gm-Message-State: AOAM533WGJ40HSmQWl9XCKD28eG9MPEvm99r+841C1hAtXBTrsrjUfnJ nZvVcTOJisoFHWrvQdK0dYx3C4yeaCRTjg== X-Google-Smtp-Source: ABdhPJyh2iVlx8kkZdmvYawNnls3IM2hIllweNMd41N2eaf91iQQ8rxyhwvlfKHKX0iwQo4yRt+46w== X-Received: by 2002:a63:4b63:: with SMTP id k35mr29021279pgl.235.1600454354815; Fri, 18 Sep 2020 11:39:14 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.39.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:39:14 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 61/81] target/arm: Implement SVE2 signed saturating doubling multiply high Date: Fri, 18 Sep 2020 11:37:31 -0700 Message-Id: <20200918183751.2787647-62-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::544; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x544.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper.h | 10 +++++ target/arm/sve.decode | 4 ++ target/arm/translate-sve.c | 18 ++++++++ target/arm/vec_helper.c | 84 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 116 insertions(+) diff --git a/target/arm/helper.h b/target/arm/helper.h index 8b3dc9b4e3..fa7b2b7dd7 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -944,6 +944,16 @@ DEF_HELPER_FLAGS_5(neon_sqrdmulh_h, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(neon_sqrdmulh_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqdmulh_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqdmulh_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqdmulh_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqdmulh_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_sqrdmulh_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqrdmulh_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqrdmulh_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqrdmulh_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(gvec_xar_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) #ifdef TARGET_AARCH64 diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 400940a18d..6879870cc1 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1214,6 +1214,10 @@ SMULH_zzz 00000100 .. 1 ..... 0110 10 ..... ..... @rd_rn_rm UMULH_zzz 00000100 .. 1 ..... 0110 11 ..... ..... @rd_rn_rm PMUL_zzz 00000100 00 1 ..... 0110 01 ..... ..... @rd_rn_rm_e0 +# SVE2 signed saturating doubling multiply high (unpredicated) +SQDMULH_zzz 00000100 .. 1 ..... 0111 00 ..... ..... @rd_rn_rm +SQRDMULH_zzz 00000100 .. 1 ..... 0111 01 ..... ..... @rd_rn_rm + ### SVE2 Integer - Predicated SADALP_zpzz 01000100 .. 000 100 101 ... ..... ..... @rdm_pg_rn diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index c30aed211f..298c328ce0 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -6434,6 +6434,24 @@ static bool trans_PMUL_zzz(DisasContext *s, arg_rrr_esz *a) return do_sve2_zzz_ool(s, a, gen_helper_gvec_pmul_b); } +static bool trans_SQDMULH_zzz(DisasContext *s, arg_rrr_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve2_sqdmulh_b, gen_helper_sve2_sqdmulh_h, + gen_helper_sve2_sqdmulh_s, gen_helper_sve2_sqdmulh_d, + }; + return do_sve2_zzz_ool(s, a, fns[a->esz]); +} + +static bool trans_SQRDMULH_zzz(DisasContext *s, arg_rrr_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve2_sqrdmulh_b, gen_helper_sve2_sqrdmulh_h, + gen_helper_sve2_sqrdmulh_s, gen_helper_sve2_sqrdmulh_d, + }; + return do_sve2_zzz_ool(s, a, fns[a->esz]); +} + /* * SVE2 Integer - Predicated */ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 82e12b51c1..f3a19821cd 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -81,6 +81,26 @@ void HELPER(sve2_sqrdmlsh_b)(void *vd, void *vn, void *vm, } } +void HELPER(sve2_sqdmulh_b)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int8_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz; ++i) { + d[i] = do_sqrdmlah_b(n[i], m[i], 0, false, false); + } +} + +void HELPER(sve2_sqrdmulh_b)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int8_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz; ++i) { + d[i] = do_sqrdmlah_b(n[i], m[i], 0, false, true); + } +} + /* Signed saturating rounding doubling multiply-accumulate high half, 16-bit */ int16_t do_sqrdmlah_h(int16_t src1, int16_t src2, int16_t src3, bool neg, bool round, uint32_t *sat) @@ -198,6 +218,28 @@ void HELPER(sve2_sqrdmlsh_h)(void *vd, void *vn, void *vm, } } +void HELPER(sve2_sqdmulh_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int16_t *d = vd, *n = vn, *m = vm; + uint32_t discard; + + for (i = 0; i < opr_sz / 2; ++i) { + d[i] = do_sqrdmlah_h(n[i], m[i], 0, false, false, &discard); + } +} + +void HELPER(sve2_sqrdmulh_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int16_t *d = vd, *n = vn, *m = vm; + uint32_t discard; + + for (i = 0; i < opr_sz / 2; ++i) { + d[i] = do_sqrdmlah_h(n[i], m[i], 0, false, true, &discard); + } +} + /* Signed saturating rounding doubling multiply-accumulate high half, 32-bit */ int32_t do_sqrdmlah_s(int32_t src1, int32_t src2, int32_t src3, bool neg, bool round, uint32_t *sat) @@ -309,6 +351,28 @@ void HELPER(sve2_sqrdmlsh_s)(void *vd, void *vn, void *vm, } } +void HELPER(sve2_sqdmulh_s)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int32_t *d = vd, *n = vn, *m = vm; + uint32_t discard; + + for (i = 0; i < opr_sz / 4; ++i) { + d[i] = do_sqrdmlah_s(n[i], m[i], 0, false, false, &discard); + } +} + +void HELPER(sve2_sqrdmulh_s)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int32_t *d = vd, *n = vn, *m = vm; + uint32_t discard; + + for (i = 0; i < opr_sz / 4; ++i) { + d[i] = do_sqrdmlah_s(n[i], m[i], 0, false, true, &discard); + } +} + /* Signed saturating rounding doubling multiply-accumulate high half, 64-bit */ static int64_t do_sat128_d(Int128 r) { @@ -368,6 +432,26 @@ void HELPER(sve2_sqrdmlsh_d)(void *vd, void *vn, void *vm, } } +void HELPER(sve2_sqdmulh_d)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int64_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 8; ++i) { + d[i] = do_sqrdmlah_d(n[i], m[i], 0, false, false); + } +} + +void HELPER(sve2_sqrdmulh_d)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int64_t *d = vd, *n = vn, *m = vm; + + for (i = 0; i < opr_sz / 8; ++i) { + d[i] = do_sqrdmlah_d(n[i], m[i], 0, false, true); + } +} + /* Integer 8 and 16-bit dot-product. * * Note that for the loops herein, host endianness does not matter From patchwork Fri Sep 18 18:37:32 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 305005 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7365EC43466 for ; Fri, 18 Sep 2020 19:33:13 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id EE83D2311D for ; Fri, 18 Sep 2020 19:33:12 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="h/KZq2LS" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EE83D2311D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:56720 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJM87-0000F8-UV for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:33:11 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48786) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLI8-0000wn-7Z for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:28 -0400 Received: from mail-pj1-x1041.google.com ([2607:f8b0:4864:20::1041]:35216) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLHz-0007El-U5 for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:27 -0400 Received: by mail-pj1-x1041.google.com with SMTP id jw11so3472161pjb.0 for ; Fri, 18 Sep 2020 11:39:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Qqw4EmrceewPlrjWl8VNLxD1qnB49LtMy4mOdSvYd6s=; b=h/KZq2LSU04u7W0Bg1orMTK7cHKXNHZhPE6qhoS0jCd3iTw5bJn+oefworx1yRn45H mT7AD3rug9dEwYGYoX6zjgWLJw67+4K91wszt+0YOVj3HtG85hHB/enLqG5z5DdcWO9B cbSobKwYF94mZIHXvFvIFZzS4AOGEkfgWBdFtAEABXTF9z0oRJXHPTHTdyF10asVfAM6 Cht0s2fd/wWYZ0/hMjgbD7ZyHEHHwQI8bHBEy6L2zHGXClpg0hxZXzXUJ1NrKqaaXFAq xHY1LpC4M/ZB9Uc/UkkFEZ+qwlEOGjphC4ng/3GIEFcTa/ZFB5mCY6uFPq1psuEldpq6 aIdQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Qqw4EmrceewPlrjWl8VNLxD1qnB49LtMy4mOdSvYd6s=; b=nnl9S0lMT2jkIXJbwWkXkjCM/yDwwucp1R063xh27QwWkNi/1lbEDxqji/50ZnI/Ft TQX8jT9t9ZRUqUobA+ALsFNy7byLCnyOQ/XA/zsE34QVW0g+IOlbLAbl9j41PYGVABiv MAYN5I/dGVcDqtJAPeG4Z1UjBQPkT5bhoo458W/qjNG8eSFuUVoQ73jJM0lhid7IUCmL ZZGr4RHiRGKyNG8dKXimI+DBDPV6kz06SQwxVybvGhSVJ9IoVFunjEfnNAtlzwx+d47g s4MhmSJ1WvOBvsGAD7RpkQJ6CyC1wT0QIVgbYZq9DzwQCwBWx8UNtmnvmN+1B2J18dNi ngWw== X-Gm-Message-State: AOAM530GaS3obSlt2U2+uOaZ1YPnG36qn564KsL+/JIblTmPLoEMcNOL mw2gFyQFEihMc3ggLa7IRCLj+ENwDigxRg== X-Google-Smtp-Source: ABdhPJzTez+PHrIvUqx+VMW50Br6SydidFdKajbjcgFIpOm2iiDkOvuOZslmIOvYhJNBkoWbFshK5A== X-Received: by 2002:a17:902:7083:b029:d2:4ca:281a with SMTP id z3-20020a1709027083b02900d204ca281amr6115818plk.58.1600454356081; Fri, 18 Sep 2020 11:39:16 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.39.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:39:15 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 62/81] target/arm: Implement SVE2 saturating multiply high (indexed) Date: Fri, 18 Sep 2020 11:37:32 -0700 Message-Id: <20200918183751.2787647-63-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1041; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1041.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper.h | 14 ++++++ target/arm/sve.decode | 8 ++++ target/arm/translate-sve.c | 8 ++++ target/arm/vec_helper.c | 88 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 118 insertions(+) diff --git a/target/arm/helper.h b/target/arm/helper.h index fa7b2b7dd7..828fcc0fc9 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -954,6 +954,20 @@ DEF_HELPER_FLAGS_4(sve2_sqrdmulh_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_sqrdmulh_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_sqrdmulh_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqdmulh_idx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqdmulh_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqdmulh_idx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_sqrdmulh_idx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqrdmulh_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqrdmulh_idx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(gvec_xar_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) #ifdef TARGET_AARCH64 diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 6879870cc1..80d76982e8 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -841,6 +841,14 @@ SQDMULLB_zzx_d 01000100 .. 1 ..... 1110.0 ..... ..... @rrxl_d SQDMULLT_zzx_s 01000100 .. 1 ..... 1110.1 ..... ..... @rrxl_s SQDMULLT_zzx_d 01000100 .. 1 ..... 1110.1 ..... ..... @rrxl_d +# SVE2 saturating multiply high (indexed) +SQDMULH_zzx_h 01000100 .. 1 ..... 111100 ..... ..... @rrx_h +SQDMULH_zzx_s 01000100 .. 1 ..... 111100 ..... ..... @rrx_s +SQDMULH_zzx_d 01000100 .. 1 ..... 111100 ..... ..... @rrx_d +SQRDMULH_zzx_h 01000100 .. 1 ..... 111101 ..... ..... @rrx_h +SQRDMULH_zzx_s 01000100 .. 1 ..... 111101 ..... ..... @rrx_s +SQRDMULH_zzx_d 01000100 .. 1 ..... 111101 ..... ..... @rrx_d + # SVE2 integer multiply (indexed) MUL_zzx_h 01000100 .. 1 ..... 111110 ..... ..... @rrx_h MUL_zzx_s 01000100 .. 1 ..... 111110 ..... ..... @rrx_s diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 298c328ce0..bdee644c78 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3868,6 +3868,14 @@ DO_SVE2_RRX(trans_MUL_zzx_h, gen_helper_gvec_mul_idx_h) DO_SVE2_RRX(trans_MUL_zzx_s, gen_helper_gvec_mul_idx_s) DO_SVE2_RRX(trans_MUL_zzx_d, gen_helper_gvec_mul_idx_d) +DO_SVE2_RRX(trans_SQDMULH_zzx_h, gen_helper_sve2_sqdmulh_idx_h) +DO_SVE2_RRX(trans_SQDMULH_zzx_s, gen_helper_sve2_sqdmulh_idx_s) +DO_SVE2_RRX(trans_SQDMULH_zzx_d, gen_helper_sve2_sqdmulh_idx_d) + +DO_SVE2_RRX(trans_SQRDMULH_zzx_h, gen_helper_sve2_sqrdmulh_idx_h) +DO_SVE2_RRX(trans_SQRDMULH_zzx_s, gen_helper_sve2_sqrdmulh_idx_s) +DO_SVE2_RRX(trans_SQRDMULH_zzx_d, gen_helper_sve2_sqrdmulh_idx_d) + #undef DO_SVE2_RRX #define DO_SVE2_RRX_TB(NAME, FUNC, TOP) \ diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index f3a19821cd..339e722383 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -240,6 +240,36 @@ void HELPER(sve2_sqrdmulh_h)(void *vd, void *vn, void *vm, uint32_t desc) } } +void HELPER(sve2_sqdmulh_idx_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc); + int idx = simd_data(desc); + int16_t *d = vd, *n = vn, *m = (int16_t *)vm + H2(idx); + uint32_t discard; + + for (i = 0; i < opr_sz / 2; i += 16 / 2) { + int16_t mm = m[i]; + for (j = 0; j < 16 / 2; ++j) { + d[i + j] = do_sqrdmlah_h(n[i + j], mm, 0, false, false, &discard); + } + } +} + +void HELPER(sve2_sqrdmulh_idx_h)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc); + int idx = simd_data(desc); + int16_t *d = vd, *n = vn, *m = (int16_t *)vm + H2(idx); + uint32_t discard; + + for (i = 0; i < opr_sz / 2; i += 16 / 2) { + int16_t mm = m[i]; + for (j = 0; j < 16 / 2; ++j) { + d[i + j] = do_sqrdmlah_h(n[i + j], mm, 0, false, true, &discard); + } + } +} + /* Signed saturating rounding doubling multiply-accumulate high half, 32-bit */ int32_t do_sqrdmlah_s(int32_t src1, int32_t src2, int32_t src3, bool neg, bool round, uint32_t *sat) @@ -373,6 +403,36 @@ void HELPER(sve2_sqrdmulh_s)(void *vd, void *vn, void *vm, uint32_t desc) } } +void HELPER(sve2_sqdmulh_idx_s)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc); + int idx = simd_data(desc); + int32_t *d = vd, *n = vn, *m = (int32_t *)vm + H4(idx); + uint32_t discard; + + for (i = 0; i < opr_sz / 4; i += 16 / 4) { + int32_t mm = m[i]; + for (j = 0; j < 16 / 4; ++j) { + d[i + j] = do_sqrdmlah_s(n[i + j], mm, 0, false, false, &discard); + } + } +} + +void HELPER(sve2_sqrdmulh_idx_s)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc); + int idx = simd_data(desc); + int32_t *d = vd, *n = vn, *m = (int32_t *)vm + H4(idx); + uint32_t discard; + + for (i = 0; i < opr_sz / 4; i += 16 / 4) { + int32_t mm = m[i]; + for (j = 0; j < 16 / 4; ++j) { + d[i + j] = do_sqrdmlah_s(n[i + j], mm, 0, false, true, &discard); + } + } +} + /* Signed saturating rounding doubling multiply-accumulate high half, 64-bit */ static int64_t do_sat128_d(Int128 r) { @@ -452,6 +512,34 @@ void HELPER(sve2_sqrdmulh_d)(void *vd, void *vn, void *vm, uint32_t desc) } } +void HELPER(sve2_sqdmulh_idx_d)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc); + int idx = simd_data(desc); + int64_t *d = vd, *n = vn, *m = (int64_t *)vm + idx; + + for (i = 0; i < opr_sz / 8; i += 16 / 8) { + int64_t mm = m[i]; + for (j = 0; j < 16 / 8; ++j) { + d[i + j] = do_sqrdmlah_d(n[i + j], mm, 0, false, false); + } + } +} + +void HELPER(sve2_sqrdmulh_idx_d)(void *vd, void *vn, void *vm, uint32_t desc) +{ + intptr_t i, j, opr_sz = simd_oprsz(desc); + int idx = simd_data(desc); + int64_t *d = vd, *n = vn, *m = (int64_t *)vm + idx; + + for (i = 0; i < opr_sz / 8; i += 16 / 8) { + int64_t mm = m[i]; + for (j = 0; j < 16 / 8; ++j) { + d[i + j] = do_sqrdmlah_d(n[i + j], mm, 0, false, true); + } + } +} + /* Integer 8 and 16-bit dot-product. * * Note that for the loops herein, host endianness does not matter From patchwork Fri Sep 18 18:37:33 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 305015 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 37262C43465 for ; Fri, 18 Sep 2020 19:18:43 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id B1DB62311A for ; Fri, 18 Sep 2020 19:18:42 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="LA7TRKyV" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B1DB62311A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:43776 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLu5-00085y-Oo for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:18:41 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48806) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLI9-0000zi-Eo for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:29 -0400 Received: from mail-pj1-x1032.google.com ([2607:f8b0:4864:20::1032]:34499) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLHz-0007G4-W0 for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:29 -0400 Received: by mail-pj1-x1032.google.com with SMTP id s14so4600284pju.1 for ; Fri, 18 Sep 2020 11:39:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=6pRicJYepiyqiE8M6Kw+idIsYMGa+wHuvAaulUdBdwc=; b=LA7TRKyVV+MSF0vedJX/wfNekYOcv+odUK+08KoI1/9saFwUXrXpxSvTQRJ+/LaHHk X55+OTA7EEjPwN1qL4KGklu+G09IwjAKmkS1/fOdpjOrIZFQVvUjDyQAhDk6NiLwtkgt OfWycUaRb3B6PfW1ydwDIE4ZV5A99Rudb3BCHgrOFTHiEc/B2Kf4lbXO6685rRvCH1jL 4iKmW2iRthJ6OXDM3RPoxgoKjq+ZuGJ6y9DeR6DNlnGJ/YfBZ7hZT3/9Q8h2SXRuE6mu 0EH6nZoNtIYCjKc/9PrciKcDkTKMouum8yiRWdnIYUfhXNjYJdrsNBbsiECtWRpZAuYF xkkA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=6pRicJYepiyqiE8M6Kw+idIsYMGa+wHuvAaulUdBdwc=; b=cM7JiNGINRlG7l5DKKMhH2HlMy49VQi119YA1Nqbyv/jqexfYEk7TBK6Lgh2Sh45rA lhYNVkkHVJ7L7YpiSloqh2qPi3iqQu3xl/+ySbO0lr3JNbb/Pr1l2n+3gmrWH0jkjbxu Vhv3V50w7X7awb5cb9JhUDKbP1v1J7zlhu9toINxU2qKC+Pa7MYYYvPSHQEBEHORShVJ /YBya8PyAgbtCPucbNkp96zIrWAZYq5l6oDXCW51GDDl8A4vHJ/17tn1nptblfosfAew YozPJ/enPXe5iDTbCY6acWRvIK6TkdghtYMJgyB2zGc/HSKyK7Rk8N+5hQwB2UwstniO CN8w== X-Gm-Message-State: AOAM530G1ee1wHIyvkZ4Pxe9U153Ltpfo8KbDR/Gm5TVH1K7bK3rM5p2 ZDay8oGGkWKob5SvuLnEGi71kwenIZWlJQ== X-Google-Smtp-Source: ABdhPJz+jKNwop4NZVSYJN+ZYhnbnszzwXyPDxeAkRorYUR26OCHFHbGcomX0z4hzD4r8uJlZGv4pA== X-Received: by 2002:a17:90a:d704:: with SMTP id y4mr14847592pju.159.1600454357440; Fri, 18 Sep 2020 11:39:17 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.39.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:39:16 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 63/81] target/arm: Implement SVE2 multiply-add long (indexed) Date: Fri, 18 Sep 2020 11:37:33 -0700 Message-Id: <20200918183751.2787647-64-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1032; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1032.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 17 +++++++++++++++++ target/arm/sve.decode | 18 ++++++++++++++++++ target/arm/sve_helper.c | 16 ++++++++++++++++ target/arm/translate-sve.c | 20 ++++++++++++++++++++ 4 files changed, 71 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index ff85e1327b..90872fa0e6 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2698,3 +2698,20 @@ DEF_HELPER_FLAGS_4(sve2_sqdmull_idx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve2_sqdmull_idx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_smlal_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_smlal_idx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_smlsl_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_smlsl_idx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_umlal_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_umlal_idx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_umlsl_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_umlsl_idx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 80d76982e8..da77ad689f 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -825,6 +825,24 @@ SQDMLSLB_zzxw_d 01000100 .. 1 ..... 0011.0 ..... ..... @rrxw_d SQDMLSLT_zzxw_s 01000100 .. 1 ..... 0011.1 ..... ..... @rrxw_s SQDMLSLT_zzxw_d 01000100 .. 1 ..... 0011.1 ..... ..... @rrxw_d +# SVE2 multiply-add long (indexed) +SMLALB_zzxw_s 01000100 .. 1 ..... 1000.0 ..... ..... @rrxw_s +SMLALB_zzxw_d 01000100 .. 1 ..... 1000.0 ..... ..... @rrxw_d +SMLALT_zzxw_s 01000100 .. 1 ..... 1000.1 ..... ..... @rrxw_s +SMLALT_zzxw_d 01000100 .. 1 ..... 1000.1 ..... ..... @rrxw_d +UMLALB_zzxw_s 01000100 .. 1 ..... 1001.0 ..... ..... @rrxw_s +UMLALB_zzxw_d 01000100 .. 1 ..... 1001.0 ..... ..... @rrxw_d +UMLALT_zzxw_s 01000100 .. 1 ..... 1001.1 ..... ..... @rrxw_s +UMLALT_zzxw_d 01000100 .. 1 ..... 1001.1 ..... ..... @rrxw_d +SMLSLB_zzxw_s 01000100 .. 1 ..... 1010.0 ..... ..... @rrxw_s +SMLSLB_zzxw_d 01000100 .. 1 ..... 1010.0 ..... ..... @rrxw_d +SMLSLT_zzxw_s 01000100 .. 1 ..... 1010.1 ..... ..... @rrxw_s +SMLSLT_zzxw_d 01000100 .. 1 ..... 1010.1 ..... ..... @rrxw_d +UMLSLB_zzxw_s 01000100 .. 1 ..... 1011.0 ..... ..... @rrxw_s +UMLSLB_zzxw_d 01000100 .. 1 ..... 1011.0 ..... ..... @rrxw_d +UMLSLT_zzxw_s 01000100 .. 1 ..... 1011.1 ..... ..... @rrxw_s +UMLSLT_zzxw_d 01000100 .. 1 ..... 1011.1 ..... ..... @rrxw_d + # SVE2 integer multiply long (indexed) SMULLB_zzx_s 01000100 .. 1 ..... 1100.0 ..... ..... @rrxl_s SMULLB_zzx_d 01000100 .. 1 ..... 1100.0 ..... ..... @rrxl_d diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 926fe98eb5..f71df89cb7 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1546,6 +1546,20 @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, uint32_t desc) \ } \ } +#define DO_MLA(N, M, A) (A + N * M) + +DO_ZZXW(sve2_smlal_idx_s, int32_t, int16_t, H1_4, H1_2, DO_MLA) +DO_ZZXW(sve2_smlal_idx_d, int64_t, int32_t, , H1_4, DO_MLA) +DO_ZZXW(sve2_umlal_idx_s, uint32_t, uint16_t, H1_4, H1_2, DO_MLA) +DO_ZZXW(sve2_umlal_idx_d, uint64_t, uint32_t, , H1_4, DO_MLA) + +#define DO_MLS(N, M, A) (A - N * M) + +DO_ZZXW(sve2_smlsl_idx_s, int32_t, int16_t, H1_4, H1_2, DO_MLS) +DO_ZZXW(sve2_smlsl_idx_d, int64_t, int32_t, , H1_4, DO_MLS) +DO_ZZXW(sve2_umlsl_idx_s, uint32_t, uint16_t, H1_4, H1_2, DO_MLS) +DO_ZZXW(sve2_umlsl_idx_d, uint64_t, uint32_t, , H1_4, DO_MLS) + #define DO_SQDMLAL_S(N, M, A) DO_SQADD_S(A, do_sqdmull_s(N, M)) #define DO_SQDMLAL_D(N, M, A) do_sqadd_d(A, do_sqdmull_d(N, M)) @@ -1558,6 +1572,8 @@ DO_ZZXW(sve2_sqdmlal_idx_d, int64_t, int32_t, , H1_4, DO_SQDMLAL_D) DO_ZZXW(sve2_sqdmlsl_idx_s, int32_t, int16_t, H1_4, H1_2, DO_SQDMLSL_S) DO_ZZXW(sve2_sqdmlsl_idx_d, int64_t, int32_t, , H1_4, DO_SQDMLSL_D) +#undef DO_MLA +#undef DO_MLS #undef DO_ZZXW #define DO_ZZX(NAME, TYPEW, TYPEN, HW, HN, OP) \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index bdee644c78..248fe4de42 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3944,6 +3944,26 @@ DO_SVE2_RRXR_TB(trans_SQDMLSLB_zzxw_d, gen_helper_sve2_sqdmlsl_idx_d, false) DO_SVE2_RRXR_TB(trans_SQDMLSLT_zzxw_s, gen_helper_sve2_sqdmlsl_idx_s, true) DO_SVE2_RRXR_TB(trans_SQDMLSLT_zzxw_d, gen_helper_sve2_sqdmlsl_idx_d, true) +DO_SVE2_RRXR_TB(trans_SMLALB_zzxw_s, gen_helper_sve2_smlal_idx_s, false) +DO_SVE2_RRXR_TB(trans_SMLALB_zzxw_d, gen_helper_sve2_smlal_idx_d, false) +DO_SVE2_RRXR_TB(trans_SMLALT_zzxw_s, gen_helper_sve2_smlal_idx_s, true) +DO_SVE2_RRXR_TB(trans_SMLALT_zzxw_d, gen_helper_sve2_smlal_idx_d, true) + +DO_SVE2_RRXR_TB(trans_UMLALB_zzxw_s, gen_helper_sve2_umlal_idx_s, false) +DO_SVE2_RRXR_TB(trans_UMLALB_zzxw_d, gen_helper_sve2_umlal_idx_d, false) +DO_SVE2_RRXR_TB(trans_UMLALT_zzxw_s, gen_helper_sve2_umlal_idx_s, true) +DO_SVE2_RRXR_TB(trans_UMLALT_zzxw_d, gen_helper_sve2_umlal_idx_d, true) + +DO_SVE2_RRXR_TB(trans_SMLSLB_zzxw_s, gen_helper_sve2_smlsl_idx_s, false) +DO_SVE2_RRXR_TB(trans_SMLSLB_zzxw_d, gen_helper_sve2_smlsl_idx_d, false) +DO_SVE2_RRXR_TB(trans_SMLSLT_zzxw_s, gen_helper_sve2_smlsl_idx_s, true) +DO_SVE2_RRXR_TB(trans_SMLSLT_zzxw_d, gen_helper_sve2_smlsl_idx_d, true) + +DO_SVE2_RRXR_TB(trans_UMLSLB_zzxw_s, gen_helper_sve2_umlsl_idx_s, false) +DO_SVE2_RRXR_TB(trans_UMLSLB_zzxw_d, gen_helper_sve2_umlsl_idx_d, false) +DO_SVE2_RRXR_TB(trans_UMLSLT_zzxw_s, gen_helper_sve2_umlsl_idx_s, true) +DO_SVE2_RRXR_TB(trans_UMLSLT_zzxw_d, gen_helper_sve2_umlsl_idx_d, true) + #undef DO_SVE2_RRXR_TB /* From patchwork Fri Sep 18 18:37:34 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273328 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A5883C43463 for ; Fri, 18 Sep 2020 19:39:26 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 18D9C22208 for ; Fri, 18 Sep 2020 19:39:25 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="M7abj4+f" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 18D9C22208 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:44948 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJME9-0007Ul-12 for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:39:25 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48858) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLIA-000131-Vz for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:31 -0400 Received: from mail-pj1-x1033.google.com ([2607:f8b0:4864:20::1033]:34500) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLI2-0007GG-0Y for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:30 -0400 Received: by mail-pj1-x1033.google.com with SMTP id s14so4600306pju.1 for ; Fri, 18 Sep 2020 11:39:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=egH0xasIOAz9Z6IwlRMxrw/89U1wEaObGVqcH5V7c9M=; b=M7abj4+fHWWP2lYJIvZLb4aPG1MzQ69wWNxSMOKZ0PX7y9vZgqETR49ZR/GKYLnres JYI0goQ8xyhg1A/Tcn3ySkr/3mdgD3IspE3sC5bxypmY43HpA1oBmjRGcTjwZq/Cfi5X sriDPcys0m9QEsYmKlY+hWDqwIQWgBTYPwBP3wlzyzFSZdhwAgdY6JpIhVcDp7ejWQyL 1QSQ7zcLH4kG/zU3YM6KUzTfpsNXO1celZxeyxt5t+zyW9tkwb/s9/5iMM2VC510WJkg IgZ1u+uApffp6w4siSkCGtHTJI4T0/M1bh0HE6B5KZsGxgaRO3j9Kufr0hmmmkFzkqHd jt6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=egH0xasIOAz9Z6IwlRMxrw/89U1wEaObGVqcH5V7c9M=; b=XuG30uph2cbldUhqGbHKtLsnIJjmj91V8zloiQOPI9Jq0/I8+Xq/vC7Xz4HCP2/9TE EUoErCp/iCbhtRS+aHsQSqdlZ67jucSvJwEm9PjVv3yDX5tDGJ/gtVCqFoT6YLMp89VW XyFsC4anvtOZ/xh5tnca42kP65kqAzEqUOjR1+c8PCCkF7Ma/qVUdgrsYKrrVhBoez0h /Z4Bunh44VFbTHZmPXRAD5br675QByMbgQPIC8pDHJmW7Hr5xuI/GpIvHGANsjOZwWEk 5rCjzciFnEpvqBtr9ZGn1y4iJBR6EnzHCTlpPI7uQtbRe9QjOa9FxSiKPrq+JZpozPji ERWg== X-Gm-Message-State: AOAM5333iRafqfxCnSknxDEB30Ieg7DH2/uq9BON7U/M5SZ9Y4efq269 VFcyIx83qA1J+BfkB/WaGxn9yWyKRsjalw== X-Google-Smtp-Source: ABdhPJxi0GbsJga4Z3Y9GZPUrNjhm6NYsAD03x+qnMC3z8Wb/s/DrtqLZJgAySIqoyu/OYb1ieOAtw== X-Received: by 2002:a17:90b:a51:: with SMTP id gw17mr14714220pjb.118.1600454358930; Fri, 18 Sep 2020 11:39:18 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.39.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:39:18 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 64/81] target/arm: Implement SVE2 complex integer multiply-add (indexed) Date: Fri, 18 Sep 2020 11:37:34 -0700 Message-Id: <20200918183751.2787647-65-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1033; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1033.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 9 +++ target/arm/sve.decode | 12 ++++ target/arm/sve_helper.c | 142 +++++++++++++++++++++++++++++++------ target/arm/translate-sve.c | 38 +++++++--- 4 files changed, 169 insertions(+), 32 deletions(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 90872fa0e6..671b3a8804 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2715,3 +2715,12 @@ DEF_HELPER_FLAGS_5(sve2_umlsl_idx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_umlsl_idx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_cmla_idx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_cmla_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrdcmlah_idx_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_sqrdcmlah_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index da77ad689f..e8011fe91b 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -825,6 +825,18 @@ SQDMLSLB_zzxw_d 01000100 .. 1 ..... 0011.0 ..... ..... @rrxw_d SQDMLSLT_zzxw_s 01000100 .. 1 ..... 0011.1 ..... ..... @rrxw_s SQDMLSLT_zzxw_d 01000100 .. 1 ..... 0011.1 ..... ..... @rrxw_d +# SVE2 complex integer multiply-add (indexed) +CMLA_zzxz_h 01000100 10 1 index:2 rm:3 0110 rot:2 rn:5 rd:5 \ + ra=%reg_movprfx +CMLA_zzxz_s 01000100 11 1 index:1 rm:4 0110 rot:2 rn:5 rd:5 \ + ra=%reg_movprfx + +# SVE2 complex saturating integer multiply-add (indexed) +SQRDCMLAH_zzxz_h 01000100 10 1 index:2 rm:3 0111 rot:2 rn:5 rd:5 \ + ra=%reg_movprfx +SQRDCMLAH_zzxz_s 01000100 11 1 index:1 rm:4 0111 rot:2 rn:5 rd:5 \ + ra=%reg_movprfx + # SVE2 multiply-add long (indexed) SMLALB_zzxw_s 01000100 .. 1 ..... 1000.0 ..... ..... @rrxw_s SMLALB_zzxw_d 01000100 .. 1 ..... 1000.0 ..... ..... @rrxw_d diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index f71df89cb7..b2f24d5cfd 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1466,34 +1466,132 @@ void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, uint32_t desc) \ } \ } -#define do_cmla(N, M, A, S) (A + (N * M) * (S ? -1 : 1)) +static int8_t do_cmla_b(int8_t n, int8_t m, int8_t a, bool sub) +{ + return n * m * (sub ? -1 : 1) + a; +} -DO_CMLA(sve2_cmla_zzzz_b, uint8_t, H1, do_cmla) -DO_CMLA(sve2_cmla_zzzz_h, uint16_t, H2, do_cmla) -DO_CMLA(sve2_cmla_zzzz_s, uint32_t, H4, do_cmla) -DO_CMLA(sve2_cmla_zzzz_d, uint64_t, , do_cmla) +static int16_t do_cmla_h(int16_t n, int16_t m, int16_t a, bool sub) +{ + return n * m * (sub ? -1 : 1) + a; +} -#define DO_SQRDMLAH_B(N, M, A, S) \ - do_sqrdmlah_b(N, M, A, S, true) -#define DO_SQRDMLAH_H(N, M, A, S) \ - ({ uint32_t discard; do_sqrdmlah_h(N, M, A, S, true, &discard); }) -#define DO_SQRDMLAH_S(N, M, A, S) \ - ({ uint32_t discard; do_sqrdmlah_s(N, M, A, S, true, &discard); }) -#define DO_SQRDMLAH_D(N, M, A, S) \ - do_sqrdmlah_d(N, M, A, S, true) +static int32_t do_cmla_s(int32_t n, int32_t m, int32_t a, bool sub) +{ + return n * m * (sub ? -1 : 1) + a; +} -DO_CMLA(sve2_sqrdcmlah_zzzz_b, int8_t, H1, DO_SQRDMLAH_B) -DO_CMLA(sve2_sqrdcmlah_zzzz_h, int16_t, H2, DO_SQRDMLAH_H) -DO_CMLA(sve2_sqrdcmlah_zzzz_s, int32_t, H4, DO_SQRDMLAH_S) -DO_CMLA(sve2_sqrdcmlah_zzzz_d, int64_t, , DO_SQRDMLAH_D) +static int64_t do_cmla_d(int64_t n, int64_t m, int64_t a, bool sub) +{ + return n * m * (sub ? -1 : 1) + a; +} + +DO_CMLA(sve2_cmla_zzzz_b, uint8_t, H1, do_cmla_b) +DO_CMLA(sve2_cmla_zzzz_h, uint16_t, H2, do_cmla_h) +DO_CMLA(sve2_cmla_zzzz_s, uint32_t, H4, do_cmla_s) +DO_CMLA(sve2_cmla_zzzz_d, uint64_t, , do_cmla_d) + +static int8_t do_sqrdcmlah_b(int8_t n, int8_t m, int8_t a, bool sub) +{ + return do_sqrdmlah_b(n, m, a, sub, true); +} + +static int16_t do_sqrdcmlah_h(int16_t n, int16_t m, int16_t a, bool sub) +{ + uint32_t discard; + return do_sqrdmlah_h(n, m, a, sub, true, &discard); +} + +static int32_t do_sqrdcmlah_s(int32_t n, int32_t m, int32_t a, bool sub) +{ + uint32_t discard; + return do_sqrdmlah_s(n, m, a, sub, true, &discard); +} + +static int64_t do_sqrdcmlah_d(int64_t n, int64_t m, int64_t a, bool sub) +{ + return do_sqrdmlah_d(n, m, a, sub, true); +} + +DO_CMLA(sve2_sqrdcmlah_zzzz_b, int8_t, H1, do_sqrdcmlah_b) +DO_CMLA(sve2_sqrdcmlah_zzzz_h, int16_t, H2, do_sqrdcmlah_h) +DO_CMLA(sve2_sqrdcmlah_zzzz_s, int32_t, H4, do_sqrdcmlah_s) +DO_CMLA(sve2_sqrdcmlah_zzzz_d, int64_t, , do_sqrdcmlah_d) -#undef DO_SQRDMLAH_B -#undef DO_SQRDMLAH_H -#undef DO_SQRDMLAH_S -#undef DO_SQRDMLAH_D -#undef do_cmla #undef DO_CMLA +static void do_cmla_idx_h(int16_t *d, int16_t *n, int16_t *m, + int16_t *a, uint32_t desc, + int16_t (*fn)(int16_t, int16_t, int16_t, bool)) +{ + intptr_t i, j, oprsz = simd_oprsz(desc); + int rot = extract32(desc, SIMD_DATA_SHIFT, 2); + int idx = extract32(desc, SIMD_DATA_SHIFT + 2, 2) * 2; + int sel_a = rot & 1, sel_b = sel_a ^ 1; + bool sub_r = rot == 1 || rot == 2; + bool sub_i = rot >= 2; + + for (i = 0; i < oprsz / 2; i += 16 / 2) { + int16_t elt2_a = m[H2(i + idx + sel_a)]; + int16_t elt2_b = m[H2(i + idx + sel_b)]; + + for (j = 0; j < 16 / 2; j += 2) { + int16_t elt1_a = n[H2(i + j + sel_a)]; + + d[H2(i + j)] = fn(elt1_a, elt2_a, a[H2(i + j)], sub_r); + d[H2(i + j + 1)] = fn(elt1_a, elt2_b, a[H2(i + j + 1)], sub_i); + } + } +} + +void HELPER(sve2_cmla_idx_h)(void *vd, void *vn, void *vm, + void *va, uint32_t desc) +{ + do_cmla_idx_h(vd, vn, vm, va, desc, do_cmla_h); +} + +void HELPER(sve2_sqrdcmlah_idx_h)(void *vd, void *vn, void *vm, + void *va, uint32_t desc) +{ + do_cmla_idx_h(vd, vn, vm, va, desc, do_sqrdcmlah_h); +} + +static void do_cmla_idx_s(int32_t *d, int32_t *n, int32_t *m, + int32_t *a, uint32_t desc, + int32_t (*fn)(int32_t, int32_t, int32_t, bool)) +{ + intptr_t i, j, oprsz = simd_oprsz(desc); + int rot = extract32(desc, SIMD_DATA_SHIFT, 2); + int idx = extract32(desc, SIMD_DATA_SHIFT + 2, 2) * 2; + int sel_a = rot & 1, sel_b = sel_a ^ 1; + bool sub_r = rot == 1 || rot == 2; + bool sub_i = rot >= 2; + + for (i = 0; i < oprsz / 4; i += 16 / 4) { + int32_t elt2_a = m[H4(i + idx + sel_a)]; + int32_t elt2_b = m[H4(i + idx + sel_b)]; + + for (j = 0; j < 16 / 4; j += 2) { + int32_t elt1_a = n[H4(i + j + sel_a)]; + + d[H4(i + j)] = fn(elt1_a, elt2_a, a[H4(i + j)], sub_r); + d[H4(i + j + 1)] = fn(elt1_a, elt2_b, a[H4(i + j + 1)], sub_i); + } + } +} + +void HELPER(sve2_cmla_idx_s)(void *vd, void *vn, void *vm, + void *va, uint32_t desc) +{ + do_cmla_idx_s(vd, vn, vm, va, desc, do_cmla_s); +} + +void HELPER(sve2_sqrdcmlah_idx_s)(void *vd, void *vn, void *vm, + void *va, uint32_t desc) +{ + do_cmla_idx_s(vd, vn, vm, va, desc, do_sqrdcmlah_s); +} + #define DO_ZZXZ(NAME, TYPE, H, OP) \ void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, uint32_t desc) \ { \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 248fe4de42..4628198b76 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3821,21 +3821,21 @@ static bool trans_DOT_zzzz(DisasContext *s, arg_DOT_zzzz *a) * SVE Multiply - Indexed */ -static bool do_zzxz_data(DisasContext *s, arg_rrxr_esz *a, +static bool do_zzxz_data(DisasContext *s, int rd, int rn, int rm, int ra, gen_helper_gvec_4 *fn, int data) { if (fn == NULL) { return false; } if (sve_access_check(s)) { - gen_gvec_ool_zzzz(s, fn, a->rd, a->rn, a->rm, a->ra, data); + gen_gvec_ool_zzzz(s, fn, rd, rn, rm, ra, data); } return true; } #define DO_RRXR(NAME, FUNC) \ static bool NAME(DisasContext *s, arg_rrxr_esz *a) \ - { return do_zzxz_data(s, a, FUNC, a->index); } + { return do_zzxz_data(s, a->rd, a->rn, a->rm, a->ra, FUNC, a->index); } DO_RRXR(trans_SDOT_zzxw_s, gen_helper_gvec_sdot_idx_b) DO_RRXR(trans_SDOT_zzxw_d, gen_helper_gvec_sdot_idx_h) @@ -3899,18 +3899,18 @@ DO_SVE2_RRX_TB(trans_SQDMULLT_zzx_d, gen_helper_sve2_sqdmull_idx_d, true) #undef DO_SVE2_RRX_TB -static bool do_sve2_zzxz_data(DisasContext *s, arg_rrxr_esz *a, - gen_helper_gvec_4 *fn, int data) +static bool do_sve2_zzxz_data(DisasContext *s, int rd, int rn, int rm, + int ra, gen_helper_gvec_4 *fn, int data) { if (!dc_isar_feature(aa64_sve2, s)) { return false; } - return do_zzxz_data(s, a, fn, data); + return do_zzxz_data(s, rd, rn, rm, ra, fn, data); } #define DO_SVE2_RRXR(NAME, FUNC) \ - static bool NAME(DisasContext *s, arg_rrxr_esz *a) \ - { return do_sve2_zzxz_data(s, a, FUNC, a->index); } +static bool NAME(DisasContext *s, arg_rrxr_esz *a) \ +{ return do_sve2_zzxz_data(s, a->rd, a->rn, a->rm, a->ra, FUNC, a->index); } DO_SVE2_RRXR(trans_MLA_zzxz_h, gen_helper_gvec_mla_idx_h) DO_SVE2_RRXR(trans_MLA_zzxz_s, gen_helper_gvec_mla_idx_s) @@ -3931,8 +3931,11 @@ DO_SVE2_RRXR(trans_SQRDMLSH_zzxz_d, gen_helper_sve2_sqrdmlsh_idx_d) #undef DO_SVE2_RRXR #define DO_SVE2_RRXR_TB(NAME, FUNC, TOP) \ - static bool NAME(DisasContext *s, arg_rrxr_esz *a) \ - { return do_sve2_zzxz_data(s, a, FUNC, (a->index << 1) | TOP); } +static bool NAME(DisasContext *s, arg_rrxr_esz *a) \ +{ \ + return do_sve2_zzxz_data(s, a->rd, a->rn, a->rm, a->ra, \ + FUNC, (a->index << 1) | TOP); \ +} DO_SVE2_RRXR_TB(trans_SQDMLALB_zzxw_s, gen_helper_sve2_sqdmlal_idx_s, false) DO_SVE2_RRXR_TB(trans_SQDMLALB_zzxw_d, gen_helper_sve2_sqdmlal_idx_d, false) @@ -3966,6 +3969,21 @@ DO_SVE2_RRXR_TB(trans_UMLSLT_zzxw_d, gen_helper_sve2_umlsl_idx_d, true) #undef DO_SVE2_RRXR_TB +#define DO_SVE2_RRXR_ROT(NAME, FUNC) \ +static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ +{ \ + return do_sve2_zzxz_data(s, a->rd, a->rn, a->rm, a->ra, \ + FUNC, (a->index << 2) | a->rot); \ +} + +DO_SVE2_RRXR_ROT(CMLA_zzxz_h, gen_helper_sve2_cmla_idx_h) +DO_SVE2_RRXR_ROT(CMLA_zzxz_s, gen_helper_sve2_cmla_idx_s) + +DO_SVE2_RRXR_ROT(SQRDCMLAH_zzxz_h, gen_helper_sve2_sqrdcmlah_idx_h) +DO_SVE2_RRXR_ROT(SQRDCMLAH_zzxz_s, gen_helper_sve2_sqrdcmlah_idx_s) + +#undef DO_SVE2_RRXR_ROT + /* *** SVE Floating Point Multiply-Add Indexed Group */ From patchwork Fri Sep 18 18:37:35 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273330 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E20C5C43464 for ; Fri, 18 Sep 2020 19:36:43 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 53C4721741 for ; Fri, 18 Sep 2020 19:36:43 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="dSYr5dhO" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 53C4721741 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:36762 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJMBW-00047d-BZ for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:36:42 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48842) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLIA-000125-IE for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:30 -0400 Received: from mail-pj1-x1043.google.com ([2607:f8b0:4864:20::1043]:52248) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLI2-0007GP-1O for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:30 -0400 Received: by mail-pj1-x1043.google.com with SMTP id bw23so786464pjb.2 for ; Fri, 18 Sep 2020 11:39:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=h6MHIdnd/k4ep8IYhrC6CDpmus0bNBpHI413VcRuYZI=; b=dSYr5dhO5Q/k1KdyIe5UvLoj0996z6qZfq2FEv8c0UJI2t3NFar3V8o4aAZ4KbkS7/ 7n6S1YFsjqTMxgyygeU57hlpWaG9vCPtQ6UPrWQxdpecJr/6nQe4iQNPUupjKutnjakK B2bxWB9okM+TNjTY2fYNYYHlUUAg0VySTxoeZBakSVoPzKpdw7Ys9gwFrmd4ruwYvMeU mn3qffP0dLOWHdzA0JuPdytRwzpaoepeBwxZ59RE+C1hnqKnd0BAK0c2wPyGeHCASwaC 2PJbJBSfO4/nRHURMqNGL6FFHM5lw5nmzhpN3tC1SGKD570hQpIuKrspPU3AN4X0BxVt WlmA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=h6MHIdnd/k4ep8IYhrC6CDpmus0bNBpHI413VcRuYZI=; b=Sbrw+RMvfs2D71do4Mn2znkrprm431S8Fa6a5Sc685I92AvIriFVsJG4uJeQh40CRW Tbekhhg4OwU/TQX4y09SMjGtcqftYdaRl4xXMrCnAbYLYVzKiJ47Y9dY9PEYkJylY/Xp PwAfj7Xdd34/6u7lJVSsF4tqGnkA6dUd1iaqjb0Bi9zPimM4REhR5OkfMHcwOki76Duj Hmicq6ecGyYiUGNZZsb6P4zG9t+VSnSYO/78Vn8SXOoTaBvY8kOSvIReCsqodbB+plCV n92tNeHFLR6j32kKlz89GcHfRHGOWdrtttYiIdYmYdXdyUn8J3LEwBmMH18fLOJrxbf2 Gk7Q== X-Gm-Message-State: AOAM532Db/56efWW7rlOamYA85b1N6PyQf2yZUELWUsvRIEVTz82CPne DNPqbV9olCJo+oCM53ym6wDnMtvuWZq/iQ== X-Google-Smtp-Source: ABdhPJzHBMsQsXJBlG1VnUFr+8PdaVTVIHEvMM5IkEZf9nn1ebkXdgelmaIRQQwMlwDOptfJYsrl1Q== X-Received: by 2002:a17:90b:1b50:: with SMTP id nv16mr14322114pjb.153.1600454359943; Fri, 18 Sep 2020 11:39:19 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.39.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:39:19 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 65/81] target/arm: Implement SVE mixed sign dot product (indexed) Date: Fri, 18 Sep 2020 11:37:35 -0700 Message-Id: <20200918183751.2787647-66-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1043; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1043.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/cpu.h | 5 +++ target/arm/helper.h | 4 +++ target/arm/sve.decode | 4 +++ target/arm/translate-sve.c | 18 +++++++++++ target/arm/vec_helper.c | 66 ++++++++++++++++++++++++++++++++++++++ 5 files changed, 97 insertions(+) diff --git a/target/arm/cpu.h b/target/arm/cpu.h index aaefbf0167..b394e15c0c 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -3917,6 +3917,11 @@ static inline bool isar_feature_aa64_sve2_bitperm(const ARMISARegisters *id) return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, BITPERM) != 0; } +static inline bool isar_feature_aa64_sve2_i8mm(const ARMISARegisters *id) +{ + return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, I8MM) != 0; +} + static inline bool isar_feature_aa64_sve2_f32mm(const ARMISARegisters *id) { return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, F32MM) != 0; diff --git a/target/arm/helper.h b/target/arm/helper.h index 828fcc0fc9..0f17b59f45 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -608,6 +608,10 @@ DEF_HELPER_FLAGS_5(gvec_sdot_idx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_udot_idx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_sudot_idx_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_usdot_idx_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_fcaddh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index e8011fe91b..51acbfa797 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -815,6 +815,10 @@ SQRDMLSH_zzxz_h 01000100 .. 1 ..... 000101 ..... ..... @rrxr_h SQRDMLSH_zzxz_s 01000100 .. 1 ..... 000101 ..... ..... @rrxr_s SQRDMLSH_zzxz_d 01000100 .. 1 ..... 000101 ..... ..... @rrxr_d +# SVE mixed sign dot product (indexed) +USDOT_zzxw_s 01000100 .. 1 ..... 000110 ..... ..... @rrxr_s +SUDOT_zzxw_s 01000100 .. 1 ..... 000111 ..... ..... @rrxr_s + # SVE2 saturating multiply-add (indexed) SQDMLALB_zzxw_s 01000100 .. 1 ..... 0010.0 ..... ..... @rrxw_s SQDMLALB_zzxw_d 01000100 .. 1 ..... 0010.0 ..... ..... @rrxw_d diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 4628198b76..ae61659b1d 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -3842,6 +3842,24 @@ DO_RRXR(trans_SDOT_zzxw_d, gen_helper_gvec_sdot_idx_h) DO_RRXR(trans_UDOT_zzxw_s, gen_helper_gvec_udot_idx_b) DO_RRXR(trans_UDOT_zzxw_d, gen_helper_gvec_udot_idx_h) +static bool trans_SUDOT_zzxw_s(DisasContext *s, arg_rrxr_esz *a) +{ + if (!dc_isar_feature(aa64_sve2_i8mm, s)) { + return false; + } + return do_zzxz_data(s, a->rd, a->rn, a->rm, a->ra, + gen_helper_gvec_sudot_idx_b, a->index); +} + +static bool trans_USDOT_zzxw_s(DisasContext *s, arg_rrxr_esz *a) +{ + if (!dc_isar_feature(aa64_sve2_i8mm, s)) { + return false; + } + return do_zzxz_data(s, a->rd, a->rn, a->rm, a->ra, + gen_helper_gvec_usdot_idx_b, a->index); +} + #undef DO_RRXR static bool do_sve2_zzx_data(DisasContext *s, arg_rrx_esz *a, diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 339e722383..8dccc6789d 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -677,6 +677,72 @@ void HELPER(gvec_udot_idx_b)(void *vd, void *vn, void *vm, clear_tail(d, opr_sz, simd_maxsz(desc)); } +void HELPER(gvec_sudot_idx_b)(void *vd, void *vn, void *vm, + void *va, uint32_t desc) +{ + intptr_t i, segend, opr_sz = simd_oprsz(desc), opr_sz_4 = opr_sz / 4; + intptr_t index = simd_data(desc); + int32_t *d = vd, *a = va; + int8_t *n = vn; + uint8_t *m_indexed = (uint8_t *)vm + index * 4; + + /* Notice the special case of opr_sz == 8, from aa64/aa32 advsimd. + * Otherwise opr_sz is a multiple of 16. + */ + segend = MIN(4, opr_sz_4); + i = 0; + do { + uint8_t m0 = m_indexed[i * 4 + 0]; + uint8_t m1 = m_indexed[i * 4 + 1]; + uint8_t m2 = m_indexed[i * 4 + 2]; + uint8_t m3 = m_indexed[i * 4 + 3]; + + do { + d[i] = (a[i] + + n[i * 4 + 0] * m0 + + n[i * 4 + 1] * m1 + + n[i * 4 + 2] * m2 + + n[i * 4 + 3] * m3); + } while (++i < segend); + segend = i + 4; + } while (i < opr_sz_4); + + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + +void HELPER(gvec_usdot_idx_b)(void *vd, void *vn, void *vm, + void *va, uint32_t desc) +{ + intptr_t i, segend, opr_sz = simd_oprsz(desc), opr_sz_4 = opr_sz / 4; + intptr_t index = simd_data(desc); + uint32_t *d = vd, *a = va; + uint8_t *n = vn; + int8_t *m_indexed = (int8_t *)vm + index * 4; + + /* Notice the special case of opr_sz == 8, from aa64/aa32 advsimd. + * Otherwise opr_sz is a multiple of 16. + */ + segend = MIN(4, opr_sz_4); + i = 0; + do { + int8_t m0 = m_indexed[i * 4 + 0]; + int8_t m1 = m_indexed[i * 4 + 1]; + int8_t m2 = m_indexed[i * 4 + 2]; + int8_t m3 = m_indexed[i * 4 + 3]; + + do { + d[i] = (a[i] + + n[i * 4 + 0] * m0 + + n[i * 4 + 1] * m1 + + n[i * 4 + 2] * m2 + + n[i * 4 + 3] * m3); + } while (++i < segend); + segend = i + 4; + } while (i < opr_sz_4); + + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + void HELPER(gvec_sdot_idx_h)(void *vd, void *vn, void *vm, void *va, uint32_t desc) { From patchwork Fri Sep 18 18:37:36 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273338 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E551AC43463 for ; Fri, 18 Sep 2020 19:22:26 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 589FE221EC for ; Fri, 18 Sep 2020 19:22:26 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="UrK7WOsi" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 589FE221EC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:55214 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLxh-0004SA-CR for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:22:25 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48870) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLIB-000144-Cs for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:31 -0400 Received: from mail-pg1-x52a.google.com ([2607:f8b0:4864:20::52a]:47038) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLI7-0007GV-0Z for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:31 -0400 Received: by mail-pg1-x52a.google.com with SMTP id 34so3965882pgo.13 for ; Fri, 18 Sep 2020 11:39:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Myz+Lsz2iwOPmsiQ4nxJKVJHSA2kWzSrZKdUlA8XURc=; b=UrK7WOsiz3KescKGh9Dx2c1IKhdkjEh2Mycnv4zxd39k25cFF5UbLerm8Ninkp1Snd TiBvygQ47Cybxcb7PHujx9StUNRnXH4R0Dfw7wMcumxiM4Ie8TOwPYHkeLnTzYQVzGPi e8f6jO8wbZulvztpKMf5Sb+AhpYGeq7kAr0LjZAZ3P25xfKMVApX7mKZ9KfEH8Utf7hS 0qwmOgAOBiQggFTX83PQMEq5kze1u5utTt9M4ADch8xQIHqsEY+5UwO8iafXFYpkb7xc ecwIcf62o70mbm0q/uE1KNhd4tp8YEgLgPQILoN3MxMv0R5VJ0C6+UH0Boef6HdGkTbm DywQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Myz+Lsz2iwOPmsiQ4nxJKVJHSA2kWzSrZKdUlA8XURc=; b=AEbGvg7YS95xcgAFmKVSvrwo6A0En5rMGZoOEGPs5Pe/0C0u/E8vGu4S1F4xjhsDoC quxCPu2ZJ6lGgiAySQSAYvuy9QQDPjHHdQWeNb9Ik5WOCbe0l3uMTk6V36YeknnLnYxi hb3a2GKLAL5Z6YKqLCH8MK2q9YAMr38ylA26wI9DD4ek837p92Ab+GYhMe0/UCsarQyO L2QhB0JKorsezgLaS0QiCh+79AUfiW9t86/mFbz9HcZwSWMnuTaPrS02+NyOrHb7+N9G LhIYyERuDo7ExuslIuoj5FPSXU/AHpd0X9AebVK1jXMvSMZgDXo8FHQ2lMA19C+bOmk1 Du1Q== X-Gm-Message-State: AOAM530KdWgjoqxnN0uRcWAGKuoEK0UB5kUT1DKjG4mIDmxVHtUnQzRL 7nxLOqjTzgzfJJzuyFU5WMKofxwz5ZP85Q== X-Google-Smtp-Source: ABdhPJwMWOF61vqmUTlETHYUf6V9hPuy6IlOw+V0Y3nCK8plpdUJeqyGLdywK/2VDcYKtrPnqPKiqw== X-Received: by 2002:a62:5a04:0:b029:142:2501:397f with SMTP id o4-20020a625a040000b02901422501397fmr17135229pfb.68.1600454361056; Fri, 18 Sep 2020 11:39:21 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.39.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:39:20 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 66/81] target/arm: Implement SVE mixed sign dot product Date: Fri, 18 Sep 2020 11:37:36 -0700 Message-Id: <20200918183751.2787647-67-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::52a; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x52a.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper.h | 2 ++ target/arm/sve.decode | 4 ++++ target/arm/translate-sve.c | 16 ++++++++++++++++ target/arm/vec_helper.c | 18 ++++++++++++++++++ 4 files changed, 40 insertions(+) diff --git a/target/arm/helper.h b/target/arm/helper.h index 0f17b59f45..868c2e9121 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -599,6 +599,8 @@ DEF_HELPER_FLAGS_5(gvec_sdot_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_udot_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_sdot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_udot_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(gvec_usdot_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(gvec_sdot_idx_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 51acbfa797..0be8a020f6 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1523,6 +1523,10 @@ UMLSLT_zzzw 01000100 .. 0 ..... 010 111 ..... ..... @rda_rn_rm CMLA_zzzz 01000100 esz:2 0 rm:5 0010 rot:2 rn:5 rd:5 ra=%reg_movprfx SQRDCMLAH_zzzz 01000100 esz:2 0 rm:5 0011 rot:2 rn:5 rd:5 ra=%reg_movprfx +## SVE mixed sign dot product + +USDOT_zzzz 01000100 .. 0 ..... 011 110 ..... ..... @rda_rn_rm + ### SVE2 floating point matrix multiply accumulate FMMLA 01100100 .. 1 ..... 111001 ..... ..... @rda_rn_rm diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index ae61659b1d..4341b83414 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -8111,3 +8111,19 @@ static bool trans_SQRDCMLAH_zzzz(DisasContext *s, arg_SQRDCMLAH_zzzz *a) } return true; } + +static bool trans_USDOT_zzzz(DisasContext *s, arg_USDOT_zzzz *a) +{ + if (a->esz != 2 || !dc_isar_feature(aa64_sve2_i8mm, s)) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_4_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vec_full_reg_offset(s, a->ra), + vsz, vsz, 0, gen_helper_gvec_usdot_b); + } + return true; +} diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index 8dccc6789d..c8d1656797 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -579,6 +579,24 @@ void HELPER(gvec_udot_b)(void *vd, void *vn, void *vm, void *va, uint32_t desc) clear_tail(d, opr_sz, simd_maxsz(desc)); } +void HELPER(gvec_usdot_b)(void *vd, void *vn, void *vm, + void *va, uint32_t desc) +{ + intptr_t i, opr_sz = simd_oprsz(desc); + int32_t *d = vd, *a = va; + uint8_t *n = vn; + int8_t *m = vm; + + for (i = 0; i < opr_sz / 4; ++i) { + d[i] = (a[i] + + n[i * 4 + 0] * m[i * 4 + 0] + + n[i * 4 + 1] * m[i * 4 + 1] + + n[i * 4 + 2] * m[i * 4 + 2] + + n[i * 4 + 3] * m[i * 4 + 3]); + } + clear_tail(d, opr_sz, simd_maxsz(desc)); +} + void HELPER(gvec_sdot_h)(void *vd, void *vn, void *vm, void *va, uint32_t desc) { intptr_t i, opr_sz = simd_oprsz(desc); From patchwork Fri Sep 18 18:37:37 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273325 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3D461C43464 for ; Fri, 18 Sep 2020 19:46:46 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A9D0821D42 for ; Fri, 18 Sep 2020 19:46:45 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="VYyt2Zyz" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A9D0821D42 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:60148 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJMLE-0005bf-Pw for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:46:44 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48892) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLIC-000162-F5 for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:32 -0400 Received: from mail-pg1-x529.google.com ([2607:f8b0:4864:20::529]:37671) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLI7-0007Gk-2B for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:32 -0400 Received: by mail-pg1-x529.google.com with SMTP id l71so3988702pge.4 for ; Fri, 18 Sep 2020 11:39:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Eth0Qo7lxODtnYEXQSTKXdT5egBkmtCsDYgIFXZmsyU=; b=VYyt2ZyzRh8URWydhUC2wro4KU2q+TVs6aLIUyO3mDwU8L0id/jtHtsfioA3pgVpNb D3phoHJyywAW5dZJ1KSCU4zv7gUwcnCRRGEIfLchT7OIjSDE8B/BIlCMwa0n72UeyHGi g14FRwV/4RRSIyjlCPnFJ1ivCxY46M/Lm/3UE6f5x/er2d48LBE11s7yc3uNt2vCMiJe GEYoSpXpMVp6w6bdi4w5c4jC5d/LWpgW+gSsZ6xhz/STbhSJG5KjnKVis39mmoumYQMk YQ8qYBGgakkksqm99MkPZoPpt2F582lFOMQbqr0PH7ur30fGjo20ykDw49I+IzHeFbQ6 uz7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Eth0Qo7lxODtnYEXQSTKXdT5egBkmtCsDYgIFXZmsyU=; b=LA27qlnbtMBVNZG61SIuAwmkGV6jumk+ZdfCtPmU4aQHlA6euGIyqDMAZSOGRBoyGw G4onb0NhrfJ+o9vldK/iq8seDeUJf7s5gS+dDSKyARPiVNuc8QQlcdCAcuxZHe9z7xOA yRqlx6AjkZLaiQxDBn/GETwybO5EPTtCsu5K7NOcBAmf7MGt+iD0VHnr5xWFi7UOgvSG q38ztzOj0wKHTF3YwvVqyTic2vOJOencjfu/QatjF2E0VlmtmhGeu9Bl5JIUQ30wRrbj a/f3sBsjx4+p+wM7yONZbdkh2Ac7weKTTrVW/EltwT4HtYbRanLwKhlWpRMAbNgfwpUt 303g== X-Gm-Message-State: AOAM533rxlsV3jlN2YIprv0wmvxwmuIMG11gxzQxDc95NgQVfZ72jTI2 gtzYJKulxIpPMUU5/tpvCHoWo/0GFqc1Yg== X-Google-Smtp-Source: ABdhPJwQRyw6+u8vzCg0DqkO+2bYsXrUqZ1XdCNqvJp89Ls7oeAG3mhW16CnZFAUZr4mtJKmeRpEyg== X-Received: by 2002:a62:3387:0:b029:13f:ed60:b208 with SMTP id z129-20020a6233870000b029013fed60b208mr23851107pfz.24.1600454362214; Fri, 18 Sep 2020 11:39:22 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.39.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:39:21 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 67/81] target/arm: Implement SVE2 crypto unary operations Date: Fri, 18 Sep 2020 11:37:37 -0700 Message-Id: <20200918183751.2787647-68-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::529; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x529.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/sve.decode | 6 ++++++ target/arm/translate-sve.c | 11 +++++++++++ 2 files changed, 17 insertions(+) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 0be8a020f6..9b0d0f3a5d 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1551,3 +1551,9 @@ STNT1_zprz 1110010 .. 00 ..... 001 ... ..... ..... \ # SVE2 32-bit scatter non-temporal store (vector plus scalar) STNT1_zprz 1110010 .. 10 ..... 001 ... ..... ..... \ @rprr_scatter_store xs=0 esz=2 scale=0 + +### SVE2 Crypto Extensions + +# SVE2 crypto unary operations +# AESMC and AESIMC +AESMC 01000101 00 10000011100 decrypt:1 00000 rd:5 diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 4341b83414..3c46fd7e2b 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -8127,3 +8127,14 @@ static bool trans_USDOT_zzzz(DisasContext *s, arg_USDOT_zzzz *a) } return true; } + +static bool trans_AESMC(DisasContext *s, arg_AESMC *a) +{ + if (!dc_isar_feature(aa64_sve2_aes, s)) { + return false; + } + if (sve_access_check(s)) { + gen_gvec_ool_zz(s, gen_helper_crypto_aesmc, a->rd, a->rd, a->decrypt); + } + return true; +} From patchwork Fri Sep 18 18:37:38 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273323 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 284B5C43464 for ; Fri, 18 Sep 2020 19:51:18 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A3A1F21481 for ; Fri, 18 Sep 2020 19:51:17 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="Fz+RIHwG" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A3A1F21481 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:39992 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJMPc-0000YU-Hw for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:51:16 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48882) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLIC-00015J-5h for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:32 -0400 Received: from mail-pf1-x441.google.com ([2607:f8b0:4864:20::441]:46711) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLI7-0007Gu-1f for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:31 -0400 Received: by mail-pf1-x441.google.com with SMTP id b124so3979287pfg.13 for ; Fri, 18 Sep 2020 11:39:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=fTY9EyKnQ6jXRZCchX3NoLEfu3cIBUQyjUkKt8lEvrs=; b=Fz+RIHwG6QgcVH3Y6Ol8HK+nEfIJJhGvu8hDaquy4k0z7qd4Vxu95C8LtKj7GSqZ21 Xrkf7mkhOf8FqoJXMwMdoXBWv/QMT3+gkJCs8dsoxG9x4FqbAMA7UO5lc++LsJ101aFa 2q8+Qj33+PPps+n+lnXwr2LzmoZhdXHOsBwk/Fr35bjUlmF5/F4gdfTBD6zzGpMP4mJ0 E2MtZr27QvW2eKEq2yXCZDmQJX5Mgxu1gTUunRr+ZLD8KP/On0vHBa5G9jK1SZCqpxMU 3ATNQo2tHOWTGshcBDALd0D0ndaLXppfedOOVCp1qwy0WMReqKg65Jg4+zKGrag6xXOD uetw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=fTY9EyKnQ6jXRZCchX3NoLEfu3cIBUQyjUkKt8lEvrs=; b=lmjR54mxx2jQkK6JmH55YjkCSRwoYbPqGqIP2pqkuJZGBgEjp9Mrm8jcZqxM/shznG h3dfXqgoEfFwZqsXwDRP5YeTBXmMteF3Hn2MIw4EdsyZNViTwfsgYU9RkgpCa4UacCsU JQ7AbGsNVe/ss770MRKaiiSrJS925UqUQHyhe6prxH2AKBpoXxDgln8Abb1A+2fqi8g0 P3TNiGMAz5s6Yo3Jd5xzFkwKyGnjKW3PLJ6y4r1SlFzoJ879BCq/ohfLMRt8G/T7RsOR UzrpViomSUDkatHugC+as5q/g/REBiUtuLF4LS0HKehfwbVMcbd3el4O42eNTKcbKREr Lrcg== X-Gm-Message-State: AOAM533wrep6N4q64P25eIV7xgjH0PbINW9MvwB3Y8I+CAY69qeoMj/D m/D3fahsWxmbNet334Sk724bqAf+bT6mCQ== X-Google-Smtp-Source: ABdhPJzWR/51lPbi35CXdNBBzcEaEKf/VIoPeDGChhP4u6WZivw6ABdf3MJ9cNUpIJQ12QfXI24ohw== X-Received: by 2002:a62:1dc1:0:b029:13e:d13d:a051 with SMTP id d184-20020a621dc10000b029013ed13da051mr32290667pfd.23.1600454363236; Fri, 18 Sep 2020 11:39:23 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.39.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:39:22 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 68/81] target/arm: Implement SVE2 crypto destructive binary operations Date: Fri, 18 Sep 2020 11:37:38 -0700 Message-Id: <20200918183751.2787647-69-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::441; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x441.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/cpu.h | 5 +++++ target/arm/sve.decode | 7 +++++++ target/arm/translate-sve.c | 38 ++++++++++++++++++++++++++++++++++++++ 3 files changed, 50 insertions(+) diff --git a/target/arm/cpu.h b/target/arm/cpu.h index b394e15c0c..f5052a58a4 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -3917,6 +3917,11 @@ static inline bool isar_feature_aa64_sve2_bitperm(const ARMISARegisters *id) return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, BITPERM) != 0; } +static inline bool isar_feature_aa64_sve2_sm4(const ARMISARegisters *id) +{ + return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, SM4) != 0; +} + static inline bool isar_feature_aa64_sve2_i8mm(const ARMISARegisters *id) { return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, I8MM) != 0; diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 9b0d0f3a5d..2ebf65f376 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -118,6 +118,8 @@ @pd_pn_pm ........ esz:2 .. rm:4 ....... rn:4 . rd:4 &rrr_esz @rdn_rm ........ esz:2 ...... ...... rm:5 rd:5 \ &rrr_esz rn=%reg_movprfx +@rdn_rm_e0 ........ .. ...... ...... rm:5 rd:5 \ + &rrr_esz rn=%reg_movprfx esz=0 @rdn_sh_i8u ........ esz:2 ...... ...... ..... rd:5 \ &rri_esz rn=%reg_movprfx imm=%sh8_i8u @rdn_i8u ........ esz:2 ...... ... imm:8 rd:5 \ @@ -1557,3 +1559,8 @@ STNT1_zprz 1110010 .. 10 ..... 001 ... ..... ..... \ # SVE2 crypto unary operations # AESMC and AESIMC AESMC 01000101 00 10000011100 decrypt:1 00000 rd:5 + +# SVE2 crypto destructive binary operations +AESE 01000101 00 10001 0 11100 0 ..... ..... @rdn_rm_e0 +AESD 01000101 00 10001 0 11100 1 ..... ..... @rdn_rm_e0 +SM4E 01000101 00 10001 1 11100 0 ..... ..... @rdn_rm_e0 diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 3c46fd7e2b..aa459ed30a 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -8138,3 +8138,41 @@ static bool trans_AESMC(DisasContext *s, arg_AESMC *a) } return true; } + +static bool do_aese(DisasContext *s, arg_rrr_esz *a, bool decrypt) +{ + if (!dc_isar_feature(aa64_sve2_aes, s)) { + return false; + } + if (sve_access_check(s)) { + gen_gvec_ool_zzz(s, gen_helper_crypto_aese, + a->rd, a->rn, a->rm, decrypt); + } + return true; +} + +static bool trans_AESE(DisasContext *s, arg_rrr_esz *a) +{ + return do_aese(s, a, false); +} + +static bool trans_AESD(DisasContext *s, arg_rrr_esz *a) +{ + return do_aese(s, a, true); +} + +static bool do_sm4(DisasContext *s, arg_rrr_esz *a, gen_helper_gvec_3 *fn) +{ + if (!dc_isar_feature(aa64_sve2_sm4, s)) { + return false; + } + if (sve_access_check(s)) { + gen_gvec_ool_zzz(s, fn, a->rd, a->rn, a->rm, 0); + } + return true; +} + +static bool trans_SM4E(DisasContext *s, arg_rrr_esz *a) +{ + return do_sm4(s, a, gen_helper_crypto_sm4e); +} From patchwork Fri Sep 18 18:37:39 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273336 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 85104C4346E for ; Fri, 18 Sep 2020 19:24:46 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 1F2E3221EC for ; Fri, 18 Sep 2020 19:24:46 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="e2XZ1IAR" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1F2E3221EC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:35214 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJLzx-0007lf-6Y for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:24:45 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48876) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLIB-00014P-Np for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:31 -0400 Received: from mail-pl1-x62a.google.com ([2607:f8b0:4864:20::62a]:44558) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLI7-0007Gs-0x for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:31 -0400 Received: by mail-pl1-x62a.google.com with SMTP id j7so3419555plk.11 for ; Fri, 18 Sep 2020 11:39:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=l4iKSLJO0P4KO8p8doEML+s4OAJMNAjz9gFYt+6w6ZQ=; b=e2XZ1IAREOy6JNtBbiXfnWHxRtlwD7gdk3vfeHa1GJiHYrd+O6YFppKneZybOYQVdq D3JAhCi2gaJxtxbnvabW2dH2k/Ej/l4l3AlrmcPr67Ea2cQ3EJAinE31G0KfFf2ERmld jV3shRXJW1ak0ParjhynMjyKRblJRVt0okwAk5li/vbOSKLm0XSItpLVv9o7i5XFNQAd m2D9Aq273ojekoMTUqQETrx+zCqr7n6Gplr0cx07W9zUweLpJaPG0tzBVjBIjPQdRcnS uQr5FVy/MnjeVcWyCkUqHDZukYugJVjY1HByd7kt6C0DrFwxPfbab1kzqi0TxOkgxs3T fcYA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=l4iKSLJO0P4KO8p8doEML+s4OAJMNAjz9gFYt+6w6ZQ=; b=pS5iNy0cz7OguZCgJVoUpRscev7hAscsBxtCPRka+reDVWGEfr/v4eBcRx2ilBhjGt Hni3SmB9VKRsUeJW46lhFjlYuDgubBZ/2OrxLS3QA92Q/l2q/1adxgTIbzD1SMLsixXq ROOvNmskhB9HAr2wpWHbHbRqv/Sl6OPW6I8iYSji4Ck+969IdlSV9PhCCdeGlIM+CwjO reZySWDmk6uIRp/Ml5xmIfLhImVxo+kQ+M9OteFvvYz3hL0yq/8IExml3EyLxbP3AiTy JPsrSct0fxClQ8rg025CnMVGnOg+9C7fuxA14GMrN1qTKQjlo9ztJtA9u81EDiRTzZ9g SjlA== X-Gm-Message-State: AOAM530gmuXEjs16vQOnMJuxsBKgqutqcTmzwMPrGEKBxNmfI2k5SgwK EOQ8wTxaE2/px2q1LDFQc/ml4N03VIIItQ== X-Google-Smtp-Source: ABdhPJzYzRXP61hGoe44hLUsEliGZGI3iUQ0IaHSb0QyQjTr4mAjJVGjzbDRmxL3wYomBCKPzg+B8g== X-Received: by 2002:a17:90a:c501:: with SMTP id k1mr12830330pjt.170.1600454364474; Fri, 18 Sep 2020 11:39:24 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.39.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:39:23 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 69/81] target/arm: Implement SVE2 crypto constructive binary operations Date: Fri, 18 Sep 2020 11:37:39 -0700 Message-Id: <20200918183751.2787647-70-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::62a; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x62a.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/cpu.h | 5 +++++ target/arm/sve.decode | 4 ++++ target/arm/translate-sve.c | 16 ++++++++++++++++ 3 files changed, 25 insertions(+) diff --git a/target/arm/cpu.h b/target/arm/cpu.h index f5052a58a4..67b2834035 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -3917,6 +3917,11 @@ static inline bool isar_feature_aa64_sve2_bitperm(const ARMISARegisters *id) return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, BITPERM) != 0; } +static inline bool isar_feature_aa64_sve2_sha3(const ARMISARegisters *id) +{ + return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, SHA3) != 0; +} + static inline bool isar_feature_aa64_sve2_sm4(const ARMISARegisters *id) { return FIELD_EX64(id->id_aa64zfr0, ID_AA64ZFR0, SM4) != 0; diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 2ebf65f376..5f2fad4754 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1564,3 +1564,7 @@ AESMC 01000101 00 10000011100 decrypt:1 00000 rd:5 AESE 01000101 00 10001 0 11100 0 ..... ..... @rdn_rm_e0 AESD 01000101 00 10001 0 11100 1 ..... ..... @rdn_rm_e0 SM4E 01000101 00 10001 1 11100 0 ..... ..... @rdn_rm_e0 + +# SVE2 crypto constructive binary operations +SM4EKEY 01000101 00 1 ..... 11110 0 ..... ..... @rd_rn_rm_e0 +RAX1 01000101 00 1 ..... 11110 1 ..... ..... @rd_rn_rm_e0 diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index aa459ed30a..cb493d838d 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -8176,3 +8176,19 @@ static bool trans_SM4E(DisasContext *s, arg_rrr_esz *a) { return do_sm4(s, a, gen_helper_crypto_sm4e); } + +static bool trans_SM4EKEY(DisasContext *s, arg_rrr_esz *a) +{ + return do_sm4(s, a, gen_helper_crypto_sm4ekey); +} + +static bool trans_RAX1(DisasContext *s, arg_rrr_esz *a) +{ + if (!dc_isar_feature(aa64_sve2_sha3, s)) { + return false; + } + if (sve_access_check(s)) { + gen_gvec_fn_zzz(s, gen_gvec_rax1, MO_64, a->rd, a->rn, a->rm); + } + return true; +} From patchwork Fri Sep 18 18:37:40 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273322 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D0C80C43463 for ; Fri, 18 Sep 2020 19:53:35 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 5F80821D42 for ; Fri, 18 Sep 2020 19:53:35 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="pWb2PWZ/" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5F80821D42 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:47836 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJMRq-0003rI-GZ for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:53:34 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48930) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLID-00019q-LW for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:33 -0400 Received: from mail-pf1-x442.google.com ([2607:f8b0:4864:20::442]:46712) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLI7-0007H6-RA for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:33 -0400 Received: by mail-pf1-x442.google.com with SMTP id b124so3979312pfg.13 for ; Fri, 18 Sep 2020 11:39:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=EL2EE96fcJ4VPdkqktR5iGr09g3MJ0ZNNIfN1LzNyPw=; b=pWb2PWZ/y58gDWTwmtUfT0uUIo6Zrg8B+baEEpeAAJhq4d6ZScn0g9HpxUL/vMbKt5 TyzISfpwLPSEetTXpzjIvuAIbDjC24QIDn44lWcOrtelayvAIUF7urbLMNB0iOzG4SUf sgneTx9aVQnZ30xOOAK8eqcM9HyuatTfQF2NXhgB8NrUxfW419IHtv6eYSXZGQaQtT/P aaz4cceY3infUNU5dLjWY5YWW0Pmq+H5yFrGWkePLdc+pnx+4TST5Adau4NHk+Sr7/9M SBr083ON6M9uQxcAGu37keE4YtabHaGeXPNTPRiASZbgNJYLpdG0/VjzP2d6rWN5bxcm vBow== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=EL2EE96fcJ4VPdkqktR5iGr09g3MJ0ZNNIfN1LzNyPw=; b=OVxtQ3iwVgbcivfDFeOXTNDa0Q47yt0vU8pAD8kA94gapF5+GQUNPJlmQnwAMyDf66 W6JRBuVJpDXcoeRM4o53G1GqW8v+DjYmW7cCp/7JsmZOURvW9JAZd34fUldfL5zAxtOW W9kvhdS7mlObMO/yqdEwUoAztrNvCtfFaQIQStTx3LbTq+ErawQk/yfT7KZfpuf6K2pg j5X0VCRuP3eUp79sp48JGY5qc0o5U8vISnYMIucxd2wIZj3afI28ZB8P9QrGupO1ats1 MjztG3awOOluUCB02EvPUhk08tsLYZ0JMVzBh3bKj+6zhWcC/0Uwz5pqc9OWapS3Rjjl SwaQ== X-Gm-Message-State: AOAM533W3phplWAMDT0xZNAvWssEvoqwU5d7I//T9o5bmTj4OPnkcIcA 8wFTHn8+2yYkyZrGM+OzAjfu3TpZLPwJMw== X-Google-Smtp-Source: ABdhPJzr2epTQuX43U9d8TFRvNKch2ch0OS9dwogYUg0jVZh/dTQ4G28eLImCoOyQOvYsgCgWl2uhg== X-Received: by 2002:a62:2c09:0:b029:142:2501:34f8 with SMTP id s9-20020a622c090000b0290142250134f8mr17057257pfs.81.1600454366034; Fri, 18 Sep 2020 11:39:26 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.39.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:39:25 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 70/81] target/arm: Implement SVE2 TBL, TBX Date: Fri, 18 Sep 2020 11:37:40 -0700 Message-Id: <20200918183751.2787647-71-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::442; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x442.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org, Stephen Long Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Stephen Long Signed-off-by: Stephen Long Message-Id: <20200428144352.9275-1-steplong@quicinc.com> [rth: rearrange the macros a little and rebase] Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 10 +++++ target/arm/sve.decode | 5 +++ target/arm/sve_helper.c | 90 ++++++++++++++++++++++++++++++-------- target/arm/translate-sve.c | 33 ++++++++++++++ 4 files changed, 119 insertions(+), 19 deletions(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 671b3a8804..b46723ebe3 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -661,6 +661,16 @@ DEF_HELPER_FLAGS_4(sve_tbl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_tbl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_tbl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_tbl_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_tbl_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_tbl_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_tbl_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_tbx_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_tbx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_tbx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_tbx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_3(sve_sunpk_h, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_sunpk_s, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(sve_sunpk_d, TCG_CALL_NO_RWG, void, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 5f2fad4754..609ecce520 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -560,6 +560,11 @@ TBL 00000101 .. 1 ..... 001100 ..... ..... @rd_rn_rm # SVE unpack vector elements UNPK 00000101 esz:2 1100 u:1 h:1 001110 rn:5 rd:5 +# SVE2 Table Lookup (three sources) + +TBL_sve2 00000101 .. 1 ..... 001010 ..... ..... @rd_rn_rm +TBX 00000101 .. 1 ..... 001011 ..... ..... @rd_rn_rm + ### SVE Permute - Predicates Group # SVE permute predicate elements diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index b2f24d5cfd..e651c0e9f1 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3034,28 +3034,80 @@ void HELPER(sve_rev_d)(void *vd, void *vn, uint32_t desc) } } -#define DO_TBL(NAME, TYPE, H) \ -void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ -{ \ - intptr_t i, opr_sz = simd_oprsz(desc); \ - uintptr_t elem = opr_sz / sizeof(TYPE); \ - TYPE *d = vd, *n = vn, *m = vm; \ - ARMVectorReg tmp; \ - if (unlikely(vd == vn)) { \ - n = memcpy(&tmp, vn, opr_sz); \ - } \ - for (i = 0; i < elem; i++) { \ - TYPE j = m[H(i)]; \ - d[H(i)] = j < elem ? n[H(j)] : 0; \ - } \ +typedef void tb_impl_fn(void *, void *, void *, void *, uintptr_t, bool); + +static inline void do_tbl1(void *vd, void *vn, void *vm, uint32_t desc, + bool is_tbx, tb_impl_fn *fn) +{ + ARMVectorReg scratch; + uintptr_t oprsz = simd_oprsz(desc); + + if (unlikely(vd == vn)) { + vn = memcpy(&scratch, vn, oprsz); + } + + fn(vd, vn, NULL, vm, oprsz, is_tbx); } -DO_TBL(sve_tbl_b, uint8_t, H1) -DO_TBL(sve_tbl_h, uint16_t, H2) -DO_TBL(sve_tbl_s, uint32_t, H4) -DO_TBL(sve_tbl_d, uint64_t, ) +static inline void do_tbl2(void *vd, void *vn0, void *vn1, void *vm, + uint32_t desc, bool is_tbx, tb_impl_fn *fn) +{ + ARMVectorReg scratch; + uintptr_t oprsz = simd_oprsz(desc); -#undef TBL + if (unlikely(vd == vn0)) { + vn0 = memcpy(&scratch, vn0, oprsz); + if (vd == vn1) { + vn1 = vn0; + } + } else if (unlikely(vd == vn1)) { + vn1 = memcpy(&scratch, vn1, oprsz); + } + + fn(vd, vn0, vn1, vm, oprsz, is_tbx); +} + +#define DO_TB(SUFF, TYPE, H) \ +static inline void do_tb_##SUFF(void *vd, void *vt0, void *vt1, \ + void *vm, uintptr_t oprsz, bool is_tbx) \ +{ \ + TYPE *d = vd, *tbl0 = vt0, *tbl1 = vt1, *indexes = vm; \ + uintptr_t i, nelem = oprsz / sizeof(TYPE); \ + for (i = 0; i < nelem; ++i) { \ + TYPE index = indexes[H1(i)], val = 0; \ + if (index < nelem) { \ + val = tbl0[H(index)]; \ + } else { \ + index -= nelem; \ + if (tbl1 && index < nelem) { \ + val = tbl1[H(index)]; \ + } else if (is_tbx) { \ + continue; \ + } \ + } \ + d[H(i)] = val; \ + } \ +} \ +void HELPER(sve_tbl_##SUFF)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + do_tbl1(vd, vn, vm, desc, false, do_tb_##SUFF); \ +} \ +void HELPER(sve2_tbl_##SUFF)(void *vd, void *vn0, void *vn1, \ + void *vm, uint32_t desc) \ +{ \ + do_tbl2(vd, vn0, vn1, vm, desc, false, do_tb_##SUFF); \ +} \ +void HELPER(sve2_tbx_##SUFF)(void *vd, void *vn, void *vm, uint32_t desc) \ +{ \ + do_tbl1(vd, vn, vm, desc, true, do_tb_##SUFF); \ +} + +DO_TB(b, uint8_t, H1) +DO_TB(h, uint16_t, H2) +DO_TB(s, uint32_t, H4) +DO_TB(d, uint64_t, ) + +#undef DO_TB #define DO_UNPK(NAME, TYPED, TYPES, HD, HS) \ void HELPER(NAME)(void *vd, void *vn, uint32_t desc) \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index cb493d838d..e8fef28750 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2417,6 +2417,39 @@ static bool trans_TBL(DisasContext *s, arg_rrr_esz *a) return true; } +static bool trans_TBL_sve2(DisasContext *s, arg_rrr_esz *a) +{ + static gen_helper_gvec_4 * const fns[4] = { + gen_helper_sve2_tbl_b, gen_helper_sve2_tbl_h, + gen_helper_sve2_tbl_s, gen_helper_sve2_tbl_d + }; + + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + gen_gvec_ool_zzzz(s, fns[a->esz], a->rd, a->rn, + (a->rn + 1) % 32, a->rm, 0); + } + return true; +} + +static bool trans_TBX(DisasContext *s, arg_rrr_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve2_tbx_b, gen_helper_sve2_tbx_h, + gen_helper_sve2_tbx_s, gen_helper_sve2_tbx_d + }; + + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + gen_gvec_ool_zzz(s, fns[a->esz], a->rd, a->rn, a->rm, 0); + } + return true; +} + static bool trans_UNPK(DisasContext *s, arg_UNPK *a) { static gen_helper_gvec_2 * const fns[4][2] = { From patchwork Fri Sep 18 18:37:41 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 304997 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 67DC1C43464 for ; Fri, 18 Sep 2020 19:50:05 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C99BF22208 for ; Fri, 18 Sep 2020 19:50:04 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="TEI+vfFb" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C99BF22208 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:38316 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJMOR-0008Hb-QL for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:50:03 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48944) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLIE-0001B6-3w for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:34 -0400 Received: from mail-pg1-x533.google.com ([2607:f8b0:4864:20::533]:44122) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLI9-0007HQ-CP for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:33 -0400 Received: by mail-pg1-x533.google.com with SMTP id 7so3963812pgm.11 for ; Fri, 18 Sep 2020 11:39:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=TPMHxOAC6GxefuCPbOOvlQhsSO4RGUrDVkoB6eDwNSg=; b=TEI+vfFb2zqzg0Y0OCuRE2g7LWu+9DKNIkTHWRCq+i+ROEq1QzTBfFhin/yFL1jZYr 6JNWbYHmbnKXdCHrH1RkRLJ/HZNNcPK4lCNHKUi7RCaRN4DlEy+wRBZztswa3LBq/JOG 7wBRX5l+lCB8LNIRsZuBWZtf+jdZ71ip3P7kB3MyAtA1QckH4brFwgIyV/ihI1XwU7gs oXeFbHzd3TGnE9xZRhh+sJDWlwODSSs+vCnKAS33VgTOR+4nVvJvYNysIVwp2NhVc6p8 HLqsL6sUClCa654xTd1Qchp6Zs/iMMlfVbAbqW6O/vJJ12giPRGsvPGG77mxtqv61pl4 m2Jw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=TPMHxOAC6GxefuCPbOOvlQhsSO4RGUrDVkoB6eDwNSg=; b=MnRPUW5fZeM+u6chb9yKVxwxs3++F3DYIwSgsuk1+T5PehFkgb4ElIYxYmk/lYu67p zTKCRZ3+drquKTK9ggwye0ylnXt5n9wWOby/KZOU0da+zIPmsSxsphVSZ+DztewVshjS ueNR99IOHTb5nVYysJiOT31UicandrxL2ik/BQk9Y3P2YyhoCZIaa5nPDRXm0iXylCiF m7lW8I9qqKaQGaeh/lAaRVmWQ9UZXVakB+lePlkfl2Lly9KQWt8PtOGa73PxBIiF4AaJ QO5WTyPmhYpOpN7eSjOD6qpgWfVrq0/H6E4KkQcl8zkib+40EcDtC42Nmgtx7EhvzLX0 nl6g== X-Gm-Message-State: AOAM530HenVsfz9tgeSBsRib9vaCnsZxjPbLdJUx+31cA+41Vn798K/9 /Qwn7LmJr8F4SCV1zprcmRsr995NQmH1DQ== X-Google-Smtp-Source: ABdhPJxi2toKnMRSsnHbhnMWZdWTuDFcZ1dzUXVoGnGR046qsmsTKQQrIXTNfcWKpNgyLvoibi36kg== X-Received: by 2002:aa7:9548:0:b029:13e:d13d:a08d with SMTP id w8-20020aa795480000b029013ed13da08dmr32866330pfq.36.1600454367488; Fri, 18 Sep 2020 11:39:27 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.39.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:39:26 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 71/81] target/arm: Implement SVE2 FCVTNT Date: Fri, 18 Sep 2020 11:37:41 -0700 Message-Id: <20200918183751.2787647-72-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::533; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x533.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org, Stephen Long Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Stephen Long Signed-off-by: Stephen Long Message-Id: <20200428174332.17162-2-steplong@quicinc.com> Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 5 +++++ target/arm/sve.decode | 4 ++++ target/arm/sve_helper.c | 20 ++++++++++++++++++++ target/arm/translate-sve.c | 16 ++++++++++++++++ 4 files changed, 45 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index b46723ebe3..0d8667a7d1 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2734,3 +2734,8 @@ DEF_HELPER_FLAGS_5(sve2_sqrdcmlah_idx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_sqrdcmlah_idx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_fcvtnt_sh, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_fcvtnt_ds, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 609ecce520..9ba4bb476e 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1573,3 +1573,7 @@ SM4E 01000101 00 10001 1 11100 0 ..... ..... @rdn_rm_e0 # SVE2 crypto constructive binary operations SM4EKEY 01000101 00 1 ..... 11110 0 ..... ..... @rd_rn_rm_e0 RAX1 01000101 00 1 ..... 11110 1 ..... ..... @rd_rn_rm_e0 + +### SVE2 floating-point convert precision odd elements +FCVTNT_sh 01100100 10 0010 00 101 ... ..... ..... @rd_pg_rn_e0 +FCVTNT_ds 01100100 11 0010 10 101 ... ..... ..... @rd_pg_rn_e0 diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index e651c0e9f1..21023c2f7f 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -7594,3 +7594,23 @@ void HELPER(fmmla_d)(void *vd, void *vn, void *vm, void *va, d[3] = float64_add(a[3], float64_add(p0, p1, status), status); } } + +#define DO_FCVTNT(NAME, TYPEW, TYPEN, HW, HN, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vg, void *status, uint32_t desc) \ +{ \ + intptr_t i = simd_oprsz(desc); \ + uint64_t *g = vg; \ + do { \ + uint64_t pg = g[(i - 1) >> 6]; \ + do { \ + i -= sizeof(TYPEW); \ + if (likely((pg >> (i & 63)) & 1)) { \ + TYPEW nn = *(TYPEW *)(vn + HW(i)); \ + *(TYPEN *)(vd + HN(i + sizeof(TYPEN))) = OP(nn, status); \ + } \ + } while (i & 63); \ + } while (i != 0); \ +} + +DO_FCVTNT(sve2_fcvtnt_sh, uint32_t, uint16_t, H1_4, H1_2, sve_f32_to_f16) +DO_FCVTNT(sve2_fcvtnt_ds, uint64_t, uint32_t, H1_4, H1_2, float64_to_float32) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index e8fef28750..8a3a646bf5 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -8225,3 +8225,19 @@ static bool trans_RAX1(DisasContext *s, arg_rrr_esz *a) } return true; } + +static bool trans_FCVTNT_sh(DisasContext *s, arg_rpr_esz *a) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve2_fcvtnt_sh); +} + +static bool trans_FCVTNT_ds(DisasContext *s, arg_rpr_esz *a) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve2_fcvtnt_ds); +} From patchwork Fri Sep 18 18:37:42 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 304996 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C28A1C43463 for ; Fri, 18 Sep 2020 19:52:57 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4646B21D42 for ; Fri, 18 Sep 2020 19:52:57 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="tjxtsOe0" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4646B21D42 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:46638 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJMRE-0003Nl-C6 for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:52:56 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48976) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLIF-0001E4-9e for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:35 -0400 Received: from mail-pj1-x102f.google.com ([2607:f8b0:4864:20::102f]:56224) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLIA-0007Hn-NF for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:34 -0400 Received: by mail-pj1-x102f.google.com with SMTP id q4so3621246pjh.5 for ; Fri, 18 Sep 2020 11:39:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=R1UphA/dMSFyR0KX0MUh9VNFX6WB6sLnHEkivop59CQ=; b=tjxtsOe0r7nuFFUUCucNknIqW4n4prX31YJmIGCYXh8C6CpoxWmrNx46CM7fyArSxE X/UZPoy1ktfGI0Fms0LNz6/2Tf4vUS0EeECVOPGHBvPSzbfZbRqUsD9eA0yrz8hlsQmY Pp265aIwIMY5XKq7s1ciQFhzw6Myhvc+1YjrlE9Z47w78rdoJkVK/H0QZNBVtTmqyqbD RGvA5APK//SH6Vij9yNU+SG0r9qC4AFBSFcqMQQPBT4JQ3+w+eqVC3Mti2jlgTJxKJSQ KOcWp39jOgqzBQzAsKrsjAkZ68k5yHXYaYsOy6HSQ9NHBC/Gx2kljyHHieSfKOt98JfH EoJQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=R1UphA/dMSFyR0KX0MUh9VNFX6WB6sLnHEkivop59CQ=; b=b/dIfl9JiIjzyNPtZsqtnXMxiKhPjqUCKEBxmGXXvXduiVJFxjo2sJiqAzC+pqLgn3 QIbEXO3GnDDLLHnhreMRGEw4J14Rpti4y+RiW5ZjgKFA6+AHRdDeYvSJWc9ux8nYtvtY ZFVdEf7R92bVlkU7FwGLOXVaYtPsuDvSfL3fAuuQUliqvnJOeQ9hFOVZ/m47Od9smodM LIMjBbWGVhJBxKFZ6SBay1VSnlJlZ0dEkmJ767ue56T5rET69pGCB0qgI/tu0b45mvUx pZ+A4CElVQtuozm+0DB3X/jh3YaUCZKO9iTGgWnakYwv3bTm9KHMmqcHKJrJAp+/Ly6f e49Q== X-Gm-Message-State: AOAM530+dbjHU9tk6g9HzIieOv1moIdqDyZ4cfzHf+4HvVc+deDPeU7l 6KB6qjGfPeMWS93tAQYsqWZ6pmdJpqO+Og== X-Google-Smtp-Source: ABdhPJwoczrxPd13N/SxXy1vkcV9OxEKhf76cV3DBqCuyqVBNrUKWwgjxTXYgrcgMM3EWQWAdMnmCw== X-Received: by 2002:a17:90a:d315:: with SMTP id p21mr14805065pju.88.1600454368778; Fri, 18 Sep 2020 11:39:28 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.39.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:39:27 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 72/81] target/arm: Implement SVE2 FCVTLT Date: Fri, 18 Sep 2020 11:37:42 -0700 Message-Id: <20200918183751.2787647-73-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::102f; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x102f.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org, Stephen Long Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Stephen Long Signed-off-by: Stephen Long Message-Id: <20200428174332.17162-3-steplong@quicinc.com> Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 5 +++++ target/arm/sve.decode | 2 ++ target/arm/sve_helper.c | 23 +++++++++++++++++++++++ target/arm/translate-sve.c | 16 ++++++++++++++++ 4 files changed, 46 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 0d8667a7d1..df141aec41 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2739,3 +2739,8 @@ DEF_HELPER_FLAGS_5(sve2_fcvtnt_sh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_fcvtnt_ds, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_fcvtlt_hs, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_fcvtlt_sd, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 9ba4bb476e..abe26f2424 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1576,4 +1576,6 @@ RAX1 01000101 00 1 ..... 11110 1 ..... ..... @rd_rn_rm_e0 ### SVE2 floating-point convert precision odd elements FCVTNT_sh 01100100 10 0010 00 101 ... ..... ..... @rd_pg_rn_e0 +FCVTLT_hs 01100100 10 0010 01 101 ... ..... ..... @rd_pg_rn_e0 FCVTNT_ds 01100100 11 0010 10 101 ... ..... ..... @rd_pg_rn_e0 +FCVTLT_sd 01100100 11 0010 11 101 ... ..... ..... @rd_pg_rn_e0 diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 21023c2f7f..8539493c72 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -7614,3 +7614,26 @@ void HELPER(NAME)(void *vd, void *vn, void *vg, void *status, uint32_t desc) \ DO_FCVTNT(sve2_fcvtnt_sh, uint32_t, uint16_t, H1_4, H1_2, sve_f32_to_f16) DO_FCVTNT(sve2_fcvtnt_ds, uint64_t, uint32_t, H1_4, H1_2, float64_to_float32) + +#define DO_FCVTLT(NAME, TYPEW, TYPEN, HW, HN, OP) \ +void HELPER(NAME)(void *vd, void *vn, void *vg, void *status, uint32_t desc) \ +{ \ + intptr_t i = simd_oprsz(desc); \ + uint64_t *g = vg; \ + do { \ + uint64_t pg = g[(i - 1) >> 6]; \ + do { \ + i -= sizeof(TYPEW); \ + if (likely((pg >> (i & 63)) & 1)) { \ + TYPEN nn = *(TYPEN *)(vn + HN(i + sizeof(TYPEN))); \ + *(TYPEW *)(vd + HW(i)) = OP(nn, status); \ + } \ + } while (i & 63); \ + } while (i != 0); \ +} + +DO_FCVTLT(sve2_fcvtlt_hs, uint32_t, uint16_t, H1_4, H1_2, sve_f16_to_f32) +DO_FCVTLT(sve2_fcvtlt_sd, uint64_t, uint32_t, H1_4, H1_2, float32_to_float64) + +#undef DO_FCVTLT +#undef DO_FCVTNT diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 8a3a646bf5..247288e164 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -8241,3 +8241,19 @@ static bool trans_FCVTNT_ds(DisasContext *s, arg_rpr_esz *a) } return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve2_fcvtnt_ds); } + +static bool trans_FCVTLT_hs(DisasContext *s, arg_rpr_esz *a) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve2_fcvtlt_hs); +} + +static bool trans_FCVTLT_sd(DisasContext *s, arg_rpr_esz *a) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve2_fcvtlt_sd); +} From patchwork Fri Sep 18 18:37:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273321 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4458FC43463 for ; Fri, 18 Sep 2020 19:56:05 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id CA94E21D42 for ; Fri, 18 Sep 2020 19:56:04 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="L0Tujru3" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CA94E21D42 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:53118 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJMUF-00068k-9p for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:56:03 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48986) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLIF-0001Eq-Ky for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:35 -0400 Received: from mail-pg1-x542.google.com ([2607:f8b0:4864:20::542]:38218) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLIB-0007I9-Qn for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:35 -0400 Received: by mail-pg1-x542.google.com with SMTP id l191so3982502pgd.5 for ; Fri, 18 Sep 2020 11:39:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=LA/sCu8vYjGaVpIHisxYPWbBbKXoMAD3QsjC3nOh9eg=; b=L0Tujru3CR/bek1utfYnMpyyIUtdqd+jKlReBBhrgsPLNTcVJ5+CmY5wJfO0sBJXfB B0qq+bORd3SZzWRCpeRrvwtLk2gOsDrRqL92SqE8XrsDKoq5f5LKuEE6qu+854WmFtLH kvBq8ArjseP70/e0RQ5YF5xADEH5ocXhHSsC2TuN50waprXhCFaOe03kIAVrMjMFa4Ve 0dra77wft0mSGhFeaThM9M95PwgnYaqv2DBzuA8yq56fTHUh9kHrMvUueq6M2wo0806c CvbTVU2G5hlITq4PAN2tBOhimhpFljlxnuqSJFh1e9xzY/9exTgEjDPrcoGa1PKLlpQa PmoA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=LA/sCu8vYjGaVpIHisxYPWbBbKXoMAD3QsjC3nOh9eg=; b=o8mbRyJF0ybJ1pEkybevC5L0fI5IQAPxPnO1nyvWgv8mOV0WlqA6jaDL0FmrXx71Gu LT9P7CuHYYhHcTkfqW6EyuSf6JVNMbMXSihVvHDd5rd3CeG0aK5WSK8wHGem7gm2Za1o VfV1RNbhguzeKYyyQBgJ4xU5p1a9UIkmqJ4r1WCgVgEQRU6GEqC9bss8redEmhtcnYQa 8DRZn7fKqTJEKAW2d2FuSjjEjreClk+1XSB/I9p4ppp7kkfMhMvXH8pM4jjzuNkS+CRK PooiRrgfTjTpDbxiWXiHrk2YIHlrYSxUqN8evBNA0fGKhlhLuKxzGk34QrDpSQTphOyE Qkow== X-Gm-Message-State: AOAM530hCLQVajzWm0tloTatsr0UHxYZkJ+WIkiyFO/opzWz+B4Ugii+ MuaE1tl37ooE/t5JZ/+I3WAY9P4Ksisf1Q== X-Google-Smtp-Source: ABdhPJxXqnh83S1+E0A9jOGeX1Y903Et3Vw+EWdSQeDxIMqJNKJiQNGH3vJM5FczKKr/YkzrNKtEkg== X-Received: by 2002:aa7:95a6:0:b029:13c:1611:6524 with SMTP id a6-20020aa795a60000b029013c16116524mr33648497pfk.4.1600454370080; Fri, 18 Sep 2020 11:39:30 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.39.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:39:29 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 73/81] target/arm: Implement SVE2 FCVTXNT, FCVTX Date: Fri, 18 Sep 2020 11:37:43 -0700 Message-Id: <20200918183751.2787647-74-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::542; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x542.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org, Stephen Long Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Stephen Long Signed-off-by: Stephen Long Message-Id: <20200428174332.17162-4-steplong@quicinc.com> [rth: Use do_frint_mode, which avoids a specific runtime helper.] Signed-off-by: Richard Henderson --- target/arm/sve.decode | 2 ++ target/arm/translate-sve.c | 49 ++++++++++++++++++++++++++++++-------- 2 files changed, 41 insertions(+), 10 deletions(-) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index abe26f2424..6c0e39d553 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1575,6 +1575,8 @@ SM4EKEY 01000101 00 1 ..... 11110 0 ..... ..... @rd_rn_rm_e0 RAX1 01000101 00 1 ..... 11110 1 ..... ..... @rd_rn_rm_e0 ### SVE2 floating-point convert precision odd elements +FCVTXNT_ds 01100100 00 0010 10 101 ... ..... ..... @rd_pg_rn_e0 +FCVTX_ds 01100101 00 0010 10 101 ... ..... ..... @rd_pg_rn_e0 FCVTNT_sh 01100100 10 0010 00 101 ... ..... ..... @rd_pg_rn_e0 FCVTLT_hs 01100100 10 0010 01 101 ... ..... ..... @rd_pg_rn_e0 FCVTNT_ds 01100100 11 0010 10 101 ... ..... ..... @rd_pg_rn_e0 diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 247288e164..7cdb128a5b 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -4769,11 +4769,9 @@ static bool trans_FRINTX(DisasContext *s, arg_rpr_esz *a) return do_zpz_ptr(s, a->rd, a->rn, a->pg, a->esz == MO_16, fns[a->esz - 1]); } -static bool do_frint_mode(DisasContext *s, arg_rpr_esz *a, int mode) +static bool do_frint_mode(DisasContext *s, arg_rpr_esz *a, + int mode, gen_helper_gvec_3_ptr *fn) { - if (a->esz == 0) { - return false; - } if (sve_access_check(s)) { unsigned vsz = vec_full_reg_size(s); TCGv_i32 tmode = tcg_const_i32(mode); @@ -4784,7 +4782,7 @@ static bool do_frint_mode(DisasContext *s, arg_rpr_esz *a, int mode) tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd), vec_full_reg_offset(s, a->rn), pred_full_reg_offset(s, a->pg), - status, vsz, vsz, 0, frint_fns[a->esz - 1]); + status, vsz, vsz, 0, fn); gen_helper_set_rmode(tmode, tmode, status); tcg_temp_free_i32(tmode); @@ -4795,27 +4793,42 @@ static bool do_frint_mode(DisasContext *s, arg_rpr_esz *a, int mode) static bool trans_FRINTN(DisasContext *s, arg_rpr_esz *a) { - return do_frint_mode(s, a, float_round_nearest_even); + if (a->esz == 0) { + return false; + } + return do_frint_mode(s, a, float_round_nearest_even, frint_fns[a->esz - 1]); } static bool trans_FRINTP(DisasContext *s, arg_rpr_esz *a) { - return do_frint_mode(s, a, float_round_up); + if (a->esz == 0) { + return false; + } + return do_frint_mode(s, a, float_round_up, frint_fns[a->esz - 1]); } static bool trans_FRINTM(DisasContext *s, arg_rpr_esz *a) { - return do_frint_mode(s, a, float_round_down); + if (a->esz == 0) { + return false; + } + return do_frint_mode(s, a, float_round_down, frint_fns[a->esz - 1]); } static bool trans_FRINTZ(DisasContext *s, arg_rpr_esz *a) { - return do_frint_mode(s, a, float_round_to_zero); + if (a->esz == 0) { + return false; + } + return do_frint_mode(s, a, float_round_to_zero, frint_fns[a->esz - 1]); } static bool trans_FRINTA(DisasContext *s, arg_rpr_esz *a) { - return do_frint_mode(s, a, float_round_ties_away); + if (a->esz == 0) { + return false; + } + return do_frint_mode(s, a, float_round_ties_away, frint_fns[a->esz - 1]); } static bool trans_FRECPX(DisasContext *s, arg_rpr_esz *a) @@ -8257,3 +8270,19 @@ static bool trans_FCVTLT_sd(DisasContext *s, arg_rpr_esz *a) } return do_zpz_ptr(s, a->rd, a->rn, a->pg, false, gen_helper_sve2_fcvtlt_sd); } + +static bool trans_FCVTX_ds(DisasContext *s, arg_rpr_esz *a) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + return do_frint_mode(s, a, float_round_to_odd, gen_helper_sve_fcvt_ds); +} + +static bool trans_FCVTXNT_ds(DisasContext *s, arg_rpr_esz *a) +{ + if (!dc_isar_feature(aa64_sve2, s)) { + return false; + } + return do_frint_mode(s, a, float_round_to_odd, gen_helper_sve2_fcvtnt_ds); +} From patchwork Fri Sep 18 18:37:44 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 304995 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8E429C43464 for ; Fri, 18 Sep 2020 19:55:47 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0C98920878 for ; Fri, 18 Sep 2020 19:55:46 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="DBYDQH6x" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0C98920878 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:52188 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJMTy-0005fb-4c for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:55:46 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:48988) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLIF-0001FW-Qp for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:35 -0400 Received: from mail-pl1-x62c.google.com ([2607:f8b0:4864:20::62c]:37362) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLID-0007IR-FC for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:35 -0400 Received: by mail-pl1-x62c.google.com with SMTP id u9so3436627plk.4 for ; Fri, 18 Sep 2020 11:39:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Z1orP7HFlYzRLm2Pcn97VdxmYxBZDG4R6KQe+ppPCwg=; b=DBYDQH6xrLJW86330Fzwf7Shu3z5ioNzQdeYW/GUL0dj0kYBP0iDplqglhIRWe4mxd cw6pxaYuKTqZnxXzV+XwfMcGcR2iuyPrSDt7W0VhVPK3iTZmmw+DAEsimQP2Bo6i9+p3 x6qjTRiJIehn8AxNoUfrrCJytgZha+btSUcFL9TWtZRE3HeKFL2n2uokgGXiRIy8Hds7 qkGkxZG+5a9o/OJ0MeRjsbVWtog/imwq0YvVlxFWm7Z2ErKK64tUsliSRHnws9BpTIEQ TVK+VEHDWQdp+QhCXIb7MU6R2zjE4DIyfflNfHwIQY7bLVgFfj5o8RM9RDZJ0ujRj/CM 487A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Z1orP7HFlYzRLm2Pcn97VdxmYxBZDG4R6KQe+ppPCwg=; b=k/01um9TaBCVbFm+MGMwvH7o7rzlPlPQM3z0g96pQ7UjPSt4rNexob0r/+lZOjM0J3 yh1p+s0qDu202LTMgpfc03Ndv1sGSSNOfQ1YHB0RIWKfachcIFrVWtcakPbt4kCIYUOv NuTAeh0UrBm2oTj6gNSYRDumjWj47jshSpwBpUR+j7B3q/kWpTo2sV6pZjg2e0eXcnK6 WzHp0iGYEnCkYAuLAz3dZjzV5dgouygma5JN7UBRwDbhsrQJ8QT9187q4CR0E37YTZgL TQyufpKejxxPYMp+xwNvUlAv0cRgfbHZHK4ASRFnYPPT0NpcNOKKyLr71bQ9qy3hhfZr 0z7Q== X-Gm-Message-State: AOAM5335Uow55xs2cIvVOGG62pChtmsF6F9DRbEMSyKL18gEWl4YR5xp JCy4Wr8DWXkOmomyKbaPhqbiYb4cWgJorw== X-Google-Smtp-Source: ABdhPJzE818py0OGl7HGtkq8t3F6HNawK9CCeqMOYgK2kPlkT0+DX6X7vPTUdrYkk35iJliB1v+CnA== X-Received: by 2002:a17:902:b109:b029:d1:e5e7:be53 with SMTP id q9-20020a170902b109b02900d1e5e7be53mr17160356plr.45.1600454371671; Fri, 18 Sep 2020 11:39:31 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.39.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:39:30 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 74/81] target/arm: Implement SVE2 FLOGB Date: Fri, 18 Sep 2020 11:37:44 -0700 Message-Id: <20200918183751.2787647-75-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::62c; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x62c.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org, Stephen Long Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Stephen Long Signed-off-by: Stephen Long Message-Id: <20200430191405.21641-1-steplong@quicinc.com> Signed-off-by: Richard Henderson --- v2: Fixed esz index and c++ comments v3: Fixed denormal arithmetic and raise invalid. --- target/arm/helper-sve.h | 4 +++ target/arm/sve.decode | 3 +++ target/arm/sve_helper.c | 52 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 24 ++++++++++++++++++ 4 files changed, 83 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index df141aec41..f25b403ddb 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2744,3 +2744,7 @@ DEF_HELPER_FLAGS_5(sve2_fcvtlt_hs, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_fcvtlt_sd, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(flogb_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(flogb_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(flogb_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 6c0e39d553..6808ff4194 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1581,3 +1581,6 @@ FCVTNT_sh 01100100 10 0010 00 101 ... ..... ..... @rd_pg_rn_e0 FCVTLT_hs 01100100 10 0010 01 101 ... ..... ..... @rd_pg_rn_e0 FCVTNT_ds 01100100 11 0010 10 101 ... ..... ..... @rd_pg_rn_e0 FCVTLT_sd 01100100 11 0010 11 101 ... ..... ..... @rd_pg_rn_e0 + +### SVE2 floating-point convert to integer +FLOGB 01100101 00 011 esz:2 0101 pg:3 rn:5 rd:5 &rpr_esz diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 8539493c72..dc51cb75fc 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -4696,6 +4696,58 @@ DO_ZPZ_FP(sve_ucvt_dh, uint64_t, , uint64_to_float16) DO_ZPZ_FP(sve_ucvt_ds, uint64_t, , uint64_to_float32) DO_ZPZ_FP(sve_ucvt_dd, uint64_t, , uint64_to_float64) +static int16_t do_float16_logb_as_int(float16 a, float_status *s) +{ + if (float16_is_normal(a)) { + return extract16(a, 10, 5) - 15; + } else if (float16_is_infinity(a)) { + return INT16_MAX; + } else if (float16_is_any_nan(a) || float16_is_zero(a)) { + float_raise(float_flag_invalid, s); + return INT16_MIN; + } else { + /* + * denormal: bias - fractional_zeros + * = bias + masked_zeros - uint32_zeros + */ + return -15 + 22 - clz32(extract16(a, 0, 10)); + } +} + +static int32_t do_float32_logb_as_int(float32 a, float_status *s) +{ + if (float32_is_normal(a)) { + return extract32(a, 23, 8) - 127; + } else if (float32_is_infinity(a)) { + return INT32_MAX; + } else if (float32_is_any_nan(a) || float32_is_zero(a)) { + float_raise(float_flag_invalid, s); + return INT32_MIN; + } else { + /* denormal (see above) */ + return -127 + 9 - clz32(extract32(a, 0, 23)); + } +} + +static int64_t do_float64_logb_as_int(float64 a, float_status *s) +{ + if (float64_is_normal(a)) { + return extract64(a, 52, 11) - 1023; + } else if (float64_is_infinity(a)) { + return INT64_MAX; + } else if (float64_is_any_nan(a) || float64_is_zero(a)) { + float_raise(float_flag_invalid, s); + return INT64_MIN; + } else { + /* denormal (see above) */ + return -1023 + 12 - clz64(extract64(a, 0, 52)); + } +} + +DO_ZPZ_FP(flogb_h, float16, H1_2, do_float16_logb_as_int) +DO_ZPZ_FP(flogb_s, float32, H1_4, do_float32_logb_as_int) +DO_ZPZ_FP(flogb_d, float64, , do_float64_logb_as_int) + #undef DO_ZPZ_FP static void do_fmla_zpzzz_h(void *vd, void *vn, void *vm, void *va, void *vg, diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 7cdb128a5b..d6719d99d5 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -8286,3 +8286,27 @@ static bool trans_FCVTXNT_ds(DisasContext *s, arg_rpr_esz *a) } return do_frint_mode(s, a, float_round_to_odd, gen_helper_sve2_fcvtnt_ds); } + +static bool trans_FLOGB(DisasContext *s, arg_rpr_esz *a) +{ + static gen_helper_gvec_3_ptr * const fns[] = { + NULL, gen_helper_flogb_h, + gen_helper_flogb_s, gen_helper_flogb_d + }; + + if (!dc_isar_feature(aa64_sve2, s) || fns[a->esz] == NULL) { + return false; + } + if (sve_access_check(s)) { + TCGv_ptr status = + fpstatus_ptr(a->esz == MO_16 ? FPST_FPCR_F16 : FPST_FPCR); + unsigned vsz = vec_full_reg_size(s); + + tcg_gen_gvec_3_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + pred_full_reg_offset(s, a->pg), + status, vsz, vsz, 0, fns[a->esz]); + tcg_temp_free_ptr(status); + } + return true; +} From patchwork Fri Sep 18 18:37:45 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273326 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5BBAAC43463 for ; Fri, 18 Sep 2020 19:43:48 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D961D21D42 for ; Fri, 18 Sep 2020 19:43:47 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="TJP+7xrP" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D961D21D42 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:53442 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJMIM-0002lW-UY for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:43:46 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49014) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLII-0001KX-05 for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:38 -0400 Received: from mail-pl1-x641.google.com ([2607:f8b0:4864:20::641]:35190) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLIF-0007Iw-Dm for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:37 -0400 Received: by mail-pl1-x641.google.com with SMTP id bg9so3442895plb.2 for ; Fri, 18 Sep 2020 11:39:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=fN08oCx6IsIsE+2nRMY1biK7h4DO3wFZOWz01E+dS1w=; b=TJP+7xrP3Rjd2PknVwTmYKhsiFw9pW3RfIPElL+JjvoRtAL8DUpYpp/RD8p/PyAB7s TcReht6b7U4p2vu4NpcawVBpHhSkqCcfoyB3/yTZ7jIgG4oe5L2nnJU0+IgOw2S9yzPm cAZ7RRWptnxFaDpATA9ri2XzzKOmmVeGUA1wm+bl5RQz5DhkUD6qz24eOVUiPB9KYxw5 jS8maK/TP+1x75H43ElqRbJulsm5+2VMrBEX36h1bxTCEOSr1ch/++VH7U54cCcbR8it VjW1Bveg/Zepx3rlWRU9lpu6XxAX9+kWSIj07j7a5xB1ZP9dQ8Wjs1fT19C1rJp20w6N 7xlg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=fN08oCx6IsIsE+2nRMY1biK7h4DO3wFZOWz01E+dS1w=; b=cRXtcoU/rfj5N2NvlmcDrumga3cttksfZdo28I/9efWfAD28B1WxBosRuMLB65WGBB 5cdZe8LNfDHeu41B2O40AhfH+IwK6KGqKX2qH1iBATmfh9PBQ8oPRlI3mZMvMcoOra1I jj1CuYB5eyYOwOrZpAkrBe8yWJKSWW1ErauZcdVvsr0e+JNMNR8nh0l1dagST19lpQz5 XHWb08EX32unBnYFggVPliNgaJf/XbmBptbIMcAikmZAdYeR4pQA5dRt12cJgL2eas2y o6lddPMf8X1RJXwT24qx78QLOdcTsvTbjdBVwpJQSBp6dXsX2+hNlSR0+twMla6gnDpj WkMw== X-Gm-Message-State: AOAM532V4I0i7tKTAvGpXvsyNe7mLVCKD9mSZiDCBem6fAmsBNrFFBB8 J4o5llxyspiCzbNWZnqy9/2uaj3C+UTlaQ== X-Google-Smtp-Source: ABdhPJzLC6x0kfth2lnRWzGtC3Hj58C2QIH3bVMiIQib2BQzrBd/AVgtg8qdQGB+12fyvcnljqn4Kg== X-Received: by 2002:a17:902:c151:b029:d1:9bd3:6855 with SMTP id 17-20020a170902c151b02900d19bd36855mr34068740plj.38.1600454373412; Fri, 18 Sep 2020 11:39:33 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.39.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:39:32 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 75/81] target/arm: Share table of sve load functions Date: Fri, 18 Sep 2020 11:37:45 -0700 Message-Id: <20200918183751.2787647-76-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::641; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x641.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" The table used by do_ldrq is a subset of the table used by do_ld_zpa; we can share them by passing dtype instead of msz to do_ldrq. Signed-off-by: Richard Henderson --- target/arm/translate-sve.c | 254 ++++++++++++++++++------------------- 1 file changed, 126 insertions(+), 128 deletions(-) diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index d6719d99d5..79af31f12e 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -5208,128 +5208,130 @@ static void do_mem_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr, tcg_temp_free_i32(t_desc); } +/* Indexed by [mte][be][dtype][nreg] */ +static gen_helper_gvec_mem * const ldr_fns[2][2][16][4] = { + { /* mte inactive, little-endian */ + { { gen_helper_sve_ld1bb_r, gen_helper_sve_ld2bb_r, + gen_helper_sve_ld3bb_r, gen_helper_sve_ld4bb_r }, + { gen_helper_sve_ld1bhu_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1bsu_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1bdu_r, NULL, NULL, NULL }, + + { gen_helper_sve_ld1sds_le_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1hh_le_r, gen_helper_sve_ld2hh_le_r, + gen_helper_sve_ld3hh_le_r, gen_helper_sve_ld4hh_le_r }, + { gen_helper_sve_ld1hsu_le_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1hdu_le_r, NULL, NULL, NULL }, + + { gen_helper_sve_ld1hds_le_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1hss_le_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1ss_le_r, gen_helper_sve_ld2ss_le_r, + gen_helper_sve_ld3ss_le_r, gen_helper_sve_ld4ss_le_r }, + { gen_helper_sve_ld1sdu_le_r, NULL, NULL, NULL }, + + { gen_helper_sve_ld1bds_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1bss_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1bhs_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1dd_le_r, gen_helper_sve_ld2dd_le_r, + gen_helper_sve_ld3dd_le_r, gen_helper_sve_ld4dd_le_r } }, + + /* mte inactive, big-endian */ + { { gen_helper_sve_ld1bb_r, gen_helper_sve_ld2bb_r, + gen_helper_sve_ld3bb_r, gen_helper_sve_ld4bb_r }, + { gen_helper_sve_ld1bhu_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1bsu_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1bdu_r, NULL, NULL, NULL }, + + { gen_helper_sve_ld1sds_be_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1hh_be_r, gen_helper_sve_ld2hh_be_r, + gen_helper_sve_ld3hh_be_r, gen_helper_sve_ld4hh_be_r }, + { gen_helper_sve_ld1hsu_be_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1hdu_be_r, NULL, NULL, NULL }, + + { gen_helper_sve_ld1hds_be_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1hss_be_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1ss_be_r, gen_helper_sve_ld2ss_be_r, + gen_helper_sve_ld3ss_be_r, gen_helper_sve_ld4ss_be_r }, + { gen_helper_sve_ld1sdu_be_r, NULL, NULL, NULL }, + + { gen_helper_sve_ld1bds_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1bss_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1bhs_r, NULL, NULL, NULL }, + { gen_helper_sve_ld1dd_be_r, gen_helper_sve_ld2dd_be_r, + gen_helper_sve_ld3dd_be_r, gen_helper_sve_ld4dd_be_r } } }, + + { /* mte active, little-endian */ + { { gen_helper_sve_ld1bb_r_mte, + gen_helper_sve_ld2bb_r_mte, + gen_helper_sve_ld3bb_r_mte, + gen_helper_sve_ld4bb_r_mte }, + { gen_helper_sve_ld1bhu_r_mte, NULL, NULL, NULL }, + { gen_helper_sve_ld1bsu_r_mte, NULL, NULL, NULL }, + { gen_helper_sve_ld1bdu_r_mte, NULL, NULL, NULL }, + + { gen_helper_sve_ld1sds_le_r_mte, NULL, NULL, NULL }, + { gen_helper_sve_ld1hh_le_r_mte, + gen_helper_sve_ld2hh_le_r_mte, + gen_helper_sve_ld3hh_le_r_mte, + gen_helper_sve_ld4hh_le_r_mte }, + { gen_helper_sve_ld1hsu_le_r_mte, NULL, NULL, NULL }, + { gen_helper_sve_ld1hdu_le_r_mte, NULL, NULL, NULL }, + + { gen_helper_sve_ld1hds_le_r_mte, NULL, NULL, NULL }, + { gen_helper_sve_ld1hss_le_r_mte, NULL, NULL, NULL }, + { gen_helper_sve_ld1ss_le_r_mte, + gen_helper_sve_ld2ss_le_r_mte, + gen_helper_sve_ld3ss_le_r_mte, + gen_helper_sve_ld4ss_le_r_mte }, + { gen_helper_sve_ld1sdu_le_r_mte, NULL, NULL, NULL }, + + { gen_helper_sve_ld1bds_r_mte, NULL, NULL, NULL }, + { gen_helper_sve_ld1bss_r_mte, NULL, NULL, NULL }, + { gen_helper_sve_ld1bhs_r_mte, NULL, NULL, NULL }, + { gen_helper_sve_ld1dd_le_r_mte, + gen_helper_sve_ld2dd_le_r_mte, + gen_helper_sve_ld3dd_le_r_mte, + gen_helper_sve_ld4dd_le_r_mte } }, + + /* mte active, big-endian */ + { { gen_helper_sve_ld1bb_r_mte, + gen_helper_sve_ld2bb_r_mte, + gen_helper_sve_ld3bb_r_mte, + gen_helper_sve_ld4bb_r_mte }, + { gen_helper_sve_ld1bhu_r_mte, NULL, NULL, NULL }, + { gen_helper_sve_ld1bsu_r_mte, NULL, NULL, NULL }, + { gen_helper_sve_ld1bdu_r_mte, NULL, NULL, NULL }, + + { gen_helper_sve_ld1sds_be_r_mte, NULL, NULL, NULL }, + { gen_helper_sve_ld1hh_be_r_mte, + gen_helper_sve_ld2hh_be_r_mte, + gen_helper_sve_ld3hh_be_r_mte, + gen_helper_sve_ld4hh_be_r_mte }, + { gen_helper_sve_ld1hsu_be_r_mte, NULL, NULL, NULL }, + { gen_helper_sve_ld1hdu_be_r_mte, NULL, NULL, NULL }, + + { gen_helper_sve_ld1hds_be_r_mte, NULL, NULL, NULL }, + { gen_helper_sve_ld1hss_be_r_mte, NULL, NULL, NULL }, + { gen_helper_sve_ld1ss_be_r_mte, + gen_helper_sve_ld2ss_be_r_mte, + gen_helper_sve_ld3ss_be_r_mte, + gen_helper_sve_ld4ss_be_r_mte }, + { gen_helper_sve_ld1sdu_be_r_mte, NULL, NULL, NULL }, + + { gen_helper_sve_ld1bds_r_mte, NULL, NULL, NULL }, + { gen_helper_sve_ld1bss_r_mte, NULL, NULL, NULL }, + { gen_helper_sve_ld1bhs_r_mte, NULL, NULL, NULL }, + { gen_helper_sve_ld1dd_be_r_mte, + gen_helper_sve_ld2dd_be_r_mte, + gen_helper_sve_ld3dd_be_r_mte, + gen_helper_sve_ld4dd_be_r_mte } } }, +}; + static void do_ld_zpa(DisasContext *s, int zt, int pg, TCGv_i64 addr, int dtype, int nreg) { - static gen_helper_gvec_mem * const fns[2][2][16][4] = { - { /* mte inactive, little-endian */ - { { gen_helper_sve_ld1bb_r, gen_helper_sve_ld2bb_r, - gen_helper_sve_ld3bb_r, gen_helper_sve_ld4bb_r }, - { gen_helper_sve_ld1bhu_r, NULL, NULL, NULL }, - { gen_helper_sve_ld1bsu_r, NULL, NULL, NULL }, - { gen_helper_sve_ld1bdu_r, NULL, NULL, NULL }, - - { gen_helper_sve_ld1sds_le_r, NULL, NULL, NULL }, - { gen_helper_sve_ld1hh_le_r, gen_helper_sve_ld2hh_le_r, - gen_helper_sve_ld3hh_le_r, gen_helper_sve_ld4hh_le_r }, - { gen_helper_sve_ld1hsu_le_r, NULL, NULL, NULL }, - { gen_helper_sve_ld1hdu_le_r, NULL, NULL, NULL }, - - { gen_helper_sve_ld1hds_le_r, NULL, NULL, NULL }, - { gen_helper_sve_ld1hss_le_r, NULL, NULL, NULL }, - { gen_helper_sve_ld1ss_le_r, gen_helper_sve_ld2ss_le_r, - gen_helper_sve_ld3ss_le_r, gen_helper_sve_ld4ss_le_r }, - { gen_helper_sve_ld1sdu_le_r, NULL, NULL, NULL }, - - { gen_helper_sve_ld1bds_r, NULL, NULL, NULL }, - { gen_helper_sve_ld1bss_r, NULL, NULL, NULL }, - { gen_helper_sve_ld1bhs_r, NULL, NULL, NULL }, - { gen_helper_sve_ld1dd_le_r, gen_helper_sve_ld2dd_le_r, - gen_helper_sve_ld3dd_le_r, gen_helper_sve_ld4dd_le_r } }, - - /* mte inactive, big-endian */ - { { gen_helper_sve_ld1bb_r, gen_helper_sve_ld2bb_r, - gen_helper_sve_ld3bb_r, gen_helper_sve_ld4bb_r }, - { gen_helper_sve_ld1bhu_r, NULL, NULL, NULL }, - { gen_helper_sve_ld1bsu_r, NULL, NULL, NULL }, - { gen_helper_sve_ld1bdu_r, NULL, NULL, NULL }, - - { gen_helper_sve_ld1sds_be_r, NULL, NULL, NULL }, - { gen_helper_sve_ld1hh_be_r, gen_helper_sve_ld2hh_be_r, - gen_helper_sve_ld3hh_be_r, gen_helper_sve_ld4hh_be_r }, - { gen_helper_sve_ld1hsu_be_r, NULL, NULL, NULL }, - { gen_helper_sve_ld1hdu_be_r, NULL, NULL, NULL }, - - { gen_helper_sve_ld1hds_be_r, NULL, NULL, NULL }, - { gen_helper_sve_ld1hss_be_r, NULL, NULL, NULL }, - { gen_helper_sve_ld1ss_be_r, gen_helper_sve_ld2ss_be_r, - gen_helper_sve_ld3ss_be_r, gen_helper_sve_ld4ss_be_r }, - { gen_helper_sve_ld1sdu_be_r, NULL, NULL, NULL }, - - { gen_helper_sve_ld1bds_r, NULL, NULL, NULL }, - { gen_helper_sve_ld1bss_r, NULL, NULL, NULL }, - { gen_helper_sve_ld1bhs_r, NULL, NULL, NULL }, - { gen_helper_sve_ld1dd_be_r, gen_helper_sve_ld2dd_be_r, - gen_helper_sve_ld3dd_be_r, gen_helper_sve_ld4dd_be_r } } }, - - { /* mte active, little-endian */ - { { gen_helper_sve_ld1bb_r_mte, - gen_helper_sve_ld2bb_r_mte, - gen_helper_sve_ld3bb_r_mte, - gen_helper_sve_ld4bb_r_mte }, - { gen_helper_sve_ld1bhu_r_mte, NULL, NULL, NULL }, - { gen_helper_sve_ld1bsu_r_mte, NULL, NULL, NULL }, - { gen_helper_sve_ld1bdu_r_mte, NULL, NULL, NULL }, - - { gen_helper_sve_ld1sds_le_r_mte, NULL, NULL, NULL }, - { gen_helper_sve_ld1hh_le_r_mte, - gen_helper_sve_ld2hh_le_r_mte, - gen_helper_sve_ld3hh_le_r_mte, - gen_helper_sve_ld4hh_le_r_mte }, - { gen_helper_sve_ld1hsu_le_r_mte, NULL, NULL, NULL }, - { gen_helper_sve_ld1hdu_le_r_mte, NULL, NULL, NULL }, - - { gen_helper_sve_ld1hds_le_r_mte, NULL, NULL, NULL }, - { gen_helper_sve_ld1hss_le_r_mte, NULL, NULL, NULL }, - { gen_helper_sve_ld1ss_le_r_mte, - gen_helper_sve_ld2ss_le_r_mte, - gen_helper_sve_ld3ss_le_r_mte, - gen_helper_sve_ld4ss_le_r_mte }, - { gen_helper_sve_ld1sdu_le_r_mte, NULL, NULL, NULL }, - - { gen_helper_sve_ld1bds_r_mte, NULL, NULL, NULL }, - { gen_helper_sve_ld1bss_r_mte, NULL, NULL, NULL }, - { gen_helper_sve_ld1bhs_r_mte, NULL, NULL, NULL }, - { gen_helper_sve_ld1dd_le_r_mte, - gen_helper_sve_ld2dd_le_r_mte, - gen_helper_sve_ld3dd_le_r_mte, - gen_helper_sve_ld4dd_le_r_mte } }, - - /* mte active, big-endian */ - { { gen_helper_sve_ld1bb_r_mte, - gen_helper_sve_ld2bb_r_mte, - gen_helper_sve_ld3bb_r_mte, - gen_helper_sve_ld4bb_r_mte }, - { gen_helper_sve_ld1bhu_r_mte, NULL, NULL, NULL }, - { gen_helper_sve_ld1bsu_r_mte, NULL, NULL, NULL }, - { gen_helper_sve_ld1bdu_r_mte, NULL, NULL, NULL }, - - { gen_helper_sve_ld1sds_be_r_mte, NULL, NULL, NULL }, - { gen_helper_sve_ld1hh_be_r_mte, - gen_helper_sve_ld2hh_be_r_mte, - gen_helper_sve_ld3hh_be_r_mte, - gen_helper_sve_ld4hh_be_r_mte }, - { gen_helper_sve_ld1hsu_be_r_mte, NULL, NULL, NULL }, - { gen_helper_sve_ld1hdu_be_r_mte, NULL, NULL, NULL }, - - { gen_helper_sve_ld1hds_be_r_mte, NULL, NULL, NULL }, - { gen_helper_sve_ld1hss_be_r_mte, NULL, NULL, NULL }, - { gen_helper_sve_ld1ss_be_r_mte, - gen_helper_sve_ld2ss_be_r_mte, - gen_helper_sve_ld3ss_be_r_mte, - gen_helper_sve_ld4ss_be_r_mte }, - { gen_helper_sve_ld1sdu_be_r_mte, NULL, NULL, NULL }, - - { gen_helper_sve_ld1bds_r_mte, NULL, NULL, NULL }, - { gen_helper_sve_ld1bss_r_mte, NULL, NULL, NULL }, - { gen_helper_sve_ld1bhs_r_mte, NULL, NULL, NULL }, - { gen_helper_sve_ld1dd_be_r_mte, - gen_helper_sve_ld2dd_be_r_mte, - gen_helper_sve_ld3dd_be_r_mte, - gen_helper_sve_ld4dd_be_r_mte } } }, - }; gen_helper_gvec_mem *fn - = fns[s->mte_active[0]][s->be_data == MO_BE][dtype][nreg]; + = ldr_fns[s->mte_active[0]][s->be_data == MO_BE][dtype][nreg]; /* * While there are holes in the table, they are not @@ -5567,14 +5569,8 @@ static bool trans_LDNF1_zpri(DisasContext *s, arg_rpri_load *a) return true; } -static void do_ldrq(DisasContext *s, int zt, int pg, TCGv_i64 addr, int msz) +static void do_ldrq(DisasContext *s, int zt, int pg, TCGv_i64 addr, int dtype) { - static gen_helper_gvec_mem * const fns[2][4] = { - { gen_helper_sve_ld1bb_r, gen_helper_sve_ld1hh_le_r, - gen_helper_sve_ld1ss_le_r, gen_helper_sve_ld1dd_le_r }, - { gen_helper_sve_ld1bb_r, gen_helper_sve_ld1hh_be_r, - gen_helper_sve_ld1ss_be_r, gen_helper_sve_ld1dd_be_r }, - }; unsigned vsz = vec_full_reg_size(s); TCGv_ptr t_pg; TCGv_i32 t_desc; @@ -5606,7 +5602,9 @@ static void do_ldrq(DisasContext *s, int zt, int pg, TCGv_i64 addr, int msz) t_pg = tcg_temp_new_ptr(); tcg_gen_addi_ptr(t_pg, cpu_env, poff); - fns[s->be_data == MO_BE][msz](cpu_env, t_pg, addr, t_desc); + gen_helper_gvec_mem *fn + = ldr_fns[s->mte_active[0]][s->be_data == MO_BE][dtype][0]; + fn(cpu_env, t_pg, addr, t_desc); tcg_temp_free_ptr(t_pg); tcg_temp_free_i32(t_desc); @@ -5628,7 +5626,7 @@ static bool trans_LD1RQ_zprr(DisasContext *s, arg_rprr_load *a) TCGv_i64 addr = new_tmp_a64(s); tcg_gen_shli_i64(addr, cpu_reg(s, a->rm), msz); tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn)); - do_ldrq(s, a->rd, a->pg, addr, msz); + do_ldrq(s, a->rd, a->pg, addr, a->dtype); } return true; } @@ -5638,7 +5636,7 @@ static bool trans_LD1RQ_zpri(DisasContext *s, arg_rpri_load *a) if (sve_access_check(s)) { TCGv_i64 addr = new_tmp_a64(s); tcg_gen_addi_i64(addr, cpu_reg_sp(s, a->rn), a->imm * 16); - do_ldrq(s, a->rd, a->pg, addr, dtype_msz(a->dtype)); + do_ldrq(s, a->rd, a->pg, addr, a->dtype); } return true; } From patchwork Fri Sep 18 18:37:46 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 304998 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 21F13C43464 for ; Fri, 18 Sep 2020 19:46:54 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id AEBFF21D42 for ; Fri, 18 Sep 2020 19:46:53 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="wdp5jhjC" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AEBFF21D42 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:60338 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJMLM-0005hd-QL for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:46:52 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49024) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLIJ-0001Lr-2H for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:44 -0400 Received: from mail-pj1-x1041.google.com ([2607:f8b0:4864:20::1041]:34579) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLIG-0007JK-Rc for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:38 -0400 Received: by mail-pj1-x1041.google.com with SMTP id s14so4600544pju.1 for ; Fri, 18 Sep 2020 11:39:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=uqRFNthjY19DaXy+bhh6PHCChxy0wtOZHHVokv5xS8o=; b=wdp5jhjCLrTN4ISPLKg1ygpdiU76waF5AzzDL1phmVGLfKCdjSE6N/g2G6D36dkzCl Yu3KUTRkwCl1UxdBv6K6Pq1H0NjARd58U0pzdqS79cMiwJWZk9ubU8o4bgvl3CCUv+Od LMMsc9Z+6VTk1saLWIQugLfwpTIXEYR2aK8AU9qRO8cTmjCIHSOeoiJnwRfrIG9PshhB F4Jj2/vZggRu9LTM6M/HQYwUzS596jgdlw6mL8nU4wPI4OzZ7m00bEFB0DQ70aYKPRJ9 tkfSeoUDP7Pc/7+N15oYWKDn2KJayVAQlejKTHCIXGH1B3Zh7jOSKvEJ7dJ1bodo8zd1 vMbg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=uqRFNthjY19DaXy+bhh6PHCChxy0wtOZHHVokv5xS8o=; b=BwgeP41DrIObDDTc1U8c6UZ2P4u8F//5PV/0fuoLcWI+o9+hQ59qwVVKAeaSWErHVN BhSExNiQ+ux2UQZ6ixfJmvwMZe6uqz+lwfcaeoDZrSoTuucmW4ZOcd++TD55iRLexsK8 cGcse11lUvAQFKDysPQ55e+DKeb4CG1jioy5yw0CRlfA1hQwy9NRK7VIoXjEqlYANi76 7RZimSqRqo44Y397ra09XJCOLW++iNOweycBfSvWqy4AXyqv62XNzW9OqbMoaeM9KisS j5S6dQb0kCIGOu395oILqCtxIahH3p7K/TKwZgmgJWbsJL4A68Hxr2kWZhmJu6t9sokU 1frQ== X-Gm-Message-State: AOAM533oH+C3hHaDAhd6oyexYDUXi/1KOhev4MVrboLMbuJCMQOiORyF G9b1Yya/T/e+BvdgDH2NQUpYrq0vjZOJIw== X-Google-Smtp-Source: ABdhPJwaYIoOCMIW8iUwZxHVqyed0x4+amtwrD+9b/pVq6K3m10v+O09cHsd/cAmxlCblBmT/z4uxA== X-Received: by 2002:a17:902:a50b:b029:d1:e5e7:bdd2 with SMTP id s11-20020a170902a50bb02900d1e5e7bdd2mr16865535plq.50.1600454374937; Fri, 18 Sep 2020 11:39:34 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.39.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:39:34 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 76/81] target/arm: Implement SVE2 LD1RO Date: Fri, 18 Sep 2020 11:37:46 -0700 Message-Id: <20200918183751.2787647-77-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1041; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1041.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/sve.decode | 4 ++ target/arm/translate-sve.c | 97 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 101 insertions(+) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 6808ff4194..e0d093c5d7 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1119,11 +1119,15 @@ LD_zpri 1010010 .. nreg:2 0.... 111 ... ..... ..... @rpri_load_msz # SVE load and broadcast quadword (scalar plus scalar) LD1RQ_zprr 1010010 .. 00 ..... 000 ... ..... ..... \ @rprr_load_msz nreg=0 +LD1RO_zprr 1010010 .. 01 ..... 000 ... ..... ..... \ + @rprr_load_msz nreg=0 # SVE load and broadcast quadword (scalar plus immediate) # LD1RQB, LD1RQH, LD1RQS, LD1RQD LD1RQ_zpri 1010010 .. 00 0.... 001 ... ..... ..... \ @rpri_load_msz nreg=0 +LD1RO_zpri 1010010 .. 01 0.... 001 ... ..... ..... \ + @rpri_load_msz nreg=0 # SVE 32-bit gather prefetch (scalar plus 32-bit scaled offsets) PRF 1000010 00 -1 ----- 0-- --- ----- 0 ---- diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 79af31f12e..7c6a6a3bda 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -5641,6 +5641,103 @@ static bool trans_LD1RQ_zpri(DisasContext *s, arg_rpri_load *a) return true; } +static void do_ldro(DisasContext *s, int zt, int pg, TCGv_i64 addr, int dtype) +{ + unsigned vsz = vec_full_reg_size(s); + unsigned vsz_r32; + TCGv_ptr t_pg; + TCGv_i32 t_desc; + int desc, poff, doff; + + if (vsz < 32) { + /* + * Note that this UNDEFINED check comes after CheckSVEEnabled() + * in the ARM pseudocode, which is the sve_access_check() done + * in our caller. We should not now return false from the caller. + */ + unallocated_encoding(s); + return; + } + + /* Load the first octaword using the normal predicated load helpers. */ + + poff = pred_full_reg_offset(s, pg); + if (vsz > 32) { + /* + * Zero-extend the first 32 bits of the predicate into a temporary. + * This avoids triggering an assert making sure we don't have bits + * set within a predicate beyond VQ, but we have lowered VQ to 2 + * for this load operation. + */ + TCGv_i64 tmp = tcg_temp_new_i64(); +#ifdef HOST_WORDS_BIGENDIAN + poff += 4; +#endif + tcg_gen_ld32u_i64(tmp, cpu_env, poff); + + poff = offsetof(CPUARMState, vfp.preg_tmp); + tcg_gen_st_i64(tmp, cpu_env, poff); + tcg_temp_free_i64(tmp); + } + + t_pg = tcg_temp_new_ptr(); + tcg_gen_addi_ptr(t_pg, cpu_env, poff); + + desc = simd_desc(32, 32, zt); + t_desc = tcg_const_i32(desc); + + gen_helper_gvec_mem *fn + = ldr_fns[s->mte_active[0]][s->be_data == MO_BE][dtype][0]; + fn(cpu_env, t_pg, addr, t_desc); + + tcg_temp_free_ptr(t_pg); + tcg_temp_free_i32(t_desc); + + /* + * Replicate that first octaword. + * The replication happens in units of 32; if the full vector size + * is not a multiple of 32, the final bits are zeroed. + */ + doff = vec_full_reg_offset(s, zt); + vsz_r32 = QEMU_ALIGN_DOWN(vsz, 32); + if (vsz >= 64) { + tcg_gen_gvec_dup_mem(5, doff + 32, doff, vsz_r32 - 32, vsz - 32); + } else if (vsz > vsz_r32) { + /* Nop move, with side effect of clearing the tail. */ + tcg_gen_gvec_mov(MO_64, doff, doff, vsz_r32, vsz); + } +} + +static bool trans_LD1RO_zprr(DisasContext *s, arg_rprr_load *a) +{ + if (!dc_isar_feature(aa64_sve2_f64mm, s)) { + return false; + } + if (a->rm == 31) { + return false; + } + if (sve_access_check(s)) { + TCGv_i64 addr = new_tmp_a64(s); + tcg_gen_shli_i64(addr, cpu_reg(s, a->rm), dtype_msz(a->dtype)); + tcg_gen_add_i64(addr, addr, cpu_reg_sp(s, a->rn)); + do_ldro(s, a->rd, a->pg, addr, a->dtype); + } + return true; +} + +static bool trans_LD1RO_zpri(DisasContext *s, arg_rpri_load *a) +{ + if (!dc_isar_feature(aa64_sve2_f64mm, s)) { + return false; + } + if (sve_access_check(s)) { + TCGv_i64 addr = new_tmp_a64(s); + tcg_gen_addi_i64(addr, cpu_reg_sp(s, a->rn), a->imm * 32); + do_ldro(s, a->rd, a->pg, addr, a->dtype); + } + return true; +} + /* Load and broadcast element. */ static bool trans_LD1R_zpri(DisasContext *s, arg_rpri_load *a) { From patchwork Fri Sep 18 18:37:47 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273334 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F2A9DC43463 for ; Fri, 18 Sep 2020 19:27:53 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8771B208DB for ; Fri, 18 Sep 2020 19:27:53 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="mq+5oxtB" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8771B208DB Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:43588 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJM2y-0002wb-MG for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:27:52 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49076) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLIO-0001NZ-Rw for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:46 -0400 Received: from mail-pg1-x52b.google.com ([2607:f8b0:4864:20::52b]:45991) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLII-0007JQ-13 for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:44 -0400 Received: by mail-pg1-x52b.google.com with SMTP id 67so3962894pgd.12 for ; Fri, 18 Sep 2020 11:39:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Ot4ZSaRqcI5khQZrBXNt0ZSLnRQc4wVu31DCD3bP540=; b=mq+5oxtBogkuL1a/YMae3ZZ3R5/PhVuE1l+1T0swgmE+1OO2RXATXF2yNC8ABSJubC Mb8XNdlqkdxkA5QruqVkAnMikIKUuc2qeQZa5WpNmMQcUtKvpVqA0n8MPhKTcnznjxN7 jIiaheLgounOiJ8Ns/iO44P0T0P0G4vWOFXRQ0M8ZDD8dkoXqQ0fLNCPYps02t7Id594 cC8oBvMRApyHUGga0ems1yRy44ELkGBndsiNR7Fe7T0SQ5woiTKnXqXFpFZNCNfu2fjz /DxaD/WyX/TnUlQF3R6kB/yOD065l3RpzaBgEWiJerGY8zqbYWD/KkgO8o4Pjo3o3vX9 81Sg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Ot4ZSaRqcI5khQZrBXNt0ZSLnRQc4wVu31DCD3bP540=; b=CoqJ/j0Djp1qSrsi8S1ImOi514OTkvRTzA9xbG9V7QIbfYaBys8ytklfchkXM1l/rt N7an4xVz2l2THXKs4N3P07WvdrYKXgHuOnB4DVKPXbMqyHiG+eqn6WiKOM4F5T3AAGO2 JsO0+wW8NUQP2UVg9r8EDVXlwTRGV9TXqSdFRCdI2sgL3YFU4654lJHX/aSQUsXRz/GK /m/BbYj+pptofrvqwzRcYtJgEDOTCL4c2jGRbhdYy765KTn+5OBHN8pXa6KZEO4uMt2e xiSjGEoo0hsDckp+6UojBdQKZtaY7vnWGZoGcUF+Rs4MbTbKOK06uVUBHY2SrVdlk2WD G1HQ== X-Gm-Message-State: AOAM533/tzIh11bMiPisWv3p6w/wJ3IK1msmkjO0hRGaepA3Q18D6BdE 3XxrMHWSGimPm3z7s7aws5pNdnMW5GKC8w== X-Google-Smtp-Source: ABdhPJxHQQlzLc1CdEs7XZ8vQufxmQSKKKHMicg74UBF2q5fqN3D/blTpuYbfuctG8pGfNB2FoVOjw== X-Received: by 2002:aa7:87da:0:b029:13c:1611:66bf with SMTP id i26-20020aa787da0000b029013c161166bfmr33485895pfo.10.1600454376200; Fri, 18 Sep 2020 11:39:36 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.39.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:39:35 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 77/81] target/arm: Implement 128-bit ZIP, UZP, TRN Date: Fri, 18 Sep 2020 11:37:47 -0700 Message-Id: <20200918183751.2787647-78-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::52b; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x52b.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 3 ++ target/arm/sve.decode | 8 ++++++ target/arm/sve_helper.c | 29 +++++++++++++------ target/arm/translate-sve.c | 58 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 90 insertions(+), 8 deletions(-) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index f25b403ddb..feed23bbd1 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -689,16 +689,19 @@ DEF_HELPER_FLAGS_4(sve_zip_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_zip_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_zip_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_zip_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_zip_q, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_uzp_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_uzp_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_uzp_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_uzp_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_uzp_q, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_trn_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_trn_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_trn_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_trn_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_trn_q, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_compact_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_4(sve_compact_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index e0d093c5d7..4e21274dc4 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -592,6 +592,14 @@ UZP2_z 00000101 .. 1 ..... 011 011 ..... ..... @rd_rn_rm TRN1_z 00000101 .. 1 ..... 011 100 ..... ..... @rd_rn_rm TRN2_z 00000101 .. 1 ..... 011 101 ..... ..... @rd_rn_rm +# SVE2 permute vector segments +ZIP1_q 00000101 10 1 ..... 000 000 ..... ..... @rd_rn_rm_e0 +ZIP2_q 00000101 10 1 ..... 000 001 ..... ..... @rd_rn_rm_e0 +UZP1_q 00000101 10 1 ..... 000 010 ..... ..... @rd_rn_rm_e0 +UZP2_q 00000101 10 1 ..... 000 011 ..... ..... @rd_rn_rm_e0 +TRN1_q 00000101 10 1 ..... 000 110 ..... ..... @rd_rn_rm_e0 +TRN2_q 00000101 10 1 ..... 000 111 ..... ..... @rd_rn_rm_e0 + ### SVE Permute - Predicated Group # SVE compress active elements diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index dc51cb75fc..12d21dcd65 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -3457,36 +3457,45 @@ void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ *(TYPE *)(vd + H(2 * i + 0)) = *(TYPE *)(vn + H(i)); \ *(TYPE *)(vd + H(2 * i + sizeof(TYPE))) = *(TYPE *)(vm + H(i)); \ } \ + if (sizeof(TYPE) == 16 && unlikely(oprsz & 16)) { \ + memset(vd + oprsz - 16, 0, 16); \ + } \ } DO_ZIP(sve_zip_b, uint8_t, H1) DO_ZIP(sve_zip_h, uint16_t, H1_2) DO_ZIP(sve_zip_s, uint32_t, H1_4) DO_ZIP(sve_zip_d, uint64_t, ) +DO_ZIP(sve2_zip_q, Int128, ) #define DO_UZP(NAME, TYPE, H) \ void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ { \ intptr_t oprsz = simd_oprsz(desc); \ - intptr_t oprsz_2 = oprsz / 2; \ intptr_t odd_ofs = simd_data(desc); \ - intptr_t i; \ + intptr_t i, p; \ ARMVectorReg tmp_m; \ if (unlikely((vm - vd) < (uintptr_t)oprsz)) { \ vm = memcpy(&tmp_m, vm, oprsz); \ } \ - for (i = 0; i < oprsz_2; i += sizeof(TYPE)) { \ - *(TYPE *)(vd + H(i)) = *(TYPE *)(vn + H(2 * i + odd_ofs)); \ - } \ - for (i = 0; i < oprsz_2; i += sizeof(TYPE)) { \ - *(TYPE *)(vd + H(oprsz_2 + i)) = *(TYPE *)(vm + H(2 * i + odd_ofs)); \ - } \ + i = 0, p = odd_ofs; \ + do { \ + *(TYPE *)(vd + H(i)) = *(TYPE *)(vn + H(p)); \ + i += sizeof(TYPE), p += 2 * sizeof(TYPE); \ + } while (p < oprsz); \ + p -= oprsz; \ + do { \ + *(TYPE *)(vd + H(i)) = *(TYPE *)(vm + H(p)); \ + i += sizeof(TYPE), p += 2 * sizeof(TYPE); \ + } while (p < oprsz); \ + tcg_debug_assert(i == oprsz); \ } DO_UZP(sve_uzp_b, uint8_t, H1) DO_UZP(sve_uzp_h, uint16_t, H1_2) DO_UZP(sve_uzp_s, uint32_t, H1_4) DO_UZP(sve_uzp_d, uint64_t, ) +DO_UZP(sve2_uzp_q, Int128, ) #define DO_TRN(NAME, TYPE, H) \ void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ @@ -3500,12 +3509,16 @@ void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \ *(TYPE *)(vd + H(i + 0)) = ae; \ *(TYPE *)(vd + H(i + sizeof(TYPE))) = be; \ } \ + if (sizeof(TYPE) == 16 && unlikely(oprsz & 16)) { \ + memset(vd + oprsz - 16, 0, 16); \ + } \ } DO_TRN(sve_trn_b, uint8_t, H1) DO_TRN(sve_trn_h, uint16_t, H1_2) DO_TRN(sve_trn_s, uint32_t, H1_4) DO_TRN(sve_trn_d, uint64_t, ) +DO_TRN(sve2_trn_q, Int128, ) #undef DO_ZIP #undef DO_UZP diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 7c6a6a3bda..214788ea23 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -2633,6 +2633,32 @@ static bool trans_ZIP2_z(DisasContext *s, arg_rrr_esz *a) return do_zip(s, a, true); } +static bool do_zip_q(DisasContext *s, arg_rrr_esz *a, bool high) +{ + if (!dc_isar_feature(aa64_sve2_f64mm, s)) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + unsigned high_ofs = high ? QEMU_ALIGN_DOWN(vsz, 32) / 2 : 0; + tcg_gen_gvec_3_ool(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn) + high_ofs, + vec_full_reg_offset(s, a->rm) + high_ofs, + vsz, vsz, 0, gen_helper_sve2_zip_q); + } + return true; +} + +static bool trans_ZIP1_q(DisasContext *s, arg_rrr_esz *a) +{ + return do_zip_q(s, a, false); +} + +static bool trans_ZIP2_q(DisasContext *s, arg_rrr_esz *a) +{ + return do_zip_q(s, a, true); +} + static gen_helper_gvec_3 * const uzp_fns[4] = { gen_helper_sve_uzp_b, gen_helper_sve_uzp_h, gen_helper_sve_uzp_s, gen_helper_sve_uzp_d, @@ -2648,6 +2674,22 @@ static bool trans_UZP2_z(DisasContext *s, arg_rrr_esz *a) return do_zzz_data_ool(s, a, 1 << a->esz, uzp_fns[a->esz]); } +static bool trans_UZP1_q(DisasContext *s, arg_rrr_esz *a) +{ + if (!dc_isar_feature(aa64_sve2_f64mm, s)) { + return false; + } + return do_zzz_data_ool(s, a, 0, gen_helper_sve2_uzp_q); +} + +static bool trans_UZP2_q(DisasContext *s, arg_rrr_esz *a) +{ + if (!dc_isar_feature(aa64_sve2_f64mm, s)) { + return false; + } + return do_zzz_data_ool(s, a, 16, gen_helper_sve2_uzp_q); +} + static gen_helper_gvec_3 * const trn_fns[4] = { gen_helper_sve_trn_b, gen_helper_sve_trn_h, gen_helper_sve_trn_s, gen_helper_sve_trn_d, @@ -2663,6 +2705,22 @@ static bool trans_TRN2_z(DisasContext *s, arg_rrr_esz *a) return do_zzz_data_ool(s, a, 1 << a->esz, trn_fns[a->esz]); } +static bool trans_TRN1_q(DisasContext *s, arg_rrr_esz *a) +{ + if (!dc_isar_feature(aa64_sve2_f64mm, s)) { + return false; + } + return do_zzz_data_ool(s, a, 0, gen_helper_sve2_trn_q); +} + +static bool trans_TRN2_q(DisasContext *s, arg_rrr_esz *a) +{ + if (!dc_isar_feature(aa64_sve2_f64mm, s)) { + return false; + } + return do_zzz_data_ool(s, a, 16, gen_helper_sve2_trn_q); +} + /* *** SVE Permute Vector - Predicated Group */ From patchwork Fri Sep 18 18:37:48 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273320 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 43B29C43464 for ; Fri, 18 Sep 2020 20:00:31 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id AC36E2067D for ; Fri, 18 Sep 2020 20:00:30 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="Z5ki9XWL" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AC36E2067D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:60438 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJMYV-0000ox-IT for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 16:00:29 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49106) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLIQ-0001OE-VA for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:46 -0400 Received: from mail-pf1-x42f.google.com ([2607:f8b0:4864:20::42f]:46243) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLIJ-0007Je-Av for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:46 -0400 Received: by mail-pf1-x42f.google.com with SMTP id b124so3979573pfg.13 for ; Fri, 18 Sep 2020 11:39:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=uWrP993e91NZdCWoZj580puJQVNQnFvVcyTMSpehn0g=; b=Z5ki9XWLvCwVriNIcIn+QVFUsXS1b/KN64STsLKcltH/2QLG3FB71xsqun2Gcq5FOl ebAtB1WgwXA7jHb765Y1A1va/3xaFnCPLOBf8/gAE+R828AtezOUJdKUvmShvHrWrcE0 dIUKzzn54f37Y0aZ4qUruTTiK6CFosTkmR57ZvDq9U/IRm4DfQcS+ALbLqfo0Cg1hdmv TwwVLt0RmhA6/CS30ZLy0Df6KVCuoXMAC8BYasWXN6WsldOSwm819KARhuGvhNixYMI7 uE/wECIiaQn+K3Z8mN/RAzbDgplQBc2CWkDpTsd1GN0nFwtIbFTO0zmj1QbjI1wl1lQa N7mQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=uWrP993e91NZdCWoZj580puJQVNQnFvVcyTMSpehn0g=; b=A2BBPfn/qFkfkrcTBoEQDVK+jXw0Upzs2G+QV79pilYcvZAQ3u0THZ2q+lqzD5128m BwuV0soyOF8NceNAoMENx5aVwEqApj92WexBgwYPUcyVTd2PtG8uQUP4gxDducBEMgZP xeeD5OkEhQDY2u4JEZ4k6oc8W5GaApYNTsk5JDR42h/abm5OkGWzVKG73BODHmL+rUoI 2PF89YtKWKT1TIxOpyg9gIr77zca8qD7yhJW0eimKSmHFhpnsG34mmZmE+vftIizSofD tFGBCmcjLC6YbofB5d3/OtpWp3h9VfDIv43QghptFYz4Z6FyKcQOLXwvCJnV9z3c5QiP Z4Ew== X-Gm-Message-State: AOAM532N8G88eKub8A0K93n1A+U7xdnQVLQqzN/BP3oi2WGzIB6jRBDZ EdhjHwzAhTbeNoxWjxY/UV5gIyOkjLTHnA== X-Google-Smtp-Source: ABdhPJzKKKZyqAadQBL8IEAcZAG+pvL8iXPnWxL42/bd7VMUTypL9hN0UuNqfQ016N/0yhaz4ouYUw== X-Received: by 2002:aa7:96c7:0:b029:142:38cd:13de with SMTP id h7-20020aa796c70000b029014238cd13demr14042045pfq.66.1600454377587; Fri, 18 Sep 2020 11:39:37 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.39.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:39:36 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 78/81] target/arm: Implement SVE2 bitwise shift immediate Date: Fri, 18 Sep 2020 11:37:48 -0700 Message-Id: <20200918183751.2787647-79-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42f; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x42f.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org, Stephen Long Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Stephen Long Implements SQSHL/UQSHL, SRSHR/URSHR, and SQSHLU Signed-off-by: Stephen Long Message-Id: <20200430194159.24064-1-steplong@quicinc.com> Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 33 +++++++++++++++++++++ target/arm/sve.decode | 5 ++++ target/arm/sve_helper.c | 35 ++++++++++++++++++++++ target/arm/translate-sve.c | 60 ++++++++++++++++++++++++++++++++++++++ 4 files changed, 133 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index feed23bbd1..9542f01c42 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2738,6 +2738,39 @@ DEF_HELPER_FLAGS_5(sve2_sqrdcmlah_idx_h, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(sve2_sqrdcmlah_idx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqshl_zpzi_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqshl_zpzi_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqshl_zpzi_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqshl_zpzi_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_uqshl_zpzi_b, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_uqshl_zpzi_h, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_uqshl_zpzi_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_uqshl_zpzi_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_srshr_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_srshr_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_srshr_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_srshr_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_urshr_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_urshr_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_urshr_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_urshr_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_4(sve2_sqshlu_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqshlu_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqshlu_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_4(sve2_sqshlu_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_5(sve2_fcvtnt_sh, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(sve2_fcvtnt_ds, TCG_CALL_NO_RWG, diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 4e21274dc4..d2f33d96f3 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -342,6 +342,11 @@ ASR_zpzi 00000100 .. 000 000 100 ... .. ... ..... @rdn_pg_tszimm_shr LSR_zpzi 00000100 .. 000 001 100 ... .. ... ..... @rdn_pg_tszimm_shr LSL_zpzi 00000100 .. 000 011 100 ... .. ... ..... @rdn_pg_tszimm_shl ASRD 00000100 .. 000 100 100 ... .. ... ..... @rdn_pg_tszimm_shr +SQSHL_zpzi 00000100 .. 000 110 100 ... .. ... ..... @rdn_pg_tszimm_shl +UQSHL_zpzi 00000100 .. 000 111 100 ... .. ... ..... @rdn_pg_tszimm_shl +SRSHR 00000100 .. 001 100 100 ... .. ... ..... @rdn_pg_tszimm_shr +URSHR 00000100 .. 001 101 100 ... .. ... ..... @rdn_pg_tszimm_shr +SQSHLU 00000100 .. 001 111 100 ... .. ... ..... @rdn_pg_tszimm_shl # SVE bitwise shift by vector (predicated) ASR_zpzz 00000100 .. 010 000 100 ... ..... ..... @rdn_pg_rm diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index 12d21dcd65..ef04a0f95a 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -2203,6 +2203,41 @@ DO_ZPZI(sve_asrd_h, int16_t, H1_2, DO_ASRD) DO_ZPZI(sve_asrd_s, int32_t, H1_4, DO_ASRD) DO_ZPZI_D(sve_asrd_d, int64_t, DO_ASRD) +/* SVE2 bitwise shift by immediate */ +DO_ZPZI(sve2_sqshl_zpzi_b, int8_t, H1, do_sqshl_b) +DO_ZPZI(sve2_sqshl_zpzi_h, int16_t, H1_2, do_sqshl_h) +DO_ZPZI(sve2_sqshl_zpzi_s, int32_t, H1_4, do_sqshl_s) +DO_ZPZI_D(sve2_sqshl_zpzi_d, int64_t, do_sqshl_d) + +DO_ZPZI(sve2_uqshl_zpzi_b, uint8_t, H1, do_uqshl_b) +DO_ZPZI(sve2_uqshl_zpzi_h, uint16_t, H1_2, do_uqshl_h) +DO_ZPZI(sve2_uqshl_zpzi_s, uint32_t, H1_4, do_uqshl_s) +DO_ZPZI_D(sve2_uqshl_zpzi_d, uint64_t, do_uqshl_d) + +DO_ZPZI(sve2_srshr_b, int8_t, H1, do_srshr) +DO_ZPZI(sve2_srshr_h, int16_t, H1_2, do_srshr) +DO_ZPZI(sve2_srshr_s, int32_t, H1_4, do_srshr) +DO_ZPZI_D(sve2_srshr_d, int64_t, do_srshr) + +DO_ZPZI(sve2_urshr_b, uint8_t, H1, do_urshr) +DO_ZPZI(sve2_urshr_h, uint16_t, H1_2, do_urshr) +DO_ZPZI(sve2_urshr_s, uint32_t, H1_4, do_urshr) +DO_ZPZI_D(sve2_urshr_d, uint64_t, do_urshr) + +#define do_suqrshl_b(n, m) \ + ({ uint32_t discard; do_suqrshl_bhs(n, (int8_t)m, 8, false, &discard); }) +#define do_suqrshl_h(n, m) \ + ({ uint32_t discard; do_suqrshl_bhs(n, (int16_t)m, 16, false, &discard); }) +#define do_suqrshl_s(n, m) \ + ({ uint32_t discard; do_suqrshl_bhs(n, m, 32, false, &discard); }) +#define do_suqrshl_d(n, m) \ + ({ uint32_t discard; do_suqrshl_d(n, m, false, &discard); }) + +DO_ZPZI(sve2_sqshlu_b, int8_t, H1, do_suqrshl_b) +DO_ZPZI(sve2_sqshlu_h, int16_t, H1_2, do_suqrshl_h) +DO_ZPZI(sve2_sqshlu_s, int32_t, H1_4, do_suqrshl_s) +DO_ZPZI_D(sve2_sqshlu_d, int64_t, do_suqrshl_d) + #undef DO_ASRD #undef DO_ZPZI #undef DO_ZPZI_D diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 214788ea23..acd6d53724 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -1044,6 +1044,66 @@ static bool trans_ASRD(DisasContext *s, arg_rpri_esz *a) } } +static bool trans_SQSHL_zpzi(DisasContext *s, arg_rpri_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve2_sqshl_zpzi_b, gen_helper_sve2_sqshl_zpzi_h, + gen_helper_sve2_sqshl_zpzi_s, gen_helper_sve2_sqshl_zpzi_d, + }; + if (a->esz < 0 || !dc_isar_feature(aa64_sve2, s)) { + return false; + } + return do_zpzi_ool(s, a, fns[a->esz]); +} + +static bool trans_UQSHL_zpzi(DisasContext *s, arg_rpri_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve2_uqshl_zpzi_b, gen_helper_sve2_uqshl_zpzi_h, + gen_helper_sve2_uqshl_zpzi_s, gen_helper_sve2_uqshl_zpzi_d, + }; + if (a->esz < 0 || !dc_isar_feature(aa64_sve2, s)) { + return false; + } + return do_zpzi_ool(s, a, fns[a->esz]); +} + +static bool trans_SRSHR(DisasContext *s, arg_rpri_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve2_srshr_b, gen_helper_sve2_srshr_h, + gen_helper_sve2_srshr_s, gen_helper_sve2_srshr_d, + }; + if (a->esz < 0 || !dc_isar_feature(aa64_sve2, s)) { + return false; + } + return do_zpzi_ool(s, a, fns[a->esz]); +} + +static bool trans_URSHR(DisasContext *s, arg_rpri_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve2_urshr_b, gen_helper_sve2_urshr_h, + gen_helper_sve2_urshr_s, gen_helper_sve2_urshr_d, + }; + if (a->esz < 0 || !dc_isar_feature(aa64_sve2, s)) { + return false; + } + return do_zpzi_ool(s, a, fns[a->esz]); +} + +static bool trans_SQSHLU(DisasContext *s, arg_rpri_esz *a) +{ + static gen_helper_gvec_3 * const fns[4] = { + gen_helper_sve2_sqshlu_b, gen_helper_sve2_sqshlu_h, + gen_helper_sve2_sqshlu_s, gen_helper_sve2_sqshlu_d, + }; + if (a->esz < 0 || !dc_isar_feature(aa64_sve2, s)) { + return false; + } + return do_zpzi_ool(s, a, fns[a->esz]); +} + /* *** SVE Bitwise Shift - Predicated Group */ From patchwork Fri Sep 18 18:37:49 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 304994 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7B62AC43464 for ; Fri, 18 Sep 2020 19:58:55 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A6A0221D42 for ; Fri, 18 Sep 2020 19:58:54 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="gINBo+6z" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A6A0221D42 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:56952 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJMWz-0007oK-L5 for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:58:53 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49126) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLIS-0001PF-RV for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:48 -0400 Received: from mail-pl1-x62e.google.com ([2607:f8b0:4864:20::62e]:42927) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLIO-0007K0-I4 for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:47 -0400 Received: by mail-pl1-x62e.google.com with SMTP id y6so3424105plt.9 for ; Fri, 18 Sep 2020 11:39:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=n5cEerfSjky0vwMwc7ItxI+SlyXUQaAUfRUD7bto4dk=; b=gINBo+6zt2AknZRFyEchitFKOVYGvSqaQdqbn9VDVpMCFvdp3XH5VtNPuSgVxXmphR /CjgUsBQ8DO5be2zGwFIJKf7lkwr1n1TxG0MTWcgVLUla8RlMEH0j/EPzn3Q+vdMLI28 nTSIFILdJDsOuSswzWGMdoFcI+J0s8vp/lxc6/u9CgPHV1VYFRBZJ0d36/YJa/7TLGRB KenMcSv6A2tVIo5LFDFqkjGWEwiG7ym0aQ93H1tLiNqO3t7CWBU8qZ54Om13t80Y4XP6 9fOXmJh8yD9XpXNn25TPYa41HUSe4xnque3AaEFJIqTWTYFgkzDq57dv5YacF+GD89Nz aV5Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=n5cEerfSjky0vwMwc7ItxI+SlyXUQaAUfRUD7bto4dk=; b=Niiq2IW/f2x4qXa5EpCTl0skEWFisRHYT1EMYM/RXduPMVb4Ai0460W3fo/lqT/9/P ttOc2mvi5cZze2zbUoAJvqyjpDA7igTVRtyDNQ0u3B0G6IbjkPX3QVnmH2VltrHlta1e eNZxItUE382635Jkiv3hZaNixhzhzUoPkSde51uSwN/029KAe2ODWBwX0SSo54ej03wo YHNE8YG+h6JDVb/buAc5xr+2WmpBlNIB6CM09rczaBBB8Z7eRvieh6qLzQdU6i2KtW4T pzDVLEkRL/rtePNVL3pDo4/Kf0OypTdQQd1NdpPWLOvTaQ9yb2swKDTo9m1UCjK9Bu2Q mqpQ== X-Gm-Message-State: AOAM533pLrtzYtMlxUwx9oMHxNJl3SH8hL/N1g3kT9ZyxWELY1iWYcu3 hRBYc3906ITuGjn8LiLYAeFE5jXd+fDz7Q== X-Google-Smtp-Source: ABdhPJyey6fBZDBcj6gFrB2O8I6gkX30G7pS6yNdo9cq6/34y3c5jI4kkDbAGzqf26zib9B5+TzduQ== X-Received: by 2002:a17:90a:1548:: with SMTP id y8mr14156412pja.113.1600454378704; Fri, 18 Sep 2020 11:39:38 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.39.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:39:38 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 79/81] target/arm: Implement SVE2 fp multiply-add long Date: Fri, 18 Sep 2020 11:37:49 -0700 Message-Id: <20200918183751.2787647-80-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::62e; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x62e.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org, Stephen Long Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" From: Stephen Long Implements both vectored and indexed FMLALB, FMLALT, FMLSLB, FMLSLT Signed-off-by: Stephen Long Message-Id: <20200504171240.11220-1-steplong@quicinc.com> [rth: Rearrange to use float16_to_float32_by_bits.] Signed-off-by: Richard Henderson --- target/arm/helper.h | 5 +++ target/arm/sve.decode | 12 ++++++ target/arm/translate-sve.c | 75 ++++++++++++++++++++++++++++++++++++++ target/arm/vec_helper.c | 51 ++++++++++++++++++++++++++ 4 files changed, 143 insertions(+) diff --git a/target/arm/helper.h b/target/arm/helper.h index 868c2e9121..41f114da2a 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -974,6 +974,11 @@ DEF_HELPER_FLAGS_4(sve2_sqrdmulh_idx_s, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_4(sve2_sqrdmulh_idx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve2_fmlal_zzzw_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_6(sve2_fmlal_zzxw_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, ptr, i32) + DEF_HELPER_FLAGS_4(gvec_xar_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) #ifdef TARGET_AARCH64 diff --git a/target/arm/sve.decode b/target/arm/sve.decode index d2f33d96f3..6708c048e0 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -1601,3 +1601,15 @@ FCVTLT_sd 01100100 11 0010 11 101 ... ..... ..... @rd_pg_rn_e0 ### SVE2 floating-point convert to integer FLOGB 01100101 00 011 esz:2 0101 pg:3 rn:5 rd:5 &rpr_esz + +### SVE2 floating-point multiply-add long (vectors) +FMLALB_zzzw 01100100 .. 1 ..... 10 0 00 0 ..... ..... @rda_rn_rm +FMLALT_zzzw 01100100 .. 1 ..... 10 0 00 1 ..... ..... @rda_rn_rm +FMLSLB_zzzw 01100100 .. 1 ..... 10 1 00 0 ..... ..... @rda_rn_rm +FMLSLT_zzzw 01100100 .. 1 ..... 10 1 00 1 ..... ..... @rda_rn_rm + +### SVE2 floating-point multiply-add long (indexed) +FMLALB_zzxw 01100100 .. 1 ..... 0100.0 ..... ..... @rrxw_s +FMLALT_zzxw 01100100 .. 1 ..... 0100.1 ..... ..... @rrxw_s +FMLSLB_zzxw 01100100 .. 1 ..... 0110.0 ..... ..... @rrxw_s +FMLSLT_zzxw 01100100 .. 1 ..... 0110.1 ..... ..... @rrxw_s diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index acd6d53724..92d1297a5c 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -8523,3 +8523,78 @@ static bool trans_FLOGB(DisasContext *s, arg_rpr_esz *a) } return true; } + +static bool do_FMLAL_zzzw(DisasContext *s, arg_rrrr_esz *a, bool sub, bool sel) +{ + if (a->esz != MO_32 || !dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_4_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vec_full_reg_offset(s, a->ra), + cpu_env, vsz, vsz, (sel << 1) | sub, + gen_helper_sve2_fmlal_zzzw_s); + } + return true; +} + +static bool trans_FMLALB_zzzw(DisasContext *s, arg_rrrr_esz *a) +{ + return do_FMLAL_zzzw(s, a, false, false); +} + +static bool trans_FMLALT_zzzw(DisasContext *s, arg_rrrr_esz *a) +{ + return do_FMLAL_zzzw(s, a, false, true); +} + +static bool trans_FMLSLB_zzzw(DisasContext *s, arg_rrrr_esz *a) +{ + return do_FMLAL_zzzw(s, a, true, false); +} + +static bool trans_FMLSLT_zzzw(DisasContext *s, arg_rrrr_esz *a) +{ + return do_FMLAL_zzzw(s, a, true, true); +} + +static bool do_FMLAL_zzxw(DisasContext *s, arg_rrxr_esz *a, bool sub, bool sel) +{ + if (a->esz != MO_32 || !dc_isar_feature(aa64_sve2, s)) { + return false; + } + if (sve_access_check(s)) { + unsigned vsz = vec_full_reg_size(s); + tcg_gen_gvec_4_ptr(vec_full_reg_offset(s, a->rd), + vec_full_reg_offset(s, a->rn), + vec_full_reg_offset(s, a->rm), + vec_full_reg_offset(s, a->ra), + cpu_env, vsz, vsz, + (a->index << 2) | (sel << 1) | sub, + gen_helper_sve2_fmlal_zzxw_s); + } + return true; +} + +static bool trans_FMLALB_zzxw(DisasContext *s, arg_rrxr_esz *a) +{ + return do_FMLAL_zzxw(s, a, false, false); +} + +static bool trans_FMLALT_zzxw(DisasContext *s, arg_rrxr_esz *a) +{ + return do_FMLAL_zzxw(s, a, false, true); +} + +static bool trans_FMLSLB_zzxw(DisasContext *s, arg_rrxr_esz *a) +{ + return do_FMLAL_zzxw(s, a, true, false); +} + +static bool trans_FMLSLT_zzxw(DisasContext *s, arg_rrxr_esz *a) +{ + return do_FMLAL_zzxw(s, a, true, true); +} diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c index c8d1656797..74874ae3c1 100644 --- a/target/arm/vec_helper.c +++ b/target/arm/vec_helper.c @@ -29,10 +29,14 @@ so addressing units smaller than that needs a host-endian fixup. */ #ifdef HOST_WORDS_BIGENDIAN #define H1(x) ((x) ^ 7) +#define H1_2(x) ((x) ^ 6) +#define H1_4(x) ((x) ^ 4) #define H2(x) ((x) ^ 3) #define H4(x) ((x) ^ 1) #else #define H1(x) (x) +#define H1_2(x) (x) +#define H1_4(x) (x) #define H2(x) (x) #define H4(x) (x) #endif @@ -1905,6 +1909,27 @@ void HELPER(gvec_fmlal_a64)(void *vd, void *vn, void *vm, get_flush_inputs_to_zero(&env->vfp.fp_status_f16)); } +void HELPER(sve2_fmlal_zzzw_s)(void *vd, void *vn, void *vm, void *va, + void *venv, uint32_t desc) +{ + intptr_t i, oprsz = simd_oprsz(desc); + uint16_t negn = extract32(desc, SIMD_DATA_SHIFT, 1) << 15; + intptr_t sel = extract32(desc, SIMD_DATA_SHIFT + 1, 1) * sizeof(float16); + CPUARMState *env = venv; + float_status *status = &env->vfp.fp_status; + bool fz16 = get_flush_inputs_to_zero(&env->vfp.fp_status_f16); + + for (i = 0; i < oprsz; i += sizeof(float32)) { + float16 nn_16 = *(float16 *)(vn + H1_2(i + sel)) ^ negn; + float16 mm_16 = *(float16 *)(vm + H1_2(i + sel)); + float32 nn = float16_to_float32_by_bits(nn_16, fz16); + float32 mm = float16_to_float32_by_bits(mm_16, fz16); + float32 aa = *(float32 *)(va + H1_4(i)); + + *(float32 *)(vd + H1_4(i)) = float32_muladd(nn, mm, aa, 0, status); + } +} + static void do_fmlal_idx(float32 *d, void *vn, void *vm, float_status *fpst, uint32_t desc, bool fz16) { @@ -1949,6 +1974,32 @@ void HELPER(gvec_fmlal_idx_a64)(void *vd, void *vn, void *vm, get_flush_inputs_to_zero(&env->vfp.fp_status_f16)); } +void HELPER(sve2_fmlal_zzxw_s)(void *vd, void *vn, void *vm, void *va, + void *venv, uint32_t desc) +{ + intptr_t i, j, oprsz = simd_oprsz(desc); + uint16_t negn = extract32(desc, SIMD_DATA_SHIFT, 1) << 15; + intptr_t sel = extract32(desc, SIMD_DATA_SHIFT + 1, 1) * sizeof(float16); + intptr_t idx = extract32(desc, SIMD_DATA_SHIFT + 2, 3) * sizeof(float16); + CPUARMState *env = venv; + float_status *status = &env->vfp.fp_status; + bool fz16 = get_flush_inputs_to_zero(&env->vfp.fp_status_f16); + + for (i = 0; i < oprsz; i += 16) { + float16 mm_16 = *(float16 *)(vm + i + idx); + float32 mm = float16_to_float32_by_bits(mm_16, fz16); + + for (j = 0; j < 16; j += sizeof(float32)) { + float16 nn_16 = *(float16 *)(vn + H1_2(i + j + sel)) ^ negn; + float32 nn = float16_to_float32_by_bits(nn_16, fz16); + float32 aa = *(float32 *)(va + H1_4(i + j)); + + *(float32 *)(vd + H1_4(i + j)) = + float32_muladd(nn, mm, aa, 0, status); + } + } +} + void HELPER(gvec_sshl_b)(void *vd, void *vn, void *vm, uint32_t desc) { intptr_t i, opr_sz = simd_oprsz(desc); From patchwork Fri Sep 18 18:37:50 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273319 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 05B9BC43464 for ; Fri, 18 Sep 2020 20:04:58 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 2E11322208 for ; Fri, 18 Sep 2020 20:04:57 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="Y4e12Sbu" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2E11322208 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:39532 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJMcp-0004JJ-VA for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 16:04:55 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49138) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLIU-0001QW-Rn for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:56 -0400 Received: from mail-pf1-x442.google.com ([2607:f8b0:4864:20::442]:37368) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLIO-0007K6-H1 for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:50 -0400 Received: by mail-pf1-x442.google.com with SMTP id w7so3996317pfi.4 for ; Fri, 18 Sep 2020 11:39:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=dgqa26KZVdKCtc03FsTaVbtOnIm76ZLnoLwFdbk/1bU=; b=Y4e12SbuIi8Jn/JBd/8XyYpSJ6yTFkJO6choFAuLQ7uC8u9NtV5ySFs2S1nDwZ3wSV nqv2Y49zRlWKmn9hCYol1ehE+KWS7z3QbgArJXM6F4blO36wMKQMlFr6OLMEUkpnXr4O HV5pBggmVtk1baLSPHONBQ+SCKJgYmSfScoHjZvbS8HWt5G56byqtcfMYO0aa/y5zCP/ 4wv0iZ+bYZ+MiyfOXVoRVNgsWgju+t5wZqNSq3FaGeYmN7mL4XaIcgzPBDoIhaiFbi4j 1ygQEiOZTg6RM3voYigPB31tcsxysIxZ/Wn12cZ7c6WNFzGo1RUHr3zO8xh3dWoaezb9 9qvg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=dgqa26KZVdKCtc03FsTaVbtOnIm76ZLnoLwFdbk/1bU=; b=qv+qqmDxc/bguw43c7FFi0QUaDyqRFqGEe78RmmNAvOJCmGlnCZWt+gwEkCJlyqJyf 8LVUSbcNDFQmPmCx30zJUyX9d6fhxwqKb9qnd3CMwQRD0u/rfKmkGXCv9ag/gFWQEAVy a4TTBauiM59k5rw/836Fg20iGtsh6/J8O7dMf5tUMPU+gUcGO+4+RTiVdqqtpjAfzvLO SFS+CaT0v2+Qf61XNLIe51M1edsZsVcc4HD0sWyztwVB/jk5LjcQYAO0bfv3m0nWAsKW P2/0nUpjCL33wavZskOKIhX8qPjqiiEVca5rFasAT1pd081rYEqq6EoD+ld7QdObCA88 IUPQ== X-Gm-Message-State: AOAM532QeHc3+3hZ2ezOr73dCLOjQfOdXst4xzpOXQTyo80qNyWp8vur AlohhpQkppzuaYw+XKmOqAdlo185hNDjGw== X-Google-Smtp-Source: ABdhPJyR7zPMTurcuPHoZLWYvSlnVkKPjXV/Q8wT+AEUsxD24crWalceBiYTe9sO2cG6lKeoAltvcg== X-Received: by 2002:a62:503:0:b029:13e:d13d:a0f9 with SMTP id 3-20020a6205030000b029013ed13da0f9mr32930391pff.21.1600454379946; Fri, 18 Sep 2020 11:39:39 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.39.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:39:39 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 80/81] target/arm: Implement SVE2 complex integer dot product Date: Fri, 18 Sep 2020 11:37:50 -0700 Message-Id: <20200918183751.2787647-81-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::442; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x442.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/helper-sve.h | 10 ++++ target/arm/sve.decode | 9 ++++ target/arm/sve_helper.c | 99 ++++++++++++++++++++++++++++++++++++++ target/arm/translate-sve.c | 17 +++++++ 4 files changed, 135 insertions(+) diff --git a/target/arm/helper-sve.h b/target/arm/helper-sve.h index 9542f01c42..4af5e1a5ce 100644 --- a/target/arm/helper-sve.h +++ b/target/arm/helper-sve.h @@ -2784,3 +2784,13 @@ DEF_HELPER_FLAGS_5(sve2_fcvtlt_sd, TCG_CALL_NO_RWG, DEF_HELPER_FLAGS_5(flogb_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(flogb_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) DEF_HELPER_FLAGS_5(flogb_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_cdot_zzzz_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_cdot_zzzz_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) + +DEF_HELPER_FLAGS_5(sve2_cdot_idx_s, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) +DEF_HELPER_FLAGS_5(sve2_cdot_idx_d, TCG_CALL_NO_RWG, + void, ptr, ptr, ptr, ptr, i32) diff --git a/target/arm/sve.decode b/target/arm/sve.decode index 6708c048e0..84232ff9e5 100644 --- a/target/arm/sve.decode +++ b/target/arm/sve.decode @@ -811,6 +811,9 @@ MUL_zzi 00100101 .. 110 000 110 ........ ..... @rdn_i8s DOT_zzzz 01000100 1 sz:1 0 rm:5 00000 u:1 rn:5 rd:5 \ ra=%reg_movprfx +# SVE2 complex dot product (vectors) +CDOT_zzzz 01000100 esz:2 0 rm:5 0001 rot:2 rn:5 rd:5 ra=%reg_movprfx + #### SVE Multiply - Indexed # SVE integer dot product (indexed) @@ -849,6 +852,12 @@ SQDMLSLB_zzxw_d 01000100 .. 1 ..... 0011.0 ..... ..... @rrxw_d SQDMLSLT_zzxw_s 01000100 .. 1 ..... 0011.1 ..... ..... @rrxw_s SQDMLSLT_zzxw_d 01000100 .. 1 ..... 0011.1 ..... ..... @rrxw_d +# SVE2 complex integer dot product (indexed) +CDOT_zzxw_s 01000100 10 1 index:2 rm:3 0100 rot:2 rn:5 rd:5 \ + ra=%reg_movprfx +CDOT_zzxw_d 01000100 11 1 index:1 rm:4 0100 rot:2 rn:5 rd:5 \ + ra=%reg_movprfx + # SVE2 complex integer multiply-add (indexed) CMLA_zzxz_h 01000100 10 1 index:2 rm:3 0110 rot:2 rn:5 rd:5 \ ra=%reg_movprfx diff --git a/target/arm/sve_helper.c b/target/arm/sve_helper.c index ef04a0f95a..0ff5e0baf8 100644 --- a/target/arm/sve_helper.c +++ b/target/arm/sve_helper.c @@ -1592,6 +1592,105 @@ void HELPER(sve2_sqrdcmlah_idx_s)(void *vd, void *vn, void *vm, do_cmla_idx_s(vd, vn, vm, va, desc, do_sqrdcmlah_s); } +/* Note N and M are 4 elements bundled into one unit. */ +static int32_t do_cdot_s(uint32_t n, uint32_t m, int32_t a, + int sel_a, int sel_b, int sub_i) +{ + for (int i = 0; i <= 1; i++) { + int32_t elt1_r = (int8_t)(n >> (16 * i)); + int32_t elt1_i = (int8_t)(n >> (16 * i + 8)); + int32_t elt2_a = (int8_t)(m >> (16 * i + 8 * sel_a)); + int32_t elt2_b = (int8_t)(m >> (16 * i + 8 * sel_b)); + + a += elt1_r * elt2_a + elt1_i * elt2_b * sub_i; + } + return a; +} + +static int64_t do_cdot_d(uint64_t n, uint64_t m, int64_t a, + int sel_a, int sel_b, int sub_i) +{ + for (int i = 0; i <= 1; i++) { + int64_t elt1_r = (int16_t)(n >> (32 * i + 0)); + int64_t elt1_i = (int16_t)(n >> (32 * i + 16)); + int64_t elt2_a = (int16_t)(m >> (32 * i + 16 * sel_a)); + int64_t elt2_b = (int16_t)(m >> (32 * i + 16 * sel_b)); + + a += elt1_r * elt2_a + elt1_i * elt2_b * sub_i; + } + return a; +} + +void HELPER(sve2_cdot_zzzz_s)(void *vd, void *vn, void *vm, + void *va, uint32_t desc) +{ + int opr_sz = simd_oprsz(desc); + int rot = simd_data(desc); + int sel_a = rot & 1; + int sel_b = sel_a ^ 1; + int sub_i = (rot == 0 || rot == 3 ? -1 : 1); + uint32_t *d = vd, *n = vn, *m = vm, *a = va; + + for (int e = 0; e < opr_sz / 4; e++) { + d[e] = do_cdot_s(n[e], m[e], a[e], sel_a, sel_b, sub_i); + } +} + +void HELPER(sve2_cdot_zzzz_d)(void *vd, void *vn, void *vm, + void *va, uint32_t desc) +{ + int opr_sz = simd_oprsz(desc); + int rot = simd_data(desc); + int sel_a = rot & 1; + int sel_b = sel_a ^ 1; + int sub_i = (rot == 0 || rot == 3 ? -1 : 1); + uint64_t *d = vd, *n = vn, *m = vm, *a = va; + + for (int e = 0; e < opr_sz / 8; e++) { + d[e] = do_cdot_d(n[e], m[e], a[e], sel_a, sel_b, sub_i); + } +} + +void HELPER(sve2_cdot_idx_s)(void *vd, void *vn, void *vm, + void *va, uint32_t desc) +{ + int opr_sz = simd_oprsz(desc); + int rot = extract32(desc, SIMD_DATA_SHIFT, 2); + int idx = H4(extract32(desc, SIMD_DATA_SHIFT + 2, 2)); + int sel_a = rot & 1; + int sel_b = sel_a ^ 1; + int sub_i = (rot == 0 || rot == 3 ? -1 : 1); + uint32_t *d = vd, *n = vn, *m = vm, *a = va; + + for (int seg = 0; seg < opr_sz / 4; seg += 4) { + uint32_t seg_m = m[seg + idx]; + for (int e = 0; e < 4; e++) { + d[seg + e] = do_cdot_s(n[seg + e], seg_m, a[seg + e], + sel_a, sel_b, sub_i); + } + } +} + +void HELPER(sve2_cdot_idx_d)(void *vd, void *vn, void *vm, + void *va, uint32_t desc) +{ + int seg, opr_sz = simd_oprsz(desc); + int rot = extract32(desc, SIMD_DATA_SHIFT, 2); + int idx = extract32(desc, SIMD_DATA_SHIFT + 2, 2); + int sel_a = rot & 1; + int sel_b = sel_a ^ 1; + int sub_i = (rot == 0 || rot == 3 ? -1 : 1); + uint64_t *d = vd, *n = vn, *m = vm, *a = va; + + for (seg = 0; seg < opr_sz / 8; seg += 2) { + uint64_t seg_m = m[seg + idx]; + for (int e = 0; e < 2; e++) { + d[seg + e] = do_cdot_d(n[seg + e], seg_m, a[seg + e], + sel_a, sel_b, sub_i); + } + } +} + #define DO_ZZXZ(NAME, TYPE, H, OP) \ void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, uint32_t desc) \ { \ diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c index 92d1297a5c..4893968cdb 100644 --- a/target/arm/translate-sve.c +++ b/target/arm/translate-sve.c @@ -4151,6 +4151,9 @@ DO_SVE2_RRXR_ROT(CMLA_zzxz_s, gen_helper_sve2_cmla_idx_s) DO_SVE2_RRXR_ROT(SQRDCMLAH_zzxz_h, gen_helper_sve2_sqrdcmlah_idx_h) DO_SVE2_RRXR_ROT(SQRDCMLAH_zzxz_s, gen_helper_sve2_sqrdcmlah_idx_s) +DO_SVE2_RRXR_ROT(CDOT_zzxw_s, gen_helper_sve2_cdot_idx_s) +DO_SVE2_RRXR_ROT(CDOT_zzxw_d, gen_helper_sve2_cdot_idx_d) + #undef DO_SVE2_RRXR_ROT /* @@ -8355,6 +8358,20 @@ static bool trans_CMLA_zzzz(DisasContext *s, arg_CMLA_zzzz *a) return true; } +static bool trans_CDOT_zzzz(DisasContext *s, arg_CMLA_zzzz *a) +{ + if (!dc_isar_feature(aa64_sve2, s) || a->esz < MO_32) { + return false; + } + if (sve_access_check(s)) { + gen_helper_gvec_4 *fn = (a->esz == MO_32 + ? gen_helper_sve2_cdot_zzzz_s + : gen_helper_sve2_cdot_zzzz_d); + gen_gvec_ool_zzzz(s, fn, a->rd, a->rn, a->rm, a->ra, a->rot); + } + return true; +} + static bool trans_SQRDCMLAH_zzzz(DisasContext *s, arg_SQRDCMLAH_zzzz *a) { static gen_helper_gvec_4 * const fns[] = { From patchwork Fri Sep 18 18:37:51 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 273332 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3DCC6C43463 for ; Fri, 18 Sep 2020 19:31:07 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C12B021734 for ; Fri, 18 Sep 2020 19:31:06 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="ZMAnrja1" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C12B021734 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:51642 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kJM65-0006HA-Qj for qemu-devel@archiver.kernel.org; Fri, 18 Sep 2020 15:31:05 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49152) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kJLIa-0001Qq-Id for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:56 -0400 Received: from mail-pl1-x642.google.com ([2607:f8b0:4864:20::642]:44932) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1kJLIO-0007KD-Io for qemu-devel@nongnu.org; Fri, 18 Sep 2020 14:39:51 -0400 Received: by mail-pl1-x642.google.com with SMTP id j7so3419852plk.11 for ; Fri, 18 Sep 2020 11:39:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=4XHIUjh2oBWbEwnuE02Cz+2r9FHl4yCsiyFloX+fVF0=; b=ZMAnrja1m8Y6VXSQfSzs3oxknICpQ42HabfUfVxgFNQUU3n5tXw/LDNpLMHc3KrYFk +edzLvrDZBi1rr5G9tdZN0eNnihTfHss0FMHopb0W5zmatsHZFE9Ag1VZnXjUNC1n2ae mTMNql/rx2aMd2r2tBL6MQD3kpNcCAkUy/gkRB8bnZTti2RfJ8i6I++MtvJ4Kvc5l1nz QlUkHDG1nrK+a9dN2K4mGPq1f6YcSQLcVnwy0buqeDDTGSj90hbqN6wKqQZAwFHS2hqo PggcLPLxrTbsI58IytNc307lzZ9/M96zOHMKQAFWjbJQ7LRMR5v+gtd8bCZVk1Gwjj1J 03Tg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=4XHIUjh2oBWbEwnuE02Cz+2r9FHl4yCsiyFloX+fVF0=; b=cF7egUYRPHmF+5IEO8dIW9qmOqRcxuLX5fP/AXJq4KaLgErq/0nzGnc/cx2yiQnMOF rcg3uSj1e5J6dov53T1yC7x0Rk48vHDS/7gUxKUauRtt+nHszs38APuFV6Ru+ACo61K1 ZNGJxMUZcflJ3jfmgH7RakzqD4tXdndPzxJbkDmIKKiJNm+VWQ5mlJpBrO8XcoqkAKqt E7w7/DVWrTbsojH8ps/Nw9xsNKolTpYMbOzhC2qpYF2h5NDAuLABBNNJ05TdFNLD/5hA 4sbkGKvxQ/KGNjNxXSRMrCdwg4tZLu3tSNwwQ5i18ZxXxyc4yXSyzaVrwVyP7l4udOXJ zLOg== X-Gm-Message-State: AOAM532pa73Jwcs84vGFsF/ZBpjWNRFXh05hlNwUhKwEi2KCzNpI5QfD CcjBaPJbl6nyf5VErwUo7HfSynNrgVyBvQ== X-Google-Smtp-Source: ABdhPJyzD7nhYr+TLIhKUr8VubcHX0HcbTUWRpdVMMrl7WfObprLdSwK4Qg0aL96gl16UDhpXLJzvg== X-Received: by 2002:a17:902:aa85:b029:d0:cbe1:e70d with SMTP id d5-20020a170902aa85b02900d0cbe1e70dmr34662636plr.27.1600454381254; Fri, 18 Sep 2020 11:39:41 -0700 (PDT) Received: from localhost.localdomain ([71.212.141.89]) by smtp.gmail.com with ESMTPSA id f4sm3680723pfj.147.2020.09.18.11.39.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Sep 2020 11:39:40 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 81/81] target/arm: Enable SVE2 and some extensions Date: Fri, 18 Sep 2020 11:37:51 -0700 Message-Id: <20200918183751.2787647-82-richard.henderson@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200918183751.2787647-1-richard.henderson@linaro.org> References: <20200918183751.2787647-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::642; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x642.google.com X-detected-operating-system: by eggs.gnu.org: No matching host in p0f cache. That's all we know. X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, alex.bennee@linaro.org Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: Richard Henderson --- target/arm/cpu64.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c index 3c2b3d9599..46b8a3908c 100644 --- a/target/arm/cpu64.c +++ b/target/arm/cpu64.c @@ -667,6 +667,17 @@ static void aarch64_max_initfn(Object *obj) t = FIELD_DP64(t, ID_AA64MMFR2, CNP, 1); /* TTCNP */ cpu->isar.id_aa64mmfr2 = t; + t = cpu->isar.id_aa64zfr0; + t = FIELD_DP64(t, ID_AA64ZFR0, SVEVER, 1); + t = FIELD_DP64(t, ID_AA64ZFR0, AES, 2); /* PMULL */ + t = FIELD_DP64(t, ID_AA64ZFR0, BITPERM, 1); + t = FIELD_DP64(t, ID_AA64ZFR0, SHA3, 1); + t = FIELD_DP64(t, ID_AA64ZFR0, SM4, 1); + t = FIELD_DP64(t, ID_AA64ZFR0, I8MM, 1); + t = FIELD_DP64(t, ID_AA64ZFR0, F32MM, 1); + t = FIELD_DP64(t, ID_AA64ZFR0, F64MM, 1); + cpu->isar.id_aa64zfr0 = t; + /* Replicate the same data to the 32-bit id registers. */ u = cpu->isar.id_isar5; u = FIELD_DP32(u, ID_ISAR5, AES, 2); /* AES + PMULL */