From patchwork Sat Feb 1 16:39:50 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 861274 Delivered-To: patch@linaro.org Received: by 2002:adf:fb05:0:b0:385:e875:8a9e with SMTP id c5csp1264567wrr; Sat, 1 Feb 2025 08:48:24 -0800 (PST) X-Forwarded-Encrypted: i=2; AJvYcCUmlQ2eJJb7oQA508C/nkFooKkkOqnCVlStwaOTbrWMdYAM/qlIYBCiHNAZm4LlxpzYxKZnDA==@linaro.org X-Google-Smtp-Source: AGHT+IGHglGuEsLoBhnzn7+EWY+brapWXffY/wsAnrZrMh1Zb1+dqs1Qiz4gqPUCocGdXkVO+2Zp X-Received: by 2002:a05:620a:1b88:b0:7b3:3657:9e5b with SMTP id af79cd13be357-7c0097c1d17mr1771862185a.26.1738428504259; Sat, 01 Feb 2025 08:48:24 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1738428504; cv=none; d=google.com; s=arc-20240605; b=PM5K4TyChzq20/1G7GmNDMTV4LnZQmMLiKvkRnaSA0BOxrCEBswJ2kXqvORImJ6sDP WZ944hVx5PBi7hhhQ83cwyQDauKGBWYKH7kAP6+ZK/a6yCNJ2nsT18HX9IjplP9U4FQp 9k8wC5I1fEkT3Piv6WKJC2KDmSGrZXPQhgG/IfMW8tq/uVBDbabjDIdSVrns+o1H00kh 1xQOdju77r4GhTwrrY8XScqpqYs1S68BCLIfKtTgp0+T6clLkuKlAZDo97ZIOHOaNeCH g2W1XAoMbsOAz9IpeogWe8xxujXnzaT71uANs1eAGorwplfWFxTGVav6YkaQIhcTX6k8 Wvjg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=hcrruo5RnW9YV1dz+RMkLiucYHbo488Dc1v5YzDdwzw=; fh=O4oQ96I9WS2hkblg7NReb3OJl4g7/ybm8gmegxS3Aoo=; b=DhBB7uM1GuR2ctlVLQ3124Xmr1vpDM516O4I81/aOW/y8E4BUivjj3kuOmGST/RQbC +V3AwAvBJIl8RGgiFPbva0r6huQ98Sps3nn+oDWpVSOIsu6MCb9upxTmoAPxawsgiXNC GnGs2e+p33Xv6j7mrMSKi/9jb9LOQKLqXUE00NH6zDco7E1eMabkiLfy2T0zTp2/qRXz A602NGWFd9TG71exCVAUQrovXXiKgLY3htPD97wPtf8QJLsgOSt1L1fg1mExNIu6yKdi pJRcXzbk6qmUW9xWeSihpCmu2gajK97X55jENPQZ/+WrGR4m8RZ/mRIdl46tcpmGFN+l e4Xg==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=l6gCYIt1; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org; dara=neutral header.i=@linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id af79cd13be357-7c00a9039d2si601979985a.209.2025.02.01.08.48.24 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Sat, 01 Feb 2025 08:48:24 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=l6gCYIt1; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org; dara=neutral header.i=@linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1teGYd-0007EB-RQ; Sat, 01 Feb 2025 11:41:27 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1teGYa-00076c-GN for qemu-devel@nongnu.org; Sat, 01 Feb 2025 11:41:20 -0500 Received: from mail-wm1-x32b.google.com ([2a00:1450:4864:20::32b]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1teGYY-0001IU-KG for qemu-devel@nongnu.org; Sat, 01 Feb 2025 11:41:20 -0500 Received: by mail-wm1-x32b.google.com with SMTP id 5b1f17b1804b1-437a92d7b96so29999245e9.2 for ; Sat, 01 Feb 2025 08:41:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1738428077; x=1739032877; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=hcrruo5RnW9YV1dz+RMkLiucYHbo488Dc1v5YzDdwzw=; b=l6gCYIt1wTvXmQSNzBp5iGBpXg4pufU6Ys8vVv0xayMZf6ReV/FSsH8yho0ebabZHT DOpqUumwYihpsoRMMTIu9iD8LpQ+TDwayzGfEt8TAyKSbet0pTOHIuIHar7Sboe1h4fT kKc4RIJrd6MUe/lMuY1RR1kF5+qpW/6tx5k3InseJ9tRrpaNXbl71h1vgWrinYrSeUiw TrNxIiB7d/UvsaViLgrU/8txtemmZ8Df5vNOg3624MPPu4fC0e6LaR1kivzNLmJfQBmz Qa1JAyw29E9qMxPXa1gjWyxUQ4BcUtGEMl78fakrn+XYSZNdZGnNmdpl4vewYduyJVzj TDEw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738428077; x=1739032877; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=hcrruo5RnW9YV1dz+RMkLiucYHbo488Dc1v5YzDdwzw=; b=EstfmvD2BWIchHlwZ9W6IdaNXEDJdoMa5rR16AzukkK2g2imosGLmXvoNdCe9D7ahf E7r/l9SpCVGFlf+iKnUVqu5UmcbHT7/ig0WwlgozHS+8i3ORkTZCJG1kHK3uiabVIqUF yuodx5SRvnsj1kzJ5s5wmGV/GUI+N44NIdSPEnj/Z1gk6aZV6VifuO8x+xq1Q6vqlO+j LwSsu517vhK+kwum60f12BhsCpptQ3g9gfqboQnxoH0NHvE1h/eki1Twbh9YuxqA6A5A wTKt8O2l0nBk8s4+xkRuKx+9TQbEdxrAZDRgNXuUYmaM9+iXe+oDLfB9TUQsOAPREMrO +PzQ== X-Forwarded-Encrypted: i=1; AJvYcCXpHvNHIanP72+ttFz1Svk1iWldki76S4kj30PidhqEW+1lADIOVwZqPYnqCXDhk5bp7Bokkw2szuKk@nongnu.org X-Gm-Message-State: AOJu0YzsmW7LAKpF6+3/1P1XwkasZlQQ7s8ICMQ1jjV6s9Lu/pKs0oDh p2L4Fe8BBrTUz+YkEh+25PcdE0UHrv/vh1cx7pRUq1ChU9sz+hqg3yE+/KHI8h4= X-Gm-Gg: ASbGnculfd9PWlv5yzRsnSfAuNdHpy1/aux2WRkjdjGoPyg023ED0N9fi5MzOENHZG5 SJiD4S/k+KbJuaaL0fPD1UQCPTJsET9HThey9KSBYd6Pi4d/dHzuTKefizwQ1lMgF3YVd2tf9Oh xQgaMA6wVbdF56GbAvXyXFWWyGKBeoFAya6f1F8Uu6ZR7QnmykCjFjFe2U/W2QwO69AC7xnmmJ2 2VnhU0S7Hq0lkl3zXXLld+NNwIlAw/sQA2HFKo8QYsAJ2gWA/1P2fiJN10YacJZXj5TiB0yVoeC We6jnXeX7PDq8JX5nO2v X-Received: by 2002:a05:600c:83ce:b0:431:5632:448b with SMTP id 5b1f17b1804b1-438dc41d5c4mr119153835e9.25.1738428076438; Sat, 01 Feb 2025 08:41:16 -0800 (PST) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [2001:8b0:1d0::2]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-438dcc81d74sm127401525e9.37.2025.02.01.08.41.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 01 Feb 2025 08:41:15 -0800 (PST) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH v2 47/69] target/arm: Handle FPCR.AH in FCMLA by index Date: Sat, 1 Feb 2025 16:39:50 +0000 Message-Id: <20250201164012.1660228-48-peter.maydell@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250201164012.1660228-1-peter.maydell@linaro.org> References: <20250201164012.1660228-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::32b; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x32b.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org From: Richard Henderson The negation step in FCMLA by index mustn't negate a NaN when FPCR.AH is set. Use the same approach as vector FCMLA of passing in FPCR.AH and using it to select whether to negate by XOR or by the muladd negate_product flag. Signed-off-by: Richard Henderson Message-id: 20250129013857.135256-27-richard.henderson@linaro.org [PMM: Expanded commit message] Reviewed-by: Peter Maydell Signed-off-by: Peter Maydell --- target/arm/tcg/translate-a64.c | 2 +- target/arm/tcg/vec_helper.c | 44 ++++++++++++++++++++-------------- 2 files changed, 27 insertions(+), 19 deletions(-) diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c index c45a9822281..e8eab1eabdc 100644 --- a/target/arm/tcg/translate-a64.c +++ b/target/arm/tcg/translate-a64.c @@ -6927,7 +6927,7 @@ static bool trans_FCMLA_vi(DisasContext *s, arg_FCMLA_vi *a) if (fp_access_check(s)) { gen_gvec_op4_fpst(s, a->q, a->rd, a->rn, a->rm, a->rd, a->esz == MO_16 ? FPST_A64_F16 : FPST_A64, - (a->idx << 2) | a->rot, fn); + (s->fpcr_ah << 4) | (a->idx << 2) | a->rot, fn); } return true; } diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c index 630513f00b2..c2f98a5c67e 100644 --- a/target/arm/tcg/vec_helper.c +++ b/target/arm/tcg/vec_helper.c @@ -995,29 +995,33 @@ void HELPER(gvec_fcmlah_idx)(void *vd, void *vn, void *vm, void *va, uintptr_t opr_sz = simd_oprsz(desc); float16 *d = vd, *n = vn, *m = vm, *a = va; intptr_t flip = extract32(desc, SIMD_DATA_SHIFT, 1); - uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1); + uint32_t negf_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1); intptr_t index = extract32(desc, SIMD_DATA_SHIFT + 2, 2); - uint32_t neg_real = flip ^ neg_imag; + uint32_t fpcr_ah = extract32(desc, SIMD_DATA_SHIFT + 4, 1); + uint32_t negf_real = flip ^ negf_imag; intptr_t elements = opr_sz / sizeof(float16); intptr_t eltspersegment = MIN(16 / sizeof(float16), elements); + float16 negx_imag, negx_real; intptr_t i, j; - /* Shift boolean to the sign bit so we can xor to negate. */ - neg_real <<= 15; - neg_imag <<= 15; + /* With AH=0, use negx; with AH=1 use negf. */ + negx_real = (negf_real & ~fpcr_ah) << 15; + negx_imag = (negf_imag & ~fpcr_ah) << 15; + negf_real = (negf_real & fpcr_ah ? float_muladd_negate_product : 0); + negf_imag = (negf_imag & fpcr_ah ? float_muladd_negate_product : 0); for (i = 0; i < elements; i += eltspersegment) { float16 mr = m[H2(i + 2 * index + 0)]; float16 mi = m[H2(i + 2 * index + 1)]; - float16 e1 = neg_real ^ (flip ? mi : mr); - float16 e3 = neg_imag ^ (flip ? mr : mi); + float16 e1 = negx_real ^ (flip ? mi : mr); + float16 e3 = negx_imag ^ (flip ? mr : mi); for (j = i; j < i + eltspersegment; j += 2) { float16 e2 = n[H2(j + flip)]; float16 e4 = e2; - d[H2(j)] = float16_muladd(e2, e1, a[H2(j)], 0, fpst); - d[H2(j + 1)] = float16_muladd(e4, e3, a[H2(j + 1)], 0, fpst); + d[H2(j)] = float16_muladd(e2, e1, a[H2(j)], negf_real, fpst); + d[H2(j + 1)] = float16_muladd(e4, e3, a[H2(j + 1)], negf_imag, fpst); } } clear_tail(d, opr_sz, simd_maxsz(desc)); @@ -1059,29 +1063,33 @@ void HELPER(gvec_fcmlas_idx)(void *vd, void *vn, void *vm, void *va, uintptr_t opr_sz = simd_oprsz(desc); float32 *d = vd, *n = vn, *m = vm, *a = va; intptr_t flip = extract32(desc, SIMD_DATA_SHIFT, 1); - uint32_t neg_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1); + uint32_t negf_imag = extract32(desc, SIMD_DATA_SHIFT + 1, 1); intptr_t index = extract32(desc, SIMD_DATA_SHIFT + 2, 2); - uint32_t neg_real = flip ^ neg_imag; + uint32_t fpcr_ah = extract32(desc, SIMD_DATA_SHIFT + 4, 1); + uint32_t negf_real = flip ^ negf_imag; intptr_t elements = opr_sz / sizeof(float32); intptr_t eltspersegment = MIN(16 / sizeof(float32), elements); + float32 negx_imag, negx_real; intptr_t i, j; - /* Shift boolean to the sign bit so we can xor to negate. */ - neg_real <<= 31; - neg_imag <<= 31; + /* With AH=0, use negx; with AH=1 use negf. */ + negx_real = (negf_real & ~fpcr_ah) << 31; + negx_imag = (negf_imag & ~fpcr_ah) << 31; + negf_real = (negf_real & fpcr_ah ? float_muladd_negate_product : 0); + negf_imag = (negf_imag & fpcr_ah ? float_muladd_negate_product : 0); for (i = 0; i < elements; i += eltspersegment) { float32 mr = m[H4(i + 2 * index + 0)]; float32 mi = m[H4(i + 2 * index + 1)]; - float32 e1 = neg_real ^ (flip ? mi : mr); - float32 e3 = neg_imag ^ (flip ? mr : mi); + float32 e1 = negx_real ^ (flip ? mi : mr); + float32 e3 = negx_imag ^ (flip ? mr : mi); for (j = i; j < i + eltspersegment; j += 2) { float32 e2 = n[H4(j + flip)]; float32 e4 = e2; - d[H4(j)] = float32_muladd(e2, e1, a[H4(j)], 0, fpst); - d[H4(j + 1)] = float32_muladd(e4, e3, a[H4(j + 1)], 0, fpst); + d[H4(j)] = float32_muladd(e2, e1, a[H4(j)], negf_real, fpst); + d[H4(j + 1)] = float32_muladd(e4, e3, a[H4(j + 1)], negf_imag, fpst); } } clear_tail(d, opr_sz, simd_maxsz(desc));