From patchwork Fri Jan 24 16:27:43 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Maydell X-Patchwork-Id: 859578 Delivered-To: patch@linaro.org Received: by 2002:a05:6000:18c8:b0:385:e875:8a9e with SMTP id w8csp882617wrq; Fri, 24 Jan 2025 08:34:24 -0800 (PST) X-Forwarded-Encrypted: i=2; AJvYcCWLodcahljgsp3rBtiBcv112ohGT1ifaeWcPvx0D6/8z4YHqDoFi3BwuyGNOCMpQiVq8HDKZg==@linaro.org X-Google-Smtp-Source: AGHT+IFdniSCFBdgc+iMcGjBisCpI3sYagzkKH9fEsR/JQ9Ql6SX26Rc3+R4X/SA2D3uyvLysbwU X-Received: by 2002:a05:6102:3b86:b0:4b2:bc6b:c73c with SMTP id ada2fe7eead31-4b709bc9ff8mr3659292137.6.1737736464700; Fri, 24 Jan 2025 08:34:24 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1737736464; cv=none; d=google.com; s=arc-20240605; b=fqY4kEqIjaQh7L8zIg1qdlfppPpX2tlH+KbIJ7Gxp2xhU/MPEkxPkGJnAcKl2KjDw2 PJ5Abe/Ntwj8j6UQz/v/ojeKmbzw6DkfIVDF/zJQGTpRg13cL1gEon7JPds4+vx3KWvJ KgOsUfEnWUyyULJIwsjYNc9ViiMiE20p/8AbuWJcDP6UNtWKnAcRBk601+FMLzsy2fTh xB+MfHNIJ2mgyNQpPdaw2qWsoPHkQbg4DNyuLxI+BOs5jtadU2RmBXX3T9rdR8kGJ8KN CTJ/qNMkyOkSZOjKhfNWe/qT9eaupwPo75eh7B400ZkLFGlIPt/MrtryV/gH6Eqm7HPZ fyAA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=9vOW3vjTeJp39sy4OsCwvMlVmTE6vsBeLOiYEKAZ7/E=; fh=vAjDKhWG2H0thFNX+75au94WBnSp3B6z+NwN13Igj74=; b=UoVacZgfoGjQPX0Op8U0YuytR1jbmTmFcBgFpUdhfDctdGP5iBhKRk8iXOj9B+10kH 7WCowiYMX/Iu55lPo0wa6E/a4aKUeX0/jw2BmnYP/WTQC6w7h9RT5pOqesKayAK6AEDE RO6cEiw7AjyFX1cVXxMWsBnFfSNLVNDZEQ0zR5Hiq6OdpUdaz9liCWYC7uqAfa9vlz7L kVj4svpau8zYj1+obcAKsNCB5qP6SKv2hdtkUlTa3GMtoNS/7MRJcg/qhsOUogbahoy4 QLJUoCf/CWujfQWvAxn4E1MoSGXHLJ1CnLawVNKeU4aUdk78zsjRyNx5lAihWQktY4f8 MPsQ==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=qwHXbkY8; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org; dara=neutral header.i=@linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id ada2fe7eead31-4b7099fb5e8si858985137.531.2025.01.24.08.34.24 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 24 Jan 2025 08:34:24 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=qwHXbkY8; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org; dara=neutral header.i=@linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tbMal-00089v-S4; Fri, 24 Jan 2025 11:31:37 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tbMYS-0003xi-IW for qemu-devel@nongnu.org; Fri, 24 Jan 2025 11:29:15 -0500 Received: from mail-wm1-x32f.google.com ([2a00:1450:4864:20::32f]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1tbMYO-0005WC-E4 for qemu-devel@nongnu.org; Fri, 24 Jan 2025 11:29:11 -0500 Received: by mail-wm1-x32f.google.com with SMTP id 5b1f17b1804b1-4361f664af5so25670765e9.1 for ; Fri, 24 Jan 2025 08:29:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1737736147; x=1738340947; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=9vOW3vjTeJp39sy4OsCwvMlVmTE6vsBeLOiYEKAZ7/E=; b=qwHXbkY8I90WENTXp1qepXef0a386Ob8O1pZN/BVLnqC9byZEFTl9yd/A3Lye6CeUS ZVM5dXWlwaLDXpaXeZimiZhhFJBMHNLxHPYdJHdijqXPnaDhSeSS/ejk305IHf8KtEgM PSBSlhOqXkoUmIx2tsOtIQgNgG0DVtbLPeIYQuzgq9ttipZUOG9WDMzURMLFLJb6MG2g 0wk5h+sDHYHaitf4CcnvWGkdKeTgkgFa0laHoPTEpl7cNrPpNCQhJ/Vg0jYuQjPLEPnH 2zmQJNU+9wS3wWZ0sDm1Y5w52/Fb26qwW3T8O3ueTNOPKEi3ghXXdTb7nLjEtEgkNtbt l4pg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1737736147; x=1738340947; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=9vOW3vjTeJp39sy4OsCwvMlVmTE6vsBeLOiYEKAZ7/E=; b=B4jb4VsVj38tH1uE1v1j80wvqGFyZfM7wjT03Z+c6RU9/BnAq8VmSkcipg+nhNPxxG 1EJ26uJOi8OHCqS37ZNH896ErdwN6Dyzi1lQ2n6hPVxJvnowTHDjdHCwSchKSiWa2JUP Y3HHP4kG6Ezk7dOFnSpoqYXt5H2XJZJwRFfSjt27EDV8CsTTYwhBQN57OjsGeVxc8AzV MyOdsZKT/n49S+wsPulUhsUbVUQC4+HtdbvMv6DJpTt3uqa+5P9ZJe6WJt7sH3wtU6mQ P6iQ76kpG0yQ2ThtBvubR09t/l5f0jG79HDEXv0VKwdgpzMxwcDIoChbqQxQvagB/rfx PKMA== X-Forwarded-Encrypted: i=1; AJvYcCXIsycZ7jB3cUQpDkXTLTfAtPsCnZFyeL4JCxVurn20ZsrF91YVDb2TWgsYjRU/O/kAuua2ASolA5II@nongnu.org X-Gm-Message-State: AOJu0Yx/MgugDXFMlxn/M6dsgRE12nrCKnv+SMSTJLOjwE9zzWJGlYJH LuoRrYKwDZChEIiY4uKBKVFdVAwGIVfO7cG2ZRqhDoS/ejgOb/6MwdNZAP5ICyWVW4gmvrSzMVD 0 X-Gm-Gg: ASbGncuLUne4Zdq9ghYnHYxjVxaoSLyFhmTFjjFGmyDfL1zqcBuDrSRi+mol0E4wpUm 63Zw98haNmzSfrM9pNzNQ8wNNc5k+wKA42LTBkPz65YPKK2/u3VRh6gXPQpLU3Yq4GZKQ8xdD62 IavNRGNNkAWFKgmPmO82DOEEegex54hffY9KIrW2WuPS7Th//rGzGKllrk5r/WWQt63kQyOP2uE gYZgTLCwJKhoiiTQwChAqQFjkumwkYTbmvJzwidbng/lfytBYDbzRIjyBErmcrNYiIzOu+HFQaJ hdnM0W9NDkw= X-Received: by 2002:a05:600c:1d1d:b0:434:a815:2b5d with SMTP id 5b1f17b1804b1-4389141c211mr253299595e9.24.1737736147035; Fri, 24 Jan 2025 08:29:07 -0800 (PST) Received: from orth.archaic.org.uk (orth.archaic.org.uk. [2001:8b0:1d0::2]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-438bd47eecasm31683025e9.6.2025.01.24.08.29.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 Jan 2025 08:29:06 -0800 (PST) From: Peter Maydell To: qemu-arm@nongnu.org, qemu-devel@nongnu.org Subject: [PATCH 23/76] fpu: Implement float_flag_input_denormal_used Date: Fri, 24 Jan 2025 16:27:43 +0000 Message-Id: <20250124162836.2332150-24-peter.maydell@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250124162836.2332150-1-peter.maydell@linaro.org> References: <20250124162836.2332150-1-peter.maydell@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2a00:1450:4864:20::32f; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x32f.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org For the x86 and the Arm FEAT_AFP semantics, we need to be able to tell the target code that the FPU operation has used an input denormal. Implement this; when it happens we set the new float_flag_denormal_input_used. Note that we only set this when an input denormal is actually used by the operation: if the operation results in Invalid Operation or Divide By Zero or the result is a NaN because some other input was a NaN then we never needed to look at the input denormal and do not set denormal_input_used. We mostly do not need to adjust the hardfloat codepaths to deal with this flag, because almost all hardfloat operations are already gated on the input not being a denormal, and will fall back to softfloat for a denormal input. The only exception is the comparison operations, where we need to add the check for input denormals, which must now fall back to softfloat where they did not before. Signed-off-by: Peter Maydell --- include/fpu/softfloat-types.h | 7 ++++ fpu/softfloat.c | 37 +++++++++++++++++-- fpu/softfloat-parts.c.inc | 68 ++++++++++++++++++++++++++++++++++- 3 files changed, 108 insertions(+), 4 deletions(-) diff --git a/include/fpu/softfloat-types.h b/include/fpu/softfloat-types.h index c177923e319..b9b4e8e55fc 100644 --- a/include/fpu/softfloat-types.h +++ b/include/fpu/softfloat-types.h @@ -165,6 +165,13 @@ enum { float_flag_invalid_sqrt = 0x0800, /* sqrt(-x) */ float_flag_invalid_cvti = 0x1000, /* non-nan to integer */ float_flag_invalid_snan = 0x2000, /* any operand was snan */ + /* + * An input was denormal and we used it (without flushing it to zero). + * Not set if we do not actually use the denormal input (e.g. + * because some other input was a NaN, or because the operation + * wasn't actually carried out (divide-by-zero; invalid)) + */ + float_flag_input_denormal_used = 0x4000, }; /* diff --git a/fpu/softfloat.c b/fpu/softfloat.c index 03a604c38ec..1b4046e81a9 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -2718,8 +2718,10 @@ static void parts_float_to_ahp(FloatParts64 *a, float_status *s) float16_params_ahp.frac_size + 1); break; - case float_class_normal: case float_class_denormal: + float_raise(float_flag_input_denormal_used, s); + break; + case float_class_normal: case float_class_zero: break; @@ -2733,6 +2735,9 @@ static void parts64_float_to_float(FloatParts64 *a, float_status *s) if (is_nan(a->cls)) { parts_return_nan(a, s); } + if (a->cls == float_class_denormal) { + float_raise(float_flag_input_denormal_used, s); + } } static void parts128_float_to_float(FloatParts128 *a, float_status *s) @@ -2740,6 +2745,9 @@ static void parts128_float_to_float(FloatParts128 *a, float_status *s) if (is_nan(a->cls)) { parts_return_nan(a, s); } + if (a->cls == float_class_denormal) { + float_raise(float_flag_input_denormal_used, s); + } } #define parts_float_to_float(P, S) \ @@ -2752,12 +2760,21 @@ static void parts_float_to_float_narrow(FloatParts64 *a, FloatParts128 *b, a->sign = b->sign; a->exp = b->exp; - if (is_anynorm(a->cls)) { + switch (a->cls) { + case float_class_denormal: + float_raise(float_flag_input_denormal_used, s); + /* fall through */ + case float_class_normal: frac_truncjam(a, b); - } else if (is_nan(a->cls)) { + break; + case float_class_snan: + case float_class_qnan: /* Discard the low bits of the NaN. */ a->frac = b->frac_hi; parts_return_nan(a, s); + break; + default: + break; } } @@ -2772,6 +2789,9 @@ static void parts_float_to_float_widen(FloatParts128 *a, FloatParts64 *b, if (is_nan(a->cls)) { parts_return_nan(a, s); } + if (a->cls == float_class_denormal) { + float_raise(float_flag_input_denormal_used, s); + } } float32 float16_to_float32(float16 a, bool ieee, float_status *s) @@ -4411,6 +4431,11 @@ float32_hs_compare(float32 xa, float32 xb, float_status *s, bool is_quiet) goto soft; } + if (unlikely(float32_is_denormal(ua.s) || float32_is_denormal(ub.s))) { + /* We may need to set the input_denormal_used flag */ + goto soft; + } + float32_input_flush2(&ua.s, &ub.s, s); if (isgreaterequal(ua.h, ub.h)) { if (isgreater(ua.h, ub.h)) { @@ -4462,6 +4487,12 @@ float64_hs_compare(float64 xa, float64 xb, float_status *s, bool is_quiet) } float64_input_flush2(&ua.s, &ub.s, s); + + if (unlikely(float64_is_denormal(ua.s) || float64_is_denormal(ub.s))) { + /* We may need to set the input_denormal_used flag */ + goto soft; + } + if (isgreaterequal(ua.h, ub.h)) { if (isgreater(ua.h, ub.h)) { return float_relation_greater; diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc index 8621cb87185..0122b35008a 100644 --- a/fpu/softfloat-parts.c.inc +++ b/fpu/softfloat-parts.c.inc @@ -433,6 +433,15 @@ static FloatPartsN *partsN(addsub)(FloatPartsN *a, FloatPartsN *b, bool b_sign = b->sign ^ subtract; int ab_mask = float_cmask(a->cls) | float_cmask(b->cls); + /* + * For addition and subtraction, we will consume an + * input denormal unless the other input is a NaN. + */ + if ((ab_mask & (float_cmask_denormal | float_cmask_anynan)) == + float_cmask_denormal) { + float_raise(float_flag_input_denormal_used, s); + } + if (a->sign != b_sign) { /* Subtraction */ if (likely(cmask_is_only_normals(ab_mask))) { @@ -516,6 +525,10 @@ static FloatPartsN *partsN(mul)(FloatPartsN *a, FloatPartsN *b, if (likely(cmask_is_only_normals(ab_mask))) { FloatPartsW tmp; + if (ab_mask & float_cmask_denormal) { + float_raise(float_flag_input_denormal_used, s); + } + frac_mulw(&tmp, a, b); frac_truncjam(a, &tmp); @@ -541,6 +554,10 @@ static FloatPartsN *partsN(mul)(FloatPartsN *a, FloatPartsN *b, } /* Multiply by 0 or Inf */ + if (ab_mask & float_cmask_denormal) { + float_raise(float_flag_input_denormal_used, s); + } + if (ab_mask & float_cmask_inf) { a->cls = float_class_inf; a->sign = sign; @@ -664,6 +681,16 @@ static FloatPartsN *partsN(muladd_scalbn)(FloatPartsN *a, FloatPartsN *b, if (flags & float_muladd_negate_result) { a->sign ^= 1; } + + /* + * All result types except for "return the default NaN + * because this is an Invalid Operation" go through here; + * this matches the set of cases where we consumed a + * denormal input. + */ + if (abc_mask & float_cmask_denormal) { + float_raise(float_flag_input_denormal_used, s); + } return a; return_sub_zero: @@ -693,6 +720,9 @@ static FloatPartsN *partsN(div)(FloatPartsN *a, FloatPartsN *b, bool sign = a->sign ^ b->sign; if (likely(cmask_is_only_normals(ab_mask))) { + if (ab_mask & float_cmask_denormal) { + float_raise(float_flag_input_denormal_used, s); + } a->sign = sign; a->exp -= b->exp + frac_div(a, b); return a; @@ -713,6 +743,10 @@ static FloatPartsN *partsN(div)(FloatPartsN *a, FloatPartsN *b, return parts_pick_nan(a, b, s); } + if ((ab_mask & float_cmask_denormal) && b->cls != float_class_zero) { + float_raise(float_flag_input_denormal_used, s); + } + a->sign = sign; /* Inf / X */ @@ -751,6 +785,9 @@ static FloatPartsN *partsN(modrem)(FloatPartsN *a, FloatPartsN *b, int ab_mask = float_cmask(a->cls) | float_cmask(b->cls); if (likely(cmask_is_only_normals(ab_mask))) { + if (ab_mask & float_cmask_denormal) { + float_raise(float_flag_input_denormal_used, s); + } frac_modrem(a, b, mod_quot); return a; } @@ -771,6 +808,10 @@ static FloatPartsN *partsN(modrem)(FloatPartsN *a, FloatPartsN *b, return a; } + if (ab_mask & float_cmask_denormal) { + float_raise(float_flag_input_denormal_used, s); + } + /* N % Inf; 0 % N */ g_assert(b->cls == float_class_inf || a->cls == float_class_zero); return a; @@ -801,6 +842,10 @@ static void partsN(sqrt)(FloatPartsN *a, float_status *status, if (unlikely(a->cls != float_class_normal)) { switch (a->cls) { case float_class_denormal: + if (!a->sign) { + /* -ve denormal will be InvalidOperation */ + float_raise(float_flag_input_denormal_used, status); + } break; case float_class_snan: case float_class_qnan: @@ -1431,6 +1476,9 @@ static FloatPartsN *partsN(minmax)(FloatPartsN *a, FloatPartsN *b, if ((flags & (minmax_isnum | minmax_isnumber)) && !(ab_mask & float_cmask_snan) && (ab_mask & ~float_cmask_qnan)) { + if (ab_mask & float_cmask_denormal) { + float_raise(float_flag_input_denormal_used, s); + } return is_nan(a->cls) ? b : a; } @@ -1455,6 +1503,10 @@ static FloatPartsN *partsN(minmax)(FloatPartsN *a, FloatPartsN *b, return parts_pick_nan(a, b, s); } + if (ab_mask & float_cmask_denormal) { + float_raise(float_flag_input_denormal_used, s); + } + a_exp = a->exp; b_exp = b->exp; @@ -1524,6 +1576,10 @@ static FloatRelation partsN(compare)(FloatPartsN *a, FloatPartsN *b, if (likely(cmask_is_only_normals(ab_mask))) { FloatRelation cmp; + if (ab_mask & float_cmask_denormal) { + float_raise(float_flag_input_denormal_used, s); + } + if (a->sign != b->sign) { goto a_sign; } @@ -1549,6 +1605,10 @@ static FloatRelation partsN(compare)(FloatPartsN *a, FloatPartsN *b, return float_relation_unordered; } + if (ab_mask & float_cmask_denormal) { + float_raise(float_flag_input_denormal_used, s); + } + if (ab_mask & float_cmask_zero) { if (ab_mask == float_cmask_zero) { return float_relation_equal; @@ -1588,8 +1648,10 @@ static void partsN(scalbn)(FloatPartsN *a, int n, float_status *s) case float_class_zero: case float_class_inf: break; - case float_class_normal: case float_class_denormal: + float_raise(float_flag_input_denormal_used, s); + /* fall through */ + case float_class_normal: a->exp += MIN(MAX(n, -0x10000), 0x10000); break; default: @@ -1609,6 +1671,10 @@ static void partsN(log2)(FloatPartsN *a, float_status *s, const FloatFmt *fmt) if (unlikely(a->cls != float_class_normal)) { switch (a->cls) { case float_class_denormal: + if (!a->sign) { + /* -ve denormal will be InvalidOperation */ + float_raise(float_flag_input_denormal_used, s); + } break; case float_class_snan: case float_class_qnan: