From patchwork Sun Dec 22 16:23:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 852905 Delivered-To: patch@linaro.org Received: by 2002:a5d:4888:0:b0:385:e875:8a9e with SMTP id g8csp3033814wrq; Sun, 22 Dec 2024 08:25:27 -0800 (PST) X-Forwarded-Encrypted: i=2; AJvYcCVqNwEssuZ/S15epiTlu25/tj95bwXl7uE5o8pdwnFQuAF6NxxulGc46nzYuhX8tH+64ib0KQ==@linaro.org X-Google-Smtp-Source: AGHT+IFMvSp3Jwig0lPJ43i+bWNFy5DdeZOxyQUKVsvWSkRBUpkveXysxRgXC9vlXlmWf5TwKuMW X-Received: by 2002:a05:620a:4714:b0:7b6:f219:a7a8 with SMTP id af79cd13be357-7b9ba805896mr1507170885a.49.1734884727325; Sun, 22 Dec 2024 08:25:27 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1734884727; cv=none; d=google.com; s=arc-20240605; b=li6pPRn3rU2vUWtxEDdfOn47kxpLfLeFH3EhQFjdOB2to+BieYemzWHGLgvX8x77Cn /6Ar4uo18NQ99Uxk71zPYwi0L8inht2TCS9CBPrB/SR0WVeyH7hMHs19VXKCs37Lqy20 V+C9qL5EUp53xpEKpr47kurZ2oYVj5GdlogSvHHuIiMeZRXLzSppemOalaQo3clMDZBJ 1mZcRXuVXV9tBenYXCUBHyjVQ6DOn039VTxfapVLZf0xjhmaTo0kEI4pQnbJ3IrJEjod ZBmLz1DMZxKE+VBPvFoJt9TJYbggLw4rGmrCBL3UB0xsSNdBGYQDO7096r2+4Bu6+OFJ Ln7A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:message-id:date:subject:to:from:dkim-signature; bh=iMHeSlbz0591ATzYmtVcUUaYYddFxbpBuy2TMLDdpkQ=; fh=PnYt+qEB9tAfMKoqBm2xjKOFpYyFFGPudh5cVIoieJM=; b=PpwJl7JhAbRennf/JakYCSf4DD5NtACPbBTcZvGXp3WcKW+Jsih5uXVhf/e6pKm6rz JLfd2gXpqrz01TGGu0OoXhGeUi6UlTwmDEsAGMQ5fw/Vp0YCcLw95FvU+7oCYSQo7Dzb 4WbXL9pFxp06VYCZyPK6LJb2aqy76C7OusqFzyjmpkx5DvfizSzOwDtzD59kqA3XG+0j 3MMWaZ153/KjRxPlqe1nmroDsMB8uZgSFYPVT4fzXOqLxe7HaL3XiQ0t8Ozgs4w679xY vTZt9/bjcoXtI5TEsvKhRcpaN8KfsrHuuPMKBBRV5fZxqvz+wE4FTaEyiGz7sBKAdbXA +Xcg==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="Q90erX/I"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org; dara=neutral header.i=@linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id af79cd13be357-7b9ac479014si946446885a.258.2024.12.22.08.25.27 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Sun, 22 Dec 2024 08:25:27 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="Q90erX/I"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org; dara=neutral header.i=@linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tPOlF-0002Yr-E6; Sun, 22 Dec 2024 11:24:59 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tPOlA-0002Y3-IZ for qemu-devel@nongnu.org; Sun, 22 Dec 2024 11:24:52 -0500 Received: from mail-pl1-x634.google.com ([2607:f8b0:4864:20::634]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1tPOl7-0002wX-SJ for qemu-devel@nongnu.org; Sun, 22 Dec 2024 11:24:51 -0500 Received: by mail-pl1-x634.google.com with SMTP id d9443c01a7336-21649a7bcdcso31420245ad.1 for ; Sun, 22 Dec 2024 08:24:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1734884688; x=1735489488; darn=nongnu.org; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:from:to:cc:subject:date:message-id:reply-to; bh=iMHeSlbz0591ATzYmtVcUUaYYddFxbpBuy2TMLDdpkQ=; b=Q90erX/IxysZ44JIaTPMV+v19VtfXs5cYYlxwOL70KgEA4Zx3syMJ6nNuc5sQ4lKIG XCEKSO0CzGugr1ntkB8UMUWqScjpxgrMbnpxAQ7Whl3yGDFzt1yqy5O34k9Awo3L+Pw6 VI52lxpWRhlQ+ru/kg1zkbZezUo8BmHZK+yxsM4lNNRRIWYL6Gsbt93byQvWd2ShTqLR 2915QpwFQ6ANGC+ufMUW1+RjQPPlfJN2MSRIS3OUbRC8PN+thUqaMWvQVydCbZr2mN+o 0LicyZfB05dSyQl8vuNJANoWVk7rEDWqIj+YMq1fuC9RlojFDjHxS+5GkjLvGSPQOKKf TEJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734884688; x=1735489488; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=iMHeSlbz0591ATzYmtVcUUaYYddFxbpBuy2TMLDdpkQ=; b=wShmEIUZ72Yfp5Hw4RX+X33OBhziprXGV00fb9k+IEvOZ41H+tg5ODjkvYp3Iv1ZI6 aDkzsdYE11Ih43RNqejzD5lxxFzN7GazvfCO4xLzPqTdBinDmqkvX3OfVz8awK3gp/Ys q/sAG8U/j6DhGtTCS+6gIe16asgE80YfkMtEhZ5ZWbQfNl/dB0e1nTOc/OIhB4ndYVAP JNZNVaO2ipp5kAIL/CTOV8AAanXkI5blsAl+wzQ1mcDiJ7mtAGzbYiwn1gGxMWvikKIj ebK6s/5J9gUumUmK3YhRTXLIE8vhz+FBgmOFy0QYLwKhGVsYToq7yT20CFuh4gGC17XP eMTQ== X-Gm-Message-State: AOJu0YzTDjd6mY5TXw/VvjqnbLRNa78RqbZDM9GQ7/aZ16dfL94f8gBS KyJVZOJmU+v7rlR6sNAGIP7uaQypkG0ZYwTcOP00FSGq9n2CmrJU9hXWyIX7PQvByVh5SCAYm5o ULEQ= X-Gm-Gg: ASbGncsNwzeXCqnEO4vaaheiMsPnGe8im2clg/sTUVAk8CFZ3sxy54d8sIol9WP/pYb +tVj1F+cHSa05mLrGei1q5czw+mvsbw/JPlgpt4h4RoZAiFmnjoxhR5KR4B/VtWDRoilx68/lf2 DH1KuApuFnlm3hVWKa6xskXe+rp21CLVXJJdJh69Rpehdt6FOoNLQCg3dFvj7O3Kpx5A2113frA BAPY+zlgBip53+Y6IMxZ+uuc2Y/k9+t7drdbdnnN2ZitRSM/OiC4KsAM2141/k= X-Received: by 2002:a17:902:ce83:b0:216:5b8b:9062 with SMTP id d9443c01a7336-219e70dd21cmr147192875ad.54.1734884687888; Sun, 22 Dec 2024 08:24:47 -0800 (PST) Received: from stoup.. ([71.212.144.252]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-219dc971814sm58461385ad.79.2024.12.22.08.24.47 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 22 Dec 2024 08:24:47 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 00/51] tcg: Remove in-flight mask data from OptContext Date: Sun, 22 Dec 2024 08:23:55 -0800 Message-ID: <20241222162446.2415717-1-richard.henderson@linaro.org> X-Mailer: git-send-email 2.43.0 MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::634; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x634.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org The desire is to start re-using some of the fold_* functions while lowering or simplifying operations during tcg_optmize. Many of these fold_* functions set z_mask, s_mask, and a_mask, which hang around until the end of the tcg_optmize loop and are applied by finish_folding. This disconnect between set and apply is a problem -- we would no longer be applying the masks to the correct opcode. Fix this by making the masks local variables, passed down to be applied immediately to the opcode being processed. Changes for v3: - Testing on non-x86 hosts with full-featured deposit ops showed some issues with some of the new computations of sign mask, which turn out to be a representational issue which leads to easy off-by-one errors. Change the representation of s_mask to include the sign itself not just repetitions. Disable optimizations based on s_mask until the conversion is complete. Bitwise logical operations require no change. Extension, and sign-extending loads, now use INTn_MIN instead of MAKE_64BIT_MASK(n, 64 - n). The existing shift and the new deposit s_mask operations require no change, but that's only because they were buggy before. Patches 6 and 49 are new. Minor changes scattered in between. r~ Richard Henderson (51): tcg/optimize: Split out finish_bb, finish_ebb tcg/optimize: Split out fold_affected_mask tcg/optimize: Copy mask writeback to fold_masks tcg/optimize: Split out fold_masks_zs tcg/optimize: Augment s_mask from z_mask in fold_masks_zs tcg/optimize: Change representation of s_mask tcg/optimize: Use finish_folding in fold_add, fold_add_vec, fold_addsub2 tcg/optimize: Use fold_masks_zs in fold_and tcg/optimize: Use fold_masks_zs in fold_andc tcg/optimize: Use fold_masks_zs in fold_bswap tcg/optimize: Use fold_masks_zs in fold_count_zeros tcg/optimize: Use fold_masks_z in fold_ctpop tcg/optimize: Use fold_and and fold_masks_z in fold_deposit tcg/optimize: Compute sign mask in fold_deposit tcg/optimize: Use finish_folding in fold_divide tcg/optimize: Use finish_folding in fold_dup, fold_dup2 tcg/optimize: Use fold_masks_s in fold_eqv tcg/optimize: Use fold_masks_z in fold_extract tcg/optimize: Use finish_folding in fold_extract2 tcg/optimize: Use fold_masks_zs in fold_exts tcg/optimize: Use fold_masks_z in fold_extu tcg/optimize: Use fold_masks_zs in fold_movcond tcg/optimize: Use finish_folding in fold_mul* tcg/optimize: Use fold_masks_s in fold_nand tcg/optimize: Use fold_masks_z in fold_neg_no_const tcg/optimize: Use fold_masks_s in fold_nor tcg/optimize: Use fold_masks_s in fold_not tcg/optimize: Use fold_masks_zs in fold_or tcg/optimize: Use fold_masks_zs in fold_orc tcg/optimize: Use fold_masks_zs in fold_qemu_ld tcg/optimize: Return true from fold_qemu_st, fold_tcg_st tcg/optimize: Use finish_folding in fold_remainder tcg/optimize: Distinguish simplification in fold_setcond_zmask tcg/optimize: Use fold_masks_z in fold_setcond tcg/optimize: Use fold_masks_s in fold_negsetcond tcg/optimize: Use fold_masks_z in fold_setcond2 tcg/optimize: Use finish_folding in fold_cmp_vec tcg/optimize: Use finish_folding in fold_cmpsel_vec tcg/optimize: Use fold_masks_zs in fold_sextract tcg/optimize: Use fold_masks_zs, fold_masks_s in fold_shift tcg/optimize: Simplify sign bit test in fold_shift tcg/optimize: Use finish_folding in fold_sub, fold_sub_vec tcg/optimize: Use fold_masks_zs in fold_tcg_ld tcg/optimize: Use finish_folding in fold_tcg_ld_memcopy tcg/optimize: Use fold_masks_zs in fold_xor tcg/optimize: Use finish_folding in fold_bitsel_vec tcg/optimize: Use finish_folding as default in tcg_optimize tcg/optimize: Remove z_mask, s_mask from OptContext tcg/optimize: Re-enable sign-mask optimizations tcg/optimize: Move fold_bitsel_vec into alphabetic sort tcg/optimize: Move fold_cmp_vec, fold_cmpsel_vec into alphabetic sort tcg/optimize.c | 834 +++++++++++++++++++++++++------------------------ 1 file changed, 432 insertions(+), 402 deletions(-)