From patchwork Fri Nov 18 09:47:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 626120 Delivered-To: patch@linaro.org Received: by 2002:a17:522:c983:b0:460:3032:e3c4 with SMTP id kr3csp35359pvb; Fri, 18 Nov 2022 01:57:30 -0800 (PST) X-Google-Smtp-Source: AA0mqf5fTexL8g7ch5vWb8Mlm4YLYBJq0zOws77KpPW6qZEpLUmfYfXgz2zO0za815Y3miqzzqiz X-Received: by 2002:a05:622a:4110:b0:3a5:4502:3d67 with SMTP id cc16-20020a05622a411000b003a545023d67mr5874913qtb.270.1668765450613; Fri, 18 Nov 2022 01:57:30 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668765450; cv=none; d=google.com; s=arc-20160816; b=jFYFIO/OUw1qdFxfo9IBH2xX2ffx0Zt/7uw9/uBMuDBr3SKLfDZbq4z/lHx2pFvt7/ n5LmToNzSOWRpTmxLy6aKzRVlLyDoh4tW8CdnWwPH4NRQGFFWNXkb7qthAVYDfHz2HZN oYaa3RNU3AidXwQyeguu9Imv0R5OsX/yagXQZ3Y+i2j5qp8X0L4/qy0pPPClm1KXpWzO SNgwsXKZ2f50vG2S6Te1fM2BrXb1ovr9n8rp6xQ/+HCZPl91IEfl2vEvDHfTGD2Mv2U/ xtLR6VGtgHra8630zElftPoKx7PFDeJYDzmmHabdp073fDOdt/6bWvUOAEn2ac3h5w+m pAXA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=m/CUiseesWXfy8ONw/IKq3y2jtfhBMrDZnaeFnHh+EA=; b=LnVmrIVr8a3BOwKQ4jWjl2YANjGgnZHBks1HA2OgW8iRPtETIqO6/JTHosjdftOA6x DUsVQkWlAKZPcjjLr3GgE3Y2gQIQAT3YObS6XqHH+9dGHe8xcPkjvHZBlQ8BCrBWFnem /qm0zxQVlbOpJDuZiKF7wsZaUgMkim2DGvuP08q/YBePbM8cIlK/tWx+/tiwZ81blGZ0 w0pSGbbZ4Ap21mPT6KPlGlIxbOAbNG2hJBHS3mP9d+91h7EiOhpYfnSi9C+fNsSdGMIO TrSe5SWmdfpDEGb2evcJl2f6O1rwxxIQKEWqEv9TCPQ5GWBWMH1no/yFhZhToGGsoUD1 kt1Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="cjh/Hx87"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id j12-20020ac8440c000000b003a4eab32a96si1603992qtn.298.2022.11.18.01.57.30 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 18 Nov 2022 01:57:30 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="cjh/Hx87"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ovxys-0007KC-GO; Fri, 18 Nov 2022 04:48:18 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ovxyc-000768-CA for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:06 -0500 Received: from mail-pl1-x636.google.com ([2607:f8b0:4864:20::636]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ovxya-0001yU-OW for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:02 -0500 Received: by mail-pl1-x636.google.com with SMTP id w23so4081102ply.12 for ; Fri, 18 Nov 2022 01:48:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=m/CUiseesWXfy8ONw/IKq3y2jtfhBMrDZnaeFnHh+EA=; b=cjh/Hx87JI2qBBTJFpWd6RW1YdLczPKK0Z5x3b2HhaOOGeojQtzNyujUjiOQfJqtcl emIBWCfRI3KcPqKLf9ZXUMzasGEUzlIwcSHp0WDQA0Ti8MoIMNY6hPcG4R9hSzngPA41 77ziaxPfUG5aAClDDQ5q1/YskGUIZBToXDMWKze5cjmpHQkZPg+uy97AMze8MuSy2JVK ESQSxMGi/MSI7uvmhhBmQxON/TmN+Bksw+qJLFamoLid2VSA4JA9KuPLWHk5yAElVNiN TNApMTDebrJLHk3GkyizZsqXboihv2Rzt56D0njAM2rzmrfrqFzrjjfpCjO8ifL7k49i 5y0Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=m/CUiseesWXfy8ONw/IKq3y2jtfhBMrDZnaeFnHh+EA=; b=cAesMA8nrJ3nG+y/mThia9KagQ+m7ZK/tAFc7lkgwXZxe9+1ynAn3gk4go8EeUEywd B9g7H9p3HJb80PA454Dm3rYBk3rzt1vp4xS7bosP97C1wx6mxLK8ErYR5Tsi2UkdMupU DxtuTF24aM7h6gX0A+p1P9ctxp2bbNhWMObVAL5oB6Lsj0g9XUmT3TC+c133HrcfZmdY OprjGW1jWXV1NGQftB2+LwGwuHZuD/Q7VuplkcGogBvthQzInAew1d6MhjWxz4voYYVg LYks31ss6ohryKByYZDsKIfjUeRWwgk4JAO0KC1KgJsbZFis3RtYrIKcmTJZxRrTAd6b rSJA== X-Gm-Message-State: ANoB5pn13E8ei7gldqbTsHiuG0GDREsuNrX1U5tTZ73WS8HeuUoWEl9U p3vdwAuZ6saX9+Yqi/Un/gikrN1NHfnNMQ== X-Received: by 2002:a17:90a:d145:b0:213:f465:14e7 with SMTP id t5-20020a17090ad14500b00213f46514e7mr6976294pjw.194.1668764879000; Fri, 18 Nov 2022 01:47:59 -0800 (PST) Received: from stoup.. ([2602:47:d48a:1201:90b2:345f:bf0a:c412]) by smtp.gmail.com with ESMTPSA id n12-20020a170902e54c00b0018862bb3976sm3115421plf.308.2022.11.18.01.47.57 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Nov 2022 01:47:58 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH for-8.0 01/29] include/qemu/cpuid: Introduce xgetbv_low Date: Fri, 18 Nov 2022 01:47:26 -0800 Message-Id: <20221118094754.242910-2-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221118094754.242910-1-richard.henderson@linaro.org> References: <20221118094754.242910-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::636; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x636.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org Replace the two uses of asm to expand xgetbv with an inline function. Since one of the two has been using the mnemonic, assume that the comment about "older versions of the assember" is obsolete, as even that is 4 years old. Signed-off-by: Richard Henderson Reviewed-by: Philippe Mathieu-Daudé --- include/qemu/cpuid.h | 7 +++++++ util/bufferiszero.c | 3 +-- tcg/i386/tcg-target.c.inc | 11 ++++------- 3 files changed, 12 insertions(+), 9 deletions(-) diff --git a/include/qemu/cpuid.h b/include/qemu/cpuid.h index 7adb12d320..1451e8ef2f 100644 --- a/include/qemu/cpuid.h +++ b/include/qemu/cpuid.h @@ -71,4 +71,11 @@ #define bit_LZCNT (1 << 5) #endif +static inline unsigned xgetbv_low(unsigned c) +{ + unsigned a, d; + asm("xgetbv" : "=a"(a), "=d"(d) : "c"(c)); + return a; +} + #endif /* QEMU_CPUID_H */ diff --git a/util/bufferiszero.c b/util/bufferiszero.c index ec3cd4ca15..b0660d484d 100644 --- a/util/bufferiszero.c +++ b/util/bufferiszero.c @@ -287,8 +287,7 @@ static void __attribute__((constructor)) init_cpuid_cache(void) /* We must check that AVX is not just available, but usable. */ if ((c & bit_OSXSAVE) && (c & bit_AVX) && max >= 7) { - int bv; - __asm("xgetbv" : "=a"(bv), "=d"(d) : "c"(0)); + unsigned bv = xgetbv_low(0); __cpuid_count(7, 0, a, b, c, d); if ((bv & 0x6) == 0x6 && (b & bit_AVX2)) { cache |= CACHE_AVX2; diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index c96b5a6f43..1361960156 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -4148,12 +4148,9 @@ static void tcg_target_init(TCGContext *s) /* There are a number of things we must check before we can be sure of not hitting invalid opcode. */ if (c & bit_OSXSAVE) { - unsigned xcrl, xcrh; - /* The xgetbv instruction is not available to older versions of - * the assembler, so we encode the instruction manually. - */ - asm(".byte 0x0f, 0x01, 0xd0" : "=a" (xcrl), "=d" (xcrh) : "c" (0)); - if ((xcrl & 6) == 6) { + unsigned bv = xgetbv_low(0); + + if ((bv & 6) == 6) { have_avx1 = (c & bit_AVX) != 0; have_avx2 = (b7 & bit_AVX2) != 0; @@ -4164,7 +4161,7 @@ static void tcg_target_init(TCGContext *s) * check that OPMASK and all extended ZMM state are enabled * even if we're not using them -- the insns will fault. */ - if ((xcrl & 0xe0) == 0xe0 + if ((bv & 0xe0) == 0xe0 && (b7 & bit_AVX512F) && (b7 & bit_AVX512VL)) { have_avx512vl = true; From patchwork Fri Nov 18 09:47:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 626104 Delivered-To: patch@linaro.org Received: by 2002:a17:522:c983:b0:460:3032:e3c4 with SMTP id kr3csp32868pvb; Fri, 18 Nov 2022 01:50:23 -0800 (PST) X-Google-Smtp-Source: AA0mqf5fPmUDfAtQfPZdG38rq/gt3l183fvnPENNxKnKOlwvqLVtnWenPC8vzEKztgTqoOUlw9Sw X-Received: by 2002:a05:6214:2a2:b0:4c6:8fac:fdbb with SMTP id m2-20020a05621402a200b004c68facfdbbmr4303596qvv.115.1668765023776; Fri, 18 Nov 2022 01:50:23 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668765023; cv=none; d=google.com; s=arc-20160816; b=P+7F5NdLjlvIcHf9OR/36ra4zEQps0dRxwtvhtrzVIKfqbv0VkU3vKDaUYcCANVYW3 G/IRjjEtzxoIcD6Lt5Bh3WP8vbb61lNxLiog8+O56KzpbhSZnNUGgKITrtoQRDXsZrV1 Gj6lFoUpqk29YwybVtRG3mLFDibPgRfRDtxV/Dn9YqVPB6eP5bLopx/e+HZcddjNu/2/ PQxYjdmbEFSW1fg20N/C9rJwrvXkzCcqz7nN7JYCgo5a5hAssHzdUEy8h3noB0hSrvBF yZGrzIq7Ur9ugcT4cD6ritwB7mmGWzsYo4s6ICxDpo5Td/wM28aSZRyAwxFH6mXOF8Tk 1eDQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=jzoBh6h+eW5YyC2mEGGgFAPiTTpJh3Si9+ZlAorRV5U=; b=hcSGCJTBJte5xXum+XF7j17+8/g3LW7OzHXP1rZ3GOOyub1wHUKQhRDASuEZiPV3GO t7L+jLHJe+Nci/j3pYb/lZldA2AIY/xnfutAJ0ujgAfZP1G6+hyyrHuhRvmM+/rSGiKw 75n6fIXnZ/Iwm+etvG/FJs1Kh6YONCXMC+cMfHHBzk/dZBuR8LdxwWOpzFyi1mp9guGD WP5RFCjOcLybTgKKZPhdLOL1iwg75yAzRBwsd2fDeLTtULhToje8/pzoNfVhP7sTTxyP PZPpXbEzERhMuguq+ZzXAUuR2+RiqggZQc55hqql4+T2f6we39MEHZLlIMHeLMf/Rzcu 63wA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=pzaYTtPJ; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id v16-20020ac87490000000b003a589c2c64fsi1695740qtq.370.2022.11.18.01.50.23 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 18 Nov 2022 01:50:23 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=pzaYTtPJ; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ovxym-00079x-TF; Fri, 18 Nov 2022 04:48:14 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ovxyh-00078C-1V for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:07 -0500 Received: from mail-pj1-x1032.google.com ([2607:f8b0:4864:20::1032]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ovxyf-0001yy-5g for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:06 -0500 Received: by mail-pj1-x1032.google.com with SMTP id e7-20020a17090a77c700b00216928a3917so7827609pjs.4 for ; Fri, 18 Nov 2022 01:48:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=jzoBh6h+eW5YyC2mEGGgFAPiTTpJh3Si9+ZlAorRV5U=; b=pzaYTtPJT2oF6teLGdg6T2+ehBkQ3Y6ZKE3iHEmtOx8rrOiv2PdqCxSohgsZFmJozl /Zc0fxCuVQ4YBy3uH5JZXAwf4blwy7opOGu/cKumBWLx/nvmFjcm7XhK8hqyWzChJ2L1 h14nWlqTKDZkaLPIoDEd3KWRG31zVu6mD/AOtlZH3NfEORHSWirwMH+rwhn8JMry95Ot GNJHbYm1QzVSPyocXUR4p+pHcRKw9Ql81sYeGtRgWaq7uyEYCfj3QZvGNBXxWIscOYpB VW8uvcw4YAbCs99wONxnJLwWP+dAzkdlHFSUSGM6wou6g3i1D1A5TBo7Kk0ePwRXgD5h 5kTw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jzoBh6h+eW5YyC2mEGGgFAPiTTpJh3Si9+ZlAorRV5U=; b=F2CbBTJ0886yTwH9DxqlQsQP8DDkESw4xjPeNvfTz+A8HUfKcuvA3KpomPC+aSiP0Z XvTYFjzsmmmCg4wVB5W8mH5FsPDjjUL2MuwijmtRhkHpT356SCgmNkg8HcMwq38Po3+U SygGtTokuVowXVJcUNbMsNtGZYOlFc1/yPG2ACGm52gxjFnsCjPEJ4wH9HoiCcx+q3RB Y2xT1v/kjo7Y7HMY+hss8zpEL1siqJgMU8Mcp1gb/HIO+x7HbfMkUIc/m9bhYvvkPBcI yQL+LJEYplvc9ZIdIrwKQD6PjFjYX+HhJk31xdttlzVQIO8aIer8IH5fAHKVAAUOAsJU kuPA== X-Gm-Message-State: ANoB5plZ/fVHCSG0dxW0RBPBkvLuNCim/4KCfXwIqBaNtGwUnJFZaI9A PSHY7shdHsao71su3m22K5HNCC07p/Csjg== X-Received: by 2002:a17:90a:ca96:b0:212:d404:5513 with SMTP id y22-20020a17090aca9600b00212d4045513mr13048674pjt.27.1668764883692; Fri, 18 Nov 2022 01:48:03 -0800 (PST) Received: from stoup.. ([2602:47:d48a:1201:90b2:345f:bf0a:c412]) by smtp.gmail.com with ESMTPSA id n12-20020a170902e54c00b0018862bb3976sm3115421plf.308.2022.11.18.01.47.59 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Nov 2022 01:47:59 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH for-8.0 02/29] include/exec/memop: Add bits describing atomicity Date: Fri, 18 Nov 2022 01:47:27 -0800 Message-Id: <20221118094754.242910-3-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221118094754.242910-1-richard.henderson@linaro.org> References: <20221118094754.242910-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1032; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1032.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org These bits may be used to describe the precise atomicity requirements of the guest, which may then be used to constrain the methods by which it may be emulated by the host. For instance, the AArch64 LDP (32-bit) instruction changes semantics with ARMv8.4 LSE2, from MO_64 | MO_ATMAX_4 | MO_ATOM_IFALIGN (64-bits, single-copy atomic only on 4 byte units, nonatomic if not aligned by 4), to MO_64 | MO_ATMAX_SIZE | MO_ATOM_WITHIN16 (64-bits, single-copy atomic within a 16 byte block) The former may be implemented with two 4 byte loads, or a single 8 byte load if that happens to be efficient on the host. The latter may not, and may also require a helper when misaligned. Signed-off-by: Richard Henderson --- include/exec/memop.h | 36 ++++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) diff --git a/include/exec/memop.h b/include/exec/memop.h index 25d027434a..04e4048f0b 100644 --- a/include/exec/memop.h +++ b/include/exec/memop.h @@ -81,6 +81,42 @@ typedef enum MemOp { MO_ALIGN_32 = 5 << MO_ASHIFT, MO_ALIGN_64 = 6 << MO_ASHIFT, + /* + * MO_ATOM_* describes that atomicity requirements of the operation: + * MO_ATOM_IFALIGN: the operation must be single-copy atomic if and + * only if it is aligned; if unaligned there is no atomicity. + * MO_ATOM_NONE: the operation has no atomicity requirements. + * MO_ATOM_SUBALIGN: the operation is single-copy atomic by parts + * by the alignment. E.g. if the address is 0 mod 4, then each + * 4-byte subobject is single-copy atomic. + * This is the atomicity of IBM Power and S390X processors. + * MO_ATOM_WITHIN16: the operation is single-copy atomic, even if it + * is unaligned, so long as it does not cross a 16-byte boundary; + * if it crosses a 16-byte boundary there is no atomicity. + * This is the atomicity of Arm FEAT_LSE2. + * + * MO_ATMAX_* describes the maximum atomicity unit required: + * MO_ATMAX_SIZE: the entire operation, i.e. MO_SIZE. + * MO_ATMAX_[248]: units of N bytes. + * + * Note the default (i.e. 0) values are single-copy atomic to the + * size of the operation, if aligned. This retains the behaviour + * from before these were introduced. + */ + MO_ATOM_SHIFT = 8, + MO_ATOM_MASK = 0x3 << MO_ATOM_SHIFT, + MO_ATOM_IFALIGN = 0 << MO_ATOM_SHIFT, + MO_ATOM_NONE = 1 << MO_ATOM_SHIFT, + MO_ATOM_SUBALIGN = 2 << MO_ATOM_SHIFT, + MO_ATOM_WITHIN16 = 3 << MO_ATOM_SHIFT, + + MO_ATMAX_SHIFT = 10, + MO_ATMAX_MASK = 0x3 << MO_ATMAX_SHIFT, + MO_ATMAX_SIZE = 0 << MO_ATMAX_SHIFT, + MO_ATMAX_2 = 1 << MO_ATMAX_SHIFT, + MO_ATMAX_4 = 2 << MO_ATMAX_SHIFT, + MO_ATMAX_8 = 3 << MO_ATMAX_SHIFT, + /* Combinations of the above, for ease of use. */ MO_UB = MO_8, MO_UW = MO_16, From patchwork Fri Nov 18 09:47:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 626111 Delivered-To: patch@linaro.org Received: by 2002:a17:522:c983:b0:460:3032:e3c4 with SMTP id kr3csp33744pvb; Fri, 18 Nov 2022 01:53:04 -0800 (PST) X-Google-Smtp-Source: AA0mqf4hJAMl+4Zz4DYKOieaFE4S0vTasQS5c9x3afshYteWHCQvO4kVEsIDU91Wz4gfE1W20xOK X-Received: by 2002:a05:622a:4886:b0:35b:af0c:6c9 with SMTP id fc6-20020a05622a488600b0035baf0c06c9mr6075656qtb.71.1668765183940; Fri, 18 Nov 2022 01:53:03 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668765183; cv=none; d=google.com; s=arc-20160816; b=DuNTN5C2+n79Dgx1LYPV+PSaZtnVAnHEf9OTEKWl53ezPDIYTFXZRVaZHdkdRDP6kX pbfR/y3all7u2SPrWcOObI9md2SHQ6NclheMFmROHKErd1Lnr6vGsBqER9phCes17iBB fMpxGnHyDd98HORDvwQuFzEPqsqA/S8B5e+u9yVrpupvwv8ZTsDgZqwCS3gmOLxad2r/ XK4SWtHFNONBA6pLJxzwvOCu1NQMlQvM7kL1zjdtOouOM3KEKE64onfCoGzX+uNMom0H cF47sX82A4+rpY1lC5HhwD/9lZoK47A0JY2Fnexxz7XvGBwP8qpA/RbcQbeIFEyB1xni WNwA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=R4W5HfqIPTFQiUbVPQ8+JyZxjEG1zccsHxXcJFvRJFQ=; b=LgT3KrbrgAHEKo1qOLdgn9rhAiBt3nb3LpnX/Hx7+GKBBVG1h0Ch3p0LCapX8byU5S /uVmOSRf3zsM3bPS7cNwgZALNH2Spd0akPNe1JsEC69e9KkZLZvgddM4Mu+vXVgNcz7f eRcgw5yMaiBNEV1x2VzanUE6mtH2e0bJGE0gyQS8OxDmcS9BTEJ4288UCYhh18DNiGuh JMFMx4IL8UvQ64Av+FvtXOP5Q2zdYpvUotG2czKsOmMFAIvNMDMJeeUfushrEc2NZEPA nA77w1eYhgEVGyvgGxoeVtQKTZUkYiodrgoCtthJJCnpXCsYscBxmGP5g5Bm65+HEZtA qy4w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=AUPCvJZg; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id k1-20020a05620a414100b006b9c33a1e8dsi1809739qko.752.2022.11.18.01.53.03 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 18 Nov 2022 01:53:03 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=AUPCvJZg; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ovxyr-0007Ii-Vy; Fri, 18 Nov 2022 04:48:18 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ovxyj-00079t-5D for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:09 -0500 Received: from mail-pj1-x1032.google.com ([2607:f8b0:4864:20::1032]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ovxyh-0001z9-Bc for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:08 -0500 Received: by mail-pj1-x1032.google.com with SMTP id q1-20020a17090a750100b002139ec1e999so4597777pjk.1 for ; Fri, 18 Nov 2022 01:48:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=R4W5HfqIPTFQiUbVPQ8+JyZxjEG1zccsHxXcJFvRJFQ=; b=AUPCvJZgkyHggr4GwiYtzXI/wef3h8GChmtTVv58eyDohg1wDgcJU2LD0QEVfRQ7Eu PkriiPhgzc4rtlPHb9yAYb466kkL8Ep/LgUaYYGOhl7PlL19DF2+W/QKxTfy9bsf8MvJ +UuGBTYLC0OLvutxuLLz1sLbAeMsxWIq6Y/RnME6q+7x9+SgO5voMheLEAykCXOaDAjQ StQSuNOjEu8d3KyVo0ZAF95ghjo8cxZULuqPxYmDR+PrzBuenN1cWcnm61dzWVllFCfd MOIgeAMZxCkDuKneCJLMvVGKttPJplVKlPGWB05bJ5w2PfzXp+eJDPOQBVoxcKeuQCX0 hACg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=R4W5HfqIPTFQiUbVPQ8+JyZxjEG1zccsHxXcJFvRJFQ=; b=unkq5XOcBj1B++zZW8bJnCngrusH1AJrPBO6BRFezOeckyEnbPTaAQUQpumIo6pW8U hUMFUAGEd5Kbi0d1WfBdbHX0alw0Ja5KBNNwck7f0Gkl+2FLsT8gjs1A/Oiax/Wj89nX deBkwOCCLHoE5F5yA+e1+aemAvYmFuDflPWwfYdqfdEtYzzFaWqlgVRmKQkLhMlorPcm WGXC2gIjxvEGuduC8aEmqtXZ9afihMs5HwuNj5yF05K84bo9PnzppwOnLbK3Hj9dlOqq dQFDOPw5dsqXZchOfAtEINPgEvruyXsgDli+sEwU6WHXPp+H6Vuq0HIhfOBkWWxhXqWy shBg== X-Gm-Message-State: ANoB5pllS5SMLjfK6v+eevODkZk4Xmz9sfMTqtP/q1OCqEcL+k0d/86x ZDM8LTofpte9+dgKKDLl9e40+kB3DJCY3Q== X-Received: by 2002:a17:90b:3605:b0:213:3918:9be6 with SMTP id ml5-20020a17090b360500b0021339189be6mr12682370pjb.183.1668764886073; Fri, 18 Nov 2022 01:48:06 -0800 (PST) Received: from stoup.. ([2602:47:d48a:1201:90b2:345f:bf0a:c412]) by smtp.gmail.com with ESMTPSA id n12-20020a170902e54c00b0018862bb3976sm3115421plf.308.2022.11.18.01.48.03 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Nov 2022 01:48:04 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH for-8.0 03/29] accel/tcg: Add cpu_in_serial_context Date: Fri, 18 Nov 2022 01:47:28 -0800 Message-Id: <20221118094754.242910-4-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221118094754.242910-1-richard.henderson@linaro.org> References: <20221118094754.242910-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1032; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1032.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org Like cpu_in_exclusive_context, but also true if there is no other cpu against which we could race. Use it in tb_flush as a direct replacement. Use it in cpu_loop_exit_atomic to ensure that there is no loop against cpu_exec_step_atomic. Signed-off-by: Richard Henderson --- accel/tcg/internal.h | 5 +++++ accel/tcg/cpu-exec-common.c | 3 +++ accel/tcg/tb-maint.c | 2 +- 3 files changed, 9 insertions(+), 1 deletion(-) diff --git a/accel/tcg/internal.h b/accel/tcg/internal.h index cb13bade4f..f06bf58e7a 100644 --- a/accel/tcg/internal.h +++ b/accel/tcg/internal.h @@ -119,4 +119,9 @@ static inline target_ulong log_pc(CPUState *cpu, const TranslationBlock *tb) #endif } +static inline bool cpu_in_serial_context(CPUState *cs) +{ + return !(cs->tcg_cflags & CF_PARALLEL) || cpu_in_exclusive_context(cs); +} + #endif /* ACCEL_TCG_INTERNAL_H */ diff --git a/accel/tcg/cpu-exec-common.c b/accel/tcg/cpu-exec-common.c index c7bc8c6efa..2fb4454c7a 100644 --- a/accel/tcg/cpu-exec-common.c +++ b/accel/tcg/cpu-exec-common.c @@ -21,6 +21,7 @@ #include "sysemu/cpus.h" #include "sysemu/tcg.h" #include "exec/exec-all.h" +#include "internal.h" bool tcg_allowed; @@ -78,6 +79,8 @@ void cpu_loop_exit_restore(CPUState *cpu, uintptr_t pc) void cpu_loop_exit_atomic(CPUState *cpu, uintptr_t pc) { + /* Prevent looping if already executing in a serial context. */ + g_assert(!cpu_in_serial_context(cpu)); cpu->exception_index = EXCP_ATOMIC; cpu_loop_exit_restore(cpu, pc); } diff --git a/accel/tcg/tb-maint.c b/accel/tcg/tb-maint.c index 0cdb35548c..a7c067628c 100644 --- a/accel/tcg/tb-maint.c +++ b/accel/tcg/tb-maint.c @@ -119,7 +119,7 @@ void tb_flush(CPUState *cpu) if (tcg_enabled()) { unsigned tb_flush_count = qatomic_mb_read(&tb_ctx.tb_flush_count); - if (cpu_in_exclusive_context(cpu)) { + if (cpu_in_serial_context(cpu)) { do_tb_flush(cpu, RUN_ON_CPU_HOST_INT(tb_flush_count)); } else { async_safe_run_on_cpu(cpu, do_tb_flush, From patchwork Fri Nov 18 09:47:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 626121 Delivered-To: patch@linaro.org Received: by 2002:a17:522:c983:b0:460:3032:e3c4 with SMTP id kr3csp35361pvb; Fri, 18 Nov 2022 01:57:31 -0800 (PST) X-Google-Smtp-Source: AA0mqf5+q0qJY4/RE1fQirBAnIh3OsENcfVz9js2Y/bRr4TiYHyxYiL12Tx7QtEbFzygM7nMbN7K X-Received: by 2002:a37:464b:0:b0:6fa:3431:1f2f with SMTP id t72-20020a37464b000000b006fa34311f2fmr5161493qka.81.1668765451062; Fri, 18 Nov 2022 01:57:31 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668765451; cv=none; d=google.com; s=arc-20160816; b=gGCz7qJHazN9kvzVxFX+ClWPE0C4ESO4Vc2VEG+VowcuY+fYERPrAwuUIpGDEHYUGi j3Lo0EPLOETQjCpL2NIZX63ptRBjRSyaiby82jGZCjU/NNTuLIObOyu4waJo7jIWjPpL 2NBlHpKCgaPwy9D8/dG8dVdR0X8BILQQeaVKWUqaQVDuShYe+DF5AMAuuNY3B0t/wyqw bAfXLm4AwRJmwmGIMsRqWJbUj4rCY97dDr9UJjd09ivscXHEOP2/huOL1x4iWMi85aUC NK3c/JAcXxJsRPRWJK0D351+h4rm6x3BJb2s9U2vL15wqRZ17UgtrjkDFiPDEATlf+FB S/Kg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=tsyuqEltcHTpckmF/ik/Z/4NbKMVod9kY0ODODwD9J0=; b=KiVTHQCQ/9RCk758PkAYa28TuJd1SSk0GckzsJD1DmgJYt9QWCip1HBpsI1K/DQX6o ks/y3bAeBl/FJsngX7WyuCFNT6JEEIdchYy8+XRflUs5azsVhsKnBws1TuxzAFnG1EJT nGdq65BOHeSRaxuqwDFQHmunvt1pSzqucuVpkCSK7Z+Rdm/eoTcUHEfvo+EQ6/CN8ANi 6TnVHOJU6neei9fEbnLzMh756B1GEJf1P/X5O14Zirs70+2gjMxKXgdD0AX/qf/XO6Hl pE6Z6TSysinDweeLUCYHkvH4KmDgJQ2UEwVm7qJlIKimzrye1DEgQYbAmS5FJCEc+iFo VmNg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=rky+NCtd; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id r1-20020a0562140c4100b004c690eebae3si1270533qvj.38.2022.11.18.01.57.30 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 18 Nov 2022 01:57:31 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=rky+NCtd; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ovxyr-0007HV-5t; Fri, 18 Nov 2022 04:48:17 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ovxyl-00079y-LZ for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:12 -0500 Received: from mail-pj1-x1034.google.com ([2607:f8b0:4864:20::1034]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ovxyj-0001zS-AJ for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:11 -0500 Received: by mail-pj1-x1034.google.com with SMTP id d13-20020a17090a3b0d00b00213519dfe4aso4588078pjc.2 for ; Fri, 18 Nov 2022 01:48:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=tsyuqEltcHTpckmF/ik/Z/4NbKMVod9kY0ODODwD9J0=; b=rky+NCtd+GCnamymXcoog2Nx6Qv+V55KumMsaWlBfPzYCb4V/cytRZJkEDLvuNCa2V oyFbejhUHI8ApzrwkmBLntmlRecXMU246bkH4XbWerPQiDAgJiiFeUDlryrhqY3S5hfP QKLLUlQftCw1cgoQTi8zQt6OvMol87TgBN1afYPxCPagMsjcZFbclmmxFhIfCr0+5kWm LmcPXqF2mkxJbA8jPlWVqv6O+RrX8joyS3BI3yhQUCiSbYjrFAcy9PDbH3kf1uXjux82 SGrL+svmtVK8BREGL02ngjSqWMGtUnX80s1tIn1Yt67fzxbzvshyxyxZ9vPRw47S4/CB IRyA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tsyuqEltcHTpckmF/ik/Z/4NbKMVod9kY0ODODwD9J0=; b=ycn4MpYlaElAcFL2J6fasQ8qdekCZ1x53kMp2QVk+prTEC1gdxm9mlLmii4hMV6ROQ nfkK5/hgSCtNa+XAKn4Syc2IM5NakoInJfzaUwubf5iiVvrAxKagoEZMrBNbF5Ym9MJv flA1ulhQtttVU9SU45Igbr4UJqtkWxgICaLFij3JpA6F0wpZDoQOB3zEEX94LtciQZXB J7RbDEWvWCPrIPkiGhUmarn+s3QpErgqTk69oxf/JNYn9cTy7NpP0LagsfVuxbVDkYjR efC1NnIveMO8cBMpjiLKGO+i4v5OU1DHJT88YkgONe8IA/XLfQdQZZ5SEh2Ve3e8/PUl JxDQ== X-Gm-Message-State: ANoB5pmkTig52s4lv4lKJUVFIpRpbIXJ9bHgCi2EuIkbt1727GVntj2m JVbGth5ouNed2laqMn4BvzVYk/GVWykk6g== X-Received: by 2002:a17:902:e1d1:b0:188:7675:ff9b with SMTP id t17-20020a170902e1d100b001887675ff9bmr6774859pla.45.1668764888057; Fri, 18 Nov 2022 01:48:08 -0800 (PST) Received: from stoup.. ([2602:47:d48a:1201:90b2:345f:bf0a:c412]) by smtp.gmail.com with ESMTPSA id n12-20020a170902e54c00b0018862bb3976sm3115421plf.308.2022.11.18.01.48.06 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Nov 2022 01:48:06 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH for-8.0 04/29] accel/tcg: Introduce tlb_read_idx Date: Fri, 18 Nov 2022 01:47:29 -0800 Message-Id: <20221118094754.242910-5-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221118094754.242910-1-richard.henderson@linaro.org> References: <20221118094754.242910-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1034; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1034.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org Instead of playing with offsetof in various places, use MMUAccessType to index an array. This is easily defined instead of the previous dummy padding array in the union. Signed-off-by: Richard Henderson Reviewed-by: Philippe Mathieu-Daudé --- include/exec/cpu-defs.h | 7 ++- include/exec/cpu_ldst.h | 26 ++++++++-- accel/tcg/cputlb.c | 104 +++++++++++++--------------------------- 3 files changed, 59 insertions(+), 78 deletions(-) diff --git a/include/exec/cpu-defs.h b/include/exec/cpu-defs.h index 21309cf567..7ce3bcb06b 100644 --- a/include/exec/cpu-defs.h +++ b/include/exec/cpu-defs.h @@ -128,8 +128,11 @@ typedef struct CPUTLBEntry { use the corresponding iotlb value. */ uintptr_t addend; }; - /* padding to get a power of two size */ - uint8_t dummy[1 << CPU_TLB_ENTRY_BITS]; + /* + * Padding to get a power of two size, as well as index + * access to addr_{read,write,code}. + */ + target_ulong addr_idx[(1 << CPU_TLB_ENTRY_BITS) / TARGET_LONG_SIZE]; }; } CPUTLBEntry; diff --git a/include/exec/cpu_ldst.h b/include/exec/cpu_ldst.h index 09b55cc0ee..fad6efc0ad 100644 --- a/include/exec/cpu_ldst.h +++ b/include/exec/cpu_ldst.h @@ -360,13 +360,29 @@ static inline void clear_helper_retaddr(void) /* Needed for TCG_OVERSIZED_GUEST */ #include "tcg/tcg.h" +static inline target_ulong tlb_read_idx(const CPUTLBEntry *entry, + MMUAccessType access_type) +{ + /* Do not rearrange the CPUTLBEntry structure members. */ + QEMU_BUILD_BUG_ON(offsetof(CPUTLBEntry, addr_read) != + MMU_DATA_LOAD * TARGET_LONG_SIZE); + QEMU_BUILD_BUG_ON(offsetof(CPUTLBEntry, addr_write) != + MMU_DATA_STORE * TARGET_LONG_SIZE); + QEMU_BUILD_BUG_ON(offsetof(CPUTLBEntry, addr_code) != + MMU_INST_FETCH * TARGET_LONG_SIZE); + + const target_ulong *ptr = &entry->addr_idx[access_type]; +#if TCG_OVERSIZED_GUEST + return *ptr; +#else + /* ofs might correspond to .addr_write, so use qatomic_read */ + return qatomic_read(ptr); +#endif +} + static inline target_ulong tlb_addr_write(const CPUTLBEntry *entry) { -#if TCG_OVERSIZED_GUEST - return entry->addr_write; -#else - return qatomic_read(&entry->addr_write); -#endif + return tlb_read_idx(entry, MMU_DATA_STORE); } /* Find the TLB index corresponding to the mmu_idx + address pair. */ diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c index d177afcad6..00a2b217e5 100644 --- a/accel/tcg/cputlb.c +++ b/accel/tcg/cputlb.c @@ -1437,34 +1437,17 @@ static void io_writex(CPUArchState *env, CPUTLBEntryFull *full, } } -static inline target_ulong tlb_read_ofs(CPUTLBEntry *entry, size_t ofs) -{ -#if TCG_OVERSIZED_GUEST - return *(target_ulong *)((uintptr_t)entry + ofs); -#else - /* ofs might correspond to .addr_write, so use qatomic_read */ - return qatomic_read((target_ulong *)((uintptr_t)entry + ofs)); -#endif -} - /* Return true if ADDR is present in the victim tlb, and has been copied back to the main tlb. */ static bool victim_tlb_hit(CPUArchState *env, size_t mmu_idx, size_t index, - size_t elt_ofs, target_ulong page) + MMUAccessType access_type, target_ulong page) { size_t vidx; assert_cpu_is_self(env_cpu(env)); for (vidx = 0; vidx < CPU_VTLB_SIZE; ++vidx) { CPUTLBEntry *vtlb = &env_tlb(env)->d[mmu_idx].vtable[vidx]; - target_ulong cmp; - - /* elt_ofs might correspond to .addr_write, so use qatomic_read */ -#if TCG_OVERSIZED_GUEST - cmp = *(target_ulong *)((uintptr_t)vtlb + elt_ofs); -#else - cmp = qatomic_read((target_ulong *)((uintptr_t)vtlb + elt_ofs)); -#endif + target_ulong cmp = tlb_read_idx(vtlb, access_type); if (cmp == page) { /* Found entry in victim tlb, swap tlb and iotlb. */ @@ -1486,11 +1469,6 @@ static bool victim_tlb_hit(CPUArchState *env, size_t mmu_idx, size_t index, return false; } -/* Macro to call the above, with local variables from the use context. */ -#define VICTIM_TLB_HIT(TY, ADDR) \ - victim_tlb_hit(env, mmu_idx, index, offsetof(CPUTLBEntry, TY), \ - (ADDR) & TARGET_PAGE_MASK) - static void notdirty_write(CPUState *cpu, vaddr mem_vaddr, unsigned size, CPUTLBEntryFull *full, uintptr_t retaddr) { @@ -1526,29 +1504,12 @@ static int probe_access_internal(CPUArchState *env, target_ulong addr, { uintptr_t index = tlb_index(env, mmu_idx, addr); CPUTLBEntry *entry = tlb_entry(env, mmu_idx, addr); - target_ulong tlb_addr, page_addr; - size_t elt_ofs; - int flags; + target_ulong tlb_addr = tlb_read_idx(entry, access_type); + target_ulong page_addr = addr & TARGET_PAGE_MASK; + int flags = TLB_FLAGS_MASK; - switch (access_type) { - case MMU_DATA_LOAD: - elt_ofs = offsetof(CPUTLBEntry, addr_read); - break; - case MMU_DATA_STORE: - elt_ofs = offsetof(CPUTLBEntry, addr_write); - break; - case MMU_INST_FETCH: - elt_ofs = offsetof(CPUTLBEntry, addr_code); - break; - default: - g_assert_not_reached(); - } - tlb_addr = tlb_read_ofs(entry, elt_ofs); - - flags = TLB_FLAGS_MASK; - page_addr = addr & TARGET_PAGE_MASK; if (!tlb_hit_page(tlb_addr, page_addr)) { - if (!victim_tlb_hit(env, mmu_idx, index, elt_ofs, page_addr)) { + if (!victim_tlb_hit(env, mmu_idx, index, access_type, page_addr)) { CPUState *cs = env_cpu(env); if (!cs->cc->tcg_ops->tlb_fill(cs, addr, fault_size, access_type, @@ -1570,7 +1531,7 @@ static int probe_access_internal(CPUArchState *env, target_ulong addr, */ flags &= ~TLB_INVALID_MASK; } - tlb_addr = tlb_read_ofs(entry, elt_ofs); + tlb_addr = tlb_read_idx(entry, access_type); } flags &= tlb_addr; @@ -1784,7 +1745,8 @@ static void *atomic_mmu_lookup(CPUArchState *env, target_ulong addr, if (prot & PAGE_WRITE) { tlb_addr = tlb_addr_write(tlbe); if (!tlb_hit(tlb_addr, addr)) { - if (!VICTIM_TLB_HIT(addr_write, addr)) { + if (!victim_tlb_hit(env, mmu_idx, index, MMU_DATA_STORE, + addr & TARGET_PAGE_MASK)) { tlb_fill(env_cpu(env), addr, size, MMU_DATA_STORE, mmu_idx, retaddr); index = tlb_index(env, mmu_idx, addr); @@ -1808,7 +1770,8 @@ static void *atomic_mmu_lookup(CPUArchState *env, target_ulong addr, } else /* if (prot & PAGE_READ) */ { tlb_addr = tlbe->addr_read; if (!tlb_hit(tlb_addr, addr)) { - if (!VICTIM_TLB_HIT(addr_write, addr)) { + if (!victim_tlb_hit(env, mmu_idx, index, MMU_DATA_LOAD, + addr & TARGET_PAGE_MASK)) { tlb_fill(env_cpu(env), addr, size, MMU_DATA_LOAD, mmu_idx, retaddr); index = tlb_index(env, mmu_idx, addr); @@ -1894,13 +1857,9 @@ load_memop(const void *haddr, MemOp op) static inline uint64_t QEMU_ALWAYS_INLINE load_helper(CPUArchState *env, target_ulong addr, MemOpIdx oi, - uintptr_t retaddr, MemOp op, bool code_read, + uintptr_t retaddr, MemOp op, MMUAccessType access_type, FullLoadHelper *full_load) { - const size_t tlb_off = code_read ? - offsetof(CPUTLBEntry, addr_code) : offsetof(CPUTLBEntry, addr_read); - const MMUAccessType access_type = - code_read ? MMU_INST_FETCH : MMU_DATA_LOAD; const unsigned a_bits = get_alignment_bits(get_memop(oi)); const size_t size = memop_size(op); uintptr_t mmu_idx = get_mmuidx(oi); @@ -1920,18 +1879,18 @@ load_helper(CPUArchState *env, target_ulong addr, MemOpIdx oi, index = tlb_index(env, mmu_idx, addr); entry = tlb_entry(env, mmu_idx, addr); - tlb_addr = code_read ? entry->addr_code : entry->addr_read; + tlb_addr = tlb_read_idx(entry, access_type); /* If the TLB entry is for a different page, reload and try again. */ if (!tlb_hit(tlb_addr, addr)) { - if (!victim_tlb_hit(env, mmu_idx, index, tlb_off, + if (!victim_tlb_hit(env, mmu_idx, index, access_type, addr & TARGET_PAGE_MASK)) { tlb_fill(env_cpu(env), addr, size, access_type, mmu_idx, retaddr); index = tlb_index(env, mmu_idx, addr); entry = tlb_entry(env, mmu_idx, addr); } - tlb_addr = code_read ? entry->addr_code : entry->addr_read; + tlb_addr = tlb_read_idx(entry, access_type); tlb_addr &= ~TLB_INVALID_MASK; } @@ -2017,7 +1976,8 @@ static uint64_t full_ldub_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr) { validate_memop(oi, MO_UB); - return load_helper(env, addr, oi, retaddr, MO_UB, false, full_ldub_mmu); + return load_helper(env, addr, oi, retaddr, MO_UB, MMU_DATA_LOAD, + full_ldub_mmu); } tcg_target_ulong helper_ret_ldub_mmu(CPUArchState *env, target_ulong addr, @@ -2030,7 +1990,7 @@ static uint64_t full_le_lduw_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr) { validate_memop(oi, MO_LEUW); - return load_helper(env, addr, oi, retaddr, MO_LEUW, false, + return load_helper(env, addr, oi, retaddr, MO_LEUW, MMU_DATA_LOAD, full_le_lduw_mmu); } @@ -2044,7 +2004,7 @@ static uint64_t full_be_lduw_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr) { validate_memop(oi, MO_BEUW); - return load_helper(env, addr, oi, retaddr, MO_BEUW, false, + return load_helper(env, addr, oi, retaddr, MO_BEUW, MMU_DATA_LOAD, full_be_lduw_mmu); } @@ -2058,7 +2018,7 @@ static uint64_t full_le_ldul_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr) { validate_memop(oi, MO_LEUL); - return load_helper(env, addr, oi, retaddr, MO_LEUL, false, + return load_helper(env, addr, oi, retaddr, MO_LEUL, MMU_DATA_LOAD, full_le_ldul_mmu); } @@ -2072,7 +2032,7 @@ static uint64_t full_be_ldul_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr) { validate_memop(oi, MO_BEUL); - return load_helper(env, addr, oi, retaddr, MO_BEUL, false, + return load_helper(env, addr, oi, retaddr, MO_BEUL, MMU_DATA_LOAD, full_be_ldul_mmu); } @@ -2086,7 +2046,7 @@ uint64_t helper_le_ldq_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr) { validate_memop(oi, MO_LEUQ); - return load_helper(env, addr, oi, retaddr, MO_LEUQ, false, + return load_helper(env, addr, oi, retaddr, MO_LEUQ, MMU_DATA_LOAD, helper_le_ldq_mmu); } @@ -2094,7 +2054,7 @@ uint64_t helper_be_ldq_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr) { validate_memop(oi, MO_BEUQ); - return load_helper(env, addr, oi, retaddr, MO_BEUQ, false, + return load_helper(env, addr, oi, retaddr, MO_BEUQ, MMU_DATA_LOAD, helper_be_ldq_mmu); } @@ -2290,7 +2250,6 @@ store_helper_unaligned(CPUArchState *env, target_ulong addr, uint64_t val, uintptr_t retaddr, size_t size, uintptr_t mmu_idx, bool big_endian) { - const size_t tlb_off = offsetof(CPUTLBEntry, addr_write); uintptr_t index, index2; CPUTLBEntry *entry, *entry2; target_ulong page1, page2, tlb_addr, tlb_addr2; @@ -2312,7 +2271,7 @@ store_helper_unaligned(CPUArchState *env, target_ulong addr, uint64_t val, tlb_addr2 = tlb_addr_write(entry2); if (page1 != page2 && !tlb_hit_page(tlb_addr2, page2)) { - if (!victim_tlb_hit(env, mmu_idx, index2, tlb_off, page2)) { + if (!victim_tlb_hit(env, mmu_idx, index2, MMU_DATA_STORE, page2)) { tlb_fill(env_cpu(env), page2, size2, MMU_DATA_STORE, mmu_idx, retaddr); index2 = tlb_index(env, mmu_idx, page2); @@ -2365,7 +2324,6 @@ static inline void QEMU_ALWAYS_INLINE store_helper(CPUArchState *env, target_ulong addr, uint64_t val, MemOpIdx oi, uintptr_t retaddr, MemOp op) { - const size_t tlb_off = offsetof(CPUTLBEntry, addr_write); const unsigned a_bits = get_alignment_bits(get_memop(oi)); const size_t size = memop_size(op); uintptr_t mmu_idx = get_mmuidx(oi); @@ -2388,7 +2346,7 @@ store_helper(CPUArchState *env, target_ulong addr, uint64_t val, /* If the TLB entry is for a different page, reload and try again. */ if (!tlb_hit(tlb_addr, addr)) { - if (!victim_tlb_hit(env, mmu_idx, index, tlb_off, + if (!victim_tlb_hit(env, mmu_idx, index, MMU_DATA_STORE, addr & TARGET_PAGE_MASK)) { tlb_fill(env_cpu(env), addr, size, MMU_DATA_STORE, mmu_idx, retaddr); @@ -2694,7 +2652,8 @@ void cpu_st16_le_mmu(CPUArchState *env, abi_ptr addr, Int128 val, static uint64_t full_ldub_code(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr) { - return load_helper(env, addr, oi, retaddr, MO_8, true, full_ldub_code); + return load_helper(env, addr, oi, retaddr, MO_8, + MMU_INST_FETCH, full_ldub_code); } uint32_t cpu_ldub_code(CPUArchState *env, abi_ptr addr) @@ -2706,7 +2665,8 @@ uint32_t cpu_ldub_code(CPUArchState *env, abi_ptr addr) static uint64_t full_lduw_code(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr) { - return load_helper(env, addr, oi, retaddr, MO_TEUW, true, full_lduw_code); + return load_helper(env, addr, oi, retaddr, MO_TEUW, + MMU_INST_FETCH, full_lduw_code); } uint32_t cpu_lduw_code(CPUArchState *env, abi_ptr addr) @@ -2718,7 +2678,8 @@ uint32_t cpu_lduw_code(CPUArchState *env, abi_ptr addr) static uint64_t full_ldl_code(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr) { - return load_helper(env, addr, oi, retaddr, MO_TEUL, true, full_ldl_code); + return load_helper(env, addr, oi, retaddr, MO_TEUL, + MMU_INST_FETCH, full_ldl_code); } uint32_t cpu_ldl_code(CPUArchState *env, abi_ptr addr) @@ -2730,7 +2691,8 @@ uint32_t cpu_ldl_code(CPUArchState *env, abi_ptr addr) static uint64_t full_ldq_code(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr) { - return load_helper(env, addr, oi, retaddr, MO_TEUQ, true, full_ldq_code); + return load_helper(env, addr, oi, retaddr, MO_TEUQ, + MMU_INST_FETCH, full_ldq_code); } uint64_t cpu_ldq_code(CPUArchState *env, abi_ptr addr) From patchwork Fri Nov 18 09:47:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 626128 Delivered-To: patch@linaro.org Received: by 2002:a17:522:c983:b0:460:3032:e3c4 with SMTP id kr3csp35722pvb; Fri, 18 Nov 2022 01:58:35 -0800 (PST) X-Google-Smtp-Source: AA0mqf4qkmpFqrQzb2GcK73jkzxVkIqyg5sju8mX2an/U50IehSTyVwcY49dV1eJNkQlWpXob+2W X-Received: by 2002:a05:620a:148c:b0:6fb:8eb:a6bc with SMTP id w12-20020a05620a148c00b006fb08eba6bcmr5341068qkj.101.1668765515776; Fri, 18 Nov 2022 01:58:35 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668765515; cv=none; d=google.com; s=arc-20160816; b=0MNSOn/33E8Bc+Mw7F5MJ+SgkagqMAQSO2C9XeSWsoXOPsPQijbu5PCxRMdAtY+hYQ +NDePVpjqQuXdDLtG0XGgdOBlBFWyoG/z6C4bxRRwLp3JWgPODVkx73eh7wxeJpz04qk 9QCMILwyhmWJ1tQVmNqffOfSIyT4fnYmqJ6JLj8xRk9svUOa7Nou9GYl0kuoO5xZmMXB JCrU62FWLCwb+04ifZ2gN441Y+1Y4Kj9YKqCpBdlr4UVjSxvXVAuhHeYQo6araoMnh09 RS1JpInsUU6bWQhJCl6lV51nWDVV1PSAvANaRXvnfEM2dbeZ0QJ/h4EIkjNbA7nONtaT 2xMg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=fcYB/BoWu1BcEoWr8DDpxA/JmoF7fvgd3FsDIEsw5Lw=; b=t7yajoo12aPMTzaHmZCBy53jJ9l4rm2bgwTubt45pDPFTnw6tRKsgXgsStTuFn9soH kB9muiziS0/4G7sfRMAHx6wY9dmHT2pQFyBLc3fQQXG/PspxPN507jviNjiXwADy2mBS uIKSKDrwTCJ/dFRRcwHrtwGrZI390w4DQMORsumFR1K7kW2zRrtjusUgPigPY4HTkZnk f5qA01dbZMNAAqvfgjzTE/zuBkvr+RRCUX2n/M1UyUHx6919X7XkxBI49YF3/wrlQRJJ UPyhomMw1Axc4V/aJ/3qQnHZhiGL0995A0cKvXyJGItLRXZOFrZ+drwtG5MhJjj+I/1B xumw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=rW59zJHt; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id r13-20020ac8794d000000b003a542a95a0bsi1573470qtt.208.2022.11.18.01.58.35 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 18 Nov 2022 01:58:35 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=rW59zJHt; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ovxyr-0007I3-OW; Fri, 18 Nov 2022 04:48:17 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ovxyo-0007CA-Nj for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:14 -0500 Received: from mail-pj1-x102d.google.com ([2607:f8b0:4864:20::102d]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ovxyl-0001zc-Bu for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:14 -0500 Received: by mail-pj1-x102d.google.com with SMTP id v4-20020a17090a088400b00212cb0ed97eso4567948pjc.5 for ; Fri, 18 Nov 2022 01:48:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=fcYB/BoWu1BcEoWr8DDpxA/JmoF7fvgd3FsDIEsw5Lw=; b=rW59zJHtVIuuOSRz1M4oS5XrCEYXxlFK4y73Efv9X0bsCteN2h7fjZBMWx6AiTfXqu Q2P+lwsF7ZMz/3GxG5WwOEPrQsvNICLsxRHF4QQZ9uSTdhk3ws01UMLA9UzCWGLpOqUO UMLLJJuKCo40HEgfmM64lyFgUpJG8kJaomGjmoixPQQmvUcOh14MQ5hZLK7XlFOyccna S69J8EuHu45Tj8NU+VPjG9vcOcqtcaylfHGwvmkoZVUtjeT9fwJ9OkxiW2kzCu6pJVBn uS/a7lD7CqQy5NeG/B5LK5ndRfnCv+X4cZMsMeNzZ6fRkQ34MaUCTDpoTNjrRdHZZ8do pwaw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=fcYB/BoWu1BcEoWr8DDpxA/JmoF7fvgd3FsDIEsw5Lw=; b=IYAqHdt6NjDkiN0A8KrSg1lQ81AeT80dXLiyCFTxTZKepVMuLIM+v608OWWVgHTAY6 BevzwqQELfDQpImMzKQsfzxEQjfeA6q3Ipf/zWsn8ThP1uYmPd6WmqF0d6+armk6bQwS 9rURJ6Oo9c5dcSdUyR5aiuE+c1IVIaeSqU9FSjKorDN0p7+tjhjzZxf7aF5hJAKc2Rtt OUmg+rvA7jcHEJBmKcgclEck81D56hDsDjkfJx6WIOD2XpSloWqYu3P3U5DgHtCgP8cT 3R77rOn/6cZhVX/dJFlSF+knw5Ol95Er/dnmbNVgHAEHVPicOPObZp8chdh+NqdyRWax AvNw== X-Gm-Message-State: ANoB5pkV2uwkDIXWGmj2uHuos2sCK6fhaShA9TbfffmqKFK6HoVCks5G Igrcyv4aLNjNHcf6Y+CROsmvVn+A1O0dOw== X-Received: by 2002:a17:902:cf0e:b0:186:6723:9217 with SMTP id i14-20020a170902cf0e00b0018667239217mr6796721plg.160.1668764889894; Fri, 18 Nov 2022 01:48:09 -0800 (PST) Received: from stoup.. ([2602:47:d48a:1201:90b2:345f:bf0a:c412]) by smtp.gmail.com with ESMTPSA id n12-20020a170902e54c00b0018862bb3976sm3115421plf.308.2022.11.18.01.48.08 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Nov 2022 01:48:08 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH for-8.0 05/29] accel/tcg: Reorg system mode load helpers Date: Fri, 18 Nov 2022 01:47:30 -0800 Message-Id: <20221118094754.242910-6-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221118094754.242910-1-richard.henderson@linaro.org> References: <20221118094754.242910-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::102d; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x102d.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org Instead of trying to unify all operations on uint64_t, pull out mmu_lookup() to perform the basic tlb hit and resolution. Create individual functions to handle access by size. Signed-off-by: Richard Henderson --- accel/tcg/cputlb.c | 606 ++++++++++++++++++++++++++++++--------------- 1 file changed, 413 insertions(+), 193 deletions(-) diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c index 00a2b217e5..c05647f1ba 100644 --- a/accel/tcg/cputlb.c +++ b/accel/tcg/cputlb.c @@ -1699,6 +1699,182 @@ bool tlb_plugin_lookup(CPUState *cpu, target_ulong addr, int mmu_idx, #endif +/* + * Probe for a load/store operation. + * Return the host address and into @flags. + */ + +typedef struct MMULookupPageData { + CPUTLBEntryFull *full; + void *haddr; + target_ulong addr; + int flags; + int size; +} MMULookupPageData; + +typedef struct MMULookupLocals { + MMULookupPageData page[2]; + MemOp memop; + int mmu_idx; +} MMULookupLocals; + +/** + * mmu_lookup1: translate one page + * @env: cpu context + * @data: lookup parameters + * @mmu_idx: virtual address context + * @access_type: load/store/code + * @ra: return address into tcg generated code, or 0 + * + * Resolve the translation for the one page at @data.addr, filling in + * the rest of @data with the results. If the translation fails, + * tlb_fill will longjmp out. Return true if the softmmu tlb for + * @mmu_idx may have resized. + */ +static bool mmu_lookup1(CPUArchState *env, MMULookupPageData *data, + int mmu_idx, MMUAccessType access_type, uintptr_t ra) +{ + target_ulong addr = data->addr; + uintptr_t index = tlb_index(env, mmu_idx, addr); + CPUTLBEntry *entry = tlb_entry(env, mmu_idx, addr); + target_ulong tlb_addr = tlb_read_idx(entry, access_type); + bool maybe_resized = false; + + /* If the TLB entry is for a different page, reload and try again. */ + if (!tlb_hit(tlb_addr, addr)) { + if (!victim_tlb_hit(env, mmu_idx, index, access_type, + addr & TARGET_PAGE_MASK)) { + tlb_fill(env_cpu(env), addr, data->size, access_type, mmu_idx, ra); + maybe_resized = true; + index = tlb_index(env, mmu_idx, addr); + entry = tlb_entry(env, mmu_idx, addr); + } + tlb_addr = tlb_read_idx(entry, access_type) & ~TLB_INVALID_MASK; + } + + data->flags = tlb_addr & TLB_FLAGS_MASK; + data->full = &env_tlb(env)->d[mmu_idx].fulltlb[index]; + /* Compute haddr speculatively; depending on flags it might be invalid. */ + data->haddr = (void *)((uintptr_t)addr + entry->addend); + + return maybe_resized; +} + +/** + * mmu_watch_or_dirty + * @env: cpu context + * @data: lookup parameters + * @access_type: load/store/code + * @ra: return address into tcg generated code, or 0 + * + * Trigger watchpoints for @data.addr:@data.size; + * record writes to protected clean pages. + */ +static void mmu_watch_or_dirty(CPUArchState *env, MMULookupPageData *data, + MMUAccessType access_type, uintptr_t ra) +{ + CPUTLBEntryFull *full = data->full; + target_ulong addr = data->addr; + int flags = data->flags; + int size = data->size; + + /* On watchpoint hit, this will longjmp out. */ + if (flags & TLB_WATCHPOINT) { + int wp = access_type == MMU_DATA_STORE ? BP_MEM_WRITE : BP_MEM_READ; + cpu_check_watchpoint(env_cpu(env), addr, size, full->attrs, wp, ra); + flags &= ~TLB_WATCHPOINT; + } + + if (flags & TLB_NOTDIRTY) { + notdirty_write(env_cpu(env), addr, size, full, ra); + flags &= ~TLB_NOTDIRTY; + } + data->flags = flags; +} + +/** + * mmu_lookup: translate page(s) + * @env: cpu context + * @addr: virtual address + * @oi: combined mmu_idx and MemOp + * @ra: return address into tcg generated code, or 0 + * @access_type: load/store/code + * @l: output result + * + * Resolve the translation for the page(s) beginning at @addr, for MemOp.size + * bytes. Return true if the lookup crosses a page boundary. + */ +static inline bool QEMU_ALWAYS_INLINE +mmu_lookup(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t ra, + MMUAccessType access_type, MMULookupLocals *l) +{ + unsigned a_bits; + bool crosspage; + int flags; + + l->memop = get_memop(oi); + l->mmu_idx = get_mmuidx(oi); + + tcg_debug_assert(l->mmu_idx < NB_MMU_MODES); + + /* Handle CPU specific unaligned behaviour */ + a_bits = get_alignment_bits(l->memop); + if (addr & ((1 << a_bits) - 1)) { + cpu_unaligned_access(env_cpu(env), addr, access_type, l->mmu_idx, ra); + } + + l->page[0].addr = addr; + l->page[0].size = memop_size(l->memop); + l->page[1].addr = (addr + l->page[0].size - 1) & TARGET_PAGE_MASK; + l->page[1].size = 0; + crosspage = (addr ^ l->page[1].addr) & TARGET_PAGE_MASK; + + if (likely(!crosspage)) { + mmu_lookup1(env, &l->page[0], l->mmu_idx, access_type, ra); + + flags = l->page[0].flags; + if (unlikely(flags & (TLB_WATCHPOINT | TLB_NOTDIRTY))) { + mmu_watch_or_dirty(env, &l->page[0], access_type, ra); + } + } else { + /* Finish compute of page crossing. */ + int size1 = l->page[1].addr - addr; + l->page[1].size = l->page[0].size - size1; + l->page[0].size = size1; + + /* + * Lookup both pages, recognizing exceptions from either. If the + * second lookup potentially resized, refresh first CPUTLBEntryFull. + */ + mmu_lookup1(env, &l->page[0], l->mmu_idx, access_type, ra); + if (mmu_lookup1(env, &l->page[1], l->mmu_idx, access_type, ra)) { + uintptr_t index = tlb_index(env, l->mmu_idx, addr); + l->page[0].full = &env_tlb(env)->d[l->mmu_idx].fulltlb[index]; + } + + flags = l->page[0].flags | l->page[1].flags; + if (unlikely(flags & (TLB_WATCHPOINT | TLB_NOTDIRTY))) { + mmu_watch_or_dirty(env, &l->page[0], access_type, ra); + mmu_watch_or_dirty(env, &l->page[1], access_type, ra); + } + } + + return crosspage; +} + +/* + * Since target/sparc is the only user of TLB_BSWAP, and all + * Sparc accesses are aligned, any treatment across two pages + * would be arbitrary. Refuse it until there's a use. + */ +#define assert_no_tlb_bswap_(F) \ + tcg_debug_assert((F & TLB_BSWAP) == 0) +#define assert_no_tlb_bswap \ + do { \ + assert_no_tlb_bswap_(l.page[0].flags); \ + assert_no_tlb_bswap_(l.page[1].flags); \ + } while (0) + /* * Probe for an atomic operation. Do not allow unaligned operations, * or io operations to proceed. Return the host address. @@ -1855,113 +2031,6 @@ load_memop(const void *haddr, MemOp op) } } -static inline uint64_t QEMU_ALWAYS_INLINE -load_helper(CPUArchState *env, target_ulong addr, MemOpIdx oi, - uintptr_t retaddr, MemOp op, MMUAccessType access_type, - FullLoadHelper *full_load) -{ - const unsigned a_bits = get_alignment_bits(get_memop(oi)); - const size_t size = memop_size(op); - uintptr_t mmu_idx = get_mmuidx(oi); - uintptr_t index; - CPUTLBEntry *entry; - target_ulong tlb_addr; - void *haddr; - uint64_t res; - - tcg_debug_assert(mmu_idx < NB_MMU_MODES); - - /* Handle CPU specific unaligned behaviour */ - if (addr & ((1 << a_bits) - 1)) { - cpu_unaligned_access(env_cpu(env), addr, access_type, - mmu_idx, retaddr); - } - - index = tlb_index(env, mmu_idx, addr); - entry = tlb_entry(env, mmu_idx, addr); - tlb_addr = tlb_read_idx(entry, access_type); - - /* If the TLB entry is for a different page, reload and try again. */ - if (!tlb_hit(tlb_addr, addr)) { - if (!victim_tlb_hit(env, mmu_idx, index, access_type, - addr & TARGET_PAGE_MASK)) { - tlb_fill(env_cpu(env), addr, size, - access_type, mmu_idx, retaddr); - index = tlb_index(env, mmu_idx, addr); - entry = tlb_entry(env, mmu_idx, addr); - } - tlb_addr = tlb_read_idx(entry, access_type); - tlb_addr &= ~TLB_INVALID_MASK; - } - - /* Handle anything that isn't just a straight memory access. */ - if (unlikely(tlb_addr & ~TARGET_PAGE_MASK)) { - CPUTLBEntryFull *full; - bool need_swap; - - /* For anything that is unaligned, recurse through full_load. */ - if ((addr & (size - 1)) != 0) { - goto do_unaligned_access; - } - - full = &env_tlb(env)->d[mmu_idx].fulltlb[index]; - - /* Handle watchpoints. */ - if (unlikely(tlb_addr & TLB_WATCHPOINT)) { - /* On watchpoint hit, this will longjmp out. */ - cpu_check_watchpoint(env_cpu(env), addr, size, - full->attrs, BP_MEM_READ, retaddr); - } - - need_swap = size > 1 && (tlb_addr & TLB_BSWAP); - - /* Handle I/O access. */ - if (likely(tlb_addr & TLB_MMIO)) { - return io_readx(env, full, mmu_idx, addr, retaddr, - access_type, op ^ (need_swap * MO_BSWAP)); - } - - haddr = (void *)((uintptr_t)addr + entry->addend); - - /* - * Keep these two load_memop separate to ensure that the compiler - * is able to fold the entire function to a single instruction. - * There is a build-time assert inside to remind you of this. ;-) - */ - if (unlikely(need_swap)) { - return load_memop(haddr, op ^ MO_BSWAP); - } - return load_memop(haddr, op); - } - - /* Handle slow unaligned access (it spans two pages or IO). */ - if (size > 1 - && unlikely((addr & ~TARGET_PAGE_MASK) + size - 1 - >= TARGET_PAGE_SIZE)) { - target_ulong addr1, addr2; - uint64_t r1, r2; - unsigned shift; - do_unaligned_access: - addr1 = addr & ~((target_ulong)size - 1); - addr2 = addr1 + size; - r1 = full_load(env, addr1, oi, retaddr); - r2 = full_load(env, addr2, oi, retaddr); - shift = (addr & (size - 1)) * 8; - - if (memop_big_endian(op)) { - /* Big-endian combine. */ - res = (r1 << shift) | (r2 >> ((size * 8) - shift)); - } else { - /* Little-endian combine. */ - res = (r1 >> shift) | (r2 << ((size * 8) - shift)); - } - return res & MAKE_64BIT_MASK(0, size * 8); - } - - haddr = (void *)((uintptr_t)addr + entry->addend); - return load_memop(haddr, op); -} - /* * For the benefit of TCG generated code, we want to avoid the * complication of ABI-specific return type promotion and always @@ -1972,90 +2041,240 @@ load_helper(CPUArchState *env, target_ulong addr, MemOpIdx oi, * We don't bother with this widened value for SOFTMMU_CODE_ACCESS. */ -static uint64_t full_ldub_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) +/** + * do_ld_mmio_beN: + * @env: cpu context + * @p: translation parameters + * @ret_be: accumulated data + * @mmu_idx: virtual address context + * @ra: return address into tcg generated code, or 0 + * + * Load @p->size bytes from @p->addr, which is memory-mapped i/o. + * The bytes are concatenated with in big-endian order with @ret_be. + */ +static uint64_t do_ld_mmio_beN(CPUArchState *env, MMULookupPageData *p, + uint64_t ret_be, int mmu_idx, + MMUAccessType type, uintptr_t ra) { - validate_memop(oi, MO_UB); - return load_helper(env, addr, oi, retaddr, MO_UB, MMU_DATA_LOAD, - full_ldub_mmu); + CPUTLBEntryFull *full = p->full; + target_ulong addr = p->addr; + int i, size = p->size; + + QEMU_IOTHREAD_LOCK_GUARD(); + for (i = 0; i < size; i++) { + uint8_t x = io_readx(env, full, mmu_idx, addr + i, ra, type, MO_UB); + ret_be = (ret_be << 8) | x; + } + return ret_be; +} + +/** + * do_ld_bytes_beN + * @p: translation parameters + * @ret_be: accumulated data + * + * Load @p->size bytes from @p->haddr, which is RAM. + * The bytes to concatenated in big-endian order with @ret_be. + */ +static uint64_t do_ld_bytes_beN(MMULookupPageData *p, uint64_t ret_be) +{ + uint8_t *haddr = p->haddr; + int i, size = p->size; + + for (i = 0; i < size; i++) { + ret_be = (ret_be << 8) | haddr[i]; + } + return ret_be; +} + +/* + * Wrapper for the above. + */ +static uint64_t do_ld_beN(CPUArchState *env, MMULookupPageData *p, + uint64_t ret_be, int mmu_idx, + MMUAccessType type, uintptr_t ra) +{ + if (unlikely(p->flags & TLB_MMIO)) { + return do_ld_mmio_beN(env, p, ret_be, mmu_idx, type, ra); + } else { + return do_ld_bytes_beN(p, ret_be); + } +} + +static uint8_t do_ld_1(CPUArchState *env, MMULookupPageData *p, int mmu_idx, + MMUAccessType type, uintptr_t ra) +{ + if (unlikely(p->flags & TLB_MMIO)) { + return io_readx(env, p->full, mmu_idx, p->addr, ra, type, MO_UB); + } else { + return *(uint8_t *)p->haddr; + } +} + +static uint8_t do_ld1_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, + uintptr_t ra, MMUAccessType access_type) +{ + MMULookupLocals l; + bool crosspage; + + crosspage = mmu_lookup(env, addr, oi, ra, access_type, &l); + tcg_debug_assert(!crosspage); + + return do_ld_1(env, &l.page[0], l.mmu_idx, access_type, ra); } tcg_target_ulong helper_ret_ldub_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr) { - return full_ldub_mmu(env, addr, oi, retaddr); + validate_memop(oi, MO_UB); + return do_ld1_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD); } -static uint64_t full_le_lduw_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) +static uint16_t do_ld2_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, + uintptr_t ra, MMUAccessType access_type) { - validate_memop(oi, MO_LEUW); - return load_helper(env, addr, oi, retaddr, MO_LEUW, MMU_DATA_LOAD, - full_le_lduw_mmu); + MMULookupLocals l; + bool crosspage; + uint16_t ret; + + crosspage = mmu_lookup(env, addr, oi, ra, access_type, &l); + if (likely(!crosspage)) { + if (l.page[0].flags & TLB_BSWAP) { + l.memop ^= MO_BSWAP; + } + if (unlikely(l.page[0].flags & TLB_MMIO)) { + ret = io_readx(env, l.page[0].full, l.mmu_idx, addr, ra, + access_type, l.memop); + } else { + /* Perform the load host endian, then swap if necessary. */ + ret = load_memop(l.page[0].haddr, MO_UW); + if (l.memop & MO_BSWAP) { + ret = bswap16(ret); + } + } + } else { + uint8_t a, b; + + assert_no_tlb_bswap; + + a = do_ld_1(env, &l.page[0], l.mmu_idx, access_type, ra); + b = do_ld_1(env, &l.page[1], l.mmu_idx, access_type, ra); + + if ((l.memop & MO_BSWAP) == MO_LE) { + ret = a | (b << 8); + } else { + ret = b | (a << 8); + } + } + return ret; } tcg_target_ulong helper_le_lduw_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr) { - return full_le_lduw_mmu(env, addr, oi, retaddr); -} - -static uint64_t full_be_lduw_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) -{ - validate_memop(oi, MO_BEUW); - return load_helper(env, addr, oi, retaddr, MO_BEUW, MMU_DATA_LOAD, - full_be_lduw_mmu); + validate_memop(oi, MO_LEUW); + return do_ld2_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD); } tcg_target_ulong helper_be_lduw_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr) { - return full_be_lduw_mmu(env, addr, oi, retaddr); + validate_memop(oi, MO_BEUW); + return do_ld2_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD); } -static uint64_t full_le_ldul_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) +static uint32_t do_ld4_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, + uintptr_t ra, MMUAccessType access_type) { - validate_memop(oi, MO_LEUL); - return load_helper(env, addr, oi, retaddr, MO_LEUL, MMU_DATA_LOAD, - full_le_ldul_mmu); + MMULookupLocals l; + bool crosspage; + uint32_t ret; + + crosspage = mmu_lookup(env, addr, oi, ra, access_type, &l); + if (likely(!crosspage)) { + if (l.page[0].flags & TLB_BSWAP) { + l.memop ^= MO_BSWAP; + } + if (unlikely(l.page[0].flags & TLB_MMIO)) { + ret = io_readx(env, l.page[0].full, l.mmu_idx, addr, ra, + access_type, l.memop); + } else { + /* Perform the load host endian. */ + ret = load_memop(l.page[0].haddr, MO_UL); + if (l.memop & MO_BSWAP) { + ret = bswap32(ret); + } + } + } else { + assert_no_tlb_bswap; + ret = do_ld_beN(env, &l.page[0], 0, l.mmu_idx, access_type, ra); + ret = do_ld_beN(env, &l.page[1], ret, l.mmu_idx, access_type, ra); + if ((l.memop & MO_BSWAP) == MO_LE) { + ret = bswap32(ret); + } + } + return ret; } tcg_target_ulong helper_le_ldul_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr) { - return full_le_ldul_mmu(env, addr, oi, retaddr); -} - -static uint64_t full_be_ldul_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) -{ - validate_memop(oi, MO_BEUL); - return load_helper(env, addr, oi, retaddr, MO_BEUL, MMU_DATA_LOAD, - full_be_ldul_mmu); + validate_memop(oi, MO_LEUL); + return do_ld4_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD); } tcg_target_ulong helper_be_ldul_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr) { - return full_be_ldul_mmu(env, addr, oi, retaddr); + validate_memop(oi, MO_BEUL); + return do_ld4_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD); +} + +static uint64_t do_ld8_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, + uintptr_t ra, MMUAccessType access_type) +{ + MMULookupLocals l; + bool crosspage; + uint64_t ret; + + crosspage = mmu_lookup(env, addr, oi, ra, access_type, &l); + if (likely(!crosspage)) { + if (l.page[0].flags & TLB_BSWAP) { + l.memop ^= MO_BSWAP; + } + if (unlikely(l.page[0].flags & TLB_MMIO)) { + ret = io_readx(env, l.page[0].full, l.mmu_idx, addr, ra, + access_type, l.memop); + } else { + /* Perform the load host endian. */ + ret = load_memop(l.page[0].haddr, MO_UQ); + if (l.memop & MO_BSWAP) { + ret = bswap64(ret); + } + } + } else { + assert_no_tlb_bswap; + ret = do_ld_beN(env, &l.page[0], 0, l.mmu_idx, access_type, ra); + ret = do_ld_beN(env, &l.page[1], ret, l.mmu_idx, access_type, ra); + if ((l.memop & MO_BSWAP) == MO_LE) { + ret = bswap64(ret); + } + } + return ret; } uint64_t helper_le_ldq_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr) { validate_memop(oi, MO_LEUQ); - return load_helper(env, addr, oi, retaddr, MO_LEUQ, MMU_DATA_LOAD, - helper_le_ldq_mmu); + return do_ld8_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD); } uint64_t helper_be_ldq_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr) { validate_memop(oi, MO_BEUQ); - return load_helper(env, addr, oi, retaddr, MO_BEUQ, MMU_DATA_LOAD, - helper_be_ldq_mmu); + return do_ld8_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD); } /* @@ -2098,56 +2317,85 @@ tcg_target_ulong helper_be_ldsl_mmu(CPUArchState *env, target_ulong addr, * Load helpers for cpu_ldst.h. */ -static inline uint64_t cpu_load_helper(CPUArchState *env, abi_ptr addr, - MemOpIdx oi, uintptr_t retaddr, - FullLoadHelper *full_load) +static void plugin_load_cb(CPUArchState *env, abi_ptr addr, MemOpIdx oi) { - uint64_t ret; - - ret = full_load(env, addr, oi, retaddr); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); - return ret; } uint8_t cpu_ldb_mmu(CPUArchState *env, abi_ptr addr, MemOpIdx oi, uintptr_t ra) { - return cpu_load_helper(env, addr, oi, ra, full_ldub_mmu); + uint8_t ret; + + validate_memop(oi, MO_UB); + ret = do_ld1_mmu(env, addr, oi, ra, MMU_DATA_LOAD); + plugin_load_cb(env, addr, oi); + return ret; } uint16_t cpu_ldw_be_mmu(CPUArchState *env, abi_ptr addr, MemOpIdx oi, uintptr_t ra) { - return cpu_load_helper(env, addr, oi, ra, full_be_lduw_mmu); + uint16_t ret; + + validate_memop(oi, MO_BEUW); + ret = do_ld2_mmu(env, addr, oi, ra, MMU_DATA_LOAD); + plugin_load_cb(env, addr, oi); + return ret; } uint32_t cpu_ldl_be_mmu(CPUArchState *env, abi_ptr addr, MemOpIdx oi, uintptr_t ra) { - return cpu_load_helper(env, addr, oi, ra, full_be_ldul_mmu); + uint32_t ret; + + validate_memop(oi, MO_BEUL); + ret = do_ld4_mmu(env, addr, oi, ra, MMU_DATA_LOAD); + plugin_load_cb(env, addr, oi); + return ret; } uint64_t cpu_ldq_be_mmu(CPUArchState *env, abi_ptr addr, MemOpIdx oi, uintptr_t ra) { - return cpu_load_helper(env, addr, oi, ra, helper_be_ldq_mmu); + uint64_t ret; + + validate_memop(oi, MO_BEUQ); + ret = do_ld8_mmu(env, addr, oi, ra, MMU_DATA_LOAD); + plugin_load_cb(env, addr, oi); + return ret; } uint16_t cpu_ldw_le_mmu(CPUArchState *env, abi_ptr addr, MemOpIdx oi, uintptr_t ra) { - return cpu_load_helper(env, addr, oi, ra, full_le_lduw_mmu); + uint16_t ret; + + validate_memop(oi, MO_LEUW); + ret = do_ld2_mmu(env, addr, oi, ra, MMU_DATA_LOAD); + plugin_load_cb(env, addr, oi); + return ret; } uint32_t cpu_ldl_le_mmu(CPUArchState *env, abi_ptr addr, MemOpIdx oi, uintptr_t ra) { - return cpu_load_helper(env, addr, oi, ra, full_le_ldul_mmu); + uint32_t ret; + + validate_memop(oi, MO_LEUL); + ret = do_ld4_mmu(env, addr, oi, ra, MMU_DATA_LOAD); + plugin_load_cb(env, addr, oi); + return ret; } uint64_t cpu_ldq_le_mmu(CPUArchState *env, abi_ptr addr, MemOpIdx oi, uintptr_t ra) { - return cpu_load_helper(env, addr, oi, ra, helper_le_ldq_mmu); + uint64_t ret; + + validate_memop(oi, MO_LEUQ); + ret = do_ld8_mmu(env, addr, oi, ra, MMU_DATA_LOAD); + plugin_load_cb(env, addr, oi); + return ret; } Int128 cpu_ld16_be_mmu(CPUArchState *env, abi_ptr addr, @@ -2649,54 +2897,26 @@ void cpu_st16_le_mmu(CPUArchState *env, abi_ptr addr, Int128 val, /* Code access functions. */ -static uint64_t full_ldub_code(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) -{ - return load_helper(env, addr, oi, retaddr, MO_8, - MMU_INST_FETCH, full_ldub_code); -} - uint32_t cpu_ldub_code(CPUArchState *env, abi_ptr addr) { MemOpIdx oi = make_memop_idx(MO_UB, cpu_mmu_index(env, true)); - return full_ldub_code(env, addr, oi, 0); -} - -static uint64_t full_lduw_code(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) -{ - return load_helper(env, addr, oi, retaddr, MO_TEUW, - MMU_INST_FETCH, full_lduw_code); + return do_ld1_mmu(env, addr, oi, 0, MMU_INST_FETCH); } uint32_t cpu_lduw_code(CPUArchState *env, abi_ptr addr) { MemOpIdx oi = make_memop_idx(MO_TEUW, cpu_mmu_index(env, true)); - return full_lduw_code(env, addr, oi, 0); -} - -static uint64_t full_ldl_code(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) -{ - return load_helper(env, addr, oi, retaddr, MO_TEUL, - MMU_INST_FETCH, full_ldl_code); + return do_ld2_mmu(env, addr, oi, 0, MMU_INST_FETCH); } uint32_t cpu_ldl_code(CPUArchState *env, abi_ptr addr) { MemOpIdx oi = make_memop_idx(MO_TEUL, cpu_mmu_index(env, true)); - return full_ldl_code(env, addr, oi, 0); -} - -static uint64_t full_ldq_code(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) -{ - return load_helper(env, addr, oi, retaddr, MO_TEUQ, - MMU_INST_FETCH, full_ldq_code); + return do_ld4_mmu(env, addr, oi, 0, MMU_INST_FETCH); } uint64_t cpu_ldq_code(CPUArchState *env, abi_ptr addr) { MemOpIdx oi = make_memop_idx(MO_TEUQ, cpu_mmu_index(env, true)); - return full_ldq_code(env, addr, oi, 0); + return do_ld8_mmu(env, addr, oi, 0, MMU_INST_FETCH); } From patchwork Fri Nov 18 09:47:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 626126 Delivered-To: patch@linaro.org Received: by 2002:a17:522:c983:b0:460:3032:e3c4 with SMTP id kr3csp35693pvb; Fri, 18 Nov 2022 01:58:31 -0800 (PST) X-Google-Smtp-Source: AA0mqf5UNmI0LEqnkIwBI5Y2pbnmdKpvHNIdKqgqdo0iIhHfWrbr50sEQm8CrOGX5RC9UK/rk34g X-Received: by 2002:a0c:efd0:0:b0:4bb:a80f:3f43 with SMTP id a16-20020a0cefd0000000b004bba80f3f43mr6055742qvt.43.1668765511234; Fri, 18 Nov 2022 01:58:31 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668765511; cv=none; d=google.com; s=arc-20160816; b=yNfE19NVZHZgeWLij8nLtG2Gf9FvWsOSPHyVDNswVrUGca9yLrkmsZ1/u3n4SbKrn4 4S01UMHY4gS/7KY31kKLDc5HmSx5dU+USI6GQzWRKtT0j4drkYY+h0/s7Fw+ysgXgO59 uTOFpXU2nTFIyIHthNEHrHsWq6nVors4IM9KCHSp7U4tVopBJ5e9Vsc/lXE6JWtwpZy1 IPto8cKgC6POLVsfFuR/nyvm5RWfFBFQQqQ6knO/3PWHxlhq2rQlMUxZOmqmjplLPprp AG5DpZNHr9yaTUXPE2v2bbj284HkyoXXlfzKfYHhM046Wog9Jp8/Bim+hoIUdIOiwez4 DNPw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=mDT9poPNoXMrXEayiVsGzrLnJ3Dqv2UhHnDkVgsK4q0=; b=dLF6Y+1JmYgihdVuX/OS83il8hZgR0ENIvsZU0Jmuf4gaX+zQQ42ApHxxFTD0diAgr CbEAT27/suWjb8fcarl0VIgs0FWwYJ+b/sXRUyTWB9P5ZuEYOxooDQ3GJNqv4WDmm6cP LoDNGOOQHSzO6KcPutqUCAc/P05PWu7UCoZk2Wscy6AmyrBaPKOZcZHUv/H+G34SpgA1 5ASZkO9J/gLVenqpqEqwe/HgP6O1uYrmxPdcdXAU1jt86k/IU59XohvQ74HoESlK+yym OzXUc9su8nYBBJ4q+OQtamwD7ItZt4roCGzaOgGmhcKFgCgZg87ydHi0qxfikaGvwFWE noVQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=qmpL2cx1; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id p26-20020ac8741a000000b003a532305e66si1614018qtq.606.2022.11.18.01.58.31 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 18 Nov 2022 01:58:31 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=qmpL2cx1; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ovxyt-0007N0-5T; Fri, 18 Nov 2022 04:48:19 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ovxyp-0007EH-5P for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:17 -0500 Received: from mail-pg1-x52a.google.com ([2607:f8b0:4864:20::52a]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ovxym-0001zl-P5 for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:14 -0500 Received: by mail-pg1-x52a.google.com with SMTP id n17so4539574pgh.9 for ; Fri, 18 Nov 2022 01:48:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=mDT9poPNoXMrXEayiVsGzrLnJ3Dqv2UhHnDkVgsK4q0=; b=qmpL2cx1RPajoI7SQyRSfffPolVKV8/7etoQzHsPjwqrPb7qIlgCCN0C5Et9LYsCqg j2HJGRbS4BsgjYD1jROuqYEU39cRojvRHASPStXaHDm0poUgyKtOZjVjNf5k/4FYKjDu jtX5l7tcEuc8j/rGaIFIVvZjxQ3YBYsqdAlaho0lXu3pkrRtkc+qMwhuVwwbo51NdPvM RXEii3tI7OFl+q86ShKsP5v7wQLMUd+A2m8YNS1VcRIvS4cAlz4BXTKKfXMZYhhWi+kL E3vaBKwV0dyhmI561IQ7SImz1OQ7KgMY5hUzKai4pbCNyYouZxV4RrIscMClpdD1F/ol cHxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=mDT9poPNoXMrXEayiVsGzrLnJ3Dqv2UhHnDkVgsK4q0=; b=SdefL+XqMHt1jJjZKsVuLXbMnqLdUfhOFAhvl4dNaYUJ2nLwFOC02zaRnPVOuD4LPS KpNZGhqyw1TMmkVMDggsM+GcYFhpu4gxaRw9o6WMSkFgpGRl4NwTdmZ/NQZ5yDfvKcm6 jROC3wLe6B5SI6g0MyZ7ukPG0lJ0iX7tQZG5UN2rwJtxmUvzea9657RmU90241iBT4De +1AkP7zzBgJEz+mfICaA7IVyHBk2bQc1LmJIjSFm/JqDC9qfkfGqXzHFZaIXQLCXBB4h hSagqiLl2FCC+h5wzDG8pVTwGs7QsxqckeKeMe6Q9xhapyN4q8N1vL0K/QKVHJ9m1Iat /1Sg== X-Gm-Message-State: ANoB5plcufBIKWtr7tPs2NrYjOKiHt/xvymLl8oqFY4VBSufAxWQXBbC NEAf04LsH8K96KCR6/IM+vmVwhHgUtL2Fw== X-Received: by 2002:aa7:8487:0:b0:56c:3bb4:28a8 with SMTP id u7-20020aa78487000000b0056c3bb428a8mr6995376pfn.83.1668764891096; Fri, 18 Nov 2022 01:48:11 -0800 (PST) Received: from stoup.. ([2602:47:d48a:1201:90b2:345f:bf0a:c412]) by smtp.gmail.com with ESMTPSA id n12-20020a170902e54c00b0018862bb3976sm3115421plf.308.2022.11.18.01.48.10 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Nov 2022 01:48:10 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH for-8.0 06/29] accel/tcg: Reorg system mode store helpers Date: Fri, 18 Nov 2022 01:47:31 -0800 Message-Id: <20221118094754.242910-7-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221118094754.242910-1-richard.henderson@linaro.org> References: <20221118094754.242910-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::52a; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x52a.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org Instead of trying to unify all operations on uint64_t, use mmu_lookup() to perform the basic tlb hit and resolution. Create individual functions to handle access by size. Signed-off-by: Richard Henderson --- accel/tcg/cputlb.c | 399 ++++++++++++++++++++------------------------- 1 file changed, 181 insertions(+), 218 deletions(-) diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c index c05647f1ba..5562fb82d6 100644 --- a/accel/tcg/cputlb.c +++ b/accel/tcg/cputlb.c @@ -2490,322 +2490,285 @@ store_memop(void *haddr, uint64_t val, MemOp op) } } -static void full_stb_mmu(CPUArchState *env, target_ulong addr, uint64_t val, - MemOpIdx oi, uintptr_t retaddr); - -static void __attribute__((noinline)) -store_helper_unaligned(CPUArchState *env, target_ulong addr, uint64_t val, - uintptr_t retaddr, size_t size, uintptr_t mmu_idx, - bool big_endian) +/** + * do_st_mmio_leN: + * @env: cpu context + * @p: translation parameters + * @val_le: data to store + * @mmu_idx: virtual address context + * @ra: return address into tcg generated code, or 0 + * + * Store @p->size bytes at @p->addr, which is memory-mapped i/o. + * The bytes to store are extracted in little-endian order from @val_le; + * return the bytes of @val_le beyond @p->size that have not been stored. + */ +static uint64_t do_st_mmio_leN(CPUArchState *env, MMULookupPageData *p, + uint64_t val_le, int mmu_idx, uintptr_t ra) { - uintptr_t index, index2; - CPUTLBEntry *entry, *entry2; - target_ulong page1, page2, tlb_addr, tlb_addr2; - MemOpIdx oi; - size_t size2; - int i; + CPUTLBEntryFull *full = p->full; + target_ulong addr = p->addr; + int i, size = p->size; - /* - * Ensure the second page is in the TLB. Note that the first page - * is already guaranteed to be filled, and that the second page - * cannot evict the first. An exception to this rule is PAGE_WRITE_INV - * handling: the first page could have evicted itself. - */ - page1 = addr & TARGET_PAGE_MASK; - page2 = (addr + size) & TARGET_PAGE_MASK; - size2 = (addr + size) & ~TARGET_PAGE_MASK; - index2 = tlb_index(env, mmu_idx, page2); - entry2 = tlb_entry(env, mmu_idx, page2); - - tlb_addr2 = tlb_addr_write(entry2); - if (page1 != page2 && !tlb_hit_page(tlb_addr2, page2)) { - if (!victim_tlb_hit(env, mmu_idx, index2, MMU_DATA_STORE, page2)) { - tlb_fill(env_cpu(env), page2, size2, MMU_DATA_STORE, - mmu_idx, retaddr); - index2 = tlb_index(env, mmu_idx, page2); - entry2 = tlb_entry(env, mmu_idx, page2); - } - tlb_addr2 = tlb_addr_write(entry2); + QEMU_IOTHREAD_LOCK_GUARD(); + for (i = 0; i < size; i++, val_le >>= 8) { + io_writex(env, full, mmu_idx, val_le, addr + i, ra, MO_UB); } + return val_le; +} - index = tlb_index(env, mmu_idx, addr); - entry = tlb_entry(env, mmu_idx, addr); - tlb_addr = tlb_addr_write(entry); +/** + * do_st_bytes_leN: + * @p: translation parameters + * @val_le: data to store + * + * Store @p->size bytes at @p->haddr, which is RAM. + * The bytes to store are extracted in little-endian order from @val_le; + * return the bytes of @val_le beyond @p->size that have not been stored. + */ +static uint64_t do_st_bytes_leN(MMULookupPageData *p, uint64_t val_le) +{ + uint8_t *haddr = p->haddr; + int i, size = p->size; - /* - * Handle watchpoints. Since this may trap, all checks - * must happen before any store. - */ - if (unlikely(tlb_addr & TLB_WATCHPOINT)) { - cpu_check_watchpoint(env_cpu(env), addr, size - size2, - env_tlb(env)->d[mmu_idx].fulltlb[index].attrs, - BP_MEM_WRITE, retaddr); - } - if (unlikely(tlb_addr2 & TLB_WATCHPOINT)) { - cpu_check_watchpoint(env_cpu(env), page2, size2, - env_tlb(env)->d[mmu_idx].fulltlb[index2].attrs, - BP_MEM_WRITE, retaddr); + for (i = 0; i < size; i++, val_le >>= 8) { + haddr[i] = val_le; } + return val_le; +} - /* - * XXX: not efficient, but simple. - * This loop must go in the forward direction to avoid issues - * with self-modifying code in Windows 64-bit. - */ - oi = make_memop_idx(MO_UB, mmu_idx); - if (big_endian) { - for (i = 0; i < size; ++i) { - /* Big-endian extract. */ - uint8_t val8 = val >> (((size - 1) * 8) - (i * 8)); - full_stb_mmu(env, addr + i, val8, oi, retaddr); - } +/* + * Wrapper for the above. + */ +static uint64_t do_st_leN(CPUArchState *env, MMULookupPageData *p, + uint64_t val_le, int mmu_idx, uintptr_t ra) +{ + if (unlikely(p->flags & TLB_MMIO)) { + return do_st_mmio_leN(env, p, val_le, mmu_idx, ra); } else { - for (i = 0; i < size; ++i) { - /* Little-endian extract. */ - uint8_t val8 = val >> (i * 8); - full_stb_mmu(env, addr + i, val8, oi, retaddr); - } + return do_st_bytes_leN(p, val_le); } } -static inline void QEMU_ALWAYS_INLINE -store_helper(CPUArchState *env, target_ulong addr, uint64_t val, - MemOpIdx oi, uintptr_t retaddr, MemOp op) +static void do_st_1(CPUArchState *env, MMULookupPageData *p, uint8_t val, + int mmu_idx, uintptr_t ra) { - const unsigned a_bits = get_alignment_bits(get_memop(oi)); - const size_t size = memop_size(op); - uintptr_t mmu_idx = get_mmuidx(oi); - uintptr_t index; - CPUTLBEntry *entry; - target_ulong tlb_addr; - void *haddr; - - tcg_debug_assert(mmu_idx < NB_MMU_MODES); - - /* Handle CPU specific unaligned behaviour */ - if (addr & ((1 << a_bits) - 1)) { - cpu_unaligned_access(env_cpu(env), addr, MMU_DATA_STORE, - mmu_idx, retaddr); + if (unlikely(p->flags & TLB_MMIO)) { + io_writex(env, p->full, mmu_idx, val, p->addr, ra, MO_UB); + } else { + *(uint8_t *)p->haddr = val; } - - index = tlb_index(env, mmu_idx, addr); - entry = tlb_entry(env, mmu_idx, addr); - tlb_addr = tlb_addr_write(entry); - - /* If the TLB entry is for a different page, reload and try again. */ - if (!tlb_hit(tlb_addr, addr)) { - if (!victim_tlb_hit(env, mmu_idx, index, MMU_DATA_STORE, - addr & TARGET_PAGE_MASK)) { - tlb_fill(env_cpu(env), addr, size, MMU_DATA_STORE, - mmu_idx, retaddr); - index = tlb_index(env, mmu_idx, addr); - entry = tlb_entry(env, mmu_idx, addr); - } - tlb_addr = tlb_addr_write(entry) & ~TLB_INVALID_MASK; - } - - /* Handle anything that isn't just a straight memory access. */ - if (unlikely(tlb_addr & ~TARGET_PAGE_MASK)) { - CPUTLBEntryFull *full; - bool need_swap; - - /* For anything that is unaligned, recurse through byte stores. */ - if ((addr & (size - 1)) != 0) { - goto do_unaligned_access; - } - - full = &env_tlb(env)->d[mmu_idx].fulltlb[index]; - - /* Handle watchpoints. */ - if (unlikely(tlb_addr & TLB_WATCHPOINT)) { - /* On watchpoint hit, this will longjmp out. */ - cpu_check_watchpoint(env_cpu(env), addr, size, - full->attrs, BP_MEM_WRITE, retaddr); - } - - need_swap = size > 1 && (tlb_addr & TLB_BSWAP); - - /* Handle I/O access. */ - if (tlb_addr & TLB_MMIO) { - io_writex(env, full, mmu_idx, val, addr, retaddr, - op ^ (need_swap * MO_BSWAP)); - return; - } - - /* Ignore writes to ROM. */ - if (unlikely(tlb_addr & TLB_DISCARD_WRITE)) { - return; - } - - /* Handle clean RAM pages. */ - if (tlb_addr & TLB_NOTDIRTY) { - notdirty_write(env_cpu(env), addr, size, full, retaddr); - } - - haddr = (void *)((uintptr_t)addr + entry->addend); - - /* - * Keep these two store_memop separate to ensure that the compiler - * is able to fold the entire function to a single instruction. - * There is a build-time assert inside to remind you of this. ;-) - */ - if (unlikely(need_swap)) { - store_memop(haddr, val, op ^ MO_BSWAP); - } else { - store_memop(haddr, val, op); - } - return; - } - - /* Handle slow unaligned access (it spans two pages or IO). */ - if (size > 1 - && unlikely((addr & ~TARGET_PAGE_MASK) + size - 1 - >= TARGET_PAGE_SIZE)) { - do_unaligned_access: - store_helper_unaligned(env, addr, val, retaddr, size, - mmu_idx, memop_big_endian(op)); - return; - } - - haddr = (void *)((uintptr_t)addr + entry->addend); - store_memop(haddr, val, op); -} - -static void __attribute__((noinline)) -full_stb_mmu(CPUArchState *env, target_ulong addr, uint64_t val, - MemOpIdx oi, uintptr_t retaddr) -{ - validate_memop(oi, MO_UB); - store_helper(env, addr, val, oi, retaddr, MO_UB); } void helper_ret_stb_mmu(CPUArchState *env, target_ulong addr, uint8_t val, - MemOpIdx oi, uintptr_t retaddr) + MemOpIdx oi, uintptr_t ra) { - full_stb_mmu(env, addr, val, oi, retaddr); + MMULookupLocals l; + bool crosspage; + + validate_memop(oi, MO_UB); + crosspage = mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE, &l); + tcg_debug_assert(!crosspage); + + do_st_1(env, &l.page[0], val, l.mmu_idx, ra); } -static void full_le_stw_mmu(CPUArchState *env, target_ulong addr, uint64_t val, - MemOpIdx oi, uintptr_t retaddr) +static void do_st2_mmu(CPUArchState *env, target_ulong addr, uint16_t val, + MemOpIdx oi, uintptr_t ra) { - validate_memop(oi, MO_LEUW); - store_helper(env, addr, val, oi, retaddr, MO_LEUW); + MMULookupLocals l; + bool crosspage; + + crosspage = mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE, &l); + if (likely(!crosspage)) { + if (l.page[0].flags & TLB_BSWAP) { + l.memop ^= MO_BSWAP; + } + if (unlikely(l.page[0].flags & TLB_MMIO)) { + io_writex(env, l.page[0].full, l.mmu_idx, val, addr, ra, l.memop); + } else { + /* Swap to host endian if necessary, then store. */ + if (l.memop & MO_BSWAP) { + val = bswap16(val); + } + store_memop(l.page[0].haddr, val, MO_UW); + } + } else { + uint8_t a, b; + + assert_no_tlb_bswap; + + if ((l.memop & MO_BSWAP) == MO_LE) { + a = val, b = val >> 8; + } else { + b = val, a = val >> 8; + } + do_st_1(env, &l.page[0], a, l.mmu_idx, ra); + do_st_1(env, &l.page[1], b, l.mmu_idx, ra); + } } void helper_le_stw_mmu(CPUArchState *env, target_ulong addr, uint16_t val, MemOpIdx oi, uintptr_t retaddr) { - full_le_stw_mmu(env, addr, val, oi, retaddr); -} - -static void full_be_stw_mmu(CPUArchState *env, target_ulong addr, uint64_t val, - MemOpIdx oi, uintptr_t retaddr) -{ - validate_memop(oi, MO_BEUW); - store_helper(env, addr, val, oi, retaddr, MO_BEUW); + validate_memop(oi, MO_LEUW); + do_st2_mmu(env, addr, val, oi, retaddr); } void helper_be_stw_mmu(CPUArchState *env, target_ulong addr, uint16_t val, MemOpIdx oi, uintptr_t retaddr) { - full_be_stw_mmu(env, addr, val, oi, retaddr); + validate_memop(oi, MO_BEUW); + do_st2_mmu(env, addr, val, oi, retaddr); } -static void full_le_stl_mmu(CPUArchState *env, target_ulong addr, uint64_t val, - MemOpIdx oi, uintptr_t retaddr) +static void do_st4_mmu(CPUArchState *env, target_ulong addr, uint32_t val, + MemOpIdx oi, uintptr_t ra) { - validate_memop(oi, MO_LEUL); - store_helper(env, addr, val, oi, retaddr, MO_LEUL); + MMULookupLocals l; + bool crosspage; + + crosspage = mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE, &l); + if (likely(!crosspage)) { + if (l.page[0].flags & TLB_BSWAP) { + l.memop ^= MO_BSWAP; + } + if (unlikely(l.page[0].flags & TLB_MMIO)) { + io_writex(env, l.page[0].full, l.mmu_idx, val, addr, ra, l.memop); + } else { + /* Swap to host endian if necessary, then store. */ + if (l.memop & MO_BSWAP) { + val = bswap32(val); + } + store_memop(l.page[0].haddr, val, MO_UL); + } + } else { + assert_no_tlb_bswap; + + /* Swap to little endian for simplicity, then store by bytes. */ + if ((l.memop & MO_BSWAP) != MO_LE) { + val = bswap32(val); + } + val = do_st_leN(env, &l.page[0], val, l.mmu_idx, ra); + (void) do_st_leN(env, &l.page[1], val, l.mmu_idx, ra); + } } void helper_le_stl_mmu(CPUArchState *env, target_ulong addr, uint32_t val, MemOpIdx oi, uintptr_t retaddr) { - full_le_stl_mmu(env, addr, val, oi, retaddr); -} - -static void full_be_stl_mmu(CPUArchState *env, target_ulong addr, uint64_t val, - MemOpIdx oi, uintptr_t retaddr) -{ - validate_memop(oi, MO_BEUL); - store_helper(env, addr, val, oi, retaddr, MO_BEUL); + validate_memop(oi, MO_LEUL); + do_st4_mmu(env, addr, val, oi, retaddr); } void helper_be_stl_mmu(CPUArchState *env, target_ulong addr, uint32_t val, MemOpIdx oi, uintptr_t retaddr) { - full_be_stl_mmu(env, addr, val, oi, retaddr); + validate_memop(oi, MO_BEUL); + do_st4_mmu(env, addr, val, oi, retaddr); +} + +static void do_st8_mmu(CPUArchState *env, target_ulong addr, uint64_t val, + MemOpIdx oi, uintptr_t ra) +{ + MMULookupLocals l; + bool crosspage; + + crosspage = mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE, &l); + if (likely(!crosspage)) { + if (l.page[0].flags & TLB_BSWAP) { + l.memop ^= MO_BSWAP; + } + if (unlikely(l.page[0].flags & TLB_MMIO)) { + io_writex(env, l.page[0].full, l.mmu_idx, val, addr, ra, l.memop); + } else { + /* Swap to host endian if necessary, then store. */ + if (l.memop & MO_BSWAP) { + val = bswap64(val); + } + store_memop(l.page[0].haddr, val, MO_UQ); + } + } else { + assert_no_tlb_bswap; + + /* Swap to little endian for simplicity, then store by bytes. */ + if ((l.memop & MO_BSWAP) != MO_LE) { + val = bswap64(val); + } + val = do_st_leN(env, &l.page[0], val, l.mmu_idx, ra); + (void) do_st_leN(env, &l.page[1], val, l.mmu_idx, ra); + } } void helper_le_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val, MemOpIdx oi, uintptr_t retaddr) { validate_memop(oi, MO_LEUQ); - store_helper(env, addr, val, oi, retaddr, MO_LEUQ); + do_st8_mmu(env, addr, val, oi, retaddr); } void helper_be_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val, MemOpIdx oi, uintptr_t retaddr) { validate_memop(oi, MO_BEUQ); - store_helper(env, addr, val, oi, retaddr, MO_BEUQ); + do_st8_mmu(env, addr, val, oi, retaddr); } /* * Store Helpers for cpu_ldst.h */ -typedef void FullStoreHelper(CPUArchState *env, target_ulong addr, - uint64_t val, MemOpIdx oi, uintptr_t retaddr); - -static inline void cpu_store_helper(CPUArchState *env, target_ulong addr, - uint64_t val, MemOpIdx oi, uintptr_t ra, - FullStoreHelper *full_store) +static void plugin_store_cb(CPUArchState *env, abi_ptr addr, MemOpIdx oi) { - full_store(env, addr, val, oi, ra); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); } void cpu_stb_mmu(CPUArchState *env, target_ulong addr, uint8_t val, MemOpIdx oi, uintptr_t retaddr) { - cpu_store_helper(env, addr, val, oi, retaddr, full_stb_mmu); + helper_ret_stb_mmu(env, addr, val, oi, retaddr); + plugin_store_cb(env, addr, oi); } void cpu_stw_be_mmu(CPUArchState *env, target_ulong addr, uint16_t val, MemOpIdx oi, uintptr_t retaddr) { - cpu_store_helper(env, addr, val, oi, retaddr, full_be_stw_mmu); + helper_be_stw_mmu(env, addr, val, oi, retaddr); + plugin_store_cb(env, addr, oi); } void cpu_stl_be_mmu(CPUArchState *env, target_ulong addr, uint32_t val, MemOpIdx oi, uintptr_t retaddr) { - cpu_store_helper(env, addr, val, oi, retaddr, full_be_stl_mmu); + helper_be_stl_mmu(env, addr, val, oi, retaddr); + plugin_store_cb(env, addr, oi); } void cpu_stq_be_mmu(CPUArchState *env, target_ulong addr, uint64_t val, MemOpIdx oi, uintptr_t retaddr) { - cpu_store_helper(env, addr, val, oi, retaddr, helper_be_stq_mmu); + helper_be_stq_mmu(env, addr, val, oi, retaddr); + plugin_store_cb(env, addr, oi); } void cpu_stw_le_mmu(CPUArchState *env, target_ulong addr, uint16_t val, MemOpIdx oi, uintptr_t retaddr) { - cpu_store_helper(env, addr, val, oi, retaddr, full_le_stw_mmu); + helper_le_stw_mmu(env, addr, val, oi, retaddr); + plugin_store_cb(env, addr, oi); } void cpu_stl_le_mmu(CPUArchState *env, target_ulong addr, uint32_t val, MemOpIdx oi, uintptr_t retaddr) { - cpu_store_helper(env, addr, val, oi, retaddr, full_le_stl_mmu); + helper_le_stl_mmu(env, addr, val, oi, retaddr); + plugin_store_cb(env, addr, oi); } void cpu_stq_le_mmu(CPUArchState *env, target_ulong addr, uint64_t val, MemOpIdx oi, uintptr_t retaddr) { - cpu_store_helper(env, addr, val, oi, retaddr, helper_le_stq_mmu); + helper_le_stq_mmu(env, addr, val, oi, retaddr); + plugin_store_cb(env, addr, oi); } void cpu_st16_be_mmu(CPUArchState *env, abi_ptr addr, Int128 val, From patchwork Fri Nov 18 09:47:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 626103 Delivered-To: patch@linaro.org Received: by 2002:a17:522:c983:b0:460:3032:e3c4 with SMTP id kr3csp32820pvb; Fri, 18 Nov 2022 01:50:16 -0800 (PST) X-Google-Smtp-Source: AA0mqf55lBFs8qEfDO7w0qFTdlmZl4t2fgML/ovM/an/peb89k4MWZ8nhWjstd4GwgJBOfMlVh/p X-Received: by 2002:ac8:4f44:0:b0:3a5:7434:9fb4 with SMTP id i4-20020ac84f44000000b003a574349fb4mr5841435qtw.3.1668765014984; Fri, 18 Nov 2022 01:50:14 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668765014; cv=none; d=google.com; s=arc-20160816; b=UGHWc8wOV/j7huF/tnLbwjWIjW0JHt88RoMXMc/CuQKMHaUjXqsz4TK4lCt/Eq/W9Q z9Ns3XD8Ybjch/6LIr2tLe4eU1WrZYeI6zypZqBcibKgVDKULItGR/ba0FweRWE//3UZ MPmcDNCvBCBq6c+rMXrqKhvQWOcKnbIZmRoyGs1vGt9gdsMM2tV0iEMHZMaRFzxusf5k ThzyEqGXaBvRZOGarmAlkzqvhRifa+th1PssNwSufqvG/dA6DiUeR9VcjMGPbriw0ExI nTnE8o1LGYibBic+iU7l85dxT6+2RzyYBOq/fDOwWnximd22kSm+VP/Rj+FH/hPnurRA 0w+w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=UCGU3IacgeyfhI71u33/bIkF6uwui0M6rgQzvtf+p2I=; b=xeqUoEitCe5QKM3tZGJVheoR+h5L8xkW/NKZFJ7Da0QkkvwFVp3FtQiTkHMwQs8QiO hFcqjrUTs8P8yClyyPYYxLodhcFI80kdkBTnqoYMxGQlJ8q8ydh5Q3SbeaTDLjZiRmgf dK3z/bcrnSOdFWeArQcjz0b11BnWBcTkqRbxE6ngBLTkAcAwlyo0E8OlVZopP7ZyPMXq 9xmCLQP1Z0RaH6KiYrpXot3PCdNqr1+DiD/PkR7y4ej7yMWqaVa7+lDR/QIIDmf2gOGX TvvMVWllVKMtLa/W3Q4Kr7bzTrSH3YtxAiaozCOTL/RMq9Wf67d7+X4fvDC3kXyQSAU1 AFnA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=j5wDTtQA; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id jk5-20020ad45d45000000b004bbdb353a66si1865951qvb.404.2022.11.18.01.50.14 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 18 Nov 2022 01:50:14 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=j5wDTtQA; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ovxyu-0007PD-Tc; Fri, 18 Nov 2022 04:48:20 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ovxyr-0007IF-OU for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:17 -0500 Received: from mail-pl1-x62f.google.com ([2607:f8b0:4864:20::62f]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ovxyo-0001zy-JJ for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:17 -0500 Received: by mail-pl1-x62f.google.com with SMTP id jn7so2247197plb.13 for ; Fri, 18 Nov 2022 01:48:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=UCGU3IacgeyfhI71u33/bIkF6uwui0M6rgQzvtf+p2I=; b=j5wDTtQAeAtZvqJ3rhOOd6bjM1Pk3rIbAdvftxEeVtjD8j/BTzTOJFCyGFVEDZm7qe pXy6yd7cN5XrvkFOSMfrlMRTqg2A4XrU/kCRLc4xYBsAxtIr9DKsSrXIyolLvARuCkgW gjXn8zSeLJzsFX/eh+GxBiGZd0ZQ5vm8beipYd5dnWC/at9aQDZJWYcpOx083B+qGXMa UAbkArgojm3dayiYRPyLFCDJQxgZsCu5muicieQhz3cODQjV5wdvHEo1xL+Wlkvb8oO3 B46SroBkLIqh+u6v0ttxABcMOZdZeZzxf0CA+JtRggGeO5pIRnnprISRt7b277zgkiDy qWiw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=UCGU3IacgeyfhI71u33/bIkF6uwui0M6rgQzvtf+p2I=; b=xzTq6B0P42UR2efbNvYP/YLCtwI6J/LXg0lH46aT2aF5/fT9CmaI5CJaqvGd2hwE7C GYCEhqfl0SiXaVzjyYlMMa9/B4ySFmDNGE7YZFnraW0if9IK68KlptYhvTwXo5lFKatQ iAz+tGRmeQq47MOYTI6VC0zh7OQwZBi9zTAwXv4s3x1Ksh7Ui+ppEAOfBHEWxL2mTibq x98X+SrtToIAumytqY7oKrTXjCsxxSh95LtTv0S1txjIISuAF8uCgdSmO1gpZPC29PXV +yYyngFyMcSPMbmBHFGKPpW3kpOcreD/+jp2HpFCsXaNzruj6zqnFcXtD6j0i8YdIyfI jTUA== X-Gm-Message-State: ANoB5pm7wz1yNynZlix/fO0o/wWZUwUZH/iqBUvAKbYXyW3OCVnbyVGc 2ZD38pF2S9Eg35LpP+EVpZYvpqmPJnuYFg== X-Received: by 2002:a17:902:6b8b:b0:17a:aca0:e295 with SMTP id p11-20020a1709026b8b00b0017aaca0e295mr6978314plk.3.1668764892917; Fri, 18 Nov 2022 01:48:12 -0800 (PST) Received: from stoup.. ([2602:47:d48a:1201:90b2:345f:bf0a:c412]) by smtp.gmail.com with ESMTPSA id n12-20020a170902e54c00b0018862bb3976sm3115421plf.308.2022.11.18.01.48.11 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Nov 2022 01:48:12 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH for-8.0 07/29] accel/tcg: Honor atomicity of loads Date: Fri, 18 Nov 2022 01:47:32 -0800 Message-Id: <20221118094754.242910-8-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221118094754.242910-1-richard.henderson@linaro.org> References: <20221118094754.242910-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::62f; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x62f.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org Create ldst_atomicity.c.inc. Not required for user-only code loads, because we've ensured that the page is read-only before beginning to translate code. Signed-off-by: Richard Henderson --- accel/tcg/cputlb.c | 174 ++++++++--- accel/tcg/user-exec.c | 26 +- accel/tcg/ldst_atomicity.c.inc | 546 +++++++++++++++++++++++++++++++++ 3 files changed, 695 insertions(+), 51 deletions(-) create mode 100644 accel/tcg/ldst_atomicity.c.inc diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c index 5562fb82d6..cdc109b473 100644 --- a/accel/tcg/cputlb.c +++ b/accel/tcg/cputlb.c @@ -1651,6 +1651,9 @@ tb_page_addr_t get_page_addr_code_hostp(CPUArchState *env, target_ulong addr, return qemu_ram_addr_from_host_nofail(p); } +/* Load/store with atomicity primitives. */ +#include "ldst_atomicity.c.inc" + #ifdef CONFIG_PLUGIN /* * Perform a TLB lookup and populate the qemu_plugin_hwaddr structure. @@ -2003,35 +2006,7 @@ static void validate_memop(MemOpIdx oi, MemOp expected) * specifically for reading instructions from system memory. It is * called by the translation loop and in some helpers where the code * is disassembled. It shouldn't be called directly by guest code. - */ - -typedef uint64_t FullLoadHelper(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr); - -static inline uint64_t QEMU_ALWAYS_INLINE -load_memop(const void *haddr, MemOp op) -{ - switch (op) { - case MO_UB: - return ldub_p(haddr); - case MO_BEUW: - return lduw_be_p(haddr); - case MO_LEUW: - return lduw_le_p(haddr); - case MO_BEUL: - return (uint32_t)ldl_be_p(haddr); - case MO_LEUL: - return (uint32_t)ldl_le_p(haddr); - case MO_BEUQ: - return ldq_be_p(haddr); - case MO_LEUQ: - return ldq_le_p(haddr); - default: - qemu_build_not_reached(); - } -} - -/* + * * For the benefit of TCG generated code, we want to avoid the * complication of ABI-specific return type promotion and always * return a value extended to the register size of the host. This is @@ -2087,17 +2062,134 @@ static uint64_t do_ld_bytes_beN(MMULookupPageData *p, uint64_t ret_be) return ret_be; } +/** + * do_ld_parts_beN + * @p: translation parameters + * @ret_be: accumulated data + * + * As do_ld_bytes_beN, but atomically on each aligned part. + */ +static uint64_t do_ld_parts_beN(MMULookupPageData *p, uint64_t ret_be) +{ + void *haddr = p->haddr; + int size = p->size; + + do { + uint64_t x; + int n; + + /* + * Find minimum of alignment and size. + * This is slightly stronger than required by MO_ATOM_SUBALIGN, which + * would have only checked the low bits of addr|size once at the start, + * but is just as easy. + */ + switch (((uintptr_t)haddr | size) & 7) { + case 4: + x = cpu_to_be32(load_atomic4(haddr)); + ret_be = (ret_be << 32) | x; + n = 4; + break; + case 2: + case 6: + x = cpu_to_be16(load_atomic2(haddr)); + ret_be = (ret_be << 16) | x; + n = 2; + break; + default: + x = *(uint8_t *)haddr; + ret_be = (ret_be << 8) | x; + n = 1; + break; + case 0: + g_assert_not_reached(); + } + haddr += n; + size -= n; + } while (size != 0); + return ret_be; +} + +/** + * do_ld_parts_be4 + * @p: translation parameters + * @ret_be: accumulated data + * + * As do_ld_bytes_beN, but with one atomic load. + * Four aligned bytes are guaranteed to cover the load. + */ +static uint64_t do_ld_whole_be4(MMULookupPageData *p, uint64_t ret_be) +{ + int o = p->addr & 3; + uint32_t x = load_atomic4(p->haddr - o); + + x = cpu_to_be32(x); + x <<= o * 8; + x >>= (4 - p->size) * 8; + return (ret_be << (p->size * 8)) | x; +} + +/** + * do_ld_parts_be8 + * @p: translation parameters + * @ret_be: accumulated data + * + * As do_ld_bytes_beN, but with one atomic load. + * Eight aligned bytes are guaranteed to cover the load. + */ +static uint64_t do_ld_whole_be8(CPUArchState *env, uintptr_t ra, + MMULookupPageData *p, uint64_t ret_be) +{ + int o = p->addr & 7; + uint64_t x = load_atomic8_or_exit(env, ra, p->haddr - o); + + x = cpu_to_be64(x); + x <<= o * 8; + x >>= (8 - p->size) * 8; + return (ret_be << (p->size * 8)) | x; +} + /* * Wrapper for the above. */ static uint64_t do_ld_beN(CPUArchState *env, MMULookupPageData *p, - uint64_t ret_be, int mmu_idx, - MMUAccessType type, uintptr_t ra) + uint64_t ret_be, int mmu_idx, MMUAccessType type, + MemOp mop, uintptr_t ra) { + MemOp atmax; + if (unlikely(p->flags & TLB_MMIO)) { return do_ld_mmio_beN(env, p, ret_be, mmu_idx, type, ra); - } else { + } + + switch (mop & MO_ATOM_MASK) { + case MO_ATOM_WITHIN16: + /* + * It is a given that we cross a page and therefore there is no + * atomicity for the load as a whole, but there may be a subobject + * as defined by ATMAX which does not cross a 16-byte boundary. + */ + atmax = mop & MO_ATMAX_MASK; + if (atmax == MO_ATMAX_SIZE) { + atmax = mop & MO_SIZE; + } else { + atmax >>= MO_ATMAX_SHIFT; + } + if (unlikely(p->size >= (1 << atmax))) { + if (!HAVE_al8_fast && p->size < 4) { + return do_ld_whole_be4(p, ret_be); + } else { + return do_ld_whole_be8(env, ra, p, ret_be); + } + } + /* fall through */ + case MO_ATOM_IFALIGN: + case MO_ATOM_NONE: return do_ld_bytes_beN(p, ret_be); + case MO_ATOM_SUBALIGN: + return do_ld_parts_beN(p, ret_be); + default: + g_assert_not_reached(); } } @@ -2147,7 +2239,7 @@ static uint16_t do_ld2_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, access_type, l.memop); } else { /* Perform the load host endian, then swap if necessary. */ - ret = load_memop(l.page[0].haddr, MO_UW); + ret = load_atom_2(env, ra, l.page[0].haddr, l.memop); if (l.memop & MO_BSWAP) { ret = bswap16(ret); } @@ -2200,15 +2292,17 @@ static uint32_t do_ld4_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, access_type, l.memop); } else { /* Perform the load host endian. */ - ret = load_memop(l.page[0].haddr, MO_UL); + ret = load_atom_4(env, ra, l.page[0].haddr, l.memop); if (l.memop & MO_BSWAP) { ret = bswap32(ret); } } } else { assert_no_tlb_bswap; - ret = do_ld_beN(env, &l.page[0], 0, l.mmu_idx, access_type, ra); - ret = do_ld_beN(env, &l.page[1], ret, l.mmu_idx, access_type, ra); + ret = do_ld_beN(env, &l.page[0], 0, l.mmu_idx, + access_type, l.memop, ra); + ret = do_ld_beN(env, &l.page[1], ret, l.mmu_idx, + access_type, l.memop, ra); if ((l.memop & MO_BSWAP) == MO_LE) { ret = bswap32(ret); } @@ -2247,15 +2341,17 @@ static uint64_t do_ld8_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, access_type, l.memop); } else { /* Perform the load host endian. */ - ret = load_memop(l.page[0].haddr, MO_UQ); + ret = load_atom_8(env, ra, l.page[0].haddr, l.memop); if (l.memop & MO_BSWAP) { ret = bswap64(ret); } } } else { assert_no_tlb_bswap; - ret = do_ld_beN(env, &l.page[0], 0, l.mmu_idx, access_type, ra); - ret = do_ld_beN(env, &l.page[1], ret, l.mmu_idx, access_type, ra); + ret = do_ld_beN(env, &l.page[0], 0, l.mmu_idx, + access_type, l.memop, ra); + ret = do_ld_beN(env, &l.page[1], ret, l.mmu_idx, + access_type, l.memop, ra); if ((l.memop & MO_BSWAP) == MO_LE) { ret = bswap64(ret); } diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c index a52c7ef826..ec721e5097 100644 --- a/accel/tcg/user-exec.c +++ b/accel/tcg/user-exec.c @@ -296,6 +296,8 @@ static void *cpu_mmu_lookup(CPUArchState *env, target_ulong addr, return ret; } +#include "ldst_atomicity.c.inc" + uint8_t cpu_ldb_mmu(CPUArchState *env, abi_ptr addr, MemOpIdx oi, uintptr_t ra) { @@ -318,10 +320,10 @@ uint16_t cpu_ldw_be_mmu(CPUArchState *env, abi_ptr addr, validate_memop(oi, MO_BEUW); haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD); - ret = lduw_be_p(haddr); + ret = load_atom_2(env, ra, haddr, get_memop(oi)); clear_helper_retaddr(); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); - return ret; + return cpu_to_be16(ret); } uint32_t cpu_ldl_be_mmu(CPUArchState *env, abi_ptr addr, @@ -332,10 +334,10 @@ uint32_t cpu_ldl_be_mmu(CPUArchState *env, abi_ptr addr, validate_memop(oi, MO_BEUL); haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD); - ret = ldl_be_p(haddr); + ret = load_atom_4(env, ra, haddr, get_memop(oi)); clear_helper_retaddr(); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); - return ret; + return cpu_to_be32(ret); } uint64_t cpu_ldq_be_mmu(CPUArchState *env, abi_ptr addr, @@ -346,10 +348,10 @@ uint64_t cpu_ldq_be_mmu(CPUArchState *env, abi_ptr addr, validate_memop(oi, MO_BEUQ); haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD); - ret = ldq_be_p(haddr); + ret = load_atom_8(env, ra, haddr, get_memop(oi)); clear_helper_retaddr(); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); - return ret; + return cpu_to_be64(ret); } uint16_t cpu_ldw_le_mmu(CPUArchState *env, abi_ptr addr, @@ -360,10 +362,10 @@ uint16_t cpu_ldw_le_mmu(CPUArchState *env, abi_ptr addr, validate_memop(oi, MO_LEUW); haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD); - ret = lduw_le_p(haddr); + ret = load_atom_2(env, ra, haddr, get_memop(oi)); clear_helper_retaddr(); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); - return ret; + return cpu_to_le16(ret); } uint32_t cpu_ldl_le_mmu(CPUArchState *env, abi_ptr addr, @@ -374,10 +376,10 @@ uint32_t cpu_ldl_le_mmu(CPUArchState *env, abi_ptr addr, validate_memop(oi, MO_LEUL); haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD); - ret = ldl_le_p(haddr); + ret = load_atom_4(env, ra, haddr, get_memop(oi)); clear_helper_retaddr(); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); - return ret; + return cpu_to_le32(ret); } uint64_t cpu_ldq_le_mmu(CPUArchState *env, abi_ptr addr, @@ -388,10 +390,10 @@ uint64_t cpu_ldq_le_mmu(CPUArchState *env, abi_ptr addr, validate_memop(oi, MO_LEUQ); haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD); - ret = ldq_le_p(haddr); + ret = load_atom_8(env, ra, haddr, get_memop(oi)); clear_helper_retaddr(); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); - return ret; + return cpu_to_le64(ret); } Int128 cpu_ld16_be_mmu(CPUArchState *env, abi_ptr addr, diff --git a/accel/tcg/ldst_atomicity.c.inc b/accel/tcg/ldst_atomicity.c.inc new file mode 100644 index 0000000000..decc9a2a16 --- /dev/null +++ b/accel/tcg/ldst_atomicity.c.inc @@ -0,0 +1,546 @@ +/* + * Routines common to user and system emulation of load/store. + * + * Copyright (c) 2022 Linaro, Ltd. + * + * SPDX-License-Identifier: GPL-2.0-or-later + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#ifdef CONFIG_ATOMIC64 +# define HAVE_al8 true +#else +# define HAVE_al8 false +#endif +#define HAVE_al8_fast (ATOMIC_REG_SIZE >= 8) + +#if defined(CONFIG_ATOMIC128) +# define HAVE_al16_fast true +#else +# define HAVE_al16_fast false +#endif + +/** + * required_atomicity: + * + * Return the lg2 bytes of atomicity required by @memop for @p. + * If the operation must be split into two operations to be + * examined separately for atomicity, return -lg2. + */ +static int required_atomicity(CPUArchState *env, uintptr_t p, MemOp memop) +{ + int atmax = memop & MO_ATMAX_MASK; + int size = memop & MO_SIZE; + unsigned tmp; + + if (atmax == MO_ATMAX_SIZE) { + atmax = size; + } else { + atmax >>= MO_ATMAX_SHIFT; + } + + switch (memop & MO_ATOM_MASK) { + case MO_ATOM_IFALIGN: + tmp = (1 << atmax) - 1; + if (p & tmp) { + return MO_8; + } + break; + case MO_ATOM_NONE: + return MO_8; + case MO_ATOM_SUBALIGN: + tmp = p & -p; + if (tmp != 0 && tmp < atmax) { + atmax = tmp; + } + break; + case MO_ATOM_WITHIN16: + tmp = p & 15; + if (tmp + (1 << size) <= 16) { + atmax = size; + } else if (atmax < size && tmp + (1 << atmax) != 16) { + /* + * Paired load/store, where the pairs aren't aligned. + * One of the two must still be handled atomically. + */ + atmax = -atmax; + } + break; + default: + g_assert_not_reached(); + } + + /* + * Here we have the architectural atomicity of the operation. + * However, when executing in a serial context, we need no extra + * host atomicity in order to avoid racing. This reduction + * avoids looping with cpu_loop_exit_atomic. + */ + if (cpu_in_serial_context(env_cpu(env))) { + return MO_8; + } + return atmax; +} + +/** + * load_atomic2: + * @pv: host address + * + * Atomically load 2 aligned bytes from @pv. + */ +static inline uint16_t load_atomic2(void *pv) +{ + uint16_t *p = __builtin_assume_aligned(pv, 2); + return qatomic_read(p); +} + +/** + * load_atomic4: + * @pv: host address + * + * Atomically load 4 aligned bytes from @pv. + */ +static inline uint32_t load_atomic4(void *pv) +{ + uint32_t *p = __builtin_assume_aligned(pv, 4); + return qatomic_read(p); +} + +/** + * load_atomic8: + * @pv: host address + * + * Atomically load 8 aligned bytes from @pv. + */ +static inline uint64_t load_atomic8(void *pv) +{ + uint64_t *p = __builtin_assume_aligned(pv, 8); + + qemu_build_assert(HAVE_al8); + return qatomic_read__nocheck(p); +} + +/** + * load_atomic16: + * @pv: host address + * + * Atomically load 16 aligned bytes from @pv. + */ +static inline Int128 load_atomic16(void *pv) +{ +#ifdef CONFIG_ATOMIC128 + __uint128_t *p = __builtin_assume_aligned(pv, 16); + Int128Alias r; + + r.u = qatomic_read__nocheck(p); + return r.s; +#else + qemu_build_not_reached(); +#endif +} + +/** + * load_atomic8_or_exit: + * @env: cpu context + * @ra: host unwind address + * @pv: host address + * + * Atomically load 8 aligned bytes from @pv. + * If this is not possible, longjmp out to restart serially. + */ +static uint64_t load_atomic8_or_exit(CPUArchState *env, uintptr_t ra, void *pv) +{ + uint64_t *p = __builtin_assume_aligned(pv, 8); + + if (HAVE_al8) { + return load_atomic8(p); + } + +#ifdef CONFIG_USER_ONLY + /* + * If the page is not writable, then assume the value is immutable + * and requires no locking. This ignores the case of MAP_SHARED with + * another process, because the fallback start_exclusive solution + * provides no protection across processes. + */ + if (!page_check_range(h2g(p), 8, PAGE_WRITE)) { + return *p; + } +#endif + + /* Ultimate fallback: re-execute in serial context. */ + cpu_loop_exit_atomic(env_cpu(env), ra); +} + +/** + * load_atomic16_or_exit: + * @env: cpu context + * @ra: host unwind address + * @pv: host address + * + * Atomically load 16 aligned bytes from @pv. + * If this is not possible, longjmp out to restart serially. + */ +static Int128 load_atomic16_or_exit(CPUArchState *env, uintptr_t ra, void *pv) +{ + Int128 *p = __builtin_assume_aligned(pv, 16); + + if (HAVE_al16_fast) { + return load_atomic16(p); + } + +#ifdef CONFIG_USER_ONLY + /* + * We can only use cmpxchg to emulate a load if the page is writable. + * If the page is not writable, then assume the value is immutable + * and requires no locking. This ignores the case of MAP_SHARED with + * another process, because the fallback start_exclusive solution + * provides no protection across processes. + */ + if (!page_check_range(h2g(p), 16, PAGE_WRITE)) { + return *p; + } +#endif + + /* + * In system mode all guest pages are writable, and for user-only + * we have just checked writability. Try cmpxchg. + */ +#if defined(CONFIG_CMPXCHG128) + /* Swap 0 with 0, with the side-effect of returning the old value. */ + { + Int128Alias r; + r.u = __sync_val_compare_and_swap_16((__uint128_t *)p, 0, 0); + return r.s; + } +#endif + + /* Ultimate fallback: re-execute in serial context. */ + cpu_loop_exit_atomic(env_cpu(env), ra); +} + +/** + * load_atom_extract_al4x2: + * @pv: host address + * + * Load 4 bytes from @p, from two sequential atomic 4-byte loads. + */ +static uint32_t load_atom_extract_al4x2(void *pv) +{ + uintptr_t pi = (uintptr_t)pv; + int sh = (pi & 3) * 8; + uint32_t a, b; + + pv = (void *)(pi & ~3); + a = load_atomic4(pv); + b = load_atomic4(pv + 4); + + if (HOST_BIG_ENDIAN) { + return (a << sh) | (b >> (-sh & 31)); + } else { + return (a >> sh) | (b << (-sh & 31)); + } +} + +/** + * load_atom_extract_al8x2: + * @pv: host address + * + * Load 8 bytes from @p, from two sequential atomic 8-byte loads. + */ +static uint64_t load_atom_extract_al8x2(void *pv) +{ + uintptr_t pi = (uintptr_t)pv; + int sh = (pi & 7) * 8; + uint64_t a, b; + + pv = (void *)(pi & ~7); + a = load_atomic8(pv); + b = load_atomic8(pv + 8); + + if (HOST_BIG_ENDIAN) { + return (a << sh) | (b >> (-sh & 63)); + } else { + return (a >> sh) | (b << (-sh & 63)); + } +} + +/** + * load_atom_extract_al8: + * @pv: host address + * @s: object size in bytes, @s <= 4. + * + * Atomically load @s bytes from @p, when p % s != 0, and [p, p+s-1] does + * not cross an 8-byte boundary. This means that we can perform an atomic + * 8-byte load and extract. + * The value is returned in the low bits of a uint32_t. + */ +static uint32_t load_atom_extract_al8(void *pv, int s) +{ + uintptr_t pi = (uintptr_t)pv; + int o = pi & 7; + int shr = (HOST_BIG_ENDIAN ? 8 - s - o : o) * 8; + + pv = (void *)(pi & ~7); + return load_atomic8(pv) >> shr; +} + +/** + * load_atom_extract_al16_or_exit: + * @env: cpu context + * @ra: host unwind address + * @p: host address + * @s: object size in bytes, @s <= 8. + * + * Atomically load @s bytes from @p, when p % 16 < 8 + * and p % 16 + s > 8. I.e. does not cross a 16-byte + * boundary, but *does* cross an 8-byte boundary. + * This is the slow version, so we must have eliminated + * any faster load_atom_extract_al8 case. + * + * If this is not possible, longjmp out to restart serially. + */ +static uint64_t load_atom_extract_al16_or_exit(CPUArchState *env, uintptr_t ra, + void *pv, int s) +{ + uintptr_t pi = (uintptr_t)pv; + int o = pi & 7; + int shr = (HOST_BIG_ENDIAN ? 16 - s - o : o) * 8; + Int128 r; + + /* + * Note constraints above: p & 8 must be clear. + * Provoke SIGBUS if possible otherwise. + */ + pv = (void *)(pi & ~7); + r = load_atomic16_or_exit(env, ra, pv); + + r = int128_urshift(r, shr); + return int128_getlo(r); +} + +/** + * load_atom_extract_al16_or_al8: + * @p: host address + * @s: object size in bytes, @s <= 8. + * + * Load @s bytes from @p, when p % s != 0. If [p, p+s-1] does not + * cross an 16-byte boundary then the access must be 16-byte atomic, + * otherwise the access must be 8-byte atomic. + */ +static inline uint64_t load_atom_extract_al16_or_al8(void *pv, int s) +{ +#if defined(CONFIG_ATOMIC128) + uintptr_t pi = (uintptr_t)pv; + int o = pi & 7; + int shr = (HOST_BIG_ENDIAN ? 16 - s - o : o) * 8; + __uint128_t r; + + pv = (void *)(pi & ~7); + if (pi & 8) { + uint64_t *p8 = __builtin_assume_aligned(pv, 16, 8); + uint64_t a = qatomic_read__nocheck(p8); + uint64_t b = qatomic_read__nocheck(p8 + 1); + + if (HOST_BIG_ENDIAN) { + r = ((__uint128_t)a << 64) | b; + } else { + r = ((__uint128_t)b << 64) | a; + } + } else { + __uint128_t *p16 = __builtin_assume_aligned(pv, 16, 0); + r = qatomic_read__nocheck(p16); + } + return r >> shr; +#else + qemu_build_not_reached(); +#endif +} + +/** + * load_atom_4_by_2: + * @pv: host address + * + * Load 4 bytes from @pv, with two 2-byte atomic loads. + */ +static inline uint32_t load_atom_4_by_2(void *pv) +{ + uint32_t a = load_atomic2(pv); + uint32_t b = load_atomic2(pv + 2); + + if (HOST_BIG_ENDIAN) { + return (a << 16) | b; + } else { + return (b << 16) | a; + } +} + +/** + * load_atom_8_by_2: + * @pv: host address + * + * Load 8 bytes from @pv, with four 2-byte atomic loads. + */ +static inline uint64_t load_atom_8_by_2(void *pv) +{ + uint32_t a = load_atom_4_by_2(pv); + uint32_t b = load_atom_4_by_2(pv + 4); + + if (HOST_BIG_ENDIAN) { + return ((uint64_t)a << 32) | b; + } else { + return ((uint64_t)b << 32) | a; + } +} + +/** + * load_atom_8_by_4: + * @pv: host address + * + * Load 8 bytes from @pv, with two 4-byte atomic loads. + */ +static inline uint64_t load_atom_8_by_4(void *pv) +{ + uint32_t a = load_atomic4(pv); + uint32_t b = load_atomic4(pv + 4); + + if (HOST_BIG_ENDIAN) { + return ((uint64_t)a << 32) | b; + } else { + return ((uint64_t)b << 32) | a; + } +} + +/** + * load_atom_2: + * @p: host address + * @memop: the full memory op + * + * Load 2 bytes from @p, honoring the atomicity of @memop. + */ +static uint16_t load_atom_2(CPUArchState *env, uintptr_t ra, + void *pv, MemOp memop) +{ + uintptr_t pi = (uintptr_t)pv; + int atmax; + + if (likely((pi & 1) == 0)) { + return load_atomic2(pv); + } + if (HAVE_al16_fast) { + return load_atom_extract_al16_or_al8(pv, 2); + } + + atmax = required_atomicity(env, pi, memop); + switch (atmax) { + case MO_8: + return lduw_he_p(pv); + case MO_16: + /* The only case remaining is MO_ATOM_WITHIN16. */ + if (!HAVE_al8_fast && (pi & 3) == 1) { + /* Big or little endian, we want the middle two bytes. */ + return load_atomic4(pv - 1) >> 8; + } + if (unlikely((pi & 15) != 7)) { + return load_atom_extract_al8(pv, 2); + } + return load_atom_extract_al16_or_exit(env, ra, pv, 2); + default: + g_assert_not_reached(); + } +} + +/** + * load_atom_4: + * @p: host address + * @memop: the full memory op + * + * Load 4 bytes from @p, honoring the atomicity of @memop. + */ +static uint32_t load_atom_4(CPUArchState *env, uintptr_t ra, + void *pv, MemOp memop) +{ + uintptr_t pi = (uintptr_t)pv; + int atmax; + + if (likely((pi & 3) == 0)) { + return load_atomic4(pv); + } + if (HAVE_al16_fast) { + return load_atom_extract_al16_or_al8(pv, 4); + } + + atmax = required_atomicity(env, pi, memop); + switch (atmax) { + case MO_8: + case MO_16: + case -MO_16: + /* + * For MO_ATOM_IFALIGN, this is more atomicity than required, + * but it's trivially supported on all hosts, better than 4 + * individual byte loads (when the host requires alignment), + * and overlaps with the MO_ATOM_SUBALIGN case of p % 2 == 0. + */ + return load_atom_extract_al4x2(pv); + case MO_32: + if (!(pi & 4)) { + return load_atom_extract_al8(pv, 4); + } + return load_atom_extract_al16_or_exit(env, ra, pv, 4); + default: + g_assert_not_reached(); + } +} + +/** + * load_atom_8: + * @p: host address + * @memop: the full memory op + * + * Load 8 bytes from @p, honoring the atomicity of @memop. + */ +static uint64_t load_atom_8(CPUArchState *env, uintptr_t ra, + void *pv, MemOp memop) +{ + uintptr_t pi = (uintptr_t)pv; + int atmax; + + /* + * If the host does not support 8-byte atomics, wait until we have + * examined the atomicity parameters below. + */ + if (HAVE_al8 && likely((pi & 7) == 0)) { + return load_atomic8(pv); + } + if (HAVE_al16_fast) { + return load_atom_extract_al16_or_al8(pv, 8); + } + + atmax = required_atomicity(env, pi, memop); + if (atmax == MO_64) { + if (!HAVE_al8 && (pi & 7) == 0) { + load_atomic8_or_exit(env, ra, pv); + } + return load_atom_extract_al16_or_exit(env, ra, pv, 8); + } + if (HAVE_al8_fast) { + return load_atom_extract_al8x2(pv); + } + switch (atmax) { + case MO_8: + return ldq_he_p(pv); + case MO_16: + return load_atom_8_by_2(pv); + case MO_32: + return load_atom_8_by_4(pv); + case -MO_32: + if (HAVE_al8) { + return load_atom_extract_al8x2(pv); + } + cpu_loop_exit_atomic(env_cpu(env), ra); + default: + g_assert_not_reached(); + } +} From patchwork Fri Nov 18 09:47:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 626102 Delivered-To: patch@linaro.org Received: by 2002:a17:522:c983:b0:460:3032:e3c4 with SMTP id kr3csp32803pvb; Fri, 18 Nov 2022 01:50:13 -0800 (PST) X-Google-Smtp-Source: AA0mqf4L9LhjBSKP+kVVMhG8m7Woz+RXHKzTrvEBVu/gKTIqCaOWaXUr4/8fNm5vWEeLhUxH9GBP X-Received: by 2002:ac8:764b:0:b0:3a4:f45e:fb12 with SMTP id i11-20020ac8764b000000b003a4f45efb12mr5830976qtr.462.1668765013835; Fri, 18 Nov 2022 01:50:13 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668765013; cv=none; d=google.com; s=arc-20160816; b=Ab0Gb6xPQBQ4GoxX3Wkw5qYgv8CcrfwWgy9kVxQ8+K6sEsk+3UsCFQyRTPwQy+qolB dKqGdclBBh4QuzH9BIy/31Q8GrGLDqmcHz/4vtzB7JXxYNeMjDcZEEUnmlRyFfp+x7QM lhw5EuEU27bdzCsbu+pSc7me1iEtuqqVXSsxY3CkLTDZIDGTPIqsCzIFQnhv50jLhxGZ Aw2uYtYm/F67JdyDk0HmCXeRVsWNSk8OpLJ4mswfQIRm93axO3ebMAsUuv3N2RlGpiVv FUTTHUU4kDthcisIBFoY5nIGoAn6CyYUUERNKLhbQyW64xqDfSMr0xdNsjJj+Xfj5af9 lScg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=fJCf200ykxp/ESHvt27e7ZHRG9vYnU4wj2iIv8ku5BY=; b=AOFKTs7OOTM24D/urh+trIUm1KnhQ9jrFQAmLd8LVMlSxuujg72P18/tVeARnZxjap ky4B3k5jb8XN4Tx6uPjwv3kkTwoo9Y42pXBgI5fJjc5g912T1E5+V5oW47skf9VMAs9j sWLeVT1dGlewMBFEOT1u+ejW8UbD1gvbGJn5WQxXxoDA6fdacXl4NMg5hFjUQf3MYlfD vpwSFWk55396vHX8dFqFs2QlzJo3C0NGrZGcwQFlJV1fnl6MfCk9Ggfyk9Qt7yA3lebF 9EtWFOdWV2QCr26UBlM0PhfmJOMXcUSoOMOzQ4/vqMzr7ud+rd5XMN5YouY7ldWwUaG9 aK4w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=y3f3k20U; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id bj18-20020a05620a191200b006fa386b73ffsi1698686qkb.6.2022.11.18.01.50.13 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 18 Nov 2022 01:50:13 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=y3f3k20U; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ovxyy-0007Q6-4o; Fri, 18 Nov 2022 04:48:24 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ovxyt-0007Oi-Ta for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:19 -0500 Received: from mail-pl1-x629.google.com ([2607:f8b0:4864:20::629]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ovxyq-0001yM-5W for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:19 -0500 Received: by mail-pl1-x629.google.com with SMTP id d20so4090237plr.10 for ; Fri, 18 Nov 2022 01:48:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=fJCf200ykxp/ESHvt27e7ZHRG9vYnU4wj2iIv8ku5BY=; b=y3f3k20U0NcD3ZMkarPcpF14fqLI4QfJulAQ7ko8ueXB8TXFBcT0G4uJ/z1eqKx5V0 a4i5o8mhtNvtqQ8wWruL5b1AMS6YRg8hSL+yS+pEJwnURoK9Cue9hBNIpdc/oW0W9pQO YCgOZWpjELZC9A0u09srdTWcNnokH70VoB+e3K82FWfAY3cY7N6PoTQOZHe/185eFZ2y l2ePEHrR/Mw2ZhLKUvnIqOgEXKGBZLZAlcRTVeHkc/UuMmA6Xjwskqpj2qXPCjrIjuPC kvxOavYvhhva+4De9fS/rByFPfVF+UBk6j05E7VK39yOsebFp1qpohdKcngJLXCkjoZl iqvQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=fJCf200ykxp/ESHvt27e7ZHRG9vYnU4wj2iIv8ku5BY=; b=l59fqQIaTXqst7QDvozHNWmqO7aBdFA00/FnkJ7TN6fBSonL30t6txX035zBOmfgjw wTDxND6TXhyYkjOAFhqO2nkBAwQVxTvBdZzf9fNkzYhnSOBrb2dc7NswV3H/NucANjrg 1NC8fS1Bg9uYWDVY//hSPIxpumt2eZFiWJ8aAz3tVvKXnm6vUexvyCtyU6PpBZjyo0yQ 8wTkti214g9+92DilrLKPBparcHJAIA2N71FaDZvQNyBRsBaBYXpOletEg3RC0XAwQvH /WwVhg7a0okofEO3fjakLmX/zEHSue9HbNimaNON8CsgoKU7CqaWa2MSX7pF2GzL/XhN 77Rw== X-Gm-Message-State: ANoB5plHmVuE1ZHCTMpxRP3nH32H2P9DTLjBnVzPPVCbPxIt4ePDz3SY xANnjEGEXmXRyWmccWaCyaU0199Ppvhfnw== X-Received: by 2002:a17:902:860a:b0:179:e1f6:d24d with SMTP id f10-20020a170902860a00b00179e1f6d24dmr6863335plo.91.1668764895060; Fri, 18 Nov 2022 01:48:15 -0800 (PST) Received: from stoup.. ([2602:47:d48a:1201:90b2:345f:bf0a:c412]) by smtp.gmail.com with ESMTPSA id n12-20020a170902e54c00b0018862bb3976sm3115421plf.308.2022.11.18.01.48.13 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Nov 2022 01:48:13 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH for-8.0 08/29] accel/tcg: Honor atomicity of stores Date: Fri, 18 Nov 2022 01:47:33 -0800 Message-Id: <20221118094754.242910-9-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221118094754.242910-1-richard.henderson@linaro.org> References: <20221118094754.242910-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::629; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x629.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org Signed-off-by: Richard Henderson --- accel/tcg/cputlb.c | 177 +++++++++++++---- accel/tcg/user-exec.c | 12 +- accel/tcg/ldst_atomicity.c.inc | 336 +++++++++++++++++++++++++++++++++ 3 files changed, 480 insertions(+), 45 deletions(-) diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c index cdc109b473..69f8a25a7f 100644 --- a/accel/tcg/cputlb.c +++ b/accel/tcg/cputlb.c @@ -2556,36 +2556,6 @@ Int128 cpu_ld16_le_mmu(CPUArchState *env, abi_ptr addr, * Store Helpers */ -static inline void QEMU_ALWAYS_INLINE -store_memop(void *haddr, uint64_t val, MemOp op) -{ - switch (op) { - case MO_UB: - stb_p(haddr, val); - break; - case MO_BEUW: - stw_be_p(haddr, val); - break; - case MO_LEUW: - stw_le_p(haddr, val); - break; - case MO_BEUL: - stl_be_p(haddr, val); - break; - case MO_LEUL: - stl_le_p(haddr, val); - break; - case MO_BEUQ: - stq_be_p(haddr, val); - break; - case MO_LEUQ: - stq_le_p(haddr, val); - break; - default: - qemu_build_not_reached(); - } -} - /** * do_st_mmio_leN: * @env: cpu context @@ -2632,16 +2602,145 @@ static uint64_t do_st_bytes_leN(MMULookupPageData *p, uint64_t val_le) return val_le; } +/** + * do_st_parts_leN + * @p: translation parameters + * @val_le: data to store + * + * As do_st_bytes_leN, but atomically on each aligned part. + */ +static uint64_t do_st_parts_leN(MMULookupPageData *p, uint64_t val_le) +{ + void *haddr = p->haddr; + int size = p->size; + + do { + int n; + + /* Find minimum of alignment and size */ + switch (((uintptr_t)haddr | size) & 7) { + case 4: + store_atomic4(haddr, le32_to_cpu(val_le)); + val_le >>= 32; + n = 4; + break; + case 2: + case 6: + store_atomic2(haddr, le16_to_cpu(val_le)); + val_le >>= 16; + n = 2; + break; + default: + stb_p(haddr, val_le); + val_le >>= 8; + n = 1; + break; + case 0: + g_assert_not_reached(); + } + haddr += n; + size -= n; + } while (size != 0); + return val_le; +} + +/** + * do_st_whole_le4 + * @p: translation parameters + * @val_le: data to store + * + * As do_st_bytes_leN, but atomically on each aligned part. + * Four aligned bytes are guaranteed to cover the store. + */ +static uint64_t do_st_whole_le4(MMULookupPageData *p, uint64_t val_le) +{ + int sz = p->size * 8; + int o = p->addr & 3; + int sh = o * 8; + uint32_t m = MAKE_64BIT_MASK(0, sz); + uint32_t v; + + if (HOST_BIG_ENDIAN) { + v = bswap32(val_le) >> sh; + m = bswap32(m) >> sh; + } else { + v = val_le << sh; + m <<= sh; + } + store_atom_insert_al4(p->haddr - o, v, m); + return val_le >> sz; +} + +/** + * do_st_whole_le8 + * @p: translation parameters + * @val_le: data to store + * + * As do_st_bytes_leN, but atomically on each aligned part. + * Eight aligned bytes are guaranteed to cover the store. + */ +static uint64_t do_st_whole_le8(MMULookupPageData *p, uint64_t val_le) +{ + int sz = p->size * 8; + int o = p->addr & 7; + int sh = o * 8; + uint64_t m = MAKE_64BIT_MASK(0, sz); + uint64_t v; + + if (HOST_BIG_ENDIAN) { + v = bswap64(val_le) >> sh; + m = bswap64(m) >> sh; + } else { + v = val_le << sh; + m <<= sh; + } + store_atom_insert_al8(p->haddr - o, v, m); + return val_le >> sz; +} + /* * Wrapper for the above. */ static uint64_t do_st_leN(CPUArchState *env, MMULookupPageData *p, - uint64_t val_le, int mmu_idx, uintptr_t ra) + uint64_t val_le, int mmu_idx, + MemOp mop, uintptr_t ra) { + MemOp atmax; + if (unlikely(p->flags & TLB_MMIO)) { return do_st_mmio_leN(env, p, val_le, mmu_idx, ra); - } else { + } + + switch (mop & MO_ATOM_MASK) { + case MO_ATOM_WITHIN16: + /* + * It is a given that we cross a page and therefore there is no + * atomicity for the load as a whole, but there may be a subobject + * as defined by ATMAX which does not cross a 16-byte boundary. + */ + atmax = mop & MO_ATMAX_MASK; + if (atmax == MO_ATMAX_SIZE) { + atmax = mop & MO_SIZE; + } else { + atmax >>= MO_ATMAX_SHIFT; + } + if (unlikely(p->size >= (1 << atmax))) { + if (!HAVE_al8_fast && p->size <= 4) { + return do_st_whole_le4(p, val_le); + } else if (HAVE_al8) { + return do_st_whole_le8(p, val_le); + } else { + cpu_loop_exit_atomic(env_cpu(env), ra); + } + } + /* fall through */ + case MO_ATOM_IFALIGN: + case MO_ATOM_NONE: return do_st_bytes_leN(p, val_le); + case MO_ATOM_SUBALIGN: + return do_st_parts_leN(p, val_le); + default: + g_assert_not_reached(); } } @@ -2686,7 +2785,7 @@ static void do_st2_mmu(CPUArchState *env, target_ulong addr, uint16_t val, if (l.memop & MO_BSWAP) { val = bswap16(val); } - store_memop(l.page[0].haddr, val, MO_UW); + store_atom_2(env, ra, l.page[0].haddr, l.memop, val); } } else { uint8_t a, b; @@ -2735,7 +2834,7 @@ static void do_st4_mmu(CPUArchState *env, target_ulong addr, uint32_t val, if (l.memop & MO_BSWAP) { val = bswap32(val); } - store_memop(l.page[0].haddr, val, MO_UL); + store_atom_4(env, ra, l.page[0].haddr, l.memop, val); } } else { assert_no_tlb_bswap; @@ -2744,8 +2843,8 @@ static void do_st4_mmu(CPUArchState *env, target_ulong addr, uint32_t val, if ((l.memop & MO_BSWAP) != MO_LE) { val = bswap32(val); } - val = do_st_leN(env, &l.page[0], val, l.mmu_idx, ra); - (void) do_st_leN(env, &l.page[1], val, l.mmu_idx, ra); + val = do_st_leN(env, &l.page[0], val, l.mmu_idx, l.memop, ra); + (void) do_st_leN(env, &l.page[1], val, l.mmu_idx, l.memop, ra); } } @@ -2781,7 +2880,7 @@ static void do_st8_mmu(CPUArchState *env, target_ulong addr, uint64_t val, if (l.memop & MO_BSWAP) { val = bswap64(val); } - store_memop(l.page[0].haddr, val, MO_UQ); + store_atom_8(env, ra, l.page[0].haddr, l.memop, val); } } else { assert_no_tlb_bswap; @@ -2790,8 +2889,8 @@ static void do_st8_mmu(CPUArchState *env, target_ulong addr, uint64_t val, if ((l.memop & MO_BSWAP) != MO_LE) { val = bswap64(val); } - val = do_st_leN(env, &l.page[0], val, l.mmu_idx, ra); - (void) do_st_leN(env, &l.page[1], val, l.mmu_idx, ra); + val = do_st_leN(env, &l.page[0], val, l.mmu_idx, l.memop, ra); + (void) do_st_leN(env, &l.page[1], val, l.mmu_idx, l.memop, ra); } } diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c index ec721e5097..ddba8c9dd7 100644 --- a/accel/tcg/user-exec.c +++ b/accel/tcg/user-exec.c @@ -451,7 +451,7 @@ void cpu_stw_be_mmu(CPUArchState *env, abi_ptr addr, uint16_t val, validate_memop(oi, MO_BEUW); haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE); - stw_be_p(haddr, val); + store_atom_2(env, ra, haddr, get_memop(oi), be16_to_cpu(val)); clear_helper_retaddr(); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); } @@ -463,7 +463,7 @@ void cpu_stl_be_mmu(CPUArchState *env, abi_ptr addr, uint32_t val, validate_memop(oi, MO_BEUL); haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE); - stl_be_p(haddr, val); + store_atom_4(env, ra, haddr, get_memop(oi), be32_to_cpu(val)); clear_helper_retaddr(); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); } @@ -475,7 +475,7 @@ void cpu_stq_be_mmu(CPUArchState *env, abi_ptr addr, uint64_t val, validate_memop(oi, MO_BEUQ); haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE); - stq_be_p(haddr, val); + store_atom_8(env, ra, haddr, get_memop(oi), be64_to_cpu(val)); clear_helper_retaddr(); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); } @@ -487,7 +487,7 @@ void cpu_stw_le_mmu(CPUArchState *env, abi_ptr addr, uint16_t val, validate_memop(oi, MO_LEUW); haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE); - stw_le_p(haddr, val); + store_atom_2(env, ra, haddr, get_memop(oi), le16_to_cpu(val)); clear_helper_retaddr(); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); } @@ -499,7 +499,7 @@ void cpu_stl_le_mmu(CPUArchState *env, abi_ptr addr, uint32_t val, validate_memop(oi, MO_LEUL); haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE); - stl_le_p(haddr, val); + store_atom_4(env, ra, haddr, get_memop(oi), le32_to_cpu(val)); clear_helper_retaddr(); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); } @@ -511,7 +511,7 @@ void cpu_stq_le_mmu(CPUArchState *env, abi_ptr addr, uint64_t val, validate_memop(oi, MO_LEUQ); haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE); - stq_le_p(haddr, val); + store_atom_8(env, ra, haddr, get_memop(oi), le64_to_cpu(val)); clear_helper_retaddr(); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); } diff --git a/accel/tcg/ldst_atomicity.c.inc b/accel/tcg/ldst_atomicity.c.inc index decc9a2a16..8876c16371 100644 --- a/accel/tcg/ldst_atomicity.c.inc +++ b/accel/tcg/ldst_atomicity.c.inc @@ -21,6 +21,12 @@ #else # define HAVE_al16_fast false #endif +#if defined(CONFIG_ATOMIC128) || defined(CONFIG_CMPXCHG128) +# define HAVE_al16 true +#else +# define HAVE_al16 false +#endif + /** * required_atomicity: @@ -544,3 +550,333 @@ static uint64_t load_atom_8(CPUArchState *env, uintptr_t ra, g_assert_not_reached(); } } + +/** + * store_atomic2: + * @pv: host address + * @val: value to store + * + * Atomically store 2 aligned bytes to @pv. + */ +static inline void store_atomic2(void *pv, uint16_t val) +{ + uint16_t *p = __builtin_assume_aligned(pv, 2); + qatomic_set(p, val); +} + +/** + * store_atomic4: + * @pv: host address + * @val: value to store + * + * Atomically store 4 aligned bytes to @pv. + */ +static inline void store_atomic4(void *pv, uint32_t val) +{ + uint32_t *p = __builtin_assume_aligned(pv, 4); + qatomic_set(p, val); +} + +/** + * store_atomic8: + * @pv: host address + * @val: value to store + * + * Atomically store 8 aligned bytes to @pv. + */ +static inline void store_atomic8(void *pv, uint64_t val) +{ + uint64_t *p = __builtin_assume_aligned(pv, 8); + + qemu_build_assert(HAVE_al8); + qatomic_set__nocheck(p, val); +} + +/** + * store_atom_4x2 + */ +static inline void store_atom_4_by_2(void *pv, uint32_t val) +{ + uint16_t *p = __builtin_assume_aligned(pv, 2); + qatomic_set(p, val >> (HOST_BIG_ENDIAN ? 16 : 0)); + qatomic_set(p + 2, val >> (HOST_BIG_ENDIAN ? 0 : 16)); +} + +/** + * store_atom_8_by_2 + */ +static inline void store_atom_8_by_2(void *pv, uint64_t val) +{ + uint32_t *p = __builtin_assume_aligned(pv, 4); + qatomic_set(p, val >> (HOST_BIG_ENDIAN ? 32 : 0)); + qatomic_set(p + 4, val >> (HOST_BIG_ENDIAN ? 0 : 32)); +} + +/** + * store_atom_8_by_4 + */ +static inline void store_atom_8_by_4(void *pv, uint64_t val) +{ + uint16_t *p = __builtin_assume_aligned(pv, 2); + qatomic_set(p, val >> (HOST_BIG_ENDIAN ? 48 : 0)); + qatomic_set(p + 2, val >> (HOST_BIG_ENDIAN ? 32 : 16)); + qatomic_set(p + 4, val >> (HOST_BIG_ENDIAN ? 16 : 32)); + qatomic_set(p + 6, val >> (HOST_BIG_ENDIAN ? 0 : 48)); +} + +/** + * store_atom_insert_al4: + * @p: host address + * @val: shifted value to store + * @msk: mask for value to store + * + * Atomically store @val to @p, masked by @msk. + */ +static void store_atom_insert_al4(uint32_t *p, uint32_t val, uint32_t msk) +{ + uint32_t old, new; + + p = __builtin_assume_aligned(p, 4); + old = qatomic_read(p); + do { + new = (old & ~msk) | val; + } while (!__atomic_compare_exchange_n(p, &old, new, true, + __ATOMIC_RELAXED, __ATOMIC_RELAXED)); +} + +/** + * store_atom_insert_al8: + * @p: host address + * @val: shifted value to store + * @msk: mask for value to store + * + * Atomically store @val to @p masked by @msk. + */ +static void store_atom_insert_al8(uint64_t *p, uint64_t val, uint64_t msk) +{ + uint64_t old, new; + + qemu_build_assert(HAVE_al8); + p = __builtin_assume_aligned(p, 8); + old = qatomic_read__nocheck(p); + do { + new = (old & ~msk) | val; + } while (!__atomic_compare_exchange_n(p, &old, new, true, + __ATOMIC_RELAXED, __ATOMIC_RELAXED)); +} + +/** + * store_atom_insert_al16: + * @p: host address + * @val: shifted value to store + * @msk: mask for value to store + * + * Atomically store @val to @p masked by @msk. + */ +static void store_atom_insert_al16(Int128 *ps, Int128Alias val, Int128Alias msk) +{ +#if defined(CONFIG_ATOMIC128) + __uint128_t *pu, old, new; + + /* With CONFIG_ATOMIC128, we can avoid the memory barriers. */ + pu = __builtin_assume_aligned(ps, 16); + old = *pu; + do { + new = (old & ~msk.u) | val.u; + } while (!__atomic_compare_exchange_n(pu, &old, new, true, + __ATOMIC_RELAXED, __ATOMIC_RELAXED)); +#elif defined(CONFIG_CMPXCHG128) + __uint128_t *pu, old, new; + + /* + * Without CONFIG_ATOMIC128, __atomic_compare_exchange_n will always + * defer to libatomic, so we must use __sync_val_compare_and_swap_16 + * and accept the sequential consistency that comes with it. + */ + pu = __builtin_assume_aligned(ps, 16); + do { + old = *pu; + new = (old & ~msk.u) | val.u; + } while (!__sync_bool_compare_and_swap_16(pu, old, new)); +#else + qemu_build_not_reached(); +#endif +} + +/** + * store_atom_2: + * @p: host address + * @val: the value to store + * @memop: the full memory op + * + * Store 2 bytes to @p, honoring the atomicity of @memop. + */ +static void store_atom_2(CPUArchState *env, uintptr_t ra, + void *pv, MemOp memop, uint16_t val) +{ + uintptr_t pi = (uintptr_t)pv; + MemOp atmax; + + if (likely((pi & 1) == 0)) { + store_atomic2(pv, val); + return; + } + + atmax = required_atomicity(env, pi, memop); + if (atmax == MO_8) { + stw_he_p(pv, val); + return; + } + + /* The only case remaining is MO_ATOM_WITHIN16. */ + if (!HAVE_al8_fast && (pi & 3) == 1) { + /* Big or little endian, we want the middle two bytes. */ + store_atom_insert_al4(pv - 1, val << 8, 0x00ffff00); + return; + } + + if ((pi & 15) != 7) { + if (HAVE_al8) { + int sh = (pi & 7) * 8; + uint64_t v, m; + + pv = (void *)(pi & ~7); + if (HOST_BIG_ENDIAN) { + v = (uint64_t)val << (48 - sh); + m = 0xffffull << (48 - sh); + } else { + v = (uint64_t)val << sh; + m = 0xffffull << sh; + } + store_atom_insert_al8(pv, v, m); + return; + } + } else { + if (HAVE_al16) { + Int128 v, m; + + /* Big or little endian, we want the middle two bytes. */ + v = int128_lshift(int128_make64(val), 56); + m = int128_lshift(int128_make64(0xffff), 56); + store_atom_insert_al16(pv - 7, v, m); + return; + } + } + + cpu_loop_exit_atomic(env_cpu(env), ra); +} + +/** + * store_atom_4: + * @p: host address + * @val: the value to store + * @memop: the full memory op + * + * Store 4 bytes to @p, honoring the atomicity of @memop. + */ +static void store_atom_4(CPUArchState *env, uintptr_t ra, + void *pv, MemOp memop, uint32_t val) +{ + uintptr_t pi = (uintptr_t)pv; + MemOp atmax; + + if (likely((pi & 3) == 0)) { + store_atomic4(pv, val); + return; + } + + atmax = required_atomicity(env, pi, memop); + switch (atmax) { + case MO_8: + stl_he_p(pv, val); + return; + case MO_16: + store_atom_4_by_2(pv, val); + return; + case MO_32: + if ((pi & 7) < 4) { + if (HAVE_al8) { + int sh = (pi & 7) * 8; + uint64_t v, m; + + pv = (void *)(pi & ~7); + if (HOST_BIG_ENDIAN) { + v = (uint64_t)val << (32 - sh); + m = 0xffffffffull << (32 - sh); + } else { + v = (uint64_t)val << sh; + m = 0xffffffffull << sh; + } + store_atom_insert_al8(pv, v, m); + return; + } + } else { + if (HAVE_al16) { + int sh = (pi & 7) * 8; + Int128 v, m; + + v = int128_make64(val); + m = int128_make64(0xffffffffull); + v = int128_lshift(v, HOST_BIG_ENDIAN ? 96 - sh : sh); + m = int128_lshift(m, HOST_BIG_ENDIAN ? 96 - sh : sh); + + pv = (void *)(pi & ~15); + store_atom_insert_al16(pv, v, m); + return; + } + } + cpu_loop_exit_atomic(env_cpu(env), ra); + default: + g_assert_not_reached(); + } +} + +/** + * store_atom_8: + * @p: host address + * @val: the value to store + * @memop: the full memory op + * + * Store 8 bytes to @p, honoring the atomicity of @memop. + */ +static void store_atom_8(CPUArchState *env, uintptr_t ra, + void *pv, MemOp memop, uint64_t val) +{ + uintptr_t pi = (uintptr_t)pv; + MemOp atmax; + + if (HAVE_al8 && likely((pi & 7) == 0)) { + store_atomic8(pv, val); + return; + } + + atmax = required_atomicity(env, pi, memop); + switch (atmax) { + case MO_8: + stq_he_p(pv, val); + return; + case MO_16: + store_atom_8_by_2(pv, val); + return; + case MO_32: + store_atom_8_by_4(pv, val); + return; + case MO_64: + if (HAVE_al16) { + int sh = (pi & 7) * 8; + Int128 v, m; + + v = int128_make64(val); + m = int128_make64(-1ull); + v = int128_lshift(v, HOST_BIG_ENDIAN ? 64 - sh : sh); + m = int128_lshift(m, HOST_BIG_ENDIAN ? 64 - sh : sh); + + pv = (void *)(pi & ~15); + store_atom_insert_al16(pv, v, m); + return; + } + cpu_loop_exit_atomic(env_cpu(env), ra); + default: + g_assert_not_reached(); + } +} From patchwork Fri Nov 18 09:47:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 626119 Delivered-To: patch@linaro.org Received: by 2002:a17:522:c983:b0:460:3032:e3c4 with SMTP id kr3csp35331pvb; Fri, 18 Nov 2022 01:57:26 -0800 (PST) X-Google-Smtp-Source: AA0mqf5F0tlB3R9ncSFa2Z1XRpu8Rgwo2W0yGWru03wnlASCOJYwMR4t35VRLCasw3p4Hbbtxfc3 X-Received: by 2002:ac8:683:0:b0:3a5:8b71:cca3 with SMTP id f3-20020ac80683000000b003a58b71cca3mr5879659qth.292.1668765446753; Fri, 18 Nov 2022 01:57:26 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668765446; cv=none; d=google.com; s=arc-20160816; b=Eki+3UWH2yNIayvUkIrQsfx+vGQGi2WP5kwu8395fSd/xgcSNNln0BsLdxrnks38dC /YQJu8KrH9KdlYRz69Kvh6jnI0foRNIrdp+VGHUbWrospjNBoz5JTW8RxRn9A35i3LzQ XqF8Zt1V6AbdCBsB0lMz2Gyh+rjKrgdnPgLmEbO6ePp9BDGBywX+yOWiEYfHFXWJJaLF oY4TxMlTDCRF0gbJrxM0Hr7McCLukjaaE9HpNY8CaKiVtBvRglpJt9+Z2zQHZnKSlan4 KDxAybegRmpQu3ldKwiKA2V5FUCC9I2MsLmkSbroLAgr/MgpmCs+4cftcF/VbecgDgEk Vtvw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=qpFA3aCL2lo2xwTYO+Bt7nlm4UpkrZxrUtHrYnNErW4=; b=YiP9KPA4yuxlGimgSERvesrU4dg0D9Yi3Xf5EVqbYEiNpOQ+MsJtADFnSKTbTate5C 2Z9BCgyqDwGtt302FuDkvjMGj2Dp5NZxif3RldF5g3RHjm3yU65Z31SU2644sIUAuglD 5hNvmwDdvKaqYBwDxO0LLZiHXNCKLjAEnKymRJ3HQtDxwmWm/fuwK5HithfEr+KiCPUb PW0MglQbo1ID0bOlOpT69rHYN+pQLIn4CNP9HgJp/UHOG4SRueWoVsPsqZjMYIX6OWec NFG5UfuWQbYIaSmAxuT6pRh1sK5W2jppp/uczh7nV2JWJTNCCa8b+MkGXSSexAPHq2F9 W1XQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=GpxVmGnj; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id dm14-20020a05620a1d4e00b006fa04da58bfsi1538673qkb.237.2022.11.18.01.57.26 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 18 Nov 2022 01:57:26 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=GpxVmGnj; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ovxyv-0007Pj-LV; Fri, 18 Nov 2022 04:48:21 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ovxyu-0007Ox-3x for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:20 -0500 Received: from mail-pj1-x1030.google.com ([2607:f8b0:4864:20::1030]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ovxys-000219-8p for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:19 -0500 Received: by mail-pj1-x1030.google.com with SMTP id w3-20020a17090a460300b00218524e8877so6671145pjg.1 for ; Fri, 18 Nov 2022 01:48:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=qpFA3aCL2lo2xwTYO+Bt7nlm4UpkrZxrUtHrYnNErW4=; b=GpxVmGnjqxG1DWCTiv3Z2+S7Lq78jS/aeCejjrELFeecUL+nxY7M/ogPPqTWIpLk9b DIsZZN8+5S3FfzwWR1iT2uegQoPNQ+m5Q32ZiiJM8VAviP5r4XjuDmfn/IiZUYg4wGQh oOu/9yx9J7tTe5EQslnJCh9T/+ecw8yp+7mJiEH/0hNnLrwf/9j1eZ7kq8xIW6GwMosy 7248IM/m5TKzOgSn9Z7frN3qzNiZwqnOm9x7MHQ2OsLlKJODOq6+wENzAf/mp5sHdMHu VbrLrs+eEVIz3WLamuC3UGcs5vbuw72R2YfWc8ExECZ1zfb2F//oMhyfOm52VNysIlOb F6RQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=qpFA3aCL2lo2xwTYO+Bt7nlm4UpkrZxrUtHrYnNErW4=; b=5GlKpx/yG/dCD7d8RFSdTO+NBtnyhc3toS0/XPqCVPoBZAveaZTv/NLHHrCxYhGe6V SBHOxlr/4H+hUHdXyM0j4/EUzWbjt7syc6dRZBDsP2YRTtvKR1vD4guWz5Ozx9W1yRCe G2xl5l4lv9ezHQL1/NVVFPPUrd7USGdt0SVtAlNcPMknLVBtmZVfEH9aeADZMBm7ut1N Eq6SlsRvgqXeFc97AnCYf7Aq58IqVauiPZWE0zB8lIZVxb7rXASqmQk1/1T0J2TFJgkw vEeklAEltTa1QMItVmvNOS/ytfTT9/gDeR+XOkl235hSdmCRgpZRfqqOAkvM8qK5I6Hm v+pA== X-Gm-Message-State: ANoB5pkC321W4PVZ86/xx9dATaX/vSAGkPrIe0//G2W1W7Cb7COKE+0A ZkCvfuVF0kHMAutdCAfp4cGtmPKpkroUqA== X-Received: by 2002:a17:902:7e48:b0:187:3c62:582c with SMTP id a8-20020a1709027e4800b001873c62582cmr6753642pln.114.1668764896771; Fri, 18 Nov 2022 01:48:16 -0800 (PST) Received: from stoup.. ([2602:47:d48a:1201:90b2:345f:bf0a:c412]) by smtp.gmail.com with ESMTPSA id n12-20020a170902e54c00b0018862bb3976sm3115421plf.308.2022.11.18.01.48.15 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Nov 2022 01:48:15 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH for-8.0 09/29] tcg/tci: Use cpu_{ld,st}_mmu Date: Fri, 18 Nov 2022 01:47:34 -0800 Message-Id: <20221118094754.242910-10-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221118094754.242910-1-richard.henderson@linaro.org> References: <20221118094754.242910-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1030; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1030.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org Unify the softmmu and the user-only paths by using the official memory interface. Avoid double logging of memory operations to plugins by relying on the ones within the cpu_*_mmu functions. Signed-off-by: Richard Henderson Reviewed-by: Philippe Mathieu-Daudé --- tcg/tcg-op.c | 9 +++- tcg/tci.c | 127 ++++++++------------------------------------------- 2 files changed, 26 insertions(+), 110 deletions(-) diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c index e7e4951a3c..1f81c3dbb3 100644 --- a/tcg/tcg-op.c +++ b/tcg/tcg-op.c @@ -2914,7 +2914,12 @@ static void tcg_gen_req_mo(TCGBar type) static inline TCGv plugin_prep_mem_callbacks(TCGv vaddr) { -#ifdef CONFIG_PLUGIN + /* + * With TCI, we get memory tracing via cpu_{ld,st}_mmu. + * No need to instrument memory operations inline, and + * we don't want to log the same memory operation twice. + */ +#if defined(CONFIG_PLUGIN) && !defined(CONFIG_TCG_INTERPRETER) if (tcg_ctx->plugin_insn != NULL) { /* Save a copy of the vaddr for use after a load. */ TCGv temp = tcg_temp_new(); @@ -2928,7 +2933,7 @@ static inline TCGv plugin_prep_mem_callbacks(TCGv vaddr) static void plugin_gen_mem_callbacks(TCGv vaddr, MemOpIdx oi, enum qemu_plugin_mem_rw rw) { -#ifdef CONFIG_PLUGIN +#if defined(CONFIG_PLUGIN) && !defined(CONFIG_TCG_INTERPRETER) if (tcg_ctx->plugin_insn != NULL) { qemu_plugin_meminfo_t info = make_plugin_meminfo(oi, rw); plugin_gen_empty_mem_callback(vaddr, info); diff --git a/tcg/tci.c b/tcg/tci.c index 022fe9d0f8..52fdd3f5ec 100644 --- a/tcg/tci.c +++ b/tcg/tci.c @@ -293,87 +293,34 @@ static uint64_t tci_qemu_ld(CPUArchState *env, target_ulong taddr, MemOp mop = get_memop(oi); uintptr_t ra = (uintptr_t)tb_ptr; -#ifdef CONFIG_SOFTMMU switch (mop & (MO_BSWAP | MO_SSIZE)) { case MO_UB: - return helper_ret_ldub_mmu(env, taddr, oi, ra); + return cpu_ldb_mmu(env, taddr, oi, ra); case MO_SB: - return helper_ret_ldsb_mmu(env, taddr, oi, ra); + return (int8_t)cpu_ldb_mmu(env, taddr, oi, ra); case MO_LEUW: - return helper_le_lduw_mmu(env, taddr, oi, ra); + return cpu_ldw_le_mmu(env, taddr, oi, ra); case MO_LESW: - return helper_le_ldsw_mmu(env, taddr, oi, ra); + return (int16_t)cpu_ldw_le_mmu(env, taddr, oi, ra); case MO_LEUL: - return helper_le_ldul_mmu(env, taddr, oi, ra); + return cpu_ldl_le_mmu(env, taddr, oi, ra); case MO_LESL: - return helper_le_ldsl_mmu(env, taddr, oi, ra); + return (int32_t)cpu_ldl_le_mmu(env, taddr, oi, ra); case MO_LEUQ: - return helper_le_ldq_mmu(env, taddr, oi, ra); + return cpu_ldq_le_mmu(env, taddr, oi, ra); case MO_BEUW: - return helper_be_lduw_mmu(env, taddr, oi, ra); + return cpu_ldw_be_mmu(env, taddr, oi, ra); case MO_BESW: - return helper_be_ldsw_mmu(env, taddr, oi, ra); + return (int16_t)cpu_ldw_be_mmu(env, taddr, oi, ra); case MO_BEUL: - return helper_be_ldul_mmu(env, taddr, oi, ra); + return cpu_ldl_be_mmu(env, taddr, oi, ra); case MO_BESL: - return helper_be_ldsl_mmu(env, taddr, oi, ra); + return (int32_t)cpu_ldl_be_mmu(env, taddr, oi, ra); case MO_BEUQ: - return helper_be_ldq_mmu(env, taddr, oi, ra); + return cpu_ldq_be_mmu(env, taddr, oi, ra); default: g_assert_not_reached(); } -#else - void *haddr = g2h(env_cpu(env), taddr); - unsigned a_mask = (1u << get_alignment_bits(mop)) - 1; - uint64_t ret; - - set_helper_retaddr(ra); - if (taddr & a_mask) { - helper_unaligned_ld(env, taddr); - } - switch (mop & (MO_BSWAP | MO_SSIZE)) { - case MO_UB: - ret = ldub_p(haddr); - break; - case MO_SB: - ret = ldsb_p(haddr); - break; - case MO_LEUW: - ret = lduw_le_p(haddr); - break; - case MO_LESW: - ret = ldsw_le_p(haddr); - break; - case MO_LEUL: - ret = (uint32_t)ldl_le_p(haddr); - break; - case MO_LESL: - ret = (int32_t)ldl_le_p(haddr); - break; - case MO_LEUQ: - ret = ldq_le_p(haddr); - break; - case MO_BEUW: - ret = lduw_be_p(haddr); - break; - case MO_BESW: - ret = ldsw_be_p(haddr); - break; - case MO_BEUL: - ret = (uint32_t)ldl_be_p(haddr); - break; - case MO_BESL: - ret = (int32_t)ldl_be_p(haddr); - break; - case MO_BEUQ: - ret = ldq_be_p(haddr); - break; - default: - g_assert_not_reached(); - } - clear_helper_retaddr(); - return ret; -#endif } static void tci_qemu_st(CPUArchState *env, target_ulong taddr, uint64_t val, @@ -382,67 +329,31 @@ static void tci_qemu_st(CPUArchState *env, target_ulong taddr, uint64_t val, MemOp mop = get_memop(oi); uintptr_t ra = (uintptr_t)tb_ptr; -#ifdef CONFIG_SOFTMMU switch (mop & (MO_BSWAP | MO_SIZE)) { case MO_UB: - helper_ret_stb_mmu(env, taddr, val, oi, ra); + cpu_stb_mmu(env, taddr, val, oi, ra); break; case MO_LEUW: - helper_le_stw_mmu(env, taddr, val, oi, ra); + cpu_stw_le_mmu(env, taddr, val, oi, ra); break; case MO_LEUL: - helper_le_stl_mmu(env, taddr, val, oi, ra); + cpu_stl_le_mmu(env, taddr, val, oi, ra); break; case MO_LEUQ: - helper_le_stq_mmu(env, taddr, val, oi, ra); + cpu_stq_le_mmu(env, taddr, val, oi, ra); break; case MO_BEUW: - helper_be_stw_mmu(env, taddr, val, oi, ra); + cpu_stw_be_mmu(env, taddr, val, oi, ra); break; case MO_BEUL: - helper_be_stl_mmu(env, taddr, val, oi, ra); + cpu_stl_be_mmu(env, taddr, val, oi, ra); break; case MO_BEUQ: - helper_be_stq_mmu(env, taddr, val, oi, ra); + cpu_stq_be_mmu(env, taddr, val, oi, ra); break; default: g_assert_not_reached(); } -#else - void *haddr = g2h(env_cpu(env), taddr); - unsigned a_mask = (1u << get_alignment_bits(mop)) - 1; - - set_helper_retaddr(ra); - if (taddr & a_mask) { - helper_unaligned_st(env, taddr); - } - switch (mop & (MO_BSWAP | MO_SIZE)) { - case MO_UB: - stb_p(haddr, val); - break; - case MO_LEUW: - stw_le_p(haddr, val); - break; - case MO_LEUL: - stl_le_p(haddr, val); - break; - case MO_LEUQ: - stq_le_p(haddr, val); - break; - case MO_BEUW: - stw_be_p(haddr, val); - break; - case MO_BEUL: - stl_be_p(haddr, val); - break; - case MO_BEUQ: - stq_be_p(haddr, val); - break; - default: - g_assert_not_reached(); - } - clear_helper_retaddr(); -#endif } #if TCG_TARGET_REG_BITS == 64 From patchwork Fri Nov 18 09:47:35 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 626118 Delivered-To: patch@linaro.org Received: by 2002:a17:522:c983:b0:460:3032:e3c4 with SMTP id kr3csp34371pvb; Fri, 18 Nov 2022 01:54:54 -0800 (PST) X-Google-Smtp-Source: AA0mqf47+9kpw1lBAcf9k7ycGCAcUCNj74aq5MUDvQU2vzSE5cXpGRk0H70+/lOqi2ohAGydJRsS X-Received: by 2002:a05:6214:cc6:b0:4bb:64a2:8dbf with SMTP id 6-20020a0562140cc600b004bb64a28dbfmr6030184qvx.46.1668765293913; Fri, 18 Nov 2022 01:54:53 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668765293; cv=none; d=google.com; s=arc-20160816; b=Cscu9IXRDqenKUELWZpRYZsrUrTsWBKWfaXF5Gki5iPAqtpJsq9IAYz8YdHSKu/iAb cdviGSg1z7+voUSnDECa7g/D+bvbtjG6AeHjyZxZedhAS/AIeWlUiBVnxjXG/N8hvHwB YB/ERPXO7sxXgn9g3wHFD1P0pnxX2Ld74yBLVsMKJzeY0Wq+6hzGq+/MCj7diwY+Qyt2 1sl6wyeWc77dOrAQI91vnUWSgQES/aYvWlvPCrSKlUrrFM/G4M+ijDWWuNPRqZg2k+e2 +PYm3avLCoNeTv6q6d5HqLa1oEaV+0/ohHZfiwBHv+bRiGmQSWEC+kORPoMFDcVrq7f2 sqFA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=akD4L3MOoGlRYA3uHiEfsrTB30qu1a8omZVJ5P37pGk=; b=0t+nXNdCHX2OWm62CBiUSTPog8Bnc6o7kMuwSg0FB6mxwgccR/d8znkZqvznAmsYAh UtB1OunvKq1fDyor2cL+rQXKRjZX2SdDvMlJMPi6G5Rhf8K2+CxoLjxhIqTPhGBGoW+y GCuajlNYBixKN7L0qMerSs9rvu+j6mRrJBuCe3M/vNfpVtb84OjaChCkXptXTjI4gybL dDadvLfwhhY5VwM41PgAmtSxq+hn3YTaQkBNLcO8xzU6BJ2QdE22zKLUxx4CLP2ZBBf7 nZqPaqEpmdoPAvIENQRk/bzEBUuGKK3hbUFKVqh49AwCkRnXLll9b6WA3kQA/vH+AaSY 2ktQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=cEBVz9x0; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id 11-20020a05621420eb00b004a95cdf750bsi1724103qvk.355.2022.11.18.01.54.53 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 18 Nov 2022 01:54:53 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=cEBVz9x0; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ovxz1-0007SL-Rv; Fri, 18 Nov 2022 04:48:27 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ovxyy-0007RR-W6 for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:25 -0500 Received: from mail-pf1-x433.google.com ([2607:f8b0:4864:20::433]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ovxyu-00021s-G7 for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:24 -0500 Received: by mail-pf1-x433.google.com with SMTP id y203so4394609pfb.4 for ; Fri, 18 Nov 2022 01:48:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=akD4L3MOoGlRYA3uHiEfsrTB30qu1a8omZVJ5P37pGk=; b=cEBVz9x0zI63IDzfbLjmhtC6xDbHmZZBWSPT+DWmp20OBKBWf9jvEGjkimMySv3xQQ tVeIcEI/A3IYo9l8DUT6WxdraIvVNLpYETG9W7h4YhQTsoIH66ibwjTNacwxGBM/ygqV Sv+xIvLGGtj+SKWCzmDF+U0Qwtxn3lqMJHUVJdHnOhJ3C7oVt2jjqBqHRejoXQzVe+nm wxbXBRw9d3M7dKlzjX4h/MEeVq6ul4qTYw3wroceKuoC7NlhzTSdYL7/BTYFV4eEJrf6 S2adxPTw6zNxtHKdOoI3vYj8SrpoOBhnlEFH17gfCGa2ORgNC42BnPQOrthvqogmqPk5 qt5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=akD4L3MOoGlRYA3uHiEfsrTB30qu1a8omZVJ5P37pGk=; b=YRcoblzZYWV0xF4V44Hs4DgwFP7HOibqfG/64pjR1qoiMNFnY7qk3JVv7wwtdDFw5z q5Vwqt/pEByYHw+IlhsG6hdMwn2sV4ySyMKidGAj8hJeIsYBdbCuFEx17JStj1JYzh6d kBaGWEDV758l+hrEmrO4P/bMGHf6OQ/N+MroRk8tvbZmKWgh3K4CyolstcjpcYPCg6Ci MpxotDGGLsWZcmOgWyE6axgcRVzG6qpzZuHucVvTkBHrNUeecMuyXU4cwYQ7QbXd4Xgl RxSWS0U3+r+jQM4Rs2rVKhCUfOXZFJrYG8Vc7LYlO2KEF8lkAdpe7erf4OOqdBF1pPxl OkdQ== X-Gm-Message-State: ANoB5pm3YN5jV5E65l3e8iuifb/U92skR4dex9K12DiUWPOOlshUhiqY f+jIuL2pHTz4GWQ61q8Lrh8pajgVJVWaMQ== X-Received: by 2002:a65:6118:0:b0:476:d2e9:778e with SMTP id z24-20020a656118000000b00476d2e9778emr5870331pgu.309.1668764898982; Fri, 18 Nov 2022 01:48:18 -0800 (PST) Received: from stoup.. ([2602:47:d48a:1201:90b2:345f:bf0a:c412]) by smtp.gmail.com with ESMTPSA id n12-20020a170902e54c00b0018862bb3976sm3115421plf.308.2022.11.18.01.48.17 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Nov 2022 01:48:17 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH for-8.0 10/29] tcg: Unify helper_{be,le}_{ld,st}* Date: Fri, 18 Nov 2022 01:47:35 -0800 Message-Id: <20221118094754.242910-11-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221118094754.242910-1-richard.henderson@linaro.org> References: <20221118094754.242910-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::433; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x433.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org With the current structure of cputlb.c, there is no difference between the little-endian and big-endian entry points, aside from the assert. Unify the pairs of functions. Signed-off-by: Richard Henderson Reviewed-by: Philippe Mathieu-Daudé --- include/tcg/tcg-ldst.h | 60 ++++------ accel/tcg/cputlb.c | 190 ++++++++++--------------------- docs/devel/loads-stores.rst | 36 ++---- tcg/aarch64/tcg-target.c.inc | 39 +++---- tcg/arm/tcg-target.c.inc | 45 +++----- tcg/i386/tcg-target.c.inc | 40 +++---- tcg/loongarch64/tcg-target.c.inc | 25 ++-- tcg/mips/tcg-target.c.inc | 40 +++---- tcg/ppc/tcg-target.c.inc | 30 ++--- tcg/riscv/tcg-target.c.inc | 51 +++------ tcg/s390x/tcg-target.c.inc | 38 +++---- tcg/sparc64/tcg-target.c.inc | 37 +++--- 12 files changed, 226 insertions(+), 405 deletions(-) diff --git a/include/tcg/tcg-ldst.h b/include/tcg/tcg-ldst.h index 2ba22bd5fe..56fa7afe5e 100644 --- a/include/tcg/tcg-ldst.h +++ b/include/tcg/tcg-ldst.h @@ -28,47 +28,31 @@ #ifdef CONFIG_SOFTMMU /* Value zero-extended to tcg register size. */ -tcg_target_ulong helper_ret_ldub_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr); -tcg_target_ulong helper_le_lduw_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr); -tcg_target_ulong helper_le_ldul_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr); -uint64_t helper_le_ldq_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr); -tcg_target_ulong helper_be_lduw_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr); -tcg_target_ulong helper_be_ldul_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr); -uint64_t helper_be_ldq_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr); +tcg_target_ulong helper_ldub_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t retaddr); +tcg_target_ulong helper_lduw_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t retaddr); +tcg_target_ulong helper_ldul_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t retaddr); +uint64_t helper_ldq_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t retaddr); /* Value sign-extended to tcg register size. */ -tcg_target_ulong helper_ret_ldsb_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr); -tcg_target_ulong helper_le_ldsw_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr); -tcg_target_ulong helper_le_ldsl_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr); -tcg_target_ulong helper_be_ldsw_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr); -tcg_target_ulong helper_be_ldsl_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr); +tcg_target_ulong helper_ldsb_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t retaddr); +tcg_target_ulong helper_ldsw_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t retaddr); +tcg_target_ulong helper_ldsl_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t retaddr); -void helper_ret_stb_mmu(CPUArchState *env, target_ulong addr, uint8_t val, - MemOpIdx oi, uintptr_t retaddr); -void helper_le_stw_mmu(CPUArchState *env, target_ulong addr, uint16_t val, - MemOpIdx oi, uintptr_t retaddr); -void helper_le_stl_mmu(CPUArchState *env, target_ulong addr, uint32_t val, - MemOpIdx oi, uintptr_t retaddr); -void helper_le_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val, - MemOpIdx oi, uintptr_t retaddr); -void helper_be_stw_mmu(CPUArchState *env, target_ulong addr, uint16_t val, - MemOpIdx oi, uintptr_t retaddr); -void helper_be_stl_mmu(CPUArchState *env, target_ulong addr, uint32_t val, - MemOpIdx oi, uintptr_t retaddr); -void helper_be_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val, - MemOpIdx oi, uintptr_t retaddr); +void helper_stb_mmu(CPUArchState *env, target_ulong addr, uint8_t val, + MemOpIdx oi, uintptr_t retaddr); +void helper_stw_mmu(CPUArchState *env, target_ulong addr, uint16_t val, + MemOpIdx oi, uintptr_t retaddr); +void helper_stl_mmu(CPUArchState *env, target_ulong addr, uint32_t val, + MemOpIdx oi, uintptr_t retaddr); +void helper_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val, + MemOpIdx oi, uintptr_t retaddr); #else diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c index 69f8a25a7f..3d32adc0e7 100644 --- a/accel/tcg/cputlb.c +++ b/accel/tcg/cputlb.c @@ -1980,25 +1980,6 @@ static void *atomic_mmu_lookup(CPUArchState *env, target_ulong addr, cpu_loop_exit_atomic(env_cpu(env), retaddr); } -/* - * Verify that we have passed the correct MemOp to the correct function. - * - * In the case of the helper_*_mmu functions, we will have done this by - * using the MemOp to look up the helper during code generation. - * - * In the case of the cpu_*_mmu functions, this is up to the caller. - * We could present one function to target code, and dispatch based on - * the MemOp, but so far we have worked hard to avoid an indirect function - * call along the memory path. - */ -static void validate_memop(MemOpIdx oi, MemOp expected) -{ -#ifdef CONFIG_DEBUG_TCG - MemOp have = get_memop(oi) & (MO_SIZE | MO_BSWAP); - assert(have == expected); -#endif -} - /* * Load Helpers * @@ -2215,10 +2196,10 @@ static uint8_t do_ld1_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, return do_ld_1(env, &l.page[0], l.mmu_idx, access_type, ra); } -tcg_target_ulong helper_ret_ldub_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) +tcg_target_ulong helper_ldub_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t retaddr) { - validate_memop(oi, MO_UB); + tcg_debug_assert((get_memop(oi) & MO_SIZE) == MO_8); return do_ld1_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD); } @@ -2261,17 +2242,10 @@ static uint16_t do_ld2_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, return ret; } -tcg_target_ulong helper_le_lduw_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) +tcg_target_ulong helper_lduw_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t retaddr) { - validate_memop(oi, MO_LEUW); - return do_ld2_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD); -} - -tcg_target_ulong helper_be_lduw_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) -{ - validate_memop(oi, MO_BEUW); + tcg_debug_assert((get_memop(oi) & MO_SIZE) == MO_16); return do_ld2_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD); } @@ -2310,17 +2284,10 @@ static uint32_t do_ld4_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, return ret; } -tcg_target_ulong helper_le_ldul_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) +tcg_target_ulong helper_ldul_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t retaddr) { - validate_memop(oi, MO_LEUL); - return do_ld4_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD); -} - -tcg_target_ulong helper_be_ldul_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) -{ - validate_memop(oi, MO_BEUL); + tcg_debug_assert((get_memop(oi) & MO_SIZE) == MO_32); return do_ld4_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD); } @@ -2359,17 +2326,10 @@ static uint64_t do_ld8_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, return ret; } -uint64_t helper_le_ldq_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) +uint64_t helper_ldq_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t retaddr) { - validate_memop(oi, MO_LEUQ); - return do_ld8_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD); -} - -uint64_t helper_be_ldq_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) -{ - validate_memop(oi, MO_BEUQ); + tcg_debug_assert((get_memop(oi) & MO_SIZE) == MO_64); return do_ld8_mmu(env, addr, oi, retaddr, MMU_DATA_LOAD); } @@ -2378,35 +2338,22 @@ uint64_t helper_be_ldq_mmu(CPUArchState *env, target_ulong addr, * avoid this for 64-bit data, or for 32-bit data on 32-bit host. */ - -tcg_target_ulong helper_ret_ldsb_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) +tcg_target_ulong helper_ldsb_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t retaddr) { - return (int8_t)helper_ret_ldub_mmu(env, addr, oi, retaddr); + return (int8_t)helper_ldub_mmu(env, addr, oi, retaddr); } -tcg_target_ulong helper_le_ldsw_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) +tcg_target_ulong helper_ldsw_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t retaddr) { - return (int16_t)helper_le_lduw_mmu(env, addr, oi, retaddr); + return (int16_t)helper_lduw_mmu(env, addr, oi, retaddr); } -tcg_target_ulong helper_be_ldsw_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) +tcg_target_ulong helper_ldsl_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t retaddr) { - return (int16_t)helper_be_lduw_mmu(env, addr, oi, retaddr); -} - -tcg_target_ulong helper_le_ldsl_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) -{ - return (int32_t)helper_le_ldul_mmu(env, addr, oi, retaddr); -} - -tcg_target_ulong helper_be_ldsl_mmu(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t retaddr) -{ - return (int32_t)helper_be_ldul_mmu(env, addr, oi, retaddr); + return (int32_t)helper_ldul_mmu(env, addr, oi, retaddr); } /* @@ -2422,7 +2369,7 @@ uint8_t cpu_ldb_mmu(CPUArchState *env, abi_ptr addr, MemOpIdx oi, uintptr_t ra) { uint8_t ret; - validate_memop(oi, MO_UB); + tcg_debug_assert((get_memop(oi) & MO_SIZE) == MO_UB); ret = do_ld1_mmu(env, addr, oi, ra, MMU_DATA_LOAD); plugin_load_cb(env, addr, oi); return ret; @@ -2433,7 +2380,7 @@ uint16_t cpu_ldw_be_mmu(CPUArchState *env, abi_ptr addr, { uint16_t ret; - validate_memop(oi, MO_BEUW); + tcg_debug_assert((get_memop(oi) & (MO_BSWAP | MO_SIZE)) == MO_BEUW); ret = do_ld2_mmu(env, addr, oi, ra, MMU_DATA_LOAD); plugin_load_cb(env, addr, oi); return ret; @@ -2444,7 +2391,7 @@ uint32_t cpu_ldl_be_mmu(CPUArchState *env, abi_ptr addr, { uint32_t ret; - validate_memop(oi, MO_BEUL); + tcg_debug_assert((get_memop(oi) & (MO_BSWAP | MO_SIZE)) == MO_BEUL); ret = do_ld4_mmu(env, addr, oi, ra, MMU_DATA_LOAD); plugin_load_cb(env, addr, oi); return ret; @@ -2455,7 +2402,7 @@ uint64_t cpu_ldq_be_mmu(CPUArchState *env, abi_ptr addr, { uint64_t ret; - validate_memop(oi, MO_BEUQ); + tcg_debug_assert((get_memop(oi) & (MO_BSWAP | MO_SIZE)) == MO_BEUQ); ret = do_ld8_mmu(env, addr, oi, ra, MMU_DATA_LOAD); plugin_load_cb(env, addr, oi); return ret; @@ -2466,7 +2413,7 @@ uint16_t cpu_ldw_le_mmu(CPUArchState *env, abi_ptr addr, { uint16_t ret; - validate_memop(oi, MO_LEUW); + tcg_debug_assert((get_memop(oi) & (MO_BSWAP | MO_SIZE)) == MO_LEUW); ret = do_ld2_mmu(env, addr, oi, ra, MMU_DATA_LOAD); plugin_load_cb(env, addr, oi); return ret; @@ -2477,7 +2424,7 @@ uint32_t cpu_ldl_le_mmu(CPUArchState *env, abi_ptr addr, { uint32_t ret; - validate_memop(oi, MO_LEUL); + tcg_debug_assert((get_memop(oi) & (MO_BSWAP | MO_SIZE)) == MO_LEUL); ret = do_ld4_mmu(env, addr, oi, ra, MMU_DATA_LOAD); plugin_load_cb(env, addr, oi); return ret; @@ -2488,7 +2435,7 @@ uint64_t cpu_ldq_le_mmu(CPUArchState *env, abi_ptr addr, { uint64_t ret; - validate_memop(oi, MO_LEUQ); + tcg_debug_assert((get_memop(oi) & (MO_BSWAP | MO_SIZE)) == MO_LEUQ); ret = do_ld8_mmu(env, addr, oi, ra, MMU_DATA_LOAD); plugin_load_cb(env, addr, oi); return ret; @@ -2516,8 +2463,8 @@ Int128 cpu_ld16_be_mmu(CPUArchState *env, abi_ptr addr, mop = (mop & ~(MO_SIZE | MO_AMASK)) | MO_64 | MO_UNALN; new_oi = make_memop_idx(mop, mmu_idx); - h = helper_be_ldq_mmu(env, addr, new_oi, ra); - l = helper_be_ldq_mmu(env, addr + 8, new_oi, ra); + h = helper_ldq_mmu(env, addr, new_oi, ra); + l = helper_ldq_mmu(env, addr + 8, new_oi, ra); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); return int128_make128(l, h); @@ -2545,8 +2492,8 @@ Int128 cpu_ld16_le_mmu(CPUArchState *env, abi_ptr addr, mop = (mop & ~(MO_SIZE | MO_AMASK)) | MO_64 | MO_UNALN; new_oi = make_memop_idx(mop, mmu_idx); - l = helper_le_ldq_mmu(env, addr, new_oi, ra); - h = helper_le_ldq_mmu(env, addr + 8, new_oi, ra); + l = helper_ldq_mmu(env, addr, new_oi, ra); + h = helper_ldq_mmu(env, addr + 8, new_oi, ra); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); return int128_make128(l, h); @@ -2754,13 +2701,13 @@ static void do_st_1(CPUArchState *env, MMULookupPageData *p, uint8_t val, } } -void helper_ret_stb_mmu(CPUArchState *env, target_ulong addr, uint8_t val, - MemOpIdx oi, uintptr_t ra) +void helper_stb_mmu(CPUArchState *env, target_ulong addr, uint8_t val, + MemOpIdx oi, uintptr_t ra) { MMULookupLocals l; bool crosspage; - validate_memop(oi, MO_UB); + tcg_debug_assert((get_memop(oi) & MO_SIZE) == MO_8); crosspage = mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE, &l); tcg_debug_assert(!crosspage); @@ -2802,17 +2749,10 @@ static void do_st2_mmu(CPUArchState *env, target_ulong addr, uint16_t val, } } -void helper_le_stw_mmu(CPUArchState *env, target_ulong addr, uint16_t val, - MemOpIdx oi, uintptr_t retaddr) +void helper_stw_mmu(CPUArchState *env, target_ulong addr, uint16_t val, + MemOpIdx oi, uintptr_t retaddr) { - validate_memop(oi, MO_LEUW); - do_st2_mmu(env, addr, val, oi, retaddr); -} - -void helper_be_stw_mmu(CPUArchState *env, target_ulong addr, uint16_t val, - MemOpIdx oi, uintptr_t retaddr) -{ - validate_memop(oi, MO_BEUW); + tcg_debug_assert((get_memop(oi) & MO_SIZE) == MO_16); do_st2_mmu(env, addr, val, oi, retaddr); } @@ -2848,17 +2788,10 @@ static void do_st4_mmu(CPUArchState *env, target_ulong addr, uint32_t val, } } -void helper_le_stl_mmu(CPUArchState *env, target_ulong addr, uint32_t val, - MemOpIdx oi, uintptr_t retaddr) +void helper_stl_mmu(CPUArchState *env, target_ulong addr, uint32_t val, + MemOpIdx oi, uintptr_t retaddr) { - validate_memop(oi, MO_LEUL); - do_st4_mmu(env, addr, val, oi, retaddr); -} - -void helper_be_stl_mmu(CPUArchState *env, target_ulong addr, uint32_t val, - MemOpIdx oi, uintptr_t retaddr) -{ - validate_memop(oi, MO_BEUL); + tcg_debug_assert((get_memop(oi) & MO_SIZE) == MO_32); do_st4_mmu(env, addr, val, oi, retaddr); } @@ -2894,17 +2827,10 @@ static void do_st8_mmu(CPUArchState *env, target_ulong addr, uint64_t val, } } -void helper_le_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val, - MemOpIdx oi, uintptr_t retaddr) +void helper_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val, + MemOpIdx oi, uintptr_t retaddr) { - validate_memop(oi, MO_LEUQ); - do_st8_mmu(env, addr, val, oi, retaddr); -} - -void helper_be_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val, - MemOpIdx oi, uintptr_t retaddr) -{ - validate_memop(oi, MO_BEUQ); + tcg_debug_assert((get_memop(oi) & MO_SIZE) == MO_64); do_st8_mmu(env, addr, val, oi, retaddr); } @@ -2920,49 +2846,55 @@ static void plugin_store_cb(CPUArchState *env, abi_ptr addr, MemOpIdx oi) void cpu_stb_mmu(CPUArchState *env, target_ulong addr, uint8_t val, MemOpIdx oi, uintptr_t retaddr) { - helper_ret_stb_mmu(env, addr, val, oi, retaddr); + helper_stb_mmu(env, addr, val, oi, retaddr); plugin_store_cb(env, addr, oi); } void cpu_stw_be_mmu(CPUArchState *env, target_ulong addr, uint16_t val, MemOpIdx oi, uintptr_t retaddr) { - helper_be_stw_mmu(env, addr, val, oi, retaddr); + tcg_debug_assert((get_memop(oi) & (MO_BSWAP | MO_SIZE)) == MO_BEUW); + do_st2_mmu(env, addr, val, oi, retaddr); plugin_store_cb(env, addr, oi); } void cpu_stl_be_mmu(CPUArchState *env, target_ulong addr, uint32_t val, MemOpIdx oi, uintptr_t retaddr) { - helper_be_stl_mmu(env, addr, val, oi, retaddr); + tcg_debug_assert((get_memop(oi) & (MO_BSWAP | MO_SIZE)) == MO_BEUL); + do_st4_mmu(env, addr, val, oi, retaddr); plugin_store_cb(env, addr, oi); } void cpu_stq_be_mmu(CPUArchState *env, target_ulong addr, uint64_t val, MemOpIdx oi, uintptr_t retaddr) { - helper_be_stq_mmu(env, addr, val, oi, retaddr); + tcg_debug_assert((get_memop(oi) & (MO_BSWAP | MO_SIZE)) == MO_BEUQ); + do_st8_mmu(env, addr, val, oi, retaddr); plugin_store_cb(env, addr, oi); } void cpu_stw_le_mmu(CPUArchState *env, target_ulong addr, uint16_t val, MemOpIdx oi, uintptr_t retaddr) { - helper_le_stw_mmu(env, addr, val, oi, retaddr); + tcg_debug_assert((get_memop(oi) & (MO_BSWAP | MO_SIZE)) == MO_LEUW); + do_st2_mmu(env, addr, val, oi, retaddr); plugin_store_cb(env, addr, oi); } void cpu_stl_le_mmu(CPUArchState *env, target_ulong addr, uint32_t val, MemOpIdx oi, uintptr_t retaddr) { - helper_le_stl_mmu(env, addr, val, oi, retaddr); + tcg_debug_assert((get_memop(oi) & (MO_BSWAP | MO_SIZE)) == MO_LEUL); + do_st4_mmu(env, addr, val, oi, retaddr); plugin_store_cb(env, addr, oi); } void cpu_stq_le_mmu(CPUArchState *env, target_ulong addr, uint64_t val, MemOpIdx oi, uintptr_t retaddr) { - helper_le_stq_mmu(env, addr, val, oi, retaddr); + tcg_debug_assert((get_memop(oi) & (MO_BSWAP | MO_SIZE)) == MO_LEUQ); + do_st8_mmu(env, addr, val, oi, retaddr); plugin_store_cb(env, addr, oi); } @@ -2987,8 +2919,8 @@ void cpu_st16_be_mmu(CPUArchState *env, abi_ptr addr, Int128 val, mop = (mop & ~(MO_SIZE | MO_AMASK)) | MO_64 | MO_UNALN; new_oi = make_memop_idx(mop, mmu_idx); - helper_be_stq_mmu(env, addr, int128_gethi(val), new_oi, ra); - helper_be_stq_mmu(env, addr + 8, int128_getlo(val), new_oi, ra); + helper_stq_mmu(env, addr, int128_gethi(val), new_oi, ra); + helper_stq_mmu(env, addr + 8, int128_getlo(val), new_oi, ra); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); } @@ -3014,8 +2946,8 @@ void cpu_st16_le_mmu(CPUArchState *env, abi_ptr addr, Int128 val, mop = (mop & ~(MO_SIZE | MO_AMASK)) | MO_64 | MO_UNALN; new_oi = make_memop_idx(mop, mmu_idx); - helper_le_stq_mmu(env, addr, int128_getlo(val), new_oi, ra); - helper_le_stq_mmu(env, addr + 8, int128_gethi(val), new_oi, ra); + helper_stq_mmu(env, addr, int128_getlo(val), new_oi, ra); + helper_stq_mmu(env, addr + 8, int128_gethi(val), new_oi, ra); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); } diff --git a/docs/devel/loads-stores.rst b/docs/devel/loads-stores.rst index ad5dfe133e..d2cefc77a2 100644 --- a/docs/devel/loads-stores.rst +++ b/docs/devel/loads-stores.rst @@ -297,31 +297,20 @@ swap: ``translator_ld{sign}{size}_swap(env, ptr, swap)`` Regexes for git grep - ``\`` -``helper_*_{ld,st}*_mmu`` +``helper_{ld,st}*_mmu`` ~~~~~~~~~~~~~~~~~~~~~~~~~ These functions are intended primarily to be called by the code -generated by the TCG backend. They may also be called by target -CPU helper function code. Like the ``cpu_{ld,st}_mmuidx_ra`` functions -they perform accesses by guest virtual address, with a given ``mmuidx``. +generated by the TCG backend. Like the ``cpu_{ld,st}_mmu`` functions +they perform accesses by guest virtual address, with a given ``MemOpIdx``. -These functions specify an ``opindex`` parameter which encodes -(among other things) the mmu index to use for the access. This parameter -should be created by calling ``make_memop_idx()``. +They differ from ``cpu_{ld,st}_mmu`` in that they take the endianness +of the operation only from the MemOpIdx, and loads extend the return +value to the size of a host general register (``tcg_target_ulong``). -The ``retaddr`` parameter should be the result of GETPC() called directly -from the top level HELPER(foo) function (or 0 if no guest CPU state -unwinding is required). +load: ``helper_ld{sign}{size}_mmu(env, addr, opindex, retaddr)`` -**TODO** The names of these functions are a bit odd for historical -reasons because they were originally expected to be called only from -within generated code. We should rename them to bring them more in -line with the other memory access functions. The explicit endianness -is the only feature they have beyond ``*_mmuidx_ra``. - -load: ``helper_{endian}_ld{sign}{size}_mmu(env, addr, opindex, retaddr)`` - -store: ``helper_{endian}_st{size}_mmu(env, addr, val, opindex, retaddr)`` +store: ``helper_{size}_mmu(env, addr, val, opindex, retaddr)`` ``sign`` - (empty) : for 32 or 64 bit sizes @@ -334,14 +323,9 @@ store: ``helper_{endian}_st{size}_mmu(env, addr, val, opindex, retaddr)`` - ``l`` : 32 bits - ``q`` : 64 bits -``endian`` - - ``le`` : little endian - - ``be`` : big endian - - ``ret`` : target endianness - Regexes for git grep - - ``\`` - - ``\`` + - ``\`` + - ``\`` ``address_space_*`` ~~~~~~~~~~~~~~~~~~~ diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc index dfe569dd8c..001a71bbc0 100644 --- a/tcg/aarch64/tcg-target.c.inc +++ b/tcg/aarch64/tcg-target.c.inc @@ -1564,37 +1564,26 @@ static void tcg_out_adr(TCGContext *s, TCGReg rd, const void *target) } #ifdef CONFIG_SOFTMMU -/* helper signature: helper_ret_ld_mmu(CPUState *env, target_ulong addr, - * MemOpIdx oi, uintptr_t ra) +/* + * helper signature: helper_ld*_mmu(CPUState *env, target_ulong addr, + * MemOpIdx oi, uintptr_t ra) */ static void * const qemu_ld_helpers[MO_SIZE + 1] = { - [MO_8] = helper_ret_ldub_mmu, -#if HOST_BIG_ENDIAN - [MO_16] = helper_be_lduw_mmu, - [MO_32] = helper_be_ldul_mmu, - [MO_64] = helper_be_ldq_mmu, -#else - [MO_16] = helper_le_lduw_mmu, - [MO_32] = helper_le_ldul_mmu, - [MO_64] = helper_le_ldq_mmu, -#endif + [MO_8] = helper_ldub_mmu, + [MO_16] = helper_lduw_mmu, + [MO_32] = helper_ldul_mmu, + [MO_64] = helper_ldq_mmu, }; -/* helper signature: helper_ret_st_mmu(CPUState *env, target_ulong addr, - * uintxx_t val, MemOpIdx oi, - * uintptr_t ra) +/* + * helper signature: helper_st*_mmu(CPUState *env, target_ulong addr, + * uintxx_t val, MemOpIdx oi, uintptr_t ra) */ static void * const qemu_st_helpers[MO_SIZE + 1] = { - [MO_8] = helper_ret_stb_mmu, -#if HOST_BIG_ENDIAN - [MO_16] = helper_be_stw_mmu, - [MO_32] = helper_be_stl_mmu, - [MO_64] = helper_be_stq_mmu, -#else - [MO_16] = helper_le_stw_mmu, - [MO_32] = helper_le_stl_mmu, - [MO_64] = helper_le_stq_mmu, -#endif + [MO_8] = helper_stb_mmu, + [MO_16] = helper_stw_mmu, + [MO_32] = helper_stl_mmu, + [MO_64] = helper_stq_mmu, }; static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb) diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc index add90ddeb4..1f89745c86 100644 --- a/tcg/arm/tcg-target.c.inc +++ b/tcg/arm/tcg-target.c.inc @@ -1300,41 +1300,26 @@ static void tcg_out_vldst(TCGContext *s, ARMInsn insn, } #ifdef CONFIG_SOFTMMU -/* helper signature: helper_ret_ld_mmu(CPUState *env, target_ulong addr, - * int mmu_idx, uintptr_t ra) +/* + * helper signature: helper_ld*_mmu(CPUState *env, target_ulong addr, + * int mmu_idx, uintptr_t ra) */ -static void * const qemu_ld_helpers[MO_SSIZE + 1] = { - [MO_UB] = helper_ret_ldub_mmu, - [MO_SB] = helper_ret_ldsb_mmu, -#if HOST_BIG_ENDIAN - [MO_UW] = helper_be_lduw_mmu, - [MO_UL] = helper_be_ldul_mmu, - [MO_UQ] = helper_be_ldq_mmu, - [MO_SW] = helper_be_ldsw_mmu, - [MO_SL] = helper_be_ldul_mmu, -#else - [MO_UW] = helper_le_lduw_mmu, - [MO_UL] = helper_le_ldul_mmu, - [MO_UQ] = helper_le_ldq_mmu, - [MO_SW] = helper_le_ldsw_mmu, - [MO_SL] = helper_le_ldul_mmu, -#endif +static void * const qemu_ld_helpers[MO_SIZE + 1] = { + [MO_UB] = helper_ldub_mmu, + [MO_UW] = helper_lduw_mmu, + [MO_UL] = helper_ldul_mmu, + [MO_UQ] = helper_ldq_mmu, }; -/* helper signature: helper_ret_st_mmu(CPUState *env, target_ulong addr, - * uintxx_t val, int mmu_idx, uintptr_t ra) +/* + * helper signature: helper_st*_mmu(CPUState *env, target_ulong addr, + * uintxx_t val, int mmu_idx, uintptr_t ra) */ static void * const qemu_st_helpers[MO_SIZE + 1] = { - [MO_8] = helper_ret_stb_mmu, -#if HOST_BIG_ENDIAN - [MO_16] = helper_be_stw_mmu, - [MO_32] = helper_be_stl_mmu, - [MO_64] = helper_be_stq_mmu, -#else - [MO_16] = helper_le_stw_mmu, - [MO_32] = helper_le_stl_mmu, - [MO_64] = helper_le_stq_mmu, -#endif + [MO_8] = helper_stb_mmu, + [MO_16] = helper_stw_mmu, + [MO_32] = helper_stl_mmu, + [MO_64] = helper_stq_mmu, }; /* Helper routines for marshalling helper function arguments into diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index 1361960156..24e9efe631 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -1728,30 +1728,26 @@ static void tcg_out_nopn(TCGContext *s, int n) } #if defined(CONFIG_SOFTMMU) -/* helper signature: helper_ret_ld_mmu(CPUState *env, target_ulong addr, - * int mmu_idx, uintptr_t ra) +/* + * helper signature: helper_ld*_mmu(CPUState *env, target_ulong addr, + * int mmu_idx, uintptr_t ra) */ -static void * const qemu_ld_helpers[(MO_SIZE | MO_BSWAP) + 1] = { - [MO_UB] = helper_ret_ldub_mmu, - [MO_LEUW] = helper_le_lduw_mmu, - [MO_LEUL] = helper_le_ldul_mmu, - [MO_LEUQ] = helper_le_ldq_mmu, - [MO_BEUW] = helper_be_lduw_mmu, - [MO_BEUL] = helper_be_ldul_mmu, - [MO_BEUQ] = helper_be_ldq_mmu, +static void * const qemu_ld_helpers[MO_SIZE + 1] = { + [MO_UB] = helper_ldub_mmu, + [MO_UW] = helper_lduw_mmu, + [MO_UL] = helper_ldul_mmu, + [MO_UQ] = helper_ldq_mmu, }; -/* helper signature: helper_ret_st_mmu(CPUState *env, target_ulong addr, - * uintxx_t val, int mmu_idx, uintptr_t ra) +/* + * helper signature: helper_st*_mmu(CPUState *env, target_ulong addr, + * uintxx_t val, int mmu_idx, uintptr_t ra) */ -static void * const qemu_st_helpers[(MO_SIZE | MO_BSWAP) + 1] = { - [MO_UB] = helper_ret_stb_mmu, - [MO_LEUW] = helper_le_stw_mmu, - [MO_LEUL] = helper_le_stl_mmu, - [MO_LEUQ] = helper_le_stq_mmu, - [MO_BEUW] = helper_be_stw_mmu, - [MO_BEUL] = helper_be_stl_mmu, - [MO_BEUQ] = helper_be_stq_mmu, +static void * const qemu_st_helpers[MO_SIZE + 1] = { + [MO_UB] = helper_stb_mmu, + [MO_UW] = helper_stw_mmu, + [MO_UL] = helper_stl_mmu, + [MO_UQ] = helper_stq_mmu, }; /* Perform the TLB load and compare. @@ -1926,7 +1922,7 @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l) (uintptr_t)l->raddr); } - tcg_out_branch(s, 1, qemu_ld_helpers[opc & (MO_BSWAP | MO_SIZE)]); + tcg_out_branch(s, 1, qemu_ld_helpers[opc & MO_SIZE]); data_reg = l->datalo_reg; switch (opc & MO_SSIZE) { @@ -2033,7 +2029,7 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l) /* "Tail call" to the helper, with the return address back inline. */ tcg_out_push(s, retaddr); - tcg_out_jmp(s, qemu_st_helpers[opc & (MO_BSWAP | MO_SIZE)]); + tcg_out_jmp(s, qemu_st_helpers[opc & MO_SIZE]); return true; } #else diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc index f6b0ed00bb..e73b48bd0f 100644 --- a/tcg/loongarch64/tcg-target.c.inc +++ b/tcg/loongarch64/tcg-target.c.inc @@ -655,26 +655,25 @@ static bool tcg_out_sti(TCGContext *s, TCGType type, TCGArg val, #if defined(CONFIG_SOFTMMU) /* - * helper signature: helper_ret_ld_mmu(CPUState *env, target_ulong addr, - * MemOpIdx oi, uintptr_t ra) + * helper signature: helper_ld*_mmu(CPUState *env, target_ulong addr, + * MemOpIdx oi, uintptr_t ra) */ static void * const qemu_ld_helpers[4] = { - [MO_8] = helper_ret_ldub_mmu, - [MO_16] = helper_le_lduw_mmu, - [MO_32] = helper_le_ldul_mmu, - [MO_64] = helper_le_ldq_mmu, + [MO_8] = helper_ldub_mmu, + [MO_16] = helper_lduw_mmu, + [MO_32] = helper_ldul_mmu, + [MO_64] = helper_ldq_mmu, }; /* - * helper signature: helper_ret_st_mmu(CPUState *env, target_ulong addr, - * uintxx_t val, MemOpIdx oi, - * uintptr_t ra) + * helper signature: helper_st*_mmu(CPUState *env, target_ulong addr, + * uintxx_t val, MemOpIdx oi, uintptr_t ra) */ static void * const qemu_st_helpers[4] = { - [MO_8] = helper_ret_stb_mmu, - [MO_16] = helper_le_stw_mmu, - [MO_32] = helper_le_stl_mmu, - [MO_64] = helper_le_stq_mmu, + [MO_8] = helper_stb_mmu, + [MO_16] = helper_stw_mmu, + [MO_32] = helper_stl_mmu, + [MO_64] = helper_stq_mmu, }; /* We expect to use a 12-bit negative offset from ENV. */ diff --git a/tcg/mips/tcg-target.c.inc b/tcg/mips/tcg-target.c.inc index 92883176c6..a23e2c409f 100644 --- a/tcg/mips/tcg-target.c.inc +++ b/tcg/mips/tcg-target.c.inc @@ -1037,31 +1037,21 @@ static void tcg_out_call(TCGContext *s, const tcg_insn_unit *arg, } #if defined(CONFIG_SOFTMMU) -static void * const qemu_ld_helpers[(MO_SSIZE | MO_BSWAP) + 1] = { - [MO_UB] = helper_ret_ldub_mmu, - [MO_SB] = helper_ret_ldsb_mmu, - [MO_LEUW] = helper_le_lduw_mmu, - [MO_LESW] = helper_le_ldsw_mmu, - [MO_LEUL] = helper_le_ldul_mmu, - [MO_LEUQ] = helper_le_ldq_mmu, - [MO_BEUW] = helper_be_lduw_mmu, - [MO_BESW] = helper_be_ldsw_mmu, - [MO_BEUL] = helper_be_ldul_mmu, - [MO_BEUQ] = helper_be_ldq_mmu, -#if TCG_TARGET_REG_BITS == 64 - [MO_LESL] = helper_le_ldsl_mmu, - [MO_BESL] = helper_be_ldsl_mmu, -#endif +static void * const qemu_ld_helpers[MO_SSIZE + 1] = { + [MO_UB] = helper_ldub_mmu, + [MO_SB] = helper_ldsb_mmu, + [MO_UW] = helper_lduw_mmu, + [MO_SW] = helper_ldsw_mmu, + [MO_UL] = helper_ldul_mmu, + [MO_SL] = helper_ldsl_mmu, + [MO_UQ] = helper_ldq_mmu, }; -static void * const qemu_st_helpers[(MO_SIZE | MO_BSWAP) + 1] = { - [MO_UB] = helper_ret_stb_mmu, - [MO_LEUW] = helper_le_stw_mmu, - [MO_LEUL] = helper_le_stl_mmu, - [MO_LEUQ] = helper_le_stq_mmu, - [MO_BEUW] = helper_be_stw_mmu, - [MO_BEUL] = helper_be_stl_mmu, - [MO_BEUQ] = helper_be_stq_mmu, +static void * const qemu_st_helpers[MO_SIZE + 1] = { + [MO_UB] = helper_stb_mmu, + [MO_UW] = helper_stw_mmu, + [MO_UL] = helper_stl_mmu, + [MO_UQ] = helper_stq_mmu, }; /* Helper routines for marshalling helper function arguments into @@ -1267,7 +1257,7 @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l) } i = tcg_out_call_iarg_imm(s, i, oi); i = tcg_out_call_iarg_imm(s, i, (intptr_t)l->raddr); - tcg_out_call_int(s, qemu_ld_helpers[opc & (MO_BSWAP | MO_SSIZE)], false); + tcg_out_call_int(s, qemu_ld_helpers[opc & MO_SSIZE], false); /* delay slot */ tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], TCG_AREG0); @@ -1345,7 +1335,7 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l) computation to take place in the return address register. */ tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_RA, (intptr_t)l->raddr); i = tcg_out_call_iarg_reg(s, i, TCG_REG_RA); - tcg_out_call_int(s, qemu_st_helpers[opc & (MO_BSWAP | MO_SIZE)], true); + tcg_out_call_int(s, qemu_st_helpers[opc & MO_SIZE], true); /* delay slot */ tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], TCG_AREG0); return true; diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc index e86d4a5e78..3e9fc9bd25 100644 --- a/tcg/ppc/tcg-target.c.inc +++ b/tcg/ppc/tcg-target.c.inc @@ -2052,27 +2052,21 @@ static const uint32_t qemu_exts_opc[4] = { /* helper signature: helper_ld_mmu(CPUState *env, target_ulong addr, * int mmu_idx, uintptr_t ra) */ -static void * const qemu_ld_helpers[(MO_SIZE | MO_BSWAP) + 1] = { - [MO_UB] = helper_ret_ldub_mmu, - [MO_LEUW] = helper_le_lduw_mmu, - [MO_LEUL] = helper_le_ldul_mmu, - [MO_LEUQ] = helper_le_ldq_mmu, - [MO_BEUW] = helper_be_lduw_mmu, - [MO_BEUL] = helper_be_ldul_mmu, - [MO_BEUQ] = helper_be_ldq_mmu, +static void * const qemu_ld_helpers[MO_SIZE + 1] = { + [MO_UB] = helper_ldub_mmu, + [MO_UW] = helper_lduw_mmu, + [MO_UL] = helper_ldul_mmu, + [MO_UQ] = helper_ldq_mmu, }; /* helper signature: helper_st_mmu(CPUState *env, target_ulong addr, * uintxx_t val, int mmu_idx, uintptr_t ra) */ -static void * const qemu_st_helpers[(MO_SIZE | MO_BSWAP) + 1] = { - [MO_UB] = helper_ret_stb_mmu, - [MO_LEUW] = helper_le_stw_mmu, - [MO_LEUL] = helper_le_stl_mmu, - [MO_LEUQ] = helper_le_stq_mmu, - [MO_BEUW] = helper_be_stw_mmu, - [MO_BEUL] = helper_be_stl_mmu, - [MO_BEUQ] = helper_be_stq_mmu, +static void * const qemu_st_helpers[MO_SIZE + 1] = { + [MO_UB] = helper_stb_mmu, + [MO_UW] = helper_stw_mmu, + [MO_UL] = helper_stl_mmu, + [MO_UQ] = helper_stq_mmu, }; /* We expect to use a 16-bit negative offset from ENV. */ @@ -2234,7 +2228,7 @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb) tcg_out_movi(s, TCG_TYPE_I32, arg++, oi); tcg_out32(s, MFSPR | RT(arg) | LR); - tcg_out_call_int(s, LK, qemu_ld_helpers[opc & (MO_BSWAP | MO_SIZE)]); + tcg_out_call_int(s, LK, qemu_ld_helpers[opc & MO_SIZE]); lo = lb->datalo_reg; hi = lb->datahi_reg; @@ -2303,7 +2297,7 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *lb) tcg_out_movi(s, TCG_TYPE_I32, arg++, oi); tcg_out32(s, MFSPR | RT(arg) | LR); - tcg_out_call_int(s, LK, qemu_st_helpers[opc & (MO_BSWAP | MO_SIZE)]); + tcg_out_call_int(s, LK, qemu_st_helpers[opc & MO_SIZE]); tcg_out_b(s, 0, lb->raddr); return true; diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc index 417736cae7..7261f15197 100644 --- a/tcg/riscv/tcg-target.c.inc +++ b/tcg/riscv/tcg-target.c.inc @@ -858,46 +858,29 @@ static void tcg_out_mb(TCGContext *s, TCGArg a0) */ #if defined(CONFIG_SOFTMMU) -/* helper signature: helper_ret_ld_mmu(CPUState *env, target_ulong addr, - * MemOpIdx oi, uintptr_t ra) +/* + * helper signature: helper_ld*_mmu(CPUState *env, target_ulong addr, + * MemOpIdx oi, uintptr_t ra) */ static void * const qemu_ld_helpers[MO_SSIZE + 1] = { - [MO_UB] = helper_ret_ldub_mmu, - [MO_SB] = helper_ret_ldsb_mmu, -#if HOST_BIG_ENDIAN - [MO_UW] = helper_be_lduw_mmu, - [MO_SW] = helper_be_ldsw_mmu, - [MO_UL] = helper_be_ldul_mmu, -#if TCG_TARGET_REG_BITS == 64 - [MO_SL] = helper_be_ldsl_mmu, -#endif - [MO_UQ] = helper_be_ldq_mmu, -#else - [MO_UW] = helper_le_lduw_mmu, - [MO_SW] = helper_le_ldsw_mmu, - [MO_UL] = helper_le_ldul_mmu, -#if TCG_TARGET_REG_BITS == 64 - [MO_SL] = helper_le_ldsl_mmu, -#endif - [MO_UQ] = helper_le_ldq_mmu, -#endif + [MO_UB] = helper_ldub_mmu, + [MO_SB] = helper_ldsb_mmu, + [MO_UW] = helper_lduw_mmu, + [MO_SW] = helper_ldsw_mmu, + [MO_UL] = helper_ldul_mmu, + [MO_SL] = helper_ldsl_mmu, + [MO_UQ] = helper_ldq_mmu, }; -/* helper signature: helper_ret_st_mmu(CPUState *env, target_ulong addr, - * uintxx_t val, MemOpIdx oi, - * uintptr_t ra) +/* + * helper signature: helper_st*_mmu(CPUState *env, target_ulong addr, + * uintxx_t val, MemOpIdx oi, uintptr_t ra) */ static void * const qemu_st_helpers[MO_SIZE + 1] = { - [MO_8] = helper_ret_stb_mmu, -#if HOST_BIG_ENDIAN - [MO_16] = helper_be_stw_mmu, - [MO_32] = helper_be_stl_mmu, - [MO_64] = helper_be_stq_mmu, -#else - [MO_16] = helper_le_stw_mmu, - [MO_32] = helper_le_stl_mmu, - [MO_64] = helper_le_stq_mmu, -#endif + [MO_8] = helper_stb_mmu, + [MO_16] = helper_stw_mmu, + [MO_32] = helper_stl_mmu, + [MO_64] = helper_stq_mmu, }; /* We don't support oversize guests */ diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc index 50655e9d1d..30556c430f 100644 --- a/tcg/s390x/tcg-target.c.inc +++ b/tcg/s390x/tcg-target.c.inc @@ -438,29 +438,21 @@ static const uint8_t tcg_cond_to_ltr_cond[] = { }; #ifdef CONFIG_SOFTMMU -static void * const qemu_ld_helpers[(MO_SSIZE | MO_BSWAP) + 1] = { - [MO_UB] = helper_ret_ldub_mmu, - [MO_SB] = helper_ret_ldsb_mmu, - [MO_LEUW] = helper_le_lduw_mmu, - [MO_LESW] = helper_le_ldsw_mmu, - [MO_LEUL] = helper_le_ldul_mmu, - [MO_LESL] = helper_le_ldsl_mmu, - [MO_LEUQ] = helper_le_ldq_mmu, - [MO_BEUW] = helper_be_lduw_mmu, - [MO_BESW] = helper_be_ldsw_mmu, - [MO_BEUL] = helper_be_ldul_mmu, - [MO_BESL] = helper_be_ldsl_mmu, - [MO_BEUQ] = helper_be_ldq_mmu, +static void * const qemu_ld_helpers[MO_SSIZE + 1] = { + [MO_UB] = helper_ldub_mmu, + [MO_SB] = helper_ldsb_mmu, + [MO_UW] = helper_lduw_mmu, + [MO_SW] = helper_ldsw_mmu, + [MO_UL] = helper_ldul_mmu, + [MO_SL] = helper_ldsl_mmu, + [MO_UQ] = helper_ldq_mmu, }; -static void * const qemu_st_helpers[(MO_SIZE | MO_BSWAP) + 1] = { - [MO_UB] = helper_ret_stb_mmu, - [MO_LEUW] = helper_le_stw_mmu, - [MO_LEUL] = helper_le_stl_mmu, - [MO_LEUQ] = helper_le_stq_mmu, - [MO_BEUW] = helper_be_stw_mmu, - [MO_BEUL] = helper_be_stl_mmu, - [MO_BEUQ] = helper_be_stq_mmu, +static void * const qemu_st_helpers[MO_SIZE + 1] = { + [MO_UB] = helper_stb_mmu, + [MO_UW] = helper_stw_mmu, + [MO_UL] = helper_stl_mmu, + [MO_UQ] = helper_stq_mmu, }; #endif @@ -1913,7 +1905,7 @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *lb) } tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_R4, oi); tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R5, (uintptr_t)lb->raddr); - tcg_out_call_int(s, qemu_ld_helpers[opc & (MO_BSWAP | MO_SSIZE)]); + tcg_out_call_int(s, qemu_ld_helpers[opc & MO_SSIZE]); tcg_out_mov(s, TCG_TYPE_I64, data_reg, TCG_REG_R2); tgen_gotoi(s, S390_CC_ALWAYS, lb->raddr); @@ -1954,7 +1946,7 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *lb) } tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_R5, oi); tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_R6, (uintptr_t)lb->raddr); - tcg_out_call_int(s, qemu_st_helpers[opc & (MO_BSWAP | MO_SIZE)]); + tcg_out_call_int(s, qemu_st_helpers[opc & MO_SIZE]); tgen_gotoi(s, S390_CC_ALWAYS, lb->raddr); return true; diff --git a/tcg/sparc64/tcg-target.c.inc b/tcg/sparc64/tcg-target.c.inc index 9b5afb8248..f9334b7c56 100644 --- a/tcg/sparc64/tcg-target.c.inc +++ b/tcg/sparc64/tcg-target.c.inc @@ -880,8 +880,8 @@ static void tcg_out_mb(TCGContext *s, TCGArg a0) } #ifdef CONFIG_SOFTMMU -static const tcg_insn_unit *qemu_ld_trampoline[(MO_SSIZE | MO_BSWAP) + 1]; -static const tcg_insn_unit *qemu_st_trampoline[(MO_SIZE | MO_BSWAP) + 1]; +static const tcg_insn_unit *qemu_ld_trampoline[MO_SSIZE + 1]; +static const tcg_insn_unit *qemu_st_trampoline[MO_SIZE + 1]; static void emit_extend(TCGContext *s, TCGReg r, int op) { @@ -907,25 +907,18 @@ static void emit_extend(TCGContext *s, TCGReg r, int op) static void build_trampolines(TCGContext *s) { static void * const qemu_ld_helpers[] = { - [MO_UB] = helper_ret_ldub_mmu, - [MO_SB] = helper_ret_ldsb_mmu, - [MO_LEUW] = helper_le_lduw_mmu, - [MO_LESW] = helper_le_ldsw_mmu, - [MO_LEUL] = helper_le_ldul_mmu, - [MO_LEUQ] = helper_le_ldq_mmu, - [MO_BEUW] = helper_be_lduw_mmu, - [MO_BESW] = helper_be_ldsw_mmu, - [MO_BEUL] = helper_be_ldul_mmu, - [MO_BEUQ] = helper_be_ldq_mmu, + [MO_UB] = helper_ldub_mmu, + [MO_SB] = helper_ldsb_mmu, + [MO_UW] = helper_lduw_mmu, + [MO_SW] = helper_ldsw_mmu, + [MO_UL] = helper_ldul_mmu, + [MO_UQ] = helper_ldq_mmu, }; static void * const qemu_st_helpers[] = { - [MO_UB] = helper_ret_stb_mmu, - [MO_LEUW] = helper_le_stw_mmu, - [MO_LEUL] = helper_le_stl_mmu, - [MO_LEUQ] = helper_le_stq_mmu, - [MO_BEUW] = helper_be_stw_mmu, - [MO_BEUL] = helper_be_stl_mmu, - [MO_BEUQ] = helper_be_stq_mmu, + [MO_UB] = helper_stb_mmu, + [MO_UW] = helper_stw_mmu, + [MO_UL] = helper_stl_mmu, + [MO_UQ] = helper_stq_mmu, }; int i; @@ -1196,9 +1189,9 @@ static void tcg_out_qemu_ld(TCGContext *s, TCGReg data, TCGReg addr, /* We use the helpers to extend SB and SW data, leaving the case of SL needing explicit extending below. */ if ((memop & MO_SSIZE) == MO_SL) { - func = qemu_ld_trampoline[memop & (MO_BSWAP | MO_SIZE)]; + func = qemu_ld_trampoline[MO_UL]; } else { - func = qemu_ld_trampoline[memop & (MO_BSWAP | MO_SSIZE)]; + func = qemu_ld_trampoline[memop & MO_SSIZE]; } tcg_debug_assert(func != NULL); tcg_out_call_nodelay(s, func, false); @@ -1338,7 +1331,7 @@ static void tcg_out_qemu_st(TCGContext *s, TCGReg data, TCGReg addr, tcg_out_mov(s, TCG_TYPE_REG, TCG_REG_O1, addrz); tcg_out_mov(s, TCG_TYPE_REG, TCG_REG_O2, data); - func = qemu_st_trampoline[memop & (MO_BSWAP | MO_SIZE)]; + func = qemu_st_trampoline[memop & MO_SIZE]; tcg_debug_assert(func != NULL); tcg_out_call_nodelay(s, func, false); /* delay slot */ From patchwork Fri Nov 18 09:47:36 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 626113 Delivered-To: patch@linaro.org Received: by 2002:a17:522:c983:b0:460:3032:e3c4 with SMTP id kr3csp33854pvb; Fri, 18 Nov 2022 01:53:24 -0800 (PST) X-Google-Smtp-Source: AA0mqf6byh0dKXiJpxzFNVR7zn3dFpqbFuAGmZA3bAOU+oRmYtRug5U8MQOYp3Y5FF4LzitLhqWr X-Received: by 2002:ac8:6c3:0:b0:397:19ab:699f with SMTP id j3-20020ac806c3000000b0039719ab699fmr5905150qth.177.1668765204840; Fri, 18 Nov 2022 01:53:24 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668765204; cv=none; d=google.com; s=arc-20160816; b=mYvL/KIbtnYHosWyf55hvD7DjlwwGuYeSjREz+HMiKgf4W5L1xPHb9iuwBLXGvs5O5 +hW4yzMBqG4KWMMNNyFhrOAyPDTBQA0xSM7ZqbZIJjsCmb/7n5lIcz7bE5JaNGKQdPWd h9ZUAy4FCwyCcsXGL6qjnNUkPWhEEjh3VXHYx7jThOcIFB74uiClsmo6oCKpOBJ/IZ+H Ee31W2o6D3dQWF55ea3K5bxrbBd59SbXOZyDc4uhcl1XDXtBSYldOVQPMn771Y+p1fBd 8zE6gZ23GDYAvUDU2he+v31IvKMXoHZt/t/CffMfveE80lkSWodFml6SHorQKiKf/SEK fZnQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=Ne4IYDfMvK52svuA6NosU3Wb3QAXFhXcoQ7Fx900lGI=; b=iHlEOpXA/aIrwPDSE2S+yszoskuRcCXprmgZxHD2Pbvj/jbwkju4mHxIYLaFfANUss +isR8rWMOzvVY6tx913uMJZ7BJ3OSEFDLrpM9MGRHUwmZiGFIcRnM/c13OgsQ5kNcdRq Xh12OW7Omi4ZGbcAT+PtH4Cyz/lsPbZyowZfjjJB5LyOerrI1ABuhnfLmPnErqayr1+f jnZp6mhy4cYQucPmxGcd99nX7f+h17zVx9C01meExlJtOEhVLprThEI9u7xZbS/u6sth V8d5m4aHCJR6bvNbwTy0iuIHB0uMTYbO0VSQ4h/l+YSWGPPVdPgIMoxUDvhzD076WnOe AYHw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=WcBtWWtI; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id m66-20020a375845000000b006fbbd26e721si1534155qkb.487.2022.11.18.01.53.24 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 18 Nov 2022 01:53:24 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=WcBtWWtI; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ovxz0-0007Rz-Hz; Fri, 18 Nov 2022 04:48:26 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ovxyy-0007RQ-Ul for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:24 -0500 Received: from mail-pg1-x52a.google.com ([2607:f8b0:4864:20::52a]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ovxyw-00022T-Ci for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:24 -0500 Received: by mail-pg1-x52a.google.com with SMTP id f9so348450pgf.7 for ; Fri, 18 Nov 2022 01:48:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=Ne4IYDfMvK52svuA6NosU3Wb3QAXFhXcoQ7Fx900lGI=; b=WcBtWWtI9eOtzVeILfEcnybiEryKFHtu0ohqi+vuCmyCw41kp84g9P+cEtns19h6LF P+oXg4Rs7b2dZhVeGKnS3QgI+6L6GVoqNygEF/4aaXxfCbij/4Sh/fKDAIr0WQ5L0Uix KmmBMkPHs0aU8uqgFzYL7EKKjZiyvv0vzqWum/Ho44/d2o8QipmEJRJUoKjEQPmEAzwu ZQjRPat8vuZkhxPJsLloN0wuXTZaG6NyPtynvY54v6HfJ3glZDXoZ1kVHqdbkuJpZkJ3 V0TSjTNoMt5xZG+lkC+e/A0UeejhkPw7Vau5lAs+7zkeUdsRGTM7HVDg0sn/3dZIBg3/ XaOw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Ne4IYDfMvK52svuA6NosU3Wb3QAXFhXcoQ7Fx900lGI=; b=dl4Eko8Jt/WRmbImm4TUgxJdHqApKyHQuGZIG1bHcjs+bVrEm0P0RhdZYeQRVo4ubk d8rSwq7XNqUcGpUzm96k9lEEgx96bTMG4BFmVqEIScSJblEgzop8+/+mxAElgDNDvZkd 8PV8LDr9iBfMGldGx0r5bIVR8VI1G/WPrp+pSDrIMBsojRlTh7bRVNF3ngPvtpt+yxAx z0RQDHx05DUlVCUI8BUkg9f16fnvjCZ6/jj+M6LA3hGtrG5S3JtvhHbyIQT37+IV9BLh GGAn6bPxWNcymHfg3ExyakRJCTrhFzUAQDytG3bAMZSAdJTyLNHEaOrqVGfiJHPSFHAp YDeQ== X-Gm-Message-State: ANoB5pkizv5PsKNkiAUKATrQM/CVv2IvnwVgc660pprua+BKlf+iCivA FkLh8e2R9goHZ52qWghgFq1ZCygg0ReYMQ== X-Received: by 2002:a05:6a00:3004:b0:56c:dba2:30b with SMTP id ay4-20020a056a00300400b0056cdba2030bmr6969125pfb.72.1668764901126; Fri, 18 Nov 2022 01:48:21 -0800 (PST) Received: from stoup.. ([2602:47:d48a:1201:90b2:345f:bf0a:c412]) by smtp.gmail.com with ESMTPSA id n12-20020a170902e54c00b0018862bb3976sm3115421plf.308.2022.11.18.01.48.19 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Nov 2022 01:48:19 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH for-8.0 11/29] accel/tcg: Implement helper_{ld, st}*_mmu for user-only Date: Fri, 18 Nov 2022 01:47:36 -0800 Message-Id: <20221118094754.242910-12-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221118094754.242910-1-richard.henderson@linaro.org> References: <20221118094754.242910-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::52a; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x52a.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org TCG backends may need to defer to a helper to implement the atomicity required by a given operation. Mirror the interface used in system mode. Signed-off-by: Richard Henderson --- include/tcg/tcg-ldst.h | 6 +- accel/tcg/user-exec.c | 392 ++++++++++++++++++++++++++++------------- 2 files changed, 276 insertions(+), 122 deletions(-) diff --git a/include/tcg/tcg-ldst.h b/include/tcg/tcg-ldst.h index 56fa7afe5e..c1d945fd66 100644 --- a/include/tcg/tcg-ldst.h +++ b/include/tcg/tcg-ldst.h @@ -25,8 +25,6 @@ #ifndef TCG_LDST_H #define TCG_LDST_H -#ifdef CONFIG_SOFTMMU - /* Value zero-extended to tcg register size. */ tcg_target_ulong helper_ldub_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr); @@ -54,10 +52,10 @@ void helper_stl_mmu(CPUArchState *env, target_ulong addr, uint32_t val, void helper_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val, MemOpIdx oi, uintptr_t retaddr); -#else +#ifdef CONFIG_USER_ONLY G_NORETURN void helper_unaligned_ld(CPUArchState *env, target_ulong addr); G_NORETURN void helper_unaligned_st(CPUArchState *env, target_ulong addr); -#endif /* CONFIG_SOFTMMU */ +#endif /* CONFIG_USER_ONLY*/ #endif /* TCG_LDST_H */ diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c index ddba8c9dd7..3455ff45a4 100644 --- a/accel/tcg/user-exec.c +++ b/accel/tcg/user-exec.c @@ -254,21 +254,6 @@ void *page_get_target_data(target_ulong address) /* The softmmu versions of these helpers are in cputlb.c. */ -/* - * Verify that we have passed the correct MemOp to the correct function. - * - * We could present one function to target code, and dispatch based on - * the MemOp, but so far we have worked hard to avoid an indirect function - * call along the memory path. - */ -static void validate_memop(MemOpIdx oi, MemOp expected) -{ -#ifdef CONFIG_DEBUG_TCG - MemOp have = get_memop(oi) & (MO_SIZE | MO_BSWAP); - assert(have == expected); -#endif -} - void helper_unaligned_ld(CPUArchState *env, target_ulong addr) { cpu_loop_exit_sigbus(env_cpu(env), addr, MMU_DATA_LOAD, GETPC()); @@ -279,10 +264,9 @@ void helper_unaligned_st(CPUArchState *env, target_ulong addr) cpu_loop_exit_sigbus(env_cpu(env), addr, MMU_DATA_STORE, GETPC()); } -static void *cpu_mmu_lookup(CPUArchState *env, target_ulong addr, - MemOpIdx oi, uintptr_t ra, MMUAccessType type) +static void *cpu_mmu_lookup(CPUArchState *env, abi_ptr addr, + MemOp mop, uintptr_t ra, MMUAccessType type) { - MemOp mop = get_memop(oi); int a_bits = get_alignment_bits(mop); void *ret; @@ -298,100 +282,206 @@ static void *cpu_mmu_lookup(CPUArchState *env, target_ulong addr, #include "ldst_atomicity.c.inc" -uint8_t cpu_ldb_mmu(CPUArchState *env, abi_ptr addr, - MemOpIdx oi, uintptr_t ra) +static uint8_t do_ld1_mmu(CPUArchState *env, abi_ptr addr, + MemOp mop, uintptr_t ra) { void *haddr; uint8_t ret; - validate_memop(oi, MO_UB); - haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD); + tcg_debug_assert((mop & MO_SIZE) == MO_8); + haddr = cpu_mmu_lookup(env, addr, mop, ra, MMU_DATA_LOAD); ret = ldub_p(haddr); clear_helper_retaddr(); + return ret; +} + +tcg_target_ulong helper_ldub_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t ra) +{ + return do_ld1_mmu(env, addr, get_memop(oi), ra); +} + +tcg_target_ulong helper_ldsb_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t ra) +{ + return (int8_t)do_ld1_mmu(env, addr, get_memop(oi), ra); +} + +uint8_t cpu_ldb_mmu(CPUArchState *env, abi_ptr addr, + MemOpIdx oi, uintptr_t ra) +{ + uint8_t ret = do_ld1_mmu(env, addr, get_memop(oi), ra); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); return ret; } +static uint16_t do_ld2_he_mmu(CPUArchState *env, abi_ptr addr, + MemOp mop, uintptr_t ra) +{ + void *haddr; + uint16_t ret; + + tcg_debug_assert((mop & MO_SIZE) == MO_16); + haddr = cpu_mmu_lookup(env, addr, mop, ra, MMU_DATA_LOAD); + ret = load_atom_2(env, ra, haddr, mop); + clear_helper_retaddr(); + return ret; +} + +tcg_target_ulong helper_lduw_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t ra) +{ + MemOp mop = get_memop(oi); + uint16_t ret = do_ld2_he_mmu(env, addr, mop, ra); + + if (mop & MO_BSWAP) { + ret = bswap16(ret); + } + return ret; +} + +tcg_target_ulong helper_ldsw_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t ra) +{ + MemOp mop = get_memop(oi); + int16_t ret = do_ld2_he_mmu(env, addr, mop, ra); + + if (mop & MO_BSWAP) { + ret = bswap16(ret); + } + return ret; +} + uint16_t cpu_ldw_be_mmu(CPUArchState *env, abi_ptr addr, MemOpIdx oi, uintptr_t ra) { - void *haddr; + MemOp mop = get_memop(oi); uint16_t ret; - validate_memop(oi, MO_BEUW); - haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD); - ret = load_atom_2(env, ra, haddr, get_memop(oi)); - clear_helper_retaddr(); + tcg_debug_assert((mop & MO_BSWAP) == MO_BE); + ret = do_ld2_he_mmu(env, addr, mop, ra); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); return cpu_to_be16(ret); } -uint32_t cpu_ldl_be_mmu(CPUArchState *env, abi_ptr addr, - MemOpIdx oi, uintptr_t ra) -{ - void *haddr; - uint32_t ret; - - validate_memop(oi, MO_BEUL); - haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD); - ret = load_atom_4(env, ra, haddr, get_memop(oi)); - clear_helper_retaddr(); - qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); - return cpu_to_be32(ret); -} - -uint64_t cpu_ldq_be_mmu(CPUArchState *env, abi_ptr addr, - MemOpIdx oi, uintptr_t ra) -{ - void *haddr; - uint64_t ret; - - validate_memop(oi, MO_BEUQ); - haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD); - ret = load_atom_8(env, ra, haddr, get_memop(oi)); - clear_helper_retaddr(); - qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); - return cpu_to_be64(ret); -} - uint16_t cpu_ldw_le_mmu(CPUArchState *env, abi_ptr addr, MemOpIdx oi, uintptr_t ra) { - void *haddr; + MemOp mop = get_memop(oi); uint16_t ret; - validate_memop(oi, MO_LEUW); - haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD); - ret = load_atom_2(env, ra, haddr, get_memop(oi)); - clear_helper_retaddr(); + tcg_debug_assert((mop & MO_BSWAP) == MO_LE); + ret = do_ld2_he_mmu(env, addr, mop, ra); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); return cpu_to_le16(ret); } +static uint32_t do_ld4_he_mmu(CPUArchState *env, abi_ptr addr, + MemOp mop, uintptr_t ra) +{ + void *haddr; + uint32_t ret; + + tcg_debug_assert((mop & MO_SIZE) == MO_32); + haddr = cpu_mmu_lookup(env, addr, mop, ra, MMU_DATA_LOAD); + ret = load_atom_4(env, ra, haddr, mop); + clear_helper_retaddr(); + return ret; +} + +tcg_target_ulong helper_ldul_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t ra) +{ + MemOp mop = get_memop(oi); + uint32_t ret = do_ld4_he_mmu(env, addr, mop, ra); + + if (mop & MO_BSWAP) { + ret = bswap32(ret); + } + return ret; +} + +tcg_target_ulong helper_ldsl_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t ra) +{ + MemOp mop = get_memop(oi); + int32_t ret = do_ld4_he_mmu(env, addr, mop, ra); + + if (mop & MO_BSWAP) { + ret = bswap32(ret); + } + return ret; +} + +uint32_t cpu_ldl_be_mmu(CPUArchState *env, abi_ptr addr, + MemOpIdx oi, uintptr_t ra) +{ + MemOp mop = get_memop(oi); + uint32_t ret; + + tcg_debug_assert((mop & MO_BSWAP) == MO_BE); + ret = do_ld4_he_mmu(env, addr, mop, ra); + qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); + return cpu_to_be32(ret); +} + uint32_t cpu_ldl_le_mmu(CPUArchState *env, abi_ptr addr, MemOpIdx oi, uintptr_t ra) { - void *haddr; + MemOp mop = get_memop(oi); uint32_t ret; - validate_memop(oi, MO_LEUL); - haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD); - ret = load_atom_4(env, ra, haddr, get_memop(oi)); - clear_helper_retaddr(); + tcg_debug_assert((mop & MO_BSWAP) == MO_LE); + ret = do_ld4_he_mmu(env, addr, mop, ra); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); return cpu_to_le32(ret); } +static uint64_t do_ld8_he_mmu(CPUArchState *env, abi_ptr addr, + MemOp mop, uintptr_t ra) +{ + void *haddr; + uint64_t ret; + + tcg_debug_assert((mop & MO_SIZE) == MO_64); + haddr = cpu_mmu_lookup(env, addr, mop, ra, MMU_DATA_LOAD); + ret = load_atom_8(env, ra, haddr, mop); + clear_helper_retaddr(); + return ret; +} + +uint64_t helper_ldq_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t ra) +{ + MemOp mop = get_memop(oi); + uint64_t ret = do_ld8_he_mmu(env, addr, mop, ra); + + if (mop & MO_BSWAP) { + ret = bswap64(ret); + } + return ret; +} + +uint64_t cpu_ldq_be_mmu(CPUArchState *env, abi_ptr addr, + MemOpIdx oi, uintptr_t ra) +{ + MemOp mop = get_memop(oi); + uint64_t ret; + + tcg_debug_assert((mop & MO_BSWAP) == MO_BE); + ret = do_ld8_he_mmu(env, addr, mop, ra); + qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); + return cpu_to_be64(ret); +} + uint64_t cpu_ldq_le_mmu(CPUArchState *env, abi_ptr addr, MemOpIdx oi, uintptr_t ra) { - void *haddr; + MemOp mop = get_memop(oi); uint64_t ret; - validate_memop(oi, MO_LEUQ); - haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD); - ret = load_atom_8(env, ra, haddr, get_memop(oi)); - clear_helper_retaddr(); + tcg_debug_assert((mop & MO_BSWAP) == MO_LE); + ret = do_ld8_he_mmu(env, addr, mop, ra); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); return cpu_to_le64(ret); } @@ -402,7 +492,7 @@ Int128 cpu_ld16_be_mmu(CPUArchState *env, abi_ptr addr, void *haddr; Int128 ret; - validate_memop(oi, MO_128 | MO_BE); + tcg_debug_assert((get_memop(oi) & (MO_BSWAP | MO_SIZE)) == (MO_128 | MO_BE)); haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD); memcpy(&ret, haddr, 16); clear_helper_retaddr(); @@ -420,7 +510,7 @@ Int128 cpu_ld16_le_mmu(CPUArchState *env, abi_ptr addr, void *haddr; Int128 ret; - validate_memop(oi, MO_128 | MO_LE); + tcg_debug_assert((get_memop(oi) & (MO_BSWAP | MO_SIZE)) == (MO_128 | MO_LE)); haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD); memcpy(&ret, haddr, 16); clear_helper_retaddr(); @@ -432,87 +522,153 @@ Int128 cpu_ld16_le_mmu(CPUArchState *env, abi_ptr addr, return ret; } -void cpu_stb_mmu(CPUArchState *env, abi_ptr addr, uint8_t val, - MemOpIdx oi, uintptr_t ra) +static void do_st1_mmu(CPUArchState *env, abi_ptr addr, uint8_t val, + MemOp mop, uintptr_t ra) { void *haddr; - validate_memop(oi, MO_UB); - haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE); + tcg_debug_assert((mop & MO_SIZE) == MO_8); + haddr = cpu_mmu_lookup(env, addr, mop, ra, MMU_DATA_STORE); stb_p(haddr, val); clear_helper_retaddr(); +} + +void helper_stb_mmu(CPUArchState *env, target_ulong addr, uint8_t val, + MemOpIdx oi, uintptr_t ra) +{ + do_st1_mmu(env, addr, val, get_memop(oi), ra); +} + +void cpu_stb_mmu(CPUArchState *env, abi_ptr addr, uint8_t val, + MemOpIdx oi, uintptr_t ra) +{ + do_st1_mmu(env, addr, val, get_memop(oi), ra); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); } +static void do_st2_he_mmu(CPUArchState *env, abi_ptr addr, uint16_t val, + MemOp mop, uintptr_t ra) +{ + void *haddr; + + tcg_debug_assert((mop & MO_SIZE) == MO_16); + haddr = cpu_mmu_lookup(env, addr, mop, ra, MMU_DATA_STORE); + store_atom_2(env, ra, haddr, mop, val); + clear_helper_retaddr(); +} + +void helper_stw_mmu(CPUArchState *env, target_ulong addr, uint16_t val, + MemOpIdx oi, uintptr_t ra) +{ + MemOp mop = get_memop(oi); + + if (mop & MO_BSWAP) { + val = bswap16(val); + } + do_st2_he_mmu(env, addr, val, mop, ra); +} + void cpu_stw_be_mmu(CPUArchState *env, abi_ptr addr, uint16_t val, MemOpIdx oi, uintptr_t ra) { - void *haddr; + MemOp mop = get_memop(oi); - validate_memop(oi, MO_BEUW); - haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE); - store_atom_2(env, ra, haddr, get_memop(oi), be16_to_cpu(val)); - clear_helper_retaddr(); - qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); -} - -void cpu_stl_be_mmu(CPUArchState *env, abi_ptr addr, uint32_t val, - MemOpIdx oi, uintptr_t ra) -{ - void *haddr; - - validate_memop(oi, MO_BEUL); - haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE); - store_atom_4(env, ra, haddr, get_memop(oi), be32_to_cpu(val)); - clear_helper_retaddr(); - qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); -} - -void cpu_stq_be_mmu(CPUArchState *env, abi_ptr addr, uint64_t val, - MemOpIdx oi, uintptr_t ra) -{ - void *haddr; - - validate_memop(oi, MO_BEUQ); - haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE); - store_atom_8(env, ra, haddr, get_memop(oi), be64_to_cpu(val)); - clear_helper_retaddr(); + tcg_debug_assert((mop & MO_BSWAP) == MO_BE); + do_st2_he_mmu(env, addr, be16_to_cpu(val), mop, ra); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); } void cpu_stw_le_mmu(CPUArchState *env, abi_ptr addr, uint16_t val, MemOpIdx oi, uintptr_t ra) +{ + MemOp mop = get_memop(oi); + + tcg_debug_assert((mop & MO_BSWAP) == MO_LE); + do_st2_he_mmu(env, addr, le16_to_cpu(val), mop, ra); + qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); +} + +static void do_st4_he_mmu(CPUArchState *env, abi_ptr addr, uint32_t val, + MemOp mop, uintptr_t ra) { void *haddr; - validate_memop(oi, MO_LEUW); - haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE); - store_atom_2(env, ra, haddr, get_memop(oi), le16_to_cpu(val)); + tcg_debug_assert((mop & MO_SIZE) == MO_32); + haddr = cpu_mmu_lookup(env, addr, mop, ra, MMU_DATA_STORE); + store_atom_4(env, ra, haddr, mop, val); clear_helper_retaddr(); +} + +void helper_stl_mmu(CPUArchState *env, target_ulong addr, uint32_t val, + MemOpIdx oi, uintptr_t ra) +{ + MemOp mop = get_memop(oi); + + if (mop & MO_BSWAP) { + val = bswap32(val); + } + do_st4_he_mmu(env, addr, val, mop, ra); +} + +void cpu_stl_be_mmu(CPUArchState *env, abi_ptr addr, uint32_t val, + MemOpIdx oi, uintptr_t ra) +{ + MemOp mop = get_memop(oi); + + tcg_debug_assert((mop & MO_BSWAP) == MO_BE); + do_st4_he_mmu(env, addr, be32_to_cpu(val), mop, ra); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); } void cpu_stl_le_mmu(CPUArchState *env, abi_ptr addr, uint32_t val, MemOpIdx oi, uintptr_t ra) +{ + MemOp mop = get_memop(oi); + + tcg_debug_assert((mop & MO_BSWAP) == MO_LE); + do_st4_he_mmu(env, addr, le32_to_cpu(val), mop, ra); + qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); +} + +static void do_st8_he_mmu(CPUArchState *env, abi_ptr addr, uint64_t val, + MemOp mop, uintptr_t ra) { void *haddr; - validate_memop(oi, MO_LEUL); - haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE); - store_atom_4(env, ra, haddr, get_memop(oi), le32_to_cpu(val)); + tcg_debug_assert((mop & MO_SIZE) == MO_64); + haddr = cpu_mmu_lookup(env, addr, mop, ra, MMU_DATA_STORE); + store_atom_8(env, ra, haddr, mop, val); clear_helper_retaddr(); +} + +void helper_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val, + MemOpIdx oi, uintptr_t ra) +{ + MemOp mop = get_memop(oi); + + if (mop & MO_BSWAP) { + val = bswap64(val); + } + do_st8_he_mmu(env, addr, val, mop, ra); +} + +void cpu_stq_be_mmu(CPUArchState *env, abi_ptr addr, uint64_t val, + MemOpIdx oi, uintptr_t ra) +{ + MemOp mop = get_memop(oi); + + tcg_debug_assert((mop & MO_BSWAP) == MO_BE); + do_st8_he_mmu(env, addr, cpu_to_be64(val), mop, ra); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); } void cpu_stq_le_mmu(CPUArchState *env, abi_ptr addr, uint64_t val, MemOpIdx oi, uintptr_t ra) { - void *haddr; + MemOp mop = get_memop(oi); - validate_memop(oi, MO_LEUQ); - haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE); - store_atom_8(env, ra, haddr, get_memop(oi), le64_to_cpu(val)); - clear_helper_retaddr(); + tcg_debug_assert((mop & MO_BSWAP) == MO_LE); + do_st8_he_mmu(env, addr, cpu_to_le64(val), mop, ra); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); } @@ -521,7 +677,7 @@ void cpu_st16_be_mmu(CPUArchState *env, abi_ptr addr, { void *haddr; - validate_memop(oi, MO_128 | MO_BE); + tcg_debug_assert((get_memop(oi) & (MO_BSWAP | MO_SIZE)) == (MO_128 | MO_BE)); haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE); if (!HOST_BIG_ENDIAN) { val = bswap128(val); @@ -536,7 +692,7 @@ void cpu_st16_le_mmu(CPUArchState *env, abi_ptr addr, { void *haddr; - validate_memop(oi, MO_128 | MO_LE); + tcg_debug_assert((get_memop(oi) & (MO_BSWAP | MO_SIZE)) == (MO_128 | MO_LE)); haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE); if (HOST_BIG_ENDIAN) { val = bswap128(val); From patchwork Fri Nov 18 09:47:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 626105 Delivered-To: patch@linaro.org Received: by 2002:a17:522:c983:b0:460:3032:e3c4 with SMTP id kr3csp32928pvb; Fri, 18 Nov 2022 01:50:36 -0800 (PST) X-Google-Smtp-Source: AA0mqf4QJ8FlxliJ6MgJ6ue7uKezEsFkK61uEI6Dd1S1lWFfSlLitIY5IuKd28a8Y02rGuVk3T2S X-Received: by 2002:ac8:7447:0:b0:3a5:75dc:ed0a with SMTP id h7-20020ac87447000000b003a575dced0amr5722980qtr.329.1668765036661; Fri, 18 Nov 2022 01:50:36 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668765036; cv=none; d=google.com; s=arc-20160816; b=HmTkdYnlyj0VXo/JhvuqpWAUomk/BlXNE13CSGEYYMp5Qmv5LGRA3Y7G7uxnhPrDjg 3AXV/Q4aSNXNMDUmMR2Y1bn7krl54pS/hCUD/j6vpxv6msfNGWzfGuA1+EXqqgB5Shv6 VoACRDBBi3VcPElqdi+p3R6X0jJYsVf0OAcCryY8n1Jj088CfPv2s+v2nLHx5bhgPzri ErnSYsJrkqm189BspdO6SmqxKlS1e9eZaNns8/DSxLpRWWJONJLrcdn7DkG/GtB4SJr/ jUsH6I7/6zSxQYD8YFOLxPp/zHUbFKeupwZ5exOzq0z/mdqpGmbG47iXtk5xLDeEDEXM Bw2Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=7Cyu/0q5wOnGCLqg9lUfetSeyl1QbF3nH1nKs8ZaoAo=; b=nIufmPAH7mJaNqkphNctMJLQ64CyOq3bTJdPk8F8up8SANr0mERi+cZopGUiHiKCm1 X0RY7od1DFiScIuc68SQy3X/Qw6p0WK1nVADx4oEMQlfbjnFpz+TxY5UDxSIagOZUN44 jdNRQ9AOihrZc8YBEWAmHfNrZz/kP45Xm1h/2mrwsZ0op8LoLQZjz6P1wvyMwEj4fmhJ d51lakFNxY0HaRkLyndzTFizWvpwIDrFh6oID0nizFzIRpVkQIgLWxSt+5l9guWYq3JY SM4JWqwRykApezyozCB+ULj0sRIboBdPqzl9YbRyS1GWlmpg0nYvhLzEvUfgHDdlNcj9 K9yg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=yU9LZR2m; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id u23-20020a37ab17000000b006f72774dc20si1228196qke.494.2022.11.18.01.50.36 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 18 Nov 2022 01:50:36 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=yU9LZR2m; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ovxzD-0007Sc-1w; Fri, 18 Nov 2022 04:48:39 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ovxz2-0007SN-D2 for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:28 -0500 Received: from mail-pj1-x102d.google.com ([2607:f8b0:4864:20::102d]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ovxyy-000234-O9 for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:28 -0500 Received: by mail-pj1-x102d.google.com with SMTP id t17so3368847pjo.3 for ; Fri, 18 Nov 2022 01:48:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=7Cyu/0q5wOnGCLqg9lUfetSeyl1QbF3nH1nKs8ZaoAo=; b=yU9LZR2msWIyqzVzuV1fj/Swv/TMFoCey3AwGtbCAJOxgQKPGdA3mDIE8/ArPa74Ya Br6hYVLodVDIuXdvYzDYRyn4bxXbd6JTBHBgJ/W+CmjzmaYeaIXi07SoSfbsiuSEC6gI 6FV48SKn3O2QMLAUKgkHBogTma6WSPXjrypWPy69FeBtu8mP/WYR5vnvR6UDskFDqajB +Bv+t06Ahe3wpmhf8F/APOJWkFpeHqawGUA+dOBsnlB06VFuhsRXLg6+49kpwW49J/4L kmzvyfXqoHw88ixs3eQoXQH63BbGqPCDjoZDeVJ7WCoJ2WkLyq2T+1TWeasP36DNEA8R Jalw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7Cyu/0q5wOnGCLqg9lUfetSeyl1QbF3nH1nKs8ZaoAo=; b=VP/XPBlv1ODS5XFpAYfXqhd7Cc3fpHhezcbA0CsUPjBX935qnaIVCZmbQxC0Swy4OP cJVcdqW64nzSlpffJGDBzxiylyd1S6KyFfQTLd1aihYYu47Q6i0tabHyaM85YoazCxGf gOiwnxgNdzXlz5FAr5Re12Kzr99ajJ7fP9cAiFEEJcJOqs8vD9soZbeNKv+yu3AXVLa1 HeDWgb/t3/+LVWsx9+bmDMaTrN3WoOr/JzjXqsmIX4eUkXhx24drgqUa6czpHxH7W2KK C+wfoza7RYEFGreb6LLRVG7neyA7RXbJspRZTUCc1NYIxFKDHgzdr/iilIrf+EEm0Ncy pUew== X-Gm-Message-State: ANoB5pk4kktgC6HuBpPbwlDvnu1lL8jfHnqhNr8bh6dRwp0X0W6NHrzN wS8A9IRxQhUGyYB4dSb3lNKM+ZQJF4+sjA== X-Received: by 2002:a17:90a:b946:b0:213:d7cc:39cb with SMTP id f6-20020a17090ab94600b00213d7cc39cbmr12980649pjw.144.1668764903198; Fri, 18 Nov 2022 01:48:23 -0800 (PST) Received: from stoup.. ([2602:47:d48a:1201:90b2:345f:bf0a:c412]) by smtp.gmail.com with ESMTPSA id n12-20020a170902e54c00b0018862bb3976sm3115421plf.308.2022.11.18.01.48.21 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Nov 2022 01:48:21 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH for-8.0 12/29] tcg: Add 128-bit guest memory primitives Date: Fri, 18 Nov 2022 01:47:37 -0800 Message-Id: <20221118094754.242910-13-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221118094754.242910-1-richard.henderson@linaro.org> References: <20221118094754.242910-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::102d; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x102d.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org Signed-off-by: Richard Henderson --- accel/tcg/tcg-runtime.h | 3 + include/tcg/tcg-ldst.h | 4 + accel/tcg/cputlb.c | 480 +++++++++++++++++++++++++-------- accel/tcg/user-exec.c | 94 +++++-- tcg/tcg-op.c | 178 +++++++----- accel/tcg/ldst_atomicity.c.inc | 175 +++++++++++- 6 files changed, 729 insertions(+), 205 deletions(-) diff --git a/accel/tcg/tcg-runtime.h b/accel/tcg/tcg-runtime.h index e141a6ab24..a7a2038901 100644 --- a/accel/tcg/tcg-runtime.h +++ b/accel/tcg/tcg-runtime.h @@ -39,6 +39,9 @@ DEF_HELPER_FLAGS_1(exit_atomic, TCG_CALL_NO_WG, noreturn, env) DEF_HELPER_FLAGS_3(memset, TCG_CALL_NO_RWG, ptr, ptr, int, ptr) #endif /* IN_HELPER_PROTO */ +DEF_HELPER_FLAGS_3(ld_i128, TCG_CALL_NO_WG, i128, env, tl, i32) +DEF_HELPER_FLAGS_4(st_i128, TCG_CALL_NO_WG, void, env, tl, i128, i32) + DEF_HELPER_FLAGS_5(atomic_cmpxchgb, TCG_CALL_NO_WG, i32, env, tl, i32, i32, i32) DEF_HELPER_FLAGS_5(atomic_cmpxchgw_be, TCG_CALL_NO_WG, diff --git a/include/tcg/tcg-ldst.h b/include/tcg/tcg-ldst.h index c1d945fd66..3004e5292d 100644 --- a/include/tcg/tcg-ldst.h +++ b/include/tcg/tcg-ldst.h @@ -34,6 +34,8 @@ tcg_target_ulong helper_ldul_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr); uint64_t helper_ldq_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t retaddr); +Int128 helper_ld16_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t retaddr); /* Value sign-extended to tcg register size. */ tcg_target_ulong helper_ldsb_mmu(CPUArchState *env, target_ulong addr, @@ -51,6 +53,8 @@ void helper_stl_mmu(CPUArchState *env, target_ulong addr, uint32_t val, MemOpIdx oi, uintptr_t retaddr); void helper_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val, MemOpIdx oi, uintptr_t retaddr); +void helper_st16_mmu(CPUArchState *env, target_ulong addr, Int128 val, + MemOpIdx oi, uintptr_t retaddr); #ifdef CONFIG_USER_ONLY diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c index 3d32adc0e7..314dbfa83d 100644 --- a/accel/tcg/cputlb.c +++ b/accel/tcg/cputlb.c @@ -40,6 +40,7 @@ #include "qemu/plugin-memory.h" #endif #include "tcg/tcg-ldst.h" +#include "exec/helper-proto.h" /* DEBUG defines, enable DEBUG_TLB_LOG to log to the CPU_LOG_MMU target */ /* #define DEBUG_TLB */ @@ -2130,6 +2131,31 @@ static uint64_t do_ld_whole_be8(CPUArchState *env, uintptr_t ra, return (ret_be << (p->size * 8)) | x; } +/** + * do_ld_parts_be16 + * @p: translation parameters + * @ret_be: accumulated data + * + * As do_ld_bytes_beN, but with one atomic load. + * 16 aligned bytes are guaranteed to cover the load. + */ +static Int128 do_ld_whole_be16(CPUArchState *env, uintptr_t ra, + MMULookupPageData *p, uint64_t ret_be) +{ + int o = p->addr & 15; + Int128 x, y = load_atomic16_or_exit(env, ra, p->haddr - o); + int size = p->size; + + if (!HOST_BIG_ENDIAN) { + y = bswap128(y); + } + y = int128_lshift(y, o * 8); + y = int128_urshift(y, (16 - size) * 8); + x = int128_make64(ret_be); + x = int128_lshift(x, size * 8); + return int128_or(x, y); +} + /* * Wrapper for the above. */ @@ -2174,6 +2200,59 @@ static uint64_t do_ld_beN(CPUArchState *env, MMULookupPageData *p, } } +/* + * Wrapper for the above, for 8 < size < 16. + */ +static Int128 do_ld16_beN(CPUArchState *env, MMULookupPageData *p, + uint64_t a, int mmu_idx, MemOp mop, uintptr_t ra) +{ + int size = p->size; + uint64_t b; + MemOp atmax; + + if (unlikely(p->flags & TLB_MMIO)) { + p->size = size - 8; + a = do_ld_mmio_beN(env, p, a, mmu_idx, MMU_DATA_LOAD, ra); + p->addr += p->size; + p->size = 8; + b = do_ld_mmio_beN(env, p, 0, mmu_idx, MMU_DATA_LOAD, ra); + } else { + switch (mop & MO_ATOM_MASK) { + case MO_ATOM_WITHIN16: + /* + * It is a given that we cross a page and therefore there is no + * atomicity for the load as a whole, but there may be a subobject + * as defined by ATMAX which does not cross a 16-byte boundary. + */ + atmax = mop & MO_ATMAX_MASK; + if (atmax != MO_ATMAX_SIZE) { + atmax >>= MO_ATMAX_SHIFT; + if (unlikely(size >= (1 << atmax))) { + return do_ld_whole_be16(env, ra, p, a); + } + } + /* fall through */ + case MO_ATOM_IFALIGN: + case MO_ATOM_NONE: + p->size = size - 8; + a = do_ld_bytes_beN(p, a); + b = ldq_be_p(p->haddr + size - 8); + break; + case MO_ATOM_SUBALIGN: + p->size = size - 8; + a = do_ld_parts_beN(p, a); + p->haddr += size - 8; + p->size = 8; + b = do_ld_parts_beN(p, 0); + break; + default: + g_assert_not_reached(); + } + } + + return int128_make128(b, a); +} + static uint8_t do_ld_1(CPUArchState *env, MMULookupPageData *p, int mmu_idx, MMUAccessType type, uintptr_t ra) { @@ -2184,6 +2263,21 @@ static uint8_t do_ld_1(CPUArchState *env, MMULookupPageData *p, int mmu_idx, } } +static uint64_t do_ld_8(CPUArchState *env, MMULookupPageData *p, int mmu_idx, + MMUAccessType type, MemOp memop, uintptr_t ra) +{ + if (unlikely(p->flags & TLB_MMIO)) { + return io_readx(env, p->full, mmu_idx, p->addr, ra, type, memop); + } else { + /* Perform the load host endian. */ + uint64_t ret = load_atom_8(env, ra, p->haddr, memop); + if (memop & MO_BSWAP) { + ret = bswap64(ret); + } + return ret; + } +} + static uint8_t do_ld1_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, uintptr_t ra, MMUAccessType access_type) { @@ -2303,16 +2397,7 @@ static uint64_t do_ld8_mmu(CPUArchState *env, target_ulong addr, MemOpIdx oi, if (l.page[0].flags & TLB_BSWAP) { l.memop ^= MO_BSWAP; } - if (unlikely(l.page[0].flags & TLB_MMIO)) { - ret = io_readx(env, l.page[0].full, l.mmu_idx, addr, ra, - access_type, l.memop); - } else { - /* Perform the load host endian. */ - ret = load_atom_8(env, ra, l.page[0].haddr, l.memop); - if (l.memop & MO_BSWAP) { - ret = bswap64(ret); - } - } + return do_ld_8(env, &l.page[0], l.mmu_idx, access_type, l.memop, ra); } else { assert_no_tlb_bswap; ret = do_ld_beN(env, &l.page[0], 0, l.mmu_idx, @@ -2356,6 +2441,83 @@ tcg_target_ulong helper_ldsl_mmu(CPUArchState *env, target_ulong addr, return (int32_t)helper_ldul_mmu(env, addr, oi, retaddr); } +static Int128 do_ld16_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t ra) +{ + MMULookupLocals l; + bool crosspage; + uint64_t a, b; + Int128 ret; + + crosspage = mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD, &l); + if (likely(!crosspage)) { + if (l.page[0].flags & TLB_BSWAP) { + l.memop ^= MO_BSWAP; + } + /* Perform the load host endian. */ + if (unlikely(l.page[0].flags & TLB_MMIO)) { + QEMU_IOTHREAD_LOCK_GUARD(); + a = io_readx(env, l.page[0].full, l.mmu_idx, addr, + ra, MMU_DATA_LOAD, MO_64); + b = io_readx(env, l.page[0].full, l.mmu_idx, addr + 8, + ra, MMU_DATA_LOAD, MO_64); + ret = int128_make128(HOST_BIG_ENDIAN ? b : a, + HOST_BIG_ENDIAN ? a : b); + } else { + ret = load_atom_16(env, ra, l.page[0].haddr, l.memop); + } + if (l.memop & MO_BSWAP) { + ret = bswap128(ret); + } + } else { + int first = l.page[0].size; + + assert_no_tlb_bswap; + + if (first == 8) { + MemOp mop8 = (l.memop & ~MO_SIZE) | MO_64; + + a = do_ld_8(env, &l.page[0], l.mmu_idx, MMU_DATA_LOAD, mop8, ra); + b = do_ld_8(env, &l.page[1], l.mmu_idx, MMU_DATA_LOAD, mop8, ra); + if ((mop8 & MO_BSWAP) == MO_LE) { + ret = int128_make128(a, b); + } else { + ret = int128_make128(b, a); + } + } else { + if (first < 8) { + a = do_ld_beN(env, &l.page[0], 0, l.mmu_idx, + MMU_DATA_LOAD, l.memop, ra); + ret = do_ld16_beN(env, &l.page[1], a, l.mmu_idx, l.memop, ra); + } else { + ret = do_ld16_beN(env, &l.page[0], 0, l.mmu_idx, l.memop, ra); + b = int128_getlo(ret); + ret = int128_lshift(ret, l.page[1].size * 8); + a = int128_gethi(ret); + b = do_ld_beN(env, &l.page[0], b, l.mmu_idx, + MMU_DATA_LOAD, l.memop, ra); + ret = int128_make128(b, a); + } + if ((l.memop & MO_BSWAP) == MO_LE) { + ret = bswap128(ret); + } + } + } + return ret; +} + +Int128 helper_ld16_mmu(CPUArchState *env, target_ulong addr, + uint32_t oi, uintptr_t retaddr) +{ + tcg_debug_assert((get_memop(oi) & MO_SIZE) == MO_128); + return do_ld16_mmu(env, addr, oi, retaddr); +} + +Int128 helper_ld_i128(CPUArchState *env, target_ulong addr, uint32_t oi) +{ + return helper_ld16_mmu(env, addr, oi, GETPC()); +} + /* * Load helpers for cpu_ldst.h. */ @@ -2444,59 +2606,23 @@ uint64_t cpu_ldq_le_mmu(CPUArchState *env, abi_ptr addr, Int128 cpu_ld16_be_mmu(CPUArchState *env, abi_ptr addr, MemOpIdx oi, uintptr_t ra) { - MemOp mop = get_memop(oi); - int mmu_idx = get_mmuidx(oi); - MemOpIdx new_oi; - unsigned a_bits; - uint64_t h, l; + Int128 ret; - tcg_debug_assert((mop & (MO_BSWAP|MO_SSIZE)) == (MO_BE|MO_128)); - a_bits = get_alignment_bits(mop); - - /* Handle CPU specific unaligned behaviour */ - if (addr & ((1 << a_bits) - 1)) { - cpu_unaligned_access(env_cpu(env), addr, MMU_DATA_LOAD, - mmu_idx, ra); - } - - /* Construct an unaligned 64-bit replacement MemOpIdx. */ - mop = (mop & ~(MO_SIZE | MO_AMASK)) | MO_64 | MO_UNALN; - new_oi = make_memop_idx(mop, mmu_idx); - - h = helper_ldq_mmu(env, addr, new_oi, ra); - l = helper_ldq_mmu(env, addr + 8, new_oi, ra); - - qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); - return int128_make128(l, h); + tcg_debug_assert((get_memop(oi) & (MO_BSWAP|MO_SIZE)) == (MO_BE|MO_128)); + ret = do_ld16_mmu(env, addr, oi, ra); + plugin_load_cb(env, addr, oi); + return ret; } Int128 cpu_ld16_le_mmu(CPUArchState *env, abi_ptr addr, MemOpIdx oi, uintptr_t ra) { - MemOp mop = get_memop(oi); - int mmu_idx = get_mmuidx(oi); - MemOpIdx new_oi; - unsigned a_bits; - uint64_t h, l; + Int128 ret; - tcg_debug_assert((mop & (MO_BSWAP|MO_SSIZE)) == (MO_LE|MO_128)); - a_bits = get_alignment_bits(mop); - - /* Handle CPU specific unaligned behaviour */ - if (addr & ((1 << a_bits) - 1)) { - cpu_unaligned_access(env_cpu(env), addr, MMU_DATA_LOAD, - mmu_idx, ra); - } - - /* Construct an unaligned 64-bit replacement MemOpIdx. */ - mop = (mop & ~(MO_SIZE | MO_AMASK)) | MO_64 | MO_UNALN; - new_oi = make_memop_idx(mop, mmu_idx); - - l = helper_ldq_mmu(env, addr, new_oi, ra); - h = helper_ldq_mmu(env, addr + 8, new_oi, ra); - - qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); - return int128_make128(l, h); + tcg_debug_assert((get_memop(oi) & (MO_BSWAP|MO_SIZE)) == (MO_LE|MO_128)); + ret = do_ld16_mmu(env, addr, oi, ra); + plugin_load_cb(env, addr, oi); + return ret; } /* @@ -2645,6 +2771,36 @@ static uint64_t do_st_whole_le8(MMULookupPageData *p, uint64_t val_le) return val_le >> sz; } +/** + * do_st_whole_le16 + * @p: translation parameters + * @val_le: data to store + * + * As do_st_bytes_leN, but atomically on each aligned part. + * 16 aligned bytes are guaranteed to cover the store. + */ +static uint64_t do_st_whole_le16(MMULookupPageData *p, Int128 val_le) +{ + int szm64 = (p->size * 8) - 64; + int o = p->addr & 15; + int sh = o * 8; + Int128 m, v; + + /* Like MAKE_64BIT_MASK(0, sz), but larger. */ + m = int128_make128(-1, MAKE_64BIT_MASK(0, szm64)); + + if (HOST_BIG_ENDIAN) { + v = int128_urshift(bswap128(val_le), sh); + m = int128_urshift(bswap128(m), sh); + } else { + v = int128_lshift(val_le, sh); + m = int128_lshift(m, sh); + } + store_atom_insert_al16(p->haddr - o, v, m); + + return int128_gethi(val_le) >> szm64; +} + /* * Wrapper for the above. */ @@ -2691,6 +2847,60 @@ static uint64_t do_st_leN(CPUArchState *env, MMULookupPageData *p, } } +/* + * Wrapper for the above, for 8 < size < 16. + */ +static uint64_t do_st16_leN(CPUArchState *env, MMULookupPageData *p, + Int128 val_le, int mmu_idx, + MemOp mop, uintptr_t ra) +{ + int size = p->size; + MemOp atmax; + + if (unlikely(p->flags & TLB_MMIO)) { + p->size = 8; + do_st_mmio_leN(env, p, int128_getlo(val_le), mmu_idx, ra); + p->size = size - 8; + p->addr += 8; + return do_st_mmio_leN(env, p, int128_gethi(val_le), mmu_idx, ra); + } + + switch (mop & MO_ATOM_MASK) { + case MO_ATOM_WITHIN16: + /* + * It is a given that we cross a page and therefore there is no + * atomicity for the store as a whole, but there may be a subobject + * as defined by ATMAX which does not cross a 16-byte boundary. + */ + atmax = mop & MO_ATMAX_MASK; + if (atmax != MO_ATMAX_SIZE) { + atmax >>= MO_ATMAX_SHIFT; + if (unlikely(size >= (1 << atmax))) { + if (HAVE_al16) { + return do_st_whole_le16(p, val_le); + } else { + cpu_loop_exit_atomic(env_cpu(env), ra); + } + } + } + /* fall through */ + case MO_ATOM_IFALIGN: + case MO_ATOM_NONE: + stq_le_p(p->haddr, int128_getlo(val_le)); + p->size = size - 8; + p->haddr += 8; + return do_st_bytes_leN(p, int128_gethi(val_le)); + case MO_ATOM_SUBALIGN: + p->size = 8; + do_st_parts_leN(p, int128_getlo(val_le)); + p->size = size - 8; + p->haddr += 8; + return do_st_parts_leN(p, int128_gethi(val_le)); + default: + g_assert_not_reached(); + } +} + static void do_st_1(CPUArchState *env, MMULookupPageData *p, uint8_t val, int mmu_idx, uintptr_t ra) { @@ -2701,6 +2911,20 @@ static void do_st_1(CPUArchState *env, MMULookupPageData *p, uint8_t val, } } +static void do_st_8(CPUArchState *env, MMULookupPageData *p, uint64_t val, + int mmu_idx, MemOp memop, uintptr_t ra) +{ + if (unlikely(p->flags & TLB_MMIO)) { + io_writex(env, p->full, mmu_idx, val, p->addr, ra, memop); + } else { + /* Swap to host endian if necessary, then store. */ + if (memop & MO_BSWAP) { + val = bswap64(val); + } + store_atom_8(env, ra, p->haddr, memop, val); + } +} + void helper_stb_mmu(CPUArchState *env, target_ulong addr, uint8_t val, MemOpIdx oi, uintptr_t ra) { @@ -2806,15 +3030,7 @@ static void do_st8_mmu(CPUArchState *env, target_ulong addr, uint64_t val, if (l.page[0].flags & TLB_BSWAP) { l.memop ^= MO_BSWAP; } - if (unlikely(l.page[0].flags & TLB_MMIO)) { - io_writex(env, l.page[0].full, l.mmu_idx, val, addr, ra, l.memop); - } else { - /* Swap to host endian if necessary, then store. */ - if (l.memop & MO_BSWAP) { - val = bswap64(val); - } - store_atom_8(env, ra, l.page[0].haddr, l.memop, val); - } + do_st_8(env, &l.page[0], val, l.mmu_idx, l.memop, ra); } else { assert_no_tlb_bswap; @@ -2834,6 +3050,82 @@ void helper_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val, do_st8_mmu(env, addr, val, oi, retaddr); } +static void do_st16_mmu(CPUArchState *env, target_ulong addr, Int128 val, + MemOpIdx oi, uintptr_t ra) +{ + MMULookupLocals l; + bool crosspage; + uint64_t a, b; + + crosspage = mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD, &l); + if (likely(!crosspage)) { + if (l.page[0].flags & TLB_BSWAP) { + l.memop ^= MO_BSWAP; + } + /* Swap to host endian if necessary, then store. */ + if (l.memop & MO_BSWAP) { + val = bswap128(val); + } + if (unlikely(l.page[0].flags & TLB_MMIO)) { + QEMU_IOTHREAD_LOCK_GUARD(); + if (HOST_BIG_ENDIAN) { + b = int128_getlo(val), a = int128_gethi(val); + } else { + a = int128_getlo(val), b = int128_gethi(val); + } + io_writex(env, l.page[0].full, l.mmu_idx, a, addr, ra, MO_64); + io_writex(env, l.page[0].full, l.mmu_idx, b, addr + 8, ra, MO_64); + } else { + store_atom_16(env, ra, l.page[0].haddr, l.memop, val); + } + } else { + int first = l.page[0].size; + + assert_no_tlb_bswap; + + if (first == 8) { + MemOp mop8 = (l.memop & ~(MO_SIZE | MO_BSWAP)) | MO_64; + + if (l.memop & MO_BSWAP) { + val = bswap128(val); + } + if (HOST_BIG_ENDIAN) { + b = int128_getlo(val), a = int128_gethi(val); + } else { + a = int128_getlo(val), b = int128_gethi(val); + } + do_st_8(env, &l.page[0], a, l.mmu_idx, mop8, ra); + do_st_8(env, &l.page[1], b, l.mmu_idx, mop8, ra); + } else { + if ((l.memop & MO_BSWAP) != MO_LE) { + val = bswap128(val); + } + if (first < 8) { + do_st_leN(env, &l.page[0], int128_getlo(val), + l.mmu_idx, l.memop, ra); + val = int128_urshift(val, first * 8); + do_st16_leN(env, &l.page[1], val, l.mmu_idx, l.memop, ra); + } else { + b = do_st16_leN(env, &l.page[0], val, l.mmu_idx, l.memop, ra); + do_st_leN(env, &l.page[1], b, l.mmu_idx, l.memop, ra); + } + } + } +} + +void helper_st16_mmu(CPUArchState *env, target_ulong addr, Int128 val, + MemOpIdx oi, uintptr_t retaddr) +{ + tcg_debug_assert((get_memop(oi) & MO_SIZE) == MO_128); + do_st16_mmu(env, addr, val, oi, retaddr); +} + +void helper_st_i128(CPUArchState *env, target_ulong addr, Int128 val, + MemOpIdx oi) +{ + helper_st16_mmu(env, addr, val, oi, GETPC()); +} + /* * Store Helpers for cpu_ldst.h */ @@ -2898,58 +3190,20 @@ void cpu_stq_le_mmu(CPUArchState *env, target_ulong addr, uint64_t val, plugin_store_cb(env, addr, oi); } -void cpu_st16_be_mmu(CPUArchState *env, abi_ptr addr, Int128 val, - MemOpIdx oi, uintptr_t ra) +void cpu_st16_be_mmu(CPUArchState *env, target_ulong addr, Int128 val, + MemOpIdx oi, uintptr_t retaddr) { - MemOp mop = get_memop(oi); - int mmu_idx = get_mmuidx(oi); - MemOpIdx new_oi; - unsigned a_bits; - - tcg_debug_assert((mop & (MO_BSWAP|MO_SSIZE)) == (MO_BE|MO_128)); - a_bits = get_alignment_bits(mop); - - /* Handle CPU specific unaligned behaviour */ - if (addr & ((1 << a_bits) - 1)) { - cpu_unaligned_access(env_cpu(env), addr, MMU_DATA_STORE, - mmu_idx, ra); - } - - /* Construct an unaligned 64-bit replacement MemOpIdx. */ - mop = (mop & ~(MO_SIZE | MO_AMASK)) | MO_64 | MO_UNALN; - new_oi = make_memop_idx(mop, mmu_idx); - - helper_stq_mmu(env, addr, int128_gethi(val), new_oi, ra); - helper_stq_mmu(env, addr + 8, int128_getlo(val), new_oi, ra); - - qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); + tcg_debug_assert((get_memop(oi) & (MO_BSWAP|MO_SIZE)) == (MO_BE|MO_128)); + do_st16_mmu(env, addr, val, oi, retaddr); + plugin_store_cb(env, addr, oi); } -void cpu_st16_le_mmu(CPUArchState *env, abi_ptr addr, Int128 val, - MemOpIdx oi, uintptr_t ra) +void cpu_st16_le_mmu(CPUArchState *env, target_ulong addr, Int128 val, + MemOpIdx oi, uintptr_t retaddr) { - MemOp mop = get_memop(oi); - int mmu_idx = get_mmuidx(oi); - MemOpIdx new_oi; - unsigned a_bits; - - tcg_debug_assert((mop & (MO_BSWAP|MO_SSIZE)) == (MO_LE|MO_128)); - a_bits = get_alignment_bits(mop); - - /* Handle CPU specific unaligned behaviour */ - if (addr & ((1 << a_bits) - 1)) { - cpu_unaligned_access(env_cpu(env), addr, MMU_DATA_STORE, - mmu_idx, ra); - } - - /* Construct an unaligned 64-bit replacement MemOpIdx. */ - mop = (mop & ~(MO_SIZE | MO_AMASK)) | MO_64 | MO_UNALN; - new_oi = make_memop_idx(mop, mmu_idx); - - helper_stq_mmu(env, addr, int128_getlo(val), new_oi, ra); - helper_stq_mmu(env, addr + 8, int128_gethi(val), new_oi, ra); - - qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); + tcg_debug_assert((get_memop(oi) & (MO_BSWAP|MO_SIZE)) == (MO_LE|MO_128)); + do_st16_mmu(env, addr, val, oi, retaddr); + plugin_store_cb(env, addr, oi); } #include "ldst_common.c.inc" diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c index 3455ff45a4..7ae88ccff1 100644 --- a/accel/tcg/user-exec.c +++ b/accel/tcg/user-exec.c @@ -486,18 +486,45 @@ uint64_t cpu_ldq_le_mmu(CPUArchState *env, abi_ptr addr, return cpu_to_le64(ret); } -Int128 cpu_ld16_be_mmu(CPUArchState *env, abi_ptr addr, - MemOpIdx oi, uintptr_t ra) +static Int128 do_ld16_he_mmu(CPUArchState *env, abi_ptr addr, + MemOp mop, uintptr_t ra) { void *haddr; Int128 ret; - tcg_debug_assert((get_memop(oi) & (MO_BSWAP | MO_SIZE)) == (MO_128 | MO_BE)); - haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD); - memcpy(&ret, haddr, 16); + tcg_debug_assert((mop & MO_SIZE) == MO_128); + haddr = cpu_mmu_lookup(env, addr, mop, ra, MMU_DATA_LOAD); + ret = load_atom_16(env, ra, haddr, mop); clear_helper_retaddr(); - qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); + return ret; +} +Int128 helper_ld16_mmu(CPUArchState *env, target_ulong addr, + MemOpIdx oi, uintptr_t ra) +{ + MemOp mop = get_memop(oi); + Int128 ret = do_ld16_he_mmu(env, addr, mop, ra); + + if (mop & MO_BSWAP) { + ret = bswap128(ret); + } + return ret; +} + +Int128 helper_ld_i128(CPUArchState *env, target_ulong addr, MemOpIdx oi) +{ + return helper_ld16_mmu(env, addr, oi, GETPC()); +} + +Int128 cpu_ld16_be_mmu(CPUArchState *env, abi_ptr addr, + MemOpIdx oi, uintptr_t ra) +{ + MemOp mop = get_memop(oi); + Int128 ret; + + tcg_debug_assert((mop & MO_BSWAP) == MO_BE); + ret = do_ld16_he_mmu(env, addr, mop, ra); + qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); if (!HOST_BIG_ENDIAN) { ret = bswap128(ret); } @@ -507,15 +534,12 @@ Int128 cpu_ld16_be_mmu(CPUArchState *env, abi_ptr addr, Int128 cpu_ld16_le_mmu(CPUArchState *env, abi_ptr addr, MemOpIdx oi, uintptr_t ra) { - void *haddr; + MemOp mop = get_memop(oi); Int128 ret; - tcg_debug_assert((get_memop(oi) & (MO_BSWAP | MO_SIZE)) == (MO_128 | MO_LE)); - haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_LOAD); - memcpy(&ret, haddr, 16); - clear_helper_retaddr(); + tcg_debug_assert((mop & MO_BSWAP) == MO_LE); + ret = do_ld16_he_mmu(env, addr, mop, ra); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_R); - if (HOST_BIG_ENDIAN) { ret = bswap128(ret); } @@ -672,33 +696,57 @@ void cpu_stq_le_mmu(CPUArchState *env, abi_ptr addr, uint64_t val, qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); } -void cpu_st16_be_mmu(CPUArchState *env, abi_ptr addr, - Int128 val, MemOpIdx oi, uintptr_t ra) +static void do_st16_he_mmu(CPUArchState *env, abi_ptr addr, Int128 val, + MemOp mop, uintptr_t ra) { void *haddr; - tcg_debug_assert((get_memop(oi) & (MO_BSWAP | MO_SIZE)) == (MO_128 | MO_BE)); - haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE); + tcg_debug_assert((mop & MO_SIZE) == MO_128); + haddr = cpu_mmu_lookup(env, addr, mop, ra, MMU_DATA_STORE); + store_atom_16(env, ra, haddr, mop, val); + clear_helper_retaddr(); +} + +void helper_st16_mmu(CPUArchState *env, target_ulong addr, Int128 val, + MemOpIdx oi, uintptr_t ra) +{ + MemOp mop = get_memop(oi); + + if (mop & MO_BSWAP) { + val = bswap128(val); + } + do_st16_he_mmu(env, addr, val, mop, ra); +} + +void helper_st_i128(CPUArchState *env, target_ulong addr, + Int128 val, MemOpIdx oi) +{ + helper_st16_mmu(env, addr, val, oi, GETPC()); +} + +void cpu_st16_be_mmu(CPUArchState *env, abi_ptr addr, + Int128 val, MemOpIdx oi, uintptr_t ra) +{ + MemOp mop = get_memop(oi); + + tcg_debug_assert((mop & MO_BSWAP) == MO_BE); if (!HOST_BIG_ENDIAN) { val = bswap128(val); } - memcpy(haddr, &val, 16); - clear_helper_retaddr(); + do_st16_he_mmu(env, addr, val, mop, ra); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); } void cpu_st16_le_mmu(CPUArchState *env, abi_ptr addr, Int128 val, MemOpIdx oi, uintptr_t ra) { - void *haddr; + MemOp mop = get_memop(oi); - tcg_debug_assert((get_memop(oi) & (MO_BSWAP | MO_SIZE)) == (MO_128 | MO_LE)); - haddr = cpu_mmu_lookup(env, addr, oi, ra, MMU_DATA_STORE); + tcg_debug_assert((mop & MO_BSWAP) == MO_LE); if (HOST_BIG_ENDIAN) { val = bswap128(val); } - memcpy(haddr, &val, 16); - clear_helper_retaddr(); + do_st16_he_mmu(env, addr, val, mop, ra); qemu_plugin_vcpu_mem_cb(env_cpu(env), addr, oi, QEMU_PLUGIN_MEM_W); } diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c index 1f81c3dbb3..bbb29bed2b 100644 --- a/tcg/tcg-op.c +++ b/tcg/tcg-op.c @@ -3112,6 +3112,48 @@ void tcg_gen_qemu_st_i64(TCGv_i64 val, TCGv addr, TCGArg idx, MemOp memop) } } +/* + * Return true if @mop, without knowledge of the pointer alignment, + * does not require 16-byte atomicity, and it would be adventagous + * to avoid a call to a helper function. + */ +static bool use_two_i64_for_i128(MemOp mop) +{ +#ifdef CONFIG_SOFTMMU + /* Two softmmu tlb lookups is larger than one function call. */ + return false; +#else + /* + * For user-only, two 64-bit operations may well be smaller than a call. + * Determine if that would be legal for the requested atomicity. + */ + MemOp atom = mop & MO_ATOM_MASK; + MemOp atmax = mop & MO_ATMAX_MASK; + + /* In a serialized context, no atomicity is required. */ + if (tcg_ctx->tb_cflags & CF_PARALLEL) { + return true; + } + + if (atmax == MO_ATMAX_SIZE) { + atmax = mop & MO_SIZE; + } else { + atmax >>= MO_ATMAX_SHIFT; + } + switch (atom) { + case MO_ATOM_NONE: + return true; + case MO_ATOM_IFALIGN: + case MO_ATOM_SUBALIGN: + return atmax < MO_128; + case MO_ATOM_WITHIN16: + return atmax == MO_8; + default: + g_assert_not_reached(); + } +#endif +} + static void canonicalize_memop_i128_as_i64(MemOp ret[2], MemOp orig) { MemOp mop_1 = orig, mop_2; @@ -3159,91 +3201,105 @@ static void canonicalize_memop_i128_as_i64(MemOp ret[2], MemOp orig) void tcg_gen_qemu_ld_i128(TCGv_i128 val, TCGv addr, TCGArg idx, MemOp memop) { - MemOp mop[2]; - TCGv addr_p8; - TCGv_i64 x, y; + MemOpIdx oi = make_memop_idx(memop, idx); - canonicalize_memop_i128_as_i64(mop, memop); + tcg_debug_assert((memop & MO_SIZE) == MO_128); + tcg_debug_assert((memop & MO_SIGN) == 0); tcg_gen_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD); addr = plugin_prep_mem_callbacks(addr); - /* TODO: respect atomicity of the operation. */ /* TODO: allow the tcg backend to see the whole operation. */ - /* - * Since there are no global TCGv_i128, there is no visible state - * changed if the second load faults. Load directly into the two - * subwords. - */ - if ((memop & MO_BSWAP) == MO_LE) { - x = TCGV128_LOW(val); - y = TCGV128_HIGH(val); + if (use_two_i64_for_i128(memop)) { + MemOp mop[2]; + TCGv addr_p8; + TCGv_i64 x, y; + + canonicalize_memop_i128_as_i64(mop, memop); + + /* + * Since there are no global TCGv_i128, there is no visible state + * changed if the second load faults. Load directly into the two + * subwords. + */ + if ((memop & MO_BSWAP) == MO_LE) { + x = TCGV128_LOW(val); + y = TCGV128_HIGH(val); + } else { + x = TCGV128_HIGH(val); + y = TCGV128_LOW(val); + } + + gen_ldst_i64(INDEX_op_qemu_ld_i64, x, addr, mop[0], idx); + + if ((mop[0] ^ memop) & MO_BSWAP) { + tcg_gen_bswap64_i64(x, x); + } + + addr_p8 = tcg_temp_new(); + tcg_gen_addi_tl(addr_p8, addr, 8); + gen_ldst_i64(INDEX_op_qemu_ld_i64, y, addr_p8, mop[1], idx); + tcg_temp_free(addr_p8); + + if ((mop[0] ^ memop) & MO_BSWAP) { + tcg_gen_bswap64_i64(y, y); + } } else { - x = TCGV128_HIGH(val); - y = TCGV128_LOW(val); + gen_helper_ld_i128(val, cpu_env, addr, tcg_constant_i32(oi)); } - gen_ldst_i64(INDEX_op_qemu_ld_i64, x, addr, mop[0], idx); - - if ((mop[0] ^ memop) & MO_BSWAP) { - tcg_gen_bswap64_i64(x, x); - } - - addr_p8 = tcg_temp_new(); - tcg_gen_addi_tl(addr_p8, addr, 8); - gen_ldst_i64(INDEX_op_qemu_ld_i64, y, addr_p8, mop[1], idx); - tcg_temp_free(addr_p8); - - if ((mop[0] ^ memop) & MO_BSWAP) { - tcg_gen_bswap64_i64(y, y); - } - - plugin_gen_mem_callbacks(addr, make_memop_idx(memop, idx), - QEMU_PLUGIN_MEM_R); + plugin_gen_mem_callbacks(addr, oi, QEMU_PLUGIN_MEM_R); } void tcg_gen_qemu_st_i128(TCGv_i128 val, TCGv addr, TCGArg idx, MemOp memop) { - MemOp mop[2]; - TCGv addr_p8; - TCGv_i64 x, y; + MemOpIdx oi = make_memop_idx(memop, idx); - canonicalize_memop_i128_as_i64(mop, memop); + tcg_debug_assert((memop & MO_SIZE) == MO_128); + tcg_debug_assert((memop & MO_SIGN) == 0); tcg_gen_req_mo(TCG_MO_ST_LD | TCG_MO_ST_ST); addr = plugin_prep_mem_callbacks(addr); - /* TODO: respect atomicity of the operation. */ /* TODO: allow the tcg backend to see the whole operation. */ - if ((memop & MO_BSWAP) == MO_LE) { - x = TCGV128_LOW(val); - y = TCGV128_HIGH(val); + if (use_two_i64_for_i128(memop)) { + MemOp mop[2]; + TCGv addr_p8; + TCGv_i64 x, y; + + canonicalize_memop_i128_as_i64(mop, memop); + + if ((memop & MO_BSWAP) == MO_LE) { + x = TCGV128_LOW(val); + y = TCGV128_HIGH(val); + } else { + x = TCGV128_HIGH(val); + y = TCGV128_LOW(val); + } + + addr_p8 = tcg_temp_new(); + if ((mop[0] ^ memop) & MO_BSWAP) { + TCGv_i64 t = tcg_temp_new_i64(); + + tcg_gen_bswap64_i64(t, x); + gen_ldst_i64(INDEX_op_qemu_st_i64, t, addr, mop[0], idx); + tcg_gen_bswap64_i64(t, y); + tcg_gen_addi_tl(addr_p8, addr, 8); + gen_ldst_i64(INDEX_op_qemu_st_i64, t, addr_p8, mop[1], idx); + tcg_temp_free_i64(t); + } else { + gen_ldst_i64(INDEX_op_qemu_st_i64, x, addr, mop[0], idx); + tcg_gen_addi_tl(addr_p8, addr, 8); + gen_ldst_i64(INDEX_op_qemu_st_i64, y, addr_p8, mop[1], idx); + } + tcg_temp_free(addr_p8); } else { - x = TCGV128_HIGH(val); - y = TCGV128_LOW(val); + gen_helper_st_i128(cpu_env, addr, val, tcg_constant_i32(oi)); } - addr_p8 = tcg_temp_new(); - if ((mop[0] ^ memop) & MO_BSWAP) { - TCGv_i64 t = tcg_temp_new_i64(); - - tcg_gen_bswap64_i64(t, x); - gen_ldst_i64(INDEX_op_qemu_st_i64, t, addr, mop[0], idx); - tcg_gen_bswap64_i64(t, y); - tcg_gen_addi_tl(addr_p8, addr, 8); - gen_ldst_i64(INDEX_op_qemu_st_i64, t, addr_p8, mop[1], idx); - tcg_temp_free_i64(t); - } else { - gen_ldst_i64(INDEX_op_qemu_st_i64, x, addr, mop[0], idx); - tcg_gen_addi_tl(addr_p8, addr, 8); - gen_ldst_i64(INDEX_op_qemu_st_i64, y, addr_p8, mop[1], idx); - } - tcg_temp_free(addr_p8); - - plugin_gen_mem_callbacks(addr, make_memop_idx(memop, idx), - QEMU_PLUGIN_MEM_W); + plugin_gen_mem_callbacks(addr, oi, QEMU_PLUGIN_MEM_W); } static void tcg_gen_ext_i32(TCGv_i32 ret, TCGv_i32 val, MemOp opc) diff --git a/accel/tcg/ldst_atomicity.c.inc b/accel/tcg/ldst_atomicity.c.inc index 8876c16371..e6a7558399 100644 --- a/accel/tcg/ldst_atomicity.c.inc +++ b/accel/tcg/ldst_atomicity.c.inc @@ -419,6 +419,21 @@ static inline uint64_t load_atom_8_by_4(void *pv) } } +/** + * load_atom_8_by_8_or_4: + * @pv: host address + * + * Load 8 bytes from aligned @pv, with at least 4-byte atomicity. + */ +static inline uint64_t load_atom_8_by_8_or_4(void *pv) +{ + if (HAVE_al8_fast) { + return load_atomic8(pv); + } else { + return load_atom_8_by_4(pv); + } +} + /** * load_atom_2: * @p: host address @@ -551,6 +566,64 @@ static uint64_t load_atom_8(CPUArchState *env, uintptr_t ra, } } +/** + * load_atom_16: + * @p: host address + * @memop: the full memory op + * + * Load 16 bytes from @p, honoring the atomicity of @memop. + */ +static Int128 load_atom_16(CPUArchState *env, uintptr_t ra, + void *pv, MemOp memop) +{ + uintptr_t pi = (uintptr_t)pv; + int atmax; + Int128 r; + uint64_t a, b; + + /* + * If the host does not support 8-byte atomics, wait until we have + * examined the atomicity parameters below. + */ + if (HAVE_al16_fast && likely((pi & 15) == 0)) { + return load_atomic16(pv); + } + + atmax = required_atomicity(env, pi, memop); + switch (atmax) { + case MO_8: + memcpy(&r, pv, 16); + return r; + case MO_16: + a = load_atom_8_by_2(pv); + b = load_atom_8_by_2(pv + 8); + break; + case MO_32: + a = load_atom_8_by_4(pv); + b = load_atom_8_by_4(pv + 8); + break; + case MO_64: + if (!HAVE_al8) { + cpu_loop_exit_atomic(env_cpu(env), ra); + } + a = load_atomic8(pv); + b = load_atomic8(pv + 8); + break; + case -MO_64: + if (!HAVE_al8) { + cpu_loop_exit_atomic(env_cpu(env), ra); + } + a = load_atom_extract_al8x2(pv); + b = load_atom_extract_al8x2(pv + 8); + break; + case MO_128: + return load_atomic16_or_exit(env, ra, pv); + default: + g_assert_not_reached(); + } + return int128_make128(HOST_BIG_ENDIAN ? b : a, HOST_BIG_ENDIAN ? a : b); +} + /** * store_atomic2: * @pv: host address @@ -592,6 +665,40 @@ static inline void store_atomic8(void *pv, uint64_t val) qatomic_set__nocheck(p, val); } +/** + * store_atomic16: + * @pv: host address + * @val: value to store + * + * Atomically store 16 aligned bytes to @pv. + */ +static inline void store_atomic16(void *pv, Int128 val) +{ +#if defined(CONFIG_ATOMIC128) + __uint128_t *pu = __builtin_assume_aligned(pv, 16); + Int128Alias new; + + new.s = val; + qatomic_set__nocheck(pu, new.u); +#elif defined(CONFIG_CMPXCHG128) + __uint128_t *pu = __builtin_assume_aligned(pv, 16); + __uint128_t o; + Int128Alias n; + + /* + * Without CONFIG_ATOMIC128, __atomic_compare_exchange_n will always + * defer to libatomic, so we must use __sync_val_compare_and_swap_16 + * and accept the sequential consistency that comes with it. + */ + n.s = val; + do { + o = *pu; + } while (!__sync_bool_compare_and_swap_16(pu, o, n.u)); +#else + qemu_build_not_reached(); +#endif +} + /** * store_atom_4x2 */ @@ -607,9 +714,8 @@ static inline void store_atom_4_by_2(void *pv, uint32_t val) */ static inline void store_atom_8_by_2(void *pv, uint64_t val) { - uint32_t *p = __builtin_assume_aligned(pv, 4); - qatomic_set(p, val >> (HOST_BIG_ENDIAN ? 32 : 0)); - qatomic_set(p + 4, val >> (HOST_BIG_ENDIAN ? 0 : 32)); + store_atom_4_by_2(pv, val >> (HOST_BIG_ENDIAN ? 32 : 0)); + store_atom_4_by_2(pv + 4, val >> (HOST_BIG_ENDIAN ? 0 : 32)); } /** @@ -617,11 +723,9 @@ static inline void store_atom_8_by_2(void *pv, uint64_t val) */ static inline void store_atom_8_by_4(void *pv, uint64_t val) { - uint16_t *p = __builtin_assume_aligned(pv, 2); - qatomic_set(p, val >> (HOST_BIG_ENDIAN ? 48 : 0)); - qatomic_set(p + 2, val >> (HOST_BIG_ENDIAN ? 32 : 16)); - qatomic_set(p + 4, val >> (HOST_BIG_ENDIAN ? 16 : 32)); - qatomic_set(p + 6, val >> (HOST_BIG_ENDIAN ? 0 : 48)); + uint32_t *p = __builtin_assume_aligned(pv, 4); + qatomic_set(p, val >> (HOST_BIG_ENDIAN ? 32 : 0)); + qatomic_set(p + 4, val >> (HOST_BIG_ENDIAN ? 0 : 32)); } /** @@ -880,3 +984,58 @@ static void store_atom_8(CPUArchState *env, uintptr_t ra, g_assert_not_reached(); } } + +/** + * store_atom_16: + * @p: host address + * @val: the value to store + * @memop: the full memory op + * + * Store 16 bytes to @p, honoring the atomicity of @memop. + */ +static void store_atom_16(CPUArchState *env, uintptr_t ra, + void *pv, MemOp memop, Int128 val) +{ + uintptr_t pi = (uintptr_t)pv; + uint64_t a, b; + MemOp atmax; + + if (HAVE_al16_fast && likely((pi & 15) == 0)) { + store_atomic16(pv, val); + return; + } + + atmax = required_atomicity(env, pi, memop); + + a = HOST_BIG_ENDIAN ? int128_gethi(val) : int128_getlo(val); + b = HOST_BIG_ENDIAN ? int128_getlo(val) : int128_gethi(val); + switch (atmax) { + case MO_8: + memcpy(pv, &val, 16); + return; + case MO_16: + store_atom_8_by_2(pv, a); + store_atom_8_by_2(pv + 8, b); + return; + case MO_32: + store_atom_8_by_4(pv, a); + store_atom_8_by_4(pv + 8, b); + return; + case MO_64: + if (HAVE_al8) { + store_atomic8(pv, a); + store_atomic8(pv + 8, b); + return; + } + break; + case MO_128: + if (HAVE_al16) { + store_atomic16(pv, val); + return; + } + break; + default: + g_assert_not_reached(); + } + cpu_loop_exit_atomic(env_cpu(env), ra); +} From patchwork Fri Nov 18 09:47:38 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 626109 Delivered-To: patch@linaro.org Received: by 2002:a17:522:c983:b0:460:3032:e3c4 with SMTP id kr3csp33212pvb; Fri, 18 Nov 2022 01:51:24 -0800 (PST) X-Google-Smtp-Source: AA0mqf4PwDRq4IgBAik2Dr3alo3elqWGHzyOdELxYlM21G6PjB53BQphQyA61jJ3S1pHdqvusSd2 X-Received: by 2002:a05:6214:5e92:b0:4b4:128:3cb2 with SMTP id mm18-20020a0562145e9200b004b401283cb2mr6022882qvb.80.1668765084421; Fri, 18 Nov 2022 01:51:24 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668765084; cv=none; d=google.com; s=arc-20160816; b=0k1waVWb3pp1J+a4g9RLNWQUp09fOXRWuo4SkUjiqtgWrOCIveGbGk3GQyN4BffyLO /FlWdVlhUvh1AHt83nyRPTHcfN6UAfBgU8TChPBPA8tLRxVcwsEz6AFl+xWncNOARxvK 7Hz+K3PwFCegKOb4Rm3TvWEL7unovdjYZXvAu24MckNfNUgKWs7Xq0KFzPQBTTOX2tUK Asy1mVUjCAT2ZimdZTc/0Q0OKt2duWKAuFIENNh8CrKIvgL27SAAkVxqlnP34FSSpaQ/ 87l7CsdEDC/WkFOxXJD/Ja5AjLzRRYX+Z39ITrO/EOvr614SrinAPnaUUhXNC3klGVJy +mTg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=pVPjPjwoOnkEOAL0DJQU491SiPsBvOPAXEwCrPNcQiA=; b=DFPOsi+oq4RTJdsixIPr/Q1WocsGKesgzzqRcBozoSK7MS5uSIbewkjU2lpCAO7sWo 9TjBZoX5cJleWNihT5h7mBd5biiXYOLY9jIN2Mp49YxHNgZIcLRZ+dg4OpjC+2BoYtzI OT/94MJmQi/dAoc4fyeyetvBZ6UBA7o9tMfIQuwuBwx01+G3t4sqI1NO61UZPtLSOMLb 8OsE68JDdElyVeDNH+xogA692sxVhFgXBhHIsmnbrMVRchqLtWY8UuGCI2PXXGShtBR9 aanmSCdDbwgv5Tbis/4Le+19LcE6Z3hRNh5CUqucfaLQ45BCmVRj6bjGDMOSIuwUg0+x cGlg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=xjTZOpvQ; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id a23-20020a05620a02f700b006eeb28c4103si1480677qko.655.2022.11.18.01.51.24 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 18 Nov 2022 01:51:24 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=xjTZOpvQ; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ovxza-0007nU-UG; Fri, 18 Nov 2022 04:49:03 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ovxz5-0007Sm-4k for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:38 -0500 Received: from mail-pg1-x533.google.com ([2607:f8b0:4864:20::533]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ovxz0-00024j-IM for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:29 -0500 Received: by mail-pg1-x533.google.com with SMTP id f3so4577338pgc.2 for ; Fri, 18 Nov 2022 01:48:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=pVPjPjwoOnkEOAL0DJQU491SiPsBvOPAXEwCrPNcQiA=; b=xjTZOpvQvW3IGfY8Vx9iUhz0n3cu9C0WLRUWEFfxYiXatRrSyUpNAMzxSFzswAqSIQ mHijSzHqbsvkmme4rGmNL5VZecADSEsQ7X/iTDTKWAcBHLhr411A/EcAfV4+B8W1a/FK UhgG9RQw5TtuxkJmoNVmcSz58RpZx70vchGav8QsoK063K3wzvN3Z/vp7t3iblzHJ/e7 jvpWr/yUQWe+Lwhoqcp9XfIiYT4FKGZRtuTi53pZKDg8WLYd5aDrqUYO669U3yaesAAQ oZ4kDc64LXspRpesxbtwWpbx9z6crJweiYxGxW/KshD0qt6f+/Dg5wW/hXqCP+ZfHw41 2FOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=pVPjPjwoOnkEOAL0DJQU491SiPsBvOPAXEwCrPNcQiA=; b=t+0NpG4Qg7Zo37mn0HMEVn3jghlfbItE55xcNT/tM0JFabSjOsuk2aURHzPM2RD38m qMTykfOwUmVXrl3cgG60eFs1ZvYk2qa6x3R359diJ/vBWX6wMAJ0b3oxyhG9tk4KIXC+ I3wt7px1E4YjsEGKKDX73q8bkvZsnUV9vsVwpmaGl9egr9g8LciJy3vtCo4IdR/rEfpi dozYZEV48JbZUHPQM0YiJGDgjytwWO9xSDeX/GQrzR4B5AleCIkvxjBUSIQrlFMhioot HgFEEMMRpA9GlG2pfgJTUbM/b/PXT7LXey/4yBNCPJwh5NakXlD5xXr0ryPVYUAHQ8rL 2cTg== X-Gm-Message-State: ANoB5pmT6/VxJbxHuPYCaE+5fgi/YkL6E/YtMvo9/MCDzZ4aVKexNsnt 50oHOVl+jsHGft4m+wWfITSov0LVSycK4w== X-Received: by 2002:a63:4d49:0:b0:46f:b030:7647 with SMTP id n9-20020a634d49000000b0046fb0307647mr5950042pgl.13.1668764904902; Fri, 18 Nov 2022 01:48:24 -0800 (PST) Received: from stoup.. ([2602:47:d48a:1201:90b2:345f:bf0a:c412]) by smtp.gmail.com with ESMTPSA id n12-20020a170902e54c00b0018862bb3976sm3115421plf.308.2022.11.18.01.48.23 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Nov 2022 01:48:23 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH for-8.0 13/29] meson: Detect atomic128 support with optimization Date: Fri, 18 Nov 2022 01:47:38 -0800 Message-Id: <20221118094754.242910-14-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221118094754.242910-1-richard.henderson@linaro.org> References: <20221118094754.242910-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::533; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x533.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org There is an edge condition prior to gcc13 for which optimization is required to generate 16-byte atomic sequences. Detect this. Signed-off-by: Richard Henderson --- accel/tcg/ldst_atomicity.c.inc | 38 ++++++++++++++++++------- meson.build | 52 ++++++++++++++++++++++------------ 2 files changed, 61 insertions(+), 29 deletions(-) diff --git a/accel/tcg/ldst_atomicity.c.inc b/accel/tcg/ldst_atomicity.c.inc index e6a7558399..68edab4398 100644 --- a/accel/tcg/ldst_atomicity.c.inc +++ b/accel/tcg/ldst_atomicity.c.inc @@ -16,6 +16,23 @@ #endif #define HAVE_al8_fast (ATOMIC_REG_SIZE >= 8) +/* + * If __alignof(unsigned __int128) < 16, GCC may refuse to inline atomics + * that are supported by the host, e.g. s390x. We can force the pointer to + * have our known alignment with __builtin_assume_aligned, however prior to + * GCC 13 that was only reliable with optimization enabled. See + * https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107389 + */ +#if defined(CONFIG_ATOMIC128_OPT) +# if !defined(__OPTIMIZE__) +# define ATTRIBUTE_ATOMIC128_OPT __attribute__((optimize("O1"))) +# endif +# define CONFIG_ATOMIC128 +#endif +#ifndef ATTRIBUTE_ATOMIC128_OPT +# define ATTRIBUTE_ATOMIC128_OPT +#endif + #if defined(CONFIG_ATOMIC128) # define HAVE_al16_fast true #else @@ -134,7 +151,8 @@ static inline uint64_t load_atomic8(void *pv) * * Atomically load 16 aligned bytes from @pv. */ -static inline Int128 load_atomic16(void *pv) +static inline Int128 ATTRIBUTE_ATOMIC128_OPT +load_atomic16(void *pv) { #ifdef CONFIG_ATOMIC128 __uint128_t *p = __builtin_assume_aligned(pv, 16); @@ -336,7 +354,8 @@ static uint64_t load_atom_extract_al16_or_exit(CPUArchState *env, uintptr_t ra, * cross an 16-byte boundary then the access must be 16-byte atomic, * otherwise the access must be 8-byte atomic. */ -static inline uint64_t load_atom_extract_al16_or_al8(void *pv, int s) +static inline uint64_t ATTRIBUTE_ATOMIC128_OPT +load_atom_extract_al16_or_al8(void *pv, int s) { #if defined(CONFIG_ATOMIC128) uintptr_t pi = (uintptr_t)pv; @@ -672,28 +691,24 @@ static inline void store_atomic8(void *pv, uint64_t val) * * Atomically store 16 aligned bytes to @pv. */ -static inline void store_atomic16(void *pv, Int128 val) +static inline void ATTRIBUTE_ATOMIC128_OPT +store_atomic16(void *pv, Int128Alias val) { #if defined(CONFIG_ATOMIC128) __uint128_t *pu = __builtin_assume_aligned(pv, 16); - Int128Alias new; - - new.s = val; - qatomic_set__nocheck(pu, new.u); + qatomic_set__nocheck(pu, val.u); #elif defined(CONFIG_CMPXCHG128) __uint128_t *pu = __builtin_assume_aligned(pv, 16); __uint128_t o; - Int128Alias n; /* * Without CONFIG_ATOMIC128, __atomic_compare_exchange_n will always * defer to libatomic, so we must use __sync_val_compare_and_swap_16 * and accept the sequential consistency that comes with it. */ - n.s = val; do { o = *pu; - } while (!__sync_bool_compare_and_swap_16(pu, o, n.u)); + } while (!__sync_bool_compare_and_swap_16(pu, o, val.u)); #else qemu_build_not_reached(); #endif @@ -777,7 +792,8 @@ static void store_atom_insert_al8(uint64_t *p, uint64_t val, uint64_t msk) * * Atomically store @val to @p masked by @msk. */ -static void store_atom_insert_al16(Int128 *ps, Int128Alias val, Int128Alias msk) +static void ATTRIBUTE_ATOMIC128_OPT +store_atom_insert_al16(Int128 *ps, Int128Alias val, Int128Alias msk) { #if defined(CONFIG_ATOMIC128) __uint128_t *pu, old, new; diff --git a/meson.build b/meson.build index 4984e80e71..503eeabd79 100644 --- a/meson.build +++ b/meson.build @@ -2215,23 +2215,21 @@ config_host_data.set('HAVE_BROKEN_SIZE_MAX', not cc.compiles(''' return printf("%zu", SIZE_MAX); }''', args: ['-Werror'])) -atomic_test = ''' +# See if 64-bit atomic operations are supported. +# Note that without __atomic builtins, we can only +# assume atomic loads/stores max at pointer size. +config_host_data.set('CONFIG_ATOMIC64', cc.links(''' #include int main(void) { - @0@ x = 0, y = 0; + uint64_t x = 0, y = 0; y = __atomic_load_n(&x, __ATOMIC_RELAXED); __atomic_store_n(&x, y, __ATOMIC_RELAXED); __atomic_compare_exchange_n(&x, &y, x, 0, __ATOMIC_RELAXED, __ATOMIC_RELAXED); __atomic_exchange_n(&x, y, __ATOMIC_RELAXED); __atomic_fetch_add(&x, y, __ATOMIC_RELAXED); return 0; - }''' - -# See if 64-bit atomic operations are supported. -# Note that without __atomic builtins, we can only -# assume atomic loads/stores max at pointer size. -config_host_data.set('CONFIG_ATOMIC64', cc.links(atomic_test.format('uint64_t'))) + }''')) has_int128 = cc.links(''' __int128_t a; @@ -2249,21 +2247,39 @@ if has_int128 # "do we have 128-bit atomics which are handled inline and specifically not # via libatomic". The reason we can't use libatomic is documented in the # comment starting "GCC is a house divided" in include/qemu/atomic128.h. - has_atomic128 = cc.links(atomic_test.format('unsigned __int128')) + # We only care about these operations on 16-byte aligned pointers, so + # force 16-byte alignment of the pointer, which may be greater than + # __alignof(unsigned __int128) for the host. + atomic_test_128 = ''' + int main(int ac, char **av) { + unsigned __int128 *p = __builtin_assume_aligned(av[ac - 1], sizeof(16)); + p[1] = __atomic_load_n(&p[0], __ATOMIC_RELAXED); + __atomic_store_n(&p[2], p[3], __ATOMIC_RELAXED); + __atomic_compare_exchange_n(&p[4], &p[5], p[6], 0, __ATOMIC_RELAXED, __ATOMIC_RELAXED); + return 0; + }''' + has_atomic128 = cc.links(atomic_test_128) config_host_data.set('CONFIG_ATOMIC128', has_atomic128) if not has_atomic128 - has_cmpxchg128 = cc.links(''' - int main(void) - { - unsigned __int128 x = 0, y = 0; - __sync_val_compare_and_swap_16(&x, y, x); - return 0; - } - ''') + # Even with __builtin_assume_aligned, the above test may have failed + # without optimization enabled. Try again with optimizations locally + # enabled for the function. See + # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107389 + has_atomic128_opt = cc.links('__attribute__((optimize("O1")))' + atomic_test_128) + config_host_data.set('CONFIG_ATOMIC128_OPT', has_atomic128_opt) - config_host_data.set('CONFIG_CMPXCHG128', has_cmpxchg128) + if not has_atomic128_opt + config_host_data.set('CONFIG_CMPXCHG128', cc.links(''' + int main(void) + { + unsigned __int128 x = 0, y = 0; + __sync_val_compare_and_swap_16(&x, y, x); + return 0; + } + ''')) + endif endif endif From patchwork Fri Nov 18 09:47:39 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 626114 Delivered-To: patch@linaro.org Received: by 2002:a17:522:c983:b0:460:3032:e3c4 with SMTP id kr3csp33951pvb; Fri, 18 Nov 2022 01:53:41 -0800 (PST) X-Google-Smtp-Source: AA0mqf7jZKzQ2roZklOXgBkj2fyBaEAiueEvzj9fCct8I40vr76y4PB/q9NrI5EdYmt/KNkXMsUM X-Received: by 2002:ae9:ed05:0:b0:6cf:9ca3:b33d with SMTP id c5-20020ae9ed05000000b006cf9ca3b33dmr5029373qkg.475.1668765221500; Fri, 18 Nov 2022 01:53:41 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668765221; cv=none; d=google.com; s=arc-20160816; b=zw1sZg19U0ngdIbp31q5T82pagfyzzuxV4dTlvWEpVleGUa2H+XatPovuotnG+hLx6 waxbuVRSin1LTK4Uh47mwoBqzBX03Q6T1uySDDrND0flzuelYn/DiYLUwq2KfMEhg74t q8PT3htGHGVNd4AofHpnMZdwEYUZz11f252nSxZ4q0ghrRMmTS1lS4Ddti+OQLXUhKUT 1vdiWuWz8uReZM6JB0ufwzPWIk6txyunMRHxwM59FLuplYsjtg2GbWZrkN3Vk6HVkbIW emK6xDJ0Tpof4EBpjlgGOXi4lenamhncEBLUH3IQWRtQdU14uoHtuLm6j8gZvKjqDXBW kFnw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=tex5BgNy0m4Jvy/PmpG7dbIZx8VBa5q+aB9+9yTGsNk=; b=CnS9f8w7xgOirvr2ZM91JrJjS33RsHpLJzctz/8YB8bvHwNcnIptoWBYLRXbTfRT+m CP96dsMDeDGjXczYLSx3F6jTU0CkHx7meSSXQBIRMbytPeLamy6bU3CLsM/hs1LTym67 9V7JUBHTRPlE21XC1p1e6Jgo3EAixRwyWZUFydHy1OFIwh+Ts8m9hNiMNPQxgI5sDZ1Q 0BUoIlc2rVe1pnEXZxR2NnO84ydtGbf57tPN9Yb+kvnZIuL7l+BI42CvzzdzQARuX5mR DPsBgg/H/h33MUMnF9ie//cjiJkvZEk6YWGX8P2ZYrwwVsKnnFM5GBFS4RKp3fPsM+Eh z+ew== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=xhUERY+F; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id l7-20020a05620a28c700b006f27650e2f9si1842873qkp.196.2022.11.18.01.53.41 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 18 Nov 2022 01:53:41 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=xhUERY+F; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ovxzh-0007sa-C2; Fri, 18 Nov 2022 04:49:09 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ovxz5-0007So-0y for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:38 -0500 Received: from mail-pj1-x1031.google.com ([2607:f8b0:4864:20::1031]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ovxz1-00024y-F3 for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:30 -0500 Received: by mail-pj1-x1031.google.com with SMTP id k2-20020a17090a4c8200b002187cce2f92so2636877pjh.2 for ; Fri, 18 Nov 2022 01:48:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=tex5BgNy0m4Jvy/PmpG7dbIZx8VBa5q+aB9+9yTGsNk=; b=xhUERY+FGLH7tvak+RYppGQ0nyoOpaq536TNDvm6nqa/pnU3+tUqDt+trLsF1yQntX HFvnBUT0DmumxapwLRt3vbjWh3nNEr3x8wYk44GKvoSsz2avuMkfgLjRWIyDNY9HrP+H BktcmLAViC/Hd9FGCMUGEBPzk6+aDoa5tkHOW6UoC7/6grvqIVrZAgwk9pPGlsWKezza mNWdJpkRpJhFYE22KTCM5gnnbcd4gpp2+l5qUutjVcDaqe7bCrLtDJs2eiToB4rzrq6c l0gvfqVXN9ohdiJaowilu0Fk1la9PVeIfbj4ju7RQcagdUUz0+Jcsg/Es4Eedrs56FD8 p97g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tex5BgNy0m4Jvy/PmpG7dbIZx8VBa5q+aB9+9yTGsNk=; b=wk1oWwRExA2oaVGwPPsej2acPy2NjPLYT972V2XGRR2q+hKVPAhnmRkDoVQwOJeeR+ vVhnXQcne4I9eir/8KRwAqsdfqaxaMsJWuTDjLOYFALDMWjnChmc/2dYepM+8A49GPf2 k307V0o8/fXiB2VEf3IXlLO1mzS5OhcOiQGuIJpOUCv0bYk58BuuHAqFGbuOy6IGL5jc pJotyS66YYICsowG6Ekf5ThBU8P5Kzp5SELqyrjAm0YP5B3ona+WY1dZptx0WIEup5TY rC08DDMwuxkwF3yH1Vo6wpLVxFZnH0whiSHHA5KCD/sml5J2u64jHTHGmX1/AQ+wVgp1 He5Q== X-Gm-Message-State: ANoB5pkezOWebFGPYV/Penfq/xc9hG0Oq4cr3D1KE8Gg04MlI46baoma 3Dq5JDpgHOyTV+fBCZsu1uGF/dmIE5m8fw== X-Received: by 2002:a17:90a:1a11:b0:213:f398:ed51 with SMTP id 17-20020a17090a1a1100b00213f398ed51mr6883100pjk.216.1668764906235; Fri, 18 Nov 2022 01:48:26 -0800 (PST) Received: from stoup.. ([2602:47:d48a:1201:90b2:345f:bf0a:c412]) by smtp.gmail.com with ESMTPSA id n12-20020a170902e54c00b0018862bb3976sm3115421plf.308.2022.11.18.01.48.25 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Nov 2022 01:48:25 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH for-8.0 14/29] tcg/i386: Add have_atomic16 Date: Fri, 18 Nov 2022 01:47:39 -0800 Message-Id: <20221118094754.242910-15-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221118094754.242910-1-richard.henderson@linaro.org> References: <20221118094754.242910-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1031; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1031.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org Notice when Intel has guaranteed that vmovdqa is atomic. The new variable will also be used in generated code. Signed-off-by: Richard Henderson --- include/qemu/cpuid.h | 18 ++++++++++++++++++ tcg/i386/tcg-target.h | 1 + tcg/i386/tcg-target.c.inc | 27 +++++++++++++++++++++++++++ 3 files changed, 46 insertions(+) diff --git a/include/qemu/cpuid.h b/include/qemu/cpuid.h index 1451e8ef2f..35325f1995 100644 --- a/include/qemu/cpuid.h +++ b/include/qemu/cpuid.h @@ -71,6 +71,24 @@ #define bit_LZCNT (1 << 5) #endif +/* + * Signatures for different CPU implementations as returned from Leaf 0. + */ + +#ifndef signature_INTEL_ecx +/* "Genu" "ineI" "ntel" */ +#define signature_INTEL_ebx 0x756e6547 +#define signature_INTEL_edx 0x49656e69 +#define signature_INTEL_ecx 0x6c65746e +#endif + +#ifndef signature_AMD_ecx +/* "Auth" "enti" "cAMD" */ +#define signature_AMD_ebx 0x68747541 +#define signature_AMD_edx 0x69746e65 +#define signature_AMD_ecx 0x444d4163 +#endif + static inline unsigned xgetbv_low(unsigned c) { unsigned a, d; diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h index 9e0e82d80a..5b037b1d2b 100644 --- a/tcg/i386/tcg-target.h +++ b/tcg/i386/tcg-target.h @@ -120,6 +120,7 @@ extern bool have_avx512dq; extern bool have_avx512vbmi2; extern bool have_avx512vl; extern bool have_movbe; +extern bool have_atomic16; /* optional instructions */ #define TCG_TARGET_HAS_div2_i32 1 diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index 24e9efe631..f4c0c7b8a2 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -185,6 +185,7 @@ bool have_avx512dq; bool have_avx512vbmi2; bool have_avx512vl; bool have_movbe; +bool have_atomic16; #ifdef CONFIG_CPUID_H static bool have_bmi2; @@ -4165,6 +4166,32 @@ static void tcg_target_init(TCGContext *s) have_avx512dq = (b7 & bit_AVX512DQ) != 0; have_avx512vbmi2 = (c7 & bit_AVX512VBMI2) != 0; } + + /* + * The Intel SDM has added: + * Processors that enumerate support for Intel® AVX + * (by setting the feature flag CPUID.01H:ECX.AVX[bit 28]) + * guarantee that the 16-byte memory operations performed + * by the following instructions will always be carried + * out atomically: + * - MOVAPD, MOVAPS, and MOVDQA. + * - VMOVAPD, VMOVAPS, and VMOVDQA when encoded with VEX.128. + * - VMOVAPD, VMOVAPS, VMOVDQA32, and VMOVDQA64 when encoded + * with EVEX.128 and k0 (masking disabled). + * Note that these instructions require the linear addresses + * of their memory operands to be 16-byte aligned. + * + * AMD has provided an even stronger guarantee that processors + * with AVX provide 16-byte atomicity for all cachable, + * naturally aligned single loads and stores, e.g. MOVDQU. + * + * See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104688 + */ + if (have_avx1) { + __cpuid(0, a, b, c, d); + have_atomic16 = (c == signature_INTEL_ecx || + c == signature_AMD_ecx); + } } } } From patchwork Fri Nov 18 09:47:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 626100 Delivered-To: patch@linaro.org Received: by 2002:a17:522:c983:b0:460:3032:e3c4 with SMTP id kr3csp32615pvb; Fri, 18 Nov 2022 01:49:40 -0800 (PST) X-Google-Smtp-Source: AA0mqf7IJNPxVHm5mi5Ybb1GAce4tGnlvcZ1+PHmwT+W8mL7Gy+CToZEhetxlUq7Us9pLxICZSny X-Received: by 2002:a05:622a:1a15:b0:35b:ae82:5e33 with SMTP id f21-20020a05622a1a1500b0035bae825e33mr5850637qtb.328.1668764980573; Fri, 18 Nov 2022 01:49:40 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668764980; cv=none; d=google.com; s=arc-20160816; b=o8k7y2giRf7Tv0c/tytB9Y+ms9XuiOxEmK8rDvmvvZpIB+DGWb+eR/VVEl+Z8h0Fdr t1Jek7Nz9KQfvopbdrZ5PddNEAdOuVLhfgIYu6nE3Fkuh6o3qrmK1HdKp06zxT/2oc6+ LqdEnA9+MtTZfcSPybuqbrYCYcwWH1Iir40GNCyRicBcQYZhRHwYbOiX0nI70ReaK8Dn YwGXu2MjzcYEvKCugU0UeDZVU76Q0rm9pCtlW1hYQ3KB/gMdStsyvIkx3JSBNMn++vHw cQKK0kQEccJ4hJt8yfgNOQjgnwvIFncIezF6p8/zqO1hRWPJv1BNoYI9c5gEVS5ko5Hh 2cRQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=QrYtnxLvtfKAc7aqQk4XJ12ghRRjf03hK+wLuvAHjsA=; b=UYm1rGWA5Nu29ZxGad+uAy3RVnDcSvHcu9EvDDFi3KpkvXo7oR9pbVR1flldREO6kS 9moYRzwppovN6+VMQP/Mp7hxDkoW/YXjgilH0jCVUADKzQEFlrIR35cyWDlwq5pmnbfD zIokYBr9bQJSkdKfilHAd3EaI6/2ku+uSWkF2uIp0ekUSuOLOZczAKFEdiJFHh5cZVb5 fXDDgovK3Xqr74L+t1lLiDoMdD2OeCaB5DMTP/Cptxb8j8XiVio2jA6kxnmjOXbgTlh9 G7wwpu47OmRiF0hW3MDYAzmoyOwU9GYDxeOGPHBIGaOAWwlUBcCg6CVw5OefPCa8YsFD SDVQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=f3Dku6bW; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id 7-20020a0562140cc700b004bbd90794cesi1962580qvx.223.2022.11.18.01.49.40 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 18 Nov 2022 01:49:40 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=f3Dku6bW; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ovxzH-0007bu-Sl; Fri, 18 Nov 2022 04:48:45 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ovxz4-0007Si-3V for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:31 -0500 Received: from mail-pj1-x1030.google.com ([2607:f8b0:4864:20::1030]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ovxz2-000219-Dh for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:29 -0500 Received: by mail-pj1-x1030.google.com with SMTP id w3-20020a17090a460300b00218524e8877so6671401pjg.1 for ; Fri, 18 Nov 2022 01:48:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=QrYtnxLvtfKAc7aqQk4XJ12ghRRjf03hK+wLuvAHjsA=; b=f3Dku6bWpuZM/TYfsyjgaCr4dOFnk+ysC5sPMUnBP59nQLkoxKVHe0523I2f5MdJFn fQj6DaZHhWX3aq7ZoNbP83K2nCyRidbgQxc6BTph6gveNi4ZAPb7YqXYp3zlfAopxbn7 HPKP75h81HOBp3F90Ey1zdquue7FxCxf2vi9S7deMRIEcOis8HbqCrD3pab1owtsub9T ZzvjQn4kPnzndGcCiiieIMHFe/PU8gXjfC/yUrH3GcHq/Y1iYSVg3tVGBbycCUFlRNbq utKvj3Rac9qCAOxylIWQnlwjGYI17RlYq70zyehsF2gg3mlMJ3Nxy1E1Jk6U45qX0Qq1 rVww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=QrYtnxLvtfKAc7aqQk4XJ12ghRRjf03hK+wLuvAHjsA=; b=F5MA31qyEe8mJ7ffBSLIznqYgWXgaeNvNS7w1imJFLI8VA2jZvfskbGIicL/gfrDKt YIiLJ2qhULrloWMfRz8XO0kpTulMFJ32Ik+YcUh88AZBJaPKM0K7/jDIQ8CfyJyxGt6L Tn9AqspLIC5kOm7aKS0/H6chHiApJeUQJ42uhWibQDRjSV7CYLD2EJfzmdM5rzeK2f71 uSYo9/6+OFTRLdJxZ9wcKzSsw8yQrSSEN9cRb0X1ZhZh+viQZud8M6lYabeZF85yNNEn ePAgN2AhzyTxDE25V68h33Aob+uK9dT5/1V5DfGedresmdE3VjyAFgfhtDCDyrDTGPCc bLtg== X-Gm-Message-State: ANoB5pm/ioupZD7S/CRGKs+CF3rEQfLnC441IIN70kSoJp06e1EjTY5L 6qMqRYcXJRTSu6bgsEkb85FazM7Rurjl8Q== X-Received: by 2002:a17:902:8217:b0:186:e2c3:91c6 with SMTP id x23-20020a170902821700b00186e2c391c6mr6764149pln.27.1668764907661; Fri, 18 Nov 2022 01:48:27 -0800 (PST) Received: from stoup.. ([2602:47:d48a:1201:90b2:345f:bf0a:c412]) by smtp.gmail.com with ESMTPSA id n12-20020a170902e54c00b0018862bb3976sm3115421plf.308.2022.11.18.01.48.26 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Nov 2022 01:48:26 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH for-8.0 15/29] include/qemu/int128: Add vector type to Int128Alias Date: Fri, 18 Nov 2022 01:47:40 -0800 Message-Id: <20221118094754.242910-16-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221118094754.242910-1-richard.henderson@linaro.org> References: <20221118094754.242910-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1030; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1030.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org Adding a vector type will make it easier to handle i386 have_atomic16 via AVX. Signed-off-by: Richard Henderson Reviewed-by: Philippe Mathieu-Daudé --- include/qemu/int128.h | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/include/qemu/int128.h b/include/qemu/int128.h index f62a46b48c..f29f90e6f4 100644 --- a/include/qemu/int128.h +++ b/include/qemu/int128.h @@ -479,16 +479,16 @@ static inline void bswap128s(Int128 *s) /* * When compiler supports a 128-bit type, define a combination of * a possible structure and the native types. Ease parameter passing - * via use of the transparent union extension. + * via use of the transparent union extension. Provide a vector type + * for use in atomicity on some hosts. */ -#ifdef CONFIG_INT128 typedef union { Int128 s; + uint64_t v __attribute__((vector_size(16))); +#ifdef CONFIG_INT128 __int128_t i; __uint128_t u; -} Int128Alias __attribute__((transparent_union)); -#else -typedef Int128 Int128Alias; #endif /* CONFIG_INT128 */ +} Int128Alias __attribute__((transparent_union)); #endif /* INT128_H */ From patchwork Fri Nov 18 09:47:41 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 626099 Delivered-To: patch@linaro.org Received: by 2002:a17:522:c983:b0:460:3032:e3c4 with SMTP id kr3csp32607pvb; Fri, 18 Nov 2022 01:49:39 -0800 (PST) X-Google-Smtp-Source: AA0mqf7ZG/ca9OrhbnkcQ9Rkv5qfxLO+v/PHc9E4TCtu1nM5KhnD0C4P+0Nrvf76or3yy4uQB+is X-Received: by 2002:ac8:46cc:0:b0:3a5:8185:5b1b with SMTP id h12-20020ac846cc000000b003a581855b1bmr5865860qto.319.1668764979720; Fri, 18 Nov 2022 01:49:39 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668764979; cv=none; d=google.com; s=arc-20160816; b=vf2V4dlPHFn141m5jCrjU3sWxWWQp34S6QKXHsoSvugE2A2YcCDTGJMnZAGDEOsEs7 x/CgCRJ5M/udG+2DBhCbkolr5Y+Qad0VcwH34txDpgmmEBu2nUwySdGAqGmsiTBQKnya 2da1PJCReUhVebnM6frD6RJ0JYzzm4ZutPDssAnxixOTWukx4VzoyxcFYgXa3vu4B/Gu Lb9jHrl87PHF3UjdNoZv0NbnQNUlpP+1w5JBOGHe9thl7/BWNz8YnJMxzo1vWZrX30km AtO0IWC/Vfr9sqWu286IGGxD5pJaQ1P7jN5qvfbw2kylcYCLFb9/TDC3WyF6Xmricd42 xQMA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=P/0TNcxpZ0rTanmAu/qQ90zp2zQbXB43oDl5viBlCGk=; b=KA1xckNdnHCMISLOYALdgt+1ssefb8p2IDTKPhYr9AcZh0J7PCGRzTluY1BsgO5QDq ngH8Hpt2JV+0nAbW1dbQjSmHQZOd8Q0gcrBvb5lhFHlTBqMBq3xK7SPzdZEDHTKoS9vm Gw+J0lty7zzJ8TKRT0c273fIbhEjbPiVg3WteiJHWzluUavpuZ178b4ATdnd7tCk94aK 05Y8K2qqz2SANnHIesFguUlT4cbaBGxcWfcqhCsTJNGmkH5WgBGw3YoRQOtJD1ZRush/ xkSMRd2VkA4zBG0gtLSxOWiW1o+bz5FX2KxIkluffopl/vxIrSXs+klPHwWm/TYI/KuJ 56Bw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=ZtUSmf4P; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id j2-20020a05621419c200b004af9a0d2479si1913810qvc.6.2022.11.18.01.49.39 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 18 Nov 2022 01:49:39 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=ZtUSmf4P; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ovxzE-0007a2-T1; Fri, 18 Nov 2022 04:48:40 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ovxz7-0007Sy-8w for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:38 -0500 Received: from mail-pj1-x1034.google.com ([2607:f8b0:4864:20::1034]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ovxz5-00025T-GE for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:33 -0500 Received: by mail-pj1-x1034.google.com with SMTP id r61-20020a17090a43c300b00212f4e9cccdso7815474pjg.5 for ; Fri, 18 Nov 2022 01:48:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=P/0TNcxpZ0rTanmAu/qQ90zp2zQbXB43oDl5viBlCGk=; b=ZtUSmf4PlIUG4e6UNo8vMCHuNV8t3xuK2jIWpOCzYPgiZR5VyHKrw4Nq2jT/MXhzUc UVvQYXWIcpp9mjVgqNu51fjmBoZe+2HkmuoUUPfvukUXIMWupB/d1Lz7muVUDYgnnpLI 8m5PTF/ZLOjEPXeGwLujW2Ug5i9RYVOFBCdiUzrG9jD5gAST4GYU3tzn+oiJGlbO6r4Q XrUg2BqOOla6ooiGLzUFG3rtPnB2D8zMNRfNm/BsLTT1dkctoXQ/W7/oL2D6uA34VXdI kLLc3J7NceAV+JInVaLIRRfuU3ozs3m62/5v6A11/EojbhJLSiF9ZGLLHXiE1yFLw7k6 UVHQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=P/0TNcxpZ0rTanmAu/qQ90zp2zQbXB43oDl5viBlCGk=; b=ZDVUZSo1EX35638a/opnN5SeMfq89rhUPgt2cRzCUr361Cf3qETBPXspmfEi2PMcbn bqHZKLVm92hgLid602czq8OxCLeQz9/59pActjgLLXHU1S1vxlAM8tNGy1iZJ3YLqSpv 7oKuHsHtFPghZJoCcRWERFFdpAfgahddsFY+SsUT4V6/wxS2vGdCzjCyCWdGKyiH5WBj /nXuYgvC0fLZeunCP4y9KfRX8EPfMKoZLIoDmkeOrvcHlpieTS85SQnqc+Z+L2JxSkVl X3fPUKnbcW3hcAjO7sW9ddH6Kyc0HIWCxIYpRENoBGKGS2yTgsskRgWpK0VUBsOzPDWY svIQ== X-Gm-Message-State: ANoB5plsSjh/BAJjU7zHcio5rIJPCbyBWlwC/2NmFK6Ob/EGnqwJdtJ0 LiKP4JR8xpEwtHrPaMlbdSsHm8b+nLy0DA== X-Received: by 2002:a17:902:6a8b:b0:186:6fde:e9f5 with SMTP id n11-20020a1709026a8b00b001866fdee9f5mr6516256plk.139.1668764909068; Fri, 18 Nov 2022 01:48:29 -0800 (PST) Received: from stoup.. ([2602:47:d48a:1201:90b2:345f:bf0a:c412]) by smtp.gmail.com with ESMTPSA id n12-20020a170902e54c00b0018862bb3976sm3115421plf.308.2022.11.18.01.48.27 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Nov 2022 01:48:28 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH for-8.0 16/29] accel/tcg: Use have_atomic16 in ldst_atomicity.c.inc Date: Fri, 18 Nov 2022 01:47:41 -0800 Message-Id: <20221118094754.242910-17-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221118094754.242910-1-richard.henderson@linaro.org> References: <20221118094754.242910-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1034; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1034.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org Hosts using Intel and AMD AVX cpus are quite common. Add fast paths through ldst_atomicity using this. Signed-off-by: Richard Henderson --- accel/tcg/ldst_atomicity.c.inc | 76 +++++++++++++++++++++++++++------- 1 file changed, 60 insertions(+), 16 deletions(-) diff --git a/accel/tcg/ldst_atomicity.c.inc b/accel/tcg/ldst_atomicity.c.inc index 68edab4398..d2a3783193 100644 --- a/accel/tcg/ldst_atomicity.c.inc +++ b/accel/tcg/ldst_atomicity.c.inc @@ -35,6 +35,14 @@ #if defined(CONFIG_ATOMIC128) # define HAVE_al16_fast true +#elif defined(CONFIG_TCG_INTERPRETER) +/* + * FIXME: host specific detection this is in tcg/$host/, + * but we're using tcg/tci/ instead. + */ +# define HAVE_al16_fast false +#elif defined(__x86_64__) +# define HAVE_al16_fast likely(have_atomic16) #else # define HAVE_al16_fast false #endif @@ -160,6 +168,12 @@ load_atomic16(void *pv) r.u = qatomic_read__nocheck(p); return r.s; +#elif defined(__x86_64__) + Int128Alias r; + + /* Via HAVE_al16_fast, have_atomic16 is true. */ + asm("vmovdqa %1, %0" : "=x" (r.u) : "m" (*(Int128 *)pv)); + return r.s; #else qemu_build_not_reached(); #endif @@ -379,6 +393,24 @@ load_atom_extract_al16_or_al8(void *pv, int s) r = qatomic_read__nocheck(p16); } return r >> shr; +#elif defined(__x86_64__) + uintptr_t pi = (uintptr_t)pv; + int shr = (pi & 7) * 8; + uint64_t a, b; + + /* Via HAVE_al16_fast, have_atomic16 is true. */ + pv = (void *)(pi & ~7); + if (pi & 8) { + uint64_t *p8 = __builtin_assume_aligned(pv, 16, 8); + a = qatomic_read__nocheck(p8); + b = qatomic_read__nocheck(p8 + 1); + } else { + asm("vmovdqa %2, %0\n\tvpextrq $1, %0, %1" + : "=x"(a), "=r"(b) : "m" (*(__uint128_t *)pv)); + } + asm("shrd %b2, %1, %0" : "+r"(a) : "r"(b), "c"(shr)); + + return a; #else qemu_build_not_reached(); #endif @@ -695,23 +727,35 @@ static inline void ATTRIBUTE_ATOMIC128_OPT store_atomic16(void *pv, Int128Alias val) { #if defined(CONFIG_ATOMIC128) - __uint128_t *pu = __builtin_assume_aligned(pv, 16); - qatomic_set__nocheck(pu, val.u); -#elif defined(CONFIG_CMPXCHG128) - __uint128_t *pu = __builtin_assume_aligned(pv, 16); - __uint128_t o; - - /* - * Without CONFIG_ATOMIC128, __atomic_compare_exchange_n will always - * defer to libatomic, so we must use __sync_val_compare_and_swap_16 - * and accept the sequential consistency that comes with it. - */ - do { - o = *pu; - } while (!__sync_bool_compare_and_swap_16(pu, o, val.u)); -#else - qemu_build_not_reached(); + { + __uint128_t *pu = __builtin_assume_aligned(pv, 16); + qatomic_set__nocheck(pu, val.u); + return; + } #endif +#if defined(__x86_64__) + if (HAVE_al16_fast) { + asm("vmovdqa %1, %0" : "=m"(*(__uint128_t *)pv) : "x" (val.u)); + return; + } +#endif +#if defined(CONFIG_CMPXCHG128) + { + __uint128_t *pu = __builtin_assume_aligned(pv, 16); + __uint128_t o; + + /* + * Without CONFIG_ATOMIC128, __atomic_compare_exchange_n will always + * defer to libatomic, so we must use __sync_val_compare_and_swap_16 + * and accept the sequential consistency that comes with it. + */ + do { + o = *pu; + } while (!__sync_bool_compare_and_swap_16(pu, o, val.u)); + return; + } +#endif + qemu_build_not_reached(); } /** From patchwork Fri Nov 18 09:47:42 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 626106 Delivered-To: patch@linaro.org Received: by 2002:a17:522:c983:b0:460:3032:e3c4 with SMTP id kr3csp32968pvb; Fri, 18 Nov 2022 01:50:41 -0800 (PST) X-Google-Smtp-Source: AA0mqf73LRPDbaK4m66CTBEu+9p5wmseV1/pWzr7W/14hPyIRF4NhOnmeCt9q2kXoT9iroc2D9an X-Received: by 2002:a0c:c70c:0:b0:4bb:7883:6bd with SMTP id w12-20020a0cc70c000000b004bb788306bdmr5867607qvi.90.1668765041273; Fri, 18 Nov 2022 01:50:41 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668765041; cv=none; d=google.com; s=arc-20160816; b=soDQwOgLBui/ySij0Dz95VfAkJ/9kOYqQ2LZvL/IKHDF9qBniSZ9+zaNMTM2kZ5nUd r8d8OzYc2nfBnX2lq5z6FfaBVKxIbSlq+eO4J6MSoa+FWS0ToldHHSM9FNtKZfMewPLM o//JHioFFkG6q21FkO/WPsn6DSAPvPTTXz/u1gDc/1+A3I9b3kbJEfCe1tloku9N0foF 2uF8f+XHy2G23US4UUKyH+zfr9DeAW2kzQ5Qk25LPHdJSwZ0SriOAAPJslZTaEzY2/HY QQ4XlLtU3LHF5MdQnuyUvFVQ2+vOg5TxnldL48QlB3frp/tLWgmowXj6yuGAVGWjNW7U Ybcw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=v1qkRSPQlg7fc/AQ74qQ3P9alwy8MipeuIKQlrtf4KE=; b=eUJpZ+Yp3BTuTuAXLJIRflb/0PH0UA/GW4weJWuvjH3TCcFpaVEqbDXDp7qG9OJfml gLJV6b1633vJg4QjmbCRnDeSlA5DYa9HoChPb+ZG7xUXfyjB4F2Ua4hsPSHiYqmF+8TE 1CwAo5yOEgFcTHav8YyzUmKhCOV1BA3QX0pUDKTod0aoObJ8dMnzJPVoMMTpAtLVG4vy Q0eEG0GEQgl7Z47crVqRWqq5vBaPpOeG2wqscs6TGoyJLV3cnjp0/ceWKcJ7xEv7oXhV UdEh409ocxcwBXuGRr+/rrowB6MkEgnKazz3cBGH3MkIhLwENrjz7QhJv1YU9goKTVC6 ArFA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=YojAn+9A; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id x18-20020a05620a259200b006faeee39772si1720182qko.277.2022.11.18.01.50.41 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 18 Nov 2022 01:50:41 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=YojAn+9A; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ovxzh-0007rg-5J; Fri, 18 Nov 2022 04:49:09 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ovxz8-0007Sz-M5 for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:38 -0500 Received: from mail-pl1-x632.google.com ([2607:f8b0:4864:20::632]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ovxz6-00025o-Dj for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:34 -0500 Received: by mail-pl1-x632.google.com with SMTP id p12so4110869plq.4 for ; Fri, 18 Nov 2022 01:48:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=v1qkRSPQlg7fc/AQ74qQ3P9alwy8MipeuIKQlrtf4KE=; b=YojAn+9Aqybx3TDaEWV3UVTSV7lS5Ixo4H6t4IGR8zdRhp0YJIP4AHKsQodjdh1W1k K6KDKswmdZHIu4N8BfUJP4zuOPXaeXSLp1KZdgK4pgCfZVOLFz7Pd4Tqc5K8lKoEaec+ 87e6mt2F1xYZJMeIaveMK0rDiCrfCKSA+lvWgD+/xDj52R9P3qQdVPmdpEv00xgq8C8b 1+xg7q4QuLO2u+nVAwuWzuAOpc/EJ3LdhvtowfSJLH1XLsuPPzWqzDjZJB+dzrIlwpkX fXi0uOvPsjLzcwirDlreY4CxTokDEMop6jEeK6pgDByial6tZx1o+UVyESCA9d2SPGne xVtw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=v1qkRSPQlg7fc/AQ74qQ3P9alwy8MipeuIKQlrtf4KE=; b=lcKeZrJ9Fa3Fptjy3fj0VnMOkKw9fDFVTmhkYnicaUvGOYPgo2W/C4/gsqpAzrM79d mZPP1QbNfPbJCN/+WquIL1tTo71hpjraOWTy7FOaakOvdxvuYOTkNLo9ZrgI3KP2lR20 K046dneiU9mpQEpKyceudiMFyD2LHykq9PspfN5A1YDzkdedNn4AkV4Zl6jIxuSsaMEN QMul9NS/XsCS5swJqVAwyzJxbl/eUgIcNUnYf7cChFUMuWfYE5sG53UYPRrP3C5sBN5Q n36bt+QqDkcK19eEHbQjJeKkUu+LgfIZzsaSH/cSwvmJBp5T9+M6wYqA28PErt9LeyNY BEuQ== X-Gm-Message-State: ANoB5pnQR0dIPxhkdiQqwfYqi5iMBxTuoOlPMV48O4n0vktLosTaOTSh 3vkqONtXJZw42pRnPjzIX8KWx90UAJOyJw== X-Received: by 2002:a17:902:ec8f:b0:186:9f20:e7e2 with SMTP id x15-20020a170902ec8f00b001869f20e7e2mr6453348plg.174.1668764910629; Fri, 18 Nov 2022 01:48:30 -0800 (PST) Received: from stoup.. ([2602:47:d48a:1201:90b2:345f:bf0a:c412]) by smtp.gmail.com with ESMTPSA id n12-20020a170902e54c00b0018862bb3976sm3115421plf.308.2022.11.18.01.48.29 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Nov 2022 01:48:29 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH for-8.0 17/29] tcg/aarch64: Add have_lse, have_lse2 Date: Fri, 18 Nov 2022 01:47:42 -0800 Message-Id: <20221118094754.242910-18-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221118094754.242910-1-richard.henderson@linaro.org> References: <20221118094754.242910-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::632; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x632.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org Notice when the host has additional atomic instructions. The new variables will also be used in generated code. Signed-off-by: Richard Henderson --- tcg/aarch64/tcg-target.h | 3 +++ tcg/aarch64/tcg-target.c.inc | 10 ++++++++++ 2 files changed, 13 insertions(+) diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h index 0dff5807f6..b8f734f371 100644 --- a/tcg/aarch64/tcg-target.h +++ b/tcg/aarch64/tcg-target.h @@ -57,6 +57,9 @@ typedef enum { #define TCG_TARGET_CALL_ARG_I128 TCG_CALL_ARG_NORMAL #define TCG_TARGET_CALL_RET_I128 TCG_CALL_RET_NORMAL +extern bool have_lse; +extern bool have_lse2; + /* optional instructions */ #define TCG_TARGET_HAS_div_i32 1 #define TCG_TARGET_HAS_rem_i32 1 diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc index 001a71bbc0..cf5ee6f742 100644 --- a/tcg/aarch64/tcg-target.c.inc +++ b/tcg/aarch64/tcg-target.c.inc @@ -13,6 +13,8 @@ #include "../tcg-ldst.c.inc" #include "../tcg-pool.c.inc" #include "qemu/bitops.h" +#include + /* We're going to re-use TCGType in setting of the SF bit, which controls the size of the operation performed. If we know the values match, it @@ -71,6 +73,9 @@ static TCGReg tcg_target_call_oarg_reg(TCGCallReturnKind kind, int slot) return TCG_REG_X0 + slot; } +bool have_lse; +bool have_lse2; + #define TCG_REG_TMP TCG_REG_X30 #define TCG_VEC_TMP TCG_REG_V31 @@ -2918,6 +2923,11 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op) static void tcg_target_init(TCGContext *s) { + unsigned long hwcap = qemu_getauxval(AT_HWCAP); + + have_lse = hwcap & HWCAP_ATOMICS; + have_lse2 = hwcap & HWCAP_USCAT; + tcg_target_available_regs[TCG_TYPE_I32] = 0xffffffffu; tcg_target_available_regs[TCG_TYPE_I64] = 0xffffffffu; tcg_target_available_regs[TCG_TYPE_V64] = 0xffffffff00000000ull; From patchwork Fri Nov 18 09:47:43 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 626123 Delivered-To: patch@linaro.org Received: by 2002:a17:522:c983:b0:460:3032:e3c4 with SMTP id kr3csp35535pvb; Fri, 18 Nov 2022 01:58:00 -0800 (PST) X-Google-Smtp-Source: AA0mqf6z7x2TLaxX0RTJ8r4tP7cwT1vZQbPmvEhM5YoR9XHLFNMb8ubG1c5JHw0nw10c2XZ6ixNp X-Received: by 2002:ac8:7ed9:0:b0:3a4:bb50:6336 with SMTP id x25-20020ac87ed9000000b003a4bb506336mr5861924qtj.536.1668765480184; Fri, 18 Nov 2022 01:58:00 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668765480; cv=none; d=google.com; s=arc-20160816; b=VW+Unu8imOoxlegrrw9yPqxEIwaOEveLHI6zKIH1ERYg8j5LU3mwZ5MU79lsVq5HiU ExvnIc1AYViymQZYa0X+YJR1lGb90bu1AOU0JQ7OZtPfEIp/SGRtC2GjcMB1uh8MzrDH H+apD3D5pp7EZ7EHFEz94WAH544e7M03YRVoCCTPurf4N4ug4edJVuifXwo9CTO7gLId Ttjd6yOsLRhEJFrd7iB5JkZqJk7MDWxf7QG4o46pULxCGCMw7Y+GZjweEfl+guvXJYGF Fi3TuILlNcFkNFrQyO7pRqLxthcNdEhYm4ZP+HjMru3bkGxqmujifRxNYQXBtzvQXaX/ PFxw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=QT31Djbsn269igvk/uvuWG/+hysavp7cLqTo/+oIdKQ=; b=dQV6jnOVOa8HCHrsP+zB7En98l5DP2Q4IMsZAEpA+vy6X+AkAw0G27NVVLZlrEb9Pn hHG6DkzkV5vjUyww+uv7W23eZaz+E7WYak+yVIgLzovKnFlh1jTmeA7BSluNev/F1LYS i2nigiZIJ9MrWcktm5VlE98wLJ6dmQzYJ5FAjehBZuYURTpUdp+1jsMNnYML9SYUzwDP IJdJVtjrnXkZ8fXpnFaCaR2L2rfXuDLEA0uaTkwPT67pVL0WOseA/PkZmEDtOzDgWZZn sOSrki2IYkuJfq6GRyvXxhF6Py7lIux+VfNwUyzRX5B8OWmRO3DiJCNtDg0Mb7n8BbmQ ozKg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=BQBqD8M2; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id ey5-20020a05622a4c0500b003a4f52719c0si1767270qtb.530.2022.11.18.01.58.00 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 18 Nov 2022 01:58:00 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=BQBqD8M2; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ovxzi-0007vr-Dj; Fri, 18 Nov 2022 04:49:10 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ovxz8-0007T0-Ue for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:38 -0500 Received: from mail-pj1-x1030.google.com ([2607:f8b0:4864:20::1030]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ovxz7-000219-0T for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:34 -0500 Received: by mail-pj1-x1030.google.com with SMTP id w3-20020a17090a460300b00218524e8877so6671488pjg.1 for ; Fri, 18 Nov 2022 01:48:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=QT31Djbsn269igvk/uvuWG/+hysavp7cLqTo/+oIdKQ=; b=BQBqD8M2Uuk8C7B+P+3Z0u4ZmIcTjiOaGZam2SoUU1b8VnYdFSm/CI5/ApPJtQyP8z a3u1Doj5E9sAl8THVEgT6OHfhjdJ88lNW1g0/1GDTkui0jyEC9fVxUVNwtKDMD4mjrni +SrC5f8SSaz9dCUNK9nyT6OLD4RxPLQUQShu7lChbyf0zYna8aZJbz1cUr026EsYx5ou hzcZZy6itV4iMO36tqAMnVvuMm0nQG+Ai4YF9fjxaXY0SZP3Bdr/GVxIfdkhz7LlHGIO yYNjr2PEs3d/ZUBkSjB8vMSv8fr7TauJJLb4ju8g4ivpmV/NVA0FSEJ61av5Zl/rwCnd wJUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=QT31Djbsn269igvk/uvuWG/+hysavp7cLqTo/+oIdKQ=; b=M5BOVtljfvM+YXxh3AfovoDgOXYNIWfiOQ0T2YZhBiVPlunVTMcY+ddUGPYnzWqmuJ 8y7RvqyrGrJ1akTTg6rnW5OxnKrCWL3bpKvf9vuqBOhYebmxE2AdZP6oALfH/pASuy0g b1msu+HyLCa+qOVVcMyvPQQKlXugXhd0TVVeK0kiCvcSx4q1qOiCbo231OW1Hkb8MS03 KocQQbfUE4AcHy5KY5nrzg2OSInhK5RR6+QBaLL2LT8pGHvlVhw7FzuCsfZ/yEALeM/d 1eRrx2GgIegJ8E2Lnl7IMWMImL96HBJN/cquIK2wz614hjLlE23r7cnUklBYQgtxybva LwQg== X-Gm-Message-State: ANoB5pnUrSUcx+rm7HjxTrpIPqO/EMue8m5gjnWQZDOOPbpp0BHdsN5d a23GTwxzBw8p5tmP0hip+Dmmrd30kqfycw== X-Received: by 2002:a17:90b:3d0a:b0:213:3521:f83a with SMTP id pt10-20020a17090b3d0a00b002133521f83amr6896899pjb.84.1668764911805; Fri, 18 Nov 2022 01:48:31 -0800 (PST) Received: from stoup.. ([2602:47:d48a:1201:90b2:345f:bf0a:c412]) by smtp.gmail.com with ESMTPSA id n12-20020a170902e54c00b0018862bb3976sm3115421plf.308.2022.11.18.01.48.30 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Nov 2022 01:48:31 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH for-8.0 18/29] accel/tcg: Add aarch64 specific support in ldst_atomicity Date: Fri, 18 Nov 2022 01:47:43 -0800 Message-Id: <20221118094754.242910-19-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221118094754.242910-1-richard.henderson@linaro.org> References: <20221118094754.242910-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1030; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1030.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org We have code in atomic128.h noting that through GCC 8, there was no support for atomic operations on __uint128. This has been fixed in GCC 10. But we can still improve over any basic compare-and-swap loop using the ldxp/stxp instructions. Add fast paths for FEAT_LSE2, using the detection in tcg. Signed-off-by: Richard Henderson --- accel/tcg/ldst_atomicity.c.inc | 75 ++++++++++++++++++++++++++++++++-- 1 file changed, 72 insertions(+), 3 deletions(-) diff --git a/accel/tcg/ldst_atomicity.c.inc b/accel/tcg/ldst_atomicity.c.inc index d2a3783193..186862b5ec 100644 --- a/accel/tcg/ldst_atomicity.c.inc +++ b/accel/tcg/ldst_atomicity.c.inc @@ -41,6 +41,8 @@ * but we're using tcg/tci/ instead. */ # define HAVE_al16_fast false +#elif defined(__aarch64__) +# define HAVE_al16_fast likely(have_lse2) #elif defined(__x86_64__) # define HAVE_al16_fast likely(have_atomic16) #else @@ -48,6 +50,8 @@ #endif #if defined(CONFIG_ATOMIC128) || defined(CONFIG_CMPXCHG128) # define HAVE_al16 true +#elif defined(__aarch64__) +# define HAVE_al16 true #else # define HAVE_al16 false #endif @@ -168,6 +172,12 @@ load_atomic16(void *pv) r.u = qatomic_read__nocheck(p); return r.s; +#elif defined(__aarch64__) + /* Via HAVE_al16_fast, FEAT_LSE2 is present: LDP becomes atomic. */ + Int128Alias r; + + asm("ldp %0, %R0, %1" : "=r"(r.u) : "m"(*(__uint128_t *)pv)); + return r.s; #elif defined(__x86_64__) Int128Alias r; @@ -246,7 +256,20 @@ static Int128 load_atomic16_or_exit(CPUArchState *env, uintptr_t ra, void *pv) * In system mode all guest pages are writable, and for user-only * we have just checked writability. Try cmpxchg. */ -#if defined(CONFIG_CMPXCHG128) +#if defined(__aarch64__) + /* We can do better than cmpxchg for AArch64. */ + { + Int128Alias r; + uint32_t fail; + + /* The load must be paired with the store to guarantee not tearing. */ + asm("0: ldxp %0, %R0, %2\n\t" + "stxp %w1, %0, %R0, %2\n\t" + "cbnz %w1, 0b" + : "=&r"(r.u), "=&r"(fail) : "Q"(*p)); + return r.s; + } +#elif defined(CONFIG_CMPXCHG128) /* Swap 0 with 0, with the side-effect of returning the old value. */ { Int128Alias r; @@ -393,6 +416,18 @@ load_atom_extract_al16_or_al8(void *pv, int s) r = qatomic_read__nocheck(p16); } return r >> shr; +#elif defined(__aarch64__) + /* + * Via HAVE_al16_fast, FEAT_LSE2 is present. + * LDP becomes single-copy atomic if 16-byte aligned, and + * single-copy atomic on the parts if 8-byte aligned. + */ + uintptr_t pi = (uintptr_t)pv; + int shr = (pi & 7) * 8; + uint64_t l, h; + + asm("ldp %0, %1, %2" : "=r"(l), "=r"(h) : "m"(*(__uint128_t *)(pi & ~7))); + return (l >> shr) | (h << (-shr & 63)); #elif defined(__x86_64__) uintptr_t pi = (uintptr_t)pv; int shr = (pi & 7) * 8; @@ -739,7 +774,23 @@ store_atomic16(void *pv, Int128Alias val) return; } #endif -#if defined(CONFIG_CMPXCHG128) +#if defined(__aarch64__) + /* We can do better than cmpxchg for AArch64. */ + __uint128_t *pu = __builtin_assume_aligned(pv, 16); + __uint128_t old; + uint32_t fail; + + if (HAVE_al16_fast) { + /* Via HAVE_al16_fast, FEAT_LSE2 is present: STP becomes atomic. */ + asm("stp %1, %R1, %0" : "=Q"(*pu) : "r"(val.u)); + } else { + asm("0: ldxp %0, %R0, %1\n\t" + "stxp %w2, %3, %R3, %1\n\t" + "cbnz %w2, 0b" + : "=&r"(old), "=Q"(*pu), "=&r"(fail) : "r"(val.u)); + } + return; +#elif defined(CONFIG_CMPXCHG128) { __uint128_t *pu = __builtin_assume_aligned(pv, 16); __uint128_t o; @@ -839,7 +890,25 @@ static void store_atom_insert_al8(uint64_t *p, uint64_t val, uint64_t msk) static void ATTRIBUTE_ATOMIC128_OPT store_atom_insert_al16(Int128 *ps, Int128Alias val, Int128Alias msk) { -#if defined(CONFIG_ATOMIC128) +#if defined(__aarch64__) + /* + * GCC only implements __sync* primitives for int128 on aarch64. + * We can do better without the barriers, and integrating the + * arithmetic into the load-exclusive/store-conditional pair. + */ + __uint128_t tmp, *pu = __builtin_assume_aligned(ps, 16); + uint32_t fail; + + asm("0: ldxp %[t], %R[t], %[mem]\n\t" + "bic %[t], %[t], %[m]\n\t" + "bic %R[t], %R[t], %R[m]\n\t" + "orr %[t], %[t], %[v]\n\t" + "orr %R[t], %R[t], %R[v]\n\t" + "stxp %w[f], %[t], %R[t], %[mem]\n\t" + "cbnz %w[f], 0b\n" + : [mem] "+Q"(*pu), [f] "=&r"(fail), [t] "=&r"(tmp) + : [v] "r"(val.u), [m] "r"(msk.u)); +#elif defined(CONFIG_ATOMIC128) __uint128_t *pu, old, new; /* With CONFIG_ATOMIC128, we can avoid the memory barriers. */ From patchwork Fri Nov 18 09:47:44 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 626101 Delivered-To: patch@linaro.org Received: by 2002:a17:522:c983:b0:460:3032:e3c4 with SMTP id kr3csp32791pvb; Fri, 18 Nov 2022 01:50:12 -0800 (PST) X-Google-Smtp-Source: AA0mqf6N59jKqhY1Z1IgrPRX7xDc9a1W7b0QKNYmtb0bbeD4AeqUkOqsby6v2k7D2sENv0aFGojU X-Received: by 2002:a37:407:0:b0:6fa:2f1d:303 with SMTP id 7-20020a370407000000b006fa2f1d0303mr4913822qke.770.1668765012178; Fri, 18 Nov 2022 01:50:12 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668765012; cv=none; d=google.com; s=arc-20160816; b=LsAaFyoHPiylwd290DGu4c2PGLLPXp2A6VhvGEGlCRxLDIbkYfxKt1pQnW3Zdc37gX Ts1qO63oscvh+cJQ/ZNFawWOx5m8k0Vl+qJAiiyPZGhtK6x1ffg5vTSfAruajesyRVWl 8qLo3lNRyJQniMsP6GihqK0UnE/H0biLt9Lb1vpORd27shPmH3k/8uRroVIk3kaNe+It TeFqwOhC1HdDPaV5CJx9CSdPRGFwxaAEIiSnBLShKhy4QCf9S4+Jl2Vj2P0DL9OkWpNR W4urUEOaMhLbrnmqm7xgOR39bf6kvBQYxSsG57wNLin3dCBmHE7gacJUqNMsgODW2dCr TtaQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=AO0bhvrt10zQjDcLABfPHAOU5PsCFxaMEM5XjDlC5Lc=; b=q96lnc+giHoWLAuVb9EUpKp2MnQ8NvZpf937kPEZp4gx/9+700baBCptnl0aO4aWd1 fP96EAZgOLo1/CZJaNZImQZT+5+GqZlhtruueGessIuy569UJtNW8lxNbHweYoY+5dki 79lxlx3kX6OkWXZJW03aETBAI/+6/m+Q8tciBevUv4kEclWH93Yg7OMLKb+ol8WrDt5m ClzoVjTyG1aZI/ys2e36Bw4vF19D7e9mUbnTgnsUgrhyW/Wcsf38XQLvoBqlRCYzkVsh MziUkbqtMTEuBT41rKSloQpto+P5T6u3FdJaWLIiv6UIxcWdzF6qGK3J78h/LaB4eUvZ VoKA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=TFbIMMHf; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id h10-20020ac8776a000000b003976b4bee7fsi1541895qtu.657.2022.11.18.01.50.12 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 18 Nov 2022 01:50:12 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=TFbIMMHf; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ovxzZ-0007mw-0B; Fri, 18 Nov 2022 04:49:02 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ovxzA-0007T1-15 for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:38 -0500 Received: from mail-pl1-x62f.google.com ([2607:f8b0:4864:20::62f]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ovxz8-0001zy-0r for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:35 -0500 Received: by mail-pl1-x62f.google.com with SMTP id jn7so2247865plb.13 for ; Fri, 18 Nov 2022 01:48:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=AO0bhvrt10zQjDcLABfPHAOU5PsCFxaMEM5XjDlC5Lc=; b=TFbIMMHf4lQQxq99mR93yfPPjPcwzilPlRpNOKx0RNuY/fun4nrkyIK+yt9an3Achm v33URkqxESmVR7vBSbCVYfZuBfdfV1dNiA2mENQTwXrL28HeZBvNMmjtq+v2vdcBLr6H YSAwzxWdgHQuNZIcuA3ZAlHbGNbFKZ2WmJdA5zjHPw8qZz6BsxT75fEQhCd2H/IbqzWV 3LIzbvr6YSlOniuajEn+NhZP41DiClMvmy4HeMeUItlCNakeUGRAjAdjXVqg5O13324X UX5I3ODCTiKoK+4NdWC8lsHwLYDPQZb3l6uzhV4M1EHeD45xNOCgljfSt40XEzoci43q EMGw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=AO0bhvrt10zQjDcLABfPHAOU5PsCFxaMEM5XjDlC5Lc=; b=nwn2s5PLHfMQ8zILpalmWf+czPd+86ZvAHZVTMQSQAu7cxZZEA9R4WWJwbOkWCaG/r FrtYRVXntsls3/NvyJI9mbDzWn9S34JodahXnS3o9WFXLEeqo70pxDc2KQyBrxgpz7Lm 7klRHdgrb3zS6Q3fc7Y081z8nevOd0i/q0mdSg3elPkf8waRL0oN9gd1yFrOOB1oGLbF KaDC6pFYxTrw2KJgPWLMj04LRmxX2MzoSfncNBaEK8K69IvCH4fSHuVAOxagLM+OfY/w CGXCDw5g12wsgCwtyeT3SQEFX1ATle2htNKYOsk2foNhjF1+R7/tOlR+tBwkymNGaYda U9rA== X-Gm-Message-State: ANoB5plNdtKU+zcMB/hqlciwfhOUkRw8nJeTLCQNb7jYd5UdZN4PWh8G E3imno9mhKER9Fy++uKDHELXt++ITye4QA== X-Received: by 2002:a17:90b:4fcc:b0:212:e08f:2b07 with SMTP id qa12-20020a17090b4fcc00b00212e08f2b07mr12609254pjb.173.1668764913064; Fri, 18 Nov 2022 01:48:33 -0800 (PST) Received: from stoup.. ([2602:47:d48a:1201:90b2:345f:bf0a:c412]) by smtp.gmail.com with ESMTPSA id n12-20020a170902e54c00b0018862bb3976sm3115421plf.308.2022.11.18.01.48.31 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Nov 2022 01:48:32 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH for-8.0 19/29] tcg: Introduce TCG_OPF_TYPE_MASK Date: Fri, 18 Nov 2022 01:47:44 -0800 Message-Id: <20221118094754.242910-20-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221118094754.242910-1-richard.henderson@linaro.org> References: <20221118094754.242910-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::62f; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x62f.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org Reorg TCG_OPF_64BIT and TCG_OPF_VECTOR into a two-bit field so that we can add TCG_OPF_128BIT without requiring another bit. Signed-off-by: Richard Henderson Reviewed-by: Philippe Mathieu-Daudé --- include/tcg/tcg.h | 22 ++++++++++++---------- tcg/optimize.c | 15 ++++++++++++--- tcg/tcg.c | 4 ++-- tcg/aarch64/tcg-target.c.inc | 8 +++++--- tcg/tci/tcg-target.c.inc | 3 ++- 5 files changed, 33 insertions(+), 19 deletions(-) diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h index a996da60b5..5874f1e30b 100644 --- a/include/tcg/tcg.h +++ b/include/tcg/tcg.h @@ -994,24 +994,26 @@ typedef struct TCGArgConstraint { /* Bits for TCGOpDef->flags, 8 bits available, all used. */ enum { + /* Two bits describing the output type. */ + TCG_OPF_TYPE_MASK = 0x03, + TCG_OPF_32BIT = 0x00, + TCG_OPF_64BIT = 0x01, + TCG_OPF_VECTOR = 0x02, + TCG_OPF_128BIT = 0x03, /* Instruction exits the translation block. */ - TCG_OPF_BB_EXIT = 0x01, + TCG_OPF_BB_EXIT = 0x04, /* Instruction defines the end of a basic block. */ - TCG_OPF_BB_END = 0x02, + TCG_OPF_BB_END = 0x08, /* Instruction clobbers call registers and potentially update globals. */ - TCG_OPF_CALL_CLOBBER = 0x04, + TCG_OPF_CALL_CLOBBER = 0x10, /* Instruction has side effects: it cannot be removed if its outputs are not used, and might trigger exceptions. */ - TCG_OPF_SIDE_EFFECTS = 0x08, - /* Instruction operands are 64-bits (otherwise 32-bits). */ - TCG_OPF_64BIT = 0x10, + TCG_OPF_SIDE_EFFECTS = 0x20, /* Instruction is optional and not implemented by the host, or insn is generic and should not be implemened by the host. */ - TCG_OPF_NOT_PRESENT = 0x20, - /* Instruction operands are vectors. */ - TCG_OPF_VECTOR = 0x40, + TCG_OPF_NOT_PRESENT = 0x40, /* Instruction is a conditional branch. */ - TCG_OPF_COND_BRANCH = 0x80 + TCG_OPF_COND_BRANCH = 0x80, }; typedef struct TCGOpDef { diff --git a/tcg/optimize.c b/tcg/optimize.c index 763bca9ea6..5c0bd6b6e6 100644 --- a/tcg/optimize.c +++ b/tcg/optimize.c @@ -2053,12 +2053,21 @@ void tcg_optimize(TCGContext *s) copy_propagate(&ctx, op, def->nb_oargs, def->nb_iargs); /* Pre-compute the type of the operation. */ - if (def->flags & TCG_OPF_VECTOR) { + switch (def->flags & TCG_OPF_TYPE_MASK) { + case TCG_OPF_VECTOR: ctx.type = TCG_TYPE_V64 + TCGOP_VECL(op); - } else if (def->flags & TCG_OPF_64BIT) { + break; + case TCG_OPF_128BIT: + ctx.type = TCG_TYPE_I128; + break; + case TCG_OPF_64BIT: ctx.type = TCG_TYPE_I64; - } else { + break; + case TCG_OPF_32BIT: ctx.type = TCG_TYPE_I32; + break; + default: + qemu_build_not_reached(); } /* Assume all bits affected, no bits known zero, no sign reps. */ diff --git a/tcg/tcg.c b/tcg/tcg.c index 50db393594..d221f76366 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -2075,7 +2075,7 @@ static void tcg_dump_ops(TCGContext *s, FILE *f, bool have_prefs) nb_iargs = def->nb_iargs; nb_cargs = def->nb_cargs; - if (def->flags & TCG_OPF_VECTOR) { + if ((def->flags & TCG_OPF_TYPE_MASK) == TCG_OPF_VECTOR) { col += ne_fprintf(f, "v%d,e%d,", 64 << TCGOP_VECL(op), 8 << TCGOP_VECE(op)); } @@ -4362,7 +4362,7 @@ static void tcg_reg_alloc_op(TCGContext *s, const TCGOp *op) } /* emit instruction */ - if (def->flags & TCG_OPF_VECTOR) { + if ((def->flags & TCG_OPF_TYPE_MASK) == TCG_OPF_VECTOR) { tcg_out_vec_op(s, op->opc, TCGOP_VECL(op), TCGOP_VECE(op), new_args, const_args); } else { diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc index cf5ee6f742..9ea1608015 100644 --- a/tcg/aarch64/tcg-target.c.inc +++ b/tcg/aarch64/tcg-target.c.inc @@ -1896,9 +1896,11 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg args[TCG_MAX_OP_ARGS], const int const_args[TCG_MAX_OP_ARGS]) { - /* 99% of the time, we can signal the use of extension registers - by looking to see if the opcode handles 64-bit data. */ - TCGType ext = (tcg_op_defs[opc].flags & TCG_OPF_64BIT) != 0; + /* + * 99% of the time, we can signal the use of extension registers + * by looking to see if the opcode handles 32-bit data or not. + */ + TCGType ext = (tcg_op_defs[opc].flags & TCG_OPF_TYPE_MASK) != TCG_OPF_32BIT; /* Hoist the loads of the most common arguments. */ TCGArg a0 = args[0]; diff --git a/tcg/tci/tcg-target.c.inc b/tcg/tci/tcg-target.c.inc index 357888a532..f8ec07839c 100644 --- a/tcg/tci/tcg-target.c.inc +++ b/tcg/tci/tcg-target.c.inc @@ -690,7 +690,8 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc, CASE_32_64(sextract) /* Optional (TCG_TARGET_HAS_sextract_*). */ { TCGArg pos = args[2], len = args[3]; - TCGArg max = tcg_op_defs[opc].flags & TCG_OPF_64BIT ? 64 : 32; + TCGArg max = ((tcg_op_defs[opc].flags & TCG_OPF_TYPE_MASK) + == TCG_OPF_32BIT ? 32 : 64); tcg_debug_assert(pos < max); tcg_debug_assert(pos + len <= max); From patchwork Fri Nov 18 09:47:45 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 626112 Delivered-To: patch@linaro.org Received: by 2002:a17:522:c983:b0:460:3032:e3c4 with SMTP id kr3csp33768pvb; Fri, 18 Nov 2022 01:53:11 -0800 (PST) X-Google-Smtp-Source: AA0mqf6k+3yJ1nW9guLJFSYcEwBpMeMXfMAMsmTlBnpx3iX0bv3nldACiRoJROQpwd62YN4+rNUA X-Received: by 2002:a05:620a:100f:b0:6fa:26ad:eb68 with SMTP id z15-20020a05620a100f00b006fa26adeb68mr5222215qkj.776.1668765191037; Fri, 18 Nov 2022 01:53:11 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668765191; cv=none; d=google.com; s=arc-20160816; b=HF9B31KltCCsqqUd0n3aqndX7KsJac4ByXreYxDF+36Jo/7OvjBVeJeSrjPiMuhPUR iAPik4u9moa73DpZuNtQSfLDJysMc78SSPhTv653uOhRYUXtWRKQLXDb+KfHffP3iIj+ y8V4Y2rLRD3JEr/2cDwUxq4z3ngsBba/jW9LHG2znz4rjFgQhvBIycV0zGTcjFIBb4ek WiZMyvlfJTJ1M6PN7+Y1FwhjUJvAC+pSgQsw7HkeSwKxIKkDyyRFDVXACT9Zn/fjZKGb XIG/bS152yUi5XX7VK6/O6V8FB5csQlcJTykp8v8M+DQOyu5J+V0bZAaLjkvLBCv+dsO ouqA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=nOLpvxWE3D85doQPhX0GBIUiSlQEMOw3TZuwlTihWTY=; b=aw+j2bZ0jCa5Sb2YjWOQcQ0HpgZWu20zRHtxnNSdGknfuR4BJYz1B8U2DEFRHQeH24 IFT9G2P4KpcIgDpTtzkFVUzxtlVnPEJedYUbINs8mHZMn/SlS6w0pUOXbyMJuqRRwiZG qU4O0of2uQocfk+vYHLGQZZEIrmYS2tV7k7qVvfDfJMwPHeDV2ZtcJuSELQN2U8mI4JJ 70eCPF9gsXHZ8jjdFX0CL05jjoLRtXNVRqUBPT6RmzJ5l3DzHQIXwmdo+BaUGWoyGAug JnpPVAdKk4cW9lI9lsf5wCygQeLf1p0RPyQxb23/F0I6BUt4mpwMA5vvO29tsnlGQozh K6cQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=ctkebpYd; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id m6-20020a05620a290600b006cefc8083b2si1833602qkp.354.2022.11.18.01.53.10 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 18 Nov 2022 01:53:11 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=ctkebpYd; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ovxze-0007p8-NG; Fri, 18 Nov 2022 04:49:06 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ovxzC-0007Uo-JJ for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:38 -0500 Received: from mail-pj1-x1031.google.com ([2607:f8b0:4864:20::1031]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ovxzA-00026Y-BC for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:38 -0500 Received: by mail-pj1-x1031.google.com with SMTP id o7so4062843pjj.1 for ; Fri, 18 Nov 2022 01:48:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=nOLpvxWE3D85doQPhX0GBIUiSlQEMOw3TZuwlTihWTY=; b=ctkebpYdCM8i40Lmd+hJG07ww81vwqBIHukrhl86mLislSECAoqZ+7sNWlfFsN0ihK ltzKO2/Aw7UkrkBoXzAwjUIRaixvScXf9cqjP1LGz7bmxIVfwE9toZFReoPjUhkDj3zN 1wKsMCJ3leSFlzJNEtXvOd99NVEZKzBUVbP6JpCmX4KzSdBSrNcb9aSKAnwyWcIE6WCK CCqGhP8MrY4ZcJE3/X2yBztC8xuJjEmT3Bng60JkRm1eA0Y4gNSHccBjDrqPO6w9cWsr 3OwmlMh6JP0Nirhgoe1eHvDUH6B0agojQ+H/l5iPrN+C/PvV3DUUaS6lGVh9UotW4woQ W8HQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=nOLpvxWE3D85doQPhX0GBIUiSlQEMOw3TZuwlTihWTY=; b=jOKcZ08QYaUgZFAZNgj6RLh25idrY5c0zdIKr7e/UQ6m2dmOxExQL+x8BazvwCD82i msz3809vvNZZZaQVAH6BJqFhhGKVSgcIJB/t9tDHMtvGUIXHAnM72H4FtIWQ1RXuDZQZ JXGE21dUGuSsMTO2/dbSr77vum+CRjPKELENEqz0DTa3PrbFeGNXd2iaR3Bljj3H/X96 kQuCDY2u2CsMGzsmSNpfKP+nvtflY5Wa+CAc9vpT5IrS1kqb834POX0Y1Oo3z7nSOJB1 /O2UFbUeWi06Zn1rxi0FDtImpeu7mXX6r/b14CH6IHmmm3dpPUq90aUPS8qrNR94TRBi Uz0A== X-Gm-Message-State: ANoB5pk1uRi77E5ps33cJNVQG9J7vMBWEaJNx4uay9Zoh2ZmcPrJ/yzy hgOKZYi0YJUZtXl4D1qeBKYC3IyrGCVurw== X-Received: by 2002:a17:90a:2b46:b0:213:aa5f:a026 with SMTP id y6-20020a17090a2b4600b00213aa5fa026mr6752748pjc.243.1668764914675; Fri, 18 Nov 2022 01:48:34 -0800 (PST) Received: from stoup.. ([2602:47:d48a:1201:90b2:345f:bf0a:c412]) by smtp.gmail.com with ESMTPSA id n12-20020a170902e54c00b0018862bb3976sm3115421plf.308.2022.11.18.01.48.33 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Nov 2022 01:48:33 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH for-8.0 20/29] tcg: Add INDEX_op_qemu_{ld,st}_i128 Date: Fri, 18 Nov 2022 01:47:45 -0800 Message-Id: <20221118094754.242910-21-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221118094754.242910-1-richard.henderson@linaro.org> References: <20221118094754.242910-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1031; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1031.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org Add opcodes for backend support for 128-bit memory operations. Signed-off-by: Richard Henderson Reviewed-by: Philippe Mathieu-Daudé --- include/tcg/tcg-opc.h | 8 +++++ tcg/aarch64/tcg-target.h | 2 ++ tcg/arm/tcg-target.h | 2 ++ tcg/i386/tcg-target.h | 2 ++ tcg/loongarch64/tcg-target.h | 2 ++ tcg/mips/tcg-target.h | 2 ++ tcg/ppc/tcg-target.h | 2 ++ tcg/riscv/tcg-target.h | 2 ++ tcg/s390x/tcg-target.h | 2 ++ tcg/sparc64/tcg-target.h | 2 ++ tcg/tci/tcg-target.h | 2 ++ tcg/tcg-op.c | 67 ++++++++++++++++++++++++++++++++---- tcg/tcg.c | 4 +++ tcg/README | 10 ++++-- 14 files changed, 100 insertions(+), 9 deletions(-) diff --git a/include/tcg/tcg-opc.h b/include/tcg/tcg-opc.h index dd444734d9..94cf7c5d6a 100644 --- a/include/tcg/tcg-opc.h +++ b/include/tcg/tcg-opc.h @@ -213,6 +213,14 @@ DEF(qemu_st8_i32, 0, TLADDR_ARGS + 1, 1, TCG_OPF_CALL_CLOBBER | TCG_OPF_SIDE_EFFECTS | IMPL(TCG_TARGET_HAS_qemu_st8_i32)) +/* Only for 64-bit hosts at the moment. */ +DEF(qemu_ld_i128, 2, 1, 1, + TCG_OPF_CALL_CLOBBER | TCG_OPF_SIDE_EFFECTS | TCG_OPF_64BIT | + IMPL(TCG_TARGET_HAS_qemu_ldst_i128)) +DEF(qemu_st_i128, 0, 3, 1, + TCG_OPF_CALL_CLOBBER | TCG_OPF_SIDE_EFFECTS | TCG_OPF_64BIT | + IMPL(TCG_TARGET_HAS_qemu_ldst_i128)) + /* Host vector support. */ #define IMPLVEC TCG_OPF_VECTOR | IMPL(TCG_TARGET_MAYBE_vec) diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h index b8f734f371..b0fbf5b699 100644 --- a/tcg/aarch64/tcg-target.h +++ b/tcg/aarch64/tcg-target.h @@ -130,6 +130,8 @@ extern bool have_lse2; #define TCG_TARGET_HAS_mulsh_i64 1 #define TCG_TARGET_HAS_direct_jump 1 +#define TCG_TARGET_HAS_qemu_ldst_i128 0 + #define TCG_TARGET_HAS_v64 1 #define TCG_TARGET_HAS_v128 1 #define TCG_TARGET_HAS_v256 0 diff --git a/tcg/arm/tcg-target.h b/tcg/arm/tcg-target.h index 6613d3d791..8bcab0ac9b 100644 --- a/tcg/arm/tcg-target.h +++ b/tcg/arm/tcg-target.h @@ -126,6 +126,8 @@ extern bool use_neon_instructions; #define TCG_TARGET_HAS_direct_jump 0 #define TCG_TARGET_HAS_qemu_st8_i32 0 +#define TCG_TARGET_HAS_qemu_ldst_i128 0 + #define TCG_TARGET_HAS_v64 use_neon_instructions #define TCG_TARGET_HAS_v128 use_neon_instructions #define TCG_TARGET_HAS_v256 0 diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h index 5b037b1d2b..53d2cb3412 100644 --- a/tcg/i386/tcg-target.h +++ b/tcg/i386/tcg-target.h @@ -195,6 +195,8 @@ extern bool have_atomic16; #define TCG_TARGET_HAS_qemu_st8_i32 1 #endif +#define TCG_TARGET_HAS_qemu_ldst_i128 0 + /* We do not support older SSE systems, only beginning with AVX1. */ #define TCG_TARGET_HAS_v64 have_avx1 #define TCG_TARGET_HAS_v128 have_avx1 diff --git a/tcg/loongarch64/tcg-target.h b/tcg/loongarch64/tcg-target.h index 9d0db8fdfe..6cb702a108 100644 --- a/tcg/loongarch64/tcg-target.h +++ b/tcg/loongarch64/tcg-target.h @@ -173,6 +173,8 @@ typedef enum { #define TCG_TARGET_HAS_muluh_i64 1 #define TCG_TARGET_HAS_mulsh_i64 1 +#define TCG_TARGET_HAS_qemu_ldst_i128 0 + void tb_target_set_jmp_target(uintptr_t, uintptr_t, uintptr_t, uintptr_t); #define TCG_TARGET_DEFAULT_MO (0) diff --git a/tcg/mips/tcg-target.h b/tcg/mips/tcg-target.h index b235cba8ba..0897cfd8d5 100644 --- a/tcg/mips/tcg-target.h +++ b/tcg/mips/tcg-target.h @@ -204,6 +204,8 @@ extern bool use_mips32r2_instructions; #define TCG_TARGET_HAS_ext16u_i64 0 /* andi rt, rs, 0xffff */ #endif +#define TCG_TARGET_HAS_qemu_ldst_i128 0 + #define TCG_TARGET_DEFAULT_MO (0) #define TCG_TARGET_HAS_MEMORY_BSWAP 1 diff --git a/tcg/ppc/tcg-target.h b/tcg/ppc/tcg-target.h index b5cd225cfa..920a746482 100644 --- a/tcg/ppc/tcg-target.h +++ b/tcg/ppc/tcg-target.h @@ -151,6 +151,8 @@ extern bool have_vsx; #define TCG_TARGET_HAS_mulsh_i64 1 #endif +#define TCG_TARGET_HAS_qemu_ldst_i128 0 + /* * While technically Altivec could support V64, it has no 64-bit store * instruction and substituting two 32-bit stores makes the generated diff --git a/tcg/riscv/tcg-target.h b/tcg/riscv/tcg-target.h index d61ca902d3..205d513d08 100644 --- a/tcg/riscv/tcg-target.h +++ b/tcg/riscv/tcg-target.h @@ -168,6 +168,8 @@ typedef enum { #define TCG_TARGET_HAS_mulsh_i64 1 #endif +#define TCG_TARGET_HAS_qemu_ldst_i128 0 + /* not defined -- call should be eliminated at compile time */ void tb_target_set_jmp_target(uintptr_t, uintptr_t, uintptr_t, uintptr_t); diff --git a/tcg/s390x/tcg-target.h b/tcg/s390x/tcg-target.h index 9a3856f0b3..f87905d1e4 100644 --- a/tcg/s390x/tcg-target.h +++ b/tcg/s390x/tcg-target.h @@ -139,6 +139,8 @@ extern uint64_t s390_facilities[3]; #define TCG_TARGET_HAS_muluh_i64 0 #define TCG_TARGET_HAS_mulsh_i64 0 +#define TCG_TARGET_HAS_qemu_ldst_i128 0 + #define TCG_TARGET_HAS_v64 HAVE_FACILITY(VECTOR) #define TCG_TARGET_HAS_v128 HAVE_FACILITY(VECTOR) #define TCG_TARGET_HAS_v256 0 diff --git a/tcg/sparc64/tcg-target.h b/tcg/sparc64/tcg-target.h index 53cfa843da..bfbfb51319 100644 --- a/tcg/sparc64/tcg-target.h +++ b/tcg/sparc64/tcg-target.h @@ -152,6 +152,8 @@ extern bool use_vis3_instructions; #define TCG_TARGET_HAS_muluh_i64 use_vis3_instructions #define TCG_TARGET_HAS_mulsh_i64 0 +#define TCG_TARGET_HAS_qemu_ldst_i128 0 + #define TCG_AREG0 TCG_REG_I0 #define TCG_TARGET_DEFAULT_MO (0) diff --git a/tcg/tci/tcg-target.h b/tcg/tci/tcg-target.h index 9d569c9e04..e4899c7d02 100644 --- a/tcg/tci/tcg-target.h +++ b/tcg/tci/tcg-target.h @@ -128,6 +128,8 @@ #define TCG_TARGET_HAS_mulu2_i32 1 #endif /* TCG_TARGET_REG_BITS == 64 */ +#define TCG_TARGET_HAS_qemu_ldst_i128 0 + /* Number of registers available. */ #define TCG_TARGET_NB_REGS 16 diff --git a/tcg/tcg-op.c b/tcg/tcg-op.c index bbb29bed2b..6210577b85 100644 --- a/tcg/tcg-op.c +++ b/tcg/tcg-op.c @@ -3201,7 +3201,7 @@ static void canonicalize_memop_i128_as_i64(MemOp ret[2], MemOp orig) void tcg_gen_qemu_ld_i128(TCGv_i128 val, TCGv addr, TCGArg idx, MemOp memop) { - MemOpIdx oi = make_memop_idx(memop, idx); + const MemOpIdx oi = make_memop_idx(memop, idx); tcg_debug_assert((memop & MO_SIZE) == MO_128); tcg_debug_assert((memop & MO_SIGN) == 0); @@ -3209,9 +3209,35 @@ void tcg_gen_qemu_ld_i128(TCGv_i128 val, TCGv addr, TCGArg idx, MemOp memop) tcg_gen_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD); addr = plugin_prep_mem_callbacks(addr); - /* TODO: allow the tcg backend to see the whole operation. */ + /* TODO: For now, force 32-bit hosts to use the helper. */ + if (TCG_TARGET_HAS_qemu_ldst_i128 && TCG_TARGET_REG_BITS == 64) { + TCGv_i64 lo, hi; + TCGArg addr_arg; + MemOpIdx adj_oi; - if (use_two_i64_for_i128(memop)) { + /* TODO: Make TCG_TARGET_HAS_MEMORY_BSWAP fine grained. */ + if (!TCG_TARGET_HAS_MEMORY_BSWAP && (memop & MO_BSWAP)) { + lo = TCGV128_HIGH(val); + hi = TCGV128_LOW(val); + adj_oi = make_memop_idx(memop & ~MO_BSWAP, idx); + } else { + lo = TCGV128_LOW(val); + hi = TCGV128_HIGH(val); + adj_oi = oi; + } + +#if TARGET_LONG_BITS == 32 + addr_arg = tcgv_i32_arg(addr); +#else + addr_arg = tcgv_i64_arg(addr); +#endif + tcg_gen_op4ii_i64(INDEX_op_qemu_ld_i128, lo, hi, addr_arg, adj_oi); + + if (!TCG_TARGET_HAS_MEMORY_BSWAP && (memop & MO_BSWAP)) { + tcg_gen_bswap64_i64(lo, lo); + tcg_gen_bswap64_i64(hi, hi); + } + } else if (use_two_i64_for_i128(memop)) { MemOp mop[2]; TCGv addr_p8; TCGv_i64 x, y; @@ -3254,7 +3280,7 @@ void tcg_gen_qemu_ld_i128(TCGv_i128 val, TCGv addr, TCGArg idx, MemOp memop) void tcg_gen_qemu_st_i128(TCGv_i128 val, TCGv addr, TCGArg idx, MemOp memop) { - MemOpIdx oi = make_memop_idx(memop, idx); + const MemOpIdx oi = make_memop_idx(memop, idx); tcg_debug_assert((memop & MO_SIZE) == MO_128); tcg_debug_assert((memop & MO_SIGN) == 0); @@ -3262,9 +3288,38 @@ void tcg_gen_qemu_st_i128(TCGv_i128 val, TCGv addr, TCGArg idx, MemOp memop) tcg_gen_req_mo(TCG_MO_ST_LD | TCG_MO_ST_ST); addr = plugin_prep_mem_callbacks(addr); - /* TODO: allow the tcg backend to see the whole operation. */ + /* TODO: For now, force 32-bit hosts to use the helper. */ - if (use_two_i64_for_i128(memop)) { + if (TCG_TARGET_HAS_qemu_ldst_i128 && TCG_TARGET_REG_BITS == 64) { + TCGv_i64 lo, hi; + TCGArg addr_arg; + MemOpIdx adj_oi; + + /* TODO: Make TCG_TARGET_HAS_MEMORY_BSWAP fine grained. */ + if (!TCG_TARGET_HAS_MEMORY_BSWAP && (memop & MO_BSWAP)) { + lo = tcg_temp_new_i64(); + hi = tcg_temp_new_i64(); + tcg_gen_bswap64_i64(lo, TCGV128_HIGH(val)); + tcg_gen_bswap64_i64(hi, TCGV128_LOW(val)); + adj_oi = make_memop_idx(memop & ~MO_BSWAP, idx); + } else { + lo = TCGV128_LOW(val); + hi = TCGV128_HIGH(val); + adj_oi = oi; + } + +#if TARGET_LONG_BITS == 32 + addr_arg = tcgv_i32_arg(addr); +#else + addr_arg = tcgv_i64_arg(addr); +#endif + tcg_gen_op4ii_i64(INDEX_op_qemu_st_i128, lo, hi, addr_arg, adj_oi); + + if (!TCG_TARGET_HAS_MEMORY_BSWAP && (memop & MO_BSWAP)) { + tcg_temp_free_i64(lo); + tcg_temp_free_i64(hi); + } + } else if (use_two_i64_for_i128(memop)) { MemOp mop[2]; TCGv addr_p8; TCGv_i64 x, y; diff --git a/tcg/tcg.c b/tcg/tcg.c index d221f76366..9a000c55ed 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -1497,6 +1497,10 @@ bool tcg_op_supported(TCGOpcode op) case INDEX_op_qemu_st8_i32: return TCG_TARGET_HAS_qemu_st8_i32; + case INDEX_op_qemu_ld_i128: + case INDEX_op_qemu_st_i128: + return TCG_TARGET_HAS_qemu_ldst_i128; + case INDEX_op_mov_i32: case INDEX_op_setcond_i32: case INDEX_op_brcond_i32: diff --git a/tcg/README b/tcg/README index bc15cc3b32..b3f8578955 100644 --- a/tcg/README +++ b/tcg/README @@ -512,8 +512,8 @@ jump to the TCG epilogue to go back to the exec loop. This operation is optional. If the TCG backend does not implement the goto_ptr opcode, emitting this op is equivalent to emitting exit_tb(0). -* qemu_ld_i32/i64 t0, t1, flags, memidx -* qemu_st_i32/i64 t0, t1, flags, memidx +* qemu_ld_i32/i64/i128 t0, t1, flags, memidx +* qemu_st_i32/i64/i128 t0, t1, flags, memidx * qemu_st8_i32 t0, t1, flags, memidx Load data at the guest address t1 into t0, or store data in t0 at guest @@ -522,7 +522,8 @@ register t0 only. The address t1 is always sized according to the guest, and the width of the memory operation is controlled by flags. Both t0 and t1 may be split into little-endian ordered pairs of registers -if dealing with 64-bit quantities on a 32-bit host. +if dealing with 64-bit quantities on a 32-bit host, or 128-bit quantities +on a 64-bit host. The memidx selects the qemu tlb index to use (e.g. user or kernel access). The flags are the MemOp bits, selecting the sign, width, and endianness @@ -531,6 +532,9 @@ of the memory access. For a 32-bit host, qemu_ld/st_i64 is guaranteed to only be used with a 64-bit memory access specified in flags. +For qemu_ld/st_i128, these are only supported for a 64-bit host, and are +guaranteed to be used with the host memory ordering. + For i386, qemu_st8_i32 is exactly like qemu_st_i32, except the size of the memory operation is known to be 8-bit. This allows the backend to provide a different set of register constraints. From patchwork Fri Nov 18 09:47:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 626115 Delivered-To: patch@linaro.org Received: by 2002:a17:522:c983:b0:460:3032:e3c4 with SMTP id kr3csp33987pvb; Fri, 18 Nov 2022 01:53:48 -0800 (PST) X-Google-Smtp-Source: AA0mqf7qqOWDoGfeBpVFF0NaRogsgN/cRjnJ1/amRzYhdIKPYw6C2o99lK1g0Eyp6wQTiW2W1XO2 X-Received: by 2002:ac8:7250:0:b0:3a5:946e:8ba6 with SMTP id l16-20020ac87250000000b003a5946e8ba6mr5788143qtp.688.1668765228619; Fri, 18 Nov 2022 01:53:48 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668765228; cv=none; d=google.com; s=arc-20160816; b=xtxLMkOgJUMxDkZdNx7XvXCXMwJkmaZM4z5ymF2WMSS2SJEQfjMl5kLWXKZuKIVVJI cN5kZ51Er1ruPgXCQUoHsVEsxeMgy+RaPw6KiiBV14m9lpR5INsM4bdokifkrpawbcSC +TDEICaKv4X8p21gS18HpRoyBjMbjFEa4tRP9nkQlukHyqzJTKBMAX23RnFywD/MrxzX Y8FIVZrGOccoZPA+nEzrr8s1YcKOXZAcVbVdW+Pq0G2b8UsFwPM5IvBNfNyUw3EeKol4 4U+3JL0tWZhiqxCUP7efpWjGD3gYTSOA8njhdOq7jFHgcryZt+ZKitZ3hbzueIB51OBS d9uw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=HnYcvv8AF0ntFzLUX2nU1QtQidjBThAeYSZTUH5cntU=; b=I2G0apQcRTcgjPGNaTprIFtBwlad5M224yorR6cZRPRgdOp/DRdEnI4OU/iPoVnhBu 0dYkWraMZC2N4L+EDR20za1TlZ40EFvZJp7lwLnYGSvB0s4e9xZxFfOIHA7gpWfPNhnd jKQ+yKIAOyCLGA/am9QJL1wuqGKD+YUG1VNYkN5i9aq1B3QE17o+XTs3+iu4Eq5LQDD1 rfKuuHU2QhAgSN9TUlwr8XsBaNKiQRzx3iCrX7+0Dl/n9fwq3kVRWXAuImy5a9VOW7IV +buwdFD+1Crb+clVskfu8w3xUHs8naRswAcjsGcKadCkfgeSccALd3mAlllk9j4DlC/Q cZyw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=zUZKQnrm; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id p25-20020ac84619000000b003a55c9ed459si1676457qtn.561.2022.11.18.01.53.48 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 18 Nov 2022 01:53:48 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=zUZKQnrm; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ovxzg-0007r1-5i; Fri, 18 Nov 2022 04:49:08 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ovxzD-0007W5-5Y for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:39 -0500 Received: from mail-pj1-x102d.google.com ([2607:f8b0:4864:20::102d]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ovxzA-000234-Ur for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:38 -0500 Received: by mail-pj1-x102d.google.com with SMTP id t17so3369227pjo.3 for ; Fri, 18 Nov 2022 01:48:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=HnYcvv8AF0ntFzLUX2nU1QtQidjBThAeYSZTUH5cntU=; b=zUZKQnrm+cpGAXvQ7IV2vxTPXZuCpByyYQe42CDDyWuvYlKwUpcD5Vfp4LslEIJFvX nTn8frQNYODlkabHCOVdjUvdSexYNNpDJuk/Muo/FfIji38jqrjIRQUKRzVqf5qj+aB0 BtDtzrSkpC/N80Pch9r+SFZQ6tSGtCKoCbywRoLog5fH9Oa86ultIYfRTkb2hgJMkZvn jxOkTIvStrQn55fluFNzGpD0Z7vm2vLj+0vN0UsT28ioEydyu+pbAV1F2RsYYUCDufLV Mo4z/bvjwxY1Bv1rfOV/L+OxIKZ5LQa/hkt1RwX2tsdyMxwvgu+i5fzISgOUcvtY+VpB OlNg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=HnYcvv8AF0ntFzLUX2nU1QtQidjBThAeYSZTUH5cntU=; b=C0QeTdL9bOtvo3ljce/Af5J+02uA5J1B8QqCjEaT5wSN+iI8q8zggrnSZZ8+FF+/Tl fyY21i4N7TlrH3OQ4E9++T97mweurM7e9e3V/lZGPZ7Yqbufi6wjsEOYEU2dHOP1ChCX dBVkYbaROJ4Ksyxo8UxIBCL9HfYcCVYs2YDih4SL4QCCsI2Negu40aFXsT2TivCeUcVF /8W3KMK7Y0WcWiwGMyhXgbdsj+2jnAtdX81Zbl9o3gTo6R2j1TAER5Asx7hUutNLaevz OzpIU8BQalEN1knEh2Mr9F2ckCUZ//RAmM5aXI5DfEzXnHB5Hl/RQ71GR0hmNodUvrUw T9dg== X-Gm-Message-State: ANoB5pmYTkKnwQURerS4iHzZglBZ3NVJ1tNWht/tqF0bEpjLqSpKz23o 9WMcqfUTgTHkHxewiMH2JocSzHxk5maNRg== X-Received: by 2002:a17:90a:710b:b0:218:725:c820 with SMTP id h11-20020a17090a710b00b002180725c820mr7022499pjk.170.1668764916001; Fri, 18 Nov 2022 01:48:36 -0800 (PST) Received: from stoup.. ([2602:47:d48a:1201:90b2:345f:bf0a:c412]) by smtp.gmail.com with ESMTPSA id n12-20020a170902e54c00b0018862bb3976sm3115421plf.308.2022.11.18.01.48.34 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Nov 2022 01:48:35 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH for-8.0 21/29] tcg/i386: Introduce tcg_out_mov2 Date: Fri, 18 Nov 2022 01:47:46 -0800 Message-Id: <20221118094754.242910-22-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221118094754.242910-1-richard.henderson@linaro.org> References: <20221118094754.242910-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::102d; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x102d.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org Create a helper for data movement minding register overlap. Use the more general xchg instruction, which consumes one extra byte, but simplifies the more general function. Signed-off-by: Richard Henderson Reviewed-by: Philippe Mathieu-Daudé --- tcg/i386/tcg-target.c.inc | 27 +++++++++++++++++++++------ 1 file changed, 21 insertions(+), 6 deletions(-) diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index f4c0c7b8a2..79568a3981 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -461,6 +461,7 @@ static bool tcg_target_const_match(int64_t val, TCGType type, int ct) #define OPC_VPTERNLOGQ (0x25 | P_EXT3A | P_DATA16 | P_VEXW | P_EVEX) #define OPC_VZEROUPPER (0x77 | P_EXT) #define OPC_XCHG_ax_r32 (0x90) +#define OPC_XCHG_EvGv (0x87) #define OPC_GRP3_Eb (0xf6) #define OPC_GRP3_Ev (0xf7) @@ -1880,6 +1881,24 @@ static void add_qemu_ldst_label(TCGContext *s, bool is_ld, bool is_64, } } +/* Move src1 to dst1 and src2 to dst2, minding possible overlap. */ +static void tcg_out_mov2(TCGContext *s, + TCGType type1, TCGReg dst1, TCGReg src1, + TCGType type2, TCGReg dst2, TCGReg src2) +{ + if (dst1 != src2) { + tcg_out_mov(s, type1, dst1, src1); + tcg_out_mov(s, type2, dst2, src2); + } else if (dst2 != src1) { + tcg_out_mov(s, type2, dst2, src2); + tcg_out_mov(s, type1, dst1, src1); + } else { + /* dst1 == src2 && dst2 == src1 -> xchg. */ + int w = (type1 == TCG_TYPE_I32 && type2 == TCG_TYPE_I32 ? 0 : P_REXW); + tcg_out_modrm(s, OPC_XCHG_EvGv + w, dst1, dst2); + } +} + /* * Generate code for the slow path for a load at the end of block */ @@ -1947,13 +1966,9 @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l) case MO_UQ: if (TCG_TARGET_REG_BITS == 64) { tcg_out_mov(s, TCG_TYPE_I64, data_reg, TCG_REG_RAX); - } else if (data_reg == TCG_REG_EDX) { - /* xchg %edx, %eax */ - tcg_out_opc(s, OPC_XCHG_ax_r32 + TCG_REG_EDX, 0, 0, 0); - tcg_out_mov(s, TCG_TYPE_I32, l->datahi_reg, TCG_REG_EAX); } else { - tcg_out_mov(s, TCG_TYPE_I32, data_reg, TCG_REG_EAX); - tcg_out_mov(s, TCG_TYPE_I32, l->datahi_reg, TCG_REG_EDX); + tcg_out_mov2(s, TCG_TYPE_I32, data_reg, TCG_REG_EAX, + TCG_TYPE_I32, l->datahi_reg, TCG_REG_EDX); } break; default: From patchwork Fri Nov 18 09:47:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 626110 Delivered-To: patch@linaro.org Received: by 2002:a17:522:c983:b0:460:3032:e3c4 with SMTP id kr3csp33483pvb; Fri, 18 Nov 2022 01:52:14 -0800 (PST) X-Google-Smtp-Source: AA0mqf6q0mZ6cTwIjfuFVoUopRAvI03U0hUAZ+PHacUUUGvb1m1rdkcz67SMzjbKyl3gQkuH7KIu X-Received: by 2002:a05:622a:428e:b0:3a5:5a43:b36b with SMTP id cr14-20020a05622a428e00b003a55a43b36bmr5867464qtb.407.1668765134517; Fri, 18 Nov 2022 01:52:14 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668765134; cv=none; d=google.com; s=arc-20160816; b=m5vBh+KGWzi/8j7iiR0bXsmX8OuQfUfI9rpN/I5GWAVkrHei7gjvafIp1KWwfrSmem Z3VApcVszvr6iDsvBw/69nPj7EMO6nSxzC6HsD1c6ZgOvfydZ8y+g4k/aOxfnj0/dqnV hvY5FNd/mHRcSbYAvLFn7yzSh2IUDynntlt8ot4tC0rFsjdu59ujCcTQlg3JRTMwoydE kqorCes+d79Y6Ax7lLZNHcnyp/2+kSQZRB1YHjNEAGe9JMtmOW/xqoZszQVc6hEVnPM7 qD7P65PcptKk90l+PR/Wo7nxxIVBBNMh5DwefjiGYlb6I2cEKi72CyDCf9pu0ym++XZR ZFAQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=B0xOCtQZQe59jgjNlZHBf8IwoOD6gwjokoZIbeqtl8U=; b=TVE1zIRNjx7RG8qlF8lYgHT6ojzWMERMye1cOwXrdf+gwTVzTY05ezUCkM2EZFHJ62 IC86Tfjvp49WfGq+2SwybPnoo4m4Mup177so94CXGUgOzXOJI5Nm11raCOQrKS04ZMCE TvRpWyhjPHiYAtbGWj9K+a3n+hPWtjzXJsUp+KG1zCEBOrbEbCFx3bakOrRgh9cPzYbB ko9wKJVXiNSFdVEpDLuYhhJcTG0w8AXEd9QMvWfXroGysVBYxymnU4CqVhN2Ov5aYiz0 xxaGfCUAPHtrhFqlMHjisQsTE3tPDZAPZEEMoStwRUjth7HePe1Xdz4f78V4kmUa+BrR Oj4w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=mmBzMdnp; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id i7-20020a05620a404700b006fb9c38a492si1933135qko.293.2022.11.18.01.52.14 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 18 Nov 2022 01:52:14 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=mmBzMdnp; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ovxzh-0007v7-RR; Fri, 18 Nov 2022 04:49:09 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ovxzE-0007bb-RD for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:40 -0500 Received: from mail-pj1-x1036.google.com ([2607:f8b0:4864:20::1036]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ovxzD-000273-7S for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:40 -0500 Received: by mail-pj1-x1036.google.com with SMTP id o7so4062929pjj.1 for ; Fri, 18 Nov 2022 01:48:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=B0xOCtQZQe59jgjNlZHBf8IwoOD6gwjokoZIbeqtl8U=; b=mmBzMdnptpoklzC/1rwuPYaZiGJd2CxR2djw+00kqEoWfAZexCIM5R+I1CbebySydq z5lruA6s/O/2hnJN7eDz90CEcAHnnCsLdQp6JutTwGdA/+L+bvEv70Ew/6agJQSLkI9m UxLfuQOg32WUA7H7sfel3EUKnUOwjk0/Nyy6OzReR1qWL+Cnkz22wsJ7HEXNoR7xs5UI um66N4Qvi/bpouvPA2XhGkRTT83v0NmnAtx2TKdLlcvi78DRyAFK/2Epb+j2JoXD3DQ4 8k9F0Gm1o/u8pfR2kCTCnw6jiDpQPjNhyPdU++bkK1qojXJaIfL2sAuXMiKbQk4EK8AX VAqg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=B0xOCtQZQe59jgjNlZHBf8IwoOD6gwjokoZIbeqtl8U=; b=Pp5zAdHnyYUwUsFhpmq43YOkKht5CQ0d/7h+eUeTRbTjW9A/0uL3X+5H+srnHNl0Xs DwQTsfrXKceR1u2c9hN9J6d2Idety1t0W/FRt611i1lQbPHnR4rwR3fmPEEcBIw8AIdc hh8HgjY67WnmkD3wlWby+dE0dteBMxU7pGCK+0L7UrgZICTBZjXm/4UziqnjP4Ujnjgc 7LAEtczcfCh4lep7frmYTmI80JdN2zW6VieHHMCl6fs2EdH575T/vLBqCb0ulnQ9v251 P2i5I6fK/YIB0S/s/6RNSaPCsKqkJd4covQzxtRe4k9iBT46B2xqk7s5l1alWNyH9pV6 M0QQ== X-Gm-Message-State: ANoB5pnTSdi9ro0gFajeqqnfXve7nBtdy2LUOn58yrggAoQ4mz29vyI+ qKKBArABt+desqc5V5cA0Eu2kLxYhE53yQ== X-Received: by 2002:a17:903:1015:b0:186:f256:91d1 with SMTP id a21-20020a170903101500b00186f25691d1mr6750651plb.151.1668764917818; Fri, 18 Nov 2022 01:48:37 -0800 (PST) Received: from stoup.. ([2602:47:d48a:1201:90b2:345f:bf0a:c412]) by smtp.gmail.com with ESMTPSA id n12-20020a170902e54c00b0018862bb3976sm3115421plf.308.2022.11.18.01.48.36 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Nov 2022 01:48:36 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH for-8.0 22/29] tcg/i386: Introduce tcg_out_testi Date: Fri, 18 Nov 2022 01:47:47 -0800 Message-Id: <20221118094754.242910-23-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221118094754.242910-1-richard.henderson@linaro.org> References: <20221118094754.242910-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1036; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1036.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org Split out a helper for choosing testb vs testl. Signed-off-by: Richard Henderson Reviewed-by: Philippe Mathieu-Daudé --- tcg/i386/tcg-target.c.inc | 30 ++++++++++++++++++------------ 1 file changed, 18 insertions(+), 12 deletions(-) diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index 79568a3981..5ddbbbaf18 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -1729,6 +1729,23 @@ static void tcg_out_nopn(TCGContext *s, int n) tcg_out8(s, 0x90); } +/* Test register R vs immediate bits I, setting Z flag for EQ/NE. */ +static void __attribute__((unused)) +tcg_out_testi(TCGContext *s, TCGReg r, uint32_t i) +{ + /* + * This is used for testing alignment, so we can usually use testb. + * For i686, we have to use testl for %esi/%edi. + */ + if (i <= 0xff && (TCG_TARGET_REG_BITS == 64 || r < 4)) { + tcg_out_modrm(s, OPC_GRP3_Eb | P_REXB_RM, EXT3_TESTi, r); + tcg_out8(s, i); + } else { + tcg_out_modrm(s, OPC_GRP3_Ev, EXT3_TESTi, r); + tcg_out32(s, i); + } +} + #if defined(CONFIG_SOFTMMU) /* * helper signature: helper_ld*_mmu(CPUState *env, target_ulong addr, @@ -2056,18 +2073,7 @@ static void tcg_out_test_alignment(TCGContext *s, bool is_ld, TCGReg addrlo, unsigned a_mask = (1 << a_bits) - 1; TCGLabelQemuLdst *label; - /* - * We are expecting a_bits to max out at 7, so we can usually use testb. - * For i686, we have to use testl for %esi/%edi. - */ - if (a_mask <= 0xff && (TCG_TARGET_REG_BITS == 64 || addrlo < 4)) { - tcg_out_modrm(s, OPC_GRP3_Eb | P_REXB_RM, EXT3_TESTi, addrlo); - tcg_out8(s, a_mask); - } else { - tcg_out_modrm(s, OPC_GRP3_Ev, EXT3_TESTi, addrlo); - tcg_out32(s, a_mask); - } - + tcg_out_testi(s, addrlo, a_mask); /* jne slow_path */ tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0); From patchwork Fri Nov 18 09:47:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 626124 Delivered-To: patch@linaro.org Received: by 2002:a17:522:c983:b0:460:3032:e3c4 with SMTP id kr3csp35541pvb; Fri, 18 Nov 2022 01:58:01 -0800 (PST) X-Google-Smtp-Source: AA0mqf5lAzL7aSigMjZfpLKRSiR1O7sI0PYCzyNVRaK/Pj6cBRbwWcRrHDzc0kfZJSwJCAfIFerc X-Received: by 2002:a0c:fd42:0:b0:4af:7058:2909 with SMTP id j2-20020a0cfd42000000b004af70582909mr5926051qvs.57.1668765480833; Fri, 18 Nov 2022 01:58:00 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668765480; cv=none; d=google.com; s=arc-20160816; b=s9Q0DCNszQWmDD4hI7ODdGUtnSy4ojEJ3kJjnHLlNkzq5ihQuM8q5//YUnq7gD/1zN QhL17/ach21yqox5cY4YJZDuNcbGR5h4yXOenYxuBn4L42tecCx9D8ehUX4wGy3C7xvm hdriwbIQDPZXlwfduRsM5WUIe1epjnAxh7GkXb4rX50nOW+B88TZh431ja2tExTri3kk bi0iuAn+YT1GJ5tLri5SSuz9Q+E5jmXjDaNDu+pCw4435oh2le9G4n18HQJguVAGZ6ki gGj2kPXrtuD2RplSQOrc/ZJJGnm/YhJ9FRuySwxKaIQm7N4OA4GhgIDcnqV1IwIO1WvH UPKQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=+RfgC8VhnopAahTH1Ei9SeHiA80KK34b0NuKXJtNsVU=; b=paCCqBLwIPsXRqKg3x3DdlADXB9GBX+bSWKq5Aizpb7TnnWB9f2l1iZIwOAz6uMKdX JOlQw6FgaOHqzT9IgJXbjCDXorisEijOzaQHbcw+o5zO/SF5FAOAWd7gV2yF4O+JkZbg SKjuECI/s4PQ4SGT/BiSQLkgVi4QJKXLkCqKHgz2Re1L41LGy+ChKIFIIQZZ0UPj5RtX 0dGwkC0kBDnTFoGWYCou/2BdfMSwcAohA4luHzjfhGGuro4UQGoOkBVopFnqx6wSKha1 prAqI1e5WvDz/pqUx4Zf4Dn4I5a0EkPD5kP1sDZus9OflFjOb0Isg49UhonVpfYLB1wh w5ag== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="zjz1S/Ob"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id r19-20020a05620a299300b006ee95fd2bc0si1989594qkp.281.2022.11.18.01.58.00 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 18 Nov 2022 01:58:00 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="zjz1S/Ob"; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ovxzj-0007xH-Eo; Fri, 18 Nov 2022 04:49:11 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ovxzH-0007c1-CG for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:43 -0500 Received: from mail-pl1-x636.google.com ([2607:f8b0:4864:20::636]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ovxzE-00027Q-Sa for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:43 -0500 Received: by mail-pl1-x636.google.com with SMTP id y4so4123718plb.2 for ; Fri, 18 Nov 2022 01:48:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=+RfgC8VhnopAahTH1Ei9SeHiA80KK34b0NuKXJtNsVU=; b=zjz1S/Ob1GSF51vaOSTmavUaW1rqYZWW0VlvcfeeWQXs9Cail0d2C0xMlFOfq90QyP OnQjNYyGdBAsog5ZRoaDEU4bFgg62a7xWlYYkDqog2LM1cPf90HSeb0UiUiZVw0OliXS ePn/Mn+ZpTAfQHbq7x48QxyQOh80cXCxNFGuPYOuZPuaISYtnAHu2+GDX9Ug1xK8voov pitIJ1D9MzJaC/3H/H8vuQjWxaZESVTKYvqr3HMhoWj/tUIQNOt5OOWBeJ0qlhP6E/kC rMAdU157pVcFrGq2/xEP8qw+HuTG+O0yYQ+fuytGE2ZqmPhIHD3kEwMd9njlGAEgp7kz aR+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+RfgC8VhnopAahTH1Ei9SeHiA80KK34b0NuKXJtNsVU=; b=fTi129gHYtbyg5RVrJKzjREEBPGklkKPGxUM5zTiZ5uWvN1IBUB8jTQh31oPxkaxZ9 r1xJTIZf8UytFasXCNw2fO3xA4KOy5Fi+4A7MZImW5WoZAvb3IegzIlJKPu8xW0yI2FO +eq0Wvp+PyTgv3sF9D7HkXIkiHoKEdqWT2PEEGhwiZ6yGdq2a2psj/QEVw0RjxPvLoyx 8toaooNJfCkwQDr7hRZgD5Q5VyCIjRK6FjjaxPLeZ7caCqmvMkhRkC0D5fdMID5Dp/4S 1+e8t//HclMuMwmpzz5tZfhiELjDpR9UFiqQuy8zRbQQOFOGZ3N4e02z4hlaizUXXpux GD4A== X-Gm-Message-State: ANoB5pngq5GrXkVBEuvte/As984cAlYy+5XGOncEfMZ9CMUDPsT/vpij fk+dZXvg5eewgw1fPikliYveo/JEnTrLSA== X-Received: by 2002:a17:90a:9403:b0:218:6a4e:e44a with SMTP id r3-20020a17090a940300b002186a4ee44amr7309025pjo.6.1668764919559; Fri, 18 Nov 2022 01:48:39 -0800 (PST) Received: from stoup.. ([2602:47:d48a:1201:90b2:345f:bf0a:c412]) by smtp.gmail.com with ESMTPSA id n12-20020a170902e54c00b0018862bb3976sm3115421plf.308.2022.11.18.01.48.38 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Nov 2022 01:48:38 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH for-8.0 23/29] tcg/i386: Use full load/store helpers in user-only mode Date: Fri, 18 Nov 2022 01:47:48 -0800 Message-Id: <20221118094754.242910-24-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221118094754.242910-1-richard.henderson@linaro.org> References: <20221118094754.242910-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::636; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x636.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org Instead of using helper_unaligned_{ld,st}, use the full load/store helpers. This will allow the fast path to increase alignment to implement atomicity while not immediately raising an alignment exception. Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.c.inc | 332 ++++++++++++++++---------------------- 1 file changed, 142 insertions(+), 190 deletions(-) diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index 5ddbbbaf18..eb93807b5f 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -1746,7 +1746,6 @@ tcg_out_testi(TCGContext *s, TCGReg r, uint32_t i) } } -#if defined(CONFIG_SOFTMMU) /* * helper signature: helper_ld*_mmu(CPUState *env, target_ulong addr, * int mmu_idx, uintptr_t ra) @@ -1769,108 +1768,6 @@ static void * const qemu_st_helpers[MO_SIZE + 1] = { [MO_UQ] = helper_stq_mmu, }; -/* Perform the TLB load and compare. - - Inputs: - ADDRLO and ADDRHI contain the low and high part of the address. - - MEM_INDEX and S_BITS are the memory context and log2 size of the load. - - WHICH is the offset into the CPUTLBEntry structure of the slot to read. - This should be offsetof addr_read or addr_write. - - Outputs: - LABEL_PTRS is filled with 1 (32-bit addresses) or 2 (64-bit addresses) - positions of the displacements of forward jumps to the TLB miss case. - - Second argument register is loaded with the low part of the address. - In the TLB hit case, it has been adjusted as indicated by the TLB - and so is a host address. In the TLB miss case, it continues to - hold a guest address. - - First argument register is clobbered. */ - -static inline void tcg_out_tlb_load(TCGContext *s, TCGReg addrlo, TCGReg addrhi, - int mem_index, MemOp opc, - tcg_insn_unit **label_ptr, int which) -{ - const TCGReg r0 = TCG_REG_L0; - const TCGReg r1 = TCG_REG_L1; - TCGType ttype = TCG_TYPE_I32; - TCGType tlbtype = TCG_TYPE_I32; - int trexw = 0, hrexw = 0, tlbrexw = 0; - unsigned a_bits = get_alignment_bits(opc); - unsigned s_bits = opc & MO_SIZE; - unsigned a_mask = (1 << a_bits) - 1; - unsigned s_mask = (1 << s_bits) - 1; - target_ulong tlb_mask; - - if (TCG_TARGET_REG_BITS == 64) { - if (TARGET_LONG_BITS == 64) { - ttype = TCG_TYPE_I64; - trexw = P_REXW; - } - if (TCG_TYPE_PTR == TCG_TYPE_I64) { - hrexw = P_REXW; - if (TARGET_PAGE_BITS + CPU_TLB_DYN_MAX_BITS > 32) { - tlbtype = TCG_TYPE_I64; - tlbrexw = P_REXW; - } - } - } - - tcg_out_mov(s, tlbtype, r0, addrlo); - tcg_out_shifti(s, SHIFT_SHR + tlbrexw, r0, - TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS); - - tcg_out_modrm_offset(s, OPC_AND_GvEv + trexw, r0, TCG_AREG0, - TLB_MASK_TABLE_OFS(mem_index) + - offsetof(CPUTLBDescFast, mask)); - - tcg_out_modrm_offset(s, OPC_ADD_GvEv + hrexw, r0, TCG_AREG0, - TLB_MASK_TABLE_OFS(mem_index) + - offsetof(CPUTLBDescFast, table)); - - /* If the required alignment is at least as large as the access, simply - copy the address and mask. For lesser alignments, check that we don't - cross pages for the complete access. */ - if (a_bits >= s_bits) { - tcg_out_mov(s, ttype, r1, addrlo); - } else { - tcg_out_modrm_offset(s, OPC_LEA + trexw, r1, addrlo, s_mask - a_mask); - } - tlb_mask = (target_ulong)TARGET_PAGE_MASK | a_mask; - tgen_arithi(s, ARITH_AND + trexw, r1, tlb_mask, 0); - - /* cmp 0(r0), r1 */ - tcg_out_modrm_offset(s, OPC_CMP_GvEv + trexw, r1, r0, which); - - /* Prepare for both the fast path add of the tlb addend, and the slow - path function argument setup. */ - tcg_out_mov(s, ttype, r1, addrlo); - - /* jne slow_path */ - tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0); - label_ptr[0] = s->code_ptr; - s->code_ptr += 4; - - if (TARGET_LONG_BITS > TCG_TARGET_REG_BITS) { - /* cmp 4(r0), addrhi */ - tcg_out_modrm_offset(s, OPC_CMP_GvEv, addrhi, r0, which + 4); - - /* jne slow_path */ - tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0); - label_ptr[1] = s->code_ptr; - s->code_ptr += 4; - } - - /* TLB Hit. */ - - /* add addend(r0), r1 */ - tcg_out_modrm_offset(s, OPC_ADD_GvEv + hrexw, r1, r0, - offsetof(CPUTLBEntry, addend)); -} - /* * Record the context of a call to the out of line helper code for the slow path * for a load or store, so that we can later generate the correct helper code @@ -1893,9 +1790,7 @@ static void add_qemu_ldst_label(TCGContext *s, bool is_ld, bool is_64, label->addrhi_reg = addrhi; label->raddr = tcg_splitwx_to_rx(raddr); label->label_ptr[0] = label_ptr[0]; - if (TARGET_LONG_BITS > TCG_TARGET_REG_BITS) { - label->label_ptr[1] = label_ptr[1]; - } + label->label_ptr[1] = label_ptr[1]; } /* Move src1 to dst1 and src2 to dst2, minding possible overlap. */ @@ -1929,7 +1824,7 @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l) /* resolve label address */ tcg_patch32(label_ptr[0], s->code_ptr - label_ptr[0] - 4); - if (TARGET_LONG_BITS > TCG_TARGET_REG_BITS) { + if (label_ptr[1]) { tcg_patch32(label_ptr[1], s->code_ptr - label_ptr[1] - 4); } @@ -1952,8 +1847,9 @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l) tcg_out_sti(s, TCG_TYPE_PTR, (uintptr_t)l->raddr, TCG_REG_ESP, ofs); } else { + tcg_out_mov(s, TCG_TYPE_TL, tcg_target_call_iarg_regs[1], + l->addrlo_reg); tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], TCG_AREG0); - /* The second argument is already loaded with addrlo. */ tcg_out_movi(s, TCG_TYPE_I32, tcg_target_call_iarg_regs[2], oi); tcg_out_movi(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[3], (uintptr_t)l->raddr); @@ -2010,7 +1906,7 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l) /* resolve label address */ tcg_patch32(label_ptr[0], s->code_ptr - label_ptr[0] - 4); - if (TARGET_LONG_BITS > TCG_TARGET_REG_BITS) { + if (label_ptr[1]) { tcg_patch32(label_ptr[1], s->code_ptr - label_ptr[1] - 4); } @@ -2043,10 +1939,11 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l) tcg_out_movi(s, TCG_TYPE_PTR, retaddr, (uintptr_t)l->raddr); tcg_out_st(s, TCG_TYPE_PTR, retaddr, TCG_REG_ESP, ofs); } else { + tcg_out_mov2(s, TCG_TYPE_TL, + tcg_target_call_iarg_regs[1], l->addrlo_reg, + s_bits == MO_64 ? TCG_TYPE_I64 : TCG_TYPE_I32, + tcg_target_call_iarg_regs[2], l->datalo_reg); tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], TCG_AREG0); - /* The second argument is already loaded with addrlo. */ - tcg_out_mov(s, (s_bits == MO_64 ? TCG_TYPE_I64 : TCG_TYPE_I32), - tcg_target_call_iarg_regs[2], l->datalo_reg); tcg_out_movi(s, TCG_TYPE_I32, tcg_target_call_iarg_regs[3], oi); if (ARRAY_SIZE(tcg_target_call_iarg_regs) > 4) { @@ -2065,72 +1962,129 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l) tcg_out_jmp(s, qemu_st_helpers[opc & MO_SIZE]); return true; } + +#if defined(CONFIG_SOFTMMU) +/* + * Perform the TLB load and compare. + * + * Inputs: + * ADDRLO and ADDRHI contain the low and high part of the address. + * + * MEM_INDEX and S_BITS are the memory context and log2 size of the load. + * + * WHICH is the offset into the CPUTLBEntry structure of the slot to read. + * This should be offsetof addr_read or addr_write. + * + * Outputs: + * LABEL_PTRS is filled with 1 (32-bit addresses) or 2 (64-bit addresses) + * positions of the displacements of forward jumps to the TLB miss case. + * + * Second argument register is loaded with the low part of the address. + * In the TLB hit case, it has been adjusted as indicated by the TLB + * and so is a host address. In the TLB miss case, it continues to + * hold a guest address. + * + * First argument register is clobbered. + */ +static void tcg_out_tlb_load(TCGContext *s, TCGReg addrlo, TCGReg addrhi, + int mem_index, MemOp opc, + tcg_insn_unit **label_ptr, int which) +{ + const TCGReg r0 = TCG_REG_L0; + const TCGReg r1 = TCG_REG_L1; + TCGType ttype = TCG_TYPE_I32; + TCGType tlbtype = TCG_TYPE_I32; + int trexw = 0, hrexw = 0, tlbrexw = 0; + unsigned a_bits = get_alignment_bits(opc); + unsigned s_bits = opc & MO_SIZE; + unsigned a_mask = (1 << a_bits) - 1; + unsigned s_mask = (1 << s_bits) - 1; + target_ulong tlb_mask; + + if (TCG_TARGET_REG_BITS == 64) { + if (TARGET_LONG_BITS == 64) { + ttype = TCG_TYPE_I64; + trexw = P_REXW; + } + if (TCG_TYPE_PTR == TCG_TYPE_I64) { + hrexw = P_REXW; + if (TARGET_PAGE_BITS + CPU_TLB_DYN_MAX_BITS > 32) { + tlbtype = TCG_TYPE_I64; + tlbrexw = P_REXW; + } + } + } + + tcg_out_mov(s, tlbtype, r0, addrlo); + tcg_out_shifti(s, SHIFT_SHR + tlbrexw, r0, + TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS); + + tcg_out_modrm_offset(s, OPC_AND_GvEv + trexw, r0, TCG_AREG0, + TLB_MASK_TABLE_OFS(mem_index) + + offsetof(CPUTLBDescFast, mask)); + + tcg_out_modrm_offset(s, OPC_ADD_GvEv + hrexw, r0, TCG_AREG0, + TLB_MASK_TABLE_OFS(mem_index) + + offsetof(CPUTLBDescFast, table)); + + /* + * If the required alignment is at least as large as the access, simply + * copy the address and mask. For lesser alignments, check that we don't + * cross pages for the complete access. + */ + if (a_bits >= s_bits) { + tcg_out_mov(s, ttype, r1, addrlo); + } else { + tcg_out_modrm_offset(s, OPC_LEA + trexw, r1, addrlo, s_mask - a_mask); + } + tlb_mask = (target_ulong)TARGET_PAGE_MASK | a_mask; + tgen_arithi(s, ARITH_AND + trexw, r1, tlb_mask, 0); + + /* cmp 0(r0), r1 */ + tcg_out_modrm_offset(s, OPC_CMP_GvEv + trexw, r1, r0, which); + + /* + * Prepare for both the fast path add of the tlb addend, and the slow + * path function argument setup. + */ + tcg_out_mov(s, ttype, r1, addrlo); + + /* jne slow_path */ + tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0); + label_ptr[0] = s->code_ptr; + s->code_ptr += 4; + + if (TARGET_LONG_BITS > TCG_TARGET_REG_BITS) { + /* cmp 4(r0), addrhi */ + tcg_out_modrm_offset(s, OPC_CMP_GvEv, addrhi, r0, which + 4); + + /* jne slow_path */ + tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0); + label_ptr[1] = s->code_ptr; + s->code_ptr += 4; + } + + /* TLB Hit. */ + + /* add addend(r0), r1 */ + tcg_out_modrm_offset(s, OPC_ADD_GvEv + hrexw, r1, r0, + offsetof(CPUTLBEntry, addend)); +} + #else -static void tcg_out_test_alignment(TCGContext *s, bool is_ld, TCGReg addrlo, - TCGReg addrhi, unsigned a_bits) +static void tcg_out_test_alignment(TCGContext *s, TCGReg addrlo, + unsigned a_bits, tcg_insn_unit **label_ptr) { unsigned a_mask = (1 << a_bits) - 1; - TCGLabelQemuLdst *label; tcg_out_testi(s, addrlo, a_mask); /* jne slow_path */ tcg_out_opc(s, OPC_JCC_long + JCC_JNE, 0, 0, 0); - - label = new_ldst_label(s); - label->is_ld = is_ld; - label->addrlo_reg = addrlo; - label->addrhi_reg = addrhi; - label->raddr = tcg_splitwx_to_rx(s->code_ptr + 4); - label->label_ptr[0] = s->code_ptr; - + *label_ptr = s->code_ptr; s->code_ptr += 4; } -static bool tcg_out_fail_alignment(TCGContext *s, TCGLabelQemuLdst *l) -{ - /* resolve label address */ - tcg_patch32(l->label_ptr[0], s->code_ptr - l->label_ptr[0] - 4); - - if (TCG_TARGET_REG_BITS == 32) { - int ofs = 0; - - tcg_out_st(s, TCG_TYPE_PTR, TCG_AREG0, TCG_REG_ESP, ofs); - ofs += 4; - - tcg_out_st(s, TCG_TYPE_I32, l->addrlo_reg, TCG_REG_ESP, ofs); - ofs += 4; - if (TARGET_LONG_BITS == 64) { - tcg_out_st(s, TCG_TYPE_I32, l->addrhi_reg, TCG_REG_ESP, ofs); - ofs += 4; - } - - tcg_out_pushi(s, (uintptr_t)l->raddr); - } else { - tcg_out_mov(s, TCG_TYPE_TL, tcg_target_call_iarg_regs[1], - l->addrlo_reg); - tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], TCG_AREG0); - - tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_RAX, (uintptr_t)l->raddr); - tcg_out_push(s, TCG_REG_RAX); - } - - /* "Tail call" to the helper, with the return address back inline. */ - tcg_out_jmp(s, (const void *)(l->is_ld ? helper_unaligned_ld - : helper_unaligned_st)); - return true; -} - -static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l) -{ - return tcg_out_fail_alignment(s, l); -} - -static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l) -{ - return tcg_out_fail_alignment(s, l); -} - #if TCG_TARGET_REG_BITS == 32 # define x86_guest_base_seg 0 # define x86_guest_base_index -1 @@ -2165,7 +2119,7 @@ static inline int setup_guest_base_seg(void) return 0; } # endif -#endif +#endif /* TCG_TARGET_REG_BITS == 32 */ #endif /* SOFTMMU */ static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg datalo, TCGReg datahi, @@ -2272,10 +2226,8 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is64) TCGReg addrhi __attribute__((unused)); MemOpIdx oi; MemOp opc; -#if defined(CONFIG_SOFTMMU) - int mem_index; - tcg_insn_unit *label_ptr[2]; -#else + tcg_insn_unit *label_ptr[2] = { }; +#ifndef CONFIG_SOFTMMU unsigned a_bits; #endif @@ -2287,26 +2239,27 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is64) opc = get_memop(oi); #if defined(CONFIG_SOFTMMU) - mem_index = get_mmuidx(oi); - - tcg_out_tlb_load(s, addrlo, addrhi, mem_index, opc, + tcg_out_tlb_load(s, addrlo, addrhi, get_mmuidx(oi), opc, label_ptr, offsetof(CPUTLBEntry, addr_read)); /* TLB Hit. */ tcg_out_qemu_ld_direct(s, datalo, datahi, TCG_REG_L1, -1, 0, 0, is64, opc); /* Record the current context of a load into ldst label */ - add_qemu_ldst_label(s, true, is64, oi, datalo, datahi, addrlo, addrhi, - s->code_ptr, label_ptr); + add_qemu_ldst_label(s, true, is64, oi, datalo, datahi, + TCG_REG_L1, addrhi, s->code_ptr, label_ptr); #else a_bits = get_alignment_bits(opc); if (a_bits) { - tcg_out_test_alignment(s, true, addrlo, addrhi, a_bits); + tcg_out_test_alignment(s, addrlo, a_bits, label_ptr); } - tcg_out_qemu_ld_direct(s, datalo, datahi, addrlo, x86_guest_base_index, x86_guest_base_offset, x86_guest_base_seg, is64, opc); + if (a_bits) { + add_qemu_ldst_label(s, true, is64, oi, datalo, datahi, + addrlo, addrhi, s->code_ptr, label_ptr); + } #endif } @@ -2368,10 +2321,8 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, bool is64) TCGReg addrhi __attribute__((unused)); MemOpIdx oi; MemOp opc; -#if defined(CONFIG_SOFTMMU) - int mem_index; - tcg_insn_unit *label_ptr[2]; -#else + tcg_insn_unit *label_ptr[2] = { }; +#ifndef CONFIG_SOFTMMU unsigned a_bits; #endif @@ -2383,25 +2334,26 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, bool is64) opc = get_memop(oi); #if defined(CONFIG_SOFTMMU) - mem_index = get_mmuidx(oi); - - tcg_out_tlb_load(s, addrlo, addrhi, mem_index, opc, + tcg_out_tlb_load(s, addrlo, addrhi, get_mmuidx(oi), opc, label_ptr, offsetof(CPUTLBEntry, addr_write)); /* TLB Hit. */ tcg_out_qemu_st_direct(s, datalo, datahi, TCG_REG_L1, -1, 0, 0, opc); /* Record the current context of a store into ldst label */ - add_qemu_ldst_label(s, false, is64, oi, datalo, datahi, addrlo, addrhi, - s->code_ptr, label_ptr); + add_qemu_ldst_label(s, false, is64, oi, datalo, datahi, + TCG_REG_L1, addrhi, s->code_ptr, label_ptr); #else a_bits = get_alignment_bits(opc); if (a_bits) { - tcg_out_test_alignment(s, false, addrlo, addrhi, a_bits); + tcg_out_test_alignment(s, addrlo, a_bits, label_ptr); } - tcg_out_qemu_st_direct(s, datalo, datahi, addrlo, x86_guest_base_index, x86_guest_base_offset, x86_guest_base_seg, opc); + if (a_bits) { + add_qemu_ldst_label(s, false, is64, oi, datalo, datahi, + addrlo, addrhi, s->code_ptr, label_ptr); + } #endif } From patchwork Fri Nov 18 09:47:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 626107 Delivered-To: patch@linaro.org Received: by 2002:a17:522:c983:b0:460:3032:e3c4 with SMTP id kr3csp33068pvb; Fri, 18 Nov 2022 01:51:01 -0800 (PST) X-Google-Smtp-Source: AA0mqf4fvSWoyaxGWfECX9hMoS4PHDFLajjKbtajI/+zwowNh+rsjYRjfFd9w5WdeDJ7Yw5DeIxZ X-Received: by 2002:a05:622a:110e:b0:3a4:ffda:64d4 with SMTP id e14-20020a05622a110e00b003a4ffda64d4mr5960142qty.162.1668765061099; Fri, 18 Nov 2022 01:51:01 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668765061; cv=none; d=google.com; s=arc-20160816; b=p11SGNYPMdQOaAOSyKUVBmtauJlXejAw6mJEt4bKbV6Wl5178P88oMVr17WVu90JZV 1mFveENo4FEBef19xlgVwjn54vMJyyXlLMg1ueEM4JNtJamQDXM0TloHe7WeieJHsBUY VmG+HeLpshHCEPmSPCqNNY+zTxE/RPy8IR3NJj/n3zr2ao1K55j13nd5WDKg6N0UOEZT NDczvx1wI6os3mASv/NEWvDJf45p1meJDsiOffTi5yD0ifO2w0uGWThyRN/Dti4ulc8f RgD3RchMnlMXkxJpepYW4uXR0c6nOsc7Hf2s4nOdcnCxTcEVIkozKcojZRlAoUzEcVTR cxgg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=TiMHFur+OZBmf7nFVW9KTqozig+8dJ3+v+HcOHcl0n4=; b=eN27oUPpryKqzMsO47WJ1sYfKZogXsQeav2gGzhthhmGgwMzROhJOSrf051KiMfBdz h+twGuXrIxnfDlwOkrIMObBjLmB74pHgwIvh7udX5od1FMMYLIOoVEVhIV24HrO9Kcsg IkjV03+8U3eOGs7IGx52JfJPrVKPHdp3/vtUVo95Ud/ZdjmY+hK+riynvxnXszT+//10 pF9tmpFNFDxQuxi0Itf+6OrgXUctlV9cGRdN1Xh8p33TJcuVn3CLY3XDaPQY26vVHeQX 43iX7MBU/FHImUtdalwmYOiI+NaWXPUcGy0M0Dwtu9axDoDyynEm+zC0kZTf5QPPQcxN L2DQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=IobpO2dR; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id bq14-20020a05620a468e00b006ec5fc43647si1862312qkb.698.2022.11.18.01.51.00 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 18 Nov 2022 01:51:01 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=IobpO2dR; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ovxzk-0007z1-AN; Fri, 18 Nov 2022 04:49:12 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ovxzI-0007eW-ET for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:45 -0500 Received: from mail-pj1-x1030.google.com ([2607:f8b0:4864:20::1030]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ovxzG-00027d-Jf for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:44 -0500 Received: by mail-pj1-x1030.google.com with SMTP id b1-20020a17090a7ac100b00213fde52d49so4574258pjl.3 for ; Fri, 18 Nov 2022 01:48:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=TiMHFur+OZBmf7nFVW9KTqozig+8dJ3+v+HcOHcl0n4=; b=IobpO2dRe8aoW9a1TCtal/bG1lsQNGSsuKC9SSFIjPxmmn89MIR82qk++2kOoUHA/9 qTVH2BLHbk/YssLy1wETf77tkdVKDsMTB559/6YjRpeJM07lxpurF/mV82OP+f1lZtO9 ijAOrtEOLFE7DO5upa1tqYW2cjD4CRF7xbINae60ZbIMb8oWtZYrDV2IEs1zyFprPkya TppP+3pUqyQ9sifg4dVyTOZTC06EeibbTAEL2sr2S6KRheac8amYQxI2VpcoCVSKiaPl UmLjMdJTxL59YoJqJIxowQpCeI0QfIoLlL8Kew24U4lLZju5JsvQkQgZS7pFVPQ60QxW Mgwg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=TiMHFur+OZBmf7nFVW9KTqozig+8dJ3+v+HcOHcl0n4=; b=Bw6Ew8shAyBaHH3zaeueCBsDWbGjJ85c0xv3wQqZcCw6edGGV1N1ndsHnFskhGvPSR /Wn4Nuz1pZqrMZP072QRb2EpC+KqRTHjBLbQK8+xIBTPh1k1ytxlgjPT74C0CvII0Jk/ il0E54+hlpLx10xmOm5jRJL5V9n1KfBX6G2MLEImnsfpBbG4UGh4GrDxMareRCb4Vt1Q fMtabeznV9OOWmjvL2Sme3Mbmq/RaSdFtzN2gVcWlll1qrCtDy9Y7k88jrIfAcZBM5ab GIneqsD5bjJA+9ToTcM2GLt/Kguju5GpQ4HcLYw2bmyWUrwHmiq0t+5D0RnRbZyuW2vV eYqw== X-Gm-Message-State: ANoB5pkLXyxy4dvAL7Fm15CZAdxsCPBdz3OgHo/Y1REwYUSDBycaEKUU 60hdjK2OJB3qjLT+ldEtQntShuLtSuRcZg== X-Received: by 2002:a17:90a:fd85:b0:218:4953:58a4 with SMTP id cx5-20020a17090afd8500b00218495358a4mr13130686pjb.57.1668764921326; Fri, 18 Nov 2022 01:48:41 -0800 (PST) Received: from stoup.. ([2602:47:d48a:1201:90b2:345f:bf0a:c412]) by smtp.gmail.com with ESMTPSA id n12-20020a170902e54c00b0018862bb3976sm3115421plf.308.2022.11.18.01.48.39 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Nov 2022 01:48:40 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH for-8.0 24/29] tcg/i386: Replace is64 with type in qemu_ld/st routines Date: Fri, 18 Nov 2022 01:47:49 -0800 Message-Id: <20221118094754.242910-25-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221118094754.242910-1-richard.henderson@linaro.org> References: <20221118094754.242910-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1030; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1030.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org Prepare for TCG_TYPE_I128 by not using a boolean. Signed-off-by: Richard Henderson Reviewed-by: Philippe Mathieu-Daudé --- tcg/i386/tcg-target.c.inc | 54 ++++++++++++++++++++++++++------------- 1 file changed, 36 insertions(+), 18 deletions(-) diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index eb93807b5f..e38f08bd12 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -1772,7 +1772,7 @@ static void * const qemu_st_helpers[MO_SIZE + 1] = { * Record the context of a call to the out of line helper code for the slow path * for a load or store, so that we can later generate the correct helper code */ -static void add_qemu_ldst_label(TCGContext *s, bool is_ld, bool is_64, +static void add_qemu_ldst_label(TCGContext *s, bool is_ld, TCGType type, MemOpIdx oi, TCGReg datalo, TCGReg datahi, TCGReg addrlo, TCGReg addrhi, @@ -1783,7 +1783,7 @@ static void add_qemu_ldst_label(TCGContext *s, bool is_ld, bool is_64, label->is_ld = is_ld; label->oi = oi; - label->type = is_64 ? TCG_TYPE_I64 : TCG_TYPE_I32; + label->type = type; label->datalo_reg = datalo; label->datahi_reg = datahi; label->addrlo_reg = addrlo; @@ -2124,10 +2124,10 @@ static inline int setup_guest_base_seg(void) static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg datalo, TCGReg datahi, TCGReg base, int index, intptr_t ofs, - int seg, bool is64, MemOp memop) + int seg, TCGType type, MemOp memop) { bool use_movbe = false; - int rexw = is64 * P_REXW; + int rexw = (type == TCG_TYPE_I32 ? 0 : P_REXW); int movop = OPC_MOVL_GvEv; /* Do big-endian loads with movbe. */ @@ -2220,7 +2220,7 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg datalo, TCGReg datahi, /* XXX: qemu_ld and qemu_st could be modified to clobber only EDX and EAX. It will be useful once fixed registers globals are less common. */ -static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is64) +static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, TCGType type) { TCGReg datalo, datahi, addrlo; TCGReg addrhi __attribute__((unused)); @@ -2232,7 +2232,16 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is64) #endif datalo = *args++; - datahi = (TCG_TARGET_REG_BITS == 32 && is64 ? *args++ : 0); + switch (type) { + case TCG_TYPE_I32: + datahi = 0; + break; + case TCG_TYPE_I64: + datahi = (TCG_TARGET_REG_BITS == 32 ? *args++ : 0); + break; + default: + g_assert_not_reached(); + } addrlo = *args++; addrhi = (TARGET_LONG_BITS > TCG_TARGET_REG_BITS ? *args++ : 0); oi = *args++; @@ -2243,10 +2252,10 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is64) label_ptr, offsetof(CPUTLBEntry, addr_read)); /* TLB Hit. */ - tcg_out_qemu_ld_direct(s, datalo, datahi, TCG_REG_L1, -1, 0, 0, is64, opc); + tcg_out_qemu_ld_direct(s, datalo, datahi, TCG_REG_L1, -1, 0, 0, type, opc); /* Record the current context of a load into ldst label */ - add_qemu_ldst_label(s, true, is64, oi, datalo, datahi, + add_qemu_ldst_label(s, true, type, oi, datalo, datahi, TCG_REG_L1, addrhi, s->code_ptr, label_ptr); #else a_bits = get_alignment_bits(opc); @@ -2255,9 +2264,9 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, bool is64) } tcg_out_qemu_ld_direct(s, datalo, datahi, addrlo, x86_guest_base_index, x86_guest_base_offset, x86_guest_base_seg, - is64, opc); + type, opc); if (a_bits) { - add_qemu_ldst_label(s, true, is64, oi, datalo, datahi, + add_qemu_ldst_label(s, true, type, oi, datalo, datahi, addrlo, addrhi, s->code_ptr, label_ptr); } #endif @@ -2315,7 +2324,7 @@ static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg datalo, TCGReg datahi, } } -static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, bool is64) +static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, TCGType type) { TCGReg datalo, datahi, addrlo; TCGReg addrhi __attribute__((unused)); @@ -2327,7 +2336,16 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, bool is64) #endif datalo = *args++; - datahi = (TCG_TARGET_REG_BITS == 32 && is64 ? *args++ : 0); + switch (type) { + case TCG_TYPE_I32: + datahi = 0; + break; + case TCG_TYPE_I64: + datahi = (TCG_TARGET_REG_BITS == 32 ? *args++ : 0); + break; + default: + g_assert_not_reached(); + } addrlo = *args++; addrhi = (TARGET_LONG_BITS > TCG_TARGET_REG_BITS ? *args++ : 0); oi = *args++; @@ -2341,7 +2359,7 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, bool is64) tcg_out_qemu_st_direct(s, datalo, datahi, TCG_REG_L1, -1, 0, 0, opc); /* Record the current context of a store into ldst label */ - add_qemu_ldst_label(s, false, is64, oi, datalo, datahi, + add_qemu_ldst_label(s, false, type, oi, datalo, datahi, TCG_REG_L1, addrhi, s->code_ptr, label_ptr); #else a_bits = get_alignment_bits(opc); @@ -2351,7 +2369,7 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, bool is64) tcg_out_qemu_st_direct(s, datalo, datahi, addrlo, x86_guest_base_index, x86_guest_base_offset, x86_guest_base_seg, opc); if (a_bits) { - add_qemu_ldst_label(s, false, is64, oi, datalo, datahi, + add_qemu_ldst_label(s, false, type, oi, datalo, datahi, addrlo, addrhi, s->code_ptr, label_ptr); } #endif @@ -2649,17 +2667,17 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc, break; case INDEX_op_qemu_ld_i32: - tcg_out_qemu_ld(s, args, 0); + tcg_out_qemu_ld(s, args, TCG_TYPE_I32); break; case INDEX_op_qemu_ld_i64: - tcg_out_qemu_ld(s, args, 1); + tcg_out_qemu_ld(s, args, TCG_TYPE_I64); break; case INDEX_op_qemu_st_i32: case INDEX_op_qemu_st8_i32: - tcg_out_qemu_st(s, args, 0); + tcg_out_qemu_st(s, args, TCG_TYPE_I32); break; case INDEX_op_qemu_st_i64: - tcg_out_qemu_st(s, args, 1); + tcg_out_qemu_st(s, args, TCG_TYPE_I64); break; OP_32_64(mulu2): From patchwork Fri Nov 18 09:47:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 626125 Delivered-To: patch@linaro.org Received: by 2002:a17:522:c983:b0:460:3032:e3c4 with SMTP id kr3csp35618pvb; Fri, 18 Nov 2022 01:58:16 -0800 (PST) X-Google-Smtp-Source: AA0mqf5Sl9yXAOQ1TrBUeEUTJDwb5OuWAa+kISC9u4adGFsoE2L5YqF4ufZOhUNaYUudD6csr+xB X-Received: by 2002:a37:d241:0:b0:6fb:b9aa:f345 with SMTP id f62-20020a37d241000000b006fbb9aaf345mr5137466qkj.30.1668765496823; Fri, 18 Nov 2022 01:58:16 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668765496; cv=none; d=google.com; s=arc-20160816; b=j0CFrTjYhOVLIfOhVESaE8YLCzjGi4wIF40NWELfPDGmrSQ4zMpOwZkPxwT7Z2sc+g I1cafAzbXDgaqoI5BAgE+mFVPL+TvEi0cADWLsxYdVY6cSgerPbVZM3Y1q/+KKG175O1 Q/rHJ5pkCLQw91Ly56vR5dsDMw5is167j7GCu5dsD48kPBg9KEwltVpGb706LqcZcA2y 9c03BhC36xdpfGhErQv4PL3QEzBF1hhlg1FsT2B/XhEUrwapUKP7BZPTBNetY+V9pYjZ Mlv0AtDSBFcNzt3pENT+9EZksR2nH96MYXxw11VWxEqmXHeP4IYLf77g/NnI8OyyeImJ sodQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=t0Huvl5EDl6S+kJ/E4VV+nKvSC60vdV8IvOI3pEWJ4w=; b=qJJ7enTmNnLwfAQg53bl+Xr1SJwrs+NIFP57aVgf9qbVIZQOF85QtELxw9KxsE3YFY R2jNy4cFomp4aZ4hCgTDdk3GyZB4VIlTHALGtz/eaxHa/SHuF81X4HwrjdvTQqv2ycSk TGfR7k0S9k4wU2cWKGvw8+KuUKQF+ZTlKf6fE+6mZm9uDFodVcceCm+W1I2FmwxRx1jB TB/Vq3Mj3YRv40tNowfNbIkUt+FS0TCzdHUK6QZlsQX8+eMHhP67lE/H5PDNtf+heHpP nh7lHnaIFkDnGK17qmQsheVHFCbo0KRIrdoDFOHFJPRxeRLd4RUZMQYzmPHAyecUhCoj n3nQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=opL7CccQ; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id gu10-20020a056214260a00b004b1d2231adbsi1800725qvb.5.2022.11.18.01.58.16 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 18 Nov 2022 01:58:16 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=opL7CccQ; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ovxzj-0007wu-3d; Fri, 18 Nov 2022 04:49:11 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ovxzJ-0007gx-Lu for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:45 -0500 Received: from mail-pj1-x1032.google.com ([2607:f8b0:4864:20::1032]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ovxzI-00027s-3j for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:45 -0500 Received: by mail-pj1-x1032.google.com with SMTP id b1-20020a17090a7ac100b00213fde52d49so4574316pjl.3 for ; Fri, 18 Nov 2022 01:48:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=t0Huvl5EDl6S+kJ/E4VV+nKvSC60vdV8IvOI3pEWJ4w=; b=opL7CccQ+GmlM2XHZaoSy4bBoFojwd3rmF1hM8Xtmh5sNvqE8vH9zAPtb+NB7gtqx1 gBQDRZ2+vjFzSKAtsOcSpY3vkYRiO1P/WT0bb6I0nsHDhoU6wWWX/+AXO4R7klbSScSB AarXJSNreaYNCIJYOM6IWeS1WYLFaoiPVPn/PCHkZ3wHJhx8CXBOYSdK4K9hDMXsO7A9 fcBtqyFZGmrwT33VNWB7HDXlQ/169CN2g5xoog5uSv4Vx51DaFVRSAuSBxMlb+mIqFi+ FcyCp6gIuOhuzrISTE47/3yPtybZ99neWmuKrdQrcZRqfCG2USFofX/zRGqDhIVUOMLn pVSQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=t0Huvl5EDl6S+kJ/E4VV+nKvSC60vdV8IvOI3pEWJ4w=; b=CahMuHi8HPhfvM225Dqk2jr6AdiNSkRb1ryvw1zfup39tX3zMrqQE7d9rmhBRO1abg JaKGljeB5MG71/D28581CD4DKekPf8ly/gRKL5FVfO78gZaD6gEh+GoV0AdGK3Y+vaTK sCzRWs9wzh3qhwVXf2/B1DtoXKch9q7mz/SKrxXhBRbwtUYviHokTThyitJjPcrpPXsI NStzithC7MGiKR0leSJBnENo2R7Ou9Jz/5i1dwkpjEwmwxNNZEmoNn3GYYYLgkcdZJu8 uztAxQxKOie0QgEqbD+i8EAiBS2hruplGdY6xazFkw+MmHXvoJjRdsmRCGemyCfeIv1I kUag== X-Gm-Message-State: ANoB5plbJ9lC+dtXIkn+ShPK7r2rkP5mC+SbRDBoiEYpBmPSmaGAfrBi Hn4gn5WEAjn+XAhgLmToHcvasq18WmnWFA== X-Received: by 2002:a17:902:7d88:b0:186:5c72:1081 with SMTP id a8-20020a1709027d8800b001865c721081mr6799264plm.71.1668764922881; Fri, 18 Nov 2022 01:48:42 -0800 (PST) Received: from stoup.. ([2602:47:d48a:1201:90b2:345f:bf0a:c412]) by smtp.gmail.com with ESMTPSA id n12-20020a170902e54c00b0018862bb3976sm3115421plf.308.2022.11.18.01.48.41 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Nov 2022 01:48:41 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH for-8.0 25/29] tcg/i386: Mark Win64 call-saved vector regs as reserved Date: Fri, 18 Nov 2022 01:47:50 -0800 Message-Id: <20221118094754.242910-26-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221118094754.242910-1-richard.henderson@linaro.org> References: <20221118094754.242910-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1032; envelope-from=richard.henderson@linaro.org; helo=mail-pj1-x1032.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org While we do not include these in tcg_target_reg_alloc_order, and therefore they ought never be allocated, it seems safer to mark them reserved as well. Signed-off-by: Richard Henderson Reviewed-by: Philippe Mathieu-Daudé --- tcg/i386/tcg-target.c.inc | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index e38f08bd12..e04818eef6 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -4224,6 +4224,19 @@ static void tcg_target_init(TCGContext *s) s->reserved_regs = 0; tcg_regset_set_reg(s->reserved_regs, TCG_REG_CALL_STACK); +#ifdef _WIN64 + /* These are call saved, and not we don't save them, so don't use them. */ + tcg_regset_set_reg(s->reserved_regs, TCG_REG_XMM6); + tcg_regset_set_reg(s->reserved_regs, TCG_REG_XMM7); + tcg_regset_set_reg(s->reserved_regs, TCG_REG_XMM8); + tcg_regset_set_reg(s->reserved_regs, TCG_REG_XMM9); + tcg_regset_set_reg(s->reserved_regs, TCG_REG_XMM10); + tcg_regset_set_reg(s->reserved_regs, TCG_REG_XMM11); + tcg_regset_set_reg(s->reserved_regs, TCG_REG_XMM12); + tcg_regset_set_reg(s->reserved_regs, TCG_REG_XMM13); + tcg_regset_set_reg(s->reserved_regs, TCG_REG_XMM14); + tcg_regset_set_reg(s->reserved_regs, TCG_REG_XMM15); +#endif } typedef struct { From patchwork Fri Nov 18 09:47:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 626117 Delivered-To: patch@linaro.org Received: by 2002:a17:522:c983:b0:460:3032:e3c4 with SMTP id kr3csp34204pvb; Fri, 18 Nov 2022 01:54:27 -0800 (PST) X-Google-Smtp-Source: AA0mqf6m5+yl69SnndLAIS+CBJc2IbFtMPsICgh7xKv9O3U5m6mBy58ARdO31La1DcSSivXqPpIj X-Received: by 2002:a05:620a:43a9:b0:6fa:18a5:376d with SMTP id a41-20020a05620a43a900b006fa18a5376dmr5182709qkp.220.1668765267217; Fri, 18 Nov 2022 01:54:27 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668765267; cv=none; d=google.com; s=arc-20160816; b=z5uy93H9tYG/m1WvLPmPZTCT24U+G/ESutBjZApfreIFFnp086j7vv34fiR33M9Zs2 5nkux4JPEArcf+fjniZKscdFriOvj+DdbBQEc/RsP1is4w+0gUl90uncG6RFlnNJeqSe aBrGUhk23+xjkKRk1afMnp6BUYveyk4qwwIaQwl3KYsFVFd3YNb+X8EQqmeoYMsUg6Y4 XyvH6VBdbXjefCkrJmWmF7QXEQ3c55RB3SRaLcFzAh6GYbNiNKcpBCAMGWzto3W7qEQz x2/HobK78FIUwxMc3nayJVe4UrrDn5FdDyWMqlDHzfstKujqDgSB7idzsHhPqMddNoaE 8oDA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=nmSbqxZkfo/DNPpgUSE+rCjDudVcMJSsfFRhc9z4sIY=; b=cdz5H6ujuMzToMVU7LSdYRMTNciLP6rdSdP+yFheq+MT3ri2bHnBSZQwsnjr45Xiru E8ZT6UaokiAOIKfkZfkoCaj86IWn6/OpPDDH060+QbxTRsEbaEEy0nG80MbEmrmnRuLD WEsjyiOuweUYBIyssuVQTl3S3JpMqJoDvKrIq9pifmLtAUInnuCnXlaf61Qf3RSYdssa qJ7Wu/IKE3jLJSy5tHCuhLBoodGgc8cS3fUXwgHBpHvTiruO/x3LpJ6N1WSTg1Qi5Ceh ueRLiz+ihJ1mdNNRmRelidEQAbmTrRVyyt4cVgXFmOffkApgdfY9yNcG4cHwvZZNfidR 5Fww== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=rMDd0LWs; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id dm6-20020a05620a1d4600b006f9b6e6003asi1691362qkb.163.2022.11.18.01.54.27 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 18 Nov 2022 01:54:27 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=rMDd0LWs; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ovxzl-00080Q-VN; Fri, 18 Nov 2022 04:49:14 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ovxzL-0007hA-VY for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:49 -0500 Received: from mail-pg1-x52b.google.com ([2607:f8b0:4864:20::52b]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ovxzJ-00028F-Sb for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:47 -0500 Received: by mail-pg1-x52b.google.com with SMTP id q1so4538014pgl.11 for ; Fri, 18 Nov 2022 01:48:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=nmSbqxZkfo/DNPpgUSE+rCjDudVcMJSsfFRhc9z4sIY=; b=rMDd0LWs+iwkSsYnJOWFk+hGPOzMnTsGreFl+y2wHpJM6zl8s8wn33pp/0nIVgcbR9 MdD+X6H/ynU5ea1lI+PIYSFJyF0g8Z7idGTt2BU/wnvZpP/Rhm36+cHKqvmCXy9FAwyv 68Oiyi7Ir3kZNFMA3y0mt+Ms+WtOr9a3ZoRiEKhmur/ROYYHATSbuRNeYmf4np6Yt8qk M8/8sZ5LfiioBpP5LVXJePK9rQyhZQgdaKXhXs271u5UJw95PdOU/24qrxfBTd1sB3Jx azT/dXxbUo5DqVhT529P6RjNoTczCJ//BId7HXGPmSIHtV/grGGXzm3Qi4tTopmaYPhc Ch9Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=nmSbqxZkfo/DNPpgUSE+rCjDudVcMJSsfFRhc9z4sIY=; b=45vf8q0nuVm5sVW8XBWfWfrGvUatWoNkuxFp79P1+a+fQ4SuQ/cVX8YC2qH86Osb4+ UFIUqcf+11Hfrv+MDHPuJi/qWZ0n+q8qgQxPd35ohIBOYsqA+q3rw81pvw/D7RRHD9Hu 3EauMUArUn+v5XeU8W9jmhow2B/46dU/1vNyBGj4YKv7cV8slihVIjdwKHmYolbVIQrj e6Di0BL34TBBK7wM4RhNCVSdm+FvArXR1/RHBqxbP8ri7KbP4wcs9VSdoJSA6flpizgj 6+oXBJ6rl5wW97utdNGXsbhl/Dq1bZehrT9ZNRBNqGLIXZx5eim9e3KTgZnThGniN5Am o/EQ== X-Gm-Message-State: ANoB5pl8LkrESavHtIJbBd5FCgleV/OZ7XzwSMakypOlkeDmvg0GTmKw xGVVWEaphmA9vfZwIlOBEpisGn076gp9Hg== X-Received: by 2002:a63:5145:0:b0:457:f4bb:513 with SMTP id r5-20020a635145000000b00457f4bb0513mr5876866pgl.331.1668764924480; Fri, 18 Nov 2022 01:48:44 -0800 (PST) Received: from stoup.. ([2602:47:d48a:1201:90b2:345f:bf0a:c412]) by smtp.gmail.com with ESMTPSA id n12-20020a170902e54c00b0018862bb3976sm3115421plf.308.2022.11.18.01.48.43 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Nov 2022 01:48:43 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH for-8.0 26/29] tcg/i386: Examine MemOp for atomicity and alignment Date: Fri, 18 Nov 2022 01:47:51 -0800 Message-Id: <20221118094754.242910-27-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221118094754.242910-1-richard.henderson@linaro.org> References: <20221118094754.242910-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::52b; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x52b.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org No change to the ultimate load/store routines yet, so some atomicity conditions not yet honored, but plumbs the change to alignment through the adjacent functions. Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.c.inc | 128 ++++++++++++++++++++++++++++++-------- 1 file changed, 101 insertions(+), 27 deletions(-) diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index e04818eef6..7dc56040d2 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -1746,6 +1746,83 @@ tcg_out_testi(TCGContext *s, TCGReg r, uint32_t i) } } +/* + * Return the alignment and atomicity to use for the inline fast path + * for the given memory operation. The alignment may be larger than + * that specified in @opc, and the correct alignment will be diagnosed + * by the slow path helper. + */ +static MemOp atom_and_align_for_opc(TCGContext *s, MemOp opc, MemOp *out_al) +{ + MemOp align = get_alignment_bits(opc); + MemOp atom, atmax, atsub, size = opc & MO_SIZE; + + /* When serialized, no further atomicity required. */ + if (s->tb_cflags & CF_PARALLEL) { + atom = opc & MO_ATOM_MASK; + } else { + atom = MO_ATOM_NONE; + } + + atmax = opc & MO_ATMAX_MASK; + if (atmax == MO_ATMAX_SIZE) { + atmax = size; + } else { + atmax = atmax >> MO_ATMAX_SHIFT; + } + + switch (atom) { + case MO_ATOM_NONE: + /* The operation requires no specific atomicity. */ + atmax = MO_8; + atsub = MO_8; + break; + case MO_ATOM_IFALIGN: + /* If unaligned, the subobjects are bytes. */ + atsub = MO_8; + break; + case MO_ATOM_WITHIN16: + /* If unaligned, there are subobjects if atmax < size. */ + atsub = (atmax < size ? atmax : MO_8); + atmax = size; + break; + case MO_ATOM_SUBALIGN: + /* If unaligned but not odd, there are subobjects up to atmax - 1. */ + atsub = (atmax == MO_8 ? MO_8 : atmax - 1); + break; + default: + g_assert_not_reached(); + } + + /* + * Per Intel Architecture SDM, Volume 3 Section 8.1.1, + * - Pentium family guarantees atomicity of aligned <= 64-bit. + * - P6 family guarantees atomicity of unaligned <= 64-bit + * which fit within a cache line. + * - AVX guarantees atomicity of aligned 128-bit VMOVDQA (et al). + * + * There is no language in the Intel manual specifying what happens + * with the partial memory operations when crossing a cache line. + * When there is required atomicity of subobjects, we must perform + * an additional runtime test for alignment and then perform either + * the full operation, or two half-sized operations. + * + * For x86_64, and MO_64, we do not have a scratch register with + * which to do this. Only allow splitting for MO_64 on i386, + * where the data is already separated, or MO_128. + * Otherwise, require full alignment and fall back to the helper + * for the misaligned case. + */ + if (align < atmax + && atsub != MO_8 + && size != (TCG_TARGET_REG_BITS == 64 ? MO_128 : MO_64)) { + align = size; + } + + *out_al = align; + return atmax; +} + /* * helper signature: helper_ld*_mmu(CPUState *env, target_ulong addr, * int mmu_idx, uintptr_t ra) @@ -1987,7 +2064,7 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l) * First argument register is clobbered. */ static void tcg_out_tlb_load(TCGContext *s, TCGReg addrlo, TCGReg addrhi, - int mem_index, MemOp opc, + int mem_index, MemOp a_bits, MemOp s_bits, tcg_insn_unit **label_ptr, int which) { const TCGReg r0 = TCG_REG_L0; @@ -1995,8 +2072,6 @@ static void tcg_out_tlb_load(TCGContext *s, TCGReg addrlo, TCGReg addrhi, TCGType ttype = TCG_TYPE_I32; TCGType tlbtype = TCG_TYPE_I32; int trexw = 0, hrexw = 0, tlbrexw = 0; - unsigned a_bits = get_alignment_bits(opc); - unsigned s_bits = opc & MO_SIZE; unsigned a_mask = (1 << a_bits) - 1; unsigned s_mask = (1 << s_bits) - 1; target_ulong tlb_mask; @@ -2124,7 +2199,8 @@ static inline int setup_guest_base_seg(void) static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg datalo, TCGReg datahi, TCGReg base, int index, intptr_t ofs, - int seg, TCGType type, MemOp memop) + int seg, TCGType type, MemOp memop, + MemOp atom, MemOp align) { bool use_movbe = false; int rexw = (type == TCG_TYPE_I32 ? 0 : P_REXW); @@ -2225,11 +2301,8 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, TCGType type) TCGReg datalo, datahi, addrlo; TCGReg addrhi __attribute__((unused)); MemOpIdx oi; - MemOp opc; + MemOp opc, atom, align; tcg_insn_unit *label_ptr[2] = { }; -#ifndef CONFIG_SOFTMMU - unsigned a_bits; -#endif datalo = *args++; switch (type) { @@ -2246,26 +2319,27 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, TCGType type) addrhi = (TARGET_LONG_BITS > TCG_TARGET_REG_BITS ? *args++ : 0); oi = *args++; opc = get_memop(oi); + atom = atom_and_align_for_opc(s, opc, &align); #if defined(CONFIG_SOFTMMU) - tcg_out_tlb_load(s, addrlo, addrhi, get_mmuidx(oi), opc, + tcg_out_tlb_load(s, addrlo, addrhi, get_mmuidx(oi), align, opc & MO_SIZE, label_ptr, offsetof(CPUTLBEntry, addr_read)); /* TLB Hit. */ - tcg_out_qemu_ld_direct(s, datalo, datahi, TCG_REG_L1, -1, 0, 0, type, opc); + tcg_out_qemu_ld_direct(s, datalo, datahi, TCG_REG_L1, -1, 0, 0, type, + opc, atom, align); /* Record the current context of a load into ldst label */ add_qemu_ldst_label(s, true, type, oi, datalo, datahi, TCG_REG_L1, addrhi, s->code_ptr, label_ptr); #else - a_bits = get_alignment_bits(opc); - if (a_bits) { - tcg_out_test_alignment(s, addrlo, a_bits, label_ptr); + if (align) { + tcg_out_test_alignment(s, addrlo, align, label_ptr); } tcg_out_qemu_ld_direct(s, datalo, datahi, addrlo, x86_guest_base_index, x86_guest_base_offset, x86_guest_base_seg, - type, opc); - if (a_bits) { + type, opc, atom, align); + if (align) { add_qemu_ldst_label(s, true, type, oi, datalo, datahi, addrlo, addrhi, s->code_ptr, label_ptr); } @@ -2274,7 +2348,8 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, TCGType type) static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg datalo, TCGReg datahi, TCGReg base, int index, intptr_t ofs, - int seg, MemOp memop) + int seg, MemOp memop, + MemOp atom, MemOp align) { bool use_movbe = false; int movop = OPC_MOVL_EvGv; @@ -2329,11 +2404,8 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, TCGType type) TCGReg datalo, datahi, addrlo; TCGReg addrhi __attribute__((unused)); MemOpIdx oi; - MemOp opc; + MemOp opc, atom, align; tcg_insn_unit *label_ptr[2] = { }; -#ifndef CONFIG_SOFTMMU - unsigned a_bits; -#endif datalo = *args++; switch (type) { @@ -2350,25 +2422,27 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, TCGType type) addrhi = (TARGET_LONG_BITS > TCG_TARGET_REG_BITS ? *args++ : 0); oi = *args++; opc = get_memop(oi); + atom = atom_and_align_for_opc(s, opc, &align); #if defined(CONFIG_SOFTMMU) - tcg_out_tlb_load(s, addrlo, addrhi, get_mmuidx(oi), opc, + tcg_out_tlb_load(s, addrlo, addrhi, get_mmuidx(oi), align, opc & MO_SIZE, label_ptr, offsetof(CPUTLBEntry, addr_write)); /* TLB Hit. */ - tcg_out_qemu_st_direct(s, datalo, datahi, TCG_REG_L1, -1, 0, 0, opc); + tcg_out_qemu_st_direct(s, datalo, datahi, TCG_REG_L1, -1, 0, 0, + opc, atom, align); /* Record the current context of a store into ldst label */ add_qemu_ldst_label(s, false, type, oi, datalo, datahi, TCG_REG_L1, addrhi, s->code_ptr, label_ptr); #else - a_bits = get_alignment_bits(opc); - if (a_bits) { - tcg_out_test_alignment(s, addrlo, a_bits, label_ptr); + if (align) { + tcg_out_test_alignment(s, addrlo, align, label_ptr); } tcg_out_qemu_st_direct(s, datalo, datahi, addrlo, x86_guest_base_index, - x86_guest_base_offset, x86_guest_base_seg, opc); - if (a_bits) { + x86_guest_base_offset, x86_guest_base_seg, + opc, atom, align); + if (align) { add_qemu_ldst_label(s, false, type, oi, datalo, datahi, addrlo, addrhi, s->code_ptr, label_ptr); } From patchwork Fri Nov 18 09:47:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 626127 Delivered-To: patch@linaro.org Received: by 2002:a17:522:c983:b0:460:3032:e3c4 with SMTP id kr3csp35696pvb; Fri, 18 Nov 2022 01:58:31 -0800 (PST) X-Google-Smtp-Source: AA0mqf6lJEUB2ZUGVwiI3i5zn95cahhznncbLqvw/0Bp4wVC26yT8SgMR62jiXCDrZ8TN7+X5P/K X-Received: by 2002:a05:620a:16c5:b0:6fa:9834:ce59 with SMTP id a5-20020a05620a16c500b006fa9834ce59mr5131952qkn.276.1668765511508; Fri, 18 Nov 2022 01:58:31 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668765511; cv=none; d=google.com; s=arc-20160816; b=IGEOscQkU3LN/ZjfYwkchL9L1tjqjn+f95BGZ/hpLYWqesT8atp0xEF4JTJ25W8lOX zPluf0w6VslmGRKszliQKsJmcEEIefkT6K90qWM32J3bG0NMn0kj2wwEcmg/80sfcFXx m7vTOgixDCkRuPXIlIOiM40MDs2m8LWywUDgVXXdIGoPBj0qLgMURjQcK3L9rfOBpRdS G0tr9OkVh2fsoiohF5mP5znyChdXGeBUmNwLz5pOdc3fXMhe5sKT7ySfvxw5h4rClwNE ZdP+uI0iCp39zwGOZ43D+3Z3TnK/19g26mf4qilp6Qf1QGBCPaZzLVJBmVZpwE49a+s7 gOGw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=5SLuwGh+3IAEv4D6Tfy89F+ucmEktoFgjCLjsD5ZiOQ=; b=dqSjbbV5NHzPhOliypwFVJ0ixUCX+P0IyG1VN/usJb6KPvgDnbN7LeSBcoXgTLCjrV i+kcf/J/eU7whV6CNaX85qQplj4YUWoZl4mg0yhfenBDzMJLjC012rqXYYVPt/3bD2cR 5tRVPNvOxdwMBW1ClIzXVGqMYz4TN0qxMUS9U6Tzwj0rwISI/EzfVpEqRhduCndCVVi9 uoKXHhDqGn0xd29Z8uK00Kj5GNTXiE+a7aIuw9XkI56ljc1Rz8gFomq15gL1Z6wYwok+ tgd6Ux5SYpm3d/60qWgyQ4bC0CFYDvb8zw97dJBylyhEcB5CvNI/NhvrgMAmitsgMbXw vxaA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=xMsvyzLf; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id q8-20020a05620a0d8800b006faa2a69ff4si1926010qkl.250.2022.11.18.01.58.31 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 18 Nov 2022 01:58:31 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=xMsvyzLf; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ovxzm-00085X-Nb; Fri, 18 Nov 2022 04:49:14 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ovxzP-0007k7-Jq for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:57 -0500 Received: from mail-pg1-x52c.google.com ([2607:f8b0:4864:20::52c]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ovxzL-00028Y-BL for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:49 -0500 Received: by mail-pg1-x52c.google.com with SMTP id v3so4565016pgh.4 for ; Fri, 18 Nov 2022 01:48:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=5SLuwGh+3IAEv4D6Tfy89F+ucmEktoFgjCLjsD5ZiOQ=; b=xMsvyzLfBX3jVy+XyIYsqoZQ4ap172bz7oXadjeBtx5mBH/XddGh4WbvmLYnq+zIPw 0wJwuUXi8z3aFRp10PYaHvWssbEVhOTzuACetcremG6lWkADd12TYfVtyyzGOdZRug78 wYbWh1UHvVDm1GZoY0y/mrJyD2vw2VAZLhs4VCKkHaZX1JrIsjWnTTmDPHlv0j44vXBu GnhMyQRZiYxarKb6PWlnuaJbKKt+lEczR138n4jsdMin5ekoA7ycMhyozo2tL28NZ0RS M9H3oIRKTuJ94rN40+ikXztfoF9Q1egrlLdyUBCvV59Kp9xmBJeUweNCuvaj5osS0Vil AuAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5SLuwGh+3IAEv4D6Tfy89F+ucmEktoFgjCLjsD5ZiOQ=; b=ZSDSyx0T8iqSSog9Kul8AZ2GidOfC2hyRk5nX93UW6ngI2+Mbs6hwbH/VFHJ45yNxc OeeWlg3iDRSo5ALpIHm6lwqV3YsnSL2fbxHwtqBlcetyBl7MQfzjegFbfMzYz0+l6lSw LI8Auz3TdMcLY0r9enkyfi/HU/NqtD2qPXlawS9tVyhMhyPWwfIREut0FuXKfgJy2uoq vDdbbhBF1aBvQol/Xo5Xy2j4Qfiijaeax3XgxGFK0LLr6DptjMYmUNBSz9cyJ/dG6bo+ 1uRWhgI0ittyz99SIS0rLgq3rtnyX8Ntl3fN6HwDrjIGfrrqz4K+9YKTFfiWdqbRESmB mh1Q== X-Gm-Message-State: ANoB5plrPncp3ihU6l3J0/LrR59wHIhYTriqoBm7FENprXH4wvC0tUp8 Fo39gO+QI43SEd8SlDJ5cPFgusQq5/vJrA== X-Received: by 2002:a62:f252:0:b0:56d:a243:7d86 with SMTP id y18-20020a62f252000000b0056da2437d86mr6985817pfl.38.1668764925964; Fri, 18 Nov 2022 01:48:45 -0800 (PST) Received: from stoup.. ([2602:47:d48a:1201:90b2:345f:bf0a:c412]) by smtp.gmail.com with ESMTPSA id n12-20020a170902e54c00b0018862bb3976sm3115421plf.308.2022.11.18.01.48.44 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Nov 2022 01:48:44 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH for-8.0 27/29] tcg/i386: Support 128-bit load/store with have_atomic16 Date: Fri, 18 Nov 2022 01:47:52 -0800 Message-Id: <20221118094754.242910-28-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221118094754.242910-1-richard.henderson@linaro.org> References: <20221118094754.242910-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::52c; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x52c.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.h | 3 +- tcg/i386/tcg-target.c.inc | 325 +++++++++++++++++++++++++++++++++++--- 2 files changed, 304 insertions(+), 24 deletions(-) diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h index 53d2cb3412..7aafd60d72 100644 --- a/tcg/i386/tcg-target.h +++ b/tcg/i386/tcg-target.h @@ -195,7 +195,8 @@ extern bool have_atomic16; #define TCG_TARGET_HAS_qemu_st8_i32 1 #endif -#define TCG_TARGET_HAS_qemu_ldst_i128 0 +#define TCG_TARGET_HAS_qemu_ldst_i128 \ + (TCG_TARGET_REG_BITS == 64 && have_atomic16) /* We do not support older SSE systems, only beginning with AVX1. */ #define TCG_TARGET_HAS_v64 have_avx1 diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index 7dc56040d2..f277085321 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -91,6 +91,8 @@ static const int tcg_target_reg_alloc_order[] = { #endif }; +#define TCG_TMP_VEC TCG_REG_XMM5 + static const int tcg_target_call_iarg_regs[] = { #if TCG_TARGET_REG_BITS == 64 #if defined(_WIN64) @@ -347,6 +349,8 @@ static bool tcg_target_const_match(int64_t val, TCGType type, int ct) #define OPC_PCMPGTW (0x65 | P_EXT | P_DATA16) #define OPC_PCMPGTD (0x66 | P_EXT | P_DATA16) #define OPC_PCMPGTQ (0x37 | P_EXT38 | P_DATA16) +#define OPC_PEXTRD (0x16 | P_EXT3A | P_DATA16) +#define OPC_PINSRD (0x22 | P_EXT3A | P_DATA16) #define OPC_PMAXSB (0x3c | P_EXT38 | P_DATA16) #define OPC_PMAXSW (0xee | P_EXT | P_DATA16) #define OPC_PMAXSD (0x3d | P_EXT38 | P_DATA16) @@ -1730,8 +1734,7 @@ static void tcg_out_nopn(TCGContext *s, int n) } /* Test register R vs immediate bits I, setting Z flag for EQ/NE. */ -static void __attribute__((unused)) -tcg_out_testi(TCGContext *s, TCGReg r, uint32_t i) +static void tcg_out_testi(TCGContext *s, TCGReg r, uint32_t i) { /* * This is used for testing alignment, so we can usually use testb. @@ -1828,10 +1831,11 @@ static MemOp atom_and_align_for_opc(TCGContext *s, MemOp opc, MemOp *out_al) * int mmu_idx, uintptr_t ra) */ static void * const qemu_ld_helpers[MO_SIZE + 1] = { - [MO_UB] = helper_ldub_mmu, - [MO_UW] = helper_lduw_mmu, - [MO_UL] = helper_ldul_mmu, - [MO_UQ] = helper_ldq_mmu, + [MO_8] = helper_ldub_mmu, + [MO_16] = helper_lduw_mmu, + [MO_32] = helper_ldul_mmu, + [MO_64] = helper_ldq_mmu, + [MO_128] = helper_ld16_mmu, }; /* @@ -1839,10 +1843,11 @@ static void * const qemu_ld_helpers[MO_SIZE + 1] = { * uintxx_t val, int mmu_idx, uintptr_t ra) */ static void * const qemu_st_helpers[MO_SIZE + 1] = { - [MO_UB] = helper_stb_mmu, - [MO_UW] = helper_stw_mmu, - [MO_UL] = helper_stl_mmu, - [MO_UQ] = helper_stq_mmu, + [MO_8] = helper_stb_mmu, + [MO_16] = helper_stw_mmu, + [MO_32] = helper_stl_mmu, + [MO_64] = helper_stq_mmu, + [MO_128] = helper_st16_mmu, }; /* @@ -1870,6 +1875,13 @@ static void add_qemu_ldst_label(TCGContext *s, bool is_ld, TCGType type, label->label_ptr[1] = label_ptr[1]; } +static void tcg_out_mov2_xchg(TCGContext *s, TCGType type1, TCGType type2, + TCGReg dst1, TCGReg dst2) +{ + int w = (type1 == TCG_TYPE_I32 && type2 == TCG_TYPE_I32 ? 0 : P_REXW); + tcg_out_modrm(s, OPC_XCHG_EvGv + w, dst1, dst2); +} + /* Move src1 to dst1 and src2 to dst2, minding possible overlap. */ static void tcg_out_mov2(TCGContext *s, TCGType type1, TCGReg dst1, TCGReg src1, @@ -1883,11 +1895,69 @@ static void tcg_out_mov2(TCGContext *s, tcg_out_mov(s, type1, dst1, src1); } else { /* dst1 == src2 && dst2 == src1 -> xchg. */ - int w = (type1 == TCG_TYPE_I32 && type2 == TCG_TYPE_I32 ? 0 : P_REXW); - tcg_out_modrm(s, OPC_XCHG_EvGv + w, dst1, dst2); + tcg_out_mov2_xchg(s, type1, type2, dst1, dst2); } } +/* Similarly for 3 pairs. */ +static void tcg_out_mov3(TCGContext *s, + TCGType type1, TCGReg dst1, TCGReg src1, + TCGType type2, TCGReg dst2, TCGReg src2, + TCGType type3, TCGReg dst3, TCGReg src3) +{ + if (dst1 != src2 && dst1 != src3) { + tcg_out_mov(s, type1, dst1, src1); + tcg_out_mov2(s, type2, dst2, src2, type3, dst3, src3); + return; + } + if (dst2 != src2 && dst2 != src3) { + tcg_out_mov(s, type2, dst2, src2); + tcg_out_mov2(s, type1, dst1, src1, type3, dst3, src3); + return; + } + if (dst3 != src1 && dst3 != src2) { + tcg_out_mov(s, type3, dst3, src3); + tcg_out_mov2(s, type1, dst1, src1, type2, dst2, src2); + return; + } + /* Three-way overlap present, at least one xchg needed. */ + if (dst1 == src2) { + tcg_out_mov2_xchg(s, type1, type2, src1, src2); + tcg_out_mov2(s, type2, dst2, src1, type3, dst3, src3); + return; + } + if (dst1 == src3) { + tcg_out_mov2_xchg(s, type1, type3, src1, src3); + tcg_out_mov2(s, type2, dst2, src2, type3, dst3, src1); + return; + } + g_assert_not_reached(); +} + +static void tcg_out_vec_to_pair(TCGContext *s, TCGType type, + TCGReg l, TCGReg h, TCGReg v) +{ + int rexw = type == TCG_TYPE_I32 ? 0 : P_REXW; + + /* vpmov{d,q} %v, %l */ + tcg_out_vex_modrm(s, OPC_MOVD_EyVy + rexw, v, 0, l); + /* vpextr{d,q} $1, %v, %h */ + tcg_out_vex_modrm(s, OPC_PEXTRD + rexw, v, 0, h); + tcg_out8(s, 1); +} + +static void tcg_out_pair_to_vec(TCGContext *s, TCGType type, + TCGReg v, TCGReg l, TCGReg h) +{ + int rexw = type == TCG_TYPE_I32 ? 0 : P_REXW; + + /* vmov{d,q} %l, %v */ + tcg_out_vex_modrm(s, OPC_MOVD_VyEy + rexw, v, 0, l); + /* vpinsr{d,q} $1, %h, %v, %v */ + tcg_out_vex_modrm(s, OPC_PINSRD + rexw, v, v, h); + tcg_out8(s, 1); +} + /* * Generate code for the slow path for a load at the end of block */ @@ -1897,7 +1967,7 @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l) MemOp opc = get_memop(oi); TCGReg data_reg; tcg_insn_unit **label_ptr = &l->label_ptr[0]; - int rexw = (l->type == TCG_TYPE_I64 ? P_REXW : 0); + int rexw = (l->type == TCG_TYPE_I32 ? 0 : P_REXW); /* resolve label address */ tcg_patch32(label_ptr[0], s->code_ptr - label_ptr[0] - 4); @@ -1961,6 +2031,22 @@ static bool tcg_out_qemu_ld_slow_path(TCGContext *s, TCGLabelQemuLdst *l) TCG_TYPE_I32, l->datahi_reg, TCG_REG_EDX); } break; + case MO_128: + tcg_debug_assert(TCG_TARGET_REG_BITS == 64); + switch (TCG_TARGET_CALL_RET_I128) { + case TCG_CALL_RET_NORMAL: + tcg_out_mov2(s, TCG_TYPE_I64, data_reg, TCG_REG_RAX, + TCG_TYPE_I64, l->datahi_reg, TCG_REG_RDX); + break; + case TCG_CALL_RET_BY_VEC: + tcg_out_vec_to_pair(s, TCG_TYPE_I64, + data_reg, l->datahi_reg, TCG_REG_XMM0); + break; + default: + qemu_build_not_reached(); + } + break; + default: tcg_abort(); } @@ -1977,7 +2063,6 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l) { MemOpIdx oi = l->oi; MemOp opc = get_memop(oi); - MemOp s_bits = opc & MO_SIZE; tcg_insn_unit **label_ptr = &l->label_ptr[0]; TCGReg retaddr; @@ -2004,9 +2089,15 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l) tcg_out_st(s, TCG_TYPE_I32, l->datalo_reg, TCG_REG_ESP, ofs); ofs += 4; - if (s_bits == MO_64) { + switch (l->type) { + case TCG_TYPE_I32: + break; + case TCG_TYPE_I64: tcg_out_st(s, TCG_TYPE_I32, l->datahi_reg, TCG_REG_ESP, ofs); ofs += 4; + break; + default: + g_assert_not_reached(); } tcg_out_sti(s, TCG_TYPE_I32, oi, TCG_REG_ESP, ofs); @@ -2016,15 +2107,54 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, TCGLabelQemuLdst *l) tcg_out_movi(s, TCG_TYPE_PTR, retaddr, (uintptr_t)l->raddr); tcg_out_st(s, TCG_TYPE_PTR, retaddr, TCG_REG_ESP, ofs); } else { - tcg_out_mov2(s, TCG_TYPE_TL, - tcg_target_call_iarg_regs[1], l->addrlo_reg, - s_bits == MO_64 ? TCG_TYPE_I64 : TCG_TYPE_I32, - tcg_target_call_iarg_regs[2], l->datalo_reg); - tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], TCG_AREG0); - tcg_out_movi(s, TCG_TYPE_I32, tcg_target_call_iarg_regs[3], oi); + int slot; - if (ARRAY_SIZE(tcg_target_call_iarg_regs) > 4) { - retaddr = tcg_target_call_iarg_regs[4]; + switch (l->type) { + case TCG_TYPE_I32: + case TCG_TYPE_I64: + tcg_out_mov2(s, TCG_TYPE_TL, + tcg_target_call_iarg_regs[1], l->addrlo_reg, + l->type, tcg_target_call_iarg_regs[2], l->datalo_reg); + slot = 3; + break; + case TCG_TYPE_I128: + switch (TCG_TARGET_CALL_ARG_I128) { + case TCG_CALL_ARG_NORMAL: + tcg_out_mov3(s, TCG_TYPE_TL, + tcg_target_call_iarg_regs[1], l->addrlo_reg, + TCG_TYPE_I64, + tcg_target_call_iarg_regs[2], l->datalo_reg, + TCG_TYPE_I64, + tcg_target_call_iarg_regs[3], l->datahi_reg); + slot = 4; + break; + case TCG_CALL_ARG_BY_REF: + /* Leave room for retaddr below, take next 16 aligned bytes. */ + tcg_out_st(s, TCG_TYPE_I64, l->datalo_reg, + TCG_REG_ESP, TCG_TARGET_CALL_STACK_OFFSET + 16); + tcg_out_st(s, TCG_TYPE_I64, l->datahi_reg, + TCG_REG_ESP, TCG_TARGET_CALL_STACK_OFFSET + 24); + tcg_out_mov(s, TCG_TYPE_TL, + tcg_target_call_iarg_regs[1], l->addrlo_reg); + tcg_out_modrm_offset(s, OPC_LEA + P_REXW, + tcg_target_call_iarg_regs[2], TCG_REG_ESP, + TCG_TARGET_CALL_STACK_OFFSET + 16); + slot = 3; + break; + default: + qemu_build_not_reached(); + } + break; + default: + g_assert_not_reached(); + } + + tcg_debug_assert(slot < (int)ARRAY_SIZE(tcg_target_call_iarg_regs) - 1); + tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], TCG_AREG0); + tcg_out_movi(s, TCG_TYPE_I32, tcg_target_call_iarg_regs[slot++], oi); + + if (slot < (int)ARRAY_SIZE(tcg_target_call_iarg_regs)) { + retaddr = tcg_target_call_iarg_regs[slot]; tcg_out_movi(s, TCG_TYPE_PTR, retaddr, (uintptr_t)l->raddr); } else { retaddr = TCG_REG_RAX; @@ -2288,6 +2418,71 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg datalo, TCGReg datahi, } } break; + + case MO_128: + { + TCGLabel *l1 = NULL, *l2 = NULL; + bool use_pair = atom < MO_128; + + tcg_debug_assert(TCG_TARGET_REG_BITS == 64); + + if (use_movbe) { + TCGReg t = datalo; + datalo = datahi; + datahi = t; + } + if (!use_pair) { + /* + * Atomicity requires that we use use VMOVDQA. + * If we've already checked for 16-byte alignment, that's all + * we need. If we arrive here with lesser alignment, then we + * have determined that less that 16-byte alignment can be + * satisfied with two 8-byte loads. + */ + if (align < MO_128) { + use_pair = true; + l1 = gen_new_label(); + l2 = gen_new_label(); + + tcg_out_testi(s, base, align == MO_64 ? 8 : 15); + tcg_out_jxx(s, JCC_JNE, l2, true); + } + + tcg_out_vex_modrm_sib_offset(s, OPC_MOVDQA_VxWx + seg, + TCG_TMP_VEC, 0, + base, index, 0, ofs); + tcg_out_vec_to_pair(s, TCG_TYPE_I64, + datalo, datahi, TCG_TMP_VEC); + + if (use_movbe) { + tcg_out_bswap64(s, datalo); + tcg_out_bswap64(s, datahi); + } + + if (use_pair) { + tcg_out_jxx(s, JCC_JMP, l1, true); + tcg_out_label(s, l2); + } + } + if (use_pair) { + if (base != datalo) { + tcg_out_modrm_sib_offset(s, movop + P_REXW + seg, datalo, + base, index, 0, ofs); + tcg_out_modrm_sib_offset(s, movop + P_REXW + seg, datahi, + base, index, 0, ofs + 8); + } else { + tcg_out_modrm_sib_offset(s, movop + P_REXW + seg, datahi, + base, index, 0, ofs + 8); + tcg_out_modrm_sib_offset(s, movop + P_REXW + seg, datalo, + base, index, 0, ofs); + } + } + if (l1) { + tcg_out_label(s, l1); + } + } + break; + default: g_assert_not_reached(); } @@ -2312,6 +2507,10 @@ static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, TCGType type) case TCG_TYPE_I64: datahi = (TCG_TARGET_REG_BITS == 32 ? *args++ : 0); break; + case TCG_TYPE_I128: + tcg_debug_assert(TCG_TARGET_REG_BITS == 64); + datahi = *args++; + break; default: g_assert_not_reached(); } @@ -2394,6 +2593,68 @@ static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg datalo, TCGReg datahi, base, index, 0, ofs + 4); } break; + + case MO_128: + { + TCGLabel *l1 = NULL, *l2 = NULL; + bool use_pair = atom < MO_128; + + tcg_debug_assert(TCG_TARGET_REG_BITS == 64); + + if (use_movbe) { + TCGReg t = datalo; + datalo = datahi; + datahi = t; + } + if (!use_pair) { + /* + * Atomicity requires that we use use VMOVDQA. + * If we've already checked for 16-byte alignment, that's all + * we need. If we arrive here with lesser alignment, then we + * have determined that less that 16-byte alignment can be + * satisfied with two 8-byte loads. + */ + if (align < MO_128) { + use_pair = true; + l1 = gen_new_label(); + l2 = gen_new_label(); + + tcg_out_testi(s, base, align == MO_64 ? 8 : 15); + tcg_out_jxx(s, JCC_JNE, l2, true); + } + + if (use_movbe) { + /* Byte swap while storing to the stack. */ + tcg_out_modrm_offset(s, movop + P_REXW + seg, datalo, + TCG_REG_ESP, 0); + tcg_out_modrm_offset(s, movop + P_REXW + seg, datahi, + TCG_REG_ESP, 8); + tcg_out_ld(s, TCG_TYPE_V128, TCG_TMP_VEC, TCG_REG_ESP, 0); + } else { + tcg_out_pair_to_vec(s, TCG_TYPE_I64, + TCG_TMP_VEC, datalo, datahi); + } + tcg_out_vex_modrm_sib_offset(s, OPC_MOVDQA_WxVx + seg, + TCG_TMP_VEC, 0, + base, index, 0, ofs); + + if (use_pair) { + tcg_out_jxx(s, JCC_JMP, l1, true); + tcg_out_label(s, l2); + } + } + if (use_pair) { + tcg_out_modrm_sib_offset(s, movop + P_REXW + seg, datalo, + base, index, 0, ofs); + tcg_out_modrm_sib_offset(s, movop + P_REXW + seg, datahi, + base, index, 0, ofs + 8); + } + if (l1) { + tcg_out_label(s, l1); + } + } + break; + default: g_assert_not_reached(); } @@ -2415,6 +2676,10 @@ static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, TCGType type) case TCG_TYPE_I64: datahi = (TCG_TARGET_REG_BITS == 32 ? *args++ : 0); break; + case TCG_TYPE_I128: + tcg_debug_assert(TCG_TARGET_REG_BITS == 64); + datahi = *args++; + break; default: g_assert_not_reached(); } @@ -2746,6 +3011,9 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc, case INDEX_op_qemu_ld_i64: tcg_out_qemu_ld(s, args, TCG_TYPE_I64); break; + case INDEX_op_qemu_ld_i128: + tcg_out_qemu_ld(s, args, TCG_TYPE_I128); + break; case INDEX_op_qemu_st_i32: case INDEX_op_qemu_st8_i32: tcg_out_qemu_st(s, args, TCG_TYPE_I32); @@ -2753,6 +3021,9 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc, case INDEX_op_qemu_st_i64: tcg_out_qemu_st(s, args, TCG_TYPE_I64); break; + case INDEX_op_qemu_st_i128: + tcg_out_qemu_st(s, args, TCG_TYPE_I128); + break; OP_32_64(mulu2): tcg_out_modrm(s, OPC_GRP3_Ev + rexw, EXT3_MUL, args[3]); @@ -3441,6 +3712,13 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op) : TARGET_LONG_BITS <= TCG_TARGET_REG_BITS ? C_O0_I3(L, L, L) : C_O0_I4(L, L, L, L)); + case INDEX_op_qemu_ld_i128: + tcg_debug_assert(TCG_TARGET_REG_BITS == 64); + return C_O2_I1(r, r, L); + case INDEX_op_qemu_st_i128: + tcg_debug_assert(TCG_TARGET_REG_BITS == 64); + return C_O0_I3(L, L, L); + case INDEX_op_brcond2_i32: return C_O0_I4(r, r, ri, ri); @@ -4298,6 +4576,7 @@ static void tcg_target_init(TCGContext *s) s->reserved_regs = 0; tcg_regset_set_reg(s->reserved_regs, TCG_REG_CALL_STACK); + tcg_regset_set_reg(s->reserved_regs, TCG_TMP_VEC); #ifdef _WIN64 /* These are call saved, and not we don't save them, so don't use them. */ tcg_regset_set_reg(s->reserved_regs, TCG_REG_XMM6); From patchwork Fri Nov 18 09:47:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 626116 Delivered-To: patch@linaro.org Received: by 2002:a17:522:c983:b0:460:3032:e3c4 with SMTP id kr3csp34023pvb; Fri, 18 Nov 2022 01:53:57 -0800 (PST) X-Google-Smtp-Source: AA0mqf7Q4w9EYNrecAsqElpOLtOOnQZ64M9fOCvGFtWN5EQqYmd6stDWoz77g9VqZXZkYV1ofNUK X-Received: by 2002:a05:6214:5a0a:b0:4bb:806b:8e5e with SMTP id lu10-20020a0562145a0a00b004bb806b8e5emr5872052qvb.123.1668765237447; Fri, 18 Nov 2022 01:53:57 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668765237; cv=none; d=google.com; s=arc-20160816; b=IwMMnsmXEn0ZJVws1LTelWky12LGeUPb1dUD3tzrquh5EZCNZ55/Igx5kc/buywVkZ 5d7iLS0ETpF/7zubIrIIBuwldghnTCFPwV/6wvA+rT0M7t5VYk81FvmH0ETYkT9deEnV YmvHZUJWfY4eKbaitPBwj+X99ouzDjmysWbGLSgn9j6VQ4mR6PlaTh86WyYaiqhH+BUP LEW92HKPSAcLgp1w5aaYt7f9hN2cr/oyN+CexgQVqO+WIVs5iZEH+odbJPfgNx3+I4Vl KmYeB9oQbZgGGPxz19dC1bKPFPeKvx7GLdmNcOLy41zl21Wx86IiWY0qVQ7psVH5tX1L x63Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=x7svweP0IDihRUwLjDBPUL8ODiog5cZ+HQk14NZtEqw=; b=JmKPsyaK9nJT51h/6mF9b5cT6yZS5SUSyG0WHHgAqyK7b00y9m75cWU33qjIR3rKrp WoY/j9XcYToWpIIMKbYsZvbnTnugypLaO27wYe6gSDXcMhGLVsq70wBhjHsGpb7YB7cn VkuP2QXN1l21/JCGFwn8wIzyEi7XH2DPOlQNjjr5evaguFxu8XY/TvfJuwCAef1Saghq djt20QbXPZCQQlPa3036Os2aqhVKU4UqTinDEzyVoeUeCXmuJyC2coq1pyFNit3iHIcA JhDiD8V8kDeb4r2i9JZHHNXpTL1a5j5EQBSALfGz9KDXz+XcIl8Nk2b+/DqeR0e91iSP pPNQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=EqKxmZTh; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id kl26-20020a056214519a00b004b40ccfe68bsi1599855qvb.105.2022.11.18.01.53.57 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 18 Nov 2022 01:53:57 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=EqKxmZTh; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ovxzf-0007py-44; Fri, 18 Nov 2022 04:49:07 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ovxzQ-0007kE-UQ for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:57 -0500 Received: from mail-pg1-x534.google.com ([2607:f8b0:4864:20::534]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ovxzP-00028s-Aa for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:52 -0500 Received: by mail-pg1-x534.google.com with SMTP id 62so4514871pgb.13 for ; Fri, 18 Nov 2022 01:48:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=x7svweP0IDihRUwLjDBPUL8ODiog5cZ+HQk14NZtEqw=; b=EqKxmZThtIdEUZjWI3AVWPuerwkJeXSZZuRsxMXaz59mDhapMBgbEr342XbvwIQ6LF M5fu8uUocsiqc0TkcLFYdj4a3weFCEniCi6fjpDu0qsm1ZJq+SBpAt+ONmo7fG5xsy4q Fm6chC7m83+N4lwYiyF0yIQnVUEy3GzGtb1RItaE4yUrbCvyjXFAEZteXPrwkWzfjATE b9yj90xn3fTkZ8JhQtaIEX978it7TPG5PgLku+RDR+JP18ZVBy4c5huqy5zuMwM8GQKD mq2P1f5fyixHrEYx5Q9nlzrm02aS/z2lWoW68rYv8IcUiLl653dpxhhYwFX3Z9djYO0i tR8A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=x7svweP0IDihRUwLjDBPUL8ODiog5cZ+HQk14NZtEqw=; b=ZgwHCJwzySwcMr4S/z7GDlaV6CXwAvCXpmZ5BPm3+dnOmUvmLtmsR16vHCoblWPAP9 bLp1RmEMRC/uxdS1cQbIudWgIsUsB3d8hnFc1Du8EXKtsCkX7Lu216upUi5Sa0py7C71 4EcnUSwvVo9mnbbQKipKp5KbU+1AMklzMd4fGlBAbEGRrc8WxYYhMvVGoHOCjnir1cgG GYXyKeojVGephD0BMgkXdPPqC5cBz4hPz93BR3qwpTCnxswfmhVIWE3KJtOvYIBlh+g3 X1A1iToyCQ3lkTaR3bazQZLtYo5UuG0hdZzDP9ci9cAXu9G3kiOtj+4tV5LwBqRvUVm2 f2cw== X-Gm-Message-State: ANoB5pkrFGdsSe6HuGiIJ3fiD4n47BfQ/TPiV5SfCAu4F518PHt+jt7P Mg7/P/XV71kwcyXPJVjFl1Lh82FqxUKcZA== X-Received: by 2002:a05:6a00:2396:b0:572:698b:5fa9 with SMTP id f22-20020a056a00239600b00572698b5fa9mr6979635pfc.28.1668764927988; Fri, 18 Nov 2022 01:48:47 -0800 (PST) Received: from stoup.. ([2602:47:d48a:1201:90b2:345f:bf0a:c412]) by smtp.gmail.com with ESMTPSA id n12-20020a170902e54c00b0018862bb3976sm3115421plf.308.2022.11.18.01.48.46 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Nov 2022 01:48:47 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH for-8.0 28/29] tcg/i386: Add vex_v argument to tcg_out_vex_modrm_pool Date: Fri, 18 Nov 2022 01:47:53 -0800 Message-Id: <20221118094754.242910-29-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221118094754.242910-1-richard.henderson@linaro.org> References: <20221118094754.242910-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::534; envelope-from=richard.henderson@linaro.org; helo=mail-pg1-x534.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.c.inc | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index f277085321..3f0cb4bc66 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -841,9 +841,9 @@ static inline void tcg_out_modrm_pool(TCGContext *s, int opc, int r) } /* Output an opcode with an expected reference to the constant pool. */ -static inline void tcg_out_vex_modrm_pool(TCGContext *s, int opc, int r) +static inline void tcg_out_vex_modrm_pool(TCGContext *s, int opc, int r, int v) { - tcg_out_vex_opc(s, opc, r, 0, 0, 0); + tcg_out_vex_opc(s, opc, r, v, 0, 0); /* Absolute for 32-bit, pc-relative for 64-bit. */ tcg_out8(s, LOWREGMASK(r) << 3 | 5); tcg_out32(s, 0); @@ -990,18 +990,18 @@ static void tcg_out_dupi_vec(TCGContext *s, TCGType type, unsigned vece, if (TCG_TARGET_REG_BITS == 32 && vece < MO_64) { if (have_avx2) { - tcg_out_vex_modrm_pool(s, OPC_VPBROADCASTD + vex_l, ret); + tcg_out_vex_modrm_pool(s, OPC_VPBROADCASTD + vex_l, ret, 0); } else { - tcg_out_vex_modrm_pool(s, OPC_VBROADCASTSS, ret); + tcg_out_vex_modrm_pool(s, OPC_VBROADCASTSS, ret, 0); } new_pool_label(s, arg, R_386_32, s->code_ptr - 4, 0); } else { if (type == TCG_TYPE_V64) { - tcg_out_vex_modrm_pool(s, OPC_MOVQ_VqWq, ret); + tcg_out_vex_modrm_pool(s, OPC_MOVQ_VqWq, ret, 0); } else if (have_avx2) { - tcg_out_vex_modrm_pool(s, OPC_VPBROADCASTQ + vex_l, ret); + tcg_out_vex_modrm_pool(s, OPC_VPBROADCASTQ + vex_l, ret, 0); } else { - tcg_out_vex_modrm_pool(s, OPC_MOVDDUP, ret); + tcg_out_vex_modrm_pool(s, OPC_MOVDDUP, ret, 0); } if (TCG_TARGET_REG_BITS == 64) { new_pool_label(s, arg, R_386_PC32, s->code_ptr - 4, -4); @@ -1024,7 +1024,7 @@ static void tcg_out_movi_vec(TCGContext *s, TCGType type, } int rexw = (type == TCG_TYPE_I32 ? 0 : P_REXW); - tcg_out_vex_modrm_pool(s, OPC_MOVD_VyEy + rexw, ret); + tcg_out_vex_modrm_pool(s, OPC_MOVD_VyEy + rexw, ret, 0); if (TCG_TARGET_REG_BITS == 64) { new_pool_label(s, arg, R_386_PC32, s->code_ptr - 4, -4); } else { From patchwork Fri Nov 18 09:47:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 626108 Delivered-To: patch@linaro.org Received: by 2002:a17:522:c983:b0:460:3032:e3c4 with SMTP id kr3csp33185pvb; Fri, 18 Nov 2022 01:51:21 -0800 (PST) X-Google-Smtp-Source: AA0mqf5GVGUcJDH3pB9pqDdtzIfijDkbpVrFu/CX5AAd9UfWvV8Ld/AqxWgaNkddIEjk82fV06OL X-Received: by 2002:a05:6214:368a:b0:4bb:6b58:2c96 with SMTP id nl10-20020a056214368a00b004bb6b582c96mr5970442qvb.127.1668765081440; Fri, 18 Nov 2022 01:51:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668765081; cv=none; d=google.com; s=arc-20160816; b=zw3ZGU3jtBt0QoCDAzDPDsNNywNEQU2pc4Qz9j5hHb4mzsXl0dYUKcnz7qsZP0RMkh UjssEkD+5o6JkPxTTJatwhn9bLjuD5h2PAl/XVwqdjJtRsoNz99rGoiE37Yhs5Bf+JsW rEB9mALNfd79aBznVdws1hjhhyf2BOcZriHkrg/ymVY3zLvU3Q3hfBxwsJTwK0xq7We7 ediC7p2kIXf17n+SDDH95XO+qHMbRXliw99plxk3fLBqKRjtHPs1tAPy568094dNy98n A7S2CMhFULrdzU1si4ssVicNMKkMAi46yixZepF9fBbkWsi/pdY8Cc2TLTHnIPLh0f81 2odg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=zrlVrOpeCBe1Y6fxW7Q9Kia6e+buM0YiUwHFQiodiXk=; b=HmgSHO3Zq7SjP+81E2j4ILsw8MSBv4g1/Re6wgFVA4Ww1jcYepYFVwqTHKZYKf+AfR Bb5WHN6+4K6bItmHXKDkEgoxO8wdCejQMEaNGxEDYI/w8q+bASti7vSku3glWIqAXzo+ nvn7G/y2S71uz/etHqQ74t8S7Sb7m6Eq/e3Hgvs9Ma4rMwG31sr+jE1kxPVBzfZUiMWC Isk5rX7dqKDvm3e+FioqE2XqIF93E031CRGR79PL/KUXhAnAy6BGsK5sNyASOfEz+YjT +2FDZwb29yJG4PlR7esel7BrnjklB1bxzY0cFrRx1AZ6C9wA1dPUM20KuKOnDmGWiKa4 tZzQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=ihOdz1hn; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id m5-20020a05620a290500b006ebcaee80a0si1596426qkp.385.2022.11.18.01.51.21 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 18 Nov 2022 01:51:21 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=ihOdz1hn; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ovxzk-000800-PT; Fri, 18 Nov 2022 04:49:12 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ovxzS-0007kK-NJ for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:57 -0500 Received: from mail-pl1-x634.google.com ([2607:f8b0:4864:20::634]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ovxzP-000293-BM for qemu-devel@nongnu.org; Fri, 18 Nov 2022 04:48:53 -0500 Received: by mail-pl1-x634.google.com with SMTP id k7so4104132pll.6 for ; Fri, 18 Nov 2022 01:48:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=zrlVrOpeCBe1Y6fxW7Q9Kia6e+buM0YiUwHFQiodiXk=; b=ihOdz1hnNemjhufe722R1BTKPGUIFb+31U6eAEeIpiqLS1lkQDBOLUsEr4OrtbkwXH ftw5+pcoSfx5yxY8wYByR9NYnDF+WA3Vtlyrus1b10mGA5xFqVHEjAwmfSpasfVT2c+E WAfFpnrXsXjUEpCljp17gATQEZTivwtOZUQ3W1aPhRZqyAGlpWqAzZacLh4u9Z9DyvBb /ZSbxD6BzbrvLH1wvFLZXP2hnq0Vihqbv7LqIwpYTqVTSCDx/Ujh0yGoWTlkzxSeZq/j iSMoR64QJafjRLUCZJ58yBUhgE0eXnfcOTl3kXFKdMj6h6YxJu0aAtaDHqwdq3FViwqD zhDg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=zrlVrOpeCBe1Y6fxW7Q9Kia6e+buM0YiUwHFQiodiXk=; b=swDXGwbZiibd5XF6+Pe/c1kraWjbZ+Xj0BFNCoCECbk30zpQ9IVhjW6kNOm+AXi4J/ wecUxuJUeEhg2zbuEZc2AbMAIyNfv4iNrCWfgxaWD6JAt6Sql6K6oJIC1HZv0vnqOhUR SZGTiFEAMx/j3+Zr2tyGSw5OSqXTpnSV+1thQXY3hs6MhOjRkW8Dk57z79NhNM5wKoRL YUbJhPHGhjjmWls16HtszQV6yuxuEPrNJC9sdNBvQeN5agUidbqMT3RYA3scOoI5m1Kd 3716gd4NA/EkM1X+nehRwZfjTFmQK9xIlEffEWczxlsOZ2p8hc2Rjag9yMhL/nv0putV DyXg== X-Gm-Message-State: ANoB5pnydKebuE2LCS6bYlyw/sJ07NlCmbGw7HEaK9opdA6Pkbh4rUyh JXKeOBx2pDGmVPgEczWuxiHTIVq1FezIbw== X-Received: by 2002:a17:903:1c4:b0:187:12cc:d6f1 with SMTP id e4-20020a17090301c400b0018712ccd6f1mr6715884plh.63.1668764929477; Fri, 18 Nov 2022 01:48:49 -0800 (PST) Received: from stoup.. ([2602:47:d48a:1201:90b2:345f:bf0a:c412]) by smtp.gmail.com with ESMTPSA id n12-20020a170902e54c00b0018862bb3976sm3115421plf.308.2022.11.18.01.48.48 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Nov 2022 01:48:48 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH for-8.0 29/29] tcg/i386: Honor 64-bit atomicity in 32-bit mode Date: Fri, 18 Nov 2022 01:47:54 -0800 Message-Id: <20221118094754.242910-30-richard.henderson@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221118094754.242910-1-richard.henderson@linaro.org> References: <20221118094754.242910-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::634; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x634.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org Use one of the coprocessors to perform 64-bit stores. Signed-off-by: Richard Henderson --- tcg/i386/tcg-target.c.inc | 119 +++++++++++++++++++++++++++++++++----- 1 file changed, 106 insertions(+), 13 deletions(-) diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc index 3f0cb4bc66..3d3ee4b20a 100644 --- a/tcg/i386/tcg-target.c.inc +++ b/tcg/i386/tcg-target.c.inc @@ -472,6 +472,10 @@ static bool tcg_target_const_match(int64_t val, TCGType type, int ct) #define OPC_GRP5 (0xff) #define OPC_GRP14 (0x73 | P_EXT | P_DATA16) +#define OPC_ESCDF (0xdf) +#define ESCDF_FILD_m64 5 +#define ESCDF_FISTP_m64 7 + /* Group 1 opcode extensions for 0x80-0x83. These are also used as modifiers for OPC_ARITH. */ #define ARITH_ADD 0 @@ -2400,21 +2404,65 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg datalo, TCGReg datahi, tcg_out_modrm_sib_offset(s, movop + P_REXW + seg, datalo, base, index, 0, ofs); } else { + TCGLabel *l1 = NULL, *l2 = NULL; + bool use_pair = atom < MO_64; + if (use_movbe) { TCGReg t = datalo; datalo = datahi; datahi = t; } - if (base != datalo) { - tcg_out_modrm_sib_offset(s, movop + seg, datalo, - base, index, 0, ofs); - tcg_out_modrm_sib_offset(s, movop + seg, datahi, - base, index, 0, ofs + 4); - } else { - tcg_out_modrm_sib_offset(s, movop + seg, datahi, - base, index, 0, ofs + 4); - tcg_out_modrm_sib_offset(s, movop + seg, datalo, + + if (!use_pair) { + /* + * Atomicity requires that we use use a single 8-byte load. + * For simplicity, and code size, always use the FPU for this. + * Similar insns using SSE/AVX are merely larger. + * Load from memory in one go, then store back to the stack, + * from whence we can load into the correct integer regs. + * + * If we've already checked for 8-byte alignment, or not + * checked for alignment at all, that's all we need. + * If we arrive here with lesser but non-zero alignment, + * then we have determined that subalignment can be + * satisfied with two 4-byte loads. + */ + if (align > MO_8 && align < MO_64) { + use_pair = true; + l1 = gen_new_label(); + l2 = gen_new_label(); + + tcg_out_testi(s, base, align == MO_32 ? 4 : 7); + tcg_out_jxx(s, JCC_JNE, l2, true); + } + + tcg_out_modrm_sib_offset(s, OPC_ESCDF + seg, ESCDF_FILD_m64, base, index, 0, ofs); + tcg_out_modrm_offset(s, OPC_ESCDF, ESCDF_FISTP_m64, + TCG_REG_ESP, 0); + tcg_out_modrm_offset(s, movop, datalo, TCG_REG_ESP, 0); + tcg_out_modrm_offset(s, movop, datahi, TCG_REG_ESP, 4); + + if (use_pair) { + tcg_out_jxx(s, JCC_JMP, l1, true); + tcg_out_label(s, l2); + } + } + if (use_pair) { + if (base != datalo) { + tcg_out_modrm_sib_offset(s, movop + seg, datalo, + base, index, 0, ofs); + tcg_out_modrm_sib_offset(s, movop + seg, datahi, + base, index, 0, ofs + 4); + } else { + tcg_out_modrm_sib_offset(s, movop + seg, datahi, + base, index, 0, ofs + 4); + tcg_out_modrm_sib_offset(s, movop + seg, datalo, + base, index, 0, ofs); + } + } + if (l1) { + tcg_out_label(s, l1); } } break; @@ -2577,20 +2625,65 @@ static void tcg_out_qemu_st_direct(TCGContext *s, TCGReg datalo, TCGReg datahi, case MO_32: tcg_out_modrm_sib_offset(s, movop + seg, datalo, base, index, 0, ofs); break; + case MO_64: if (TCG_TARGET_REG_BITS == 64) { tcg_out_modrm_sib_offset(s, movop + P_REXW + seg, datalo, base, index, 0, ofs); } else { + TCGLabel *l1 = NULL, *l2 = NULL; + bool use_pair = atom < MO_64; + if (use_movbe) { TCGReg t = datalo; datalo = datahi; datahi = t; } - tcg_out_modrm_sib_offset(s, movop + seg, datalo, - base, index, 0, ofs); - tcg_out_modrm_sib_offset(s, movop + seg, datahi, - base, index, 0, ofs + 4); + + if (!use_pair) { + /* + * Atomicity requires that we use use one 8-byte store. + * For simplicity, and code size, always use the FPU for this. + * Similar insns using SSE/AVX are merely larger. + * Assemble the 8-byte quantity in required endianness + * on the stack, load to coproc unit, and store. + * + * If we've already checked for 8-byte alignment, or not + * checked for alignment at all, that's all we need. + * If we arrive here with lesser but non-zero alignment, + * then we have determined that subalignment can be + * satisfied with two 4-byte stores. + */ + if (align > MO_8 && align < MO_64) { + use_pair = true; + l1 = gen_new_label(); + l2 = gen_new_label(); + + tcg_out_testi(s, base, align == MO_32 ? 4 : 7); + tcg_out_jxx(s, JCC_JNE, l2, true); + } + + tcg_out_modrm_offset(s, movop, datalo, TCG_REG_ESP, 0); + tcg_out_modrm_offset(s, movop, datahi, TCG_REG_ESP, 4); + tcg_out_modrm_offset(s, OPC_ESCDF, ESCDF_FILD_m64, + TCG_REG_ESP, 0); + tcg_out_modrm_sib_offset(s, OPC_ESCDF + seg, ESCDF_FISTP_m64, + base, index, 0, ofs); + + if (use_pair) { + tcg_out_jxx(s, JCC_JMP, l1, true); + tcg_out_label(s, l2); + } + } + if (use_pair) { + tcg_out_modrm_sib_offset(s, movop + seg, datalo, + base, index, 0, ofs); + tcg_out_modrm_sib_offset(s, movop + seg, datahi, + base, index, 0, ofs + 4); + } + if (l1) { + tcg_out_label(s, l1); + } } break;