From patchwork Thu Aug 11 15:24:24 2016
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-Patchwork-Submitter: =?utf-8?q?Alex_Benn=C3=A9e?= <alex.bennee@linaro.org>
X-Patchwork-Id: 73782
Delivered-To: patch@linaro.org
Received: by 10.140.29.52 with SMTP id a49csp176857qga;
 Thu, 11 Aug 2016 08:51:03 -0700 (PDT)
X-Received: by 10.200.54.28 with SMTP id m28mr11617610qtb.87.1470930663374; 
 Thu, 11 Aug 2016 08:51:03 -0700 (PDT)
Return-Path: <qemu-devel-bounces+patch=linaro.org@nongnu.org>
Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11])
 by mx.google.com with ESMTPS id j4si570487qkf.317.2016.08.11.08.51.03
 for <patch@linaro.org> (version=TLS1 cipher=AES128-SHA bits=128/128);
 Thu, 11 Aug 2016 08:51:03 -0700 (PDT)
Received-SPF: pass (google.com: domain of
 qemu-devel-bounces+patch=linaro.org@nongnu.org designates
 2001:4830:134:3::11 as permitted sender)
 client-ip=2001:4830:134:3::11; 
Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org;
 spf=pass (google.com: domain of
 qemu-devel-bounces+patch=linaro.org@nongnu.org designates
 2001:4830:134:3::11 as permitted sender)
 smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; 
 dmarc=fail (p=NONE dis=NONE) header.from=linaro.org
Received: from localhost ([::1]:49278 helo=lists.gnu.org)
 by lists.gnu.org with esmtp (Exim 4.71)
 (envelope-from <qemu-devel-bounces+patch=linaro.org@nongnu.org>)
 id 1bXsG6-000582-M5
 for patch@linaro.org; Thu, 11 Aug 2016 11:51:02 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:58813)
 by lists.gnu.org with esmtp (Exim 4.71)
 (envelope-from <alex.bennee@linaro.org>) id 1bXrwa-00022J-Fk
 for qemu-devel@nongnu.org; Thu, 11 Aug 2016 11:30:59 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <alex.bennee@linaro.org>) id 1bXrwU-0007sC-7z
 for qemu-devel@nongnu.org; Thu, 11 Aug 2016 11:30:52 -0400
Received: from mail-wm0-x231.google.com ([2a00:1450:400c:c09::231]:38115)
 by eggs.gnu.org with esmtp (Exim 4.71)
 (envelope-from <alex.bennee@linaro.org>) id 1bXrwT-0007s8-SF
 for qemu-devel@nongnu.org; Thu, 11 Aug 2016 11:30:46 -0400
Received: by mail-wm0-x231.google.com with SMTP id o80so3512551wme.1
 for <qemu-devel@nongnu.org>; Thu, 11 Aug 2016 08:30:45 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; 
 h=from:to:cc:subject:date:message-id:in-reply-to:references
 :mime-version:content-transfer-encoding;
 bh=pbCrE0hX0klLlPgIr3ppLlHD8nf3+2epAK3dbkTUVJY=;
 b=gURZzQeELxhaBsIz1XbR2cZwbw0c0sjOQWncLQmieW62xbAvb3pG8//IgORmIhqAr8
 Z1u6sR9g7Q1SiWl3BrsRBl1eXo21+elwBOW4JP01bsEgZ47L9Zajw8j3Ht6r5xkEkfi5
 Z2iwbUV7pWOBXsRWLuBqV6s2E1kPv5s8JZIi4=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
 :references:mime-version:content-transfer-encoding;
 bh=pbCrE0hX0klLlPgIr3ppLlHD8nf3+2epAK3dbkTUVJY=;
 b=CNMnjYoi0QoC66vuwVzA41AJ0TCeH+kdHpw6GZPNyTtNnvpjkogDlCvRyQ8uQwBxx3
 jGgQv+wmofX8jqH8WvJ5RV/uZRRXaQ+BNa47dciiI9IrawdGcUpWKQG8PeczoLLi7vBi
 XbdpuSuD/XQWcUwe6v85uqKGlq2jE3MavrSkVe/Gz+kru9zroy0im9hYnD963bTwdK0I
 6C5l62DbgLo26Z/q5xKkMzKs0SOUMoTZKswJJ99/Wftnc1SnU8I+0UqQCAjSqGmgHpXN
 ywjx+92qAEEN3hty18ShscPtZ4SNdRT7Tpmsge6J4PXxDAhf5ux5geWfv+AEaRLVr4kI
 xhpw==
X-Gm-Message-State: AEkoouslgUApASFMm2dXOaCiY7mCGaWWMRUPI5fuVwNnliXKVy5IWmCToDI/kSfYTgpeptrc
X-Received: by 10.28.19.134 with SMTP id 128mr9508659wmt.40.1470929445023;
 Thu, 11 Aug 2016 08:30:45 -0700 (PDT)
Received: from zen.linaro.local ([81.128.185.34])
 by smtp.gmail.com with ESMTPSA id
 f3sm3239579wjh.2.2016.08.11.08.30.43
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Thu, 11 Aug 2016 08:30:43 -0700 (PDT)
Received: from zen.linaroharston (localhost [127.0.0.1])
 by zen.linaro.local (Postfix) with ESMTP id E9CB83E035A;
 Thu, 11 Aug 2016 16:24:31 +0100 (BST)
From: =?UTF-8?q?Alex=20Benn=C3=A9e?= <alex.bennee@linaro.org>
To: mttcg@listserver.greensocs.com, qemu-devel@nongnu.org,
 fred.konrad@greensocs.com, a.rigo@virtualopensystems.com,
 cota@braap.org, bobby.prani@gmail.com, nikunj@linux.vnet.ibm.com
Date: Thu, 11 Aug 2016 16:24:24 +0100
Message-Id: <1470929064-4092-29-git-send-email-alex.bennee@linaro.org>
X-Mailer: git-send-email 2.7.4
In-Reply-To: <1470929064-4092-1-git-send-email-alex.bennee@linaro.org>
References: <1470929064-4092-1-git-send-email-alex.bennee@linaro.org>
MIME-Version: 1.0
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-Received-From: 2a00:1450:400c:c09::231
Subject: [Qemu-devel] [RFC v4 28/28] cputlb: make tlb_flush_by_mmuidx safe
 for MTTCG
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.21
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Cc: peter.maydell@linaro.org, claudio.fontana@huawei.com,
 Peter Crosthwaite <crosthwaite.peter@gmail.com>,
 jan.kiszka@siemens.com, mark.burton@greensocs.com,
 serge.fdrv@gmail.com, pbonzini@redhat.com,
 =?UTF-8?q?Alex=20Benn=C3=A9e?= <alex.bennee@linaro.org>, rth@twiddle.net
Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org
Sender: "Qemu-devel" <qemu-devel-bounces+patch=linaro.org@nongnu.org>

These flushes allow a per-mmuidx granularity to the TLB flushing and are
currently only used by the ARM model. As it is possible to hammer the
other vCPU threads with flushes (and build up long queues of identical
flushes) we extend mechanism used for the global tlb_flush and set a
bitmap describing all the pending flushes. The updates are done
atomically to avoid corruption of the bitmap but repeating a flush is
certainly not a problem.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 cputlb.c          | 138 ++++++++++++++++++++++++++++++++++++++++--------------
 include/qom/cpu.h |  12 ++---
 2 files changed, 108 insertions(+), 42 deletions(-)

-- 
2.7.4

diff --git a/cputlb.c b/cputlb.c
index faeb195..3b741d4 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -68,6 +68,9 @@
 /* We need a solution for stuffing 64 bit pointers in 32 bit ones if
  * we care about this combination */
 QEMU_BUILD_BUG_ON(sizeof(target_ulong) > sizeof(void *));
+QEMU_BUILD_BUG_ON(NB_MMU_MODES > 16);
+
+#define ALL_MMUIDX_BITS ((1 << NB_MMU_MODES) - 1)
 
 /* statistics */
 int tlb_flush_count;
@@ -88,7 +91,7 @@ static void tlb_flush_nocheck(CPUState *cpu, int flush_global)
     env->tlb_flush_mask = 0;
     tlb_flush_count++;
 
-    atomic_mb_set(&cpu->pending_tlb_flush, false);
+    atomic_mb_set(&cpu->pending_tlb_flush, 0);
 }
 
 static void tlb_flush_global_async_work(CPUState *cpu, void *opaque)
@@ -111,7 +114,8 @@ static void tlb_flush_global_async_work(CPUState *cpu, void *opaque)
 void tlb_flush(CPUState *cpu, int flush_global)
 {
     if (cpu->created && !qemu_cpu_is_self(cpu)) {
-        if (atomic_bool_cmpxchg(&cpu->pending_tlb_flush, false, true)) {
+        if (atomic_mb_read(&cpu->pending_tlb_flush) != ALL_MMUIDX_BITS) {
+            atomic_mb_set(&cpu->pending_tlb_flush, ALL_MMUIDX_BITS);
             async_run_on_cpu(cpu, tlb_flush_global_async_work,
                              GINT_TO_POINTER(flush_global));
         }
@@ -120,35 +124,67 @@ void tlb_flush(CPUState *cpu, int flush_global)
     }
 }
 
-static inline void v_tlb_flush_by_mmuidx(CPUState *cpu, va_list argp)
+static void tlb_flush_by_mmuidx_async_work(CPUState *cpu, void *mmu_bitmask)
 {
     CPUArchState *env = cpu->env_ptr;
+    unsigned long mmu_idx_bitmask = GPOINTER_TO_UINT(mmu_bitmask);
+    int mmu_idx;
 
     assert_cpu_is_self(cpu);
-    tlb_debug("start\n");
 
-    for (;;) {
-        int mmu_idx = va_arg(argp, int);
+    tlb_debug("start\n");
 
-        if (mmu_idx < 0) {
-            break;
-        }
+    for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
 
-        tlb_debug("%d\n", mmu_idx);
+        if (test_bit(mmu_idx, &mmu_idx_bitmask)) {
+            tlb_debug("%d\n", mmu_idx);
 
-        memset(env->tlb_table[mmu_idx], -1, sizeof(env->tlb_table[0]));
-        memset(env->tlb_v_table[mmu_idx], -1, sizeof(env->tlb_v_table[0]));
+            memset(env->tlb_table[mmu_idx], -1, sizeof(env->tlb_table[0]));
+            memset(env->tlb_v_table[mmu_idx], -1, sizeof(env->tlb_v_table[0]));
+        }
     }
 
     memset(cpu->tb_jmp_cache, 0, sizeof(cpu->tb_jmp_cache));
 }
 
+/* Helper function to slurp va_args list into a bitmap
+ */
+static inline unsigned long make_mmu_index_bitmap(va_list args)
+{
+    unsigned long bitmap = 0;
+    int mmu_index = va_arg(args, int);
+
+    /* An empty va_list would be a bad call */
+    g_assert(mmu_index > 0);
+
+    do {
+        set_bit(mmu_index, &bitmap);
+        mmu_index = va_arg(args, int);
+    } while (mmu_index >= 0);
+
+    return bitmap;
+}
+
 void tlb_flush_by_mmuidx(CPUState *cpu, ...)
 {
     va_list argp;
+    unsigned long mmu_idx_bitmap;
+
     va_start(argp, cpu);
-    v_tlb_flush_by_mmuidx(cpu, argp);
+    mmu_idx_bitmap = make_mmu_index_bitmap(argp);
     va_end(argp);
+
+    if (!qemu_cpu_is_self(cpu)) {
+        uint16_t pending_flushes =
+            mmu_idx_bitmap & ~atomic_mb_read(&cpu->pending_tlb_flush);
+        if (pending_flushes) {
+            atomic_or(&cpu->pending_tlb_flush, pending_flushes);
+            async_run_on_cpu(cpu, tlb_flush_by_mmuidx_async_work,
+                             GUINT_TO_POINTER(pending_flushes));
+        }
+    } else {
+        tlb_flush_by_mmuidx_async_work(cpu, GUINT_TO_POINTER(mmu_idx_bitmap));
+    }
 }
 
 static inline void tlb_flush_entry(CPUTLBEntry *tlb_entry, target_ulong addr)
@@ -199,15 +235,46 @@ void tlb_flush_page(CPUState *cpu, target_ulong addr)
     tb_flush_jmp_cache(cpu, addr);
 }
 
+/* As we are going to hijack the bottom bits of the page address for a
+ * mmuidx bit mask we need to fail to build if we can't do that
+ */
+QEMU_BUILD_BUG_ON(NB_MMU_MODES > TARGET_PAGE_BITS);
+
+static void tlb_flush_page_by_mmuidx_async_work(CPUState *cpu, void *data)
+{
+    CPUArchState *env = cpu->env_ptr;
+    uintptr_t page_and_mmuidx = GPOINTER_TO_UINT(data);
+    target_ulong addr = page_and_mmuidx & TARGET_PAGE_MASK;
+    int page = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
+    int mmu_idx;
+    int i;
+
+    assert_cpu_is_self(cpu);
+
+    for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
+        if (test_bit(mmu_idx, &page_and_mmuidx)) {
+            tlb_flush_entry(&env->tlb_table[mmu_idx][page], addr);
+
+            /* check whether there are vltb entries that need to be flushed */
+            for (i = 0; i < CPU_VTLB_SIZE; i++) {
+                tlb_flush_entry(&env->tlb_v_table[mmu_idx][i], addr);
+            }
+        }
+    }
+
+    tb_flush_jmp_cache(cpu, addr);
+}
+
 void tlb_flush_page_by_mmuidx(CPUState *cpu, target_ulong addr, ...)
 {
     CPUArchState *env = cpu->env_ptr;
-    int i, k;
+    unsigned long mmu_idx_bitmap;
     va_list argp;
 
     va_start(argp, addr);
+    mmu_idx_bitmap = make_mmu_index_bitmap(argp);
+    va_end(argp);
 
-    assert_cpu_is_self(cpu);
     tlb_debug("addr "TARGET_FMT_lx"\n", addr);
 
     /* Check if we need to flush due to large pages.  */
@@ -216,33 +283,32 @@ void tlb_flush_page_by_mmuidx(CPUState *cpu, target_ulong addr, ...)
                   TARGET_FMT_lx "/" TARGET_FMT_lx ")\n",
                   env->tlb_flush_addr, env->tlb_flush_mask);
 
-        v_tlb_flush_by_mmuidx(cpu, argp);
-        va_end(argp);
+
+        if (!qemu_cpu_is_self(cpu)) {
+            uint16_t pending_flushes =
+                mmu_idx_bitmap & ~atomic_mb_read(&cpu->pending_tlb_flush);
+            if (pending_flushes) {
+                atomic_or(&cpu->pending_tlb_flush, pending_flushes);
+                async_run_on_cpu(cpu, tlb_flush_by_mmuidx_async_work,
+                                 GUINT_TO_POINTER(pending_flushes));
+            }
+        } else {
+            tlb_flush_by_mmuidx_async_work(cpu,
+                                           GUINT_TO_POINTER(mmu_idx_bitmap));
+        }
         return;
     }
 
+    /* This should already be page aligned */
     addr &= TARGET_PAGE_MASK;
-    i = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
+    addr |= mmu_idx_bitmap;
 
-    for (;;) {
-        int mmu_idx = va_arg(argp, int);
-
-        if (mmu_idx < 0) {
-            break;
-        }
-
-        tlb_debug("idx %d\n", mmu_idx);
-
-        tlb_flush_entry(&env->tlb_table[mmu_idx][i], addr);
-
-        /* check whether there are vltb entries that need to be flushed */
-        for (k = 0; k < CPU_VTLB_SIZE; k++) {
-            tlb_flush_entry(&env->tlb_v_table[mmu_idx][k], addr);
-        }
+    if (!qemu_cpu_is_self(cpu)) {
+        async_run_on_cpu(cpu, tlb_flush_page_by_mmuidx_async_work,
+                         GUINT_TO_POINTER(addr));
+    } else {
+        tlb_flush_page_by_mmuidx_async_work(cpu, GUINT_TO_POINTER(addr));
     }
-    va_end(argp);
-
-    tb_flush_jmp_cache(cpu, addr);
 }
 
 static void tlb_flush_page_async_work(CPUState *cpu, void *opaque)
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 6b35f37..fcec99d 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -371,17 +371,17 @@ struct CPUState {
      */
     bool throttle_thread_scheduled;
 
+    /* The pending_tlb_flush flag is set and cleared atomically to
+     * avoid potential races. The aim of the flag is to avoid
+     * unnecessary flushes.
+     */
+    uint16_t pending_tlb_flush;
+
     /* Note that this is accessed at the start of every TB via a negative
        offset from AREG0.  Leave this field at the end so as to make the
        (absolute value) offset as small as possible.  This reduces code
        size, especially for hosts without large memory offsets.  */
     uint32_t tcg_exit_req;
-
-    /* The pending_tlb_flush flag is set and cleared atomically to
-     * avoid potential races. The aim of the flag is to avoid
-     * unnecessary flushes.
-     */
-    bool pending_tlb_flush;
 };
 
 QTAILQ_HEAD(CPUTailQ, CPUState);