From patchwork Thu Feb 23 18:29:18 2017
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-Patchwork-Submitter: =?utf-8?q?Alex_Benn=C3=A9e?= <alex.bennee@linaro.org>
X-Patchwork-Id: 94411
Delivered-To: patch@linaro.org
Received: by 10.140.20.99 with SMTP id 90csp352423qgi;
 Thu, 23 Feb 2017 11:03:35 -0800 (PST)
X-Received: by 10.200.34.171 with SMTP id f40mr14697877qta.125.1487876615107; 
 Thu, 23 Feb 2017 11:03:35 -0800 (PST)
Return-Path: <qemu-devel-bounces+patch=linaro.org@nongnu.org>
Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11])
 by mx.google.com with ESMTPS id
 f37si3935490qte.234.2017.02.23.11.03.34 for <patch@linaro.org>
 (version=TLS1 cipher=AES128-SHA bits=128/128);
 Thu, 23 Feb 2017 11:03:35 -0800 (PST)
Received-SPF: pass (google.com: domain of
 qemu-devel-bounces+patch=linaro.org@nongnu.org designates
 2001:4830:134:3::11 as permitted sender)
 client-ip=2001:4830:134:3::11; 
Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org;
 spf=pass (google.com: domain of
 qemu-devel-bounces+patch=linaro.org@nongnu.org designates
 2001:4830:134:3::11 as permitted sender)
 smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; 
 dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org
Received: from localhost ([::1]:60381 helo=lists.gnu.org)
 by lists.gnu.org with esmtp (Exim 4.71)
 (envelope-from <qemu-devel-bounces+patch=linaro.org@nongnu.org>)
 id 1cgyfs-0004Lm-Or
 for patch@linaro.org; Thu, 23 Feb 2017 14:03:32 -0500
Received: from eggs.gnu.org ([2001:4830:134:3::10]:44453)
 by lists.gnu.org with esmtp (Exim 4.71)
 (envelope-from <alex.bennee@linaro.org>) id 1cgy9F-0004QK-ED
 for qemu-devel@nongnu.org; Thu, 23 Feb 2017 13:29:50 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <alex.bennee@linaro.org>) id 1cgy9D-0006gG-QU
 for qemu-devel@nongnu.org; Thu, 23 Feb 2017 13:29:49 -0500
Received: from mail-wr0-x235.google.com ([2a00:1450:400c:c0c::235]:36208)
 by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16)
 (Exim 4.71) (envelope-from <alex.bennee@linaro.org>)
 id 1cgy9D-0006fe-I2
 for qemu-devel@nongnu.org; Thu, 23 Feb 2017 13:29:47 -0500
Received: by mail-wr0-x235.google.com with SMTP id 89so27034823wrr.3
 for <qemu-devel@nongnu.org>; Thu, 23 Feb 2017 10:29:47 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; 
 h=from:to:cc:subject:date:message-id:in-reply-to:references
 :mime-version:content-transfer-encoding;
 bh=55i/1NgQ86iE5P1SVEAg6oU/muMgCSGRNWgAJcR3v9c=;
 b=bAt9aW3wFFX4FrpGnsUrVQEX1zkzQq13Q+TMoyOXwz0Id4+e0m9Xx//QDZIKMLTNHq
 nPSJR9xBwCFJgYcpsHsu0deGXWxjPoe1qmgC7c2pEYy2lR7nJoEQIbiuipkuNLuH+08K
 5EEHHvBUt0cMr+Ibj3IdVZnRDOCqLMhM3XZhU=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
 :references:mime-version:content-transfer-encoding;
 bh=55i/1NgQ86iE5P1SVEAg6oU/muMgCSGRNWgAJcR3v9c=;
 b=OJtgM66T+ik5+t1NHXOd99427692ByjA3cSBGegK16Hp6o2E54C7VzCQQaU3g15Nc0
 blbzT5c6dDJqgJUyxxAquc0ncVRGkCnn2JuU8riSepwDa82ajF97ptPlTekeYnylKmAA
 56ZX/fOYqgQ6UGUr4hKf7jkKcBgS8ZD6Qs9oMEQCBDf2laioXsAJ7EYbyv9aGjAwcdDQ
 T93QV+/GhVKNuHaan3viEBX9I1dzsEPysjl93XNIbdVkmairA3VgPmYxnG9C05yJftsR
 +7JMVWOi2Z0TbzoPTzGaMs671JEJXe0yaFR2wi7BGD9ROj55aTsrgq03SfC8Z5RVanT7
 YLLg==
X-Gm-Message-State: AMke39kHMnmmg/6f91LETq/5797A6Ap8UcJ/XPstAWTU5LafaJ+WTdv/tZznMWtIvdJRJNSj
X-Received: by 10.223.145.227 with SMTP id 90mr31699291wri.156.1487874586413; 
 Thu, 23 Feb 2017 10:29:46 -0800 (PST)
Received: from zen.linaro.local
 (host109-151-49-69.range109-151.btcentralplus.com. [109.151.49.69])
 by smtp.gmail.com with ESMTPSA id
 48sm7055049wrt.54.2017.02.23.10.29.35
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Thu, 23 Feb 2017 10:29:36 -0800 (PST)
Received: from zen.home (localhost [127.0.0.1])
 by zen.linaro.local (Postfix) with ESMTP id 6589A3E08B1;
 Thu, 23 Feb 2017 18:29:28 +0000 (GMT)
From: =?utf-8?q?Alex_Benn=C3=A9e?= <alex.bennee@linaro.org>
To: rth@twiddle.net,
	peter.maydell@linaro.org
Date: Thu, 23 Feb 2017 18:29:18 +0000
Message-Id: <20170223182927.7166-16-alex.bennee@linaro.org>
X-Mailer: git-send-email 2.11.0
In-Reply-To: <20170223182927.7166-1-alex.bennee@linaro.org>
References: <20170223182927.7166-1-alex.bennee@linaro.org>
MIME-Version: 1.0
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-Received-From: 2a00:1450:400c:c0c::235
Subject: [Qemu-devel] [PATCH v14 15/24] cputlb: introduce tlb_flush_* async
 work.
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.21
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Cc: mttcg@listserver.greensocs.com, nikunj@linux.vnet.ibm.com, Peter
 Crosthwaite <crosthwaite.peter@gmail.com>, jan.kiszka@siemens.com,
 mark.burton@greensocs.com, a.rigo@virtualopensystems.com,
 qemu-devel@nongnu.org, cota@braap.org, serge.fdrv@gmail.com,
 pbonzini@redhat.com, 	bobby.prani@gmail.com, =?utf-8?q?Alex_Benn=C3=A9e?=
 <alex.bennee@linaro.org>, bamvor.zhangjian@linaro.org,
 fred.konrad@greensocs.com
Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org
Sender: "Qemu-devel" <qemu-devel-bounces+patch=linaro.org@nongnu.org>

From: KONRAD Frederic <fred.konrad@greensocs.com>

Some architectures allow to flush the tlb of other VCPUs. This is not a problem
when we have only one thread for all VCPUs but it definitely needs to be an
asynchronous work when we are in true multithreaded work.

We take the tb_lock() when doing this to avoid racing with other threads
which may be invalidating TB's at the same time. The alternative would
be to use proper atomic primitives to clear the tlb entries en-mass.

This patch doesn't do anything to protect other cputlb function being
called in MTTCG mode making cross vCPU changes.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
[AJB: remove need for g_malloc on defer, make check fixes, tb_lock]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>

---
v8
  - fix merge failure mentioning global flush
v6 (base patches)
  - don't use cmpxchg_bool (we drop it later anyway)
  - use RUN_ON_CPU macros instead of inlines
  - bug out of tlb_flush if !tcg_enabled() (MacOSX make check failure)
v5 (base patches)
  - take tb_lock() for memset
  - ensure tb_flush_page properly asyncs work for other vCPUs
  - use run_on_cpu_data
v4 (base_patches)
  - brought forward from arm enabling series
  - restore pending_tlb_flush flag
v1
  - Remove tlb_flush_all just do the check in tlb_flush.
  - remove the need to g_malloc
  - tlb_flush calls direct if !cpu->created
---
 cputlb.c                | 66 +++++++++++++++++++++++++++++++++++++++++++++++--
 include/exec/exec-all.h |  1 +
 include/qom/cpu.h       |  6 +++++
 3 files changed, 71 insertions(+), 2 deletions(-)

-- 
2.11.0

diff --git a/cputlb.c b/cputlb.c
index 94fa9977c5..5dfd3c3ba9 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -64,6 +64,10 @@
         }                                                         \
     } while (0)
 
+/* run_on_cpu_data.target_ptr should always be big enough for a
+ * target_ulong even on 32 bit builds */
+QEMU_BUILD_BUG_ON(sizeof(target_ulong) > sizeof(run_on_cpu_data));
+
 /* statistics */
 int tlb_flush_count;
 
@@ -72,13 +76,22 @@ int tlb_flush_count;
  * flushing more entries than required is only an efficiency issue,
  * not a correctness issue.
  */
-void tlb_flush(CPUState *cpu)
+static void tlb_flush_nocheck(CPUState *cpu)
 {
     CPUArchState *env = cpu->env_ptr;
 
+    /* The QOM tests will trigger tlb_flushes without setting up TCG
+     * so we bug out here in that case.
+     */
+    if (!tcg_enabled()) {
+        return;
+    }
+
     assert_cpu_is_self(cpu);
     tlb_debug("(count: %d)\n", tlb_flush_count++);
 
+    tb_lock();
+
     memset(env->tlb_table, -1, sizeof(env->tlb_table));
     memset(env->tlb_v_table, -1, sizeof(env->tlb_v_table));
     memset(cpu->tb_jmp_cache, 0, sizeof(cpu->tb_jmp_cache));
@@ -86,6 +99,27 @@ void tlb_flush(CPUState *cpu)
     env->vtlb_index = 0;
     env->tlb_flush_addr = -1;
     env->tlb_flush_mask = 0;
+
+    tb_unlock();
+
+    atomic_mb_set(&cpu->pending_tlb_flush, false);
+}
+
+static void tlb_flush_global_async_work(CPUState *cpu, run_on_cpu_data data)
+{
+    tlb_flush_nocheck(cpu);
+}
+
+void tlb_flush(CPUState *cpu)
+{
+    if (cpu->created && !qemu_cpu_is_self(cpu)) {
+        if (atomic_cmpxchg(&cpu->pending_tlb_flush, false, true) == true) {
+            async_run_on_cpu(cpu, tlb_flush_global_async_work,
+                             RUN_ON_CPU_NULL);
+        }
+    } else {
+        tlb_flush_nocheck(cpu);
+    }
 }
 
 static inline void v_tlb_flush_by_mmuidx(CPUState *cpu, va_list argp)
@@ -95,6 +129,8 @@ static inline void v_tlb_flush_by_mmuidx(CPUState *cpu, va_list argp)
     assert_cpu_is_self(cpu);
     tlb_debug("start\n");
 
+    tb_lock();
+
     for (;;) {
         int mmu_idx = va_arg(argp, int);
 
@@ -109,6 +145,8 @@ static inline void v_tlb_flush_by_mmuidx(CPUState *cpu, va_list argp)
     }
 
     memset(cpu->tb_jmp_cache, 0, sizeof(cpu->tb_jmp_cache));
+
+    tb_unlock();
 }
 
 void tlb_flush_by_mmuidx(CPUState *cpu, ...)
@@ -131,13 +169,15 @@ static inline void tlb_flush_entry(CPUTLBEntry *tlb_entry, target_ulong addr)
     }
 }
 
-void tlb_flush_page(CPUState *cpu, target_ulong addr)
+static void tlb_flush_page_async_work(CPUState *cpu, run_on_cpu_data data)
 {
     CPUArchState *env = cpu->env_ptr;
+    target_ulong addr = (target_ulong) data.target_ptr;
     int i;
     int mmu_idx;
 
     assert_cpu_is_self(cpu);
+
     tlb_debug("page :" TARGET_FMT_lx "\n", addr);
 
     /* Check if we need to flush due to large pages.  */
@@ -167,6 +207,18 @@ void tlb_flush_page(CPUState *cpu, target_ulong addr)
     tb_flush_jmp_cache(cpu, addr);
 }
 
+void tlb_flush_page(CPUState *cpu, target_ulong addr)
+{
+    tlb_debug("page :" TARGET_FMT_lx "\n", addr);
+
+    if (!qemu_cpu_is_self(cpu)) {
+        async_run_on_cpu(cpu, tlb_flush_page_async_work,
+                         RUN_ON_CPU_TARGET_PTR(addr));
+    } else {
+        tlb_flush_page_async_work(cpu, RUN_ON_CPU_TARGET_PTR(addr));
+    }
+}
+
 void tlb_flush_page_by_mmuidx(CPUState *cpu, target_ulong addr, ...)
 {
     CPUArchState *env = cpu->env_ptr;
@@ -213,6 +265,16 @@ void tlb_flush_page_by_mmuidx(CPUState *cpu, target_ulong addr, ...)
     tb_flush_jmp_cache(cpu, addr);
 }
 
+void tlb_flush_page_all(target_ulong addr)
+{
+    CPUState *cpu;
+
+    CPU_FOREACH(cpu) {
+        async_run_on_cpu(cpu, tlb_flush_page_async_work,
+                         RUN_ON_CPU_TARGET_PTR(addr));
+    }
+}
+
 /* update the TLBs so that writes to code in the virtual page 'addr'
    can be detected */
 void tlb_protect_code(ram_addr_t ram_addr)
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 82f0e12327..c694e3482b 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -158,6 +158,7 @@ void tlb_set_page(CPUState *cpu, target_ulong vaddr,
 void tb_invalidate_phys_addr(AddressSpace *as, hwaddr addr);
 void probe_write(CPUArchState *env, target_ulong addr, int mmu_idx,
                  uintptr_t retaddr);
+void tlb_flush_page_all(target_ulong addr);
 #else
 static inline void tlb_flush_page(CPUState *cpu, target_ulong addr)
 {
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 10db89b16a..e80bf7a64a 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -402,6 +402,12 @@ struct CPUState {
 
     bool hax_vcpu_dirty;
     struct hax_vcpu_state *hax_vcpu;
+
+    /* The pending_tlb_flush flag is set and cleared atomically to
+     * avoid potential races. The aim of the flag is to avoid
+     * unnecessary flushes.
+     */
+    bool pending_tlb_flush;
 };
 
 QTAILQ_HEAD(CPUTailQ, CPUState);