From patchwork Fri Jan 27 10:39:16 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Alex_Benn=C3=A9e?= X-Patchwork-Id: 92647 Delivered-To: patch@linaro.org Received: by 10.182.3.34 with SMTP id 2csp161360obz; Fri, 27 Jan 2017 03:42:19 -0800 (PST) X-Received: by 10.200.48.110 with SMTP id g43mr7012797qte.277.1485517339664; Fri, 27 Jan 2017 03:42:19 -0800 (PST) Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id a23si3340408qkj.160.2017.01.27.03.42.18 for (version=TLS1 cipher=AES128-SHA bits=128/128); Fri, 27 Jan 2017 03:42:19 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:44613 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cX4v2-0004Xe-UM for patch@linaro.org; Fri, 27 Jan 2017 06:42:17 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:50402) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cX45A-000465-OO for qemu-devel@nongnu.org; Fri, 27 Jan 2017 05:48:42 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cX459-0005ni-25 for qemu-devel@nongnu.org; Fri, 27 Jan 2017 05:48:40 -0500 Received: from mail-wm0-x232.google.com ([2a00:1450:400c:c09::232]:37839) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1cX458-0005nJ-PL for qemu-devel@nongnu.org; Fri, 27 Jan 2017 05:48:38 -0500 Received: by mail-wm0-x232.google.com with SMTP id c206so130575298wme.0 for ; Fri, 27 Jan 2017 02:48:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=y0qb/ZNL0M+Z6lcBvCeXxRcVaqF7l0wGhpJK9E+gUpY=; b=hR7vxGFEp2jJl6InJzdL3wZBlPLNMXwMK1ok81BCIl0X3YcfKSXdxPzzq3FCcXqMUr hvHgIhe02IgX8YEypMiGlefU6m6JR7/SeS2VC/+bsXeTsKy0TabRA3R+k3MM4dNLbwaj KJs8NbMiTJUoxmXWO1DqyfqOHeHeN4bVr/x5A= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=y0qb/ZNL0M+Z6lcBvCeXxRcVaqF7l0wGhpJK9E+gUpY=; b=qbsANSiGSITNzLflnRXIVL+auTzjyH2Ag3baYwK5wdzc70t3DlHbTXzr/LfU0iyxzz LQHr1vTRwYXlYUPz8mb4FNO/mYGiyc2JoB4D/e5VKjIf7KPQ+WIkt4L27oIKLWAGUN/w 8zDk5VpIXYw60XCrjeq4iIXGY4vL1v8gwZGMC/aZFbJqQ26G2ugQT3EnlYylcS+55wBl rOOPkbdMditM8WmDJzCvCXjLzCmGk+djFdsI2/w+zWfaJ3l0TlOyLdgHnIpOrmUI9hZQ xp+BEQ+xkTdAd07tiguG7NMGToDYDBxuJe6zA1iKdSYqJwd0Hh5BJlRZ4YwU7b8zG93Z JbRw== X-Gm-Message-State: AIkVDXJJAXSVIaWETV13YCnoCgFvJL5SImbfu+KRF2hJ1EaTALv1633J8wpR+LBSdInRSlxx X-Received: by 10.223.151.99 with SMTP id r90mr6540218wrb.183.1485514117552; Fri, 27 Jan 2017 02:48:37 -0800 (PST) Received: from zen.linaro.local ([81.128.185.34]) by smtp.gmail.com with ESMTPSA id n10sm7236024wrb.9.2017.01.27.02.48.33 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 27 Jan 2017 02:48:33 -0800 (PST) Received: from zen.linaroharston (localhost [127.0.0.1]) by zen.linaro.local (Postfix) with ESMTP id F12643E3776; Fri, 27 Jan 2017 10:39:23 +0000 (GMT) From: =?utf-8?q?Alex_Benn=C3=A9e?= To: mttcg@listserver.greensocs.com, qemu-devel@nongnu.org, fred.konrad@greensocs.com, a.rigo@virtualopensystems.com, cota@braap.org, bobby.prani@gmail.com, nikunj@linux.vnet.ibm.com Date: Fri, 27 Jan 2017 10:39:16 +0000 Message-Id: <20170127103922.19658-20-alex.bennee@linaro.org> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20170127103922.19658-1-alex.bennee@linaro.org> References: <20170127103922.19658-1-alex.bennee@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2a00:1450:400c:c09::232 Subject: [Qemu-devel] [PATCH v8 19/25] cputlb: introduce tlb_flush_*_all_cpus[_synced] X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, claudio.fontana@huawei.com, Peter Crosthwaite , jan.kiszka@siemens.com, mark.burton@greensocs.com, serge.fdrv@gmail.com, pbonzini@redhat.com, =?utf-8?q?Alex_Benn=C3=A9e?= , bamvor.zhangjian@linaro.org, rth@twiddle.net Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" This introduces support to the cputlb API for flushing all CPUs TLBs with one call. This avoids the need for target helpers to iterate through the vCPUs themselves. An additional variant of the API (_synced) do not return from the caller and will cause the work to be scheduled as "safe work". The result will be all the flush operations will be complete by the time the originating vCPU starts executing again. It is up to the caller to ensure enough state has been saved so execution can be restarted at the next appropriate instruction. Some guest architectures can defer completion of flush operations until later. If they later schedule work using the async_safe_work mechanism they can be sure other vCPUs will have flushed their TLBs by the point execution returns from the safe work. Signed-off-by: Alex Bennée --- v7 - some checkpatch long line fixes v8 - change from varg to bitmap calling convention - add _synced variants, re-factored helper --- cputlb.c | 110 +++++++++++++++++++++++++++++++++++++++++++--- include/exec/exec-all.h | 114 ++++++++++++++++++++++++++++++++++++++++++++++-- 2 files changed, 215 insertions(+), 9 deletions(-) -- 2.11.0 Reviewed-by: Richard Henderson diff --git a/cputlb.c b/cputlb.c index 65003350e3..7f9a54f253 100644 --- a/cputlb.c +++ b/cputlb.c @@ -73,6 +73,25 @@ QEMU_BUILD_BUG_ON(sizeof(target_ulong) > sizeof(run_on_cpu_data)); QEMU_BUILD_BUG_ON(NB_MMU_MODES > 16); #define ALL_MMUIDX_BITS ((1 << NB_MMU_MODES) - 1) +/* flush_all_helper: run fn across all cpus + * + * If the wait flag is set then the src cpu's helper will be queued as + * "safe" work and the loop exited creating a synchronisation point + * where all queued work will be finished before execution starts + * again. + */ +static void flush_all_helper(CPUState *src, run_on_cpu_func fn, + run_on_cpu_data d) +{ + CPUState *cpu; + + CPU_FOREACH(cpu) { + if (cpu != src) { + async_run_on_cpu(cpu, fn, d); + } + } +} + /* statistics */ int tlb_flush_count; @@ -128,6 +147,19 @@ void tlb_flush(CPUState *cpu) } } +void tlb_flush_all_cpus(CPUState *src_cpu) +{ + flush_all_helper(src_cpu, tlb_flush_global_async_work, RUN_ON_CPU_NULL); + tlb_flush_global_async_work(src_cpu, RUN_ON_CPU_NULL); +} + +void QEMU_NORETURN tlb_flush_all_cpus_synced(CPUState *src_cpu) +{ + flush_all_helper(src_cpu, tlb_flush_global_async_work, RUN_ON_CPU_NULL); + tlb_flush_global_async_work(src_cpu, RUN_ON_CPU_NULL); + cpu_loop_exit(src_cpu); +} + static void tlb_flush_by_mmuidx_async_work(CPUState *cpu, run_on_cpu_data data) { CPUArchState *env = cpu->env_ptr; @@ -178,6 +210,30 @@ void tlb_flush_by_mmuidx(CPUState *cpu, uint16_t idxmap) } } +void tlb_flush_by_mmuidx_all_cpus(CPUState *src_cpu, uint16_t idxmap) +{ + const run_on_cpu_func fn = tlb_flush_by_mmuidx_async_work; + + tlb_debug("mmu_idx: 0x%"PRIx16"\n", idxmap); + + flush_all_helper(src_cpu, fn, RUN_ON_CPU_HOST_INT(idxmap)); + fn(src_cpu, RUN_ON_CPU_HOST_INT(idxmap)); +} + +void QEMU_NORETURN tlb_flush_by_mmuidx_all_cpus_synced(CPUState *src_cpu, + uint16_t idxmap) +{ + const run_on_cpu_func fn = tlb_flush_by_mmuidx_async_work; + + tlb_debug("mmu_idx: 0x%"PRIx16"\n", idxmap); + + flush_all_helper(src_cpu, fn, RUN_ON_CPU_HOST_INT(idxmap)); + async_safe_run_on_cpu(src_cpu, fn, RUN_ON_CPU_HOST_INT(idxmap)); + cpu_loop_exit(src_cpu); +} + + + static inline void tlb_flush_entry(CPUTLBEntry *tlb_entry, target_ulong addr) { if (addr == (tlb_entry->addr_read & @@ -317,14 +373,56 @@ void tlb_flush_page_by_mmuidx(CPUState *cpu, target_ulong addr, uint16_t idxmap) } } -void tlb_flush_page_all(target_ulong addr) +void tlb_flush_page_by_mmuidx_all_cpus(CPUState *src_cpu, target_ulong addr, + uint16_t idxmap) { - CPUState *cpu; + const run_on_cpu_func fn = tlb_check_page_and_flush_by_mmuidx_async_work; + target_ulong addr_and_mmu_idx; - CPU_FOREACH(cpu) { - async_run_on_cpu(cpu, tlb_flush_page_async_work, - RUN_ON_CPU_TARGET_PTR(addr)); - } + tlb_debug("addr: "TARGET_FMT_lx" mmu_idx:%"PRIx16"\n", addr, idxmap); + + /* This should already be page aligned */ + addr_and_mmu_idx = addr & TARGET_PAGE_MASK; + addr_and_mmu_idx |= idxmap; + + flush_all_helper(src_cpu, fn, RUN_ON_CPU_TARGET_PTR(addr_and_mmu_idx)); + fn(src_cpu, RUN_ON_CPU_TARGET_PTR(addr_and_mmu_idx)); +} + +void QEMU_NORETURN tlb_flush_page_by_mmuidx_all_cpus_synced(CPUState *src_cpu, + target_ulong addr, + uint16_t idxmap) +{ + const run_on_cpu_func fn = tlb_check_page_and_flush_by_mmuidx_async_work; + target_ulong addr_and_mmu_idx; + + tlb_debug("addr: "TARGET_FMT_lx" mmu_idx:%"PRIx16"\n", addr, idxmap); + + /* This should already be page aligned */ + addr_and_mmu_idx = addr & TARGET_PAGE_MASK; + addr_and_mmu_idx |= idxmap; + + flush_all_helper(src_cpu, fn, RUN_ON_CPU_TARGET_PTR(addr_and_mmu_idx)); + async_safe_run_on_cpu(src_cpu, fn, RUN_ON_CPU_TARGET_PTR(addr_and_mmu_idx)); + cpu_loop_exit(src_cpu); +} + +void tlb_flush_page_all_cpus(CPUState *src, target_ulong addr) +{ + const run_on_cpu_func fn = tlb_flush_page_async_work; + + flush_all_helper(src, fn, RUN_ON_CPU_TARGET_PTR(addr)); + fn(src, RUN_ON_CPU_TARGET_PTR(addr)); +} + +void QEMU_NORETURN tlb_flush_page_all_cpus_synced(CPUState *src, + target_ulong addr) +{ + const run_on_cpu_func fn = tlb_flush_page_async_work; + + flush_all_helper(src, fn, RUN_ON_CPU_TARGET_PTR(addr)); + async_safe_run_on_cpu(src, fn, RUN_ON_CPU_TARGET_PTR(addr)); + cpu_loop_exit(src); } /* update the TLBs so that writes to code in the virtual page 'addr' diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h index a6c17ed74a..cffed39fe9 100644 --- a/include/exec/exec-all.h +++ b/include/exec/exec-all.h @@ -93,6 +93,27 @@ void cpu_address_space_init(CPUState *cpu, AddressSpace *as, int asidx); */ void tlb_flush_page(CPUState *cpu, target_ulong addr); /** + * tlb_flush_page_all_cpus: + * @cpu: src CPU of the flush + * @addr: virtual address of page to be flushed + * + * Flush one page from the TLB of the specified CPU, for all + * MMU indexes. + */ +void tlb_flush_page_all_cpus(CPUState *src, target_ulong addr); +/** + * tlb_flush_page_all_cpus_synced: + * @cpu: src CPU of the flush + * @addr: virtual address of page to be flushed + * + * Flush one page from the TLB of the specified CPU, for all + * MMU indexes like tlb_flush_page_all_cpus except this function does + * not return and will restart the run-loop once all CPUs have + * executed the flush. + */ +void QEMU_NORETURN tlb_flush_page_all_cpus_synced(CPUState *src, + target_ulong addr); +/** * tlb_flush: * @cpu: CPU whose TLB should be flushed * @@ -103,6 +124,19 @@ void tlb_flush_page(CPUState *cpu, target_ulong addr); */ void tlb_flush(CPUState *cpu); /** + * tlb_flush_all_cpus: + * @cpu: src CPU of the flush + */ +void tlb_flush_all_cpus(CPUState *src_cpu); +/** + * tlb_flush_all_cpus_synced: + * @cpu: src CPU of the flush + * + * Like tlb_flush_all_cpus except this function does not return and + * will restart the run-loop once all CPUs have executed the flush. + */ +void QEMU_NORETURN tlb_flush_all_cpus_synced(CPUState *src_cpu); +/** * tlb_flush_page_by_mmuidx: * @cpu: CPU whose TLB should be flushed * @addr: virtual address of page to be flushed @@ -114,8 +148,34 @@ void tlb_flush(CPUState *cpu); void tlb_flush_page_by_mmuidx(CPUState *cpu, target_ulong addr, uint16_t idxmap); /** + * tlb_flush_page_by_mmuidx_all_cpus: + * @cpu: Originating CPU of the flush + * @addr: virtual address of page to be flushed + * @idxmap: bitmap of MMU indexes to flush + * + * Flush one page from the TLB of all CPUs, for the specified + * MMU indexes. + */ +void tlb_flush_page_by_mmuidx_all_cpus(CPUState *cpu, target_ulong addr, + uint16_t idxmap); +/** + * tlb_flush_page_by_mmuidx_all_cpus_synced: + * @cpu: Originating CPU of the flush + * @addr: virtual address of page to be flushed + * @idxmap: bitmap of MMU indexes to flush + * + * Flush one page from the TLB of all CPUs, for the specified MMU + * indexes like tlb_flush_page_by_mmuidx_all_cpus except this function + * does not return and will restart the run-loop once all CPUs have + * executed the flush. + */ +void QEMU_NORETURN tlb_flush_page_by_mmuidx_all_cpus_synced(CPUState *cpu, + target_ulong addr, + uint16_t idxmap); +/** * tlb_flush_by_mmuidx: * @cpu: CPU whose TLB should be flushed + * @wait: If true ensure synchronisation by exiting the cpu_loop * @idxmap: bitmap of MMU indexes to flush * * Flush all entries from the TLB of the specified CPU, for the specified @@ -123,6 +183,27 @@ void tlb_flush_page_by_mmuidx(CPUState *cpu, target_ulong addr, */ void tlb_flush_by_mmuidx(CPUState *cpu, uint16_t idxmap); /** + * tlb_flush_by_mmuidx_all_cpus: + * @cpu: Originating CPU of the flush + * @idxmap: bitmap of MMU indexes to flush + * + * Flush all entries from all TLBs of all CPUs, for the specified + * MMU indexes. + */ +void tlb_flush_by_mmuidx_all_cpus(CPUState *cpu, uint16_t idxmap); +/** + * tlb_flush_by_mmuidx_all_cpus_synced: + * @cpu: Originating CPU of the flush + * @idxmap: bitmap of MMU indexes to flush + * + * Flush all entries from all TLBs of all CPUs, for the specified + * MMU indexes like tlb_flush_by_mmuidx_all_cpus except this function + * does not return and will restart the run-loop once all CPUs have + * executed the flush. + */ +void QEMU_NORETURN tlb_flush_by_mmuidx_all_cpus_synced(CPUState *cpu, + uint16_t idxmap); +/** * tlb_set_page_with_attrs: * @cpu: CPU to add this TLB entry for * @vaddr: virtual address of page to add entry for @@ -159,16 +240,26 @@ void tlb_set_page(CPUState *cpu, target_ulong vaddr, void tb_invalidate_phys_addr(AddressSpace *as, hwaddr addr); void probe_write(CPUArchState *env, target_ulong addr, int mmu_idx, uintptr_t retaddr); -void tlb_flush_page_all(target_ulong addr); #else static inline void tlb_flush_page(CPUState *cpu, target_ulong addr) { } - +static inline void tlb_flush_page_all_cpus(CPUState *src, target_ulong addr) +{ +} +static inline void tlb_flush_page_all_cpus_synced(CPUState *src, + target_ulong addr) +{ +} static inline void tlb_flush(CPUState *cpu) { } - +static inline void tlb_flush_all_cpus(CPUState *src_cpu) +{ +} +static inline void tlb_flush_all_cpus_synced(CPUState *src_cpu) +{ +} static inline void tlb_flush_page_by_mmuidx(CPUState *cpu, target_ulong addr, uint16_t idxmap) { @@ -177,6 +268,23 @@ static inline void tlb_flush_page_by_mmuidx(CPUState *cpu, static inline void tlb_flush_by_mmuidx(CPUState *cpu, uint16_t idxmap) { } +static inline void tlb_flush_page_by_mmuidx_all_cpus(CPUState *cpu, + target_ulong addr, + uint16_t idxmap) +{ +} +static inline void tlb_flush_page_by_mmuidx_all_cpus_synced(CPUState *cpu, + target_ulong addr, + uint16_t idxmap) +{ +} +static inline void tlb_flush_by_mmuidx_all_cpus(CPUState *cpu, uint16_t idxmap) +{ +} +static inline void tlb_flush_by_mmuidx_all_cpus_synced(CPUState *cpu, + uint16_t idxmap) +{ +} #endif #define CODE_GEN_ALIGN 16 /* must be >= of the size of a icache line */