From patchwork Fri Jun 15 19:43:39 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 138755 Delivered-To: patch@linaro.org Received: by 2002:a2e:970d:0:0:0:0:0 with SMTP id r13-v6csp1247810lji; Fri, 15 Jun 2018 12:48:07 -0700 (PDT) X-Google-Smtp-Source: ADUXVKJlpgEZ/4MrceX/3b6eeJpwSP8+ulwnUbfRRAKieNHq1TRP9lhjOVYIVQj3DBHJMOH8ILej X-Received: by 2002:a0c:cc91:: with SMTP id f17-v6mr2635533qvl.239.1529092087522; Fri, 15 Jun 2018 12:48:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529092087; cv=none; d=google.com; s=arc-20160816; b=Qc559S10kMxIHTnotE78hbQSemEytqvluv43fOfOkgWhM8i7GuG4Ni7iSc9ITRc/S5 /nk90ZnNP7PykzxBKDJvMZ2K6TdIkA/JwnWt8gex3St6CQTYD6HdhcmjmHvQK7xmxmFk v7vtpZFAz9HRbk1eleocFz2TLJKuBRuAEIlSPKsnfDb89XKJfx5Wxm2AnH3SXIi6bTEU GSXqyR7stcK48wd21/QbQcVTfMjV8K95npdNp8Ms/veOTQRvFtTDYPQEl06MVtUsVtln Ymjw9rhqkSIXwLp3OXyeHvcLxoau3+FEDdsSnirRpecLqxrx7NCReGvvTXGo9/nwJKS3 ZS2Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=2eXp/0v31+I85MHKNIFJ+BSApDB0NAmwq4blTyk6igY=; b=JiwaHhKJQkWQpLxQ290AYdtpjJWSdo8fll/1owfomYbLeEAv6kP7BunOx6DJF9+U6S sz0zepK2Esg8hhakRJqEFVwDJGRsuKkxx9dYJVoqJn8YWykXvPJ0iryiGQhpVZns3DZk gnoB5paHBAvss2C2zzidkW5BxrnaKtKl39gcLoRPVO6aHYrx2jqDKaS2vh2FzB419ovW vzTP28WI+tFP3/r99WmMK/+WIiggs1YtPE2WoQfyKJE6+rXJAioUzMxpy/yMV3f5+RC5 l+g4e2YZfaV/NPf7FogvnnG4l6wEeoYPWVKWfKdcYgz1t3N9AK2VkH1P/cWv/9lzxsbX wt8g== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=kLBt+GaJ; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id p5-v6si7769024qkc.230.2018.06.15.12.48.07 for (version=TLS1 cipher=AES128-SHA bits=128/128); Fri, 15 Jun 2018 12:48:07 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=kLBt+GaJ; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:48971 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fTuhb-0002o7-0E for patch@linaro.org; Fri, 15 Jun 2018 15:48:07 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:51521) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fTudn-0000GL-6S for qemu-devel@nongnu.org; Fri, 15 Jun 2018 15:44:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fTudk-0002RH-LJ for qemu-devel@nongnu.org; Fri, 15 Jun 2018 15:44:11 -0400 Received: from mail-pg0-x22e.google.com ([2607:f8b0:400e:c05::22e]:35222) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1fTudk-0002QY-AS for qemu-devel@nongnu.org; Fri, 15 Jun 2018 15:44:08 -0400 Received: by mail-pg0-x22e.google.com with SMTP id 15-v6so4858774pge.2 for ; Fri, 15 Jun 2018 12:44:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=2eXp/0v31+I85MHKNIFJ+BSApDB0NAmwq4blTyk6igY=; b=kLBt+GaJwQz5NC7TWXZdEPcYeCYnYUzZg/1A/lGg54chdV+uhrbSb9Ntmt95Q/M+Tf aFNTj3eB1qdMs/qj0SvBDt8FueciNHVuNWKregXghtg/weAWGGbP+3+yb3wLvxYWqtrw 35+UHXfuGsPn5poH+SEo9vyHqjEtvPNEd3scc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=2eXp/0v31+I85MHKNIFJ+BSApDB0NAmwq4blTyk6igY=; b=FY8XZLaaYMlY7ZxKWyYL7gUlRwKlBolipOCW1ojSDDckFaxpUDH4At2eit+QD6/7fy hHou0kf2O4gB1rIVfUvBmw29hmVX/P+jpAN5YV3ONNIIhEIRLdFXh6ETVbTt1TkHVfYO /ms0usPwUbfmX/alypx2h2TPh8h0NLIaaiFNAt4joV5L2EPsZd7VKs9N8ehkJ5Kj0EcR uM7ffEtYGZW8Zq9UEGoYNryMHY41gWppR6eTP2vOzIVwpFuXmJnr9mOkUAs7smU2oXzs e83BqfvUVchnseSbLk250+H2IZk4wOMi7ZEn0kqbd7sEdHN4v4DRGvqvlzwCsYhG7VXv lsAg== X-Gm-Message-State: APt69E2iiqVzuRL4MPsTT4CebR1zMdUEvctEIbPQVDTW6uO+KsbrelFp EbFyXzL4pqaE+DfcQ0JwUU5buxWg5wg= X-Received: by 2002:a62:3101:: with SMTP id x1-v6mr3331117pfx.246.1529091846867; Fri, 15 Jun 2018 12:44:06 -0700 (PDT) Received: from cloudburst.twiddle.net (rrcs-173-198-77-219.west.biz.rr.com. [173.198.77.219]) by smtp.gmail.com with ESMTPSA id 29-v6sm14038360pfj.14.2018.06.15.12.44.05 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Fri, 15 Jun 2018 12:44:05 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 15 Jun 2018 09:43:39 -1000 Message-Id: <20180615194354.12489-5-richard.henderson@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180615194354.12489-1-richard.henderson@linaro.org> References: <20180615194354.12489-1-richard.henderson@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c05::22e Subject: [Qemu-devel] [PULL v2 04/19] tcg: track TBs with per-region BST's X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org, "Emilio G. Cota" Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" From: "Emilio G. Cota" This paves the way for enabling scalable parallel generation of TCG code. Instead of tracking TBs with a single binary search tree (BST), use a BST for each TCG region, protecting it with a lock. This is as scalable as it gets, since each TCG thread operates on a separate region. The core of this change is the introduction of struct tcg_region_tree, which contains a pointer to a GTree and an associated lock to serialize accesses to it. We then allocate an array of tcg_region_tree's, adding the appropriate padding to avoid false sharing based on qemu_dcache_linesize. Given a tc_ptr, we first find the corresponding region_tree. This is done by special-casing the first and last regions first, since they might be of size != region.size; otherwise we just divide the offset by region.stride. I was worried about this division (several dozen cycles of latency), but profiling shows that this is not a fast path. Note that region.stride is not required to be a power of two; it is only required to be a multiple of the host's page size. Note that with this design we can also provide consistent snapshots about all region trees at once; for instance, tcg_tb_foreach acquires/releases all region_tree locks before/after iterating over them. For this reason we now drop tb_lock in dump_exec_info(). As an alternative I considered implementing a concurrent BST, but this can be tricky to get right, offers no consistent snapshots of the BST, and performance and scalability-wise I don't think it could ever beat having separate GTrees, given that our workload is insert-mostly (all concurrent BST designs I've seen focus, understandably, on making lookups fast, which comes at the expense of convoluted, non-wait-free insertions/removals). Reviewed-by: Richard Henderson Reviewed-by: Alex BennĂ©e Signed-off-by: Emilio G. Cota Signed-off-by: Richard Henderson --- include/exec/exec-all.h | 1 - include/exec/tb-context.h | 1 - tcg/tcg.h | 6 ++ accel/tcg/cpu-exec.c | 2 +- accel/tcg/translate-all.c | 101 +++----------------- tcg/tcg.c | 191 ++++++++++++++++++++++++++++++++++++++ 6 files changed, 213 insertions(+), 89 deletions(-) -- 2.17.1 diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h index 8bbea787a9..8d4306ac25 100644 --- a/include/exec/exec-all.h +++ b/include/exec/exec-all.h @@ -405,7 +405,6 @@ static inline uint32_t curr_cflags(void) | (use_icount ? CF_USE_ICOUNT : 0); } -void tb_remove(TranslationBlock *tb); void tb_flush(CPUState *cpu); void tb_phys_invalidate(TranslationBlock *tb, tb_page_addr_t page_addr); TranslationBlock *tb_htable_lookup(CPUState *cpu, target_ulong pc, diff --git a/include/exec/tb-context.h b/include/exec/tb-context.h index 1d41202485..d8472c88fb 100644 --- a/include/exec/tb-context.h +++ b/include/exec/tb-context.h @@ -31,7 +31,6 @@ typedef struct TBContext TBContext; struct TBContext { - GTree *tb_tree; struct qht htable; /* any access to the tbs or the page table must use this lock */ QemuMutex tb_lock; diff --git a/tcg/tcg.h b/tcg/tcg.h index 509f4d65d2..1e6df1906f 100644 --- a/tcg/tcg.h +++ b/tcg/tcg.h @@ -866,6 +866,12 @@ void tcg_region_reset_all(void); size_t tcg_code_size(void); size_t tcg_code_capacity(void); +void tcg_tb_insert(TranslationBlock *tb); +void tcg_tb_remove(TranslationBlock *tb); +TranslationBlock *tcg_tb_lookup(uintptr_t tc_ptr); +void tcg_tb_foreach(GTraverseFunc func, gpointer user_data); +size_t tcg_nb_tbs(void); + /* user-mode: Called with tb_lock held. */ static inline void *tcg_malloc(int size) { diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c index 6d6c51b686..7570c59f09 100644 --- a/accel/tcg/cpu-exec.c +++ b/accel/tcg/cpu-exec.c @@ -224,7 +224,7 @@ static void cpu_exec_nocache(CPUState *cpu, int max_cycles, tb_lock(); tb_phys_invalidate(tb, -1); - tb_remove(tb); + tcg_tb_remove(tb); tb_unlock(); } #endif diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c index 1695f8c352..ef841c82cc 100644 --- a/accel/tcg/translate-all.c +++ b/accel/tcg/translate-all.c @@ -205,8 +205,6 @@ void tb_lock_reset(void) } } -static TranslationBlock *tb_find_pc(uintptr_t tc_ptr); - void cpu_gen_init(void) { tcg_context_init(&tcg_init_ctx); @@ -375,13 +373,13 @@ bool cpu_restore_state(CPUState *cpu, uintptr_t host_pc, bool will_exit) if (check_offset < tcg_init_ctx.code_gen_buffer_size) { tb_lock(); - tb = tb_find_pc(host_pc); + tb = tcg_tb_lookup(host_pc); if (tb) { cpu_restore_state_from_tb(cpu, tb, host_pc, will_exit); if (tb->cflags & CF_NOCACHE) { /* one-shot translation, invalidate it immediately */ tb_phys_invalidate(tb, -1); - tb_remove(tb); + tcg_tb_remove(tb); } r = true; } @@ -728,48 +726,6 @@ static inline void *alloc_code_gen_buffer(void) } #endif /* USE_STATIC_CODE_GEN_BUFFER, WIN32, POSIX */ -/* compare a pointer @ptr and a tb_tc @s */ -static int ptr_cmp_tb_tc(const void *ptr, const struct tb_tc *s) -{ - if (ptr >= s->ptr + s->size) { - return 1; - } else if (ptr < s->ptr) { - return -1; - } - return 0; -} - -static gint tb_tc_cmp(gconstpointer ap, gconstpointer bp) -{ - const struct tb_tc *a = ap; - const struct tb_tc *b = bp; - - /* - * When both sizes are set, we know this isn't a lookup. - * This is the most likely case: every TB must be inserted; lookups - * are a lot less frequent. - */ - if (likely(a->size && b->size)) { - if (a->ptr > b->ptr) { - return 1; - } else if (a->ptr < b->ptr) { - return -1; - } - /* a->ptr == b->ptr should happen only on deletions */ - g_assert(a->size == b->size); - return 0; - } - /* - * All lookups have either .size field set to 0. - * From the glib sources we see that @ap is always the lookup key. However - * the docs provide no guarantee, so we just mark this case as likely. - */ - if (likely(a->size == 0)) { - return ptr_cmp_tb_tc(a->ptr, b); - } - return ptr_cmp_tb_tc(b->ptr, a); -} - static inline void code_gen_alloc(size_t tb_size) { tcg_ctx->code_gen_buffer_size = size_code_gen_buffer(tb_size); @@ -778,7 +734,6 @@ static inline void code_gen_alloc(size_t tb_size) fprintf(stderr, "Could not allocate dynamic translator buffer\n"); exit(1); } - tb_ctx.tb_tree = g_tree_new(tb_tc_cmp); qemu_mutex_init(&tb_ctx.tb_lock); } @@ -839,14 +794,6 @@ static TranslationBlock *tb_alloc(target_ulong pc) return tb; } -/* Called with tb_lock held. */ -void tb_remove(TranslationBlock *tb) -{ - assert_tb_locked(); - - g_tree_remove(tb_ctx.tb_tree, &tb->tc); -} - static inline void invalidate_page_bitmap(PageDesc *p) { #ifdef CONFIG_SOFTMMU @@ -911,10 +858,10 @@ static void do_tb_flush(CPUState *cpu, run_on_cpu_data tb_flush_count) } if (DEBUG_TB_FLUSH_GATE) { - size_t nb_tbs = g_tree_nnodes(tb_ctx.tb_tree); + size_t nb_tbs = tcg_nb_tbs(); size_t host_size = 0; - g_tree_foreach(tb_ctx.tb_tree, tb_host_size_iter, &host_size); + tcg_tb_foreach(tb_host_size_iter, &host_size); printf("qemu: flush code_size=%zu nb_tbs=%zu avg_tb_size=%zu\n", tcg_code_size(), nb_tbs, nb_tbs > 0 ? host_size / nb_tbs : 0); } @@ -923,10 +870,6 @@ static void do_tb_flush(CPUState *cpu, run_on_cpu_data tb_flush_count) cpu_tb_jmp_cache_clear(cpu); } - /* Increment the refcount first so that destroy acts as a reset */ - g_tree_ref(tb_ctx.tb_tree); - g_tree_destroy(tb_ctx.tb_tree); - qht_reset_size(&tb_ctx.htable, CODE_GEN_HTABLE_SIZE); page_flush_tb(); @@ -1406,7 +1349,7 @@ TranslationBlock *tb_gen_code(CPUState *cpu, * through the physical hash table and physical page list. */ tb_link_page(tb, phys_pc, phys_page2); - g_tree_insert(tb_ctx.tb_tree, &tb->tc, tb); + tcg_tb_insert(tb); return tb; } @@ -1510,7 +1453,7 @@ void tb_invalidate_phys_page_range(tb_page_addr_t start, tb_page_addr_t end, current_tb = NULL; if (cpu->mem_io_pc) { /* now we have a real cpu fault */ - current_tb = tb_find_pc(cpu->mem_io_pc); + current_tb = tcg_tb_lookup(cpu->mem_io_pc); } } if (current_tb == tb && @@ -1627,7 +1570,7 @@ static bool tb_invalidate_phys_page(tb_page_addr_t addr, uintptr_t pc) tb = p->first_tb; #ifdef TARGET_HAS_PRECISE_SMC if (tb && pc != 0) { - current_tb = tb_find_pc(pc); + current_tb = tcg_tb_lookup(pc); } if (cpu != NULL) { env = cpu->env_ptr; @@ -1670,18 +1613,6 @@ static bool tb_invalidate_phys_page(tb_page_addr_t addr, uintptr_t pc) } #endif -/* - * Find the TB 'tb' such that - * tb->tc.ptr <= tc_ptr < tb->tc.ptr + tb->tc.size - * Return NULL if not found. - */ -static TranslationBlock *tb_find_pc(uintptr_t tc_ptr) -{ - struct tb_tc s = { .ptr = (void *)tc_ptr }; - - return g_tree_lookup(tb_ctx.tb_tree, &s); -} - #if !defined(CONFIG_USER_ONLY) void tb_invalidate_phys_addr(AddressSpace *as, hwaddr addr, MemTxAttrs attrs) { @@ -1709,7 +1640,7 @@ void tb_check_watchpoint(CPUState *cpu) { TranslationBlock *tb; - tb = tb_find_pc(cpu->mem_io_pc); + tb = tcg_tb_lookup(cpu->mem_io_pc); if (tb) { /* We can use retranslation to find the PC. */ cpu_restore_state_from_tb(cpu, tb, cpu->mem_io_pc, true); @@ -1743,7 +1674,7 @@ void cpu_io_recompile(CPUState *cpu, uintptr_t retaddr) uint32_t n; tb_lock(); - tb = tb_find_pc(retaddr); + tb = tcg_tb_lookup(retaddr); if (!tb) { cpu_abort(cpu, "cpu_io_recompile: could not find TB for pc=%p", (void *)retaddr); @@ -1782,7 +1713,7 @@ void cpu_io_recompile(CPUState *cpu, uintptr_t retaddr) * cpu_exec_nocache() */ tb_phys_invalidate(tb->orig_tb, -1); } - tb_remove(tb); + tcg_tb_remove(tb); } /* TODO: If env->pc != tb->pc (i.e. the faulting instruction was not @@ -1853,6 +1784,7 @@ static void print_qht_statistics(FILE *f, fprintf_function cpu_fprintf, } struct tb_tree_stats { + size_t nb_tbs; size_t host_size; size_t target_size; size_t max_target_size; @@ -1866,6 +1798,7 @@ static gboolean tb_tree_stats_iter(gpointer key, gpointer value, gpointer data) const TranslationBlock *tb = value; struct tb_tree_stats *tst = data; + tst->nb_tbs++; tst->host_size += tb->tc.size; tst->target_size += tb->size; if (tb->size > tst->max_target_size) { @@ -1889,10 +1822,8 @@ void dump_exec_info(FILE *f, fprintf_function cpu_fprintf) struct qht_stats hst; size_t nb_tbs; - tb_lock(); - - nb_tbs = g_tree_nnodes(tb_ctx.tb_tree); - g_tree_foreach(tb_ctx.tb_tree, tb_tree_stats_iter, &tst); + tcg_tb_foreach(tb_tree_stats_iter, &tst); + nb_tbs = tst.nb_tbs; /* XXX: avoid using doubles ? */ cpu_fprintf(f, "Translation buffer state:\n"); /* @@ -1927,8 +1858,6 @@ void dump_exec_info(FILE *f, fprintf_function cpu_fprintf) cpu_fprintf(f, "TB invalidate count %d\n", tb_ctx.tb_phys_invalidate_count); cpu_fprintf(f, "TLB flush count %zu\n", tlb_flush_count()); tcg_dump_info(f, cpu_fprintf); - - tb_unlock(); } void dump_opcount_info(FILE *f, fprintf_function cpu_fprintf) @@ -2196,7 +2125,7 @@ int page_unprotect(target_ulong address, uintptr_t pc) * set the page to PAGE_WRITE and did the TB invalidate for us. */ #ifdef TARGET_HAS_PRECISE_SMC - TranslationBlock *current_tb = tb_find_pc(pc); + TranslationBlock *current_tb = tcg_tb_lookup(pc); if (current_tb) { current_tb_invalidated = tb_cflags(current_tb) & CF_INVALID; } diff --git a/tcg/tcg.c b/tcg/tcg.c index 6eeebe0624..62e3391020 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -135,6 +135,12 @@ static TCGContext **tcg_ctxs; static unsigned int n_tcg_ctxs; TCGv_env cpu_env = 0; +struct tcg_region_tree { + QemuMutex lock; + GTree *tree; + /* padding to avoid false sharing is computed at run-time */ +}; + /* * We divide code_gen_buffer into equally-sized "regions" that TCG threads * dynamically allocate from as demand dictates. Given appropriate region @@ -158,6 +164,13 @@ struct tcg_region_state { }; static struct tcg_region_state region; +/* + * This is an array of struct tcg_region_tree's, with padding. + * We use void * to simplify the computation of region_trees[i]; each + * struct is found every tree_size bytes. + */ +static void *region_trees; +static size_t tree_size; static TCGRegSet tcg_target_available_regs[TCG_TYPE_COUNT]; static TCGRegSet tcg_target_call_clobber_regs; @@ -295,6 +308,180 @@ TCGLabel *gen_new_label(void) #include "tcg-target.inc.c" +/* compare a pointer @ptr and a tb_tc @s */ +static int ptr_cmp_tb_tc(const void *ptr, const struct tb_tc *s) +{ + if (ptr >= s->ptr + s->size) { + return 1; + } else if (ptr < s->ptr) { + return -1; + } + return 0; +} + +static gint tb_tc_cmp(gconstpointer ap, gconstpointer bp) +{ + const struct tb_tc *a = ap; + const struct tb_tc *b = bp; + + /* + * When both sizes are set, we know this isn't a lookup. + * This is the most likely case: every TB must be inserted; lookups + * are a lot less frequent. + */ + if (likely(a->size && b->size)) { + if (a->ptr > b->ptr) { + return 1; + } else if (a->ptr < b->ptr) { + return -1; + } + /* a->ptr == b->ptr should happen only on deletions */ + g_assert(a->size == b->size); + return 0; + } + /* + * All lookups have either .size field set to 0. + * From the glib sources we see that @ap is always the lookup key. However + * the docs provide no guarantee, so we just mark this case as likely. + */ + if (likely(a->size == 0)) { + return ptr_cmp_tb_tc(a->ptr, b); + } + return ptr_cmp_tb_tc(b->ptr, a); +} + +static void tcg_region_trees_init(void) +{ + size_t i; + + tree_size = ROUND_UP(sizeof(struct tcg_region_tree), qemu_dcache_linesize); + region_trees = qemu_memalign(qemu_dcache_linesize, region.n * tree_size); + for (i = 0; i < region.n; i++) { + struct tcg_region_tree *rt = region_trees + i * tree_size; + + qemu_mutex_init(&rt->lock); + rt->tree = g_tree_new(tb_tc_cmp); + } +} + +static struct tcg_region_tree *tc_ptr_to_region_tree(void *p) +{ + size_t region_idx; + + if (p < region.start_aligned) { + region_idx = 0; + } else { + ptrdiff_t offset = p - region.start_aligned; + + if (offset > region.stride * (region.n - 1)) { + region_idx = region.n - 1; + } else { + region_idx = offset / region.stride; + } + } + return region_trees + region_idx * tree_size; +} + +void tcg_tb_insert(TranslationBlock *tb) +{ + struct tcg_region_tree *rt = tc_ptr_to_region_tree(tb->tc.ptr); + + qemu_mutex_lock(&rt->lock); + g_tree_insert(rt->tree, &tb->tc, tb); + qemu_mutex_unlock(&rt->lock); +} + +void tcg_tb_remove(TranslationBlock *tb) +{ + struct tcg_region_tree *rt = tc_ptr_to_region_tree(tb->tc.ptr); + + qemu_mutex_lock(&rt->lock); + g_tree_remove(rt->tree, &tb->tc); + qemu_mutex_unlock(&rt->lock); +} + +/* + * Find the TB 'tb' such that + * tb->tc.ptr <= tc_ptr < tb->tc.ptr + tb->tc.size + * Return NULL if not found. + */ +TranslationBlock *tcg_tb_lookup(uintptr_t tc_ptr) +{ + struct tcg_region_tree *rt = tc_ptr_to_region_tree((void *)tc_ptr); + TranslationBlock *tb; + struct tb_tc s = { .ptr = (void *)tc_ptr }; + + qemu_mutex_lock(&rt->lock); + tb = g_tree_lookup(rt->tree, &s); + qemu_mutex_unlock(&rt->lock); + return tb; +} + +static void tcg_region_tree_lock_all(void) +{ + size_t i; + + for (i = 0; i < region.n; i++) { + struct tcg_region_tree *rt = region_trees + i * tree_size; + + qemu_mutex_lock(&rt->lock); + } +} + +static void tcg_region_tree_unlock_all(void) +{ + size_t i; + + for (i = 0; i < region.n; i++) { + struct tcg_region_tree *rt = region_trees + i * tree_size; + + qemu_mutex_unlock(&rt->lock); + } +} + +void tcg_tb_foreach(GTraverseFunc func, gpointer user_data) +{ + size_t i; + + tcg_region_tree_lock_all(); + for (i = 0; i < region.n; i++) { + struct tcg_region_tree *rt = region_trees + i * tree_size; + + g_tree_foreach(rt->tree, func, user_data); + } + tcg_region_tree_unlock_all(); +} + +size_t tcg_nb_tbs(void) +{ + size_t nb_tbs = 0; + size_t i; + + tcg_region_tree_lock_all(); + for (i = 0; i < region.n; i++) { + struct tcg_region_tree *rt = region_trees + i * tree_size; + + nb_tbs += g_tree_nnodes(rt->tree); + } + tcg_region_tree_unlock_all(); + return nb_tbs; +} + +static void tcg_region_tree_reset_all(void) +{ + size_t i; + + tcg_region_tree_lock_all(); + for (i = 0; i < region.n; i++) { + struct tcg_region_tree *rt = region_trees + i * tree_size; + + /* Increment the refcount first so that destroy acts as a reset */ + g_tree_ref(rt->tree); + g_tree_destroy(rt->tree); + } + tcg_region_tree_unlock_all(); +} + static void tcg_region_bounds(size_t curr_region, void **pstart, void **pend) { void *start, *end; @@ -380,6 +567,8 @@ void tcg_region_reset_all(void) g_assert(!err); } qemu_mutex_unlock(®ion.lock); + + tcg_region_tree_reset_all(); } #ifdef CONFIG_USER_ONLY @@ -496,6 +685,8 @@ void tcg_region_init(void) g_assert(!rc); } + tcg_region_trees_init(); + /* In user-mode we support only one ctx, so do the initial allocation now */ #ifdef CONFIG_USER_ONLY {