From patchwork Fri Feb 26 13:15:33 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Alex_Benn=C3=A9e?= X-Patchwork-Id: 63050 Delivered-To: patch@linaro.org Received: by 10.112.199.169 with SMTP id jl9csp714409lbc; Fri, 26 Feb 2016 05:21:19 -0800 (PST) X-Received: by 10.55.18.168 with SMTP id 40mr1712302qks.99.1456492879669; Fri, 26 Feb 2016 05:21:19 -0800 (PST) Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id j11si13114658qgj.90.2016.02.26.05.21.19 for (version=TLS1 cipher=AES128-SHA bits=128/128); Fri, 26 Feb 2016 05:21:19 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dkim=fail header.i=@linaro.org Received: from localhost ([::1]:49732 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aZIKc-00089S-UV for patch@linaro.org; Fri, 26 Feb 2016 08:21:18 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:33466) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aZIFU-00005t-9A for qemu-devel@nongnu.org; Fri, 26 Feb 2016 08:16:02 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aZIFP-0004GV-2g for qemu-devel@nongnu.org; Fri, 26 Feb 2016 08:15:59 -0500 Received: from mail-wm0-x232.google.com ([2a00:1450:400c:c09::232]:35134) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aZIFO-0004GH-PE for qemu-devel@nongnu.org; Fri, 26 Feb 2016 08:15:55 -0500 Received: by mail-wm0-x232.google.com with SMTP id c200so71917158wme.0 for ; Fri, 26 Feb 2016 05:15:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=/dbk6m8FgLTkAG/aLNJWwStZlkeUkenu8O8veeYoYQ4=; b=gWYZ7t3epf25YKAx0M8MU+wpAP93X+au/IIlJla2nZF/L6k99oEZZt8YKP3YWO0wfM Vy4wAuP/f2hI1aq5XApE28ahF8Z3Yo2pvvuqShF1Mj1OipI1EcG/MnmNBw1u2WGZHtR5 i2+le9GOUuvt2xQ+pPH62AMTChjaj78qBsG3Q= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=/dbk6m8FgLTkAG/aLNJWwStZlkeUkenu8O8veeYoYQ4=; b=T8rZRwIGFVbLfhTAsOqJttYJK02NLrJ1UV57YE78d4jsOMOCIUdPZaVoF07mctzJSn Vv3+xyqjuGZWU1ObPlztDCJuU+jbhW3yHKPch3Mrq/Yl+93rq0ZTJ6nJAv7wNYUoxwLk JpeLvFBRziwWBSGss9aH0A8sCi4tsmtGmQZayas2KFkrSPijBFBz2fSXdIrVIMO7ka7A 6jLnVjgEjXC4aEYyBeIZy51Szv7R9ElEAKEMzeUCeINgiE7fo9toNy1vZPGZkgG+gUge Otw3AyGBNKnWK3KU72UvurCrwgPD2XCoSGTr6CDrkkwjQLi1vWrWLfFdJACDd7i9aRak Waow== X-Gm-Message-State: AD7BkJJoAiOEiqDdKkPsI72dQrEeJu5u7oxcGZOYyddRwFQH2z88SzrVar14gUt9SNyIL8pp X-Received: by 10.194.238.34 with SMTP id vh2mr1882093wjc.157.1456492554221; Fri, 26 Feb 2016 05:15:54 -0800 (PST) Received: from zen.linaro.local ([81.128.185.34]) by smtp.gmail.com with ESMTPSA id w136sm2860496wmw.0.2016.02.26.05.15.46 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 26 Feb 2016 05:15:49 -0800 (PST) Received: from zen.linaroharston (localhost [127.0.0.1]) by zen.linaro.local (Postfix) with ESMTP id 9D6043E032A; Fri, 26 Feb 2016 13:15:41 +0000 (GMT) From: =?UTF-8?q?Alex=20Benn=C3=A9e?= To: mttcg@listserver.greensocs.com, mark.burton@greensocs.com, fred.konrad@greensocs.com, a.rigo@virtualopensystems.com Date: Fri, 26 Feb 2016 13:15:33 +0000 Message-Id: <1456492533-17171-12-git-send-email-alex.bennee@linaro.org> X-Mailer: git-send-email 2.7.1 In-Reply-To: <1456492533-17171-1-git-send-email-alex.bennee@linaro.org> References: <1456492533-17171-1-git-send-email-alex.bennee@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2a00:1450:400c:c09::232 Cc: peter.maydell@linaro.org, drjones@redhat.com, a.spyridakis@virtualopensystems.com, claudio.fontana@huawei.com, qemu-devel@nongnu.org, will.deacon@arm.com, crosthwaitepeter@gmail.com, pbonzini@redhat.com, =?UTF-8?q?Alex=20Benn=C3=A9e?= , aurelien@aurel32.net, rth@twiddle.net Subject: [Qemu-devel] [RFC 11/11] arm/tcg-test: some basic TCG exercising tests X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org These tests are not really aimed at KVM at all but exist to stretch QEMU's TCG code generator. In particular these exercise the ability of the TCG to: * Chain TranslationBlocks together (tight) * Handle heavy usage of the tb_jump_cache (paged) * Pathological case of computed local jumps (computed) In addition the tests can be varied by adding IPI IRQs or SMC sequences into the mix to stress the tcg_exit and invalidation mechanisms. Signed-off-by: Alex Bennée --- arm/tcg-test-asm.S | 170 +++++++++++++++++++++++++++++ arm/tcg-test.c | 248 +++++++++++++++++++++++++++++++++++++++++++ arm/unittests.cfg | 90 ++++++++++++++++ config/config-arm-common.mak | 2 + 4 files changed, 510 insertions(+) create mode 100644 arm/tcg-test-asm.S create mode 100644 arm/tcg-test.c -- 2.7.1 diff --git a/arm/tcg-test-asm.S b/arm/tcg-test-asm.S new file mode 100644 index 0000000..6e823b7 --- /dev/null +++ b/arm/tcg-test-asm.S @@ -0,0 +1,170 @@ +/* + * TCG Test assembler functions for armv7 tests. + * + * Copyright (C) 2016, Linaro Ltd, Alex Bennée + * + * This work is licensed under the terms of the GNU LGPL, version 2. + * + * These helper functions are written in pure asm to control the size + * of the basic blocks and ensure they fit neatly into page + * aligned chunks. The pattern of branches they follow is determined by + * the 32 bit seed they are passed. It should be the same for each set. + * + * Calling convention + * - r0, iterations + * - r1, jump pattern + * - r2-r3, scratch + * + * Returns r0 + */ + +.arm + +.section .text + +/* Tight - all blocks should quickly be patched and should run + * very fast unless irqs or smc gets in the way + */ + +.global tight_start +tight_start: + subs r0, r0, #1 + beq tight_end + + ror r1, r1, #1 + tst r1, #1 + beq tightA + b tight_start + +tightA: + subs r0, r0, #1 + beq tight_end + + ror r1, r1, #1 + tst r1, #1 + beq tightB + b tight_start + +tightB: + subs r0, r0, #1 + beq tight_end + + ror r1, r1, #1 + tst r1, #1 + beq tight_start + b tightA + +.global tight_end +tight_end: + mov pc, lr + +/* + * Computed jumps cannot be hardwired into the basic blocks so each one + * will cause an exit for the main execution loop to look up the next block. + * + * There is some caching which should ameliorate the cost a little. + */ + + /* Align << 13 == 4096 byte alignment */ + .align 13 + .global computed_start +computed_start: + subs r0, r0, #1 + beq computed_end + + /* Jump table */ + ror r1, r1, #1 + and r2, r1, #1 + adr r3, computed_jump_table + ldr r2, [r3, r2, lsl #2] + mov pc, r2 + + b computed_err + +computed_jump_table: + .word computed_start + .word computedA + +computedA: + subs r0, r0, #1 + beq computed_end + + /* Jump into code */ + ror r1, r1, #1 + and r2, r1, #1 + adr r3, 1f + add r3, r2, lsl #2 + mov pc, r3 +1: b computed_start + b computedB + + b computed_err + + +computedB: + subs r0, r0, #1 + beq computed_end + ror r1, r1, #1 + + /* Conditional register load */ + adr r3, computedA + tst r1, #1 + adreq r3, computed_start + mov pc, r3 + + b computed_err + +computed_err: + mov r0, #1 + .global computed_end +computed_end: + mov pc, lr + + +/* + * Page hoping + * + * Each block is in a different page, hence the blocks never get joined + */ + /* Align << 13 == 4096 byte alignment */ + .align 13 + .global paged_start +paged_start: + subs r0, r0, #1 + beq paged_end + + ror r1, r1, #1 + tst r1, #1 + beq pagedA + b paged_start + + /* Align << 13 == 4096 byte alignment */ + .align 13 +pagedA: + subs r0, r0, #1 + beq paged_end + + ror r1, r1, #1 + tst r1, #1 + beq pagedB + b paged_start + + /* Align << 13 == 4096 byte alignment */ + .align 13 +pagedB: + subs r0, r0, #1 + beq paged_end + + ror r1, r1, #1 + tst r1, #1 + beq paged_start + b pagedA + + /* Align << 13 == 4096 byte alignment */ + .align 13 +.global paged_end +paged_end: + mov pc, lr + +.global test_code_end +test_code_end: diff --git a/arm/tcg-test.c b/arm/tcg-test.c new file mode 100644 index 0000000..6fa61ba --- /dev/null +++ b/arm/tcg-test.c @@ -0,0 +1,248 @@ +/* + * ARM TCG Tests + * + * These tests are explicitly aimed at stretching the QEMU TCG engine. + */ + +#include +#include +#include +#include +#include +#include + +#include + +#define MAX_CPUS 8 + +/* These entry points are in the assembly code */ +extern int tight_start(uint32_t count, uint32_t pattern); +extern int computed_start(uint32_t count, uint32_t pattern); +extern int paged_start(uint32_t count, uint32_t pattern); +extern uint32_t tight_end; +extern uint32_t computed_end; +extern uint32_t paged_end; +extern unsigned long test_code_end; + +typedef int (*test_fn)(uint32_t count, uint32_t pattern); + +typedef struct { + const char *test_name; + bool should_pass; + test_fn start_fn; + uint32_t *code_end; +} test_descr_t; + +/* Test array */ +static test_descr_t tests[] = { + /* + * Tight chain. + * + * These are a bunch of basic blocks that have fixed branches in + * a page aligned space. The branches taken are decided by a + * psuedo-random bitmap for each CPU. + * + * Once the basic blocks have been chained together by the TCG they + * should run until they reach their block count. This will be the + * most efficient mode in which generated code is run. The only other + * exits will be caused by interrupts or TB invalidation. + */ + { "tight", true, tight_start, &tight_end }, + /* + * Computed jumps. + * + * A bunch of basic blocks which just do computed jumps so the basic + * block is never chained but they are all within a page (maybe not + * required). This will exercise the cache lookup but not the new + * generation. + */ + { "computed", true, computed_start, &computed_end }, + /* + * Page ping pong. + * + * Have the blocks are separated by PAGE_SIZE so they can never + * be chained together. + * + */ + { "paged", true, paged_start, &paged_end} +}; + +static test_descr_t *test = NULL; + +static int iterations = 100000; +static int rounds = 1000; +static int mod_freq = 5; +static uint32_t pattern[MAX_CPUS]; + +static int smc = 0; +static int irq = 0; +static int irq_cnt[MAX_CPUS]; +static int errors[MAX_CPUS]; + +static cpumask_t smp_test_complete; + + +/* This triggers TCGs SMC detection by writing values to the executing + * code pages. We are not actually modifying the instructions and the + * underlying code will remain unchanged. However this should trigger + * invalidation of the Translation Blocks + */ + +void trigger_smc_detection(uint32_t *start, uint32_t *end) +{ + volatile uint32_t *ptr = start; + while (ptr < end) { + uint32_t inst = *ptr; + *ptr++ = inst; + } +} + +/* Handler for receiving IRQs */ + +static void irq_handler(struct pt_regs *regs __unused) +{ + int cpu = smp_processor_id(); + irq_cnt[cpu]++; + gic_irq_ack(); +} + +/* This triggers cross-CPU IRQs. Each IRQ should cause the basic block + * execution to finish the main run-loop get entered again. + */ +int send_cross_cpu_irqs(int this_cpu) +{ + int cpu, sent = 0; + + for_each_present_cpu(cpu) { + if (cpu != this_cpu) { + gic_send_sgi(cpu, 1); + sent++; + } + } + + return sent; +} + + +void do_test(void) +{ + int cpu = smp_processor_id(); + int i; + int sent_irqs = 0; + + printf("CPU%d: online and setting up with pattern 0x%x\n", cpu, pattern[cpu]); + + if (irq) { + gic_enable(); +#ifdef __arm__ + install_exception_handler(EXCPTN_IRQ, irq_handler); +#else + install_irq_handler(EL1H_IRQ, irq_handler); +#endif + local_irq_enable(); + } + + for (i=0; istart_fn(iterations, pattern[cpu]); + + if ((i + cpu) % mod_freq == 0) + { + if (smc) { + trigger_smc_detection((uint32_t *) test->start_fn, + test->code_end); + } + if (irq) { + sent_irqs += send_cross_cpu_irqs(cpu); + } + } + } + + if (irq) { + printf("CPU%d: Done with %d irqs sent and %d received\n", cpu, sent_irqs, irq_cnt[cpu]); + } else { + printf("CPU%d: Done with %d errors\n", cpu, errors[cpu]); + } + + cpumask_set_cpu(cpu, &smp_test_complete); + if (cpu != 0) + halt(); +} + + +void setup_and_run_tcg_test(void) +{ + static const unsigned char seed[] = "tcg-test"; + struct isaac_ctx prng_context; + int cpu; + + isaac_init(&prng_context, &seed[0], sizeof(seed)); + + if (irq) { + gic_enable(); + } + + /* boot other CPUs */ + for_each_present_cpu(cpu) { + pattern[cpu] = isaac_next_uint32(&prng_context); + + if (cpu == 0) + continue; + + smp_boot_secondary(cpu, do_test); + } + + do_test(); + + while (!cpumask_full(&smp_test_complete)) + cpu_relax(); + + /* how do we detect errors other than not crashing? */ + report("passed", true); +} + +int main(int argc, char **argv) +{ + int i; + unsigned int j; + + for (i=0; i