From patchwork Mon Aug 3 16:06:26 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Alex_Benn=C3=A9e?= X-Patchwork-Id: 51870 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-lb0-f198.google.com (mail-lb0-f198.google.com [209.85.217.198]) by patches.linaro.org (Postfix) with ESMTPS id 296C0228F2 for ; Mon, 3 Aug 2015 16:07:52 +0000 (UTC) Received: by lbbyj8 with SMTP id yj8sf26308723lbb.3 for ; Mon, 03 Aug 2015 09:07:51 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:delivered-to:references:from:to:in-reply-to:date :message-id:mime-version:content-type:cc:subject:precedence:list-id :list-unsubscribe:list-archive:list-post:list-help:list-subscribe :errors-to:sender:x-original-sender :x-original-authentication-results:mailing-list; bh=fwKUwFfHhGyBrVgag/8q7AB36s0FD0A4Kv6zFOkwaIk=; b=Z8wWhwXRjvGw+OOGhd57gyfoq+cxHkbRkzPnqYBDnVkpdixLQXV1zQZfsvru1m+bmj gyZe5kbo4skRUpKGX8PNtyY2iqxjkCJPd7TsWlTDoYydw7l2q+e/9MygCgE8PeDeJAb4 ZtVN+NjZttkqDspNoiJWKTrOIkiRNOLQRTVkvo/IxqvnqhLXOfeG6imMLLv65pX6fBEi OZB24Czv5GkUEpo6GhTUuA5+qF3Ad8Pc7nnZnFBZTNbJ07zvYbjoMeKbT4GVPvfTrCHr grztuDqh9vNt0dIDYxIAnc5er69/k3zltglX58uxcAn9MIjLYM7Z13bOjHwSjkj36wrW TiSg== X-Gm-Message-State: ALoCoQkKfk6DHL3BzPMVSVcxgTAw6aRJA7PQBWDFBIMqx0Hjt3h+mBBNma9nrFW10/LfF8KfO0cq X-Received: by 10.180.205.198 with SMTP id li6mr5976401wic.5.1438618071136; Mon, 03 Aug 2015 09:07:51 -0700 (PDT) X-BeenThere: patchwork-forward@linaro.org Received: by 10.152.25.134 with SMTP id c6ls625417lag.54.gmail; Mon, 03 Aug 2015 09:07:50 -0700 (PDT) X-Received: by 10.152.178.193 with SMTP id da1mr3080242lac.53.1438618070831; Mon, 03 Aug 2015 09:07:50 -0700 (PDT) Received: from mail-la0-f49.google.com (mail-la0-f49.google.com. [209.85.215.49]) by mx.google.com with ESMTPS id e10si12360985lam.131.2015.08.03.09.07.50 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 03 Aug 2015 09:07:50 -0700 (PDT) Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.215.49 as permitted sender) client-ip=209.85.215.49; Received: by labix3 with SMTP id ix3so3426572lab.0 for ; Mon, 03 Aug 2015 09:07:50 -0700 (PDT) X-Received: by 10.112.125.34 with SMTP id mn2mr16062445lbb.76.1438618070705; Mon, 03 Aug 2015 09:07:50 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.112.7.198 with SMTP id l6csp1949712lba; Mon, 3 Aug 2015 09:07:49 -0700 (PDT) X-Received: by 10.107.163.205 with SMTP id m196mr20104933ioe.185.1438618067222; Mon, 03 Aug 2015 09:07:47 -0700 (PDT) Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id j66si14379990ioi.48.2015.08.03.09.07.46 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Mon, 03 Aug 2015 09:07:47 -0700 (PDT) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Received: from localhost ([::1]:59787 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZMIHB-0006ZO-Io for patch@linaro.org; Mon, 03 Aug 2015 12:07:45 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59161) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZMIG2-0005Tj-PD for qemu-devel@nongnu.org; Mon, 03 Aug 2015 12:06:38 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZMIFy-0002Pc-2G for qemu-devel@nongnu.org; Mon, 03 Aug 2015 12:06:34 -0400 Received: from mail-wi0-f177.google.com ([209.85.212.177]:35584) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZMIFx-0002PT-PP for qemu-devel@nongnu.org; Mon, 03 Aug 2015 12:06:30 -0400 Received: by wibxm9 with SMTP id xm9so128359036wib.0 for ; Mon, 03 Aug 2015 09:06:29 -0700 (PDT) X-Received: by 10.180.198.140 with SMTP id jc12mr27742120wic.50.1438617988923; Mon, 03 Aug 2015 09:06:28 -0700 (PDT) Received: from zen.linaro.local ([81.128.185.34]) by smtp.gmail.com with ESMTPSA id lm16sm14261130wic.18.2015.08.03.09.06.27 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 03 Aug 2015 09:06:27 -0700 (PDT) Received: from zen.linaro.local (localhost [127.0.0.1]) by zen.linaro.local (Postfix) with ESMTPS id 8E0043E0445; Mon, 3 Aug 2015 17:06:26 +0100 (BST) References: <1438358041-18021-1-git-send-email-alex.bennee@linaro.org> <1438358041-18021-12-git-send-email-alex.bennee@linaro.org> <87oaioll54.fsf@linaro.org> From: Alex =?utf-8?Q?Benn=C3=A9e?= To: alvise rigo In-reply-to: Date: Mon, 03 Aug 2015 17:06:26 +0100 Message-ID: <87mvy8l5kt.fsf@linaro.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.85.212.177 Cc: mttcg@listserver.greensocs.com, Peter Maydell , Andrew Jones , Claudio Fontana , kvm@vger.kernel.org, Alexander Spyridakis , Mark Burton , QEMU Developers , KONRAD =?utf-8?B?RnLDqWTDqXJpYw==?= Subject: Re: [Qemu-devel] [kvm-unit-tests PATCH v5 11/11] new: arm/barrier-test for memory barriers X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: , List-Help: , List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: alex.bennee@linaro.org X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.215.49 as permitted sender) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org X-Google-Group-Id: 836684582541 alvise rigo writes: > On Mon, Aug 3, 2015 at 12:30 PM, Alex Bennée wrote: >> >> alvise rigo writes: >> >>> Hi Alex, >>> >>> Nice set of tests, they are proving to be helpful. >>> One question below. >>> >>> >>> Why are we calling these last two instructions with the 'eq' suffix? >>> Shouldn't we just strex r1, r0, [sptr] and then cmp r1, #0? >> >> Possibly, my armv7 is a little rusty. I'm just looking at tweaking this >> test now so I'll try and clean that up. Please find the updated test attached. I've also included some new test modes. In theory the barrier test by itself should still fail but it passes on real ARMv7 as well as TCG. I'm trying to run the test on a heavier core-ed ARMv7 to check. I suspect we get away with it on ARMv7-on-x86_64 due to the strong ordering of the x86. The "excl" and "acqrel" tests now run without issue (although again plain acqrel semantics shouldn't stop a race corrupting shared_value). I'll tweak the v8 versions of the test tomorrow. >From 0953549985134268bf9079a7a01b2631d8a4fdee Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Alex=20Benn=C3=A9e?= Date: Thu, 30 Jul 2015 15:13:33 +0000 Subject: [kvm-unit-tests PATCH] arm/barrier-test: add memory barrier tests MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This test has been written mainly to stress multi-threaded TCG behaviour but will demonstrate failure by default on real hardware. The test takes the following parameters: - "lock" use GCC's locking semantics - "atomic" use GCC's __atomic primitives - "barrier" use plain dmb() barriers - "wfelock" use WaitForEvent sleep - "excl" use load/store exclusive semantics - "acqrel" use acquire/release semantics Also two more options allow the test to be tweaked - "noshuffle" disables the memory shuffling - "count=%ld" set your own per-CPU increment count Signed-off-by: Alex Bennée --- v2 - Don't use thumb style strexeq stuff - Add atomic, barrier and wfelock tests - Add count/noshuffle test controls --- arm/barrier-test.c | 284 +++++++++++++++++++++++++++++++++++++++++++ config/config-arm-common.mak | 2 + 2 files changed, 286 insertions(+) create mode 100644 arm/barrier-test.c diff --git a/arm/barrier-test.c b/arm/barrier-test.c new file mode 100644 index 0000000..765f8f6 --- /dev/null +++ b/arm/barrier-test.c @@ -0,0 +1,284 @@ +#include +#include +#include +#include +#include + +#include + +#define MAX_CPUS 8 + +/* How many increments to do */ +static int increment_count = 10000000; +static int do_shuffle = 1; + + +/* shared value all the tests attempt to safely increment */ +static unsigned int shared_value; + +/* PAGE_SIZE * uint32_t means we span several pages */ +static uint32_t memory_array[PAGE_SIZE]; + +/* We use the alignment of the following to ensure accesses to locking + * and synchronisation primatives don't interfere with the page of the + * shared value + */ +__attribute__((aligned(PAGE_SIZE))) static unsigned int per_cpu_value[MAX_CPUS]; +__attribute__((aligned(PAGE_SIZE))) static cpumask_t smp_test_complete; +__attribute__((aligned(PAGE_SIZE))) static int global_lock; + +struct isaac_ctx prng_context[MAX_CPUS]; + +void (*inc_fn)(void); + + +/* In any SMP setting this *should* fail due to cores stepping on + * each other updating the shared variable + */ +static void increment_shared(void) +{ + shared_value++; +} + +/* GCC __sync primitives are deprecated in favour of __atomic */ +static void increment_shared_with_lock(void) +{ + while (__sync_lock_test_and_set(&global_lock, 1)); + shared_value++; + __sync_lock_release(&global_lock); +} + +/* In practice even __ATOMIC_RELAXED uses ARM's ldxr/stex exclusive + * semantics */ +static void increment_shared_with_atomic(void) +{ + __atomic_add_fetch(&shared_value, 1, __ATOMIC_SEQ_CST); +} + + +/* By themselves barriers do not gaurentee atomicity */ +static void increment_shared_with_barrier(void) +{ +#if defined (__LP64__) || defined (_LP64) +#else + asm volatile( + " ldr r0, [%[sptr]]\n" + " dmb\n" + " add r0, r0, #0x1\n" + " str r1, r0, [%[sptr]]\n" + " dmb\n" + : /* out */ + : [sptr] "r" (&shared_value) /* in */ + : "r0", "r1", "cc"); +#endif +} + +/* + * Load/store exclusive with WFE (wait-for-event) + */ + +static void increment_shared_with_wfelock(void) +{ +#if defined (__LP64__) || defined (_LP64) +#else + asm volatile( + " mov r1, #1\n" + "1: ldrex r0, [%[lock]]\n" + " cmp r0, #0\n" + " wfene\n" + " strexeq r0, r1, [%[lock]]\n" + " cmpeq r0, #0\n" + " bne 1b\n" + " dmb\n" + /* lock held */ + " ldr r0, [%[sptr]]\n" + " add r0, r0, #0x1\n" + " str r0, [%[sptr]]\n" + /* now release */ + " mov r0, #0\n" + " dmb\n" + " str r0, [%[lock]]\n" + " dsb\n" + " sev\n" + : /* out */ + : [lock] "r" (&global_lock), [sptr] "r" (&shared_value) /* in */ + : "r0", "r1", "cc"); +#endif +} + + +/* + * Hand-written version of the load/store exclusive + */ +static void increment_shared_with_excl(void) +{ +#if defined (__LP64__) || defined (_LP64) + asm volatile( + "1: ldxr w0, [%[sptr]]\n" + " add w0, w0, #0x1\n" + " stxr w1, w0, [%[sptr]]\n" + " cbnz w1, 1b\n" + : /* out */ + : [sptr] "r" (&shared_value) /* in */ + : "w0", "w1", "cc"); +#else + asm volatile( + "1: ldrex r0, [%[sptr]]\n" + " add r0, r0, #0x1\n" + " strex r1, r0, [%[sptr]]\n" + " cmp r1, #0\n" + " bne 1b\n" + : /* out */ + : [sptr] "r" (&shared_value) /* in */ + : "r0", "r1", "cc"); +#endif +} + +static void increment_shared_with_acqrel(void) +{ +#if defined (__LP64__) || defined (_LP64) + asm volatile( + " ldar w0, [%[sptr]]\n" + " add w0, w0, #0x1\n" + " str w0, [%[sptr]]\n" + : /* out */ + : [sptr] "r" (&shared_value) /* in */ + : "w0"); +#else + /* ARMv7 has no acquire/release semantics but we + * can ensure the results of the write are propagated + * with the use of barriers. + */ + asm volatile( + "1: ldrex r0, [%[sptr]]\n" + " add r0, r0, #0x1\n" + " strex r1, r0, [%[sptr]]\n" + " cmp r1, #0\n" + " bne 1b\n" + " dmb\n" + : /* out */ + : [sptr] "r" (&shared_value) /* in */ + : "r0", "r1", "cc"); +#endif + +} + +/* The idea of this is just to generate some random load/store + * activity which may or may not race with an un-barried incremented + * of the shared counter + */ +static void shuffle_memory(int cpu) +{ + int i; + uint32_t lspat = isaac_next_uint32(&prng_context[cpu]); + uint32_t seq = isaac_next_uint32(&prng_context[cpu]); + int count = seq & 0x1f; + uint32_t val=0; + + seq >>= 5; + + for (i=0; i>= PAGE_SHIFT; + seq ^= lspat; + lspat >>= 1; + } + +} + +static void do_increment(void) +{ + int i; + int cpu = smp_processor_id(); + + printf("CPU%d online\n", cpu); + + for (i=0; i < increment_count; i++) { + per_cpu_value[cpu]++; + inc_fn(); + + if (do_shuffle) + shuffle_memory(cpu); + } + + printf("CPU%d: Done, %d incs\n", cpu, per_cpu_value[cpu]); + + cpumask_set_cpu(cpu, &smp_test_complete); + if (cpu != 0) + halt(); +} + +int main(int argc, char **argv) +{ + int cpu; + unsigned int i, sum = 0; + static const unsigned char seed[] = "myseed"; + + inc_fn = &increment_shared; + + isaac_init(&prng_context[0], &seed[0], sizeof(seed)); + + for (i=0; i