From patchwork Wed Jun 24 16:43:33 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 50291 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-wg0-f71.google.com (mail-wg0-f71.google.com [74.125.82.71]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id 8681221575 for ; Wed, 24 Jun 2015 16:45:48 +0000 (UTC) Received: by wgbbj7 with SMTP id bj7sf12552215wgb.2 for ; Wed, 24 Jun 2015 09:45:47 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:delivered-to:from:to:subject:date:message-id:cc :precedence:list-id:list-unsubscribe:list-archive:list-post :list-help:list-subscribe:mime-version:content-type :content-transfer-encoding:sender:errors-to:x-original-sender :x-original-authentication-results:mailing-list; bh=VToSkiQeaZeg0FCOCnQ5csN4GfHyFOV0MrRTq40fDnc=; b=PorUtMP6x1BQehbrzWsVNyjePfJDihusoVCSU6uhNL2HuTPqx4MDERyAI7Wxu/hp0g 636WuLAoy8AKoy+H9p95w0rD/PszrTkI8i6xo8nPkPAS+5AI/VoTHnrdgTK3t6vdxiMJ fglJCc51GH4gQ0pciicikoH1s4a9FeX4enp3vQPqja9u3ZrG9VE5knX53reU1Gs5myUS tEy/+ZLAxxT1/OCyTaT0izPbkNQaUDa2YuEVFGadFLiUSGWVgiRQghsQD45dlZgifbA2 Zq0ojKKVHAvYMuPIU6+A8ykS+FdtE6aj2MYCHJvG0VgwlyvYvfGpPki/wvQ+lpQ1ql5n NgTg== X-Gm-Message-State: ALoCoQlIOoOu+7H8oOdSVC8HTAKIywDT0475qtKcaoVYrPERVdlNZs2s5MsYxWv1XGveuJGHMr/2 X-Received: by 10.152.27.130 with SMTP id t2mr36626459lag.2.1435164347560; Wed, 24 Jun 2015 09:45:47 -0700 (PDT) X-BeenThere: patchwork-forward@linaro.org Received: by 10.152.27.102 with SMTP id s6ls218028lag.102.gmail; Wed, 24 Jun 2015 09:45:47 -0700 (PDT) X-Received: by 10.112.13.97 with SMTP id g1mr40860391lbc.52.1435164347415; Wed, 24 Jun 2015 09:45:47 -0700 (PDT) Received: from mail-lb0-f170.google.com (mail-lb0-f170.google.com. [209.85.217.170]) by mx.google.com with ESMTPS id oq3si22437612lbb.125.2015.06.24.09.45.47 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 24 Jun 2015 09:45:47 -0700 (PDT) Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.217.170 as permitted sender) client-ip=209.85.217.170; Received: by lbnk3 with SMTP id k3so30022988lbn.1 for ; Wed, 24 Jun 2015 09:45:47 -0700 (PDT) X-Received: by 10.153.5.38 with SMTP id cj6mr10761486lad.36.1435164346983; Wed, 24 Jun 2015 09:45:46 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.112.108.230 with SMTP id hn6csp31335lbb; Wed, 24 Jun 2015 09:45:45 -0700 (PDT) X-Received: by 10.68.205.2 with SMTP id lc2mr82567874pbc.147.1435164345133; Wed, 24 Jun 2015 09:45:45 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org. [2001:1868:205::9]) by mx.google.com with ESMTPS id q9si40716837pds.196.2015.06.24.09.45.44 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 24 Jun 2015 09:45:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-arm-kernel-bounces+patch=linaro.org@lists.infradead.org designates 2001:1868:205::9 as permitted sender) client-ip=2001:1868:205::9; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1Z7nmU-0001eL-GZ; Wed, 24 Jun 2015 16:44:10 +0000 Received: from mail-wi0-f179.google.com ([209.85.212.179]) by bombadil.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1Z7nmM-0001Fk-UL for linux-arm-kernel@lists.infradead.org; Wed, 24 Jun 2015 16:44:04 +0000 Received: by wiga1 with SMTP id a1so141082438wig.0 for ; Wed, 24 Jun 2015 09:43:40 -0700 (PDT) X-Received: by 10.180.149.243 with SMTP id ud19mr6654260wib.11.1435164220757; Wed, 24 Jun 2015 09:43:40 -0700 (PDT) Received: from localhost.localdomain ([185.13.106.93]) by mx.google.com with ESMTPSA id m2sm3397897wiy.7.2015.06.24.09.43.38 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Wed, 24 Jun 2015 09:43:39 -0700 (PDT) From: Ard Biesheuvel To: linux-arm-kernel@lists.infradead.org, hpa@zytor.com Subject: [PATCH] md/raid6: delta syndrome for ARM NEON Date: Wed, 24 Jun 2015 18:43:33 +0200 Message-Id: <1435164213-25410-1-git-send-email-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 1.9.1 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20150624_094403_192329_E9691144 X-CRM114-Status: GOOD ( 14.27 ) X-Spam-Score: -2.6 (--) X-Spam-Report: SpamAssassin version 3.4.0 on bombadil.infradead.org summary: Content analysis details: (-2.6 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low trust [209.85.212.179 listed in list.dnswl.org] -0.0 RCVD_IN_MSPIKE_H3 RBL: Good reputation (+3) [209.85.212.179 listed in wl.mailspike.net] -0.0 SPF_PASS SPF: sender matches SPF record -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] -0.0 RCVD_IN_MSPIKE_WL Mailspike good senders Cc: Neil Brown , Markus Stockhausen , Ard Biesheuvel X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: , List-Help: , List-Subscribe: , MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patch=linaro.org@lists.infradead.org X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: ard.biesheuvel@linaro.org X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.217.170 as permitted sender) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org X-Google-Group-Id: 836684582541 This implements XOR syndrome calculation using NEON intrinsics. As before, the module can be built for ARM and arm64 from the same source. Relative performance on a Cortex-A57 based system: raid6: int64x1 gen() 905 MB/s raid6: int64x1 xor() 881 MB/s raid6: int64x2 gen() 1343 MB/s raid6: int64x2 xor() 1286 MB/s raid6: int64x4 gen() 1896 MB/s raid6: int64x4 xor() 1321 MB/s raid6: int64x8 gen() 1773 MB/s raid6: int64x8 xor() 1165 MB/s raid6: neonx1 gen() 1834 MB/s raid6: neonx1 xor() 1278 MB/s raid6: neonx2 gen() 2528 MB/s raid6: neonx2 xor() 1942 MB/s raid6: neonx4 gen() 2888 MB/s raid6: neonx4 xor() 2334 MB/s raid6: neonx8 gen() 2957 MB/s raid6: neonx8 xor() 2232 MB/s raid6: using algorithm neonx8 gen() 2957 MB/s raid6: .... xor() 2232 MB/s, rmw enabled Cc: Markus Stockhausen Cc: Neil Brown Signed-off-by: Ard Biesheuvel --- lib/raid6/neon.c | 13 ++++++++++++- lib/raid6/neon.uc | 46 ++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 58 insertions(+), 1 deletion(-) diff --git a/lib/raid6/neon.c b/lib/raid6/neon.c index d9ad6ee284f4..7076ef1ba3dd 100644 --- a/lib/raid6/neon.c +++ b/lib/raid6/neon.c @@ -40,9 +40,20 @@ (unsigned long)bytes, ptrs); \ kernel_neon_end(); \ } \ + static void raid6_neon ## _n ## _xor_syndrome(int disks, \ + int start, int stop, \ + size_t bytes, void **ptrs) \ + { \ + void raid6_neon ## _n ## _xor_syndrome_real(int, \ + int, int, unsigned long, void**); \ + kernel_neon_begin(); \ + raid6_neon ## _n ## _xor_syndrome_real(disks, \ + start, stop, (unsigned long)bytes, ptrs); \ + kernel_neon_end(); \ + } \ struct raid6_calls const raid6_neonx ## _n = { \ raid6_neon ## _n ## _gen_syndrome, \ - NULL, /* XOR not yet implemented */ \ + raid6_neon ## _n ## _xor_syndrome, \ raid6_have_neon, \ "neonx" #_n, \ 0 \ diff --git a/lib/raid6/neon.uc b/lib/raid6/neon.uc index 1b9ed793342d..4fa51b761dd0 100644 --- a/lib/raid6/neon.uc +++ b/lib/raid6/neon.uc @@ -3,6 +3,7 @@ * neon.uc - RAID-6 syndrome calculation using ARM NEON instructions * * Copyright (C) 2012 Rob Herring + * Copyright (C) 2015 Linaro Ltd. * * Based on altivec.uc: * Copyright 2002-2004 H. Peter Anvin - All Rights Reserved @@ -78,3 +79,48 @@ void raid6_neon$#_gen_syndrome_real(int disks, unsigned long bytes, void **ptrs) vst1q_u8(&q[d+NSIZE*$$], wq$$); } } + +void raid6_neon$#_xor_syndrome_real(int disks, int start, int stop, + unsigned long bytes, void **ptrs) +{ + uint8_t **dptr = (uint8_t **)ptrs; + uint8_t *p, *q; + int d, z, z0; + + register unative_t wd$$, wq$$, wp$$, w1$$, w2$$; + const unative_t x1d = NBYTES(0x1d); + + z0 = stop; /* P/Q right side optimization */ + p = dptr[disks-2]; /* XOR parity */ + q = dptr[disks-1]; /* RS syndrome */ + + for ( d = 0 ; d < bytes ; d += NSIZE*$# ) { + wq$$ = vld1q_u8(&dptr[z0][d+$$*NSIZE]); + wp$$ = veorq_u8(vld1q_u8(&p[d+$$*NSIZE]), wq$$); + + /* P/Q data pages */ + for ( z = z0-1 ; z >= start ; z-- ) { + wd$$ = vld1q_u8(&dptr[z][d+$$*NSIZE]); + wp$$ = veorq_u8(wp$$, wd$$); + w2$$ = MASK(wq$$); + w1$$ = SHLBYTE(wq$$); + + w2$$ = vandq_u8(w2$$, x1d); + w1$$ = veorq_u8(w1$$, w2$$); + wq$$ = veorq_u8(w1$$, wd$$); + } + /* P/Q left side optimization */ + for ( z = start-1 ; z >= 0 ; z-- ) { + w2$$ = MASK(wq$$); + w1$$ = SHLBYTE(wq$$); + + w2$$ = vandq_u8(w2$$, x1d); + wq$$ = veorq_u8(w1$$, w2$$); + } + w1$$ = vld1q_u8(&q[d+NSIZE*$$]); + wq$$ = veorq_u8(wq$$, w1$$); + + vst1q_u8(&p[d+NSIZE*$$], wp$$); + vst1q_u8(&q[d+NSIZE*$$], wq$$); + } +}