From patchwork Wed Jan 13 07:08:18 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Pinski X-Patchwork-Id: 59651 Delivered-To: patch@linaro.org Received: by 10.112.130.2 with SMTP id oa2csp3187681lbb; Tue, 12 Jan 2016 23:09:32 -0800 (PST) X-Received: by 10.98.10.203 with SMTP id 72mr11820726pfk.87.1452668972648; Tue, 12 Jan 2016 23:09:32 -0800 (PST) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id kw9si32463339pab.63.2016.01.12.23.09.32; Tue, 12 Jan 2016 23:09:32 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dkim=pass header.i=@cavium-com.20150623.gappssmtp.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755369AbcAMHJa (ORCPT + 29 others); Wed, 13 Jan 2016 02:09:30 -0500 Received: from mail-ig0-f171.google.com ([209.85.213.171]:34898 "EHLO mail-ig0-f171.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755176AbcAMHI1 (ORCPT ); Wed, 13 Jan 2016 02:08:27 -0500 Received: by mail-ig0-f171.google.com with SMTP id t15so139075568igr.0 for ; Tue, 12 Jan 2016 23:08:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cavium-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=v0cwOjcpdko06a1JGMKnSDvXLj/c7Rp4d5KHfNBOzY4=; b=02m6joDSNJoHSghSXSifF8cUojMOUYkJ/uUGjaXuf/biVaeOjvf9L1NWuhyZkHpu1m AyzEuY8hmOv+lsNzRlGRTmdA26Nd9lmaJVZp6/nh8dH2OcFT5l5Uh2c5KiPknfqEjfg8 0Rp8ApLb8KpDcIFv+lwFpJbC0a+5FwM2JFP1idJlr7z9RobLyR1ZdfvoCnmNm6z1Pvrj 1BbtdzP/pO1j7mriKXE8Wk3Y7z8V+1lWcB94DH1y8E9QxqDt3n/GhvMovfd3NbVy9ema 1t/4SPf1m7r3fcxXsD/4Js+3Gf731jFXsA1JEkFfjJprE48UpQNPAhNCA+Y0arSTLZ4E 18jw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=v0cwOjcpdko06a1JGMKnSDvXLj/c7Rp4d5KHfNBOzY4=; b=gFxruv/ua5zUXoaTcAiJsqFFA6vPWiAmnglCKa2QWAvPACbcPj4mYUrKk0KNWjWjCb YOzrbHqFV5jYfA5fBc9rYwHkxQW1wx0T7JU2k7iAbgVYQFrEdAPzgrLDU5rIu1JvVnon 9uKcJ9QZEu/4Rq7awxXiwPCmxAR6NaPuFIFCm5Vn4AzrTe326pP6q55iKaULVoaCGNQW pqL+FJo9a6F87cDmoTg2vLuJF+NCjoGnhOvdOh3Cm60E0eY9gt8jk6monSndjSlT98CG iogch/6IVBLeEiEqRVAwov63fbewbqNTfCX3ToyhqU7SvzEDXXihmW0xsp1+SItkRcdy WXzg== X-Gm-Message-State: ALoCoQn1jaL7DyFpT5TMmwI/v8UdKyYeXcdszB6QwPFADCdKPp29/xwnAGLvH6lhHd6JM1r2hNgr8hGK5aWYXJW93u8/qawQsQ== X-Received: by 10.50.85.13 with SMTP id d13mr22681801igz.32.1452668906589; Tue, 12 Jan 2016 23:08:26 -0800 (PST) Received: from localhost.localdomain ([64.2.3.194]) by smtp.gmail.com with ESMTPSA id y143sm537817iod.35.2016.01.12.23.08.24 (version=TLS1 cipher=AES128-SHA bits=128/128); Tue, 12 Jan 2016 23:08:25 -0800 (PST) Received: from localhost.localdomain (apinskidesktop [127.0.0.1]) by localhost.localdomain (8.14.3/8.14.3/Debian-9.4) with ESMTP id u0D78NgK003622 (version=TLSv1/SSLv3 cipher=DHE-DSS-AES256-SHA bits=256 verify=NO); Tue, 12 Jan 2016 23:08:23 -0800 Received: (from apinski@localhost) by localhost.localdomain (8.14.3/8.14.3/Submit) id u0D78Nmk003621; Tue, 12 Jan 2016 23:08:23 -0800 From: Andrew Pinski To: pinskia@gmail.com, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, Will Deacon Cc: Andrew Pinski Subject: [PATCH 4/5] ARM64: Patch in prefetching for copy_page is requested Date: Tue, 12 Jan 2016 23:08:18 -0800 Message-Id: <1452668899-3553-5-git-send-email-apinski@cavium.com> X-Mailer: git-send-email 1.7.2.5 In-Reply-To: <1452668899-3553-1-git-send-email-apinski@cavium.com> References: <1452668899-3553-1-git-send-email-apinski@cavium.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On ThunderX T88 pass 1 and pass 2, there is no hardware prefetching so we need to patch in software prefetching. Prefetching improves this code by 60% over the original code and 2x over the code without prefetching. Meaured by using the benchmark code at https://github.com/apinski-cavium/copy_page_benchmark Signed-off-by: Andrew Pinski --- arch/arm64/lib/copy_page.S | 17 +++++++++++++++++ 1 files changed, 17 insertions(+), 0 deletions(-) -- 1.7.2.5 diff --git a/arch/arm64/lib/copy_page.S b/arch/arm64/lib/copy_page.S index dfb0316..876e882 100644 --- a/arch/arm64/lib/copy_page.S +++ b/arch/arm64/lib/copy_page.S @@ -18,6 +18,8 @@ #include #include #include +#include +#include /* @@ -28,6 +30,15 @@ * x1 - src */ ENTRY(copy_page) +alternative_if_not ARM64_NEEDS_PREFETCH_128 + nop + nop +alternative_else + # Prefetch two cache lines ahead. + prfm pldl1strm, [x1, #128] + prfm pldl1strm, [x1, #256] +alternative_endif + ldp x2, x3, [x1] ldp x4, x5, [x1, #16] ldp x6, x7, [x1, #32] @@ -42,6 +53,12 @@ ENTRY(copy_page) 1: subs x18, x18, #128 +alternative_if_not ARM64_NEEDS_PREFETCH_128 + nop +alternative_else + prfm pldl1strm, [x1, #384] +alternative_endif + stnp x2, x3, [x0] ldp x2, x3, [x1] stnp x4, x5, [x0, #16]