From patchwork Fri Nov 7 13:32:22 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 40406 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-lb0-f198.google.com (mail-lb0-f198.google.com [209.85.217.198]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id 44BEB24237 for ; Fri, 7 Nov 2014 13:35:32 +0000 (UTC) Received: by mail-lb0-f198.google.com with SMTP id 10sf1889857lbg.9 for ; Fri, 07 Nov 2014 05:35:31 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:delivered-to:from:to:subject:date:message-id :in-reply-to:references:cc:precedence:list-id:list-unsubscribe :list-archive:list-post:list-help:list-subscribe:mime-version:sender :errors-to:x-original-sender:x-original-authentication-results :mailing-list:content-type:content-transfer-encoding; bh=Sly6bJPMw2eefLotHh8a8CNj7w/xbHAPCbBBbzwmO0g=; b=Tq7bad1+oE1mgZyPqxFynxWS+5cXEO4tuSMPIaXbT60lAYc7LY32LGMgVVvFBBrTZQ pbYuramjetefcH9GR73r+bXFofU2YN9fey6SCH8b8o/eJY3XMQgq3fwqDePDy21eGr5I /valKmBRgOHondFI0LVbJMNTwZrcVhedRk/QqVtLnHuFYspq4f50MNgUp3EjiGtL0KKZ ksvvJ7CM//wWZ5D9jAxV7BatMzvBezWrgJ9BceJNMC78Gx2byMnC50CRABKa9SueYIK5 Sq/ZW2XaWT9Lu99tYi1r+TkWiCR2y68O3i1PZpPATtZusiwk7Q7ToSV/XeKD/m6xJoo7 I8yQ== X-Gm-Message-State: ALoCoQlnqe9OrcWH0c652sfb1H0Z0aOPKbGrDXkVsmxJotPg76/LpUx7VrM+wydSVFZTsxrQvvyI X-Received: by 10.194.91.180 with SMTP id cf20mr431142wjb.4.1415367331058; Fri, 07 Nov 2014 05:35:31 -0800 (PST) X-BeenThere: patchwork-forward@linaro.org Received: by 10.152.28.67 with SMTP id z3ls210839lag.74.gmail; Fri, 07 Nov 2014 05:35:30 -0800 (PST) X-Received: by 10.152.7.71 with SMTP id h7mr11022694laa.68.1415367330666; Fri, 07 Nov 2014 05:35:30 -0800 (PST) Received: from mail-la0-f49.google.com (mail-la0-f49.google.com. [209.85.215.49]) by mx.google.com with ESMTPS id k3si15235379lbd.26.2014.11.07.05.35.30 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 07 Nov 2014 05:35:30 -0800 (PST) Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.215.49 as permitted sender) client-ip=209.85.215.49; Received: by mail-la0-f49.google.com with SMTP id ge10so4442688lab.36 for ; Fri, 07 Nov 2014 05:35:30 -0800 (PST) X-Received: by 10.152.116.102 with SMTP id jv6mr5668772lab.40.1415367330208; Fri, 07 Nov 2014 05:35:30 -0800 (PST) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.112.184.201 with SMTP id ew9csp197438lbc; Fri, 7 Nov 2014 05:35:29 -0800 (PST) X-Received: by 10.70.61.37 with SMTP id m5mr9999463pdr.162.1415367328492; Fri, 07 Nov 2014 05:35:28 -0800 (PST) Received: from bombadil.infradead.org (bombadil.infradead.org. [2001:1868:205::9]) by mx.google.com with ESMTPS id rp7si8956477pab.229.2014.11.07.05.35.27 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 07 Nov 2014 05:35:28 -0800 (PST) Received-SPF: none (google.com: linux-arm-kernel-bounces+patch=linaro.org@lists.infradead.org does not designate permitted sender hosts) client-ip=2001:1868:205::9; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1Xmjfl-0007XE-P4; Fri, 07 Nov 2014 13:33:53 +0000 Received: from cam-admin0.cambridge.arm.com ([217.140.96.50]) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1Xmjeh-0006nk-88 for linux-arm-kernel@lists.infradead.org; Fri, 07 Nov 2014 13:32:48 +0000 Received: from edgewater-inn.cambridge.arm.com (edgewater-inn.cambridge.arm.com [10.1.203.36]) by cam-admin0.cambridge.arm.com (8.12.6/8.12.6) with ESMTP id sA7DWMwo020720; Fri, 7 Nov 2014 13:32:22 GMT Received: by edgewater-inn.cambridge.arm.com (Postfix, from userid 1000) id 6F0A61AE01F8; Fri, 7 Nov 2014 13:32:24 +0000 (GMT) From: Will Deacon To: linux-arm-kernel@lists.infradead.org Subject: [PATCH 2/2] arm64: entry: use ldp/stp instead of push/pop when saving/restoring regs Date: Fri, 7 Nov 2014 13:32:22 +0000 Message-Id: <1415367142-5005-2-git-send-email-will.deacon@arm.com> X-Mailer: git-send-email 2.1.1 In-Reply-To: <1415367142-5005-1-git-send-email-will.deacon@arm.com> References: <1415367142-5005-1-git-send-email-will.deacon@arm.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20141107_053247_714664_525924D9 X-CRM114-Status: GOOD ( 10.99 ) X-Spam-Score: -5.6 (-----) X-Spam-Report: SpamAssassin version 3.4.0 on bombadil.infradead.org summary: Content analysis details: (-5.6 points) pts rule name description ---- ---------------------- -------------------------------------------------- -5.0 RCVD_IN_DNSWL_HI RBL: Sender listed at http://www.dnswl.org/, high trust [217.140.96.50 listed in list.dnswl.org] -0.6 RP_MATCHES_RCVD Envelope sender domain matches handover relay domain -0.0 SPF_PASS SPF: sender matches SPF record Cc: catalin.marinas@arm.com, Will Deacon X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: , List-Help: , List-Subscribe: , MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patch=linaro.org@lists.infradead.org X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: will.deacon@arm.com X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.215.49 as permitted sender) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org X-Google-Group-Id: 836684582541 The push/pop instructions can be suboptimal when saving/restoring large amounts of data to/from the stack, for example on entry/exit from the kernel. This is because: (1) They act on descending addresses (i.e. the newly decremented sp), which may defeat some hardware prefetchers (2) They introduce an implicit dependency between each instruction, as the sp has to be updated in order to resolve the address of the next access. This patch removes the push/pop instructions from our kernel entry/exit macros in favour of ldp/stp plus offset. Signed-off-by: Will Deacon --- arch/arm64/kernel/entry.S | 75 +++++++++++++++++++++++------------------------ 1 file changed, 37 insertions(+), 38 deletions(-) diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index 2cebe56d650c..622a409916f3 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -64,25 +64,26 @@ #define BAD_ERROR 3 .macro kernel_entry, el, regsize = 64 - sub sp, sp, #S_FRAME_SIZE - S_LR // room for LR, SP, SPSR, ELR + sub sp, sp, #S_FRAME_SIZE .if \regsize == 32 mov w0, w0 // zero upper 32 bits of x0 .endif - push x28, x29 - push x26, x27 - push x24, x25 - push x22, x23 - push x20, x21 - push x18, x19 - push x16, x17 - push x14, x15 - push x12, x13 - push x10, x11 - push x8, x9 - push x6, x7 - push x4, x5 - push x2, x3 - push x0, x1 + stp x0, x1, [sp, #16 * 0] + stp x2, x3, [sp, #16 * 1] + stp x4, x5, [sp, #16 * 2] + stp x6, x7, [sp, #16 * 3] + stp x8, x9, [sp, #16 * 4] + stp x10, x11, [sp, #16 * 5] + stp x12, x13, [sp, #16 * 6] + stp x14, x15, [sp, #16 * 7] + stp x16, x17, [sp, #16 * 8] + stp x18, x19, [sp, #16 * 9] + stp x20, x21, [sp, #16 * 10] + stp x22, x23, [sp, #16 * 11] + stp x24, x25, [sp, #16 * 12] + stp x26, x27, [sp, #16 * 13] + stp x28, x29, [sp, #16 * 14] + .if \el == 0 mrs x21, sp_el0 get_thread_info tsk // Ensure MDSCR_EL1.SS is clear, @@ -118,33 +119,31 @@ .if \el == 0 ct_user_enter ldr x23, [sp, #S_SP] // load return stack pointer + msr sp_el0, x23 .endif + msr elr_el1, x21 // set up the return data + msr spsr_el1, x22 .if \ret ldr x1, [sp, #S_X1] // preserve x0 (syscall return) - add sp, sp, S_X2 .else - pop x0, x1 - .endif - pop x2, x3 // load the rest of the registers - pop x4, x5 - pop x6, x7 - pop x8, x9 - msr elr_el1, x21 // set up the return data - msr spsr_el1, x22 - .if \el == 0 - msr sp_el0, x23 + ldp x0, x1, [sp, #16 * 0] .endif - pop x10, x11 - pop x12, x13 - pop x14, x15 - pop x16, x17 - pop x18, x19 - pop x20, x21 - pop x22, x23 - pop x24, x25 - pop x26, x27 - pop x28, x29 - ldr lr, [sp], #S_FRAME_SIZE - S_LR // load LR and restore SP + ldp x2, x3, [sp, #16 * 1] + ldp x4, x5, [sp, #16 * 2] + ldp x6, x7, [sp, #16 * 3] + ldp x8, x9, [sp, #16 * 4] + ldp x10, x11, [sp, #16 * 5] + ldp x12, x13, [sp, #16 * 6] + ldp x14, x15, [sp, #16 * 7] + ldp x16, x17, [sp, #16 * 8] + ldp x18, x19, [sp, #16 * 9] + ldp x20, x21, [sp, #16 * 10] + ldp x22, x23, [sp, #16 * 11] + ldp x24, x25, [sp, #16 * 12] + ldp x26, x27, [sp, #16 * 13] + ldp x28, x29, [sp, #16 * 14] + ldr lr, [sp, #S_LR] + add sp, sp, #S_FRAME_SIZE // restore sp eret // return to kernel .endm