From patchwork Wed Feb 5 17:13:37 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 24215 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-oa0-f69.google.com (mail-oa0-f69.google.com [209.85.219.69]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id BAF3620675 for ; Wed, 5 Feb 2014 17:13:46 +0000 (UTC) Received: by mail-oa0-f69.google.com with SMTP id h16sf3614037oag.8 for ; Wed, 05 Feb 2014 09:13:45 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:delivered-to:from:to:cc:subject :date:message-id:in-reply-to:references:x-original-sender :x-original-authentication-results:precedence:mailing-list:list-id :list-post:list-help:list-archive:list-unsubscribe; bh=bqbXDlV7PT67LVrwId2RtUKmMcBvaklQ5xTMdlmstpk=; b=k2eSczVXG0lR9VpX5szoAp0Ya/GDbrOEiSyiCNezMMOaok2Y3NAhmQ+Y8HMAB77x19 9K6GRjeRcVTIJ9pDgnUJ3AOzRURC7sLL4ZX6atjzIg6vlGjxADGhmLpCnmtOLPzJKWcm ddm/nftbbPFwC+OtgQBdCVvEMKRny5Y2zoJbz4RWbuUGJMstEgXuAUn9FDKOZ/YseFl3 UWKPIuTJewiRZwDvexVpk/CyGad9+RGcKR8ui6m8rx4lgQ27fMHCz4wzyHfmQo9SFYsN UOo0Oao4xwEeuB3aOzq3D6R6lMcdd3UUM4o9IvpIdp5o1r8Ems7VDu68GBNcYowRz+98 PxPA== X-Gm-Message-State: ALoCoQnB32dWQCH9pOXozkIcEx1jn3svjV4NfMUTzsSGcTJ3BiHyN01qkkMPzZRWTzZC7I/+YBTO X-Received: by 10.43.100.129 with SMTP id cw1mr862616icc.30.1391620425793; Wed, 05 Feb 2014 09:13:45 -0800 (PST) MIME-Version: 1.0 X-BeenThere: patchwork-forward@linaro.org Received: by 10.140.83.175 with SMTP id j44ls213042qgd.44.gmail; Wed, 05 Feb 2014 09:13:45 -0800 (PST) X-Received: by 10.58.169.7 with SMTP id aa7mr1798922vec.24.1391620425695; Wed, 05 Feb 2014 09:13:45 -0800 (PST) Received: from mail-vc0-f172.google.com (mail-vc0-f172.google.com [209.85.220.172]) by mx.google.com with ESMTPS id t20si6431068vek.79.2014.02.05.09.13.45 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 05 Feb 2014 09:13:45 -0800 (PST) Received-SPF: neutral (google.com: 209.85.220.172 is neither permitted nor denied by best guess record for domain of patch+caf_=patchwork-forward=linaro.org@linaro.org) client-ip=209.85.220.172; Received: by mail-vc0-f172.google.com with SMTP id lf12so518085vcb.17 for ; Wed, 05 Feb 2014 09:13:45 -0800 (PST) X-Received: by 10.221.26.10 with SMTP id rk10mr1791751vcb.0.1391620425607; Wed, 05 Feb 2014 09:13:45 -0800 (PST) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patches@linaro.org Received: by 10.220.174.196 with SMTP id u4csp65745vcz; Wed, 5 Feb 2014 09:13:45 -0800 (PST) X-Received: by 10.180.165.174 with SMTP id yz14mr17748558wib.34.1391620424546; Wed, 05 Feb 2014 09:13:44 -0800 (PST) Received: from mail-we0-f181.google.com (mail-we0-f181.google.com [74.125.82.181]) by mx.google.com with ESMTPS id ck9si8331087wjb.120.2014.02.05.09.13.44 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 05 Feb 2014 09:13:44 -0800 (PST) Received-SPF: neutral (google.com: 74.125.82.181 is neither permitted nor denied by best guess record for domain of ard.biesheuvel@linaro.org) client-ip=74.125.82.181; Received: by mail-we0-f181.google.com with SMTP id w61so508366wes.40 for ; Wed, 05 Feb 2014 09:13:44 -0800 (PST) X-Received: by 10.180.100.70 with SMTP id ew6mr3317102wib.57.1391620423983; Wed, 05 Feb 2014 09:13:43 -0800 (PST) Received: from ards-macbook-pro.local (cag06-7-83-153-85-71.fbx.proxad.net. [83.153.85.71]) by mx.google.com with ESMTPSA id y13sm62961628wjr.8.2014.02.05.09.13.42 for (version=TLSv1.1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 05 Feb 2014 09:13:43 -0800 (PST) From: Ard Biesheuvel To: will.deacon@arm.com, catalin.marinas@arm.com, linux-arm-kernel@lists.infradead.org Cc: patches@linaro.org, Ard Biesheuvel Subject: [PATCH v2 3/4] arm64: add support for kernel mode NEON in interrupt context Date: Wed, 5 Feb 2014 18:13:37 +0100 Message-Id: <1391620418-3999-4-git-send-email-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 1.8.3.2 In-Reply-To: <1391620418-3999-1-git-send-email-ard.biesheuvel@linaro.org> References: <1391620418-3999-1-git-send-email-ard.biesheuvel@linaro.org> X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: ard.biesheuvel@linaro.org X-Original-Authentication-Results: mx.google.com; spf=neutral (google.com: 209.85.220.172 is neither permitted nor denied by best guess record for domain of patch+caf_=patchwork-forward=linaro.org@linaro.org) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Precedence: list Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org List-ID: X-Google-Group-Id: 836684582541 List-Post: , List-Help: , List-Archive: List-Unsubscribe: , This patch modifies kernel_neon_begin() and kernel_neon_end(), so they may be called from any context. To address the case where only a couple of registers are needed, kernel_neon_begin_partial(u32) is introduced which takes as a parameter the number of bottom 'n' NEON q-registers required. To mark the end of such a partial section, the regular kernel_neon_end() should be used. Signed-off-by: Ard Biesheuvel --- arch/arm64/include/asm/fpsimd.h | 17 ++++++++++++++ arch/arm64/include/asm/fpsimdmacros.h | 37 ++++++++++++++++++++++++++++++ arch/arm64/include/asm/neon.h | 6 ++++- arch/arm64/kernel/entry-fpsimd.S | 24 +++++++++++++++++++ arch/arm64/kernel/fpsimd.c | 43 +++++++++++++++++++++++------------ 5 files changed, 112 insertions(+), 15 deletions(-) diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h index f7e70f3f1eb7..8c0f8d332528 100644 --- a/arch/arm64/include/asm/fpsimd.h +++ b/arch/arm64/include/asm/fpsimd.h @@ -41,6 +41,19 @@ struct fpsimd_state { unsigned int last_cpu; }; +/* + * Struct for stacking the bottom 'n' FP/SIMD registers. + * Mainly intended for kernel use of v8 Crypto Extensions which only + * needs a few registers and may need to execute in atomic context. + */ +struct fpsimd_partial_state { + u32 num_regs; + u32 fpsr; + u32 fpcr; + __uint128_t vregs[32] __aligned(16); +} __aligned(16); + + #if defined(__KERNEL__) && defined(CONFIG_COMPAT) /* Masks for extracting the FPSR and FPCR from the FPSCR */ #define VFP_FPSCR_STAT_MASK 0xf800009f @@ -65,6 +78,10 @@ void fpsimd_set_user_state(struct task_struct *t, struct user_fpsimd_state *st); void fpsimd_reload_fpstate(void); +extern void fpsimd_save_partial_state(struct fpsimd_partial_state *state, + u32 num_regs); +extern void fpsimd_load_partial_state(struct fpsimd_partial_state *state); + #endif #endif diff --git a/arch/arm64/include/asm/fpsimdmacros.h b/arch/arm64/include/asm/fpsimdmacros.h index bbec599c96bd..42990a82c671 100644 --- a/arch/arm64/include/asm/fpsimdmacros.h +++ b/arch/arm64/include/asm/fpsimdmacros.h @@ -62,3 +62,40 @@ ldr w\tmpnr, [\state, #16 * 2 + 4] msr fpcr, x\tmpnr .endm + +.altmacro +.macro q2op, op, q1, q2, state + \op q\q1, q\q2, [\state, # -16 * \q1 - 16] +.endm + +.macro fpsimd_save_partial state, numnr, tmpnr1, tmpnr2 + mrs x\tmpnr1, fpsr + str w\numnr, [\state] + mrs x\tmpnr2, fpcr + stp w\tmpnr1, w\tmpnr2, [\state, #4] + adr x\tmpnr1, 0f + add \state, \state, x\numnr, lsl #4 + sub x\tmpnr1, x\tmpnr1, x\numnr, lsl #1 + br x\tmpnr1 + .irp qa, 30, 28, 26, 24, 22, 20, 18, 16, 14, 12, 10, 8, 6, 4, 2, 0 + qb = \qa + 1 + q2op stp, \qa, %qb, \state + .endr +0: +.endm + +.macro fpsimd_restore_partial state, tmpnr1, tmpnr2 + ldp w\tmpnr1, w\tmpnr2, [\state, #4] + msr fpsr, x\tmpnr1 + msr fpcr, x\tmpnr2 + adr x\tmpnr1, 0f + ldr w\tmpnr2, [\state] + add \state, \state, x\tmpnr2, lsl #4 + sub x\tmpnr1, x\tmpnr1, x\tmpnr2, lsl #1 + br x\tmpnr1 + .irp qa, 30, 28, 26, 24, 22, 20, 18, 16, 14, 12, 10, 8, 6, 4, 2, 0 + qb = \qa + 1 + q2op ldp, \qa, %qb, \state + .endr +0: +.endm diff --git a/arch/arm64/include/asm/neon.h b/arch/arm64/include/asm/neon.h index b0cc58a97780..13ce4cc18e26 100644 --- a/arch/arm64/include/asm/neon.h +++ b/arch/arm64/include/asm/neon.h @@ -8,7 +8,11 @@ * published by the Free Software Foundation. */ +#include + #define cpu_has_neon() (1) -void kernel_neon_begin(void); +#define kernel_neon_begin() kernel_neon_begin_partial(32) + +void kernel_neon_begin_partial(u32 num_regs); void kernel_neon_end(void); diff --git a/arch/arm64/kernel/entry-fpsimd.S b/arch/arm64/kernel/entry-fpsimd.S index 6a27cd6dbfa6..d358ccacfc00 100644 --- a/arch/arm64/kernel/entry-fpsimd.S +++ b/arch/arm64/kernel/entry-fpsimd.S @@ -41,3 +41,27 @@ ENTRY(fpsimd_load_state) fpsimd_restore x0, 8 ret ENDPROC(fpsimd_load_state) + +#ifdef CONFIG_KERNEL_MODE_NEON + +/* + * Save the bottom n FP registers. + * + * x0 - pointer to struct fpsimd_partial_state + */ +ENTRY(fpsimd_save_partial_state) + fpsimd_save_partial x0, 1, 8, 9 + ret +ENDPROC(fpsimd_load_partial_state) + +/* + * Load the bottom n FP registers. + * + * x0 - pointer to struct fpsimd_partial_state + */ +ENTRY(fpsimd_load_partial_state) + fpsimd_restore_partial x0, 8, 9 + ret +ENDPROC(fpsimd_load_partial_state) + +#endif diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c index 239c8162473f..2ed92976ac16 100644 --- a/arch/arm64/kernel/fpsimd.c +++ b/arch/arm64/kernel/fpsimd.c @@ -188,29 +188,44 @@ void fpsimd_set_user_state(struct task_struct *t, struct user_fpsimd_state *st) #ifdef CONFIG_KERNEL_MODE_NEON +static DEFINE_PER_CPU(struct fpsimd_partial_state, hardirq_fpsimdstate); +static DEFINE_PER_CPU(struct fpsimd_partial_state, softirq_fpsimdstate); + /* * Kernel-side NEON support functions */ -void kernel_neon_begin(void) +void kernel_neon_begin_partial(u32 num_regs) { - /* Avoid using the NEON in interrupt context */ - BUG_ON(in_interrupt()); - preempt_disable(); + if (in_interrupt()) { + struct fpsimd_partial_state *s = this_cpu_ptr( + in_irq() ? &hardirq_fpsimdstate : &softirq_fpsimdstate); - /* - * Save the userland FPSIMD state if we have one and if we haven't done - * so already. Clear fpsimd_last_task to indicate that there is no - * longer userland context in the registers. - */ - if (current->mm && !test_and_set_thread_flag(TIF_FOREIGN_FPSTATE)) - fpsimd_save_state(¤t->thread.fpsimd_state); - __get_cpu_var(fpsimd_last_task) = NULL; + BUG_ON(num_regs > 32); + fpsimd_save_partial_state(s, roundup(num_regs, 2)); + } else { + /* + * Save the userland FPSIMD state if we have one and if we + * haven't done so already. Clear fpsimd_last_task to indicate + * that there is no longer userland context in the registers. + */ + preempt_disable(); + if (current->mm && + !test_and_set_thread_flag(TIF_FOREIGN_FPSTATE)) + fpsimd_save_state(¤t->thread.fpsimd_state); + __get_cpu_var(fpsimd_last_task) = NULL; + } } -EXPORT_SYMBOL(kernel_neon_begin); +EXPORT_SYMBOL(kernel_neon_begin_partial); void kernel_neon_end(void) { - preempt_enable(); + if (in_interrupt()) { + struct fpsimd_partial_state *s = this_cpu_ptr( + in_irq() ? &hardirq_fpsimdstate : &softirq_fpsimdstate); + fpsimd_load_partial_state(s); + } else { + preempt_enable(); + } } EXPORT_SYMBOL(kernel_neon_end);