From patchwork Fri Dec 8 11:32:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 751720 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="0JMNUNOx" Received: from mail-wr1-x44a.google.com (mail-wr1-x44a.google.com [IPv6:2a00:1450:4864:20::44a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7E08210D8 for ; Fri, 8 Dec 2023 03:32:42 -0800 (PST) Received: by mail-wr1-x44a.google.com with SMTP id ffacd0b85a97d-33340d20b90so1571453f8f.2 for ; Fri, 08 Dec 2023 03:32:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1702035160; x=1702639960; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=jg1jNGMQOGhVtXZCjl80hXnmubgi/Ca9g2iUAVxbP60=; b=0JMNUNOxo6cVoPJkB4WKiQXT5eJlqk8i1GHSCOOZ7Jc9eV/G96e58jNknNPvgLMOBq rU0Kp0wqeXx9LEfzRKyTEjssmlBhGyx2xvUMjMGjMgvNO1C1guxKPDTycorCkvfsaI0Y BD41INYP3fvyNbUtEDrV3MktkXi6ayz+575NUliBn7HDYcz5BlslbB3Fg0kG+n4++NFX 9AdtweBGdMaBra94cHwAN4O4l2IZSp7vYwdwR3eRkIumAV/DMvmIQgCu1wdfFB6Ml/1f oGszMZc/wnq5E4t/uOdHLGw2XfGKheRpyNOHZwYW2G5uld/AoGOxrp2nqMaj0LFcfFOK Qo8Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702035160; x=1702639960; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=jg1jNGMQOGhVtXZCjl80hXnmubgi/Ca9g2iUAVxbP60=; b=G9YqrTlD1JT4L3XfDBFV5x6kOGzOoFm2nZW7GqcAvzR041Elx4kLJfNfEe6GJdSN1l aoap6YxilMH6E4r/rNXFe4S5NhLzz/siW8hmFqoo21X3i52esqO4MMin91bwdjRQ/2aS 8GRX3Y3j8/PJGcuA3aW1lxuvEGnPGWhFznUcqbYTUcWmlzc915OSuQat1UZcz3t0s7sp C/fhSXWYEUZy+eys0xjYIy0PJ77yw01RSTKFL5vsFjbE3Pd8DkLDUgqvM49jRRIrelE7 KzeQdS6D0vouv3taeBxhUD+fOMvHqZ+CwO39JvBaMZ1l6Bf4X44v8QEefgkBwJjTsjaM Rvow== X-Gm-Message-State: AOJu0YwExp7skFzS0DiLByBfqIzzir1J70Zddd3z7Zs8rBfGsByeDF3f 3qeeGmYeO/0Z8NIdU+Ill0Rpkada X-Google-Smtp-Source: AGHT+IGkH+QxL7r73sTzLHjD0Egz3yIU0KByp6DGplkQ3jrogz8531guXVzVYPh2nijV8KggnCWJpNgB X-Received: from palermo.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:118a]) (user=ardb job=sendgmr) by 2002:a5d:5d83:0:b0:333:3029:781 with SMTP id ci3-20020a5d5d83000000b0033330290781mr25310wrb.3.1702035160680; Fri, 08 Dec 2023 03:32:40 -0800 (PST) Date: Fri, 8 Dec 2023 12:32:20 +0100 In-Reply-To: <20231208113218.3001940-6-ardb@google.com> Precedence: bulk X-Mailing-List: linux-crypto@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231208113218.3001940-6-ardb@google.com> X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 X-Developer-Signature: v=1; a=openpgp-sha256; l=7216; i=ardb@kernel.org; h=from:subject; bh=vct3jQZrFKs6CxzAxEpfL53Y/+dXQO86TjA7/Z0onS0=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JIbXo3xFN09WVi3gj9S8rKIsx7rmyfM4nne2PEo42nDZi6 FT42iHYUcrCIMbBICumyCIw+++7nacnStU6z5KFmcPKBDKEgYtTACbidI6RYcKdHzeqjgtvr5QN midZ7lWrcWf518ff6ve0RTVd6/768wHDP9XIf0uEv25bOOtUa5L9zdBfpz8HWEn3cFqLmS66oOk fzw4A X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog Message-ID: <20231208113218.3001940-7-ardb@google.com> Subject: [PATCH v4 1/4] arm64: fpsimd: Drop unneeded 'busy' flag From: Ard Biesheuvel To: linux-arm-kernel@lists.infradead.org Cc: linux-crypto@vger.kernel.org, Ard Biesheuvel , Marc Zyngier , Will Deacon , Mark Rutland , Kees Cook , Catalin Marinas , Mark Brown , Eric Biggers , Sebastian Andrzej Siewior From: Ard Biesheuvel Kernel mode NEON will preserve the user mode FPSIMD state by saving it into the task struct before clobbering the registers. In order to avoid the need for preserving kernel mode state too, we disallow nested use of kernel mode NEON, i..e, use in softirq context while the interrupted task context was using kernel mode NEON too. Originally, this policy was implemented using a per-CPU flag which was exposed via may_use_simd(), requiring the users of the kernel mode NEON to deal with the possibility that it might return false, and having NEON and non-NEON code paths. This policy was changed by commit 13150149aa6ded1 ("arm64: fpsimd: run kernel mode NEON with softirqs disabled"), and now, softirq processing is disabled entirely instead, and so may_use_simd() can never fail when called from task or softirq context. This means we can drop the fpsimd_context_busy flag entirely, and instead, ensure that we disable softirq processing in places where we formerly relied on the flag for preventing races in the FPSIMD preserve routines. Signed-off-by: Ard Biesheuvel Reviewed-by: Mark Brown Tested-by: Geert Uytterhoeven --- arch/arm64/include/asm/simd.h | 11 +--- arch/arm64/kernel/fpsimd.c | 53 +++++--------------- 2 files changed, 13 insertions(+), 51 deletions(-) diff --git a/arch/arm64/include/asm/simd.h b/arch/arm64/include/asm/simd.h index 6a75d7ecdcaa..8e86c9e70e48 100644 --- a/arch/arm64/include/asm/simd.h +++ b/arch/arm64/include/asm/simd.h @@ -12,8 +12,6 @@ #include #include -DECLARE_PER_CPU(bool, fpsimd_context_busy); - #ifdef CONFIG_KERNEL_MODE_NEON /* @@ -28,17 +26,10 @@ static __must_check inline bool may_use_simd(void) /* * We must make sure that the SVE has been initialized properly * before using the SIMD in kernel. - * fpsimd_context_busy is only set while preemption is disabled, - * and is clear whenever preemption is enabled. Since - * this_cpu_read() is atomic w.r.t. preemption, fpsimd_context_busy - * cannot change under our feet -- if it's set we cannot be - * migrated, and if it's clear we cannot be migrated to a CPU - * where it is set. */ return !WARN_ON(!system_capabilities_finalized()) && system_supports_fpsimd() && - !in_hardirq() && !irqs_disabled() && !in_nmi() && - !this_cpu_read(fpsimd_context_busy); + !in_hardirq() && !irqs_disabled() && !in_nmi(); } #else /* ! CONFIG_KERNEL_MODE_NEON */ diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c index 1559c706d32d..ccc4a78a70e4 100644 --- a/arch/arm64/kernel/fpsimd.c +++ b/arch/arm64/kernel/fpsimd.c @@ -85,13 +85,13 @@ * softirq kicks in. Upon vcpu_put(), KVM will save the vcpu FP state and * flag the register state as invalid. * - * In order to allow softirq handlers to use FPSIMD, kernel_neon_begin() may - * save the task's FPSIMD context back to task_struct from softirq context. - * To prevent this from racing with the manipulation of the task's FPSIMD state - * from task context and thereby corrupting the state, it is necessary to - * protect any manipulation of a task's fpsimd_state or TIF_FOREIGN_FPSTATE - * flag with {, __}get_cpu_fpsimd_context(). This will still allow softirqs to - * run but prevent them to use FPSIMD. + * In order to allow softirq handlers to use FPSIMD, kernel_neon_begin() may be + * called from softirq context, which will save the task's FPSIMD context back + * to task_struct. To prevent this from racing with the manipulation of the + * task's FPSIMD state from task context and thereby corrupting the state, it + * is necessary to protect any manipulation of a task's fpsimd_state or + * TIF_FOREIGN_FPSTATE flag with get_cpu_fpsimd_context(), which will suspend + * softirq servicing entirely until put_cpu_fpsimd_context() is called. * * For a certain task, the sequence may look something like this: * - the task gets scheduled in; if both the task's fpsimd_cpu field @@ -209,27 +209,14 @@ static inline void sme_free(struct task_struct *t) { } #endif -DEFINE_PER_CPU(bool, fpsimd_context_busy); -EXPORT_PER_CPU_SYMBOL(fpsimd_context_busy); - static void fpsimd_bind_task_to_cpu(void); -static void __get_cpu_fpsimd_context(void) -{ - bool busy = __this_cpu_xchg(fpsimd_context_busy, true); - - WARN_ON(busy); -} - /* * Claim ownership of the CPU FPSIMD context for use by the calling context. * * The caller may freely manipulate the FPSIMD context metadata until * put_cpu_fpsimd_context() is called. * - * The double-underscore version must only be called if you know the task - * can't be preempted. - * * On RT kernels local_bh_disable() is not sufficient because it only * serializes soft interrupt related sections via a local lock, but stays * preemptible. Disabling preemption is the right choice here as bottom @@ -242,14 +229,6 @@ static void get_cpu_fpsimd_context(void) local_bh_disable(); else preempt_disable(); - __get_cpu_fpsimd_context(); -} - -static void __put_cpu_fpsimd_context(void) -{ - bool busy = __this_cpu_xchg(fpsimd_context_busy, false); - - WARN_ON(!busy); /* No matching get_cpu_fpsimd_context()? */ } /* @@ -261,18 +240,12 @@ static void __put_cpu_fpsimd_context(void) */ static void put_cpu_fpsimd_context(void) { - __put_cpu_fpsimd_context(); if (!IS_ENABLED(CONFIG_PREEMPT_RT)) local_bh_enable(); else preempt_enable(); } -static bool have_cpu_fpsimd_context(void) -{ - return !preemptible() && __this_cpu_read(fpsimd_context_busy); -} - unsigned int task_get_vl(const struct task_struct *task, enum vec_type type) { return task->thread.vl[type]; @@ -383,7 +356,7 @@ static void task_fpsimd_load(void) bool restore_ffr; WARN_ON(!system_supports_fpsimd()); - WARN_ON(!have_cpu_fpsimd_context()); + WARN_ON(preemptible()); if (system_supports_sve() || system_supports_sme()) { switch (current->thread.fp_type) { @@ -467,7 +440,7 @@ static void fpsimd_save(void) unsigned int vl; WARN_ON(!system_supports_fpsimd()); - WARN_ON(!have_cpu_fpsimd_context()); + WARN_ON(preemptible()); if (test_thread_flag(TIF_FOREIGN_FPSTATE)) return; @@ -1507,7 +1480,7 @@ void fpsimd_thread_switch(struct task_struct *next) if (!system_supports_fpsimd()) return; - __get_cpu_fpsimd_context(); + WARN_ON_ONCE(!irqs_disabled()); /* Save unsaved fpsimd state, if any: */ fpsimd_save(); @@ -1523,8 +1496,6 @@ void fpsimd_thread_switch(struct task_struct *next) update_tsk_thread_flag(next, TIF_FOREIGN_FPSTATE, wrong_task || wrong_cpu); - - __put_cpu_fpsimd_context(); } static void fpsimd_flush_thread_vl(enum vec_type type) @@ -1829,10 +1800,10 @@ void fpsimd_save_and_flush_cpu_state(void) if (!system_supports_fpsimd()) return; WARN_ON(preemptible()); - __get_cpu_fpsimd_context(); + get_cpu_fpsimd_context(); fpsimd_save(); fpsimd_flush_cpu_state(); - __put_cpu_fpsimd_context(); + put_cpu_fpsimd_context(); } #ifdef CONFIG_KERNEL_MODE_NEON From patchwork Fri Dec 8 11:32:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 752046 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="ce1MJz6L" Received: from mail-wr1-x44a.google.com (mail-wr1-x44a.google.com [IPv6:2a00:1450:4864:20::44a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A2DF410F9 for ; Fri, 8 Dec 2023 03:32:44 -0800 (PST) Received: by mail-wr1-x44a.google.com with SMTP id ffacd0b85a97d-33331e69698so1534956f8f.1 for ; Fri, 08 Dec 2023 03:32:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1702035163; x=1702639963; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=3O2yuXK53QoMtia112whRnCg8zUb1fGWC+saOC+mET8=; b=ce1MJz6LMDQI5/vIannxJjbbyRNy8p41nASuCTPM5wvZvnqaTWKs1DflVt9lOeiMBY 2t5irPCcd2IodwpVGoXEiFz5aWl9/71KVc5D9BBGG91qh/QvTGp0DkVoNCCmRAhcEa6i RLEV5TTr960f0t1XFuGsOoq4TmSufdLRc6riIaanruumC2kqIt0eE1WvxSfeId7l5nWP 0Xc6tf8zBBMJf5682fXYL4uY9zx4Ni2Y2I6mC+GHtGyO+mPgY4JlkhGZ2BOWB3KsJEvi I6ilWJAq8FAUrPmRqlH+xLjhqSoh4ZGGmWZA1t/Lhci/+vNGKFJBdO1RBOV0rCCxmfdN DvYw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702035163; x=1702639963; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=3O2yuXK53QoMtia112whRnCg8zUb1fGWC+saOC+mET8=; b=VAxkk9MQ1qgycp8KlW2kwYDgYHiI9umW2qeCh84gntbimDu9lFn/sVIbqGOjizQwjC CSUdm7VkqfRKo476sRTSjGcoffjKJcwzWCs74fndCCzYQSiKx/21JfFjYqmEVXXhgqUS c8VyRO5+Gdsez0i68JJywGxfR4gf/K7eRH+AXMvWCurxRUvCaXOJKrH2ZvIRju2uBkLI 6YHTZgiq7EtpSC//cA60kHBlDPsRNFTxt/lkbGP9sQ9s/7+F6V35xu3kFwuhNKEY18zx WVK21y8V6/kYW5I9yg3ImTQhiEoMh0ZqBdvCIyHm1cywT7owVmTUuh8qB3qNeKvuH8/t h+Cw== X-Gm-Message-State: AOJu0Yy+pCKrbzK0Lax8l/SH3PhtaHAbrs/3hk0p8vapwc8AgaEphEAG kZtdj3P6z3qQMM9EAnjKCDiKndIB X-Google-Smtp-Source: AGHT+IGchnNKZLtJZLnM4Mg8Fopmt4Ms2IZHC0vpZD2xvrk2BI0CsZMPYPlXAw30wgGxC9MD2s8ptDAv X-Received: from palermo.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:118a]) (user=ardb job=sendgmr) by 2002:a05:6000:4008:b0:336:d27:f052 with SMTP id cp8-20020a056000400800b003360d27f052mr2785wrb.6.1702035163182; Fri, 08 Dec 2023 03:32:43 -0800 (PST) Date: Fri, 8 Dec 2023 12:32:21 +0100 In-Reply-To: <20231208113218.3001940-6-ardb@google.com> Precedence: bulk X-Mailing-List: linux-crypto@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231208113218.3001940-6-ardb@google.com> X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 X-Developer-Signature: v=1; a=openpgp-sha256; l=9014; i=ardb@kernel.org; h=from:subject; bh=l8r/ER1V9gqsGl3Qyn/lK6qCXUafvW2TT1gMg+3pacI=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JIbXo35HNk97ev2Dp9nLthq9PFms/Keew9J8dxrjy3xYNd qPv3N0BHaUsDGIcDLJiiiwCs/++23l6olSt8yxZmDmsTCBDGLg4BWAiCkKMDBuFXuQzn2NYtGTR rWu3+VoM8v5Klp44+aB+r8EFuzU+jsyMDGcVYj5090/x0Vj7yF3fxXntucqfIeJ509tMNNg7Zmd lMwEA X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog Message-ID: <20231208113218.3001940-8-ardb@google.com> Subject: [PATCH v4 2/4] arm64: fpsimd: Preserve/restore kernel mode NEON at context switch From: Ard Biesheuvel To: linux-arm-kernel@lists.infradead.org Cc: linux-crypto@vger.kernel.org, Ard Biesheuvel , Marc Zyngier , Will Deacon , Mark Rutland , Kees Cook , Catalin Marinas , Mark Brown , Eric Biggers , Sebastian Andrzej Siewior From: Ard Biesheuvel Currently, the FPSIMD register file is not preserved and restored along with the general registers on exception entry/exit or context switch. For this reason, we disable preemption when enabling FPSIMD for kernel mode use in task context, and suspend the processing of softirqs so that there are no concurrent uses in the kernel. (Kernel mode FPSIMD may not be used at all in other contexts). Disabling preemption while doing CPU intensive work on inputs of potentially unbounded size is bad for real-time performance, which is why we try and ensure that SIMD crypto code does not operate on more than ~4k at a time, which is an arbitrary limit and requires assembler code to implement efficiently. We can avoid the need for disabling preemption if we can ensure that any in-kernel users of the NEON will not lose the FPSIMD register state across a context switch. And given that disabling softirqs implicitly disables preemption as well, we will also have to ensure that a softirq that runs code using FPSIMD can safely interrupt an in-kernel user. So introduce a thread_info flag TIF_USING_KMODE_FPSIMD, and modify the context switch hook for FPSIMD to preserve and restore the kernel mode FPSIMD to/from struct thread_struct when it is set. This avoids any scheduling blackouts due to prolonged use of FPSIMD in kernel mode, without the need for manual yielding. In order to support softirq processing while FPSIMD is being used in kernel task context, use the same flag to decide whether the kernel mode FPSIMD state needs to be preserved and restored before allowing FPSIMD to be used in softirq context. Signed-off-by: Ard Biesheuvel Reviewed-by: Mark Brown Reviewed-by: Mark Rutland --- arch/arm64/include/asm/processor.h | 2 + arch/arm64/include/asm/thread_info.h | 1 + arch/arm64/kernel/fpsimd.c | 92 ++++++++++++++++---- 3 files changed, 77 insertions(+), 18 deletions(-) diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h index e5bc54522e71..ce6eebd6c08b 100644 --- a/arch/arm64/include/asm/processor.h +++ b/arch/arm64/include/asm/processor.h @@ -167,6 +167,8 @@ struct thread_struct { unsigned long fault_address; /* fault info */ unsigned long fault_code; /* ESR_EL1 value */ struct debug_info debug; /* debugging */ + + struct user_fpsimd_state kernel_fpsimd_state; #ifdef CONFIG_ARM64_PTR_AUTH struct ptrauth_keys_user keys_user; #ifdef CONFIG_ARM64_PTR_AUTH_KERNEL diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h index 553d1bc559c6..e72a3bf9e563 100644 --- a/arch/arm64/include/asm/thread_info.h +++ b/arch/arm64/include/asm/thread_info.h @@ -80,6 +80,7 @@ void arch_setup_new_exec(void); #define TIF_TAGGED_ADDR 26 /* Allow tagged user addresses */ #define TIF_SME 27 /* SME in use */ #define TIF_SME_VL_INHERIT 28 /* Inherit SME vl_onexec across exec */ +#define TIF_KERNEL_FPSTATE 29 /* Task is in a kernel mode FPSIMD section */ #define _TIF_SIGPENDING (1 << TIF_SIGPENDING) #define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED) diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c index ccc4a78a70e4..c2d05de677d1 100644 --- a/arch/arm64/kernel/fpsimd.c +++ b/arch/arm64/kernel/fpsimd.c @@ -357,6 +357,7 @@ static void task_fpsimd_load(void) WARN_ON(!system_supports_fpsimd()); WARN_ON(preemptible()); + WARN_ON(test_thread_flag(TIF_KERNEL_FPSTATE)); if (system_supports_sve() || system_supports_sme()) { switch (current->thread.fp_type) { @@ -379,7 +380,7 @@ static void task_fpsimd_load(void) default: /* * This indicates either a bug in - * fpsimd_save() or memory corruption, we + * fpsimd_save_user_state() or memory corruption, we * should always record an explicit format * when we save. We always at least have the * memory allocated for FPSMID registers so @@ -430,7 +431,7 @@ static void task_fpsimd_load(void) * than via current, if we are saving KVM state then it will have * ensured that the type of registers to save is set in last->to_save. */ -static void fpsimd_save(void) +static void fpsimd_save_user_state(void) { struct cpu_fp_state const *last = this_cpu_ptr(&fpsimd_last_state); @@ -861,7 +862,7 @@ int vec_set_vector_length(struct task_struct *task, enum vec_type type, if (task == current) { get_cpu_fpsimd_context(); - fpsimd_save(); + fpsimd_save_user_state(); } fpsimd_flush_task_state(task); @@ -1473,6 +1474,16 @@ void do_fpsimd_exc(unsigned long esr, struct pt_regs *regs) current); } +static void fpsimd_load_kernel_state(struct task_struct *task) +{ + fpsimd_load_state(&task->thread.kernel_fpsimd_state); +} + +static void fpsimd_save_kernel_state(struct task_struct *task) +{ + fpsimd_save_state(&task->thread.kernel_fpsimd_state); +} + void fpsimd_thread_switch(struct task_struct *next) { bool wrong_task, wrong_cpu; @@ -1483,19 +1494,28 @@ void fpsimd_thread_switch(struct task_struct *next) WARN_ON_ONCE(!irqs_disabled()); /* Save unsaved fpsimd state, if any: */ - fpsimd_save(); + if (test_thread_flag(TIF_KERNEL_FPSTATE)) + fpsimd_save_kernel_state(current); + else + fpsimd_save_user_state(); - /* - * Fix up TIF_FOREIGN_FPSTATE to correctly describe next's - * state. For kernel threads, FPSIMD registers are never loaded - * and wrong_task and wrong_cpu will always be true. - */ - wrong_task = __this_cpu_read(fpsimd_last_state.st) != - &next->thread.uw.fpsimd_state; - wrong_cpu = next->thread.fpsimd_cpu != smp_processor_id(); + if (test_tsk_thread_flag(next, TIF_KERNEL_FPSTATE)) { + fpsimd_load_kernel_state(next); + set_tsk_thread_flag(next, TIF_FOREIGN_FPSTATE); + } else { + /* + * Fix up TIF_FOREIGN_FPSTATE to correctly describe next's + * state. For kernel threads, FPSIMD registers are never + * loaded with user mode FPSIMD state and so wrong_task and + * wrong_cpu will always be true. + */ + wrong_task = __this_cpu_read(fpsimd_last_state.st) != + &next->thread.uw.fpsimd_state; + wrong_cpu = next->thread.fpsimd_cpu != smp_processor_id(); - update_tsk_thread_flag(next, TIF_FOREIGN_FPSTATE, - wrong_task || wrong_cpu); + update_tsk_thread_flag(next, TIF_FOREIGN_FPSTATE, + wrong_task || wrong_cpu); + } } static void fpsimd_flush_thread_vl(enum vec_type type) @@ -1585,7 +1605,7 @@ void fpsimd_preserve_current_state(void) return; get_cpu_fpsimd_context(); - fpsimd_save(); + fpsimd_save_user_state(); put_cpu_fpsimd_context(); } @@ -1801,7 +1821,7 @@ void fpsimd_save_and_flush_cpu_state(void) return; WARN_ON(preemptible()); get_cpu_fpsimd_context(); - fpsimd_save(); + fpsimd_save_user_state(); fpsimd_flush_cpu_state(); put_cpu_fpsimd_context(); } @@ -1835,10 +1855,37 @@ void kernel_neon_begin(void) get_cpu_fpsimd_context(); /* Save unsaved fpsimd state, if any: */ - fpsimd_save(); + if (test_thread_flag(TIF_KERNEL_FPSTATE)) { + BUG_ON(IS_ENABLED(CONFIG_PREEMPT_RT) || !in_serving_softirq()); + fpsimd_save_kernel_state(current); + } else { + fpsimd_save_user_state(); + + /* + * Set the thread flag so that the kernel mode FPSIMD state + * will be context switched along with the rest of the task + * state. + * + * On non-PREEMPT_RT, softirqs may interrupt task level kernel + * mode FPSIMD, but the task will not be preemptible so setting + * TIF_KERNEL_FPSTATE for those would be both wrong (as it + * would mark the task context FPSIMD state as requiring a + * context switch) and unnecessary. + * + * On PREEMPT_RT, softirqs are serviced from a separate thread, + * which is scheduled as usual, and this guarantees that these + * softirqs are not interrupting use of the FPSIMD in kernel + * mode in task context. So in this case, setting the flag here + * is always appropriate. + */ + if (IS_ENABLED(CONFIG_PREEMPT_RT) || !in_serving_softirq()) + set_thread_flag(TIF_KERNEL_FPSTATE); + } /* Invalidate any task state remaining in the fpsimd regs: */ fpsimd_flush_cpu_state(); + + put_cpu_fpsimd_context(); } EXPORT_SYMBOL_GPL(kernel_neon_begin); @@ -1856,7 +1903,16 @@ void kernel_neon_end(void) if (!system_supports_fpsimd()) return; - put_cpu_fpsimd_context(); + /* + * If we are returning from a nested use of kernel mode FPSIMD, restore + * the task context kernel mode FPSIMD state. This can only happen when + * running in softirq context on non-PREEMPT_RT. + */ + if (!IS_ENABLED(CONFIG_PREEMPT_RT) && in_serving_softirq() && + test_thread_flag(TIF_KERNEL_FPSTATE)) + fpsimd_load_kernel_state(current); + else + clear_thread_flag(TIF_KERNEL_FPSTATE); } EXPORT_SYMBOL_GPL(kernel_neon_end); From patchwork Fri Dec 8 11:32:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 751719 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Frtc1AN9" Received: from mail-wm1-x349.google.com (mail-wm1-x349.google.com [IPv6:2a00:1450:4864:20::349]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0A28110D8 for ; Fri, 8 Dec 2023 03:32:49 -0800 (PST) Received: by mail-wm1-x349.google.com with SMTP id 5b1f17b1804b1-40b357e2a01so14124045e9.1 for ; Fri, 08 Dec 2023 03:32:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1702035167; x=1702639967; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=HXGpaqh/zgZf0KyBTEhnX4/b4EkZDuu9UX5raBl8kcg=; b=Frtc1AN9AGgqvqmuBKmyOn4CMa4LAwrWMvRNohSe9EXWwDkGr+fYs7nhUsky1MFuUl 2A/vQ0N4CFHFqJ+kSeOwAj0MamRQ3z1r3S0w6+y+uvIdshR5+66VdQJnxFQK/9gJ2cii ThKgSkurZwYHVdYJqFuCo+GnY2htsTXr9C0sH8mPdP3U224c5CYNSJ6mK6Xqjij0vU5p xIZpOsn4TC9hie+dGIgdh+DPoW7uwokX1HM7aeawtwl/YD/m59dW2MTGQowizI70hnhO jwUypj0oT2fVu79cstampYnBmuhRUj5AKtKc9W8DOXBsNhZQuldMnf7HpI/phsT1Nfcd bB6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702035167; x=1702639967; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=HXGpaqh/zgZf0KyBTEhnX4/b4EkZDuu9UX5raBl8kcg=; b=YH6z9J93ZQAZhzEDa+ehdMiU43XRtawxCFClLtNNhtxVIL3rz27VVg61KYe2OaTcHc z6AwjMXZNo/MT/YpAP7VWgPTqre1tewswGbUCEj/EaiYfVk6rFi+o2lIbcyuvPM82OLx AQnl6lQfU5/oFsqaRdIWYsURyWRJYLcp2mfU2cFTFyNmwmPkDZxs9Ao2Z1tsyb6IOEm0 UCidiECTkysmjUP0WrcGmMkln/6E8OfDQiK3G6zPN+299M1pHegMruu+YOWY34bvJuJ9 dYSDI05kowIrGi1taYPo/62s/t5L0fdpEp3In+ziiCUWGuSkd65PA6v/v2QV3YsRHzY1 WkJA== X-Gm-Message-State: AOJu0Yw3UXb3NJslHEz1NlJ2apHA6ELevzezeMkpcVPN55GEJ/Z/J5iv UempI0fXPJHq1Sttl8zwLBejyckN X-Google-Smtp-Source: AGHT+IEHfkUnH7fiOWr3g719RJ8fqdfAiAbCXQvqS8YXxlxFj7QxMABZd+ezCq05DcEo184QpXA0TZNK X-Received: from palermo.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:118a]) (user=ardb job=sendgmr) by 2002:a05:600c:4a12:b0:40a:5268:b9c with SMTP id c18-20020a05600c4a1200b0040a52680b9cmr38554wmp.2.1702035165430; Fri, 08 Dec 2023 03:32:45 -0800 (PST) Date: Fri, 8 Dec 2023 12:32:22 +0100 In-Reply-To: <20231208113218.3001940-6-ardb@google.com> Precedence: bulk X-Mailing-List: linux-crypto@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231208113218.3001940-6-ardb@google.com> X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 X-Developer-Signature: v=1; a=openpgp-sha256; l=2812; i=ardb@kernel.org; h=from:subject; bh=vQbP47nJikDTpxsJZ362i0c2hKsXJVRVb5p7Cv8VwhU=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JIbXo39ESHoe7R96tEfgUHtrJN71vFu89B8HHXlXLT04v9 dG5f/9NRykLgxgHg6yYIovA7L/vdp6eKFXrPEsWZg4rE8gQBi5OAZjIdVNGhtMrJlnkzmCujmZl +W+a+MDt867XHi6b4hWcbK9OW8h8cAIjw/d9C/czT7ecu/XZ0kS7askixVNLZR9q5Fza+8x0jiW /BTMA X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog Message-ID: <20231208113218.3001940-9-ardb@google.com> Subject: [PATCH v4 3/4] arm64: fpsimd: Implement lazy restore for kernel mode FPSIMD From: Ard Biesheuvel To: linux-arm-kernel@lists.infradead.org Cc: linux-crypto@vger.kernel.org, Ard Biesheuvel , Marc Zyngier , Will Deacon , Mark Rutland , Kees Cook , Catalin Marinas , Mark Brown , Eric Biggers , Sebastian Andrzej Siewior From: Ard Biesheuvel Now that kernel mode FPSIMD state is context switched along with other task state, we can enable the existing logic that keeps track of which task's FPSIMD state the CPU is holding in its registers. If it is the context of the task that we are switching to, we can elide the reload of the FPSIMD state from memory. Note that we also need to check whether the FPSIMD state on this CPU is the most recent: if a task gets migrated away and back again, the state in memory may be more recent than the state in the CPU. So add another CPU id field to task_struct to keep track of this. (We could reuse the existing CPU id field used for user mode context, but that might result in user state to be discarded unnecessarily, given that two distinct CPUs could be holding the most recent user mode state and the most recent kernel mode state) Signed-off-by: Ard Biesheuvel Reviewed-by: Mark Brown Acked-by: Mark Rutland --- arch/arm64/include/asm/processor.h | 1 + arch/arm64/kernel/fpsimd.c | 18 ++++++++++++++++++ 2 files changed, 19 insertions(+) diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h index ce6eebd6c08b..5b0a04810b23 100644 --- a/arch/arm64/include/asm/processor.h +++ b/arch/arm64/include/asm/processor.h @@ -169,6 +169,7 @@ struct thread_struct { struct debug_info debug; /* debugging */ struct user_fpsimd_state kernel_fpsimd_state; + unsigned int kernel_fpsimd_cpu; #ifdef CONFIG_ARM64_PTR_AUTH struct ptrauth_keys_user keys_user; #ifdef CONFIG_ARM64_PTR_AUTH_KERNEL diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c index c2d05de677d1..50ae93d9baec 100644 --- a/arch/arm64/kernel/fpsimd.c +++ b/arch/arm64/kernel/fpsimd.c @@ -1476,12 +1476,30 @@ void do_fpsimd_exc(unsigned long esr, struct pt_regs *regs) static void fpsimd_load_kernel_state(struct task_struct *task) { + struct cpu_fp_state *last = this_cpu_ptr(&fpsimd_last_state); + + /* + * Elide the load if this CPU holds the most recent kernel mode + * FPSIMD context of the current task. + */ + if (last->st == &task->thread.kernel_fpsimd_state && + task->thread.kernel_fpsimd_cpu == smp_processor_id()) + return; + fpsimd_load_state(&task->thread.kernel_fpsimd_state); } static void fpsimd_save_kernel_state(struct task_struct *task) { + struct cpu_fp_state cpu_fp_state = { + .st = &task->thread.kernel_fpsimd_state, + .to_save = FP_STATE_FPSIMD, + }; + fpsimd_save_state(&task->thread.kernel_fpsimd_state); + fpsimd_bind_state_to_cpu(&cpu_fp_state); + + task->thread.kernel_fpsimd_cpu = smp_processor_id(); } void fpsimd_thread_switch(struct task_struct *next) From patchwork Fri Dec 8 11:32:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 752045 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="H5V2xJ5Y" Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7E10910F7 for ; Fri, 8 Dec 2023 03:32:50 -0800 (PST) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-5d7e7e10231so12634247b3.1 for ; Fri, 08 Dec 2023 03:32:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1702035169; x=1702639969; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=tIW6OxkQLD6ojtSOM3oZEkZa7hR7XnG6NZiYsQLCSik=; b=H5V2xJ5YINiq/gZGqofpu4lYZF4+LghiGp7lEnHPvKcj2HEAA8L09OCwFpoTLEpPgZ VN8JyRC4ScbZhvLxWDuxl7D6SedV1e382B14ZORiYqQWAclKsWFVPhrxYtQj7asC3Ym6 GzV01qi6dVymox9nBotpSN0doX1vw2GOc0SLwkOQyolWYh8GaGbB8kPwjulLxApn1GB6 L5XFwkpAOQQ24YDjQfRz2sRgwNiBC4sKVPoxuzGOKKS28iKvNpqiNOqZAwwCibVjWnr8 6kJClXzhIlFFVPdOwy2jSn2LK8MuzZ2Ayz8ThrOgM9dMwHvnXfnGnYe9+8dNZkYwk4uW 6A6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702035169; x=1702639969; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=tIW6OxkQLD6ojtSOM3oZEkZa7hR7XnG6NZiYsQLCSik=; b=EeQrWYEyaRnODqrpCEkZ1101amZ1SUlJjHAYnfIcSGFKh1XeZF/aEjwoQvrOXpv77r cPIvCT1szQJmeHHuGgU/jdtEGaX12TEG0r7O3e1k3B5Zim2GV/+zMeqBBn7SCVKGoSsI OLRcvq4aWlf7+Jxu8si7PmCKQAHCIHzM39zwoPfffOjNmNxnvlPaKW0VTD8bRicIjZcT YQB3Pb57eK+F9hy5D+mh5El490RpJ1XSHLFxdgQteImcjnoUdyhs3J763XknWvdXjKIw jbPKf1Cmt7r735UUOWKK8COCyKLTF4FLmPOdeiExQOrGbQoJTp+HIwac02qFCw9tataA brLw== X-Gm-Message-State: AOJu0YyU0RbqbP0qaCyEoYJJc6RChR/DNAK/jAP5tblQQqNoayfqBHv0 mnFmjVQZ6Nh0vGxzMEEG5OSRWOEI X-Google-Smtp-Source: AGHT+IGmi88gIiTU25m/DqJSHEAZeeNr73MSkwK61S+3OU/oHCtmTGtgJXqoqfhL2wBZUn5l7SaTiMG7 X-Received: from palermo.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:118a]) (user=ardb job=sendgmr) by 2002:a81:fe09:0:b0:5d3:d44a:578d with SMTP id j9-20020a81fe09000000b005d3d44a578dmr11434ywn.4.1702035169745; Fri, 08 Dec 2023 03:32:49 -0800 (PST) Date: Fri, 8 Dec 2023 12:32:23 +0100 In-Reply-To: <20231208113218.3001940-6-ardb@google.com> Precedence: bulk X-Mailing-List: linux-crypto@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231208113218.3001940-6-ardb@google.com> X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 X-Developer-Signature: v=1; a=openpgp-sha256; l=6423; i=ardb@kernel.org; h=from:subject; bh=CrHTaHJltEK6PeCTI4LoBKpJ8PhFKRqD4hhGSJUvD44=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JIbXo37Ft8nUvf9254Rmw993GMzGn3L8e/NO2JereI4mHm 6oSVCKSO0pZGMQ4GGTFFFkEZv99t/P0RKla51myMHNYmUCGMHBxCsBEklkY/medMDjxp/i3kaqn vNqdY+7F9XPTTzA+N1x0Y8qkfQIPOjoYGfY7zdJgzWI+tuI8ywOudlmPaqG9V7KDbz+pCk5bzRP AyQAA X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog Message-ID: <20231208113218.3001940-10-ardb@google.com> Subject: [PATCH v4 4/4] arm64: crypto: Disable yielding logic unless CONFIG_PREEMPT_VOLUNTARY=y From: Ard Biesheuvel To: linux-arm-kernel@lists.infradead.org Cc: linux-crypto@vger.kernel.org, Ard Biesheuvel , Marc Zyngier , Will Deacon , Mark Rutland , Kees Cook , Catalin Marinas , Mark Brown , Eric Biggers , Sebastian Andrzej Siewior From: Ard Biesheuvel Now that kernel mode use of SIMD runs with preemption enabled, the explicit yield logic is redundant for preemptible builds, and since it should not actually be used at all on non-preemptible builds (where kernel work is supposed to run to completion and not give up its time slice prematurely), let's make it depend on CONFIG_PREEMPT_VOLUNTARY. Once CONFIG_PREEMPT_VOLUNTARY is removed, all the logic it guards can be removed as well. Signed-off-by: Ard Biesheuvel --- arch/arm64/crypto/aes-ce-ccm-glue.c | 8 ++++++-- arch/arm64/crypto/chacha-neon-glue.c | 5 ++++- arch/arm64/crypto/crct10dif-ce-glue.c | 6 ++++-- arch/arm64/crypto/nhpoly1305-neon-glue.c | 5 ++++- arch/arm64/crypto/poly1305-glue.c | 5 ++++- arch/arm64/crypto/polyval-ce-glue.c | 9 +++++++-- arch/arm64/include/asm/assembler.h | 4 ++-- 7 files changed, 31 insertions(+), 11 deletions(-) diff --git a/arch/arm64/crypto/aes-ce-ccm-glue.c b/arch/arm64/crypto/aes-ce-ccm-glue.c index 25cd3808ecbe..82e293a698ff 100644 --- a/arch/arm64/crypto/aes-ce-ccm-glue.c +++ b/arch/arm64/crypto/aes-ce-ccm-glue.c @@ -125,13 +125,17 @@ static void ccm_calculate_auth_mac(struct aead_request *req, u8 mac[]) scatterwalk_start(&walk, sg_next(walk.sg)); n = scatterwalk_clamp(&walk, len); } - n = min_t(u32, n, SZ_4K); /* yield NEON at least every 4k */ + + if (IS_ENABLED(CONFIG_PREEMPT_VOLUNTARY)) + n = min_t(u32, n, SZ_4K); /* yield NEON at least every 4k */ + p = scatterwalk_map(&walk); macp = ce_aes_ccm_auth_data(mac, p, n, macp, ctx->key_enc, num_rounds(ctx)); - if (len / SZ_4K > (len - n) / SZ_4K) { + if (IS_ENABLED(CONFIG_PREEMPT_VOLUNTARY) && + len / SZ_4K > (len - n) / SZ_4K) { kernel_neon_end(); kernel_neon_begin(); } diff --git a/arch/arm64/crypto/chacha-neon-glue.c b/arch/arm64/crypto/chacha-neon-glue.c index af2bbca38e70..655b250cef4a 100644 --- a/arch/arm64/crypto/chacha-neon-glue.c +++ b/arch/arm64/crypto/chacha-neon-glue.c @@ -88,7 +88,10 @@ void chacha_crypt_arch(u32 *state, u8 *dst, const u8 *src, unsigned int bytes, return chacha_crypt_generic(state, dst, src, bytes, nrounds); do { - unsigned int todo = min_t(unsigned int, bytes, SZ_4K); + unsigned int todo = bytes; + + if (IS_ENABLED(CONFIG_PREEMPT_VOLUNTARY)) + todo = min_t(unsigned int, todo, SZ_4K); kernel_neon_begin(); chacha_doneon(state, dst, src, todo, nrounds); diff --git a/arch/arm64/crypto/crct10dif-ce-glue.c b/arch/arm64/crypto/crct10dif-ce-glue.c index 09eb1456aed4..c6e8cf4f56da 100644 --- a/arch/arm64/crypto/crct10dif-ce-glue.c +++ b/arch/arm64/crypto/crct10dif-ce-glue.c @@ -40,7 +40,8 @@ static int crct10dif_update_pmull_p8(struct shash_desc *desc, const u8 *data, do { unsigned int chunk = length; - if (chunk > SZ_4K + CRC_T10DIF_PMULL_CHUNK_SIZE) + if (IS_ENABLED(CONFIG_PREEMPT_VOLUNTARY) && + chunk > SZ_4K + CRC_T10DIF_PMULL_CHUNK_SIZE) chunk = SZ_4K; kernel_neon_begin(); @@ -65,7 +66,8 @@ static int crct10dif_update_pmull_p64(struct shash_desc *desc, const u8 *data, do { unsigned int chunk = length; - if (chunk > SZ_4K + CRC_T10DIF_PMULL_CHUNK_SIZE) + if (IS_ENABLED(CONFIG_PREEMPT_VOLUNTARY) && + chunk > SZ_4K + CRC_T10DIF_PMULL_CHUNK_SIZE) chunk = SZ_4K; kernel_neon_begin(); diff --git a/arch/arm64/crypto/nhpoly1305-neon-glue.c b/arch/arm64/crypto/nhpoly1305-neon-glue.c index e4a0b463f080..cbbc51b27d93 100644 --- a/arch/arm64/crypto/nhpoly1305-neon-glue.c +++ b/arch/arm64/crypto/nhpoly1305-neon-glue.c @@ -23,7 +23,10 @@ static int nhpoly1305_neon_update(struct shash_desc *desc, return crypto_nhpoly1305_update(desc, src, srclen); do { - unsigned int n = min_t(unsigned int, srclen, SZ_4K); + unsigned int n = srclen; + + if (IS_ENABLED(CONFIG_PREEMPT_VOLUNTARY)) + n = min_t(unsigned int, n, SZ_4K); kernel_neon_begin(); crypto_nhpoly1305_update_helper(desc, src, n, nh_neon); diff --git a/arch/arm64/crypto/poly1305-glue.c b/arch/arm64/crypto/poly1305-glue.c index 1fae18ba11ed..27f84f5bfc98 100644 --- a/arch/arm64/crypto/poly1305-glue.c +++ b/arch/arm64/crypto/poly1305-glue.c @@ -144,7 +144,10 @@ void poly1305_update_arch(struct poly1305_desc_ctx *dctx, const u8 *src, if (static_branch_likely(&have_neon) && crypto_simd_usable()) { do { - unsigned int todo = min_t(unsigned int, len, SZ_4K); + unsigned int todo = len; + + if (IS_ENABLED(CONFIG_PREEMPT_VOLUNTARY)) + todo = min_t(unsigned int, todo, SZ_4K); kernel_neon_begin(); poly1305_blocks_neon(&dctx->h, src, todo, 1); diff --git a/arch/arm64/crypto/polyval-ce-glue.c b/arch/arm64/crypto/polyval-ce-glue.c index 0a3b5718df85..c4c0fb3fcaf4 100644 --- a/arch/arm64/crypto/polyval-ce-glue.c +++ b/arch/arm64/crypto/polyval-ce-glue.c @@ -123,8 +123,13 @@ static int polyval_arm64_update(struct shash_desc *desc, } while (srclen >= POLYVAL_BLOCK_SIZE) { - /* allow rescheduling every 4K bytes */ - nblocks = min(srclen, 4096U) / POLYVAL_BLOCK_SIZE; + unsigned int len = srclen; + + if (IS_ENABLED(CONFIG_PREEMPT_VOLUNTARY)) + /* allow rescheduling every 4K bytes */ + len = min(len, 4096U); + + nblocks = len / POLYVAL_BLOCK_SIZE; internal_polyval_update(tctx, src, nblocks, dctx->buffer); srclen -= nblocks * POLYVAL_BLOCK_SIZE; src += nblocks * POLYVAL_BLOCK_SIZE; diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h index 376a980f2bad..0180ac1f9b8b 100644 --- a/arch/arm64/include/asm/assembler.h +++ b/arch/arm64/include/asm/assembler.h @@ -769,6 +769,7 @@ alternative_endif * field) */ .macro cond_yield, lbl:req, tmp:req, tmp2:req +#ifdef CONFIG_PREEMPT_VOLUNTARY get_current_task \tmp ldr \tmp, [\tmp, #TSK_TI_PREEMPT] /* @@ -777,15 +778,14 @@ alternative_endif * run to completion as quickly as we can. */ tbnz \tmp, #SOFTIRQ_SHIFT, .Lnoyield_\@ -#ifdef CONFIG_PREEMPTION sub \tmp, \tmp, #PREEMPT_DISABLE_OFFSET cbz \tmp, \lbl -#endif adr_l \tmp, irq_stat + IRQ_CPUSTAT_SOFTIRQ_PENDING get_this_cpu_offset \tmp2 ldr w\tmp, [\tmp, \tmp2] cbnz w\tmp, \lbl // yield on pending softirq in task context .Lnoyield_\@: +#endif .endm /*