Message ID | 1392774729-3235-4-git-send-email-sebastian.capella@linaro.org |
---|---|
State | New |
Headers | show |
On Wed, Feb 19, 2014 at 01:52:09AM +0000, Sebastian Capella wrote: [...] > diff --git a/arch/arm/kernel/hibernate.c b/arch/arm/kernel/hibernate.c > new file mode 100644 > index 0000000..16f406f > --- /dev/null > +++ b/arch/arm/kernel/hibernate.c > @@ -0,0 +1,106 @@ > +/* > + * Hibernation support specific for ARM > + * > + * Derived from work on ARM hibernation support by: > + * > + * Ubuntu project, hibernation support for mach-dove > + * Copyright (C) 2010 Nokia Corporation (Hiroshi Doyu) > + * Copyright (C) 2010 Texas Instruments, Inc. (Teerth Reddy et al.) > + * https://lkml.org/lkml/2010/6/18/4 > + * https://lists.linux-foundation.org/pipermail/linux-pm/2010-June/027422.html > + * https://patchwork.kernel.org/patch/96442/ > + * > + * Copyright (C) 2006 Rafael J. Wysocki <rjw@sisk.pl> > + * > + * License terms: GNU General Public License (GPL) version 2 > + */ > + > +#include <linux/mm.h> > +#include <linux/suspend.h> > +#include <asm/tlbflush.h> > +#include <asm/cacheflush.h> > +#include <asm/system_misc.h> > +#include <asm/idmap.h> > +#include <asm/suspend.h> > + > +extern const void __nosave_begin, __nosave_end; > + > +int pfn_is_nosave(unsigned long pfn) > +{ > + unsigned long nosave_begin_pfn = > + __pa_symbol(&__nosave_begin) >> PAGE_SHIFT; > + unsigned long nosave_end_pfn = > + PAGE_ALIGN(__pa_symbol(&__nosave_end)) >> PAGE_SHIFT; > + > + return (pfn >= nosave_begin_pfn) && (pfn < nosave_end_pfn); > +} > + > +void notrace save_processor_state(void) > +{ > + WARN_ON(num_online_cpus() != 1); > + flush_thread(); Can you explain to me please why we need to call flush_thread() here ? At this point in time syscore_suspend() was already called and CPU peripheral state saved through CPU PM notifiers. > + local_fiq_disable(); To me it looks like we are using this hook to disable fiqs, since it is not done in generic code. > +} > + > +void notrace restore_processor_state(void) > +{ > + local_fiq_enable(); > +} > + > +/* > + * Snapshot kernel memory and reset the system. > + * After resume, the hibernation snapshot is written out. > + */ > +static int notrace __swsusp_arch_save_image(unsigned long unused) > +{ > + int ret; > + > + ret = swsusp_save(); > + if (ret == 0) > + soft_restart(virt_to_phys(cpu_resume)); By the time the suspend finisher (ie this function) is run, the processor state has been saved and I think that's all you have to do, function can just return after calling swsusp_save(), unless I am missing something. I do not understand why a soft_restart is required here. On a side note, finisher is called with irqs disabled so, since you added a function for soft restart noirq, it should be used, if needed, but I have to understand why in the first place. > + return ret; > +} > + > +/* > + * Save the current CPU state before suspend / poweroff. > + */ > +int notrace swsusp_arch_suspend(void) > +{ > + return cpu_suspend(0, __swsusp_arch_save_image); If the goal of soft_restart is to return 0 on success from this call, you can still do that without requiring a soft_restart in the first place. IIUC all you want to achieve is to save processor context registers so that when you resume from image you will actually return from here. > +} > + > +/* > + * The framework loads the hibernation image into a linked list anchored > + * at restore_pblist, for swsusp_arch_resume() to copy back to the proper > + * destinations. > + * > + * To make this work if resume is triggered from initramfs, the > + * pagetables need to be switched to allow writes to kernel mem. Can you elaborate a bit more on this please ? > + */ > +static void notrace __swsusp_arch_restore_image(void *unused) > +{ > + struct pbe *pbe; > + > + cpu_switch_mm(idmap_pgd, &init_mm); Same here, thanks. > + for (pbe = restore_pblist; pbe; pbe = pbe->next) > + copy_page(pbe->orig_address, pbe->address); > + > + soft_restart_noirq(virt_to_phys(cpu_resume)); This soft_restart is justified so that you resume from the context saved when creating the image. > +} > + > +static u8 __swsusp_resume_stk[PAGE_SIZE/2] __nosavedata; > + > +/* > + * Resume from the hibernation image. > + * Due to the kernel heap / data restore, stack contents change underneath > + * and that would make function calls impossible; switch to a temporary > + * stack within the nosave region to avoid that problem. > + */ > +int __naked swsusp_arch_resume(void) > +{ > + extern void call_with_stack(void (*fn)(void *), void *arg, void *sp); Ok, a function with attribute __naked that still calls C functions, is attr __naked really needed here ? > + cpu_init(); /* get a clean PSR */ cpu_init is called in the cpu_resume path, why is this call needed here ? Thanks, Lorenzo -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Feb 19, 2014 at 07:10:31PM +0000, Russ Dill wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 02/19/2014 08:12 AM, Lorenzo Pieralisi wrote: > > + * https://patchwork.kernel.org/patch/96442/ > I am guessing the snippets of code your comments refer to. > I think the idea here is to get the CPU into a state so that later > when we resume from the resume kernel, the actual CPU state matches > the state we have in kernel. The main thing flush_thread does is clear > out any and all FP state. Which has already been saved through syscore_suspend().... > The may be part of the patchset that is OBE. It has to be updated then. > cpu_resume makes many assumptions about the state of the state of the > CPU, the primary being that the MMU is disabled, but also that all > caches and IRQs are disabled. soft_restart does all this for us. > > > > ah, you are saying just return from __swsusp_arch_save_image and allow > cpu_suspend_abort to be called, placing the result of swsusp_save > somewhere else. This may work and would reduce the complexity of the > code slightly. Yes. Basically you are doing a soft reboot just to return 0. > This is taken from the previous iteration of the patchset, I think the > comment is OBE. Updated it please then. > But this is still required to select the right mapping for our copying. /me confused. Please describe what switching to idmap is meant to achieve. In the patch above the copied swapper pgdir is not even used, I would like to understand why this is done. > I don't remember why I needed to prevent gcc from manipulating the > stack here. That's not a good reason to mark a function with attr __naked. If it is needed we leave it there, if it is not it has to go. > This is another holdover from previous patch versions that may be OBE. See above. Lorenzo -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Sebastian, On Wed, Feb 19, 2014 at 07:33:15PM +0000, Sebastian Capella wrote: > Quoting Lorenzo Pieralisi (2014-02-19 08:12:54) > > On Wed, Feb 19, 2014 at 01:52:09AM +0000, Sebastian Capella wrote: > > [...] > > > diff --git a/arch/arm/kernel/hibernate.c b/arch/arm/kernel/hibernate.c > > > new file mode 100644 > > > index 0000000..16f406f > > > --- /dev/null > > > +++ b/arch/arm/kernel/hibernate.c > > > +void notrace save_processor_state(void) > > > +{ > > > + WARN_ON(num_online_cpus() != 1); > > > + flush_thread(); > > > > Can you explain to me please why we need to call flush_thread() here ? > > At this point in time syscore_suspend() was already called and CPU > > peripheral state saved through CPU PM notifiers. > > Copying Russ' response here: > > "I think the idea here is to get the CPU into a state so that later > when we resume from the resume kernel, the actual CPU state matches > the state we have in kernel. The main thing flush_thread does is clear > out any and all FP state." - Russ Dill See my reply to Russ. [...] > > > +static void notrace __swsusp_arch_restore_image(void *unused) > > > +{ > > > + struct pbe *pbe; > > > + > > > + cpu_switch_mm(idmap_pgd, &init_mm); > > > > Same here, thanks. > > At restore time, we take the save buffer data and restore it to the same > physical locations used in the previous execution. This will require having > write access to all of memory, which may not be generally granted by the > current mm. So we switch to 1-1 init_mm to restore memory. I still do not understand why switching to idmap, which is a clone of init_mm + 1:1 kernel mappings is required here. Why idmap ? And while at it, can't the idmap be overwritten _while_ copying back the resume kernel ? Is it safe to use idmap page tables while copying ? I had a look at x86 and there idmap page tables used to resume are created on the fly using safe pages, on ARM idmap is created at boot. I am grokking the code to understand what is really needed here, will get back to you asap but I would like things to be clarified in the interim. Thanks, Lorenzo -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, Feb 22, 2014 at 10:38:40AM +0000, Russell King - ARM Linux wrote: > On Wed, Feb 19, 2014 at 04:12:54PM +0000, Lorenzo Pieralisi wrote: > > On Wed, Feb 19, 2014 at 01:52:09AM +0000, Sebastian Capella wrote: > > > +/* > > > + * Snapshot kernel memory and reset the system. > > > + * After resume, the hibernation snapshot is written out. > > > + */ > > > +static int notrace __swsusp_arch_save_image(unsigned long unused) > > > +{ > > > + int ret; > > > + > > > + ret = swsusp_save(); > > > + if (ret == 0) > > > + soft_restart(virt_to_phys(cpu_resume)); > > > > By the time the suspend finisher (ie this function) is run, the > > processor state has been saved and I think that's all you have to do, > > function can just return after calling swsusp_save(), unless I am missing > > something. > > > > I do not understand why a soft_restart is required here. On a side note, > > finisher is called with irqs disabled so, since you added a function for > > soft restart noirq, it should be used, if needed, but I have to understand > > why in the first place. > > It's required because you can't just return from the finisher. A normal > return from the finisher will always be interpreted as an abort rather > than success (because the state has to be unwound.) > > This is the only way to get a zero return from cpu_suspend(). Yes, that's the only reason why this code is jumping to cpu_resume, since all it is needed is to snapshot the CPU context and by the time the finisher is called that's done. Wanted to say that soft reboot is not useful (cache flushing and resume with MMU off), but what you are saying is correct. We might be saving swsusp_save return value in a global variable and just return from the finisher, but that's horrible and given the amount of time it takes to snapshot the image to disk the cost of this soft reboot will be dwarfed by that. I wanted to ask and clarify why the code was written like this though, given its complexity. > > > +/* > > > + * The framework loads the hibernation image into a linked list anchored > > > + * at restore_pblist, for swsusp_arch_resume() to copy back to the proper > > > + * destinations. > > > + * > > > + * To make this work if resume is triggered from initramfs, the > > > + * pagetables need to be switched to allow writes to kernel mem. > > > > Can you elaborate a bit more on this please ? > > > > > + */ > > > +static void notrace __swsusp_arch_restore_image(void *unused) > > > +{ > > > + struct pbe *pbe; > > > + > > > + cpu_switch_mm(idmap_pgd, &init_mm); > > > > Same here, thanks. > > > > > + for (pbe = restore_pblist; pbe; pbe = pbe->next) > > > + copy_page(pbe->orig_address, pbe->address); > > > + > > > + soft_restart_noirq(virt_to_phys(cpu_resume)); > > > > This soft_restart is justified so that you resume from the context saved > > when creating the image. > > You need the idmap_pgd in place to call cpu_resume at it's physical > address. Other page tables just won't do here. It's well established > that this page table must be in place for the resume paths to work. Well, we do not need idmap page tables for copying the restore_pblist, but we do need a set of tables that won't be corrupted by the copy and idmap does the trick (I was confused because 1:1 mappings are not needed for the copy itself). The switch to idmap is done for us in soft_reboot anyway before jumping to cpu_resume and that's required, as you said. Lorenzo -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, Feb 22, 2014 at 10:16:55AM +0000, Russell King - ARM Linux wrote: > On Thu, Feb 20, 2014 at 04:27:55PM +0000, Lorenzo Pieralisi wrote: > > I still do not understand why switching to idmap, which is a clone of > > init_mm + 1:1 kernel mappings is required here. Why idmap ? > > > > And while at it, can't the idmap be overwritten _while_ copying back the > > resume kernel ? Is it safe to use idmap page tables while copying ? > > > > I had a look at x86 and there idmap page tables used to resume are created > > on the fly using safe pages, on ARM idmap is created at boot. > > That's fine. > > Remember, you're required to boot exactly the same kernel image when > resuming as the kernel which created the suspend image. Unless you > have random allocations going on, you should get the same layout for > the idmap stuff at each boot. Thanks Russell, now that's clear. We do need a copy of page tables that are not tampered with while copying, and idmap works well for that. Lorenzo -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Quoting Lorenzo Pieralisi (2014-02-22 04:09:10) > On Sat, Feb 22, 2014 at 10:38:40AM +0000, Russell King - ARM Linux wrote: > > On Wed, Feb 19, 2014 at 04:12:54PM +0000, Lorenzo Pieralisi wrote: > > > > + cpu_switch_mm(idmap_pgd, &init_mm); > > > > You need the idmap_pgd in place to call cpu_resume at it's physical > > address. Other page tables just won't do here. It's well established > > that this page table must be in place for the resume paths to work. > > Well, we do not need idmap page tables for copying the restore_pblist, > but we do need a set of tables that won't be corrupted by the copy and > idmap does the trick (I was confused because 1:1 mappings are not needed > for the copy itself). > > The switch to idmap is done for us in soft_reboot anyway before jumping to > cpu_resume and that's required, as you said. Ok, so I'll leave the cpu_switch_mm as is for the next patchset. Thanks Lorenzo, Russ and Russell! Sebastian -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Quoting Lorenzo Pieralisi (2014-02-22 04:09:10) > On Sat, Feb 22, 2014 at 10:38:40AM +0000, Russell King - ARM Linux wrote: > > On Wed, Feb 19, 2014 at 04:12:54PM +0000, Lorenzo Pieralisi wrote: > > > On Wed, Feb 19, 2014 at 01:52:09AM +0000, Sebastian Capella wrote: > > > > +/* > > > > + * Snapshot kernel memory and reset the system. > > > > + * After resume, the hibernation snapshot is written out. > > > > + */ > > > > +static int notrace __swsusp_arch_save_image(unsigned long unused) > > > > +{ > > > > + int ret; > > > > + > > > > + ret = swsusp_save(); > > > > + if (ret == 0) > > > > + soft_restart(virt_to_phys(cpu_resume)); > > > > > > By the time the suspend finisher (ie this function) is run, the > > > processor state has been saved and I think that's all you have to do, > > > function can just return after calling swsusp_save(), unless I am missing > > > something. > > > > > > I do not understand why a soft_restart is required here. On a side note, > > > finisher is called with irqs disabled so, since you added a function for > > > soft restart noirq, it should be used, if needed, but I have to understand > > > why in the first place. > > > > It's required because you can't just return from the finisher. A normal > > return from the finisher will always be interpreted as an abort rather > > than success (because the state has to be unwound.) > > > > This is the only way to get a zero return from cpu_suspend(). > > Yes, that's the only reason why this code is jumping to cpu_resume, since > all it is needed is to snapshot the CPU context and by the time the > finisher is called that's done. Wanted to say that soft reboot is not > useful (cache flushing and resume with MMU off), but what you are saying > is correct. We might be saving swsusp_save return value in a global > variable and just return from the finisher, but that's horrible and > given the amount of time it takes to snapshot the image to disk the > cost of this soft reboot will be dwarfed by that. > > I wanted to ask and clarify why the code was written like this though, given > its complexity. We could also return a constant > 1. __cpu_suspend code will replace a 0 return with 1 for paths exiting suspend, but will not change return values != 0. cpu_suspend_abort: ldmia sp!, {r1 - r3} @ pop phys pgd, virt SP, phys resume fn teq r0, #0 moveq r0, #1 @ force non-zero value mov sp, r2 ldmfd sp!, {r4 - r11, pc} We could take advantage of that if we wanted, but Lorenzo pointed out also that the relative benefit is very low since the cost of resuming is >> soft_restart. I'll go with leaving the soft_restart as is unless someone feels strongly against. Thanks! Sebastian -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h index 8756e4b..1079ea8 100644 --- a/arch/arm/include/asm/memory.h +++ b/arch/arm/include/asm/memory.h @@ -291,6 +291,7 @@ static inline void *phys_to_virt(phys_addr_t x) */ #define __pa(x) __virt_to_phys((unsigned long)(x)) #define __va(x) ((void *)__phys_to_virt((phys_addr_t)(x))) +#define __pa_symbol(x) __pa(RELOC_HIDE((unsigned long)(x), 0)) #define pfn_to_kaddr(pfn) __va((pfn) << PAGE_SHIFT) extern phys_addr_t (*arch_virt_to_idmap)(unsigned long x); diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile index a30fc9b..8afa848 100644 --- a/arch/arm/kernel/Makefile +++ b/arch/arm/kernel/Makefile @@ -39,6 +39,7 @@ obj-$(CONFIG_ARTHUR) += arthur.o obj-$(CONFIG_ISA_DMA) += dma-isa.o obj-$(CONFIG_PCI) += bios32.o isa.o obj-$(CONFIG_ARM_CPU_SUSPEND) += sleep.o suspend.o +obj-$(CONFIG_HIBERNATION) += hibernate.o obj-$(CONFIG_SMP) += smp.o ifdef CONFIG_MMU obj-$(CONFIG_SMP) += smp_tlb.o diff --git a/arch/arm/kernel/hibernate.c b/arch/arm/kernel/hibernate.c new file mode 100644 index 0000000..16f406f --- /dev/null +++ b/arch/arm/kernel/hibernate.c @@ -0,0 +1,106 @@ +/* + * Hibernation support specific for ARM + * + * Derived from work on ARM hibernation support by: + * + * Ubuntu project, hibernation support for mach-dove + * Copyright (C) 2010 Nokia Corporation (Hiroshi Doyu) + * Copyright (C) 2010 Texas Instruments, Inc. (Teerth Reddy et al.) + * https://lkml.org/lkml/2010/6/18/4 + * https://lists.linux-foundation.org/pipermail/linux-pm/2010-June/027422.html + * https://patchwork.kernel.org/patch/96442/ + * + * Copyright (C) 2006 Rafael J. Wysocki <rjw@sisk.pl> + * + * License terms: GNU General Public License (GPL) version 2 + */ + +#include <linux/mm.h> +#include <linux/suspend.h> +#include <asm/tlbflush.h> +#include <asm/cacheflush.h> +#include <asm/system_misc.h> +#include <asm/idmap.h> +#include <asm/suspend.h> + +extern const void __nosave_begin, __nosave_end; + +int pfn_is_nosave(unsigned long pfn) +{ + unsigned long nosave_begin_pfn = + __pa_symbol(&__nosave_begin) >> PAGE_SHIFT; + unsigned long nosave_end_pfn = + PAGE_ALIGN(__pa_symbol(&__nosave_end)) >> PAGE_SHIFT; + + return (pfn >= nosave_begin_pfn) && (pfn < nosave_end_pfn); +} + +void notrace save_processor_state(void) +{ + WARN_ON(num_online_cpus() != 1); + flush_thread(); + local_fiq_disable(); +} + +void notrace restore_processor_state(void) +{ + local_fiq_enable(); +} + +/* + * Snapshot kernel memory and reset the system. + * After resume, the hibernation snapshot is written out. + */ +static int notrace __swsusp_arch_save_image(unsigned long unused) +{ + int ret; + + ret = swsusp_save(); + if (ret == 0) + soft_restart(virt_to_phys(cpu_resume)); + return ret; +} + +/* + * Save the current CPU state before suspend / poweroff. + */ +int notrace swsusp_arch_suspend(void) +{ + return cpu_suspend(0, __swsusp_arch_save_image); +} + +/* + * The framework loads the hibernation image into a linked list anchored + * at restore_pblist, for swsusp_arch_resume() to copy back to the proper + * destinations. + * + * To make this work if resume is triggered from initramfs, the + * pagetables need to be switched to allow writes to kernel mem. + */ +static void notrace __swsusp_arch_restore_image(void *unused) +{ + struct pbe *pbe; + + cpu_switch_mm(idmap_pgd, &init_mm); + for (pbe = restore_pblist; pbe; pbe = pbe->next) + copy_page(pbe->orig_address, pbe->address); + + soft_restart_noirq(virt_to_phys(cpu_resume)); +} + +static u8 __swsusp_resume_stk[PAGE_SIZE/2] __nosavedata; + +/* + * Resume from the hibernation image. + * Due to the kernel heap / data restore, stack contents change underneath + * and that would make function calls impossible; switch to a temporary + * stack within the nosave region to avoid that problem. + */ +int __naked swsusp_arch_resume(void) +{ + extern void call_with_stack(void (*fn)(void *), void *arg, void *sp); + cpu_init(); /* get a clean PSR */ + call_with_stack(__swsusp_arch_restore_image, 0, + __swsusp_resume_stk + sizeof(__swsusp_resume_stk)); + return 0; +} diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig index 1f8fed9..83707702 100644 --- a/arch/arm/mm/Kconfig +++ b/arch/arm/mm/Kconfig @@ -611,6 +611,11 @@ config CPU_USE_DOMAINS config IO_36 bool +config ARCH_HIBERNATION_POSSIBLE + bool + depends on MMU + default y if CPU_ARM920T || CPU_ARM926T || CPU_SA1100 || CPU_XSCALE || CPU_XSC3 || CPU_V6 || CPU_V6K || CPU_V7 + comment "Processor Features" config ARM_LPAE diff --git a/include/linux/suspend.h b/include/linux/suspend.h index f73cabf..38bbf95 100644 --- a/include/linux/suspend.h +++ b/include/linux/suspend.h @@ -320,6 +320,8 @@ extern unsigned long get_safe_page(gfp_t gfp_mask); extern void hibernation_set_ops(const struct platform_hibernation_ops *ops); extern int hibernate(void); extern bool system_entering_hibernation(void); +asmlinkage int swsusp_save(void); +extern struct pbe *restore_pblist; #else /* CONFIG_HIBERNATION */ static inline void register_nosave_region(unsigned long b, unsigned long e) {} static inline void register_nosave_region_late(unsigned long b, unsigned long e) {}