Message ID | 20221031142017.26750-1-luoxueqin66@gmail.com |
---|---|
State | New |
Headers | show |
Series | [v2] kernel/power : add pr_err() for debugging "Error -14 resuming" error | expand |
On Mon, Oct 31, 2022 at 3:20 PM Luo Xueqin <luoxueqin66@gmail.com> wrote: > > From: Xueqin Luo <luoxueqin@kylinos.cn> > > The system memory map can change over a hibernation-restore cycle due > to a defect in the platform firmware, and some of the page frames used > by the kernel before hibernation may not be available any more during > the subsequent restore which leads to the error below. > > [ T357] PM: Image loading progress: 0% > [ T357] PM: Read 2681596 kbytes in 0.03 seconds (89386.53 MB/s) > [ T357] PM: Error -14 resuming > [ T357] PM: Failed to load hibernation image, recovering. > [ T357] PM: Basic memory bitmaps freed > [ T357] OOM killer enabled. > [ T357] Restarting tasks ... done. > [ T357] PM: resume from hibernation failed (-14) > [ T357] PM: Hibernation image not present or could not be loaded. > > So, by adding an Error message to the unpack () function, you can quickly > navigate to the Error page number and analyze the cause when an "Error -14 > resuming" error occurs in S4. This can save developers the cost of > debugging time. > > Signed-off-by: Xueqin Luo <luoxueqin@kylinos.cn> > --- > > v2: Modify the commit message and pr_err() function output > > kernel/power/snapshot.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c > index 2a406753af90..2be2e9f5a060 100644 > --- a/kernel/power/snapshot.c > +++ b/kernel/power/snapshot.c > @@ -2259,10 +2259,13 @@ static int unpack_orig_pfns(unsigned long *buf, struct memory_bitmap *bm) > if (unlikely(buf[j] == BM_END_OF_MAP)) > break; > > - if (pfn_valid(buf[j]) && memory_bm_pfn_present(bm, buf[j])) > + if (pfn_valid(buf[j]) && memory_bm_pfn_present(bm, buf[j])) { > memory_bm_set_bit(bm, buf[j]); > - else > + } else { > + if (!pfn_valid(buf[j])) > + pr_err("The page frame number: %lx is not valid.\n", buf[j]); What about printing this message instead: pr_err(FW_BUG "Memory map mismatch at 0x%llx after hibernation\n", page_address(pfn_to_page(buf[j]))); > return -EFAULT; > + } > } > > return 0; > -- > 2.25.1 >
On Thu, Nov 10, 2022 at 3:13 PM Rafael J. Wysocki <rafael@kernel.org> wrote: > > On Mon, Oct 31, 2022 at 3:20 PM Luo Xueqin <luoxueqin66@gmail.com> wrote: > > > > From: Xueqin Luo <luoxueqin@kylinos.cn> > > > > The system memory map can change over a hibernation-restore cycle due > > to a defect in the platform firmware, and some of the page frames used > > by the kernel before hibernation may not be available any more during > > the subsequent restore which leads to the error below. > > > > [ T357] PM: Image loading progress: 0% > > [ T357] PM: Read 2681596 kbytes in 0.03 seconds (89386.53 MB/s) > > [ T357] PM: Error -14 resuming > > [ T357] PM: Failed to load hibernation image, recovering. > > [ T357] PM: Basic memory bitmaps freed > > [ T357] OOM killer enabled. > > [ T357] Restarting tasks ... done. > > [ T357] PM: resume from hibernation failed (-14) > > [ T357] PM: Hibernation image not present or could not be loaded. > > > > So, by adding an Error message to the unpack () function, you can quickly > > navigate to the Error page number and analyze the cause when an "Error -14 > > resuming" error occurs in S4. This can save developers the cost of > > debugging time. > > > > Signed-off-by: Xueqin Luo <luoxueqin@kylinos.cn> > > --- > > > > v2: Modify the commit message and pr_err() function output > > > > kernel/power/snapshot.c | 7 +++++-- > > 1 file changed, 5 insertions(+), 2 deletions(-) > > > > diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c > > index 2a406753af90..2be2e9f5a060 100644 > > --- a/kernel/power/snapshot.c > > +++ b/kernel/power/snapshot.c > > @@ -2259,10 +2259,13 @@ static int unpack_orig_pfns(unsigned long *buf, struct memory_bitmap *bm) > > if (unlikely(buf[j] == BM_END_OF_MAP)) > > break; > > > > - if (pfn_valid(buf[j]) && memory_bm_pfn_present(bm, buf[j])) > > + if (pfn_valid(buf[j]) && memory_bm_pfn_present(bm, buf[j])) { > > memory_bm_set_bit(bm, buf[j]); > > - else > > + } else { > > + if (!pfn_valid(buf[j])) > > + pr_err("The page frame number: %lx is not valid.\n", buf[j]); > > What about printing this message instead: > > pr_err(FW_BUG "Memory map mismatch at 0x%llx after hibernation\n", > page_address(pfn_to_page(buf[j]))); Actually, this should be pr_err(FW_BUG "Memory map mismatch at 0x%llx after hibernation\n", PFN_PHYS(buf[j]))); > > > return -EFAULT; > > + } > > } > > > > return 0; > > -- > > 2.25.1 > >
diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c index 2a406753af90..2be2e9f5a060 100644 --- a/kernel/power/snapshot.c +++ b/kernel/power/snapshot.c @@ -2259,10 +2259,13 @@ static int unpack_orig_pfns(unsigned long *buf, struct memory_bitmap *bm) if (unlikely(buf[j] == BM_END_OF_MAP)) break; - if (pfn_valid(buf[j]) && memory_bm_pfn_present(bm, buf[j])) + if (pfn_valid(buf[j]) && memory_bm_pfn_present(bm, buf[j])) { memory_bm_set_bit(bm, buf[j]); - else + } else { + if (!pfn_valid(buf[j])) + pr_err("The page frame number: %lx is not valid.\n", buf[j]); return -EFAULT; + } } return 0;