Message ID | 20220325161428.5068d97e@imladris.surriel.com |
---|---|
State | Accepted |
Commit | 3149c79f3cb0e2e3bafb7cfadacec090cbd250d3 |
Headers | show |
Series | mm,hwpoison: unmap poisoned page before invalidation | expand |
On Sat, 2022-03-26 at 15:48 +0800, Miaohe Lin wrote: > On 2022/3/26 4:14, Rik van Riel wrote: > > > > +++ b/mm/memory.c > > @@ -3918,14 +3918,18 @@ static vm_fault_t __do_fault(struct > > vm_fault *vmf) > > return ret; > > > > if (unlikely(PageHWPoison(vmf->page))) { > > + struct page *page = vmf->page; > > vm_fault_t poisonret = VM_FAULT_HWPOISON; > > if (ret & VM_FAULT_LOCKED) { > > + if (page_mapped(page)) > > + unmap_mapping_pages(page_mapping(pa > > ge), > > + page->index, 1, > > false); > > It seems this unmap_mapping_pages also helps the success rate of the > below invalidate_inode_page. > That is indeed what it is supposed to do. It isn't fool proof, since you can still end up with dirty pages that don't get cleaned immediately, but it seems to turn infinite loops of a program being killed every time it's started into a more manageable situation where the task succeeds again pretty quickly. > > /* Retry if a clean page was removed from > > the cache. */ > > - if (invalidate_inode_page(vmf->page)) > > - poisonret = 0; > > - unlock_page(vmf->page); > > + if (invalidate_inode_page(page)) > > + poisonret = VM_FAULT_NOPAGE; > > + unlock_page(page); > > } > > - put_page(vmf->page); > > + put_page(page); > > Do we use page instead of vmf->page just for simplicity? Or there is > some other concern? > Just a simplification, and not dereferencing the same thing 6 times. > > vmf->page = NULL; > > We return either VM_FAULT_NOPAGE or VM_FAULT_HWPOISON with vmf->page > = NULL. If any case, > finish_fault won't be called later. So I think your fix is right. Want to send in a Reviewed-by or Acked-by? :)
On Mon, 2022-03-28 at 10:14 +0800, Miaohe Lin wrote: > On 2022/3/27 4:14, Rik van Riel wrote: > > > > > > > > /* Retry if a clean page was removed > > > > from > > > > the cache. */ > > > > - if (invalidate_inode_page(vmf->page)) > > > > - poisonret = 0; > > > > - unlock_page(vmf->page); > > > > + if (invalidate_inode_page(page)) > > > > + poisonret = VM_FAULT_NOPAGE; > > > > + unlock_page(page); > > > > Sure, but when I think more about this, it seems this fix isn't > ideal: > If VM_FAULT_NOPAGE is returned with page table unset, the process > will > re-trigger page fault again and again until invalidate_inode_page > succeeds > to evict the inode page. This might hang the process a really long > time. > Or am I miss something? > If invalidate_inode_page fails, we will return VM_FAULT_HWPOISON, and kill the task, instead of looping indefinitely.
On 2022/3/28 10:24, Rik van Riel wrote: > On Mon, 2022-03-28 at 10:14 +0800, Miaohe Lin wrote: >> On 2022/3/27 4:14, Rik van Riel wrote: >> >> >>> >>>>> /* Retry if a clean page was removed >>>>> from >>>>> the cache. */ >>>>> - if (invalidate_inode_page(vmf->page)) >>>>> - poisonret = 0; >>>>> - unlock_page(vmf->page); >>>>> + if (invalidate_inode_page(page)) >>>>> + poisonret = VM_FAULT_NOPAGE; >>>>> + unlock_page(page); >>> >> >> Sure, but when I think more about this, it seems this fix isn't >> ideal: >> If VM_FAULT_NOPAGE is returned with page table unset, the process >> will >> re-trigger page fault again and again until invalidate_inode_page >> succeeds >> to evict the inode page. This might hang the process a really long >> time. >> Or am I miss something? >> > If invalidate_inode_page fails, we will return > VM_FAULT_HWPOISON, and kill the task, instead > of looping indefinitely. Oh, really sorry! It's a drowsy Monday morning. :) This patch looks good to me. Thanks! Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> >
On Fri, Mar 25, 2022 at 04:14:28PM -0400, Rik van Riel wrote: > In some cases it appears the invalidation of a hwpoisoned page > fails because the page is still mapped in another process. This > can cause a program to be continuously restarted and die when > it page faults on the page that was not invalidated. Avoid that > problem by unmapping the hwpoisoned page when we find it. > > Another issue is that sometimes we end up oopsing in finish_fault, > if the code tries to do something with the now-NULL vmf->page. > I did not hit this error when submitting the previous patch because > there are several opportunities for alloc_set_pte to bail out before > accessing vmf->page, and that apparently happened on those systems, > and most of the time on other systems, too. > > However, across several million systems that error does occur a > handful of times a day. It can be avoided by returning VM_FAULT_NOPAGE > which will cause do_read_fault to return before calling finish_fault. > > Fixes: e53ac7374e64 ("mm: invalidate hwpoison page cache page in fault path") > Cc: Oscar Salvador <osalvador@suse.de> > Cc: Miaohe Lin <linmiaohe@huawei.com> > Cc: Naoya Horiguchi <naoya.horiguchi@nec.com> > Cc: Mel Gorman <mgorman@suse.de> > Cc: Johannes Weiner <hannes@cmpxchg.org> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: stable@vger.kernel.org Reviewed-by: Oscar Salvador <osalvador@suse.de> > --- > mm/memory.c | 12 ++++++++---- > 1 file changed, 8 insertions(+), 4 deletions(-) > > diff --git a/mm/memory.c b/mm/memory.c > index be44d0b36b18..76e3af9639d9 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -3918,14 +3918,18 @@ static vm_fault_t __do_fault(struct vm_fault *vmf) > return ret; > > if (unlikely(PageHWPoison(vmf->page))) { > + struct page *page = vmf->page; > vm_fault_t poisonret = VM_FAULT_HWPOISON; > if (ret & VM_FAULT_LOCKED) { > + if (page_mapped(page)) > + unmap_mapping_pages(page_mapping(page), > + page->index, 1, false); > /* Retry if a clean page was removed from the cache. */ > - if (invalidate_inode_page(vmf->page)) > - poisonret = 0; > - unlock_page(vmf->page); > + if (invalidate_inode_page(page)) > + poisonret = VM_FAULT_NOPAGE; > + unlock_page(page); > } > - put_page(vmf->page); > + put_page(page); > vmf->page = NULL; > return poisonret; > } > -- > 2.35.1 > > >
diff --git a/mm/memory.c b/mm/memory.c index be44d0b36b18..76e3af9639d9 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3918,14 +3918,18 @@ static vm_fault_t __do_fault(struct vm_fault *vmf) return ret; if (unlikely(PageHWPoison(vmf->page))) { + struct page *page = vmf->page; vm_fault_t poisonret = VM_FAULT_HWPOISON; if (ret & VM_FAULT_LOCKED) { + if (page_mapped(page)) + unmap_mapping_pages(page_mapping(page), + page->index, 1, false); /* Retry if a clean page was removed from the cache. */ - if (invalidate_inode_page(vmf->page)) - poisonret = 0; - unlock_page(vmf->page); + if (invalidate_inode_page(page)) + poisonret = VM_FAULT_NOPAGE; + unlock_page(page); } - put_page(vmf->page); + put_page(page); vmf->page = NULL; return poisonret; }