diff mbox series

[1/2] hugetlb: fix update_and_free_page contig page struct assumption

Message ID 20210217184926.33567-1-mike.kravetz@oracle.com
State Accepted
Commit dbfee5aee7e54f83d96ceb8e3e80717fac62ad63
Headers show
Series [1/2] hugetlb: fix update_and_free_page contig page struct assumption | expand

Commit Message

Mike Kravetz Feb. 17, 2021, 6:49 p.m. UTC
page structs are not guaranteed to be contiguous for gigantic pages.  The
routine update_and_free_page can encounter a gigantic page, yet it assumes
page structs are contiguous when setting page flags in subpages.

If update_and_free_page encounters non-contiguous page structs, we can
see “BUG: Bad page state in process …” errors.

Non-contiguous page structs are generally not an issue.  However, they can
exist with a specific kernel configuration and hotplug operations.  For
example: Configure the kernel with CONFIG_SPARSEMEM and
!CONFIG_SPARSEMEM_VMEMMAP.  Then, hotplug add memory for the area where the
gigantic page will be allocated.
Zi Yan outlined steps to reproduce here [1].

[1] https://lore.kernel.org/linux-mm/16F7C58B-4D79-41C5-9B64-A1A1628F4AF2@nvidia.com/

Fixes: 944d9fec8d7a ("hugetlb: add support for gigantic page allocation at runtime")
Signed-off-by: Zi Yan <ziy@nvidia.com>
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: <stable@vger.kernel.org>
---
 mm/hugetlb.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Zi Yan Feb. 18, 2021, 5:27 p.m. UTC | #1
On 18 Feb 2021, at 12:25, Jason Gunthorpe wrote:

> On Thu, Feb 18, 2021 at 02:45:54PM +0000, Matthew Wilcox wrote:
>> On Wed, Feb 17, 2021 at 11:02:52AM -0800, Andrew Morton wrote:
>>> On Wed, 17 Feb 2021 10:49:25 -0800 Mike Kravetz <mike.kravetz@oracle.com> wrote:
>>>> page structs are not guaranteed to be contiguous for gigantic pages.  The
>>>
>>> June 2014.  That's a long lurk time for a bug.  I wonder if some later
>>> commit revealed it.
>>
>> I would suggest that gigantic pages have not seen much use.  Certainly
>> performance with Intel CPUs on benchmarks that I've been involved with
>> showed lower performance with 1GB pages than with 2MB pages until quite
>> recently.
>
> I suggested in another thread that maybe it is time to consider
> dropping this "feature"

You mean dropping gigantic page support in hugetlb?

>
> If it has been slightly broken for 7 years it seems a good bet it
> isn't actually being used.
>
> The cost to fix GUP to be compatible with this will hurt normal
> GUP performance - and again, that nobody has hit this bug in GUP
> further suggests the feature isn't used..

A easy fix might be to make gigantic hugetlb page depends on
CONFIG_SPARSEMEM_VMEMMAP, which guarantee all struct pages are contiguous.


—
Best Regards,
Yan Zi
Mike Kravetz Feb. 18, 2021, 5:34 p.m. UTC | #2
On 2/18/21 9:25 AM, Jason Gunthorpe wrote:
> On Thu, Feb 18, 2021 at 02:45:54PM +0000, Matthew Wilcox wrote:
>> On Wed, Feb 17, 2021 at 11:02:52AM -0800, Andrew Morton wrote:
>>> On Wed, 17 Feb 2021 10:49:25 -0800 Mike Kravetz <mike.kravetz@oracle.com> wrote:
>>>> page structs are not guaranteed to be contiguous for gigantic pages.  The
>>>
>>> June 2014.  That's a long lurk time for a bug.  I wonder if some later
>>> commit revealed it.
>>
>> I would suggest that gigantic pages have not seen much use.  Certainly
>> performance with Intel CPUs on benchmarks that I've been involved with
>> showed lower performance with 1GB pages than with 2MB pages until quite
>> recently.
> 
> I suggested in another thread that maybe it is time to consider
> dropping this "feature"
> 
> If it has been slightly broken for 7 years it seems a good bet it
> isn't actually being used.
> 
> The cost to fix GUP to be compatible with this will hurt normal
> GUP performance - and again, that nobody has hit this bug in GUP
> further suggests the feature isn't used..

I was thinking that we could detect these 'unusual' configurations and only
do the slower page struct walking in those cases.  However, we would need to
do some research to make sure we have taken into account all possible config
options which can produce non-contiguous page structs.  That should have zero
performance impact in the 'normal' cases.

I suppose we could prohibit gigantic pages in these 'unusual' configurations.
It would require some research to see if this 'may' impact someone.
diff mbox series

Patch

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 4bdb58ab14cb..94e9fa803294 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1312,14 +1312,16 @@  static inline void destroy_compound_gigantic_page(struct page *page,
 static void update_and_free_page(struct hstate *h, struct page *page)
 {
 	int i;
+	struct page *subpage = page;
 
 	if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported())
 		return;
 
 	h->nr_huge_pages--;
 	h->nr_huge_pages_node[page_to_nid(page)]--;
-	for (i = 0; i < pages_per_huge_page(h); i++) {
-		page[i].flags &= ~(1 << PG_locked | 1 << PG_error |
+	for (i = 0; i < pages_per_huge_page(h);
+	     i++, subpage = mem_map_next(subpage, page, i)) {
+		subpage->flags &= ~(1 << PG_locked | 1 << PG_error |
 				1 << PG_referenced | 1 << PG_dirty |
 				1 << PG_active | 1 << PG_private |
 				1 << PG_writeback);