Message ID | 20180314192937.12888-1-ard.biesheuvel@linaro.org |
---|---|
State | Accepted |
Commit | 3e04040df6d4613a8af5a80882d5f7f298f49810 |
Headers | show |
Series | [v2] Revert "mm/page_alloc: fix memmap_init_zone pageblock alignment" | expand |
On Wed, Mar 14, 2018 at 07:29:37PM +0000, Ard Biesheuvel wrote: > This reverts commit 864b75f9d6b0100bb24fdd9a20d156e7cda9b5ae. FWIW, the revert fixes the boot hang I'm seeing on ThunderX1. --Jan > Commit 864b75f9d6b0 ("mm/page_alloc: fix memmap_init_zone pageblock > alignment") modified the logic in memmap_init_zone() to initialize > struct pages associated with invalid PFNs, to appease a VM_BUG_ON() > in move_freepages(), which is redundant by its own admission, and > dereferences struct page fields to obtain the zone without checking > whether the struct pages in question are valid to begin with. > > Commit 864b75f9d6b0 only makes it worse, since the rounding it does > may cause pfn assume the same value it had in a prior iteration of > the loop, resulting in an infinite loop and a hang very early in the > boot. Also, since it doesn't perform the same rounding on start_pfn > itself but only on intermediate values following an invalid PFN, we > may still hit the same VM_BUG_ON() as before. > > So instead, let's fix this at the core, and ensure that the BUG > check doesn't dereference struct page fields of invalid pages. > > Fixes: 864b75f9d6b0 ("mm/page_alloc: fix memmap_init_zone pageblock alignment") > Cc: Daniel Vacek <neelx@redhat.com> > Cc: Mel Gorman <mgorman@techsingularity.net> > Cc: Michal Hocko <mhocko@suse.com> > Cc: Paul Burton <paul.burton@imgtec.com> > Cc: Pavel Tatashin <pasha.tatashin@oracle.com> > Cc: Vlastimil Babka <vbabka@suse.cz> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Linus Torvalds <torvalds@linux-foundation.org> > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> > --- > mm/page_alloc.c | 13 +++++-------- > 1 file changed, 5 insertions(+), 8 deletions(-) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 3d974cb2a1a1..635d7dd29d7f 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -1910,7 +1910,9 @@ static int move_freepages(struct zone *zone, > * Remove at a later date when no bug reports exist related to > * grouping pages by mobility > */ > - VM_BUG_ON(page_zone(start_page) != page_zone(end_page)); > + VM_BUG_ON(pfn_valid(page_to_pfn(start_page)) && > + pfn_valid(page_to_pfn(end_page)) && > + page_zone(start_page) != page_zone(end_page)); > #endif > > if (num_movable) > @@ -5359,14 +5361,9 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone, > /* > * Skip to the pfn preceding the next valid one (or > * end_pfn), such that we hit a valid pfn (or end_pfn) > - * on our next iteration of the loop. Note that it needs > - * to be pageblock aligned even when the region itself > - * is not. move_freepages_block() can shift ahead of > - * the valid region but still depends on correct page > - * metadata. > + * on our next iteration of the loop. > */ > - pfn = (memblock_next_valid_pfn(pfn, end_pfn) & > - ~(pageblock_nr_pages-1)) - 1; > + pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1; > #endif > continue; > } > -- > 2.15.1 > > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Hi Ard, On 03/14/2018 05:25 PM, Jan Glauber wrote: > On Wed, Mar 14, 2018 at 07:29:37PM +0000, Ard Biesheuvel wrote: >> This reverts commit 864b75f9d6b0100bb24fdd9a20d156e7cda9b5ae. > > FWIW, the revert fixes the boot hang I'm seeing on ThunderX1. > > --Jan > Thanks for this patch, it fixes the boot hang on QDF2400 platform. >> Commit 864b75f9d6b0 ("mm/page_alloc: fix memmap_init_zone pageblock >> alignment") modified the logic in memmap_init_zone() to initialize >> struct pages associated with invalid PFNs, to appease a VM_BUG_ON() >> in move_freepages(), which is redundant by its own admission, and >> dereferences struct page fields to obtain the zone without checking >> whether the struct pages in question are valid to begin with. >> >> Commit 864b75f9d6b0 only makes it worse, since the rounding it does >> may cause pfn assume the same value it had in a prior iteration of >> the loop, resulting in an infinite loop and a hang very early in the >> boot. Also, since it doesn't perform the same rounding on start_pfn >> itself but only on intermediate values following an invalid PFN, we >> may still hit the same VM_BUG_ON() as before. >> >> So instead, let's fix this at the core, and ensure that the BUG >> check doesn't dereference struct page fields of invalid pages. >> >> Fixes: 864b75f9d6b0 ("mm/page_alloc: fix memmap_init_zone pageblock alignment") >> Cc: Daniel Vacek <neelx@redhat.com> >> Cc: Mel Gorman <mgorman@techsingularity.net> >> Cc: Michal Hocko <mhocko@suse.com> >> Cc: Paul Burton <paul.burton@imgtec.com> >> Cc: Pavel Tatashin <pasha.tatashin@oracle.com> >> Cc: Vlastimil Babka <vbabka@suse.cz> >> Cc: Andrew Morton <akpm@linux-foundation.org> >> Cc: Linus Torvalds <torvalds@linux-foundation.org> >> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> >> --- >> mm/page_alloc.c | 13 +++++-------- >> 1 file changed, 5 insertions(+), 8 deletions(-) >> >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >> index 3d974cb2a1a1..635d7dd29d7f 100644 >> --- a/mm/page_alloc.c >> +++ b/mm/page_alloc.c >> @@ -1910,7 +1910,9 @@ static int move_freepages(struct zone *zone, >> * Remove at a later date when no bug reports exist related to >> * grouping pages by mobility >> */ >> - VM_BUG_ON(page_zone(start_page) != page_zone(end_page)); >> + VM_BUG_ON(pfn_valid(page_to_pfn(start_page)) && >> + pfn_valid(page_to_pfn(end_page)) && >> + page_zone(start_page) != page_zone(end_page)); >> #endif >> >> if (num_movable) >> @@ -5359,14 +5361,9 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone, >> /* >> * Skip to the pfn preceding the next valid one (or >> * end_pfn), such that we hit a valid pfn (or end_pfn) >> - * on our next iteration of the loop. Note that it needs >> - * to be pageblock aligned even when the region itself >> - * is not. move_freepages_block() can shift ahead of >> - * the valid region but still depends on correct page >> - * metadata. >> + * on our next iteration of the loop. >> */ >> - pfn = (memblock_next_valid_pfn(pfn, end_pfn) & >> - ~(pageblock_nr_pages-1)) - 1; >> + pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1; >> #endif >> continue; >> } >> -- >> 2.15.1 >> >> >> _______________________________________________ >> linux-arm-kernel mailing list >> linux-arm-kernel@lists.infradead.org >> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel > -- Shanker Donthineni Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc. Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
On Wed, Mar 14, 2018 at 12:29 PM, Ard Biesheuvel
<ard.biesheuvel@linaro.org> wrote:
> This reverts commit 864b75f9d6b0100bb24fdd9a20d156e7cda9b5ae.
Applied directly, since we already had two confirmations of it fixing
things for people.
Thanks,
Linus
On Wed, Mar 14, 2018 at 8:29 PM, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: > This reverts commit 864b75f9d6b0100bb24fdd9a20d156e7cda9b5ae. > > Commit 864b75f9d6b0 ("mm/page_alloc: fix memmap_init_zone pageblock > alignment") modified the logic in memmap_init_zone() to initialize > struct pages associated with invalid PFNs, to appease a VM_BUG_ON() > in move_freepages(), which is redundant by its own admission, and > dereferences struct page fields to obtain the zone without checking > whether the struct pages in question are valid to begin with. > > Commit 864b75f9d6b0 only makes it worse, since the rounding it does > may cause pfn assume the same value it had in a prior iteration of > the loop, resulting in an infinite loop and a hang very early in the > boot. Also, since it doesn't perform the same rounding on start_pfn > itself but only on intermediate values following an invalid PFN, we > may still hit the same VM_BUG_ON() as before. > > So instead, let's fix this at the core, and ensure that the BUG > check doesn't dereference struct page fields of invalid pages. > > Fixes: 864b75f9d6b0 ("mm/page_alloc: fix memmap_init_zone pageblock alignment") > Cc: Daniel Vacek <neelx@redhat.com> > Cc: Mel Gorman <mgorman@techsingularity.net> > Cc: Michal Hocko <mhocko@suse.com> > Cc: Paul Burton <paul.burton@imgtec.com> > Cc: Pavel Tatashin <pasha.tatashin@oracle.com> > Cc: Vlastimil Babka <vbabka@suse.cz> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Linus Torvalds <torvalds@linux-foundation.org> > Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> > --- > mm/page_alloc.c | 13 +++++-------- > 1 file changed, 5 insertions(+), 8 deletions(-) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 3d974cb2a1a1..635d7dd29d7f 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -1910,7 +1910,9 @@ static int move_freepages(struct zone *zone, > * Remove at a later date when no bug reports exist related to > * grouping pages by mobility > */ > - VM_BUG_ON(page_zone(start_page) != page_zone(end_page)); > + VM_BUG_ON(pfn_valid(page_to_pfn(start_page)) && > + pfn_valid(page_to_pfn(end_page)) && > + page_zone(start_page) != page_zone(end_page)); Hi, I am on vacation this week and I didn't have a chance to test this yet but I am not sure this is correct. Generic pfn_valid() unlike the arm{,64} arch specific versions returns true for all pfns in a section if there is at least some memory mapped in that section. So I doubt this prevents the crash I was targeting. I believe pfn_valid() does not change a thing here :( ------------------------ include/linux/mmzone.h: pfn_valid(pfn) valid_section(__nr_to_section(pfn_to_section_nr(pfn))) return (section && (section->section_mem_map & SECTION_HAS_MEM_MAP)) arch/arm64/mm/init.c: #ifdef CONFIG_HAVE_ARCH_PFN_VALID int pfn_valid(unsigned long pfn) { return memblock_is_map_memory(pfn << PAGE_SHIFT); } EXPORT_SYMBOL(pfn_valid); #endif ------------------------ Also I already sent a fix to Andrew yesterday which was reported to fix the loop. Moreover, you also reported this: > Early memory node ranges > node 0: [mem 0x0000000080000000-0x00000000febeffff] > node 0: [mem 0x00000000febf0000-0x00000000fefcffff] > node 0: [mem 0x00000000fefd0000-0x00000000ff43ffff] > node 0: [mem 0x00000000ff440000-0x00000000ff7affff] > node 0: [mem 0x00000000ff7b0000-0x00000000ffffffff] > node 0: [mem 0x0000000880000000-0x0000000fffffffff] > Initmem setup node 0 [mem 0x0000000080000000-0x0000000fffffffff] > pfn:febf0 oldnext:febf0 newnext:fe9ff > pfn:febf0 oldnext:febf0 newnext:fe9ff > pfn:febf0 oldnext:febf0 newnext:fe9ff > etc etc I am wondering how come pfn_valid(0xfebf0) returns false here. Should it be true or do I miss something? --nX
On 15 March 2018 at 02:23, Daniel Vacek <neelx@redhat.com> wrote: > On Wed, Mar 14, 2018 at 8:29 PM, Ard Biesheuvel > <ard.biesheuvel@linaro.org> wrote: >> This reverts commit 864b75f9d6b0100bb24fdd9a20d156e7cda9b5ae. >> >> Commit 864b75f9d6b0 ("mm/page_alloc: fix memmap_init_zone pageblock >> alignment") modified the logic in memmap_init_zone() to initialize >> struct pages associated with invalid PFNs, to appease a VM_BUG_ON() >> in move_freepages(), which is redundant by its own admission, and >> dereferences struct page fields to obtain the zone without checking >> whether the struct pages in question are valid to begin with. >> >> Commit 864b75f9d6b0 only makes it worse, since the rounding it does >> may cause pfn assume the same value it had in a prior iteration of >> the loop, resulting in an infinite loop and a hang very early in the >> boot. Also, since it doesn't perform the same rounding on start_pfn >> itself but only on intermediate values following an invalid PFN, we >> may still hit the same VM_BUG_ON() as before. >> >> So instead, let's fix this at the core, and ensure that the BUG >> check doesn't dereference struct page fields of invalid pages. >> >> Fixes: 864b75f9d6b0 ("mm/page_alloc: fix memmap_init_zone pageblock alignment") >> Cc: Daniel Vacek <neelx@redhat.com> >> Cc: Mel Gorman <mgorman@techsingularity.net> >> Cc: Michal Hocko <mhocko@suse.com> >> Cc: Paul Burton <paul.burton@imgtec.com> >> Cc: Pavel Tatashin <pasha.tatashin@oracle.com> >> Cc: Vlastimil Babka <vbabka@suse.cz> >> Cc: Andrew Morton <akpm@linux-foundation.org> >> Cc: Linus Torvalds <torvalds@linux-foundation.org> >> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> >> --- >> mm/page_alloc.c | 13 +++++-------- >> 1 file changed, 5 insertions(+), 8 deletions(-) >> >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >> index 3d974cb2a1a1..635d7dd29d7f 100644 >> --- a/mm/page_alloc.c >> +++ b/mm/page_alloc.c >> @@ -1910,7 +1910,9 @@ static int move_freepages(struct zone *zone, >> * Remove at a later date when no bug reports exist related to >> * grouping pages by mobility >> */ >> - VM_BUG_ON(page_zone(start_page) != page_zone(end_page)); >> + VM_BUG_ON(pfn_valid(page_to_pfn(start_page)) && >> + pfn_valid(page_to_pfn(end_page)) && >> + page_zone(start_page) != page_zone(end_page)); > > Hi, I am on vacation this week and I didn't have a chance to test this > yet but I am not sure this is correct. Generic pfn_valid() unlike the > arm{,64} arch specific versions returns true for all pfns in a section > if there is at least some memory mapped in that section. So I doubt > this prevents the crash I was targeting. I believe pfn_valid() does > not change a thing here :( > If this is the case, memblock_next_valid_pfn() is broken since it skips valid PFNs, and we should be fixing that instead. > ------------------------ > include/linux/mmzone.h: > pfn_valid(pfn) > valid_section(__nr_to_section(pfn_to_section_nr(pfn))) > return (section && (section->section_mem_map & SECTION_HAS_MEM_MAP)) > > arch/arm64/mm/init.c: > #ifdef CONFIG_HAVE_ARCH_PFN_VALID > int pfn_valid(unsigned long pfn) > { > return memblock_is_map_memory(pfn << PAGE_SHIFT); > } > EXPORT_SYMBOL(pfn_valid); > #endif > ------------------------ > > Also I already sent a fix to Andrew yesterday which was reported to > fix the loop. > > Moreover, you also reported this: > >> Early memory node ranges >> node 0: [mem 0x0000000080000000-0x00000000febeffff] >> node 0: [mem 0x00000000febf0000-0x00000000fefcffff] >> node 0: [mem 0x00000000fefd0000-0x00000000ff43ffff] >> node 0: [mem 0x00000000ff440000-0x00000000ff7affff] >> node 0: [mem 0x00000000ff7b0000-0x00000000ffffffff] >> node 0: [mem 0x0000000880000000-0x0000000fffffffff] >> Initmem setup node 0 [mem 0x0000000080000000-0x0000000fffffffff] >> pfn:febf0 oldnext:febf0 newnext:fe9ff >> pfn:febf0 oldnext:febf0 newnext:fe9ff >> pfn:febf0 oldnext:febf0 newnext:fe9ff >> etc etc > > I am wondering how come pfn_valid(0xfebf0) returns false here. Should > it be true or do I miss something? > > --nX
On Thu, Mar 15, 2018 at 8:36 AM, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: > On 15 March 2018 at 02:23, Daniel Vacek <neelx@redhat.com> wrote: >> On Wed, Mar 14, 2018 at 8:29 PM, Ard Biesheuvel >> <ard.biesheuvel@linaro.org> wrote: >>> This reverts commit 864b75f9d6b0100bb24fdd9a20d156e7cda9b5ae. >>> >>> Commit 864b75f9d6b0 ("mm/page_alloc: fix memmap_init_zone pageblock >>> alignment") modified the logic in memmap_init_zone() to initialize >>> struct pages associated with invalid PFNs, to appease a VM_BUG_ON() >>> in move_freepages(), which is redundant by its own admission, and >>> dereferences struct page fields to obtain the zone without checking >>> whether the struct pages in question are valid to begin with. >>> >>> Commit 864b75f9d6b0 only makes it worse, since the rounding it does >>> may cause pfn assume the same value it had in a prior iteration of >>> the loop, resulting in an infinite loop and a hang very early in the >>> boot. Also, since it doesn't perform the same rounding on start_pfn >>> itself but only on intermediate values following an invalid PFN, we >>> may still hit the same VM_BUG_ON() as before. >>> >>> So instead, let's fix this at the core, and ensure that the BUG >>> check doesn't dereference struct page fields of invalid pages. >>> >>> Fixes: 864b75f9d6b0 ("mm/page_alloc: fix memmap_init_zone pageblock alignment") >>> Cc: Daniel Vacek <neelx@redhat.com> >>> Cc: Mel Gorman <mgorman@techsingularity.net> >>> Cc: Michal Hocko <mhocko@suse.com> >>> Cc: Paul Burton <paul.burton@imgtec.com> >>> Cc: Pavel Tatashin <pasha.tatashin@oracle.com> >>> Cc: Vlastimil Babka <vbabka@suse.cz> >>> Cc: Andrew Morton <akpm@linux-foundation.org> >>> Cc: Linus Torvalds <torvalds@linux-foundation.org> >>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> >>> --- >>> mm/page_alloc.c | 13 +++++-------- >>> 1 file changed, 5 insertions(+), 8 deletions(-) >>> >>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >>> index 3d974cb2a1a1..635d7dd29d7f 100644 >>> --- a/mm/page_alloc.c >>> +++ b/mm/page_alloc.c >>> @@ -1910,7 +1910,9 @@ static int move_freepages(struct zone *zone, >>> * Remove at a later date when no bug reports exist related to >>> * grouping pages by mobility >>> */ >>> - VM_BUG_ON(page_zone(start_page) != page_zone(end_page)); >>> + VM_BUG_ON(pfn_valid(page_to_pfn(start_page)) && >>> + pfn_valid(page_to_pfn(end_page)) && >>> + page_zone(start_page) != page_zone(end_page)); >> >> Hi, I am on vacation this week and I didn't have a chance to test this >> yet but I am not sure this is correct. Generic pfn_valid() unlike the >> arm{,64} arch specific versions returns true for all pfns in a section >> if there is at least some memory mapped in that section. So I doubt >> this prevents the crash I was targeting. I believe pfn_valid() does >> not change a thing here :( >> > > If this is the case, memblock_next_valid_pfn() is broken since it > skips valid PFNs, and we should be fixing that instead. How do you define valid pfn? Maybe the generic version of pfn_valid() should be fixed??? -nX >> ------------------------ >> include/linux/mmzone.h: >> pfn_valid(pfn) >> valid_section(__nr_to_section(pfn_to_section_nr(pfn))) >> return (section && (section->section_mem_map & SECTION_HAS_MEM_MAP)) >> >> arch/arm64/mm/init.c: >> #ifdef CONFIG_HAVE_ARCH_PFN_VALID >> int pfn_valid(unsigned long pfn) >> { >> return memblock_is_map_memory(pfn << PAGE_SHIFT); >> } >> EXPORT_SYMBOL(pfn_valid); >> #endif >> ------------------------ >> >> Also I already sent a fix to Andrew yesterday which was reported to >> fix the loop. >> >> Moreover, you also reported this: >> >>> Early memory node ranges >>> node 0: [mem 0x0000000080000000-0x00000000febeffff] >>> node 0: [mem 0x00000000febf0000-0x00000000fefcffff] >>> node 0: [mem 0x00000000fefd0000-0x00000000ff43ffff] >>> node 0: [mem 0x00000000ff440000-0x00000000ff7affff] >>> node 0: [mem 0x00000000ff7b0000-0x00000000ffffffff] >>> node 0: [mem 0x0000000880000000-0x0000000fffffffff] >>> Initmem setup node 0 [mem 0x0000000080000000-0x0000000fffffffff] >>> pfn:febf0 oldnext:febf0 newnext:fe9ff >>> pfn:febf0 oldnext:febf0 newnext:fe9ff >>> pfn:febf0 oldnext:febf0 newnext:fe9ff >>> etc etc >> >> I am wondering how come pfn_valid(0xfebf0) returns false here. Should >> it be true or do I miss something? >> >> --nX
On 15 March 2018 at 07:44, Daniel Vacek <neelx@redhat.com> wrote: > On Thu, Mar 15, 2018 at 8:36 AM, Ard Biesheuvel > <ard.biesheuvel@linaro.org> wrote: >> On 15 March 2018 at 02:23, Daniel Vacek <neelx@redhat.com> wrote: >>> On Wed, Mar 14, 2018 at 8:29 PM, Ard Biesheuvel >>> <ard.biesheuvel@linaro.org> wrote: >>>> This reverts commit 864b75f9d6b0100bb24fdd9a20d156e7cda9b5ae. >>>> >>>> Commit 864b75f9d6b0 ("mm/page_alloc: fix memmap_init_zone pageblock >>>> alignment") modified the logic in memmap_init_zone() to initialize >>>> struct pages associated with invalid PFNs, to appease a VM_BUG_ON() >>>> in move_freepages(), which is redundant by its own admission, and >>>> dereferences struct page fields to obtain the zone without checking >>>> whether the struct pages in question are valid to begin with. >>>> >>>> Commit 864b75f9d6b0 only makes it worse, since the rounding it does >>>> may cause pfn assume the same value it had in a prior iteration of >>>> the loop, resulting in an infinite loop and a hang very early in the >>>> boot. Also, since it doesn't perform the same rounding on start_pfn >>>> itself but only on intermediate values following an invalid PFN, we >>>> may still hit the same VM_BUG_ON() as before. >>>> >>>> So instead, let's fix this at the core, and ensure that the BUG >>>> check doesn't dereference struct page fields of invalid pages. >>>> >>>> Fixes: 864b75f9d6b0 ("mm/page_alloc: fix memmap_init_zone pageblock alignment") >>>> Cc: Daniel Vacek <neelx@redhat.com> >>>> Cc: Mel Gorman <mgorman@techsingularity.net> >>>> Cc: Michal Hocko <mhocko@suse.com> >>>> Cc: Paul Burton <paul.burton@imgtec.com> >>>> Cc: Pavel Tatashin <pasha.tatashin@oracle.com> >>>> Cc: Vlastimil Babka <vbabka@suse.cz> >>>> Cc: Andrew Morton <akpm@linux-foundation.org> >>>> Cc: Linus Torvalds <torvalds@linux-foundation.org> >>>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> >>>> --- >>>> mm/page_alloc.c | 13 +++++-------- >>>> 1 file changed, 5 insertions(+), 8 deletions(-) >>>> >>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >>>> index 3d974cb2a1a1..635d7dd29d7f 100644 >>>> --- a/mm/page_alloc.c >>>> +++ b/mm/page_alloc.c >>>> @@ -1910,7 +1910,9 @@ static int move_freepages(struct zone *zone, >>>> * Remove at a later date when no bug reports exist related to >>>> * grouping pages by mobility >>>> */ >>>> - VM_BUG_ON(page_zone(start_page) != page_zone(end_page)); >>>> + VM_BUG_ON(pfn_valid(page_to_pfn(start_page)) && >>>> + pfn_valid(page_to_pfn(end_page)) && >>>> + page_zone(start_page) != page_zone(end_page)); >>> >>> Hi, I am on vacation this week and I didn't have a chance to test this >>> yet but I am not sure this is correct. Generic pfn_valid() unlike the >>> arm{,64} arch specific versions returns true for all pfns in a section >>> if there is at least some memory mapped in that section. So I doubt >>> this prevents the crash I was targeting. I believe pfn_valid() does >>> not change a thing here :( >>> >> >> If this is the case, memblock_next_valid_pfn() is broken since it >> skips valid PFNs, and we should be fixing that instead. > > How do you define valid pfn? Maybe the generic version of pfn_valid() > should be fixed??? > memblock_next_valid_pfn() skips PFNs for which pfn_valid() returns true. That is clearly a bug.
On Thu, Mar 15, 2018 at 8:45 AM, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: > On 15 March 2018 at 07:44, Daniel Vacek <neelx@redhat.com> wrote: >> On Thu, Mar 15, 2018 at 8:36 AM, Ard Biesheuvel >> <ard.biesheuvel@linaro.org> wrote: >>> On 15 March 2018 at 02:23, Daniel Vacek <neelx@redhat.com> wrote: >>>> On Wed, Mar 14, 2018 at 8:29 PM, Ard Biesheuvel >>>> <ard.biesheuvel@linaro.org> wrote: >>>>> This reverts commit 864b75f9d6b0100bb24fdd9a20d156e7cda9b5ae. >>>>> >>>>> Commit 864b75f9d6b0 ("mm/page_alloc: fix memmap_init_zone pageblock >>>>> alignment") modified the logic in memmap_init_zone() to initialize >>>>> struct pages associated with invalid PFNs, to appease a VM_BUG_ON() >>>>> in move_freepages(), which is redundant by its own admission, and >>>>> dereferences struct page fields to obtain the zone without checking >>>>> whether the struct pages in question are valid to begin with. >>>>> >>>>> Commit 864b75f9d6b0 only makes it worse, since the rounding it does >>>>> may cause pfn assume the same value it had in a prior iteration of >>>>> the loop, resulting in an infinite loop and a hang very early in the >>>>> boot. Also, since it doesn't perform the same rounding on start_pfn >>>>> itself but only on intermediate values following an invalid PFN, we >>>>> may still hit the same VM_BUG_ON() as before. >>>>> >>>>> So instead, let's fix this at the core, and ensure that the BUG >>>>> check doesn't dereference struct page fields of invalid pages. >>>>> >>>>> Fixes: 864b75f9d6b0 ("mm/page_alloc: fix memmap_init_zone pageblock alignment") >>>>> Cc: Daniel Vacek <neelx@redhat.com> >>>>> Cc: Mel Gorman <mgorman@techsingularity.net> >>>>> Cc: Michal Hocko <mhocko@suse.com> >>>>> Cc: Paul Burton <paul.burton@imgtec.com> >>>>> Cc: Pavel Tatashin <pasha.tatashin@oracle.com> >>>>> Cc: Vlastimil Babka <vbabka@suse.cz> >>>>> Cc: Andrew Morton <akpm@linux-foundation.org> >>>>> Cc: Linus Torvalds <torvalds@linux-foundation.org> >>>>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> >>>>> --- >>>>> mm/page_alloc.c | 13 +++++-------- >>>>> 1 file changed, 5 insertions(+), 8 deletions(-) >>>>> >>>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >>>>> index 3d974cb2a1a1..635d7dd29d7f 100644 >>>>> --- a/mm/page_alloc.c >>>>> +++ b/mm/page_alloc.c >>>>> @@ -1910,7 +1910,9 @@ static int move_freepages(struct zone *zone, >>>>> * Remove at a later date when no bug reports exist related to >>>>> * grouping pages by mobility >>>>> */ >>>>> - VM_BUG_ON(page_zone(start_page) != page_zone(end_page)); >>>>> + VM_BUG_ON(pfn_valid(page_to_pfn(start_page)) && >>>>> + pfn_valid(page_to_pfn(end_page)) && >>>>> + page_zone(start_page) != page_zone(end_page)); >>>> >>>> Hi, I am on vacation this week and I didn't have a chance to test this >>>> yet but I am not sure this is correct. Generic pfn_valid() unlike the >>>> arm{,64} arch specific versions returns true for all pfns in a section >>>> if there is at least some memory mapped in that section. So I doubt >>>> this prevents the crash I was targeting. I believe pfn_valid() does >>>> not change a thing here :( >>>> >>> >>> If this is the case, memblock_next_valid_pfn() is broken since it >>> skips valid PFNs, and we should be fixing that instead. >> >> How do you define valid pfn? Maybe the generic version of pfn_valid() >> should be fixed??? >> > > memblock_next_valid_pfn() skips PFNs for which pfn_valid() returns > true. That is clearly a bug. So pfn_valid() does not mean this frame is usable memory? OK, in that case we need to fix or revert memblock_next_valid_pfn(). That works for me. --nX
On 15 March 2018 at 15:12, Daniel Vacek <neelx@redhat.com> wrote: > On Thu, Mar 15, 2018 at 8:45 AM, Ard Biesheuvel > <ard.biesheuvel@linaro.org> wrote: >> On 15 March 2018 at 07:44, Daniel Vacek <neelx@redhat.com> wrote: >>> On Thu, Mar 15, 2018 at 8:36 AM, Ard Biesheuvel >>> <ard.biesheuvel@linaro.org> wrote: >>>> On 15 March 2018 at 02:23, Daniel Vacek <neelx@redhat.com> wrote: >>>>> On Wed, Mar 14, 2018 at 8:29 PM, Ard Biesheuvel >>>>> <ard.biesheuvel@linaro.org> wrote: >>>>>> This reverts commit 864b75f9d6b0100bb24fdd9a20d156e7cda9b5ae. >>>>>> >>>>>> Commit 864b75f9d6b0 ("mm/page_alloc: fix memmap_init_zone pageblock >>>>>> alignment") modified the logic in memmap_init_zone() to initialize >>>>>> struct pages associated with invalid PFNs, to appease a VM_BUG_ON() >>>>>> in move_freepages(), which is redundant by its own admission, and >>>>>> dereferences struct page fields to obtain the zone without checking >>>>>> whether the struct pages in question are valid to begin with. >>>>>> >>>>>> Commit 864b75f9d6b0 only makes it worse, since the rounding it does >>>>>> may cause pfn assume the same value it had in a prior iteration of >>>>>> the loop, resulting in an infinite loop and a hang very early in the >>>>>> boot. Also, since it doesn't perform the same rounding on start_pfn >>>>>> itself but only on intermediate values following an invalid PFN, we >>>>>> may still hit the same VM_BUG_ON() as before. >>>>>> >>>>>> So instead, let's fix this at the core, and ensure that the BUG >>>>>> check doesn't dereference struct page fields of invalid pages. >>>>>> >>>>>> Fixes: 864b75f9d6b0 ("mm/page_alloc: fix memmap_init_zone pageblock alignment") >>>>>> Cc: Daniel Vacek <neelx@redhat.com> >>>>>> Cc: Mel Gorman <mgorman@techsingularity.net> >>>>>> Cc: Michal Hocko <mhocko@suse.com> >>>>>> Cc: Paul Burton <paul.burton@imgtec.com> >>>>>> Cc: Pavel Tatashin <pasha.tatashin@oracle.com> >>>>>> Cc: Vlastimil Babka <vbabka@suse.cz> >>>>>> Cc: Andrew Morton <akpm@linux-foundation.org> >>>>>> Cc: Linus Torvalds <torvalds@linux-foundation.org> >>>>>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> >>>>>> --- >>>>>> mm/page_alloc.c | 13 +++++-------- >>>>>> 1 file changed, 5 insertions(+), 8 deletions(-) >>>>>> >>>>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >>>>>> index 3d974cb2a1a1..635d7dd29d7f 100644 >>>>>> --- a/mm/page_alloc.c >>>>>> +++ b/mm/page_alloc.c >>>>>> @@ -1910,7 +1910,9 @@ static int move_freepages(struct zone *zone, >>>>>> * Remove at a later date when no bug reports exist related to >>>>>> * grouping pages by mobility >>>>>> */ >>>>>> - VM_BUG_ON(page_zone(start_page) != page_zone(end_page)); >>>>>> + VM_BUG_ON(pfn_valid(page_to_pfn(start_page)) && >>>>>> + pfn_valid(page_to_pfn(end_page)) && >>>>>> + page_zone(start_page) != page_zone(end_page)); >>>>> >>>>> Hi, I am on vacation this week and I didn't have a chance to test this >>>>> yet but I am not sure this is correct. Generic pfn_valid() unlike the >>>>> arm{,64} arch specific versions returns true for all pfns in a section >>>>> if there is at least some memory mapped in that section. So I doubt >>>>> this prevents the crash I was targeting. I believe pfn_valid() does >>>>> not change a thing here :( >>>>> >>>> >>>> If this is the case, memblock_next_valid_pfn() is broken since it >>>> skips valid PFNs, and we should be fixing that instead. >>> >>> How do you define valid pfn? Maybe the generic version of pfn_valid() >>> should be fixed??? >>> >> >> memblock_next_valid_pfn() skips PFNs for which pfn_valid() returns >> true. That is clearly a bug. > > So pfn_valid() does not mean this frame is usable memory? > Who cares what it *means*? memblock_next_valid_pfn() has 'valid_pfn' in its name, so if passing pfn A returns B, and there exists a C such that A < C < B and pfn_valid(C) returns true, memblock_next_valid_pfn doesn't do what it says on the tin and should be fixed. You keep going on about how pfn_valid() does or does not do what you think, but that is really irrelevant. > OK, in that case we need to fix or revert memblock_next_valid_pfn(). > That works for me. > OK. You can add my ack to a patch that reverts it, and we can revisit it for the next cycle.
On Thu, Mar 15, 2018 at 4:17 PM, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote: > On 15 March 2018 at 15:12, Daniel Vacek <neelx@redhat.com> wrote: >> On Thu, Mar 15, 2018 at 8:45 AM, Ard Biesheuvel >> <ard.biesheuvel@linaro.org> wrote: >>> On 15 March 2018 at 07:44, Daniel Vacek <neelx@redhat.com> wrote: >>>> On Thu, Mar 15, 2018 at 8:36 AM, Ard Biesheuvel >>>> <ard.biesheuvel@linaro.org> wrote: >>>>> On 15 March 2018 at 02:23, Daniel Vacek <neelx@redhat.com> wrote: >>>>>> On Wed, Mar 14, 2018 at 8:29 PM, Ard Biesheuvel >>>>>> <ard.biesheuvel@linaro.org> wrote: >>>>>>> This reverts commit 864b75f9d6b0100bb24fdd9a20d156e7cda9b5ae. >>>>>>> >>>>>>> Commit 864b75f9d6b0 ("mm/page_alloc: fix memmap_init_zone pageblock >>>>>>> alignment") modified the logic in memmap_init_zone() to initialize >>>>>>> struct pages associated with invalid PFNs, to appease a VM_BUG_ON() >>>>>>> in move_freepages(), which is redundant by its own admission, and >>>>>>> dereferences struct page fields to obtain the zone without checking >>>>>>> whether the struct pages in question are valid to begin with. >>>>>>> >>>>>>> Commit 864b75f9d6b0 only makes it worse, since the rounding it does >>>>>>> may cause pfn assume the same value it had in a prior iteration of >>>>>>> the loop, resulting in an infinite loop and a hang very early in the >>>>>>> boot. Also, since it doesn't perform the same rounding on start_pfn >>>>>>> itself but only on intermediate values following an invalid PFN, we >>>>>>> may still hit the same VM_BUG_ON() as before. >>>>>>> >>>>>>> So instead, let's fix this at the core, and ensure that the BUG >>>>>>> check doesn't dereference struct page fields of invalid pages. >>>>>>> >>>>>>> Fixes: 864b75f9d6b0 ("mm/page_alloc: fix memmap_init_zone pageblock alignment") >>>>>>> Cc: Daniel Vacek <neelx@redhat.com> >>>>>>> Cc: Mel Gorman <mgorman@techsingularity.net> >>>>>>> Cc: Michal Hocko <mhocko@suse.com> >>>>>>> Cc: Paul Burton <paul.burton@imgtec.com> >>>>>>> Cc: Pavel Tatashin <pasha.tatashin@oracle.com> >>>>>>> Cc: Vlastimil Babka <vbabka@suse.cz> >>>>>>> Cc: Andrew Morton <akpm@linux-foundation.org> >>>>>>> Cc: Linus Torvalds <torvalds@linux-foundation.org> >>>>>>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> >>>>>>> --- >>>>>>> mm/page_alloc.c | 13 +++++-------- >>>>>>> 1 file changed, 5 insertions(+), 8 deletions(-) >>>>>>> >>>>>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >>>>>>> index 3d974cb2a1a1..635d7dd29d7f 100644 >>>>>>> --- a/mm/page_alloc.c >>>>>>> +++ b/mm/page_alloc.c >>>>>>> @@ -1910,7 +1910,9 @@ static int move_freepages(struct zone *zone, >>>>>>> * Remove at a later date when no bug reports exist related to >>>>>>> * grouping pages by mobility >>>>>>> */ >>>>>>> - VM_BUG_ON(page_zone(start_page) != page_zone(end_page)); >>>>>>> + VM_BUG_ON(pfn_valid(page_to_pfn(start_page)) && >>>>>>> + pfn_valid(page_to_pfn(end_page)) && >>>>>>> + page_zone(start_page) != page_zone(end_page)); >>>>>> >>>>>> Hi, I am on vacation this week and I didn't have a chance to test this >>>>>> yet but I am not sure this is correct. Generic pfn_valid() unlike the >>>>>> arm{,64} arch specific versions returns true for all pfns in a section >>>>>> if there is at least some memory mapped in that section. So I doubt >>>>>> this prevents the crash I was targeting. I believe pfn_valid() does >>>>>> not change a thing here :( >>>>>> >>>>> >>>>> If this is the case, memblock_next_valid_pfn() is broken since it >>>>> skips valid PFNs, and we should be fixing that instead. >>>> >>>> How do you define valid pfn? Maybe the generic version of pfn_valid() >>>> should be fixed??? >>>> >>> >>> memblock_next_valid_pfn() skips PFNs for which pfn_valid() returns >>> true. That is clearly a bug. >> >> So pfn_valid() does not mean this frame is usable memory? >> > > Who cares what it *means*? abstractions? > memblock_next_valid_pfn() has 'valid_pfn' in its name, so if passing > pfn A returns B, and there exists a C such that A < C < B and > pfn_valid(C) returns true, memblock_next_valid_pfn doesn't do what it > says on the tin and should be fixed. If you don't like the name I would argue it should be changed to something like memblock_pfn_of_next_memory(). Still the name is not next_valid_pfn() but memblock_next... kind of distinguishing something different. > You keep going on about how pfn_valid() does or does not do what you > think, but that is really irrelevant. I'm trying to learn. Hence I was asking what is the abstract meaning of it. As I see two *way_different* implementations so I am not sure how I should understand that. >> OK, in that case we need to fix or revert memblock_next_valid_pfn(). >> That works for me. >> > > OK. You can add my ack to a patch that reverts it, and we can revisit > it for the next cycle. Cool. I'll do that next week. Thank you, sir. --nX
On 15 March 2018 at 15:34, Daniel Vacek <neelx@redhat.com> wrote: > On Thu, Mar 15, 2018 at 4:17 PM, Ard Biesheuvel > <ard.biesheuvel@linaro.org> wrote: >> On 15 March 2018 at 15:12, Daniel Vacek <neelx@redhat.com> wrote: >>> On Thu, Mar 15, 2018 at 8:45 AM, Ard Biesheuvel >>> <ard.biesheuvel@linaro.org> wrote: >>>> On 15 March 2018 at 07:44, Daniel Vacek <neelx@redhat.com> wrote: >>>>> On Thu, Mar 15, 2018 at 8:36 AM, Ard Biesheuvel >>>>> <ard.biesheuvel@linaro.org> wrote: >>>>>> On 15 March 2018 at 02:23, Daniel Vacek <neelx@redhat.com> wrote: >>>>>>> On Wed, Mar 14, 2018 at 8:29 PM, Ard Biesheuvel >>>>>>> <ard.biesheuvel@linaro.org> wrote: >>>>>>>> This reverts commit 864b75f9d6b0100bb24fdd9a20d156e7cda9b5ae. >>>>>>>> >>>>>>>> Commit 864b75f9d6b0 ("mm/page_alloc: fix memmap_init_zone pageblock >>>>>>>> alignment") modified the logic in memmap_init_zone() to initialize >>>>>>>> struct pages associated with invalid PFNs, to appease a VM_BUG_ON() >>>>>>>> in move_freepages(), which is redundant by its own admission, and >>>>>>>> dereferences struct page fields to obtain the zone without checking >>>>>>>> whether the struct pages in question are valid to begin with. >>>>>>>> >>>>>>>> Commit 864b75f9d6b0 only makes it worse, since the rounding it does >>>>>>>> may cause pfn assume the same value it had in a prior iteration of >>>>>>>> the loop, resulting in an infinite loop and a hang very early in the >>>>>>>> boot. Also, since it doesn't perform the same rounding on start_pfn >>>>>>>> itself but only on intermediate values following an invalid PFN, we >>>>>>>> may still hit the same VM_BUG_ON() as before. >>>>>>>> >>>>>>>> So instead, let's fix this at the core, and ensure that the BUG >>>>>>>> check doesn't dereference struct page fields of invalid pages. >>>>>>>> >>>>>>>> Fixes: 864b75f9d6b0 ("mm/page_alloc: fix memmap_init_zone pageblock alignment") >>>>>>>> Cc: Daniel Vacek <neelx@redhat.com> >>>>>>>> Cc: Mel Gorman <mgorman@techsingularity.net> >>>>>>>> Cc: Michal Hocko <mhocko@suse.com> >>>>>>>> Cc: Paul Burton <paul.burton@imgtec.com> >>>>>>>> Cc: Pavel Tatashin <pasha.tatashin@oracle.com> >>>>>>>> Cc: Vlastimil Babka <vbabka@suse.cz> >>>>>>>> Cc: Andrew Morton <akpm@linux-foundation.org> >>>>>>>> Cc: Linus Torvalds <torvalds@linux-foundation.org> >>>>>>>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> >>>>>>>> --- >>>>>>>> mm/page_alloc.c | 13 +++++-------- >>>>>>>> 1 file changed, 5 insertions(+), 8 deletions(-) >>>>>>>> >>>>>>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >>>>>>>> index 3d974cb2a1a1..635d7dd29d7f 100644 >>>>>>>> --- a/mm/page_alloc.c >>>>>>>> +++ b/mm/page_alloc.c >>>>>>>> @@ -1910,7 +1910,9 @@ static int move_freepages(struct zone *zone, >>>>>>>> * Remove at a later date when no bug reports exist related to >>>>>>>> * grouping pages by mobility >>>>>>>> */ >>>>>>>> - VM_BUG_ON(page_zone(start_page) != page_zone(end_page)); >>>>>>>> + VM_BUG_ON(pfn_valid(page_to_pfn(start_page)) && >>>>>>>> + pfn_valid(page_to_pfn(end_page)) && >>>>>>>> + page_zone(start_page) != page_zone(end_page)); >>>>>>> >>>>>>> Hi, I am on vacation this week and I didn't have a chance to test this >>>>>>> yet but I am not sure this is correct. Generic pfn_valid() unlike the >>>>>>> arm{,64} arch specific versions returns true for all pfns in a section >>>>>>> if there is at least some memory mapped in that section. So I doubt >>>>>>> this prevents the crash I was targeting. I believe pfn_valid() does >>>>>>> not change a thing here :( >>>>>>> >>>>>> >>>>>> If this is the case, memblock_next_valid_pfn() is broken since it >>>>>> skips valid PFNs, and we should be fixing that instead. >>>>> >>>>> How do you define valid pfn? Maybe the generic version of pfn_valid() >>>>> should be fixed??? >>>>> >>>> >>>> memblock_next_valid_pfn() skips PFNs for which pfn_valid() returns >>>> true. That is clearly a bug. >>> >>> So pfn_valid() does not mean this frame is usable memory? >>> >> >> Who cares what it *means*? > > abstractions? > >> memblock_next_valid_pfn() has 'valid_pfn' in its name, so if passing >> pfn A returns B, and there exists a C such that A < C < B and >> pfn_valid(C) returns true, memblock_next_valid_pfn doesn't do what it >> says on the tin and should be fixed. > > If you don't like the name I would argue it should be changed to > something like memblock_pfn_of_next_memory(). Still the name is not > next_valid_pfn() but memblock_next... kind of distinguishing something > different. > Naming is important. If the name would have reflected what it actually does, perhaps it would have taken us less time to figure out that what it's doing is wrong. >> You keep going on about how pfn_valid() does or does not do what you >> think, but that is really irrelevant. > > I'm trying to learn. So am I :-) > Hence I was asking what is the abstract meaning > of it. As I see two *way_different* implementations so I am not sure > how I should understand that. > My interpretation is that it has a struct page associated with it, but it seems the semantics of pfn_valid() aren't well defined. IIRC there are places in the kernel that assume that valid PFNs are covered by the kernel direct mapping (on non-highmem kernels), and this is why we have a separate definition for arm64, which needs to remove some regions from the kernel direct mapping because the architecture does not permit mappings with mismatched attributes, and these regions may be mapped by the firmware as well. Dereferencing the struct page associated with a PFN for which pfn_valid() returns false is wrong in any case, which is why the VM_BUG_ON() you identified was buggy as well. But on the other hand, that does mean we need to guarantee that all valid PFNs have their struct page initialized. >>> OK, in that case we need to fix or revert memblock_next_valid_pfn(). >>> That works for me. >>> >> >> OK. You can add my ack to a patch that reverts it, and we can revisit >> it for the next cycle. > > Cool. I'll do that next week. Thank you, sir. > Likewise.
On Thu 15-03-18 15:48:47, Ard Biesheuvel wrote: > On 15 March 2018 at 15:34, Daniel Vacek <neelx@redhat.com> wrote: [...] > > Hence I was asking what is the abstract meaning > > of it. As I see two *way_different* implementations so I am not sure > > how I should understand that. > > > > My interpretation is that it has a struct page associated with it, but > it seems the semantics of pfn_valid() aren't well defined. Well, pfn_valid says that a given pfn is backed by a real memory and it has a valid struct page backing it. -- Michal Hocko SUSE Labs
On Thu, Mar 15, 2018 at 11:21 AM, Michal Hocko <mhocko@kernel.org> wrote: > > Well, pfn_valid says that a given pfn is backed by a real memory and it > has a valid struct page backing it. No, it really just says the latter. There should be a "struct page" for it. It may not be "real memory". The struct page might have it marked reserved or something. But at least you should be able to get to the "struct page" and look at it. Linus
On Thu 15-03-18 11:28:27, Linus Torvalds wrote: > On Thu, Mar 15, 2018 at 11:21 AM, Michal Hocko <mhocko@kernel.org> wrote: > > > > Well, pfn_valid says that a given pfn is backed by a real memory and it > > has a valid struct page backing it. > > No, it really just says the latter. There should be a "struct page" for it. You are right! A simple example could be non volatile memory. That is certainly not RAM but it has pfn_valid memory. Sorry about the confusion. -- Michal Hocko SUSE Labs
On 14/03/2018 23:53, Shanker Donthineni wrote: > > Hi Ard, > > On 03/14/2018 05:25 PM, Jan Glauber wrote: >> On Wed, Mar 14, 2018 at 07:29:37PM +0000, Ard Biesheuvel wrote: >>> This reverts commit 864b75f9d6b0100bb24fdd9a20d156e7cda9b5ae. >> >> FWIW, the revert fixes the boot hang I'm seeing on ThunderX1. >> >> --Jan >> > > Thanks for this patch, it fixes the boot hang on QDF2400 platform. For the record, this also fixes boot on Amlogic S905X (and for the whole GX family I presume). Thanks, Neil > > >>> Commit 864b75f9d6b0 ("mm/page_alloc: fix memmap_init_zone pageblock >>> alignment") modified the logic in memmap_init_zone() to initialize >>> struct pages associated with invalid PFNs, to appease a VM_BUG_ON() >>> in move_freepages(), which is redundant by its own admission, and >>> dereferences struct page fields to obtain the zone without checking >>> whether the struct pages in question are valid to begin with. >>> >>> Commit 864b75f9d6b0 only makes it worse, since the rounding it does >>> may cause pfn assume the same value it had in a prior iteration of >>> the loop, resulting in an infinite loop and a hang very early in the >>> boot. Also, since it doesn't perform the same rounding on start_pfn >>> itself but only on intermediate values following an invalid PFN, we >>> may still hit the same VM_BUG_ON() as before. >>> >>> So instead, let's fix this at the core, and ensure that the BUG >>> check doesn't dereference struct page fields of invalid pages. >>> >>> Fixes: 864b75f9d6b0 ("mm/page_alloc: fix memmap_init_zone pageblock alignment") >>> Cc: Daniel Vacek <neelx@redhat.com> >>> Cc: Mel Gorman <mgorman@techsingularity.net> >>> Cc: Michal Hocko <mhocko@suse.com> >>> Cc: Paul Burton <paul.burton@imgtec.com> >>> Cc: Pavel Tatashin <pasha.tatashin@oracle.com> >>> Cc: Vlastimil Babka <vbabka@suse.cz> >>> Cc: Andrew Morton <akpm@linux-foundation.org> >>> Cc: Linus Torvalds <torvalds@linux-foundation.org> >>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> >>> --- >>> mm/page_alloc.c | 13 +++++-------- >>> 1 file changed, 5 insertions(+), 8 deletions(-) >>> >>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >>> index 3d974cb2a1a1..635d7dd29d7f 100644 >>> --- a/mm/page_alloc.c >>> +++ b/mm/page_alloc.c >>> @@ -1910,7 +1910,9 @@ static int move_freepages(struct zone *zone, >>> * Remove at a later date when no bug reports exist related to >>> * grouping pages by mobility >>> */ >>> - VM_BUG_ON(page_zone(start_page) != page_zone(end_page)); >>> + VM_BUG_ON(pfn_valid(page_to_pfn(start_page)) && >>> + pfn_valid(page_to_pfn(end_page)) && >>> + page_zone(start_page) != page_zone(end_page)); >>> #endif >>> >>> if (num_movable) >>> @@ -5359,14 +5361,9 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone, >>> /* >>> * Skip to the pfn preceding the next valid one (or >>> * end_pfn), such that we hit a valid pfn (or end_pfn) >>> - * on our next iteration of the loop. Note that it needs >>> - * to be pageblock aligned even when the region itself >>> - * is not. move_freepages_block() can shift ahead of >>> - * the valid region but still depends on correct page >>> - * metadata. >>> + * on our next iteration of the loop. >>> */ >>> - pfn = (memblock_next_valid_pfn(pfn, end_pfn) & >>> - ~(pageblock_nr_pages-1)) - 1; >>> + pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1; >>> #endif >>> continue; >>> } >>> -- >>> 2.15.1 >>> >>> >>> _______________________________________________ >>> linux-arm-kernel mailing list >>> linux-arm-kernel@lists.infradead.org >>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel >> >> _______________________________________________ >> linux-arm-kernel mailing list >> linux-arm-kernel@lists.infradead.org >> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel >> >
diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 3d974cb2a1a1..635d7dd29d7f 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1910,7 +1910,9 @@ static int move_freepages(struct zone *zone, * Remove at a later date when no bug reports exist related to * grouping pages by mobility */ - VM_BUG_ON(page_zone(start_page) != page_zone(end_page)); + VM_BUG_ON(pfn_valid(page_to_pfn(start_page)) && + pfn_valid(page_to_pfn(end_page)) && + page_zone(start_page) != page_zone(end_page)); #endif if (num_movable) @@ -5359,14 +5361,9 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone, /* * Skip to the pfn preceding the next valid one (or * end_pfn), such that we hit a valid pfn (or end_pfn) - * on our next iteration of the loop. Note that it needs - * to be pageblock aligned even when the region itself - * is not. move_freepages_block() can shift ahead of - * the valid region but still depends on correct page - * metadata. + * on our next iteration of the loop. */ - pfn = (memblock_next_valid_pfn(pfn, end_pfn) & - ~(pageblock_nr_pages-1)) - 1; + pfn = memblock_next_valid_pfn(pfn, end_pfn) - 1; #endif continue; }
This reverts commit 864b75f9d6b0100bb24fdd9a20d156e7cda9b5ae. Commit 864b75f9d6b0 ("mm/page_alloc: fix memmap_init_zone pageblock alignment") modified the logic in memmap_init_zone() to initialize struct pages associated with invalid PFNs, to appease a VM_BUG_ON() in move_freepages(), which is redundant by its own admission, and dereferences struct page fields to obtain the zone without checking whether the struct pages in question are valid to begin with. Commit 864b75f9d6b0 only makes it worse, since the rounding it does may cause pfn assume the same value it had in a prior iteration of the loop, resulting in an infinite loop and a hang very early in the boot. Also, since it doesn't perform the same rounding on start_pfn itself but only on intermediate values following an invalid PFN, we may still hit the same VM_BUG_ON() as before. So instead, let's fix this at the core, and ensure that the BUG check doesn't dereference struct page fields of invalid pages. Fixes: 864b75f9d6b0 ("mm/page_alloc: fix memmap_init_zone pageblock alignment") Cc: Daniel Vacek <neelx@redhat.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Pavel Tatashin <pasha.tatashin@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> --- mm/page_alloc.c | 13 +++++-------- 1 file changed, 5 insertions(+), 8 deletions(-) -- 2.15.1