diff mbox series

[4.19,1/7] mm: drop meminit_pfn_in_nid as it is redundant

Message ID 20200619132425.425063-1-pasha.tatashin@soleen.com
State New
Headers show
Series [4.19,1/7] mm: drop meminit_pfn_in_nid as it is redundant | expand

Commit Message

Pasha Tatashin June 19, 2020, 1:24 p.m. UTC
From: Alexander Duyck <alexander.h.duyck@linux.intel.com>

From: Alexander Duyck <alexander.h.duyck@linux.intel.com>

commit 56ec43d8b02719402c9fcf984feb52ec2300f8a5 upstream.

As best as I can tell the meminit_pfn_in_nid call is completely redundant.
The deferred memory initialization is already making use of
for_each_free_mem_range which in turn will call into __next_mem_range
which will only return a memory range if it matches the node ID provided
assuming it is not NUMA_NO_NODE.

I am operating on the assumption that there are no zones or pgdata_t
structures that have a NUMA node of NUMA_NO_NODE associated with them.  If
that is the case then __next_mem_range will never return a memory range
that doesn't match the zone's node ID and as such the check is redundant.

So one piece I would like to verify on this is if this works for ia64.
Technically it was using a different approach to get the node ID, but it
seems to have the node ID also encoded into the memblock.  So I am
assuming this is okay, but would like to get confirmation on that.

On my x86_64 test system with 384GB of memory per node I saw a reduction
in initialization time from 2.80s to 1.85s as a result of this patch.

Link: http://lkml.kernel.org/r/20190405221219.12227.93957.stgit@localhost.localdomain
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Khalid Aziz <khalid.aziz@oracle.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <yi.z.zhang@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
---
 mm/page_alloc.c | 51 ++++++++++++++-----------------------------------
 1 file changed, 14 insertions(+), 37 deletions(-)

Comments

Greg KH Jan. 4, 2021, 12:42 p.m. UTC | #1
On Fri, Jun 19, 2020 at 09:24:19AM -0400, Pavel Tatashin wrote:
> From: Alexander Duyck <alexander.h.duyck@linux.intel.com>

> 

> From: Alexander Duyck <alexander.h.duyck@linux.intel.com>

> 

> commit 56ec43d8b02719402c9fcf984feb52ec2300f8a5 upstream.

> 

> As best as I can tell the meminit_pfn_in_nid call is completely redundant.

> The deferred memory initialization is already making use of

> for_each_free_mem_range which in turn will call into __next_mem_range

> which will only return a memory range if it matches the node ID provided

> assuming it is not NUMA_NO_NODE.

> 

> I am operating on the assumption that there are no zones or pgdata_t

> structures that have a NUMA node of NUMA_NO_NODE associated with them.  If

> that is the case then __next_mem_range will never return a memory range

> that doesn't match the zone's node ID and as such the check is redundant.

> 

> So one piece I would like to verify on this is if this works for ia64.

> Technically it was using a different approach to get the node ID, but it

> seems to have the node ID also encoded into the memblock.  So I am

> assuming this is okay, but would like to get confirmation on that.

> 

> On my x86_64 test system with 384GB of memory per node I saw a reduction

> in initialization time from 2.80s to 1.85s as a result of this patch.

> 

> Link: http://lkml.kernel.org/r/20190405221219.12227.93957.stgit@localhost.localdomain

> Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>

> Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com>

> Acked-by: Michal Hocko <mhocko@suse.com>

> Cc: Mike Rapoport <rppt@linux.ibm.com>

> Cc: Dan Williams <dan.j.williams@intel.com>

> Cc: Dave Jiang <dave.jiang@intel.com>

> Cc: David S. Miller <davem@davemloft.net>

> Cc: Ingo Molnar <mingo@kernel.org>

> Cc: Khalid Aziz <khalid.aziz@oracle.com>

> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>

> Cc: Laurent Dufour <ldufour@linux.vnet.ibm.com>

> Cc: Matthew Wilcox <willy@infradead.org>

> Cc: Mel Gorman <mgorman@techsingularity.net>

> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>

> Cc: Pavel Tatashin <pasha.tatashin@soleen.com>

> Cc: Vlastimil Babka <vbabka@suse.cz>

> Cc: <yi.z.zhang@linux.intel.com>

> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

> Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>

> ---

>  mm/page_alloc.c | 51 ++++++++++++++-----------------------------------

>  1 file changed, 14 insertions(+), 37 deletions(-)


Given the recent changes backported to 4.19.y, is this series still
needed?

If so, can you please regenerate it and resend as it does not apply to
the current 4.19.y tree.

thanks,

greg k-h
diff mbox series

Patch

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index d8c3051387d1..c86a117acb5b 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1321,36 +1321,22 @@  int __meminit early_pfn_to_nid(unsigned long pfn)
 #endif
 
 #ifdef CONFIG_NODES_SPAN_OTHER_NODES
-static inline bool __meminit __maybe_unused
-meminit_pfn_in_nid(unsigned long pfn, int node,
-		   struct mminit_pfnnid_cache *state)
+/* Only safe to use early in boot when initialisation is single-threaded */
+static inline bool __meminit early_pfn_in_nid(unsigned long pfn, int node)
 {
 	int nid;
 
-	nid = __early_pfn_to_nid(pfn, state);
+	nid = __early_pfn_to_nid(pfn, &early_pfnnid_cache);
 	if (nid >= 0 && nid != node)
 		return false;
 	return true;
 }
 
-/* Only safe to use early in boot when initialisation is single-threaded */
-static inline bool __meminit early_pfn_in_nid(unsigned long pfn, int node)
-{
-	return meminit_pfn_in_nid(pfn, node, &early_pfnnid_cache);
-}
-
 #else
-
 static inline bool __meminit early_pfn_in_nid(unsigned long pfn, int node)
 {
 	return true;
 }
-static inline bool __meminit  __maybe_unused
-meminit_pfn_in_nid(unsigned long pfn, int node,
-		   struct mminit_pfnnid_cache *state)
-{
-	return true;
-}
 #endif
 
 
@@ -1480,21 +1466,13 @@  static inline void __init pgdat_init_report_one_done(void)
  *
  * Then, we check if a current large page is valid by only checking the validity
  * of the head pfn.
- *
- * Finally, meminit_pfn_in_nid is checked on systems where pfns can interleave
- * within a node: a pfn is between start and end of a node, but does not belong
- * to this memory node.
  */
-static inline bool __init
-deferred_pfn_valid(int nid, unsigned long pfn,
-		   struct mminit_pfnnid_cache *nid_init_state)
+static inline bool __init deferred_pfn_valid(unsigned long pfn)
 {
 	if (!pfn_valid_within(pfn))
 		return false;
 	if (!(pfn & (pageblock_nr_pages - 1)) && !pfn_valid(pfn))
 		return false;
-	if (!meminit_pfn_in_nid(pfn, nid, nid_init_state))
-		return false;
 	return true;
 }
 
@@ -1502,15 +1480,14 @@  deferred_pfn_valid(int nid, unsigned long pfn,
  * Free pages to buddy allocator. Try to free aligned pages in
  * pageblock_nr_pages sizes.
  */
-static void __init deferred_free_pages(int nid, int zid, unsigned long pfn,
+static void __init deferred_free_pages(unsigned long pfn,
 				       unsigned long end_pfn)
 {
-	struct mminit_pfnnid_cache nid_init_state = { };
 	unsigned long nr_pgmask = pageblock_nr_pages - 1;
 	unsigned long nr_free = 0;
 
 	for (; pfn < end_pfn; pfn++) {
-		if (!deferred_pfn_valid(nid, pfn, &nid_init_state)) {
+		if (!deferred_pfn_valid(pfn)) {
 			deferred_free_range(pfn - nr_free, nr_free);
 			nr_free = 0;
 		} else if (!(pfn & nr_pgmask)) {
@@ -1530,17 +1507,18 @@  static void __init deferred_free_pages(int nid, int zid, unsigned long pfn,
  * by performing it only once every pageblock_nr_pages.
  * Return number of pages initialized.
  */
-static unsigned long  __init deferred_init_pages(int nid, int zid,
+static unsigned long  __init deferred_init_pages(struct zone *zone,
 						 unsigned long pfn,
 						 unsigned long end_pfn)
 {
-	struct mminit_pfnnid_cache nid_init_state = { };
 	unsigned long nr_pgmask = pageblock_nr_pages - 1;
+	int nid = zone_to_nid(zone);
 	unsigned long nr_pages = 0;
+	int zid = zone_idx(zone);
 	struct page *page = NULL;
 
 	for (; pfn < end_pfn; pfn++) {
-		if (!deferred_pfn_valid(nid, pfn, &nid_init_state)) {
+		if (!deferred_pfn_valid(pfn)) {
 			page = NULL;
 			continue;
 		} else if (!page || !(pfn & nr_pgmask)) {
@@ -1603,12 +1581,12 @@  static int __init deferred_init_memmap(void *data)
 	for_each_free_mem_range(i, nid, MEMBLOCK_NONE, &spa, &epa, NULL) {
 		spfn = max_t(unsigned long, first_init_pfn, PFN_UP(spa));
 		epfn = min_t(unsigned long, zone_end_pfn(zone), PFN_DOWN(epa));
-		nr_pages += deferred_init_pages(nid, zid, spfn, epfn);
+		nr_pages += deferred_init_pages(zone, spfn, epfn);
 	}
 	for_each_free_mem_range(i, nid, MEMBLOCK_NONE, &spa, &epa, NULL) {
 		spfn = max_t(unsigned long, first_init_pfn, PFN_UP(spa));
 		epfn = min_t(unsigned long, zone_end_pfn(zone), PFN_DOWN(epa));
-		deferred_free_pages(nid, zid, spfn, epfn);
+		deferred_free_pages(spfn, epfn);
 	}
 	pgdat_resize_unlock(pgdat, &flags);
 
@@ -1640,7 +1618,6 @@  static int __init deferred_init_memmap(void *data)
 static noinline bool __init
 deferred_grow_zone(struct zone *zone, unsigned int order)
 {
-	int zid = zone_idx(zone);
 	int nid = zone_to_nid(zone);
 	pg_data_t *pgdat = NODE_DATA(nid);
 	unsigned long nr_pages_needed = ALIGN(1 << order, PAGES_PER_SECTION);
@@ -1690,7 +1667,7 @@  deferred_grow_zone(struct zone *zone, unsigned int order)
 		while (spfn < epfn && nr_pages < nr_pages_needed) {
 			t = ALIGN(spfn + PAGES_PER_SECTION, PAGES_PER_SECTION);
 			first_deferred_pfn = min(t, epfn);
-			nr_pages += deferred_init_pages(nid, zid, spfn,
+			nr_pages += deferred_init_pages(zone, spfn,
 							first_deferred_pfn);
 			spfn = first_deferred_pfn;
 		}
@@ -1702,7 +1679,7 @@  deferred_grow_zone(struct zone *zone, unsigned int order)
 	for_each_free_mem_range(i, nid, MEMBLOCK_NONE, &spa, &epa, NULL) {
 		spfn = max_t(unsigned long, first_init_pfn, PFN_UP(spa));
 		epfn = min_t(unsigned long, first_deferred_pfn, PFN_DOWN(epa));
-		deferred_free_pages(nid, zid, spfn, epfn);
+		deferred_free_pages(spfn, epfn);
 
 		if (first_deferred_pfn == epfn)
 			break;