From patchwork Wed Jan 15 15:33:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 233870 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=3.0 tests=DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 23359C33CB3 for ; Wed, 15 Jan 2020 15:33:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id D94C024656 for ; Wed, 15 Jan 2020 15:33:54 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="TxK+Ro66" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726248AbgAOPdy (ORCPT ); Wed, 15 Jan 2020 10:33:54 -0500 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:53111 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726132AbgAOPdy (ORCPT ); Wed, 15 Jan 2020 10:33:54 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1579102432; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ABwpddKmjy8UbwKZ+RCfg08PsetPJXQeyFlvqbnXpXU=; b=TxK+Ro663POsN4HdkLLU81LmqPg9UrIgfbMaB3Oed6WiAUQQqq4CaNpc9yupXNKzbkC7h/ lzm5VsogXQotaWtePjWzvGeMGRwOXdjv1jBvmQ5RgEzlILFdejYUdEWLG18iBtWJM7rpb0 PmdBGWznnVeO0bzmPQEBpRANHjdAnvk= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-177-VudsHqgoMYiFtVSn5ONPWg-1; Wed, 15 Jan 2020 10:33:48 -0500 X-MC-Unique: VudsHqgoMYiFtVSn5ONPWg-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id DA23A926E28; Wed, 15 Jan 2020 15:33:46 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.36.118.7]) by smtp.corp.redhat.com (Postfix) with ESMTP id 607D286CB2; Wed, 15 Jan 2020 15:33:42 +0000 (UTC) From: David Hildenbrand To: stable@vger.kernel.org Cc: linux-mm@kvack.org, Oscar Salvador , Michal Hocko , "Aneesh Kumar K . V" , Greg Kroah-Hartman , Dan Williams , Andrew Morton , Laurent Vivier , Baoquan He , David Hildenbrand Subject: [PATCH for 4.19-stable 01/25] mm/memory_hotplug: make remove_memory() take the device_hotplug_lock Date: Wed, 15 Jan 2020 16:33:15 +0100 Message-Id: <20200115153339.36409-2-david@redhat.com> In-Reply-To: <20200115153339.36409-1-david@redhat.com> References: <20200115153339.36409-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org commit cdeddc022a893858102a362147dc29be4b578abc upstream. Patch series "mm: online/offline_pages called w.o. mem_hotplug_lock", v3. Reading through the code and studying how mem_hotplug_lock is to be used, I noticed that there are two places where we can end up calling device_online()/device_offline() - online_pages()/offline_pages() without the mem_hotplug_lock. And there are other places where we call device_online()/device_offline() without the device_hotplug_lock. While e.g. echo "online" > /sys/devices/system/memory/memory9/state is fine, e.g. echo 1 > /sys/devices/system/memory/memory9/online Will not take the mem_hotplug_lock. However the device_lock() and device_hotplug_lock. E.g. via memory_probe_store(), we can end up calling add_memory()->online_pages() without the device_hotplug_lock. So we can have concurrent callers in online_pages(). We e.g. touch in online_pages() basically unprotected zone->present_pages then. Looks like there is a longer history to that (see Patch #2 for details), and fixing it to work the way it was intended is not really possible. We would e.g. have to take the mem_hotplug_lock in device/base/core.c, which sounds wrong. Summary: We had a lock inversion on mem_hotplug_lock and device_lock(). More details can be found in patch 3 and patch 6. I propose the general rules (documentation added in patch 6): 1. add_memory/add_memory_resource() must only be called with device_hotplug_lock. 2. remove_memory() must only be called with device_hotplug_lock. This is already documented and holds for all callers. 3. device_online()/device_offline() must only be called with device_hotplug_lock. This is already documented and true for now in core code. Other callers (related to memory hotplug) have to be fixed up. 4. mem_hotplug_lock is taken inside of add_memory/remove_memory/ online_pages/offline_pages. To me, this looks way cleaner than what we have right now (and easier to verify). And looking at the documentation of remove_memory, using lock_device_hotplug also for add_memory() feels natural. This patch (of 6): remove_memory() is exported right now but requires the device_hotplug_lock, which is not exported. So let's provide a variant that takes the lock and only export that one. The lock is already held in arch/powerpc/platforms/pseries/hotplug-memory.c drivers/acpi/acpi_memhotplug.c arch/powerpc/platforms/powernv/memtrace.c Apart from that, there are not other users in the tree. Link: http://lkml.kernel.org/r/20180925091457.28651-2-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Pavel Tatashin Reviewed-by: Rafael J. Wysocki Reviewed-by: Rashmica Gupta Reviewed-by: Oscar Salvador Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: Michael Ellerman Cc: "Rafael J. Wysocki" Cc: Len Brown Cc: Rashmica Gupta Cc: Michael Neuling Cc: Balbir Singh Cc: Nathan Fontenot Cc: John Allen Cc: Michal Hocko Cc: Dan Williams Cc: Joonsoo Kim Cc: Vlastimil Babka Cc: Greg Kroah-Hartman Cc: YASUAKI ISHIMATSU Cc: Mathieu Malaterre Cc: Boris Ostrovsky Cc: Haiyang Zhang Cc: Heiko Carstens Cc: Jonathan Corbet Cc: Juergen Gross Cc: Kate Stewart Cc: "K. Y. Srinivasan" Cc: Martin Schwidefsky Cc: Philippe Ombredanne Cc: Stephen Hemminger Cc: Thomas Gleixner Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: David Hildenbrand --- arch/powerpc/platforms/powernv/memtrace.c | 2 +- arch/powerpc/platforms/pseries/hotplug-memory.c | 6 +++--- drivers/acpi/acpi_memhotplug.c | 2 +- include/linux/memory_hotplug.h | 3 ++- mm/memory_hotplug.c | 9 ++++++++- 5 files changed, 15 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/platforms/powernv/memtrace.c b/arch/powerpc/platforms/powernv/memtrace.c index dd3cc4632b9a..84d038ed3882 100644 --- a/arch/powerpc/platforms/powernv/memtrace.c +++ b/arch/powerpc/platforms/powernv/memtrace.c @@ -122,7 +122,7 @@ static u64 memtrace_alloc_node(u32 nid, u64 size) */ end_pfn = base_pfn + nr_pages; for (pfn = base_pfn; pfn < end_pfn; pfn += bytes>> PAGE_SHIFT) { - remove_memory(nid, pfn << PAGE_SHIFT, bytes); + __remove_memory(nid, pfn << PAGE_SHIFT, bytes); } unlock_device_hotplug(); return base_pfn << PAGE_SHIFT; diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c index 7f86bc3eaade..ad46eabe9f5f 100644 --- a/arch/powerpc/platforms/pseries/hotplug-memory.c +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c @@ -306,7 +306,7 @@ static int pseries_remove_memblock(unsigned long base, unsigned int memblock_siz nid = memory_add_physaddr_to_nid(base); for (i = 0; i < sections_per_block; i++) { - remove_memory(nid, base, MIN_MEMORY_BLOCK_SIZE); + __remove_memory(nid, base, MIN_MEMORY_BLOCK_SIZE); base += MIN_MEMORY_BLOCK_SIZE; } @@ -398,7 +398,7 @@ static int dlpar_remove_lmb(struct drmem_lmb *lmb) block_sz = pseries_memory_block_size(); nid = memory_add_physaddr_to_nid(lmb->base_addr); - remove_memory(nid, lmb->base_addr, block_sz); + __remove_memory(nid, lmb->base_addr, block_sz); /* Update memory regions for memory remove */ memblock_remove(lmb->base_addr, block_sz); @@ -685,7 +685,7 @@ static int dlpar_add_lmb(struct drmem_lmb *lmb) rc = dlpar_online_lmb(lmb); if (rc) { - remove_memory(nid, lmb->base_addr, block_sz); + __remove_memory(nid, lmb->base_addr, block_sz); invalidate_lmb_associativity_index(lmb); } else { lmb->flags |= DRCONF_MEM_ASSIGNED; diff --git a/drivers/acpi/acpi_memhotplug.c b/drivers/acpi/acpi_memhotplug.c index 2ccfbb61ca89..8fe0960ea572 100644 --- a/drivers/acpi/acpi_memhotplug.c +++ b/drivers/acpi/acpi_memhotplug.c @@ -282,7 +282,7 @@ static void acpi_memory_remove_memory(struct acpi_memory_device *mem_device) nid = memory_add_physaddr_to_nid(info->start_addr); acpi_unbind_memory_blocks(info); - remove_memory(nid, info->start_addr, info->length); + __remove_memory(nid, info->start_addr, info->length); list_del(&info->list); kfree(info); } diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index 4915e6cd7fd5..6f13a5a33b51 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -303,6 +303,7 @@ extern bool is_mem_section_removable(unsigned long pfn, unsigned long nr_pages); extern void try_offline_node(int nid); extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages); extern void remove_memory(int nid, u64 start, u64 size); +extern void __remove_memory(int nid, u64 start, u64 size); #else static inline bool is_mem_section_removable(unsigned long pfn, @@ -319,6 +320,7 @@ static inline int offline_pages(unsigned long start_pfn, unsigned long nr_pages) } static inline void remove_memory(int nid, u64 start, u64 size) {} +static inline void __remove_memory(int nid, u64 start, u64 size) {} #endif /* CONFIG_MEMORY_HOTREMOVE */ extern void __ref free_area_init_core_hotplug(int nid); @@ -333,7 +335,6 @@ extern void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn, unsigned long nr_pages, struct vmem_altmap *altmap); extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages); extern bool is_memblock_offlined(struct memory_block *mem); -extern void remove_memory(int nid, u64 start, u64 size); extern int sparse_add_one_section(struct pglist_data *pgdat, unsigned long start_pfn, struct vmem_altmap *altmap); extern void sparse_remove_one_section(struct zone *zone, struct mem_section *ms, diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 413f6709039a..e2e2cf7014ee 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1893,7 +1893,7 @@ EXPORT_SYMBOL(try_offline_node); * and online/offline operations before this call, as required by * try_offline_node(). */ -void __ref remove_memory(int nid, u64 start, u64 size) +void __ref __remove_memory(int nid, u64 start, u64 size) { int ret; @@ -1922,5 +1922,12 @@ void __ref remove_memory(int nid, u64 start, u64 size) mem_hotplug_done(); } + +void remove_memory(int nid, u64 start, u64 size) +{ + lock_device_hotplug(); + __remove_memory(nid, start, size); + unlock_device_hotplug(); +} EXPORT_SYMBOL_GPL(remove_memory); #endif /* CONFIG_MEMORY_HOTREMOVE */ From patchwork Wed Jan 15 15:33:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 233869 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=3.0 tests=DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4F332C33CB2 for ; Wed, 15 Jan 2020 15:33:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 2784F2053B for ; Wed, 15 Jan 2020 15:33:59 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="JS5Jl4dm" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726132AbgAOPd6 (ORCPT ); Wed, 15 Jan 2020 10:33:58 -0500 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:48766 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726506AbgAOPd6 (ORCPT ); Wed, 15 Jan 2020 10:33:58 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1579102436; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Xouwvrgb9om+SdRVos7b4oALgOeF1wG6WhY9MhFAPfM=; b=JS5Jl4dmar62BHwzfwuFBl4bY5Yx0qIAcZrlxGML6UxQji7rae+O1fv6t71k96IoaTr722 NePa3GHz/cqVvWG7owAkSDy9V7w88mvfUs/ETQH/SsmiDi+W2ePOAleJFCbg8DQ2U6lJYQ WYfzDN/vEqh79zM4O0BQjbs8uwNELok= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-233-dSiSfgeDO4iSSN-MK1IFiw-1; Wed, 15 Jan 2020 10:33:53 -0500 X-MC-Unique: dSiSfgeDO4iSSN-MK1IFiw-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id D3CFC8DB0FF; Wed, 15 Jan 2020 15:33:51 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.36.118.7]) by smtp.corp.redhat.com (Postfix) with ESMTP id A5FFC8888F; Wed, 15 Jan 2020 15:33:49 +0000 (UTC) From: David Hildenbrand To: stable@vger.kernel.org Cc: linux-mm@kvack.org, Oscar Salvador , Michal Hocko , "Aneesh Kumar K . V" , Greg Kroah-Hartman , Dan Williams , Andrew Morton , Laurent Vivier , Baoquan He , David Hildenbrand Subject: [PATCH for 4.19-stable 03/25] mm, sparse: pass nid instead of pgdat to sparse_add_one_section() Date: Wed, 15 Jan 2020 16:33:17 +0100 Message-Id: <20200115153339.36409-4-david@redhat.com> In-Reply-To: <20200115153339.36409-1-david@redhat.com> References: <20200115153339.36409-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org commit 4e0d2e7ef14d9e1c900dac909db45263822b824f upstream. Since the information needed in sparse_add_one_section() is node id to allocate proper memory, it is not necessary to pass its pgdat. This patch changes the prototype of sparse_add_one_section() to pass node id directly. This is intended to reduce misleading that sparse_add_one_section() would touch pgdat. Link: http://lkml.kernel.org/r/20181204085657.20472-2-richard.weiyang@gmail.com Signed-off-by: Wei Yang Reviewed-by: David Hildenbrand Acked-by: Michal Hocko Cc: Dave Hansen Cc: Oscar Salvador Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: David Hildenbrand --- include/linux/memory_hotplug.h | 4 ++-- mm/memory_hotplug.c | 2 +- mm/sparse.c | 8 ++++---- 3 files changed, 7 insertions(+), 7 deletions(-) diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index 6f13a5a33b51..008e5281e7d7 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -335,8 +335,8 @@ extern void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn, unsigned long nr_pages, struct vmem_altmap *altmap); extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages); extern bool is_memblock_offlined(struct memory_block *mem); -extern int sparse_add_one_section(struct pglist_data *pgdat, - unsigned long start_pfn, struct vmem_altmap *altmap); +extern int sparse_add_one_section(int nid, unsigned long start_pfn, + struct vmem_altmap *altmap); extern void sparse_remove_one_section(struct zone *zone, struct mem_section *ms, unsigned long map_offset, struct vmem_altmap *altmap); extern struct page *sparse_decode_mem_map(unsigned long coded_mem_map, diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index e2e2cf7014ee..c109e3f0bc16 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -255,7 +255,7 @@ static int __meminit __add_section(int nid, unsigned long phys_start_pfn, if (pfn_valid(phys_start_pfn)) return -EEXIST; - ret = sparse_add_one_section(NODE_DATA(nid), phys_start_pfn, altmap); + ret = sparse_add_one_section(nid, phys_start_pfn, altmap); if (ret < 0) return ret; diff --git a/mm/sparse.c b/mm/sparse.c index 9aca9f24bdc5..f52e8c328761 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -661,8 +661,8 @@ static void free_map_bootmem(struct page *memmap) * set. If this is <=0, then that means that the passed-in * map was not consumed and must be freed. */ -int __meminit sparse_add_one_section(struct pglist_data *pgdat, - unsigned long start_pfn, struct vmem_altmap *altmap) +int __meminit sparse_add_one_section(int nid, unsigned long start_pfn, + struct vmem_altmap *altmap) { unsigned long section_nr = pfn_to_section_nr(start_pfn); struct mem_section *ms; @@ -674,11 +674,11 @@ int __meminit sparse_add_one_section(struct pglist_data *pgdat, * no locking for this, because it does its own * plus, it does a kmalloc */ - ret = sparse_index_init(section_nr, pgdat->node_id); + ret = sparse_index_init(section_nr, nid); if (ret < 0 && ret != -EEXIST) return ret; ret = 0; - memmap = kmalloc_section_memmap(section_nr, pgdat->node_id, altmap); + memmap = kmalloc_section_memmap(section_nr, nid, altmap); if (!memmap) return -ENOMEM; usemap = __kmalloc_section_usemap(); From patchwork Wed Jan 15 15:33:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 233868 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=3.0 tests=DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9F337C33CB1 for ; Wed, 15 Jan 2020 15:34:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 6AD602467C for ; Wed, 15 Jan 2020 15:34:04 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="GWW66/fB" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728896AbgAOPeE (ORCPT ); Wed, 15 Jan 2020 10:34:04 -0500 Received: from us-smtp-2.mimecast.com ([207.211.31.81]:47337 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726506AbgAOPeD (ORCPT ); Wed, 15 Jan 2020 10:34:03 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1579102442; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=wIA+F40TJeius/Ve+PoUrhJl3BcrWMkiZ//CCAuYEtY=; b=GWW66/fBS8nQt6jgmZKjcJCmX+KPdBsSk+E1wIQmy24vc47cQYMJcWENe9LfxWfVDmUKFS SQemAVRE9rgoURXb6iViuPXC5JlAwF5S9DHeFVbEDEkwP17RK4JLGO1xwyOjTZEimOclbJ f8sOragjxJ9TGp15pq+QSBxu9RpdnGA= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-237-Med2IXC8PemnrEDT--fkSw-1; Wed, 15 Jan 2020 10:33:58 -0500 X-MC-Unique: Med2IXC8PemnrEDT--fkSw-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id D09179011F6; Wed, 15 Jan 2020 15:33:56 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.36.118.7]) by smtp.corp.redhat.com (Postfix) with ESMTP id A5FCC86CB2; Wed, 15 Jan 2020 15:33:54 +0000 (UTC) From: David Hildenbrand To: stable@vger.kernel.org Cc: linux-mm@kvack.org, Oscar Salvador , Michal Hocko , "Aneesh Kumar K . V" , Greg Kroah-Hartman , Dan Williams , Andrew Morton , Laurent Vivier , Baoquan He , David Hildenbrand Subject: [PATCH for 4.19-stable 05/25] mm, memory_hotplug: add nid parameter to arch_remove_memory Date: Wed, 15 Jan 2020 16:33:19 +0100 Message-Id: <20200115153339.36409-6-david@redhat.com> In-Reply-To: <20200115153339.36409-1-david@redhat.com> References: <20200115153339.36409-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org commit 2c2a5af6fed20cf74401c9d64319c76c5ff81309 upstream. -- snip -- Missing unification of mm/hmm.c and kernel/memremap.c -- snip -- Patch series "Do not touch pages in hot-remove path", v2. This patchset aims for two things: 1) A better definition about offline and hot-remove stage 2) Solving bugs where we can access non-initialized pages during hot-remove operations [2] [3]. This is achieved by moving all page/zone handling to the offline stage, so we do not need to access pages when hot-removing memory. [1] https://patchwork.kernel.org/cover/10691415/ [2] https://patchwork.kernel.org/patch/10547445/ [3] https://www.spinics.net/lists/linux-mm/msg161316.html This patch (of 5): This is a preparation for the following-up patches. The idea of passing the nid is that it will allow us to get rid of the zone parameter afterwards. Link: http://lkml.kernel.org/r/20181127162005.15833-2-osalvador@suse.de Signed-off-by: Oscar Salvador Reviewed-by: David Hildenbrand Reviewed-by: Pavel Tatashin Cc: Michal Hocko Cc: Dan Williams Cc: Jerome Glisse Cc: Jonathan Cameron Cc: "Rafael J. Wysocki" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: David Hildenbrand --- arch/ia64/mm/init.c | 2 +- arch/powerpc/mm/mem.c | 3 ++- arch/s390/mm/init.c | 2 +- arch/sh/mm/init.c | 2 +- arch/x86/mm/init_32.c | 2 +- arch/x86/mm/init_64.c | 3 ++- include/linux/memory_hotplug.h | 4 ++-- kernel/memremap.c | 5 ++++- mm/hmm.c | 4 +++- mm/memory_hotplug.c | 2 +- 10 files changed, 18 insertions(+), 11 deletions(-) diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c index 3b85c3ecac38..b54d0ee74b53 100644 --- a/arch/ia64/mm/init.c +++ b/arch/ia64/mm/init.c @@ -662,7 +662,7 @@ int arch_add_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap, } #ifdef CONFIG_MEMORY_HOTREMOVE -int arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap) +int arch_remove_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap) { unsigned long start_pfn = start >> PAGE_SHIFT; unsigned long nr_pages = size >> PAGE_SHIFT; diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c index 9a6afd9f3f9b..1b6e0ef5d14d 100644 --- a/arch/powerpc/mm/mem.c +++ b/arch/powerpc/mm/mem.c @@ -140,7 +140,8 @@ int __meminit arch_add_memory(int nid, u64 start, u64 size, struct vmem_altmap * } #ifdef CONFIG_MEMORY_HOTREMOVE -int __meminit arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap) +int __meminit arch_remove_memory(int nid, u64 start, u64 size, + struct vmem_altmap *altmap) { unsigned long start_pfn = start >> PAGE_SHIFT; unsigned long nr_pages = size >> PAGE_SHIFT; diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c index 3fa3e5323612..bc49b560625e 100644 --- a/arch/s390/mm/init.c +++ b/arch/s390/mm/init.c @@ -240,7 +240,7 @@ int arch_add_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap, } #ifdef CONFIG_MEMORY_HOTREMOVE -int arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap) +int arch_remove_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap) { /* * There is no hardware or firmware interface which could trigger a diff --git a/arch/sh/mm/init.c b/arch/sh/mm/init.c index 7713c084d040..5c91bb6abc35 100644 --- a/arch/sh/mm/init.c +++ b/arch/sh/mm/init.c @@ -444,7 +444,7 @@ EXPORT_SYMBOL_GPL(memory_add_physaddr_to_nid); #endif #ifdef CONFIG_MEMORY_HOTREMOVE -int arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap) +int arch_remove_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap) { unsigned long start_pfn = PFN_DOWN(start); unsigned long nr_pages = size >> PAGE_SHIFT; diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c index 979e0a02cbe1..9fa503f2f56c 100644 --- a/arch/x86/mm/init_32.c +++ b/arch/x86/mm/init_32.c @@ -861,7 +861,7 @@ int arch_add_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap, } #ifdef CONFIG_MEMORY_HOTREMOVE -int arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap) +int arch_remove_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap) { unsigned long start_pfn = start >> PAGE_SHIFT; unsigned long nr_pages = size >> PAGE_SHIFT; diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index a3e9c6ee3cf2..32066d5dc9af 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -1142,7 +1142,8 @@ kernel_physical_mapping_remove(unsigned long start, unsigned long end) remove_pagetable(start, end, true, NULL); } -int __ref arch_remove_memory(u64 start, u64 size, struct vmem_altmap *altmap) +int __ref arch_remove_memory(int nid, u64 start, u64 size, + struct vmem_altmap *altmap) { unsigned long start_pfn = start >> PAGE_SHIFT; unsigned long nr_pages = size >> PAGE_SHIFT; diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index 008e5281e7d7..df77a7597aba 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -109,8 +109,8 @@ static inline bool movable_node_is_enabled(void) } #ifdef CONFIG_MEMORY_HOTREMOVE -extern int arch_remove_memory(u64 start, u64 size, - struct vmem_altmap *altmap); +extern int arch_remove_memory(int nid, u64 start, u64 size, + struct vmem_altmap *altmap); extern int __remove_pages(struct zone *zone, unsigned long start_pfn, unsigned long nr_pages, struct vmem_altmap *altmap); #endif /* CONFIG_MEMORY_HOTREMOVE */ diff --git a/kernel/memremap.c b/kernel/memremap.c index 7c5fb8a208ac..2ee2e672d5fc 100644 --- a/kernel/memremap.c +++ b/kernel/memremap.c @@ -121,6 +121,7 @@ static void devm_memremap_pages_release(void *data) struct resource *res = &pgmap->res; resource_size_t align_start, align_size; unsigned long pfn; + int nid; pgmap->kill(pgmap->ref); for_each_device_pfn(pfn, pgmap) @@ -131,13 +132,15 @@ static void devm_memremap_pages_release(void *data) align_size = ALIGN(res->start + resource_size(res), SECTION_SIZE) - align_start; + nid = page_to_nid(pfn_to_page(align_start >> PAGE_SHIFT)); + mem_hotplug_begin(); if (pgmap->type == MEMORY_DEVICE_PRIVATE) { pfn = align_start >> PAGE_SHIFT; __remove_pages(page_zone(pfn_to_page(pfn)), pfn, align_size >> PAGE_SHIFT, NULL); } else { - arch_remove_memory(align_start, align_size, + arch_remove_memory(nid, align_start, align_size, pgmap->altmap_valid ? &pgmap->altmap : NULL); kasan_remove_zero_shadow(__va(align_start), align_size); } diff --git a/mm/hmm.c b/mm/hmm.c index 57f0d2a4ff34..ae1f6ad46d30 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -999,6 +999,7 @@ static void hmm_devmem_release(void *data) unsigned long start_pfn, npages; struct zone *zone; struct page *page; + int nid; /* pages are dead and unused, undo the arch mapping */ start_pfn = (resource->start & ~(PA_SECTION_SIZE - 1)) >> PAGE_SHIFT; @@ -1006,12 +1007,13 @@ static void hmm_devmem_release(void *data) page = pfn_to_page(start_pfn); zone = page_zone(page); + nid = page_to_nid(page); mem_hotplug_begin(); if (resource->desc == IORES_DESC_DEVICE_PRIVATE_MEMORY) __remove_pages(zone, start_pfn, npages, NULL); else - arch_remove_memory(start_pfn << PAGE_SHIFT, + arch_remove_memory(nid, start_pfn << PAGE_SHIFT, npages << PAGE_SHIFT, NULL); mem_hotplug_done(); diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index c109e3f0bc16..a51e5ffdaa04 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1916,7 +1916,7 @@ void __ref __remove_memory(int nid, u64 start, u64 size) memblock_free(start, size); memblock_remove(start, size); - arch_remove_memory(start, size, NULL); + arch_remove_memory(nid, start, size, NULL); try_offline_node(nid); From patchwork Wed Jan 15 15:33:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 233867 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=3.0 tests=DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 137A5C33CB3 for ; Wed, 15 Jan 2020 15:34:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id DEB132053B for ; Wed, 15 Jan 2020 15:34:08 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="c1qn9TBE" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726506AbgAOPeI (ORCPT ); Wed, 15 Jan 2020 10:34:08 -0500 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:31369 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728912AbgAOPeI (ORCPT ); Wed, 15 Jan 2020 10:34:08 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1579102446; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=iqgCaZviHLl0j25ZTLpmn0A6giTT4QsPeJlLoPbLi5c=; b=c1qn9TBE4Ny8PJ+UQErpmXW0+nK4ZFumfgmNx9K801ANThCxgGb3AvkwEJuHZ8HOJtf+OK a13o7NuK4xM0pynb9mug9qlncMOh77oDYom7+Aq605UrPSvxZ9LwqPfJOTRnGxB2fbQf9f EDW54tf6GFLIsXPVm5BLO/1/QbjaAVA= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-321-VmF-FHJAMCaT26V_ZHofcw-1; Wed, 15 Jan 2020 10:34:03 -0500 X-MC-Unique: VmF-FHJAMCaT26V_ZHofcw-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id CE53C1109881; Wed, 15 Jan 2020 15:34:01 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.36.118.7]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9936786CB2; Wed, 15 Jan 2020 15:33:59 +0000 (UTC) From: David Hildenbrand To: stable@vger.kernel.org Cc: linux-mm@kvack.org, Oscar Salvador , Michal Hocko , "Aneesh Kumar K . V" , Greg Kroah-Hartman , Dan Williams , Andrew Morton , Laurent Vivier , Baoquan He , David Hildenbrand Subject: [PATCH for 4.19-stable 07/25] drivers/base/memory.c: clean up relics in function parameters Date: Wed, 15 Jan 2020 16:33:21 +0100 Message-Id: <20200115153339.36409-8-david@redhat.com> In-Reply-To: <20200115153339.36409-1-david@redhat.com> References: <20200115153339.36409-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org commit 063b8a4cee8088224bcdb79bcd08db98df16178e upstream. The input parameter 'phys_index' of memory_block_action() is actually the section number, but not the phys_index of memory_block. This is a relic from the past when one memory block could only contain one section. Rename it to start_section_nr. And also in remove_memory_section(), the 'node_id' and 'phys_device' arguments are not used by anyone. Remove them. Link: http://lkml.kernel.org/r/20190329144250.14315-2-bhe@redhat.com Signed-off-by: Baoquan He Acked-by: Michal Hocko Reviewed-by: Rafael J. Wysocki Reviewed-by: Mukesh Ojha Reviewed-by: Oscar Salvador Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: David Hildenbrand --- drivers/base/memory.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/base/memory.c b/drivers/base/memory.c index 5fca7225f3fe..b384f01ad29d 100644 --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -230,13 +230,14 @@ static bool pages_correctly_probed(unsigned long start_pfn) * OK to have direct references to sparsemem variables in here. */ static int -memory_block_action(unsigned long phys_index, unsigned long action, int online_type) +memory_block_action(unsigned long start_section_nr, unsigned long action, + int online_type) { unsigned long start_pfn; unsigned long nr_pages = PAGES_PER_SECTION * sections_per_block; int ret; - start_pfn = section_nr_to_pfn(phys_index); + start_pfn = section_nr_to_pfn(start_section_nr); switch (action) { case MEM_ONLINE: @@ -250,7 +251,7 @@ memory_block_action(unsigned long phys_index, unsigned long action, int online_t break; default: WARN(1, KERN_WARNING "%s(%ld, %ld) unknown action: " - "%ld\n", __func__, phys_index, action, action); + "%ld\n", __func__, start_section_nr, action, action); ret = -EINVAL; } @@ -747,8 +748,7 @@ unregister_memory(struct memory_block *memory) device_unregister(&memory->dev); } -static int remove_memory_section(unsigned long node_id, - struct mem_section *section, int phys_device) +static int remove_memory_section(struct mem_section *section) { struct memory_block *mem; @@ -780,7 +780,7 @@ int unregister_memory_section(struct mem_section *section) if (!present_section(section)) return -EINVAL; - return remove_memory_section(0, section, 0); + return remove_memory_section(section); } #endif /* CONFIG_MEMORY_HOTREMOVE */ From patchwork Wed Jan 15 15:33:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 233866 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=3.0 tests=DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6EBC6C33CB1 for ; Wed, 15 Jan 2020 15:34:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 42C2924656 for ; Wed, 15 Jan 2020 15:34:19 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="YWm9fHGH" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726550AbgAOPeS (ORCPT ); Wed, 15 Jan 2020 10:34:18 -0500 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:37413 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728912AbgAOPeS (ORCPT ); Wed, 15 Jan 2020 10:34:18 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1579102456; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=JDa/6MD2QZaRtp/DNJJPsnCvCCrAi5HUTK78bJMcAgs=; b=YWm9fHGHL4T9r8jVUPvM6IQYWLoiVzFn57+9DInmLIPbSjU+Lc2akdlBmOBFnpcgxlGaac /dsb4UfB+scM1fpaXXPOEz7AeuTDlSv7Jx70rbounVU1OTla45mPtwbq28D9M2B7iWXrDi vB2WkC9Vgxa+tmrWdLvDLO/bg9BJtwM= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-72-5j_0HzXPORifJSvzdSUNOg-1; Wed, 15 Jan 2020 10:34:13 -0500 X-MC-Unique: 5j_0HzXPORifJSvzdSUNOg-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 9B51A8C4329; Wed, 15 Jan 2020 15:34:11 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.36.118.7]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6A82B86CB2; Wed, 15 Jan 2020 15:34:09 +0000 (UTC) From: David Hildenbrand To: stable@vger.kernel.org Cc: linux-mm@kvack.org, Oscar Salvador , Michal Hocko , "Aneesh Kumar K . V" , Greg Kroah-Hartman , Dan Williams , Andrew Morton , Laurent Vivier , Baoquan He , David Hildenbrand Subject: [PATCH for 4.19-stable 10/25] mm/memory_hotplug: make __remove_section() never fail Date: Wed, 15 Jan 2020 16:33:24 +0100 Message-Id: <20200115153339.36409-11-david@redhat.com> In-Reply-To: <20200115153339.36409-1-david@redhat.com> References: <20200115153339.36409-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org commit 9d1d887d785b4fe0590bd3c5e71acaa3908044e2 upstream. Let's just warn in case a section is not valid instead of failing to remove somewhere in the middle of the process, returning an error that will be mostly ignored by callers. Link: http://lkml.kernel.org/r/20190409100148.24703-4-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Oscar Salvador Cc: Michal Hocko Cc: David Hildenbrand Cc: Pavel Tatashin Cc: Qian Cai Cc: Wei Yang Cc: Arun KS Cc: Mathieu Malaterre Cc: Andrew Banman Cc: Andy Lutomirski Cc: Benjamin Herrenschmidt Cc: Borislav Petkov Cc: Christophe Leroy Cc: Dave Hansen Cc: Fenghua Yu Cc: Geert Uytterhoeven Cc: Greg Kroah-Hartman Cc: Heiko Carstens Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Ingo Molnar Cc: Joonsoo Kim Cc: "Kirill A. Shutemov" Cc: Martin Schwidefsky Cc: Masahiro Yamada Cc: Michael Ellerman Cc: Mike Rapoport Cc: Mike Travis Cc: Nicholas Piggin Cc: Oscar Salvador Cc: Paul Mackerras Cc: Peter Zijlstra Cc: "Rafael J. Wysocki" Cc: Rich Felker Cc: Rob Herring Cc: Stefan Agner Cc: Thomas Gleixner Cc: Tony Luck Cc: Vasily Gorbik Cc: Yoshinori Sato Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: David Hildenbrand --- mm/memory_hotplug.c | 22 +++++++++------------- 1 file changed, 9 insertions(+), 13 deletions(-) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 42c5bc371ffe..a9fcae50a33a 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -478,15 +478,15 @@ static void __remove_zone(struct zone *zone, unsigned long start_pfn) pgdat_resize_unlock(zone->zone_pgdat, &flags); } -static int __remove_section(struct zone *zone, struct mem_section *ms, - unsigned long map_offset, struct vmem_altmap *altmap) +static void __remove_section(struct zone *zone, struct mem_section *ms, + unsigned long map_offset, + struct vmem_altmap *altmap) { unsigned long start_pfn; int scn_nr; - int ret = -EINVAL; - if (!valid_section(ms)) - return ret; + if (WARN_ON_ONCE(!valid_section(ms))) + return; unregister_memory_section(ms); @@ -495,7 +495,6 @@ static int __remove_section(struct zone *zone, struct mem_section *ms, __remove_zone(zone, start_pfn); sparse_remove_one_section(zone, ms, map_offset, altmap); - return 0; } /** @@ -515,7 +514,7 @@ int __remove_pages(struct zone *zone, unsigned long phys_start_pfn, { unsigned long i; unsigned long map_offset = 0; - int sections_to_remove, ret = 0; + int sections_to_remove; /* In the ZONE_DEVICE case device driver owns the memory region */ if (is_dev_zone(zone)) { @@ -536,16 +535,13 @@ int __remove_pages(struct zone *zone, unsigned long phys_start_pfn, unsigned long pfn = phys_start_pfn + i*PAGES_PER_SECTION; cond_resched(); - ret = __remove_section(zone, __pfn_to_section(pfn), map_offset, - altmap); + __remove_section(zone, __pfn_to_section(pfn), map_offset, + altmap); map_offset = 0; - if (ret) - break; } set_zone_contiguous(zone); - - return ret; + return 0; } #endif /* CONFIG_MEMORY_HOTREMOVE */ From patchwork Wed Jan 15 15:33:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 233865 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=3.0 tests=DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 73C70C3F68F for ; Wed, 15 Jan 2020 15:34:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 499772053B for ; Wed, 15 Jan 2020 15:34:20 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="DHNc9VtH" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729010AbgAOPeT (ORCPT ); Wed, 15 Jan 2020 10:34:19 -0500 Received: from us-smtp-2.mimecast.com ([207.211.31.81]:41358 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729005AbgAOPeT (ORCPT ); Wed, 15 Jan 2020 10:34:19 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1579102459; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=6I7vjcTE4M6IA5t0ZZKiqiD7EyciUEcVUegkom+LSR8=; b=DHNc9VtHDd3sRyZh8OcBSBtil0A019N1Qgj5lEeboPFGmp2WNg8nh0yzAJgeA0qv1kwV+j /NOoSMO9xFAmTjxe40DRSwukhgZxUK+N4bVEnQ9TzepLoNjM3KJ4CxalJa8b35zin5/z9p Wki04r5jlvxvXCHYKGlhVuP0yYGoHrY= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-265-cLEUws5WObObcmxRDmKVHw-1; Wed, 15 Jan 2020 10:34:15 -0500 X-MC-Unique: cLEUws5WObObcmxRDmKVHw-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 2275C1A2C4F; Wed, 15 Jan 2020 15:34:14 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.36.118.7]) by smtp.corp.redhat.com (Postfix) with ESMTP id E557486CB2; Wed, 15 Jan 2020 15:34:11 +0000 (UTC) From: David Hildenbrand To: stable@vger.kernel.org Cc: linux-mm@kvack.org, Oscar Salvador , Michal Hocko , "Aneesh Kumar K . V" , Greg Kroah-Hartman , Dan Williams , Andrew Morton , Laurent Vivier , Baoquan He , David Hildenbrand Subject: [PATCH for 4.19-stable 11/25] powerpc/mm: Fix section mismatch warning Date: Wed, 15 Jan 2020 16:33:25 +0100 Message-Id: <20200115153339.36409-12-david@redhat.com> In-Reply-To: <20200115153339.36409-1-david@redhat.com> References: <20200115153339.36409-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org commit 26ad26718dfaa7cf49d106d212ebf2370076c253 upstream. This patch fix the below section mismatch warnings. WARNING: vmlinux.o(.text+0x2d1f44): Section mismatch in reference from the function devm_memremap_pages_release() to the function .meminit.text:arch_remove_memory() WARNING: vmlinux.o(.text+0x2d265c): Section mismatch in reference from the function devm_memremap_pages() to the function .meminit.text:arch_add_memory() Signed-off-by: Aneesh Kumar K.V Signed-off-by: Michael Ellerman Signed-off-by: David Hildenbrand --- arch/powerpc/mm/mem.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c index 1b6e0ef5d14d..625d78547fe7 100644 --- a/arch/powerpc/mm/mem.c +++ b/arch/powerpc/mm/mem.c @@ -118,8 +118,8 @@ int __weak remove_section_mapping(unsigned long start, unsigned long end) return -ENODEV; } -int __meminit arch_add_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap, - bool want_memblock) +int __ref arch_add_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap, + bool want_memblock) { unsigned long start_pfn = start >> PAGE_SHIFT; unsigned long nr_pages = size >> PAGE_SHIFT; @@ -140,8 +140,8 @@ int __meminit arch_add_memory(int nid, u64 start, u64 size, struct vmem_altmap * } #ifdef CONFIG_MEMORY_HOTREMOVE -int __meminit arch_remove_memory(int nid, u64 start, u64 size, - struct vmem_altmap *altmap) +int __ref arch_remove_memory(int nid, u64 start, u64 size, + struct vmem_altmap *altmap) { unsigned long start_pfn = start >> PAGE_SHIFT; unsigned long nr_pages = size >> PAGE_SHIFT; From patchwork Wed Jan 15 15:33:28 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 233864 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=3.0 tests=DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A503DC33CB1 for ; Wed, 15 Jan 2020 15:34:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 7C99C2053B for ; Wed, 15 Jan 2020 15:34:28 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Mut3x9Vs" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729011AbgAOPe2 (ORCPT ); Wed, 15 Jan 2020 10:34:28 -0500 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:25246 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728978AbgAOPe2 (ORCPT ); Wed, 15 Jan 2020 10:34:28 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1579102466; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=6y509MhmNd/KLAcJws0N6bsuBxMXUDWgudqlhF4LAYQ=; b=Mut3x9Vsygfihvi7E3U/9+GWDAsGyIOHqECLPuibG+6fGA4uiWt1+/yI1S8vjLy1gNH0BF ntshS2P1uCLXsiw0b82tret9DNW23yNQE/0z847HfShWXsoSkFUxb+Pgj1rYsLKiPGIi8l PrRt1N3A7Puy6okt6ENjL7pZSTGl0tE= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-409-IoJdR7yXMqKZmPpBYlgcsA-1; Wed, 15 Jan 2020 10:34:23 -0500 X-MC-Unique: IoJdR7yXMqKZmPpBYlgcsA-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id A37EB109AAEC; Wed, 15 Jan 2020 15:34:21 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.36.118.7]) by smtp.corp.redhat.com (Postfix) with ESMTP id 73FAE86CB2; Wed, 15 Jan 2020 15:34:19 +0000 (UTC) From: David Hildenbrand To: stable@vger.kernel.org Cc: linux-mm@kvack.org, Oscar Salvador , Michal Hocko , "Aneesh Kumar K . V" , Greg Kroah-Hartman , Dan Williams , Andrew Morton , Laurent Vivier , Baoquan He , David Hildenbrand Subject: [PATCH for 4.19-stable 14/25] s390x/mm: implement arch_remove_memory() Date: Wed, 15 Jan 2020 16:33:28 +0100 Message-Id: <20200115153339.36409-15-david@redhat.com> In-Reply-To: <20200115153339.36409-1-david@redhat.com> References: <20200115153339.36409-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org commit 18c86506c80f6b6b5e67d95bf0d6f7e665de5239 upstream. Will come in handy when wanting to handle errors after arch_add_memory(). Link: http://lkml.kernel.org/r/20190527111152.16324-4-david@redhat.com Signed-off-by: David Hildenbrand Cc: Heiko Carstens Cc: Michal Hocko Cc: Mike Rapoport Cc: David Hildenbrand Cc: Vasily Gorbik Cc: Oscar Salvador Cc: Alex Deucher Cc: Andrew Banman Cc: Andy Lutomirski Cc: Anshuman Khandual Cc: Ard Biesheuvel Cc: Arun KS Cc: Baoquan He Cc: Benjamin Herrenschmidt Cc: Borislav Petkov Cc: Catalin Marinas Cc: Chintan Pandya Cc: Christophe Leroy Cc: Chris Wilson Cc: Dan Williams Cc: Dave Hansen Cc: "David S. Miller" Cc: Fenghua Yu Cc: Greg Kroah-Hartman Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Jonathan Cameron Cc: Joonsoo Kim Cc: Jun Yao Cc: "Kirill A. Shutemov" Cc: Logan Gunthorpe Cc: Mark Brown Cc: Mark Rutland Cc: Masahiro Yamada Cc: Mathieu Malaterre Cc: Michael Ellerman Cc: Mike Rapoport Cc: "mike.travis@hpe.com" Cc: Nicholas Piggin Cc: Oscar Salvador Cc: Paul Mackerras Cc: Pavel Tatashin Cc: Peter Zijlstra Cc: Qian Cai Cc: "Rafael J. Wysocki" Cc: Rich Felker Cc: Rob Herring Cc: Robin Murphy Cc: Thomas Gleixner Cc: Tony Luck Cc: Wei Yang Cc: Will Deacon Cc: Yoshinori Sato Cc: Yu Zhao Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: David Hildenbrand --- arch/s390/mm/init.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c index 0da486d914e4..fc761001fa94 100644 --- a/arch/s390/mm/init.c +++ b/arch/s390/mm/init.c @@ -243,12 +243,13 @@ int arch_add_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap, void arch_remove_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap) { - /* - * There is no hardware or firmware interface which could trigger a - * hot memory remove on s390. So there is nothing that needs to be - * implemented. - */ - BUG(); + unsigned long start_pfn = start >> PAGE_SHIFT; + unsigned long nr_pages = size >> PAGE_SHIFT; + struct zone *zone; + + zone = page_zone(pfn_to_page(start_pfn)); + __remove_pages(zone, start_pfn, nr_pages, altmap); + vmem_remove_mapping(start, size); } #endif #endif /* CONFIG_MEMORY_HOTPLUG */ From patchwork Wed Jan 15 15:33:29 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 233863 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=3.0 tests=DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 061FFC33CB3 for ; Wed, 15 Jan 2020 15:34:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id C121A2053B for ; Wed, 15 Jan 2020 15:34:33 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="eXfzMlBg" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729016AbgAOPed (ORCPT ); Wed, 15 Jan 2020 10:34:33 -0500 Received: from us-smtp-1.mimecast.com ([207.211.31.81]:40332 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729014AbgAOPed (ORCPT ); Wed, 15 Jan 2020 10:34:33 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1579102471; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=VSt3G4WopOAKbZpLpjpo+iNW5nosm3evV6t5IdJzQNo=; b=eXfzMlBgWtw6l2lDOY0257oPFuCETW1pl8o6KzmCwLAKmNxe1hyQWapCjmexHKz6mbQq9l V7kpGSdo26OWMsk948VS6oeLgRC4JiLRqM5NZyWMIKCzzE7e9EBeE0h+AfBNk0Phv989T4 /qpmsKxm1zAwTHr3BnvOhpsqgRTs2Z8= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-417-nxtVJsbPMYS8tCrd45AozQ-1; Wed, 15 Jan 2020 10:34:28 -0500 X-MC-Unique: nxtVJsbPMYS8tCrd45AozQ-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 89F7B1A3CCA; Wed, 15 Jan 2020 15:34:26 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.36.118.7]) by smtp.corp.redhat.com (Postfix) with ESMTP id F1A408886D; Wed, 15 Jan 2020 15:34:21 +0000 (UTC) From: David Hildenbrand To: stable@vger.kernel.org Cc: linux-mm@kvack.org, Oscar Salvador , Michal Hocko , "Aneesh Kumar K . V" , Greg Kroah-Hartman , Dan Williams , Andrew Morton , Laurent Vivier , Baoquan He , David Hildenbrand Subject: [PATCH for 4.19-stable 15/25] mm/memory_hotplug: allow arch_remove_memory() without CONFIG_MEMORY_HOTREMOVE Date: Wed, 15 Jan 2020 16:33:29 +0100 Message-Id: <20200115153339.36409-16-david@redhat.com> In-Reply-To: <20200115153339.36409-1-david@redhat.com> References: <20200115153339.36409-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org commit 80ec922dbd87fd38d15719c86a94457204648aeb upstream. -- snip -- Missing arm64 memory hot(un)plug support. -- snip -- We want to improve error handling while adding memory by allowing to use arch_remove_memory() and __remove_pages() even if CONFIG_MEMORY_HOTREMOVE is not set to e.g., implement something like: arch_add_memory() rc = do_something(); if (rc) { arch_remove_memory(); } We won't get rid of CONFIG_MEMORY_HOTREMOVE for now, as it will require quite some dependencies for memory offlining. Link: http://lkml.kernel.org/r/20190527111152.16324-7-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Pavel Tatashin Cc: Tony Luck Cc: Fenghua Yu Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: Michael Ellerman Cc: Heiko Carstens Cc: Yoshinori Sato Cc: Rich Felker Cc: Dave Hansen Cc: Andy Lutomirski Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Borislav Petkov Cc: "H. Peter Anvin" Cc: Greg Kroah-Hartman Cc: "Rafael J. Wysocki" Cc: Michal Hocko Cc: David Hildenbrand Cc: Oscar Salvador Cc: "Kirill A. Shutemov" Cc: Alex Deucher Cc: "David S. Miller" Cc: Mark Brown Cc: Chris Wilson Cc: Christophe Leroy Cc: Nicholas Piggin Cc: Vasily Gorbik Cc: Rob Herring Cc: Masahiro Yamada Cc: "mike.travis@hpe.com" Cc: Andrew Banman Cc: Arun KS Cc: Qian Cai Cc: Mathieu Malaterre Cc: Baoquan He Cc: Logan Gunthorpe Cc: Anshuman Khandual Cc: Ard Biesheuvel Cc: Catalin Marinas Cc: Chintan Pandya Cc: Dan Williams Cc: Ingo Molnar Cc: Jonathan Cameron Cc: Joonsoo Kim Cc: Jun Yao Cc: Mark Rutland Cc: Mike Rapoport Cc: Oscar Salvador Cc: Robin Murphy Cc: Wei Yang Cc: Will Deacon Cc: Yu Zhao Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: David Hildenbrand --- arch/ia64/mm/init.c | 2 -- arch/powerpc/mm/mem.c | 2 -- arch/s390/mm/init.c | 2 -- arch/sh/mm/init.c | 2 -- arch/x86/mm/init_32.c | 2 -- arch/x86/mm/init_64.c | 2 -- drivers/base/memory.c | 2 -- include/linux/memory.h | 2 -- include/linux/memory_hotplug.h | 2 -- mm/memory_hotplug.c | 2 -- mm/sparse.c | 6 ------ 11 files changed, 26 deletions(-) diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c index 950a9e0a4548..778781e5f22e 100644 --- a/arch/ia64/mm/init.c +++ b/arch/ia64/mm/init.c @@ -661,7 +661,6 @@ int arch_add_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap, return ret; } -#ifdef CONFIG_MEMORY_HOTREMOVE void arch_remove_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap) { @@ -673,4 +672,3 @@ void arch_remove_memory(int nid, u64 start, u64 size, __remove_pages(zone, start_pfn, nr_pages, altmap); } #endif -#endif diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c index d3fd0095aae6..a9c1aaa0c133 100644 --- a/arch/powerpc/mm/mem.c +++ b/arch/powerpc/mm/mem.c @@ -139,7 +139,6 @@ int __ref arch_add_memory(int nid, u64 start, u64 size, struct vmem_altmap *altm return __add_pages(nid, start_pfn, nr_pages, altmap, want_memblock); } -#ifdef CONFIG_MEMORY_HOTREMOVE void __ref arch_remove_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap) { @@ -173,7 +172,6 @@ void __ref arch_remove_memory(int nid, u64 start, u64 size, pr_warn("Hash collision while resizing HPT\n"); } #endif -#endif /* CONFIG_MEMORY_HOTPLUG */ /* * walk_memory_resource() needs to make sure there is no holes in a given diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c index fc761001fa94..e0b0788cfecb 100644 --- a/arch/s390/mm/init.c +++ b/arch/s390/mm/init.c @@ -239,7 +239,6 @@ int arch_add_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap, return rc; } -#ifdef CONFIG_MEMORY_HOTREMOVE void arch_remove_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap) { @@ -251,5 +250,4 @@ void arch_remove_memory(int nid, u64 start, u64 size, __remove_pages(zone, start_pfn, nr_pages, altmap); vmem_remove_mapping(start, size); } -#endif #endif /* CONFIG_MEMORY_HOTPLUG */ diff --git a/arch/sh/mm/init.c b/arch/sh/mm/init.c index 59ae5d7dc081..0da784ac0e34 100644 --- a/arch/sh/mm/init.c +++ b/arch/sh/mm/init.c @@ -443,7 +443,6 @@ int memory_add_physaddr_to_nid(u64 addr) EXPORT_SYMBOL_GPL(memory_add_physaddr_to_nid); #endif -#ifdef CONFIG_MEMORY_HOTREMOVE void arch_remove_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap) { @@ -454,5 +453,4 @@ void arch_remove_memory(int nid, u64 start, u64 size, zone = page_zone(pfn_to_page(start_pfn)); __remove_pages(zone, start_pfn, nr_pages, altmap); } -#endif #endif /* CONFIG_MEMORY_HOTPLUG */ diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c index c6a50a0f240b..64f54f77052b 100644 --- a/arch/x86/mm/init_32.c +++ b/arch/x86/mm/init_32.c @@ -860,7 +860,6 @@ int arch_add_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap, return __add_pages(nid, start_pfn, nr_pages, altmap, want_memblock); } -#ifdef CONFIG_MEMORY_HOTREMOVE void arch_remove_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap) { @@ -872,7 +871,6 @@ void arch_remove_memory(int nid, u64 start, u64 size, __remove_pages(zone, start_pfn, nr_pages, altmap); } #endif -#endif int kernel_set_to_readonly __read_mostly; diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index b9e15f25b921..50df7ca142f9 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -1132,7 +1132,6 @@ void __ref vmemmap_free(unsigned long start, unsigned long end, remove_pagetable(start, end, false, altmap); } -#ifdef CONFIG_MEMORY_HOTREMOVE static void __meminit kernel_physical_mapping_remove(unsigned long start, unsigned long end) { @@ -1157,7 +1156,6 @@ void __ref arch_remove_memory(int nid, u64 start, u64 size, __remove_pages(zone, start_pfn, nr_pages, altmap); kernel_physical_mapping_remove(start, start + size); } -#endif #endif /* CONFIG_MEMORY_HOTPLUG */ static struct kcore_list kcore_vsyscall; diff --git a/drivers/base/memory.c b/drivers/base/memory.c index 5b9950d982dc..9a5f81dc32c5 100644 --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -737,7 +737,6 @@ int hotplug_memory_register(int nid, struct mem_section *section) return ret; } -#ifdef CONFIG_MEMORY_HOTREMOVE static void unregister_memory(struct memory_block *memory) { @@ -776,7 +775,6 @@ void unregister_memory_section(struct mem_section *section) out_unlock: mutex_unlock(&mem_sysfs_mutex); } -#endif /* CONFIG_MEMORY_HOTREMOVE */ /* return true if the memory block is offlined, otherwise, return false */ bool is_memblock_offlined(struct memory_block *mem) diff --git a/include/linux/memory.h b/include/linux/memory.h index e1dc1bb2b787..474c7c60c8f2 100644 --- a/include/linux/memory.h +++ b/include/linux/memory.h @@ -112,9 +112,7 @@ extern void unregister_memory_notifier(struct notifier_block *nb); extern int register_memory_isolate_notifier(struct notifier_block *nb); extern void unregister_memory_isolate_notifier(struct notifier_block *nb); int hotplug_memory_register(int nid, struct mem_section *section); -#ifdef CONFIG_MEMORY_HOTREMOVE extern void unregister_memory_section(struct mem_section *); -#endif extern int memory_dev_init(void); extern int memory_notify(unsigned long val, void *v); extern int memory_isolate_notify(unsigned long val, void *v); diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index 04c40da0228a..5ac58325e8fc 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -108,12 +108,10 @@ static inline bool movable_node_is_enabled(void) return movable_node_enabled; } -#ifdef CONFIG_MEMORY_HOTREMOVE extern void arch_remove_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap); extern void __remove_pages(struct zone *zone, unsigned long start_pfn, unsigned long nr_pages, struct vmem_altmap *altmap); -#endif /* CONFIG_MEMORY_HOTREMOVE */ /* reasonably generic interface to expand the physical pages */ extern int __add_pages(int nid, unsigned long start_pfn, unsigned long nr_pages, diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 7905e3275289..6b5ce0bd907f 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -315,7 +315,6 @@ int __ref __add_pages(int nid, unsigned long phys_start_pfn, return err; } -#ifdef CONFIG_MEMORY_HOTREMOVE /* find the smallest valid pfn in the range [start_pfn, end_pfn) */ static unsigned long find_smallest_section_pfn(int nid, struct zone *zone, unsigned long start_pfn, @@ -542,7 +541,6 @@ void __remove_pages(struct zone *zone, unsigned long phys_start_pfn, set_zone_contiguous(zone); } -#endif /* CONFIG_MEMORY_HOTREMOVE */ int set_online_page_callback(online_page_callback_t callback) { diff --git a/mm/sparse.c b/mm/sparse.c index f52e8c328761..6f624ce9252e 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -576,7 +576,6 @@ static void __kfree_section_memmap(struct page *memmap, vmemmap_free(start, end, altmap); } -#ifdef CONFIG_MEMORY_HOTREMOVE static void free_map_bootmem(struct page *memmap) { unsigned long start = (unsigned long)memmap; @@ -584,7 +583,6 @@ static void free_map_bootmem(struct page *memmap) vmemmap_free(start, end, NULL); } -#endif /* CONFIG_MEMORY_HOTREMOVE */ #else static struct page *__kmalloc_section_memmap(void) { @@ -623,7 +621,6 @@ static void __kfree_section_memmap(struct page *memmap, get_order(sizeof(struct page) * PAGES_PER_SECTION)); } -#ifdef CONFIG_MEMORY_HOTREMOVE static void free_map_bootmem(struct page *memmap) { unsigned long maps_section_nr, removing_section_nr, i; @@ -653,7 +650,6 @@ static void free_map_bootmem(struct page *memmap) put_page_bootmem(page); } } -#endif /* CONFIG_MEMORY_HOTREMOVE */ #endif /* CONFIG_SPARSEMEM_VMEMMAP */ /* @@ -712,7 +708,6 @@ int __meminit sparse_add_one_section(int nid, unsigned long start_pfn, return ret; } -#ifdef CONFIG_MEMORY_HOTREMOVE #ifdef CONFIG_MEMORY_FAILURE static void clear_hwpoisoned_pages(struct page *memmap, int nr_pages) { @@ -780,5 +775,4 @@ void sparse_remove_one_section(struct zone *zone, struct mem_section *ms, PAGES_PER_SECTION - map_offset); free_section_usemap(memmap, usemap, altmap); } -#endif /* CONFIG_MEMORY_HOTREMOVE */ #endif /* CONFIG_MEMORY_HOTPLUG */ From patchwork Wed Jan 15 15:33:31 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 233862 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=3.0 tests=DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 91694C33CB1 for ; Wed, 15 Jan 2020 15:34:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 56F652053B for ; Wed, 15 Jan 2020 15:34:41 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="JGrQkA+n" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728978AbgAOPel (ORCPT ); Wed, 15 Jan 2020 10:34:41 -0500 Received: from us-smtp-1.mimecast.com ([207.211.31.81]:26621 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728931AbgAOPek (ORCPT ); Wed, 15 Jan 2020 10:34:40 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1579102479; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=72SdbaJ4/IJRXbpxtNMbuMiXfQLcQVRYkn12Z1ngsTA=; b=JGrQkA+nWW7y1ahFlBDAvSsQDfjVwqGzD/MAOecnMYDOEBNto+/Myzf2FaPPxvO7vsu6Au idisiTrT0srHYDGj9zsAqM3spIL4WWW0+nbBhHyb02xbf0WntRyjCvF8167TgcDdXAodlX awS+xWBpq6uteGOLqZNElVDw8RWV+as= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-241-jKc1ZsUMO3WtFv7zgu8H0A-1; Wed, 15 Jan 2020 10:34:33 -0500 X-MC-Unique: jKc1ZsUMO3WtFv7zgu8H0A-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 8925B9027F5; Wed, 15 Jan 2020 15:34:31 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.36.118.7]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5DFEB86CB2; Wed, 15 Jan 2020 15:34:29 +0000 (UTC) From: David Hildenbrand To: stable@vger.kernel.org Cc: linux-mm@kvack.org, Oscar Salvador , Michal Hocko , "Aneesh Kumar K . V" , Greg Kroah-Hartman , Dan Williams , Andrew Morton , Laurent Vivier , Baoquan He , David Hildenbrand Subject: [PATCH for 4.19-stable 17/25] mm/memory_hotplug: create memory block devices after arch_add_memory() Date: Wed, 15 Jan 2020 16:33:31 +0100 Message-Id: <20200115153339.36409-18-david@redhat.com> In-Reply-To: <20200115153339.36409-1-david@redhat.com> References: <20200115153339.36409-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org commit db051a0dac13db24d58470d75cee0ce7c6b031a1 upstream. Only memory to be added to the buddy and to be onlined/offlined by user space using /sys/devices/system/memory/... needs (and should have!) memory block devices. Factor out creation of memory block devices. Create all devices after arch_add_memory() succeeded. We can later drop the want_memblock parameter, because it is now effectively stale. Only after memory block devices have been added, memory can be onlined by user space. This implies, that memory is not visible to user space at all before arch_add_memory() succeeded. While at it - use WARN_ON_ONCE instead of BUG_ON in moved unregister_memory() - introduce find_memory_block_by_id() to search via block id - Use find_memory_block_by_id() in init_memory_block() to catch duplicates Link: http://lkml.kernel.org/r/20190527111152.16324-8-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Pavel Tatashin Acked-by: Michal Hocko Cc: Greg Kroah-Hartman Cc: "Rafael J. Wysocki" Cc: David Hildenbrand Cc: "mike.travis@hpe.com" Cc: Ingo Molnar Cc: Andrew Banman Cc: Oscar Salvador Cc: Qian Cai Cc: Wei Yang Cc: Arun KS Cc: Mathieu Malaterre Cc: Alex Deucher Cc: Andy Lutomirski Cc: Anshuman Khandual Cc: Ard Biesheuvel Cc: Baoquan He Cc: Benjamin Herrenschmidt Cc: Borislav Petkov Cc: Catalin Marinas Cc: Chintan Pandya Cc: Christophe Leroy Cc: Chris Wilson Cc: Dan Williams Cc: Dave Hansen Cc: "David S. Miller" Cc: Fenghua Yu Cc: Heiko Carstens Cc: "H. Peter Anvin" Cc: Jonathan Cameron Cc: Joonsoo Kim Cc: Jun Yao Cc: "Kirill A. Shutemov" Cc: Logan Gunthorpe Cc: Mark Brown Cc: Mark Rutland Cc: Masahiro Yamada Cc: Michael Ellerman Cc: Mike Rapoport Cc: Nicholas Piggin Cc: Oscar Salvador Cc: Paul Mackerras Cc: Peter Zijlstra Cc: Rich Felker Cc: Rob Herring Cc: Robin Murphy Cc: Thomas Gleixner Cc: Tony Luck Cc: Vasily Gorbik Cc: Will Deacon Cc: Yoshinori Sato Cc: Yu Zhao Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: David Hildenbrand --- drivers/base/memory.c | 82 +++++++++++++++++++++++++++--------------- include/linux/memory.h | 2 +- mm/memory_hotplug.c | 15 ++++---- 3 files changed, 63 insertions(+), 36 deletions(-) diff --git a/drivers/base/memory.c b/drivers/base/memory.c index f9818d75ac43..b89b9c3efa59 100644 --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -39,6 +39,11 @@ static inline int base_memory_block_id(int section_nr) return section_nr / sections_per_block; } +static inline int pfn_to_block_id(unsigned long pfn) +{ + return base_memory_block_id(pfn_to_section_nr(pfn)); +} + static int memory_subsys_online(struct device *dev); static int memory_subsys_offline(struct device *dev); @@ -591,10 +596,9 @@ int __weak arch_get_memory_phys_device(unsigned long start_pfn) * A reference for the returned object is held and the reference for the * hinted object is released. */ -struct memory_block *find_memory_block_hinted(struct mem_section *section, - struct memory_block *hint) +static struct memory_block *find_memory_block_by_id(int block_id, + struct memory_block *hint) { - int block_id = base_memory_block_id(__section_nr(section)); struct device *hintdev = hint ? &hint->dev : NULL; struct device *dev; @@ -606,6 +610,14 @@ struct memory_block *find_memory_block_hinted(struct mem_section *section, return to_memory_block(dev); } +struct memory_block *find_memory_block_hinted(struct mem_section *section, + struct memory_block *hint) +{ + int block_id = base_memory_block_id(__section_nr(section)); + + return find_memory_block_by_id(block_id, hint); +} + /* * For now, we have a linear search to go find the appropriate * memory_block corresponding to a particular phys_index. If @@ -667,6 +679,11 @@ static int init_memory_block(struct memory_block **memory, int block_id, unsigned long start_pfn; int ret = 0; + mem = find_memory_block_by_id(block_id, NULL); + if (mem) { + put_device(&mem->dev); + return -EEXIST; + } mem = kzalloc(sizeof(*mem), GFP_KERNEL); if (!mem) return -ENOMEM; @@ -704,44 +721,53 @@ static int add_memory_block(int base_section_nr) return 0; } +static void unregister_memory(struct memory_block *memory) +{ + if (WARN_ON_ONCE(memory->dev.bus != &memory_subsys)) + return; + + /* drop the ref. we got via find_memory_block() */ + put_device(&memory->dev); + device_unregister(&memory->dev); +} + /* - * need an interface for the VM to add new memory regions, - * but without onlining it. + * Create memory block devices for the given memory area. Start and size + * have to be aligned to memory block granularity. Memory block devices + * will be initialized as offline. */ -int hotplug_memory_register(int nid, struct mem_section *section) +int create_memory_block_devices(unsigned long start, unsigned long size) { - int block_id = base_memory_block_id(__section_nr(section)); - int ret = 0; + const int start_block_id = pfn_to_block_id(PFN_DOWN(start)); + int end_block_id = pfn_to_block_id(PFN_DOWN(start + size)); struct memory_block *mem; + unsigned long block_id; + int ret = 0; - mutex_lock(&mem_sysfs_mutex); + if (WARN_ON_ONCE(!IS_ALIGNED(start, memory_block_size_bytes()) || + !IS_ALIGNED(size, memory_block_size_bytes()))) + return -EINVAL; - mem = find_memory_block(section); - if (mem) { - mem->section_count++; - put_device(&mem->dev); - } else { + mutex_lock(&mem_sysfs_mutex); + for (block_id = start_block_id; block_id != end_block_id; block_id++) { ret = init_memory_block(&mem, block_id, MEM_OFFLINE); if (ret) - goto out; - mem->section_count++; + break; + mem->section_count = sections_per_block; + } + if (ret) { + end_block_id = block_id; + for (block_id = start_block_id; block_id != end_block_id; + block_id++) { + mem = find_memory_block_by_id(block_id, NULL); + mem->section_count = 0; + unregister_memory(mem); + } } - -out: mutex_unlock(&mem_sysfs_mutex); return ret; } -static void -unregister_memory(struct memory_block *memory) -{ - BUG_ON(memory->dev.bus != &memory_subsys); - - /* drop the ref. we got via find_memory_block() */ - put_device(&memory->dev); - device_unregister(&memory->dev); -} - void unregister_memory_section(struct mem_section *section) { struct memory_block *mem; diff --git a/include/linux/memory.h b/include/linux/memory.h index 474c7c60c8f2..db3e8567f900 100644 --- a/include/linux/memory.h +++ b/include/linux/memory.h @@ -111,7 +111,7 @@ extern int register_memory_notifier(struct notifier_block *nb); extern void unregister_memory_notifier(struct notifier_block *nb); extern int register_memory_isolate_notifier(struct notifier_block *nb); extern void unregister_memory_isolate_notifier(struct notifier_block *nb); -int hotplug_memory_register(int nid, struct mem_section *section); +int create_memory_block_devices(unsigned long start, unsigned long size); extern void unregister_memory_section(struct mem_section *); extern int memory_dev_init(void); extern int memory_notify(unsigned long val, void *v); diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 6b5ce0bd907f..414771114685 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -256,13 +256,7 @@ static int __meminit __add_section(int nid, unsigned long phys_start_pfn, return -EEXIST; ret = sparse_add_one_section(nid, phys_start_pfn, altmap); - if (ret < 0) - return ret; - - if (!want_memblock) - return 0; - - return hotplug_memory_register(nid, __pfn_to_section(phys_start_pfn)); + return ret < 0 ? ret : 0; } /* @@ -1096,6 +1090,13 @@ int __ref add_memory_resource(int nid, struct resource *res, bool online) if (ret < 0) goto error; + /* create memory block devices after memory was added */ + ret = create_memory_block_devices(start, size); + if (ret) { + arch_remove_memory(nid, start, size, NULL); + goto error; + } + if (new_node) { /* If sysfs file of new node can't be created, cpu on the node * can't be hot-added. There is no rollback way now. From patchwork Wed Jan 15 15:33:33 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 233861 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=3.0 tests=DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 35A29C33CB2 for ; Wed, 15 Jan 2020 15:34:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 0DBC42467A for ; Wed, 15 Jan 2020 15:34:45 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="LyY92bwZ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729021AbgAOPeo (ORCPT ); Wed, 15 Jan 2020 10:34:44 -0500 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:32079 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728931AbgAOPeo (ORCPT ); Wed, 15 Jan 2020 10:34:44 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1579102482; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=uTtwNHdS2IIpvaTdfYSk+ONIaqPDsbPlM+ndeM14IWo=; b=LyY92bwZwj/TaJrJfFYjz/lBp0RoluLqbXzgZsCExYl4uk7Z0eQMr6O+A2/PGmeGcnTG1d XeI0UULKdqiLLoZX/0Bof+7M8x7qEAl6fUDR3kfNJ0W9+ypZ1NT+MlflHRujzBQMLi4xqA xDcam5rGeCMBKRWf3qUHf5PcLKuHqms= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-54-OI6L9s_0PV-rsLLqQ6hYEw-1; Wed, 15 Jan 2020 10:34:39 -0500 X-MC-Unique: OI6L9s_0PV-rsLLqQ6hYEw-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id F05F811331C8; Wed, 15 Jan 2020 15:34:36 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.36.118.7]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6293486CB2; Wed, 15 Jan 2020 15:34:34 +0000 (UTC) From: David Hildenbrand To: stable@vger.kernel.org Cc: linux-mm@kvack.org, Oscar Salvador , Michal Hocko , "Aneesh Kumar K . V" , Greg Kroah-Hartman , Dan Williams , Andrew Morton , Laurent Vivier , Baoquan He , David Hildenbrand Subject: [PATCH for 4.19-stable 19/25] mm/memory_hotplug: make unregister_memory_block_under_nodes() never fail Date: Wed, 15 Jan 2020 16:33:33 +0100 Message-Id: <20200115153339.36409-20-david@redhat.com> In-Reply-To: <20200115153339.36409-1-david@redhat.com> References: <20200115153339.36409-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org commit a31b264c2b415b29660da0bc2ba291a98629ce51 upstream. We really don't want anything during memory hotunplug to fail. We always pass a valid memory block device, that check can go. Avoid allocating memory and eventually failing. As we are always called under lock, we can use a static piece of memory. This avoids having to put the structure onto the stack, having to guess about the stack size of callers. Patch inspired by a patch from Oscar Salvador. In the future, there might be no need to iterate over nodes at all. mem->nid should tell us exactly what to remove. Memory block devices with mixed nodes (added during boot) should properly fenced off and never removed. Link: http://lkml.kernel.org/r/20190527111152.16324-11-david@redhat.com Signed-off-by: David Hildenbrand Reviewed-by: Wei Yang Reviewed-by: Oscar Salvador Acked-by: Michal Hocko Cc: Greg Kroah-Hartman Cc: "Rafael J. Wysocki" Cc: Alex Deucher Cc: "David S. Miller" Cc: Mark Brown Cc: Chris Wilson Cc: David Hildenbrand Cc: Jonathan Cameron Cc: Andrew Banman Cc: Andy Lutomirski Cc: Anshuman Khandual Cc: Ard Biesheuvel Cc: Arun KS Cc: Baoquan He Cc: Benjamin Herrenschmidt Cc: Borislav Petkov Cc: Catalin Marinas Cc: Chintan Pandya Cc: Christophe Leroy Cc: Dan Williams Cc: Dave Hansen Cc: Fenghua Yu Cc: Heiko Carstens Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Joonsoo Kim Cc: Jun Yao Cc: "Kirill A. Shutemov" Cc: Logan Gunthorpe Cc: Mark Rutland Cc: Masahiro Yamada Cc: Mathieu Malaterre Cc: Michael Ellerman Cc: Mike Rapoport Cc: "mike.travis@hpe.com" Cc: Nicholas Piggin Cc: Paul Mackerras Cc: Pavel Tatashin Cc: Peter Zijlstra Cc: Qian Cai Cc: Rich Felker Cc: Rob Herring Cc: Robin Murphy Cc: Thomas Gleixner Cc: Tony Luck Cc: Vasily Gorbik Cc: Will Deacon Cc: Yoshinori Sato Cc: Yu Zhao Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: David Hildenbrand --- drivers/base/node.c | 18 +++++------------- include/linux/node.h | 5 ++--- 2 files changed, 7 insertions(+), 16 deletions(-) diff --git a/drivers/base/node.c b/drivers/base/node.c index 1211794d658c..bdff237f4167 100644 --- a/drivers/base/node.c +++ b/drivers/base/node.c @@ -455,20 +455,14 @@ int register_mem_sect_under_node(struct memory_block *mem_blk, void *arg) /* * Unregister memory block device under all nodes that it spans. + * Has to be called with mem_sysfs_mutex held (due to unlinked_nodes). */ -int unregister_memory_block_under_nodes(struct memory_block *mem_blk) +void unregister_memory_block_under_nodes(struct memory_block *mem_blk) { - NODEMASK_ALLOC(nodemask_t, unlinked_nodes, GFP_KERNEL); unsigned long pfn, sect_start_pfn, sect_end_pfn; + static nodemask_t unlinked_nodes; - if (!mem_blk) { - NODEMASK_FREE(unlinked_nodes); - return -EFAULT; - } - if (!unlinked_nodes) - return -ENOMEM; - nodes_clear(*unlinked_nodes); - + nodes_clear(unlinked_nodes); sect_start_pfn = section_nr_to_pfn(mem_blk->start_section_nr); sect_end_pfn = section_nr_to_pfn(mem_blk->end_section_nr); for (pfn = sect_start_pfn; pfn <= sect_end_pfn; pfn++) { @@ -479,15 +473,13 @@ int unregister_memory_block_under_nodes(struct memory_block *mem_blk) continue; if (!node_online(nid)) continue; - if (node_test_and_set(nid, *unlinked_nodes)) + if (node_test_and_set(nid, unlinked_nodes)) continue; sysfs_remove_link(&node_devices[nid]->dev.kobj, kobject_name(&mem_blk->dev.kobj)); sysfs_remove_link(&mem_blk->dev.kobj, kobject_name(&node_devices[nid]->dev.kobj)); } - NODEMASK_FREE(unlinked_nodes); - return 0; } int link_mem_sections(int nid, unsigned long start_pfn, unsigned long end_pfn) diff --git a/include/linux/node.h b/include/linux/node.h index 9a6db437f2d1..708939bae9aa 100644 --- a/include/linux/node.h +++ b/include/linux/node.h @@ -72,7 +72,7 @@ extern int register_cpu_under_node(unsigned int cpu, unsigned int nid); extern int unregister_cpu_under_node(unsigned int cpu, unsigned int nid); extern int register_mem_sect_under_node(struct memory_block *mem_blk, void *arg); -extern int unregister_memory_block_under_nodes(struct memory_block *mem_blk); +extern void unregister_memory_block_under_nodes(struct memory_block *mem_blk); #ifdef CONFIG_HUGETLBFS extern void register_hugetlbfs_with_node(node_registration_func_t doregister, @@ -104,9 +104,8 @@ static inline int register_mem_sect_under_node(struct memory_block *mem_blk, { return 0; } -static inline int unregister_memory_block_under_nodes(struct memory_block *mem_blk) +static inline void unregister_memory_block_under_nodes(struct memory_block *mem_blk) { - return 0; } static inline void register_hugetlbfs_with_node(node_registration_func_t reg, From patchwork Wed Jan 15 15:33:35 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 233860 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=3.0 tests=DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A718EC33CB2 for ; Wed, 15 Jan 2020 15:34:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 7EFD72467A for ; Wed, 15 Jan 2020 15:34:48 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="XwuPz0jn" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729025AbgAOPes (ORCPT ); Wed, 15 Jan 2020 10:34:48 -0500 Received: from us-smtp-2.mimecast.com ([205.139.110.61]:53432 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728901AbgAOPes (ORCPT ); Wed, 15 Jan 2020 10:34:48 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1579102486; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=A1Dc3gs8JZdhCjkc7OqCqrUxdiJH6iZG5nSEy7h1nY4=; b=XwuPz0jnn7C3SHeUR9Jx6Cqp9bdI9BCW7aUU8AO2kW93Tg+vf7MpqqD26innFtutlEvQQv dxf/nUUz03SWneerdsOiKuzscZ1Scz7LnJ0ptI2QLmhqTUmpo3Ii7WqCL+/t1UpmG0KdBK aLxA//HGEhL6InTdQ3wQM7ef5Bk0sjg= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-384-FQWVYX7pNEaOnXsfG3rlBw-1; Wed, 15 Jan 2020 10:34:43 -0500 X-MC-Unique: FQWVYX7pNEaOnXsfG3rlBw-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id EED271945ECD; Wed, 15 Jan 2020 15:34:41 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.36.118.7]) by smtp.corp.redhat.com (Postfix) with ESMTP id C20CF86CB2; Wed, 15 Jan 2020 15:34:39 +0000 (UTC) From: David Hildenbrand To: stable@vger.kernel.org Cc: linux-mm@kvack.org, Oscar Salvador , Michal Hocko , "Aneesh Kumar K . V" , Greg Kroah-Hartman , Dan Williams , Andrew Morton , Laurent Vivier , Baoquan He , David Hildenbrand Subject: [PATCH for 4.19-stable 21/25] mm/hotplug: kill is_dev_zone() usage in __remove_pages() Date: Wed, 15 Jan 2020 16:33:35 +0100 Message-Id: <20200115153339.36409-22-david@redhat.com> In-Reply-To: <20200115153339.36409-1-david@redhat.com> References: <20200115153339.36409-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org commit 96da4350000973ef9310a10d077d65bbc017f093 upstream. -- snip -- Minor conflict, keep the altmap check. -- snip -- The zone type check was a leftover from the cleanup that plumbed altmap through the memory hotplug path, i.e. commit da024512a1fa "mm: pass the vmem_altmap to arch_remove_memory and __remove_pages". Link: http://lkml.kernel.org/r/156092352642.979959.6664333788149363039.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams Reviewed-by: David Hildenbrand Reviewed-by: Oscar Salvador Tested-by: Aneesh Kumar K.V [ppc64] Cc: Michal Hocko Cc: Logan Gunthorpe Cc: Pavel Tatashin Cc: Jane Chu Cc: Jeff Moyer Cc: Jérôme Glisse Cc: Jonathan Corbet Cc: Mike Rapoport Cc: Toshi Kani Cc: Vlastimil Babka Cc: Wei Yang Cc: Jason Gunthorpe Cc: Christoph Hellwig Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: David Hildenbrand --- mm/memory_hotplug.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 74cfa2fbe88e..888fcbdf97cf 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -507,11 +507,8 @@ void __remove_pages(struct zone *zone, unsigned long phys_start_pfn, unsigned long map_offset = 0; int sections_to_remove; - /* In the ZONE_DEVICE case device driver owns the memory region */ - if (is_dev_zone(zone)) { - if (altmap) - map_offset = vmem_altmap_offset(altmap); - } + if (altmap) + map_offset = vmem_altmap_offset(altmap); clear_zone_contiguous(zone); From patchwork Wed Jan 15 15:33:37 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 233859 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=3.0 tests=DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 134B8C33CB1 for ; Wed, 15 Jan 2020 15:34:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id D20C32053B for ; Wed, 15 Jan 2020 15:34:56 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="h4Riu7c0" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729043AbgAOPe4 (ORCPT ); Wed, 15 Jan 2020 10:34:56 -0500 Received: from us-smtp-2.mimecast.com ([205.139.110.61]:41020 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728901AbgAOPe4 (ORCPT ); Wed, 15 Jan 2020 10:34:56 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1579102494; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=AaiWNguwKGyKS8brjKreHJ8pKDJrjMQlGhGKUvICRYE=; b=h4Riu7c05783jPdExdfTxjRlQ8d1pZvL4Go5QYkEBYfqb6/ZPPxEjmjtbUmyJHe6vYhqTQ yZPMe/LBSJDn6srbu11VKghhRsBOFkvthzvHzTqDLV/EGVKok59a9kYCZU9TSTwlIeCUvn Yb3H0mMZ9KUsrv/0I5wI1kgQlE+DGnM= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-274-GZJXD9tnOV2oT77X7dwsng-1; Wed, 15 Jan 2020 10:34:50 -0500 X-MC-Unique: GZJXD9tnOV2oT77X7dwsng-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 6783C1971824; Wed, 15 Jan 2020 15:34:49 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.36.118.7]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3A60C86CB2; Wed, 15 Jan 2020 15:34:47 +0000 (UTC) From: David Hildenbrand To: stable@vger.kernel.org Cc: linux-mm@kvack.org, Oscar Salvador , Michal Hocko , "Aneesh Kumar K . V" , Greg Kroah-Hartman , Dan Williams , Andrew Morton , Laurent Vivier , Baoquan He , David Hildenbrand Subject: [PATCH for 4.19-stable 23/25] mm/memunmap: don't access uninitialized memmap in memunmap_pages() Date: Wed, 15 Jan 2020 16:33:37 +0100 Message-Id: <20200115153339.36409-24-david@redhat.com> In-Reply-To: <20200115153339.36409-1-david@redhat.com> References: <20200115153339.36409-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org commit 77e080e7680e1e615587352f70c87b9e98126d03 upstream. -- snip -- - Missing mm/hmm.c and kernel/memremap.c unification. -- hmm code does not need fixes (no altmap) - Missing 7cc7867fb061 ("mm/devm_memremap_pages: enable sub-section remap") -- snip -- Patch series "mm/memory_hotplug: Shrink zones before removing memory", v6. This series fixes the access of uninitialized memmaps when shrinking zones/nodes and when removing memory. Also, it contains all fixes for crashes that can be triggered when removing certain namespace using memunmap_pages() - ZONE_DEVICE, reported by Aneesh. We stop trying to shrink ZONE_DEVICE, as it's buggy, fixing it would be more involved (we don't have SECTION_IS_ONLINE as an indicator), and shrinking is only of limited use (set_zone_contiguous() cannot detect the ZONE_DEVICE as contiguous). We continue shrinking !ZONE_DEVICE zones, however, I reduced the amount of code to a minimum. Shrinking is especially necessary to keep zone->contiguous set where possible, especially, on memory unplug of DIMMs at zone boundaries. -------------------------------------------------------------------------- Zones are now properly shrunk when offlining memory blocks or when onlining failed. This allows to properly shrink zones on memory unplug even if the separate memory blocks of a DIMM were onlined to different zones or re-onlined to a different zone after offlining. Example: :/# cat /proc/zoneinfo Node 1, zone Movable spanned 0 present 0 managed 0 :/# echo "online_movable" > /sys/devices/system/memory/memory41/state :/# echo "online_movable" > /sys/devices/system/memory/memory43/state :/# cat /proc/zoneinfo Node 1, zone Movable spanned 98304 present 65536 managed 65536 :/# echo 0 > /sys/devices/system/memory/memory43/online :/# cat /proc/zoneinfo Node 1, zone Movable spanned 32768 present 32768 managed 32768 :/# echo 0 > /sys/devices/system/memory/memory41/online :/# cat /proc/zoneinfo Node 1, zone Movable spanned 0 present 0 managed 0 This patch (of 10): With an altmap, the memmap falling into the reserved altmap space are not initialized and, therefore, contain a garbage NID and a garbage zone. Make sure to read the NID/zone from a memmap that was initialized. This fixes a kernel crash that is observed when destroying a namespace: kernel BUG at include/linux/mm.h:1107! cpu 0x1: Vector: 700 (Program Check) at [c000000274087890] pc: c0000000004b9728: memunmap_pages+0x238/0x340 lr: c0000000004b9724: memunmap_pages+0x234/0x340 ... pid = 3669, comm = ndctl kernel BUG at include/linux/mm.h:1107! devm_action_release+0x30/0x50 release_nodes+0x268/0x2d0 device_release_driver_internal+0x174/0x240 unbind_store+0x13c/0x190 drv_attr_store+0x44/0x60 sysfs_kf_write+0x70/0xa0 kernfs_fop_write+0x1ac/0x290 __vfs_write+0x3c/0x70 vfs_write+0xe4/0x200 ksys_write+0x7c/0x140 system_call+0x5c/0x68 The "page_zone(pfn_to_page(pfn)" was introduced by 69324b8f4833 ("mm, devm_memremap_pages: add MEMORY_DEVICE_PRIVATE support"), however, I think we will never have driver reserved memory with MEMORY_DEVICE_PRIVATE (no altmap AFAIKS). [david@redhat.com: minimze code changes, rephrase description] Link: http://lkml.kernel.org/r/20191006085646.5768-2-david@redhat.com Fixes: 2c2a5af6fed2 ("mm, memory_hotplug: add nid parameter to arch_remove_memory") Signed-off-by: Aneesh Kumar K.V Signed-off-by: David Hildenbrand Cc: Dan Williams Cc: Jason Gunthorpe Cc: Logan Gunthorpe Cc: Ira Weiny Cc: Damian Tometzki Cc: Alexander Duyck Cc: Alexander Potapenko Cc: Andy Lutomirski Cc: Anshuman Khandual Cc: Benjamin Herrenschmidt Cc: Borislav Petkov Cc: Catalin Marinas Cc: Christian Borntraeger Cc: Christophe Leroy Cc: Dave Hansen Cc: Fenghua Yu Cc: Gerald Schaefer Cc: Greg Kroah-Hartman Cc: Halil Pasic Cc: Heiko Carstens Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: Jun Yao Cc: Mark Rutland Cc: Masahiro Yamada Cc: "Matthew Wilcox (Oracle)" Cc: Mel Gorman Cc: Michael Ellerman Cc: Michal Hocko Cc: Mike Rapoport Cc: Oscar Salvador Cc: Pankaj Gupta Cc: Paul Mackerras Cc: Pavel Tatashin Cc: Pavel Tatashin Cc: Peter Zijlstra Cc: Qian Cai Cc: Rich Felker Cc: Robin Murphy Cc: Steve Capper Cc: Thomas Gleixner Cc: Tom Lendacky Cc: Tony Luck Cc: Vasily Gorbik Cc: Vlastimil Babka Cc: Wei Yang Cc: Wei Yang Cc: Will Deacon Cc: Yoshinori Sato Cc: Yu Zhao Cc: [5.0+] Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: David Hildenbrand --- kernel/memremap.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/kernel/memremap.c b/kernel/memremap.c index 2ee2e672d5fc..ba652672a546 100644 --- a/kernel/memremap.c +++ b/kernel/memremap.c @@ -120,6 +120,7 @@ static void devm_memremap_pages_release(void *data) struct device *dev = pgmap->dev; struct resource *res = &pgmap->res; resource_size_t align_start, align_size; + struct page *first_page; unsigned long pfn; int nid; @@ -132,12 +133,14 @@ static void devm_memremap_pages_release(void *data) align_size = ALIGN(res->start + resource_size(res), SECTION_SIZE) - align_start; - nid = page_to_nid(pfn_to_page(align_start >> PAGE_SHIFT)); + /* make sure to access a memmap that was actually initialized */ + first_page = pfn_to_page(pfn_first(pgmap)); + + nid = page_to_nid(first_page); mem_hotplug_begin(); if (pgmap->type == MEMORY_DEVICE_PRIVATE) { - pfn = align_start >> PAGE_SHIFT; - __remove_pages(page_zone(pfn_to_page(pfn)), pfn, + __remove_pages(page_zone(first_page), pfn, align_size >> PAGE_SHIFT, NULL); } else { arch_remove_memory(nid, align_start, align_size, From patchwork Wed Jan 15 15:33:39 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 233858 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=3.0 tests=DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 25682C33CB1 for ; Wed, 15 Jan 2020 15:35:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id D7EBD222C3 for ; Wed, 15 Jan 2020 15:35:01 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="fVvbhnFc" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728927AbgAOPfB (ORCPT ); Wed, 15 Jan 2020 10:35:01 -0500 Received: from us-smtp-1.mimecast.com ([207.211.31.81]:51586 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728901AbgAOPfB (ORCPT ); Wed, 15 Jan 2020 10:35:01 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1579102499; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=PpG5vViRCZYZbKWrAJTCWEh3OmPh5tGhT+HsdicYf+0=; b=fVvbhnFcMLQ3dLkRTmZeM718gbtsolyUvQiyZZyTd8IpU9P0rjPeRd3+bW1/M0t/+MQ0rU vh5UCfTxJhmHtBd0Zj3Zluv339+0brXb4eQYsTUhFDPM/AcNuLg5/BUq3o3/aOxhHAyeFl vdkAf3XKpVO3nQR1r+HBn6MyH/TepCM= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-339-yDs55aXpOkWX00hSEJArnA-1; Wed, 15 Jan 2020 10:34:56 -0500 X-MC-Unique: yDs55aXpOkWX00hSEJArnA-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 6446B197183E; Wed, 15 Jan 2020 15:34:54 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.36.118.7]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3B5F486CB2; Wed, 15 Jan 2020 15:34:52 +0000 (UTC) From: David Hildenbrand To: stable@vger.kernel.org Cc: linux-mm@kvack.org, Oscar Salvador , Michal Hocko , "Aneesh Kumar K . V" , Greg Kroah-Hartman , Dan Williams , Andrew Morton , Laurent Vivier , Baoquan He , David Hildenbrand Subject: [PATCH for 4.19-stable 25/25] mm/memory_hotplug: shrink zones when offlining memory Date: Wed, 15 Jan 2020 16:33:39 +0100 Message-Id: <20200115153339.36409-26-david@redhat.com> In-Reply-To: <20200115153339.36409-1-david@redhat.com> References: <20200115153339.36409-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org commit feee6b2989165631b17ac6d4ccdbf6759254e85a upstream. -- snip -- - Missing arm64 hot(un)plug support - Missing some vmem_altmap_offset() cleanups - Missing sub-section hotadd support - Missing unification of mm/hmm.c and kernel/memremap.c -- snip -- We currently try to shrink a single zone when removing memory. We use the zone of the first page of the memory we are removing. If that memmap was never initialized (e.g., memory was never onlined), we will read garbage and can trigger kernel BUGs (due to a stale pointer): BUG: unable to handle page fault for address: 000000000000353d #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 0 P4D 0 Oops: 0002 [#1] SMP PTI CPU: 1 PID: 7 Comm: kworker/u8:0 Not tainted 5.3.0-rc5-next-20190820+ #317 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.4 Workqueue: kacpi_hotplug acpi_hotplug_work_fn RIP: 0010:clear_zone_contiguous+0x5/0x10 Code: 48 89 c6 48 89 c3 e8 2a fe ff ff 48 85 c0 75 cf 5b 5d c3 c6 85 fd 05 00 00 01 5b 5d c3 0f 1f 840 RSP: 0018:ffffad2400043c98 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000200000000 RCX: 0000000000000000 RDX: 0000000000200000 RSI: 0000000000140000 RDI: 0000000000002f40 RBP: 0000000140000000 R08: 0000000000000000 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000140000 R13: 0000000000140000 R14: 0000000000002f40 R15: ffff9e3e7aff3680 FS: 0000000000000000(0000) GS:ffff9e3e7bb00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000353d CR3: 0000000058610000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __remove_pages+0x4b/0x640 arch_remove_memory+0x63/0x8d try_remove_memory+0xdb/0x130 __remove_memory+0xa/0x11 acpi_memory_device_remove+0x70/0x100 acpi_bus_trim+0x55/0x90 acpi_device_hotplug+0x227/0x3a0 acpi_hotplug_work_fn+0x1a/0x30 process_one_work+0x221/0x550 worker_thread+0x50/0x3b0 kthread+0x105/0x140 ret_from_fork+0x3a/0x50 Modules linked in: CR2: 000000000000353d Instead, shrink the zones when offlining memory or when onlining failed. Introduce and use remove_pfn_range_from_zone(() for that. We now properly shrink the zones, even if we have DIMMs whereby - Some memory blocks fall into no zone (never onlined) - Some memory blocks fall into multiple zones (offlined+re-onlined) - Multiple memory blocks that fall into different zones Drop the zone parameter (with a potential dubious value) from __remove_pages() and __remove_section(). Link: http://lkml.kernel.org/r/20191006085646.5768-6-david@redhat.com Fixes: f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") [visible after d0dc12e86b319] Signed-off-by: David Hildenbrand Reviewed-by: Oscar Salvador Cc: Michal Hocko Cc: "Matthew Wilcox (Oracle)" Cc: "Aneesh Kumar K.V" Cc: Pavel Tatashin Cc: Greg Kroah-Hartman Cc: Dan Williams Cc: Logan Gunthorpe Cc: [5.0+] Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: David Hildenbrand --- arch/ia64/mm/init.c | 4 +--- arch/powerpc/mm/mem.c | 11 +---------- arch/sh/mm/init.c | 4 +--- arch/x86/mm/init_32.c | 4 +--- arch/x86/mm/init_64.c | 8 +------- include/linux/memory_hotplug.h | 7 +++++-- kernel/memremap.c | 3 +-- mm/hmm.c | 4 +--- mm/memory_hotplug.c | 29 ++++++++++++++--------------- 9 files changed, 26 insertions(+), 48 deletions(-) diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c index 778781e5f22e..79e5cc70f1fd 100644 --- a/arch/ia64/mm/init.c +++ b/arch/ia64/mm/init.c @@ -666,9 +666,7 @@ void arch_remove_memory(int nid, u64 start, u64 size, { unsigned long start_pfn = start >> PAGE_SHIFT; unsigned long nr_pages = size >> PAGE_SHIFT; - struct zone *zone; - zone = page_zone(pfn_to_page(start_pfn)); - __remove_pages(zone, start_pfn, nr_pages, altmap); + __remove_pages(start_pfn, nr_pages, altmap); } #endif diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c index a9c1aaa0c133..4d03925fdfd2 100644 --- a/arch/powerpc/mm/mem.c +++ b/arch/powerpc/mm/mem.c @@ -144,18 +144,9 @@ void __ref arch_remove_memory(int nid, u64 start, u64 size, { unsigned long start_pfn = start >> PAGE_SHIFT; unsigned long nr_pages = size >> PAGE_SHIFT; - struct page *page; int ret; - /* - * If we have an altmap then we need to skip over any reserved PFNs - * when querying the zone. - */ - page = pfn_to_page(start_pfn); - if (altmap) - page += vmem_altmap_offset(altmap); - - __remove_pages(page_zone(page), start_pfn, nr_pages, altmap); + __remove_pages(start_pfn, nr_pages, altmap); /* Remove htab bolted mappings for this section of memory */ start = (unsigned long)__va(start); diff --git a/arch/sh/mm/init.c b/arch/sh/mm/init.c index 0da784ac0e34..47882be91121 100644 --- a/arch/sh/mm/init.c +++ b/arch/sh/mm/init.c @@ -448,9 +448,7 @@ void arch_remove_memory(int nid, u64 start, u64 size, { unsigned long start_pfn = PFN_DOWN(start); unsigned long nr_pages = size >> PAGE_SHIFT; - struct zone *zone; - zone = page_zone(pfn_to_page(start_pfn)); - __remove_pages(zone, start_pfn, nr_pages, altmap); + __remove_pages(start_pfn, nr_pages, altmap); } #endif /* CONFIG_MEMORY_HOTPLUG */ diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c index 64f54f77052b..79b95910fd9f 100644 --- a/arch/x86/mm/init_32.c +++ b/arch/x86/mm/init_32.c @@ -865,10 +865,8 @@ void arch_remove_memory(int nid, u64 start, u64 size, { unsigned long start_pfn = start >> PAGE_SHIFT; unsigned long nr_pages = size >> PAGE_SHIFT; - struct zone *zone; - zone = page_zone(pfn_to_page(start_pfn)); - __remove_pages(zone, start_pfn, nr_pages, altmap); + __remove_pages(start_pfn, nr_pages, altmap); } #endif diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index 50df7ca142f9..81e85a8dd300 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -1146,14 +1146,8 @@ void __ref arch_remove_memory(int nid, u64 start, u64 size, { unsigned long start_pfn = start >> PAGE_SHIFT; unsigned long nr_pages = size >> PAGE_SHIFT; - struct page *page = pfn_to_page(start_pfn); - struct zone *zone; - /* With altmap the first mapped page is offset from @start */ - if (altmap) - page += vmem_altmap_offset(altmap); - zone = page_zone(page); - __remove_pages(zone, start_pfn, nr_pages, altmap); + __remove_pages(start_pfn, nr_pages, altmap); kernel_physical_mapping_remove(start, start + size); } #endif /* CONFIG_MEMORY_HOTPLUG */ diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index 26bda048f8a7..d17d45c41a0b 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -110,8 +110,8 @@ static inline bool movable_node_is_enabled(void) extern void arch_remove_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap); -extern void __remove_pages(struct zone *zone, unsigned long start_pfn, - unsigned long nr_pages, struct vmem_altmap *altmap); +extern void __remove_pages(unsigned long start_pfn, unsigned long nr_pages, + struct vmem_altmap *altmap); /* reasonably generic interface to expand the physical pages */ extern int __add_pages(int nid, unsigned long start_pfn, unsigned long nr_pages, @@ -331,6 +331,9 @@ extern int arch_add_memory(int nid, u64 start, u64 size, struct vmem_altmap *altmap, bool want_memblock); extern void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn, unsigned long nr_pages, struct vmem_altmap *altmap); +extern void remove_pfn_range_from_zone(struct zone *zone, + unsigned long start_pfn, + unsigned long nr_pages); extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages); extern bool is_memblock_offlined(struct memory_block *mem); extern int sparse_add_one_section(int nid, unsigned long start_pfn, diff --git a/kernel/memremap.c b/kernel/memremap.c index ba652672a546..5a3640a7e629 100644 --- a/kernel/memremap.c +++ b/kernel/memremap.c @@ -140,8 +140,7 @@ static void devm_memremap_pages_release(void *data) mem_hotplug_begin(); if (pgmap->type == MEMORY_DEVICE_PRIVATE) { - __remove_pages(page_zone(first_page), pfn, - align_size >> PAGE_SHIFT, NULL); + __remove_pages(pfn, align_size >> PAGE_SHIFT, NULL); } else { arch_remove_memory(nid, align_start, align_size, pgmap->altmap_valid ? &pgmap->altmap : NULL); diff --git a/mm/hmm.c b/mm/hmm.c index ae1f6ad46d30..c482c07bbab7 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -997,7 +997,6 @@ static void hmm_devmem_release(void *data) struct hmm_devmem *devmem = data; struct resource *resource = devmem->resource; unsigned long start_pfn, npages; - struct zone *zone; struct page *page; int nid; @@ -1006,12 +1005,11 @@ static void hmm_devmem_release(void *data) npages = ALIGN(resource_size(resource), PA_SECTION_SIZE) >> PAGE_SHIFT; page = pfn_to_page(start_pfn); - zone = page_zone(page); nid = page_to_nid(page); mem_hotplug_begin(); if (resource->desc == IORES_DESC_DEVICE_PRIVATE_MEMORY) - __remove_pages(zone, start_pfn, npages, NULL); + __remove_pages(start_pfn, npages, NULL); else arch_remove_memory(nid, start_pfn << PAGE_SHIFT, npages << PAGE_SHIFT, NULL); diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 7ab34360920b..abc10dcbc9d5 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -449,10 +449,11 @@ static void update_pgdat_span(struct pglist_data *pgdat) pgdat->node_spanned_pages = node_end_pfn - node_start_pfn; } -static void __remove_zone(struct zone *zone, unsigned long start_pfn) +void __ref remove_pfn_range_from_zone(struct zone *zone, + unsigned long start_pfn, + unsigned long nr_pages) { struct pglist_data *pgdat = zone->zone_pgdat; - int nr_pages = PAGES_PER_SECTION; unsigned long flags; #ifdef CONFIG_ZONE_DEVICE @@ -465,14 +466,17 @@ static void __remove_zone(struct zone *zone, unsigned long start_pfn) return; #endif + clear_zone_contiguous(zone); + pgdat_resize_lock(zone->zone_pgdat, &flags); shrink_zone_span(zone, start_pfn, start_pfn + nr_pages); update_pgdat_span(pgdat); pgdat_resize_unlock(zone->zone_pgdat, &flags); + + set_zone_contiguous(zone); } -static void __remove_section(struct zone *zone, struct mem_section *ms, - unsigned long map_offset, +static void __remove_section(struct mem_section *ms, unsigned long map_offset, struct vmem_altmap *altmap) { unsigned long start_pfn; @@ -483,14 +487,12 @@ static void __remove_section(struct zone *zone, struct mem_section *ms, scn_nr = __section_nr(ms); start_pfn = section_nr_to_pfn((unsigned long)scn_nr); - __remove_zone(zone, start_pfn); sparse_remove_one_section(ms, map_offset, altmap); } /** - * __remove_pages() - remove sections of pages from a zone - * @zone: zone from which pages need to be removed + * __remove_pages() - remove sections of pages * @phys_start_pfn: starting pageframe (must be aligned to start of a section) * @nr_pages: number of pages to remove (must be multiple of section size) * @altmap: alternative device page map or %NULL if default memmap is used @@ -500,8 +502,8 @@ static void __remove_section(struct zone *zone, struct mem_section *ms, * sure that pages are marked reserved and zones are adjust properly by * calling offline_pages(). */ -void __remove_pages(struct zone *zone, unsigned long phys_start_pfn, - unsigned long nr_pages, struct vmem_altmap *altmap) +void __remove_pages(unsigned long phys_start_pfn, unsigned long nr_pages, + struct vmem_altmap *altmap) { unsigned long i; unsigned long map_offset = 0; @@ -510,8 +512,6 @@ void __remove_pages(struct zone *zone, unsigned long phys_start_pfn, if (altmap) map_offset = vmem_altmap_offset(altmap); - clear_zone_contiguous(zone); - /* * We can only remove entire sections */ @@ -523,12 +523,9 @@ void __remove_pages(struct zone *zone, unsigned long phys_start_pfn, unsigned long pfn = phys_start_pfn + i*PAGES_PER_SECTION; cond_resched(); - __remove_section(zone, __pfn_to_section(pfn), map_offset, - altmap); + __remove_section(__pfn_to_section(pfn), map_offset, altmap); map_offset = 0; } - - set_zone_contiguous(zone); } int set_online_page_callback(online_page_callback_t callback) @@ -898,6 +895,7 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ (unsigned long long) pfn << PAGE_SHIFT, (((unsigned long long) pfn + nr_pages) << PAGE_SHIFT) - 1); memory_notify(MEM_CANCEL_ONLINE, &arg); + remove_pfn_range_from_zone(zone, pfn, nr_pages); mem_hotplug_done(); return ret; } @@ -1682,6 +1680,7 @@ static int __ref __offline_pages(unsigned long start_pfn, writeback_set_ratelimit(); memory_notify(MEM_OFFLINE, &arg); + remove_pfn_range_from_zone(zone, start_pfn, nr_pages); mem_hotplug_done(); return 0;