From patchwork Tue Sep 8 20:10:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 251614 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.9 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E9799C2D0A7 for ; Tue, 8 Sep 2020 20:12:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B122A2145D for ; Tue, 8 Sep 2020 20:12:01 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Ya3IKFBp" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730734AbgIHULE (ORCPT ); Tue, 8 Sep 2020 16:11:04 -0400 Received: from us-smtp-delivery-124.mimecast.com ([63.128.21.124]:39117 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731087AbgIHUKr (ORCPT ); Tue, 8 Sep 2020 16:10:47 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1599595839; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0L/VNlFzuDB/95oeV+NVSs6BIkKkE+sXG0xqekszlyk=; b=Ya3IKFBp6jBXrCzk0IjdHX+5msWtqI0/qpZadidgVoM1auwkK+0FqitIbRQhAxoB1z7zhF amdxkTOUDxHo/RDUi+MHmDfPBcz4iQNkrFuW+Lr8Wsgt27KH/5Merq4/5GkoFHP9VjF/gZ fKhpIUVJlkjwaUP8erA0IPMP0KC6qZU= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-16-yZ_soSYAPYGW_0mzx33S1Q-1; Tue, 08 Sep 2020 16:10:37 -0400 X-MC-Unique: yZ_soSYAPYGW_0mzx33S1Q-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 0FBF01007465; Tue, 8 Sep 2020 20:10:35 +0000 (UTC) Received: from t480s.redhat.com (ovpn-115-46.ams2.redhat.com [10.36.115.46]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4E53D5D9EF; Tue, 8 Sep 2020 20:10:29 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: virtualization@lists.linux-foundation.org, linux-mm@kvack.org, linux-hyperv@vger.kernel.org, xen-devel@lists.xenproject.org, linux-acpi@vger.kernel.org, linux-nvdimm@lists.01.org, linux-s390@vger.kernel.org, Andrew Morton , David Hildenbrand , Michal Hocko , Dan Williams , Jason Gunthorpe , Kees Cook , Ard Biesheuvel , Pankaj Gupta , Baoquan He , Wei Yang Subject: [PATCH v2 1/7] kernel/resource: make release_mem_region_adjustable() never fail Date: Tue, 8 Sep 2020 22:10:06 +0200 Message-Id: <20200908201012.44168-2-david@redhat.com> In-Reply-To: <20200908201012.44168-1-david@redhat.com> References: <20200908201012.44168-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Sender: linux-acpi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-acpi@vger.kernel.org Let's make sure splitting a resource on memory hotunplug will never fail. This will become more relevant once we merge selected System RAM resources - then, we'll trigger that case more often on memory hotunplug. In general, this function is already unlikely to fail. When we remove memory, we free up quite a lot of metadata (memmap, page tables, memory block device, etc.). The only reason it could really fail would be when injecting allocation errors. All other error cases inside release_mem_region_adjustable() seem to be sanity checks if the function would be abused in different context - let's add WARN_ON_ONCE() in these cases so we can catch them. Cc: Andrew Morton Cc: Michal Hocko Cc: Dan Williams Cc: Jason Gunthorpe Cc: Kees Cook Cc: Ard Biesheuvel Cc: Pankaj Gupta Cc: Baoquan He Cc: Wei Yang Signed-off-by: David Hildenbrand --- include/linux/ioport.h | 4 ++-- kernel/resource.c | 49 ++++++++++++++++++++++++------------------ mm/memory_hotplug.c | 22 +------------------ 3 files changed, 31 insertions(+), 44 deletions(-) diff --git a/include/linux/ioport.h b/include/linux/ioport.h index 6c2b06fe8beb7..52a91f5fa1a36 100644 --- a/include/linux/ioport.h +++ b/include/linux/ioport.h @@ -248,8 +248,8 @@ extern struct resource * __request_region(struct resource *, extern void __release_region(struct resource *, resource_size_t, resource_size_t); #ifdef CONFIG_MEMORY_HOTREMOVE -extern int release_mem_region_adjustable(struct resource *, resource_size_t, - resource_size_t); +extern void release_mem_region_adjustable(struct resource *, resource_size_t, + resource_size_t); #endif /* Wrappers for managed devices */ diff --git a/kernel/resource.c b/kernel/resource.c index f1175ce93a1d5..36b3552210120 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -1258,21 +1258,28 @@ EXPORT_SYMBOL(__release_region); * assumes that all children remain in the lower address entry for * simplicity. Enhance this logic when necessary. */ -int release_mem_region_adjustable(struct resource *parent, - resource_size_t start, resource_size_t size) +void release_mem_region_adjustable(struct resource *parent, + resource_size_t start, resource_size_t size) { + struct resource *new_res = NULL; + bool alloc_nofail = false; struct resource **p; struct resource *res; - struct resource *new_res; resource_size_t end; - int ret = -EINVAL; end = start + size - 1; - if ((start < parent->start) || (end > parent->end)) - return ret; + if (WARN_ON_ONCE((start < parent->start) || (end > parent->end))) + return; - /* The alloc_resource() result gets checked later */ - new_res = alloc_resource(GFP_KERNEL); + /* + * We free up quite a lot of memory on memory hotunplug (esp., memap), + * just before releasing the region. This is highly unlikely to + * fail - let's play save and make it never fail as the caller cannot + * perform any error handling (e.g., trying to re-add memory will fail + * similarly). + */ +retry: + new_res = alloc_resource(GFP_KERNEL | alloc_nofail ? __GFP_NOFAIL : 0); p = &parent->child; write_lock(&resource_lock); @@ -1298,7 +1305,6 @@ int release_mem_region_adjustable(struct resource *parent, * so if we are dealing with them, let us just back off here. */ if (!(res->flags & IORESOURCE_SYSRAM)) { - ret = 0; break; } @@ -1315,20 +1321,23 @@ int release_mem_region_adjustable(struct resource *parent, /* free the whole entry */ *p = res->sibling; free_resource(res); - ret = 0; } else if (res->start == start && res->end != end) { /* adjust the start */ - ret = __adjust_resource(res, end + 1, - res->end - end); + WARN_ON_ONCE(__adjust_resource(res, end + 1, + res->end - end)); } else if (res->start != start && res->end == end) { /* adjust the end */ - ret = __adjust_resource(res, res->start, - start - res->start); + WARN_ON_ONCE(__adjust_resource(res, res->start, + start - res->start)); } else { - /* split into two entries */ + /* split into two entries - we need a new resource */ if (!new_res) { - ret = -ENOMEM; - break; + new_res = alloc_resource(GFP_ATOMIC); + if (!new_res) { + alloc_nofail = true; + write_unlock(&resource_lock); + goto retry; + } } new_res->name = res->name; new_res->start = end + 1; @@ -1339,9 +1348,8 @@ int release_mem_region_adjustable(struct resource *parent, new_res->sibling = res->sibling; new_res->child = NULL; - ret = __adjust_resource(res, res->start, - start - res->start); - if (ret) + if (WARN_ON_ONCE(__adjust_resource(res, res->start, + start - res->start))) break; res->sibling = new_res; new_res = NULL; @@ -1352,7 +1360,6 @@ int release_mem_region_adjustable(struct resource *parent, write_unlock(&resource_lock); free_resource(new_res); - return ret; } #endif /* CONFIG_MEMORY_HOTREMOVE */ diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index baded53b9ff92..4c47b68a9f4b5 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1724,26 +1724,6 @@ void try_offline_node(int nid) } EXPORT_SYMBOL(try_offline_node); -static void __release_memory_resource(resource_size_t start, - resource_size_t size) -{ - int ret; - - /* - * When removing memory in the same granularity as it was added, - * this function never fails. It might only fail if resources - * have to be adjusted or split. We'll ignore the error, as - * removing of memory cannot fail. - */ - ret = release_mem_region_adjustable(&iomem_resource, start, size); - if (ret) { - resource_size_t endres = start + size - 1; - - pr_warn("Unable to release resource <%pa-%pa> (%d)\n", - &start, &endres, ret); - } -} - static int __ref try_remove_memory(int nid, u64 start, u64 size) { int rc = 0; @@ -1777,7 +1757,7 @@ static int __ref try_remove_memory(int nid, u64 start, u64 size) memblock_remove(start, size); } - __release_memory_resource(start, size); + release_mem_region_adjustable(&iomem_resource, start, size); try_offline_node(nid); From patchwork Tue Sep 8 20:10:07 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 251617 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.9 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 011AFC433E2 for ; Tue, 8 Sep 2020 20:11:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BED8321741 for ; Tue, 8 Sep 2020 20:11:04 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="bYskYPYt" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730415AbgIHULD (ORCPT ); Tue, 8 Sep 2020 16:11:03 -0400 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:50195 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1731733AbgIHUKu (ORCPT ); Tue, 8 Sep 2020 16:10:50 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1599595849; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zFptVInSDSbB2nDLyh5K9NKtnygiDyhn7CU7AXmV4mo=; b=bYskYPYt9aiVUJ6gIgN0+JVAwt4I7PDsfCXfZjeLMWEZ3eEpZT2ul/+FD5mW0eCFK5xAue ++rAKM+zs6bqILol8FufMBmK3GZX0OY5fG1u46O2Cpmb3qIg8rvcNDmcPe1vh4wmzjY7x3 ROEgzp4sZ4rxFsWpypNTt7Xvx5+NjhE= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-530-LvQ5oUMCNbaDKH3Y7EmEEA-1; Tue, 08 Sep 2020 16:10:47 -0400 X-MC-Unique: LvQ5oUMCNbaDKH3Y7EmEEA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 409D9839A48; Tue, 8 Sep 2020 20:10:42 +0000 (UTC) Received: from t480s.redhat.com (ovpn-115-46.ams2.redhat.com [10.36.115.46]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6279B5D9E8; Tue, 8 Sep 2020 20:10:35 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: virtualization@lists.linux-foundation.org, linux-mm@kvack.org, linux-hyperv@vger.kernel.org, xen-devel@lists.xenproject.org, linux-acpi@vger.kernel.org, linux-nvdimm@lists.01.org, linux-s390@vger.kernel.org, Andrew Morton , David Hildenbrand , Michal Hocko , Dan Williams , Jason Gunthorpe , Kees Cook , Ard Biesheuvel , Pankaj Gupta , Baoquan He , Wei Yang , Eric Biederman , Thomas Gleixner , Greg Kroah-Hartman , kexec@lists.infradead.org Subject: [PATCH v2 2/7] kernel/resource: move and rename IORESOURCE_MEM_DRIVER_MANAGED Date: Tue, 8 Sep 2020 22:10:07 +0200 Message-Id: <20200908201012.44168-3-david@redhat.com> In-Reply-To: <20200908201012.44168-1-david@redhat.com> References: <20200908201012.44168-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Sender: linux-acpi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-acpi@vger.kernel.org IORESOURCE_MEM_DRIVER_MANAGED currently uses an unused PnP bit, which is always set to 0 by hardware. This is far from beautiful (and confusing), and the bit only applies to SYSRAM. So let's move it out of the bus-specific (PnP) defined bits. We'll add another SYSRAM specific bit soon. If we ever need more bits for other purposes, we can steal some from "desc", or reshuffle/regroup what we have. Cc: Andrew Morton Cc: Michal Hocko Cc: Dan Williams Cc: Jason Gunthorpe Cc: Kees Cook Cc: Ard Biesheuvel Cc: Pankaj Gupta Cc: Baoquan He Cc: Wei Yang Cc: Eric Biederman Cc: Thomas Gleixner Cc: Greg Kroah-Hartman Cc: kexec@lists.infradead.org Signed-off-by: David Hildenbrand --- include/linux/ioport.h | 4 +++- kernel/kexec_file.c | 2 +- mm/memory_hotplug.c | 4 ++-- 3 files changed, 6 insertions(+), 4 deletions(-) diff --git a/include/linux/ioport.h b/include/linux/ioport.h index 52a91f5fa1a36..d7620d7c941a0 100644 --- a/include/linux/ioport.h +++ b/include/linux/ioport.h @@ -58,6 +58,9 @@ struct resource { #define IORESOURCE_EXT_TYPE_BITS 0x01000000 /* Resource extended types */ #define IORESOURCE_SYSRAM 0x01000000 /* System RAM (modifier) */ +/* IORESOURCE_SYSRAM specific bits. */ +#define IORESOURCE_SYSRAM_DRIVER_MANAGED 0x02000000 /* Always detected via a driver. */ + #define IORESOURCE_EXCLUSIVE 0x08000000 /* Userland may not map this resource */ #define IORESOURCE_DISABLED 0x10000000 @@ -103,7 +106,6 @@ struct resource { #define IORESOURCE_MEM_32BIT (3<<3) #define IORESOURCE_MEM_SHADOWABLE (1<<5) /* dup: IORESOURCE_SHADOWABLE */ #define IORESOURCE_MEM_EXPANSIONROM (1<<6) -#define IORESOURCE_MEM_DRIVER_MANAGED (1<<7) /* PnP I/O specific bits (IORESOURCE_BITS) */ #define IORESOURCE_IO_16BIT_ADDR (1<<0) diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c index ca40bef75a616..dfeeed1aed084 100644 --- a/kernel/kexec_file.c +++ b/kernel/kexec_file.c @@ -520,7 +520,7 @@ static int locate_mem_hole_callback(struct resource *res, void *arg) /* Returning 0 will take to next memory range */ /* Don't use memory that will be detected and handled by a driver. */ - if (res->flags & IORESOURCE_MEM_DRIVER_MANAGED) + if (res->flags & IORESOURCE_SYSRAM_DRIVER_MANAGED) return 0; if (sz < kbuf->memsz) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 4c47b68a9f4b5..8e1cd18b5cf14 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -105,7 +105,7 @@ static struct resource *register_memory_resource(u64 start, u64 size, unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY; if (strcmp(resource_name, "System RAM")) - flags |= IORESOURCE_MEM_DRIVER_MANAGED; + flags |= IORESOURCE_SYSRAM_DRIVER_MANAGED; /* * Make sure value parsed from 'mem=' only restricts memory adding @@ -1160,7 +1160,7 @@ EXPORT_SYMBOL_GPL(add_memory); * * For this memory, no entries in /sys/firmware/memmap ("raw firmware-provided * memory map") are created. Also, the created memory resource is flagged - * with IORESOURCE_MEM_DRIVER_MANAGED, so in-kernel users can special-case + * with IORESOURCE_SYSRAM_DRIVER_MANAGED, so in-kernel users can special-case * this memory as well (esp., not place kexec images onto it). * * The resource_name (visible via /proc/iomem) has to have the format From patchwork Tue Sep 8 20:10:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 251616 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.9 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E213EC2BC11 for ; Tue, 8 Sep 2020 20:11:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B2FE6208FE for ; Tue, 8 Sep 2020 20:11:29 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="P4E9LQ3u" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730523AbgIHUL1 (ORCPT ); Tue, 8 Sep 2020 16:11:27 -0400 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:21043 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1731208AbgIHUL0 (ORCPT ); Tue, 8 Sep 2020 16:11:26 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1599595884; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mCWjFN28DssVUtSPV502OfUrY2nTqiXbNRNkHsa+cEQ=; b=P4E9LQ3uFro85jFp6R0GvuvynaRGSebYYUDh9dgHC3Ub69SQzesnZP878iJSZFICbyGoR8 fAD8N5SRDVq1HSAbfx1+NSO9mvUrKfdrGKUMLmkKcbpxDNaeLz796nkxuw5FvH5nmVJLEb CQKzfnhhOGQM+W7Au0J0mYlRgSAf8K0= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-357-_qNC6GigMcKYNHpivSKcaw-1; Tue, 08 Sep 2020 16:11:20 -0400 X-MC-Unique: _qNC6GigMcKYNHpivSKcaw-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 3FD241007466; Tue, 8 Sep 2020 20:11:16 +0000 (UTC) Received: from t480s.redhat.com (ovpn-115-46.ams2.redhat.com [10.36.115.46]) by smtp.corp.redhat.com (Postfix) with ESMTP id A2DD55D9E8; Tue, 8 Sep 2020 20:11:04 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: virtualization@lists.linux-foundation.org, linux-mm@kvack.org, linux-hyperv@vger.kernel.org, xen-devel@lists.xenproject.org, linux-acpi@vger.kernel.org, linux-nvdimm@lists.01.org, linux-s390@vger.kernel.org, Andrew Morton , David Hildenbrand , Michal Hocko , Dan Williams , "Michael S . Tsirkin" , Jason Wang , Pankaj Gupta , Baoquan He , Wei Yang Subject: [PATCH v2 5/7] virtio-mem: try to merge system ram resources Date: Tue, 8 Sep 2020 22:10:10 +0200 Message-Id: <20200908201012.44168-6-david@redhat.com> In-Reply-To: <20200908201012.44168-1-david@redhat.com> References: <20200908201012.44168-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Sender: linux-acpi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-acpi@vger.kernel.org virtio-mem adds memory in memory block granularity, to be able to remove it in the same granularity again later, and to grow slowly on demand. This, however, results in quite a lot of resources when adding a lot of memory. Resources are effectively stored in a list-based tree. Having a lot of resources not only wastes memory, it also makes traversing that tree more expensive, and makes /proc/iomem explode in size (e.g., requiring kexec-tools to manually merge resources later when e.g., trying to create a kdump header). Before this patch, we get (/proc/iomem) when hotplugging 2G via virtio-mem on x86-64: [...] 100000000-13fffffff : System RAM 140000000-33fffffff : virtio0 140000000-147ffffff : System RAM (virtio_mem) 148000000-14fffffff : System RAM (virtio_mem) 150000000-157ffffff : System RAM (virtio_mem) 158000000-15fffffff : System RAM (virtio_mem) 160000000-167ffffff : System RAM (virtio_mem) 168000000-16fffffff : System RAM (virtio_mem) 170000000-177ffffff : System RAM (virtio_mem) 178000000-17fffffff : System RAM (virtio_mem) 180000000-187ffffff : System RAM (virtio_mem) 188000000-18fffffff : System RAM (virtio_mem) 190000000-197ffffff : System RAM (virtio_mem) 198000000-19fffffff : System RAM (virtio_mem) 1a0000000-1a7ffffff : System RAM (virtio_mem) 1a8000000-1afffffff : System RAM (virtio_mem) 1b0000000-1b7ffffff : System RAM (virtio_mem) 1b8000000-1bfffffff : System RAM (virtio_mem) 3280000000-32ffffffff : PCI Bus 0000:00 With this patch, we get (/proc/iomem): [...] fffc0000-ffffffff : Reserved 100000000-13fffffff : System RAM 140000000-33fffffff : virtio0 140000000-1bfffffff : System RAM (virtio_mem) 3280000000-32ffffffff : PCI Bus 0000:00 Of course, with more hotplugged memory, it gets worse. When unplugging memory blocks again, try_remove_memory() (via offline_and_remove_memory()) will properly split the resource up again. Cc: Andrew Morton Cc: Michal Hocko Cc: Dan Williams Cc: Michael S. Tsirkin Cc: Jason Wang Cc: Pankaj Gupta Cc: Baoquan He Cc: Wei Yang Signed-off-by: David Hildenbrand --- drivers/virtio/virtio_mem.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c index 314ab753139d1..ba4de598f6636 100644 --- a/drivers/virtio/virtio_mem.c +++ b/drivers/virtio/virtio_mem.c @@ -424,7 +424,8 @@ static int virtio_mem_mb_add(struct virtio_mem *vm, unsigned long mb_id) dev_dbg(&vm->vdev->dev, "adding memory block: %lu\n", mb_id); return add_memory_driver_managed(nid, addr, memory_block_size_bytes(), - vm->resource_name, 0); + vm->resource_name, + MEMHP_MERGE_RESOURCE); } /* From patchwork Tue Sep 8 20:10:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 251615 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.9 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A85DAC2D0E4 for ; Tue, 8 Sep 2020 20:11:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7688C21D42 for ; Tue, 8 Sep 2020 20:11:45 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="X/VuyEf1" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731142AbgIHULg (ORCPT ); Tue, 8 Sep 2020 16:11:36 -0400 Received: from us-smtp-2.mimecast.com ([207.211.31.81]:53438 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1732127AbgIHULb (ORCPT ); Tue, 8 Sep 2020 16:11:31 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1599595890; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8K1xYXTYUbnlqXd2ZY48Xr9MumdD7TqIS7rIjkX/laU=; b=X/VuyEf12mvS7gHzVPlZj851Rjk4FB9KC2zs7IO3wdIL6wtrEDghFRARb4m2Ja9PHrVhm+ ZvypGzbTPXvxSJw/nMoiZXVo9yLEKiGC6Upj9GAQdxChWYSVAmHmQOT9d2s2ouU1nsMtjZ OgbGnlJm6AY9jh7jxXGBHtKxLVnO34M= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-401-Y-wM1DyKPe22UQB0ydanhg-1; Tue, 08 Sep 2020 16:11:28 -0400 X-MC-Unique: Y-wM1DyKPe22UQB0ydanhg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id CA4DC85C734; Tue, 8 Sep 2020 20:11:25 +0000 (UTC) Received: from t480s.redhat.com (ovpn-115-46.ams2.redhat.com [10.36.115.46]) by smtp.corp.redhat.com (Postfix) with ESMTP id 942655D9E8; Tue, 8 Sep 2020 20:11:16 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: virtualization@lists.linux-foundation.org, linux-mm@kvack.org, linux-hyperv@vger.kernel.org, xen-devel@lists.xenproject.org, linux-acpi@vger.kernel.org, linux-nvdimm@lists.01.org, linux-s390@vger.kernel.org, Andrew Morton , David Hildenbrand , Michal Hocko , Boris Ostrovsky , Juergen Gross , Stefano Stabellini , =?utf-8?q?Roger_Pau_Monn=C3=A9?= , Julien Grall , Pankaj Gupta , Baoquan He , Wei Yang Subject: [PATCH v2 6/7] xen/balloon: try to merge system ram resources Date: Tue, 8 Sep 2020 22:10:11 +0200 Message-Id: <20200908201012.44168-7-david@redhat.com> In-Reply-To: <20200908201012.44168-1-david@redhat.com> References: <20200908201012.44168-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Sender: linux-acpi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-acpi@vger.kernel.org Let's try to merge system ram resources we add, to minimize the number of resources in /proc/iomem. We don't care about the boundaries of individual chunks we added. Cc: Andrew Morton Cc: Michal Hocko Cc: Boris Ostrovsky Cc: Juergen Gross Cc: Stefano Stabellini Cc: Roger Pau Monné Cc: Julien Grall Cc: Pankaj Gupta Cc: Baoquan He Cc: Wei Yang Signed-off-by: David Hildenbrand Reviewed-by: Juergen Gross --- drivers/xen/balloon.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c index 7bac38764513d..b57b2067ecbfb 100644 --- a/drivers/xen/balloon.c +++ b/drivers/xen/balloon.c @@ -331,7 +331,7 @@ static enum bp_state reserve_additional_memory(void) mutex_unlock(&balloon_mutex); /* add_memory_resource() requires the device_hotplug lock */ lock_device_hotplug(); - rc = add_memory_resource(nid, resource, 0); + rc = add_memory_resource(nid, resource, MEMHP_MERGE_RESOURCE); unlock_device_hotplug(); mutex_lock(&balloon_mutex);