From patchwork Sun Mar 22 04:39:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 228910 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 26C6FC4332E for ; Sun, 22 Mar 2020 04:49:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id D8369206F9 for ; Sun, 22 Mar 2020 04:49:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1584852556; bh=+XT11jeDv1p40IvxgewUHNGU1dbfEDAW2ShYPS7T1NM=; h=Date:From:To:Subject:In-Reply-To:List-ID:From; b=aQ/zBPHCLEGML0szL2DE7DC7ScYDDgXuW31aJpusoRE6SMyYDE/uwQ3GaCOvASw/H b9HD69M+dc45lcRag0UOlgHZwNKKrrP018JJ5D27XTdneG82Q7ZU0zgP2PZuXZEMao WMAO/HNTzII75FX/Dy/GnyCeEg0WCCXRVQTwQyI8= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725825AbgCVEtP (ORCPT ); Sun, 22 Mar 2020 00:49:15 -0400 Received: from mail.kernel.org ([198.145.29.99]:34496 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725765AbgCVEtO (ORCPT ); Sun, 22 Mar 2020 00:49:14 -0400 Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id C9F46206F9; Sun, 22 Mar 2020 04:39:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1584851984; bh=+XT11jeDv1p40IvxgewUHNGU1dbfEDAW2ShYPS7T1NM=; h=Date:From:To:Subject:In-Reply-To:From; b=1hAISbPPVyKb42pUBTWTbyH7exhT5RkABtL0Qe/1vrBEn8N9BmuHZWSeKKCRtAeoR 91IkRZXSC4VR/h5M4Aj4reOVhh9kr/+WPc5f7jofxtEtkNgsfW2DadNC6loz8rZwWL aqfUUXFM7E5CbBk43WSD8nvBPTab3vgLrx9MipCY= Date: Sat, 21 Mar 2020 21:39:43 -0700 From: Andrew Morton To: arnd@arndb.de, ebiggers@google.com, glider@google.com, gregkh@linuxfoundation.org, keescook@chromium.org, mm-commits@vger.kernel.org, rafael@kernel.org, stable@vger.kernel.org, viro@zeniv.linux.org.uk Subject: + libfs-fix-infoleak-in-simple_attr_read.patch added to -mm tree Message-ID: <20200322043943.GCnF6HIYE%akpm@linux-foundation.org> In-Reply-To: <20200321181954.c0564dfd5514cd742b534884@linux-foundation.org> User-Agent: s-nail v14.8.16 Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org The patch titled Subject: libfs: fix infoleak in simple_attr_read() has been added to the -mm tree. Its filename is libfs-fix-infoleak-in-simple_attr_read.patch This patch should soon appear at http://ozlabs.org/~akpm/mmots/broken-out/libfs-fix-infoleak-in-simple_attr_read.patch and later at http://ozlabs.org/~akpm/mmotm/broken-out/libfs-fix-infoleak-in-simple_attr_read.patch Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Eric Biggers Subject: libfs: fix infoleak in simple_attr_read() Reading from a debugfs file at a nonzero position, without first reading at position 0, leaks uninitialized memory to userspace. It's a bit tricky to do this, since lseek() and pread() aren't allowed on these files, and write() doesn't update the position on them. But writing to them with splice() *does* update the position: #define _GNU_SOURCE 1 #include #include #include int main() { int pipes[2], fd, n, i; char buf[32]; pipe(pipes); write(pipes[1], "0", 1); fd = open("/sys/kernel/debug/fault_around_bytes", O_RDWR); splice(pipes[0], NULL, fd, NULL, 1, 0); n = read(fd, buf, sizeof(buf)); for (i = 0; i < n; i++) printf("%02x", buf[i]); printf(" "); } Output: 5a5a5a5a5a5a5a5a5a5a5a5a5a5a5a5a5a5a5a5a5a5a5a30 Fix the infoleak by making simple_attr_read() always fill simple_attr::get_buf if it hasn't been filled yet. Link: http://lkml.kernel.org/r/20200308023849.988264-1-ebiggers@kernel.org Fixes: acaefc25d21f ("[PATCH] libfs: add simple attribute files") Signed-off-by: Eric Biggers Reported-by: syzbot+fcab69d1ada3e8d6f06b@syzkaller.appspotmail.com Reported-by: Alexander Potapenko Cc: Al Viro Cc: Arnd Bergmann Cc: Greg Kroah-Hartman Cc: "Rafael J. Wysocki" Cc: Kees Cook Cc: Signed-off-by: Andrew Morton --- fs/libfs.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) --- a/fs/libfs.c~libfs-fix-infoleak-in-simple_attr_read +++ a/fs/libfs.c @@ -891,7 +891,7 @@ int simple_attr_open(struct inode *inode { struct simple_attr *attr; - attr = kmalloc(sizeof(*attr), GFP_KERNEL); + attr = kzalloc(sizeof(*attr), GFP_KERNEL); if (!attr) return -ENOMEM; @@ -931,9 +931,11 @@ ssize_t simple_attr_read(struct file *fi if (ret) return ret; - if (*ppos) { /* continued read */ + if (*ppos && attr->get_buf[0]) { + /* continued read */ size = strlen(attr->get_buf); - } else { /* first read */ + } else { + /* first read */ u64 val; ret = attr->get(attr->data, &val); if (ret) From patchwork Sun Mar 22 01:22:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 228914 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 285B5C4332D for ; Sun, 22 Mar 2020 01:22:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id F1FDA20776 for ; Sun, 22 Mar 2020 01:22:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1584840136; bh=wxm93Vm+4qrFdWvilJt/Pg/Fhx76X//hnCX8KhDSgDw=; h=Date:From:To:Subject:In-Reply-To:List-ID:From; b=cM610W7zKZ5w7U21kJ8fouJVntSSHs3gY5NwUoi4DQiUsmJ+93LDGKM9PifW9Ip5N Bi5WnlRzkkClggVrNJcqaBJo+rvpQ0NnRPwG8O8nuuzIk0fZUsdnleb6WE26hmPOts 8ylwomRvHizoo51RCp6NKQFV0pHCOk9gQd1Ibmm4= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728075AbgCVBWP (ORCPT ); Sat, 21 Mar 2020 21:22:15 -0400 Received: from mail.kernel.org ([198.145.29.99]:41006 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726859AbgCVBWP (ORCPT ); Sat, 21 Mar 2020 21:22:15 -0400 Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 1B68C20757; Sun, 22 Mar 2020 01:22:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1584840134; bh=wxm93Vm+4qrFdWvilJt/Pg/Fhx76X//hnCX8KhDSgDw=; h=Date:From:To:Subject:In-Reply-To:From; b=AgAD4p4m7Rib0TctHxKmYFN1SWbLGb8Boi9TndmYjtCEeh5+mUSx2qBYP8deUyjCM +8upMLkjOytTf9bEokl5bLtJEYt1IN8HcdSqpDbzADZG5BiFZaRhEzxB8kLacQrGa7 7PYI8L7ZRD9bvGqRRRL/++YtX2nAlT7/Giv+9RM0= Date: Sat, 21 Mar 2020 18:22:13 -0700 From: Andrew Morton To: akpm@linux-foundation.org, bhe@redhat.com, david@redhat.com, linux-mm@kvack.org, mhocko@suse.com, mm-commits@vger.kernel.org, osalvador@suse.de, pankaj.gupta.linux@gmail.com, richardw.yang@linux.intel.com, rppt@linux.ibm.com, stable@vger.kernel.org, torvalds@linux-foundation.org Subject: [patch 02/10] mm/hotplug: fix hot remove failure in SPARSEMEM|!VMEMMAP case Message-ID: <20200322012213.bAcH5rYaA%akpm@linux-foundation.org> In-Reply-To: <20200321181954.c0564dfd5514cd742b534884@linux-foundation.org> User-Agent: s-nail v14.8.16 Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Baoquan He Subject: mm/hotplug: fix hot remove failure in SPARSEMEM|!VMEMMAP case In section_deactivate(), pfn_to_page() doesn't work any more after ms->section_mem_map is resetting to NULL in SPARSEMEM|!VMEMMAP case. It caused hot remove failure: kernel BUG at mm/page_alloc.c:4806! invalid opcode: 0000 [#1] SMP PTI CPU: 3 PID: 8 Comm: kworker/u16:0 Tainted: G W 5.5.0-next-20200205+ #340 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015 Workqueue: kacpi_hotplug acpi_hotplug_work_fn RIP: 0010:free_pages+0x85/0xa0 Call Trace: __remove_pages+0x99/0xc0 arch_remove_memory+0x23/0x4d try_remove_memory+0xc8/0x130 ? walk_memory_blocks+0x72/0xa0 __remove_memory+0xa/0x11 acpi_memory_device_remove+0x72/0x100 acpi_bus_trim+0x55/0x90 acpi_device_hotplug+0x2eb/0x3d0 acpi_hotplug_work_fn+0x1a/0x30 process_one_work+0x1a7/0x370 worker_thread+0x30/0x380 ? flush_rcu_work+0x30/0x30 kthread+0x112/0x130 ? kthread_create_on_node+0x60/0x60 ret_from_fork+0x35/0x40 Let's move the ->section_mem_map resetting after depopulate_section_memmap() to fix it. [akpm@linux-foundation.org: remove unneeded initialization, per David] Link: http://lkml.kernel.org/r/20200307084229.28251-2-bhe@redhat.com Fixes: ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug") Signed-off-by: Baoquan He Acked-by: Michal Hocko Reviewed-by: Pankaj Gupta Reviewed-by: David Hildenbrand Cc: Wei Yang Cc: Oscar Salvador Cc: Mike Rapoport Cc: Signed-off-by: Andrew Morton --- mm/sparse.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) --- a/mm/sparse.c~mm-hotplug-fix-hot-remove-failure-in-sparsememvmemmap-case +++ a/mm/sparse.c @@ -734,6 +734,7 @@ static void section_deactivate(unsigned struct mem_section *ms = __pfn_to_section(pfn); bool section_is_early = early_section(ms); struct page *memmap = NULL; + bool empty; unsigned long *subsection_map = ms->usage ? &ms->usage->subsection_map[0] : NULL; @@ -764,7 +765,8 @@ static void section_deactivate(unsigned * For 2/ and 3/ the SPARSEMEM_VMEMMAP={y,n} cases are unified */ bitmap_xor(subsection_map, map, subsection_map, SUBSECTIONS_PER_SECTION); - if (bitmap_empty(subsection_map, SUBSECTIONS_PER_SECTION)) { + empty = bitmap_empty(subsection_map, SUBSECTIONS_PER_SECTION); + if (empty) { unsigned long section_nr = pfn_to_section_nr(pfn); /* @@ -779,13 +781,15 @@ static void section_deactivate(unsigned ms->usage = NULL; } memmap = sparse_decode_mem_map(ms->section_mem_map, section_nr); - ms->section_mem_map = (unsigned long)NULL; } if (section_is_early && memmap) free_map_bootmem(memmap); else depopulate_section_memmap(pfn, nr_pages, altmap); + + if (empty) + ms->section_mem_map = (unsigned long)NULL; } static struct page * __meminit section_activate(int nid, unsigned long pfn, From patchwork Sun Mar 22 01:22:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 228913 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 31A13C4332D for ; Sun, 22 Mar 2020 01:22:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id EC80420782 for ; Sun, 22 Mar 2020 01:22:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1584840142; bh=AKIThlUcG1GmY0Wte6a3xYrpgomJvThfNHiDtL8Glj8=; h=Date:From:To:Subject:In-Reply-To:List-ID:From; b=ZyrRRuFv7LluaW++UJkI+mn/3Lhp7F2CB37Gc+uK2yfzY9yo6uvqfiP+NrrqiRRi+ TP8vYfFywvUf8gEV8xf8nR4sjV3/z7jnGladGChP9SvKHTtrLu41KU0AC787xAICnr KjSaJAzmWC4Yr68SR0dCPq5/OoQv3ZSeXxQ39efQ= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728132AbgCVBWV (ORCPT ); Sat, 21 Mar 2020 21:22:21 -0400 Received: from mail.kernel.org ([198.145.29.99]:41124 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726859AbgCVBWV (ORCPT ); Sat, 21 Mar 2020 21:22:21 -0400 Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 85F1E20767; Sun, 22 Mar 2020 01:22:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1584840140; bh=AKIThlUcG1GmY0Wte6a3xYrpgomJvThfNHiDtL8Glj8=; h=Date:From:To:Subject:In-Reply-To:From; b=qfzl87JmjK5BBqKpus11yDfCs4/UBONR9bnmVPg9CZtoJBdNH7IubPfsJYp1SsVN5 gmHu2lfMEJaq43zCD4BnUhr8ty33Smnvl3uiB7i8JTfnLiDHgMxAJs3GMdOVnkbYm2 wGrfaGQdBBj7+8JjHApHlOMW5WEeDgG1QoQJwNeo= Date: Sat, 21 Mar 2020 18:22:20 -0700 From: Andrew Morton To: akpm@linux-foundation.org, chris@chrisdown.name, guro@fb.com, hannes@cmpxchg.org, linux-mm@kvack.org, mhocko@kernel.org, mm-commits@vger.kernel.org, natechancellor@gmail.com, stable@vger.kernel.org, tj@kernel.org, torvalds@linux-foundation.org Subject: [patch 04/10] mm, memcg: fix corruption on 64-bit divisor in memory.high throttling Message-ID: <20200322012220.Y-EvF7Agt%akpm@linux-foundation.org> In-Reply-To: <20200321181954.c0564dfd5514cd742b534884@linux-foundation.org> User-Agent: s-nail v14.8.16 Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Chris Down Subject: mm, memcg: fix corruption on 64-bit divisor in memory.high throttling 0e4b01df8659 had a bunch of fixups to use the right division method. However, it seems that after all that it still wasn't right -- div_u64 takes a 32-bit divisor. The headroom is still large (2^32 pages), so on mundane systems you won't hit this, but this should definitely be fixed. Link: http://lkml.kernel.org/r/80780887060514967d414b3cd91f9a316a16ab98.1584036142.git.chris@chrisdown.name Fixes: 0e4b01df8659 ("mm, memcg: throttle allocators when failing reclaim over memory.high") Signed-off-by: Chris Down Reported-by: Johannes Weiner Acked-by: Johannes Weiner Cc: Tejun Heo Cc: Roman Gushchin Cc: Michal Hocko Cc: Nathan Chancellor Cc: [5.4.x+] Signed-off-by: Andrew Morton --- mm/memcontrol.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/memcontrol.c~mm-memcg-fix-corruption-on-64-bit-divisor-in-memoryhigh-throttling +++ a/mm/memcontrol.c @@ -2339,7 +2339,7 @@ void mem_cgroup_handle_over_high(void) */ clamped_high = max(high, 1UL); - overage = div_u64((u64)(usage - high) << MEMCG_DELAY_PRECISION_SHIFT, + overage = div64_u64((u64)(usage - high) << MEMCG_DELAY_PRECISION_SHIFT, clamped_high); penalty_jiffies = ((u64)overage * overage * HZ) From patchwork Sun Mar 22 01:22:26 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 228912 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 361B0C4332D for ; Sun, 22 Mar 2020 01:22:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 0878120767 for ; Sun, 22 Mar 2020 01:22:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1584840150; bh=GPIJXAU8IdYJRQY6I34QACgs/PHeLhjUhfPVGA19puk=; h=Date:From:To:Subject:In-Reply-To:List-ID:From; b=OrDMSlk9ZpU1mzhEtU8XdswPPqQvwci0Z49ko5xQFnNEgmOvm4nHTJjHGWjuCqCuB BvihOwoDc+4UsMzI+Pu8C/quMXhX8yMC8bPOBFnT1+dLWcD7jCTXl+O+wySiDCFYj0 xxEZ0oroglC2M30QezfrOzxyQraOxRTWg7/Lyq+4= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728178AbgCVBW3 (ORCPT ); Sat, 21 Mar 2020 21:22:29 -0400 Received: from mail.kernel.org ([198.145.29.99]:41258 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726859AbgCVBW3 (ORCPT ); Sat, 21 Mar 2020 21:22:29 -0400 Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 0AB9E20757; Sun, 22 Mar 2020 01:22:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1584840148; bh=GPIJXAU8IdYJRQY6I34QACgs/PHeLhjUhfPVGA19puk=; h=Date:From:To:Subject:In-Reply-To:From; b=sIgkxHmIyg5bf/bNSgO7qViDIKy9HuWF7IIlhuaPb1m8jG+rAtrzuBSJnD578Y6fJ Ejne412lZYBQF153iO3yl0FGXeUvusqkVOvSu1gkAdvPcKGsviMNEXKv0Xu8wPufd+ Fq6/xCgKkS6NxaT8yDMn+s4cOL/b+ksa+qgfpnQU= Date: Sat, 21 Mar 2020 18:22:26 -0700 From: Andrew Morton To: akpm@linux-foundation.org, dancol@google.com, dave.hansen@intel.com, jannh@google.com, joel@joelfernandes.org, linux-mm@kvack.org, mhocko@suse.com, minchan@kernel.org, mm-commits@vger.kernel.org, stable@vger.kernel.org, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 06/10] mm: do not allow MADV_PAGEOUT for CoW pages Message-ID: <20200322012226.yY5JKgG__%akpm@linux-foundation.org> In-Reply-To: <20200321181954.c0564dfd5514cd742b534884@linux-foundation.org> User-Agent: s-nail v14.8.16 Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Michal Hocko Subject: mm: do not allow MADV_PAGEOUT for CoW pages Jann has brought up a very interesting point [1]. While shared pages are excluded from MADV_PAGEOUT normally, CoW pages can be easily reclaimed that way. This can lead to all sorts of hard to debug problems. E.g. performance problems outlined by Daniel [2]. There are runtime environments where there is a substantial memory shared among security domains via CoW memory and a easy to reclaim way of that memory, which MADV_{COLD,PAGEOUT} offers, can lead to either performance degradation in for the parent process which might be more privileged or even open side channel attacks. The feasibility of the latter is not really clear to me TBH but there is no real reason for exposure at this stage. It seems there is no real use case to depend on reclaiming CoW memory via madvise at this stage so it is much easier to simply disallow it and this is what this patch does. Put it simply MADV_{PAGEOUT,COLD} can operate only on the exclusively owned memory which is a straightforward semantic. [1] http://lkml.kernel.org/r/CAG48ez0G3JkMq61gUmyQAaCq=_TwHbi1XKzWRooxZkv08PQKuw@mail.gmail.com [2] http://lkml.kernel.org/r/CAKOZueua_v8jHCpmEtTB6f3i9e2YnmX4mqdYVWhV4E=Z-n+zRQ@mail.gmail.com Link: http://lkml.kernel.org/r/20200312082248.GS23944@dhcp22.suse.cz Fixes: 9c276cc65a58 ("mm: introduce MADV_COLD") Signed-off-by: Michal Hocko Reported-by: Jann Horn Acked-by: Vlastimil Babka Cc: Minchan Kim Cc: Daniel Colascione Cc: Dave Hansen Cc: "Joel Fernandes (Google)" Cc: Signed-off-by: Andrew Morton --- mm/madvise.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) --- a/mm/madvise.c~mm-do-not-allow-madv_pageout-for-cow-pages +++ a/mm/madvise.c @@ -335,12 +335,14 @@ static int madvise_cold_or_pageout_pte_r } page = pmd_page(orig_pmd); + + /* Do not interfere with other mappings of this page */ + if (page_mapcount(page) != 1) + goto huge_unlock; + if (next - addr != HPAGE_PMD_SIZE) { int err; - if (page_mapcount(page) != 1) - goto huge_unlock; - get_page(page); spin_unlock(ptl); lock_page(page); @@ -426,6 +428,10 @@ regular_page: continue; } + /* Do not interfere with other mappings of this page */ + if (page_mapcount(page) != 1) + continue; + VM_BUG_ON_PAGE(PageTransCompound(page), page); if (pte_young(ptent)) { From patchwork Sun Mar 22 01:22:37 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 228911 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 68FC3C43333 for ; Sun, 22 Mar 2020 01:22:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 320E520786 for ; Sun, 22 Mar 2020 01:22:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1584840162; bh=XnJmbynSe2OrL+JmdBuiX1zyaoda4QFgeJk5Yfx2RSg=; h=Date:From:To:Subject:In-Reply-To:List-ID:From; b=hBeH2XZOvrSw/26AL0hF4vuP2LM0uwocFHOJfGOYgjenVA3CLyYbU8k6qmEDfywkc OZpvUeBrWjjgq0KnFP9O2h03kXNdiYNRsHx4lD41KfYCC15rbN3ZSgcKLXyeSzgTye md8aM/CnACxM4SNR33mWxkP68bWeE310bVHqd0VQ= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728184AbgCVBWl (ORCPT ); Sat, 21 Mar 2020 21:22:41 -0400 Received: from mail.kernel.org ([198.145.29.99]:41450 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726859AbgCVBWl (ORCPT ); Sat, 21 Mar 2020 21:22:41 -0400 Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 61B1A20722; Sun, 22 Mar 2020 01:22:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1584840159; bh=XnJmbynSe2OrL+JmdBuiX1zyaoda4QFgeJk5Yfx2RSg=; h=Date:From:To:Subject:In-Reply-To:From; b=WQToU+upbzX0HYKrZ7sWt6M8j+c6PS805LosZIWerUdR+aYbk4J81WxZPb7RVMseU gbwNaqSotyo/hIhvP7i5rnhoGiBUY/Ckruyl2zanycv61DoIsgZejsbYMuf/2/+rB6 361ddG5EQRIH3UmGJCxmh2yoBXroLpCu9d9Z93YU= Date: Sat, 21 Mar 2020 18:22:37 -0700 From: Andrew Morton To: akpm@linux-foundation.org, bharata@linux.ibm.com, cl@linux.com, iamjoonsoo.kim@lge.com, ktkhai@virtuozzo.com, linux-mm@kvack.org, mgorman@techsingularity.net, mhocko@kernel.org, mm-commits@vger.kernel.org, mpe@ellerman.id.au, nathanl@linux.ibm.com, penberg@kernel.org, puvichakravarthy@in.ibm.com, rientjes@google.com, sachinp@linux.vnet.ibm.com, srikar@linux.vnet.ibm.com, stable@vger.kernel.org, torvalds@linux-foundation.org, vbabka@suse.cz Subject: [patch 09/10] mm, slub: prevent kmalloc_node crashes and memory leaks Message-ID: <20200322012237.-UR_Vp0tW%akpm@linux-foundation.org> In-Reply-To: <20200321181954.c0564dfd5514cd742b534884@linux-foundation.org> User-Agent: s-nail v14.8.16 Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Vlastimil Babka Subject: mm, slub: prevent kmalloc_node crashes and memory leaks Sachin reports [1] a crash in SLUB __slab_alloc(): BUG: Kernel NULL pointer dereference on read at 0x000073b0 Faulting instruction address: 0xc0000000003d55f4 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries Modules linked in: CPU: 19 PID: 1 Comm: systemd Not tainted 5.6.0-rc2-next-20200218-autotest #1 NIP: c0000000003d55f4 LR: c0000000003d5b94 CTR: 0000000000000000 REGS: c0000008b37836d0 TRAP: 0300 Not tainted (5.6.0-rc2-next-20200218-autotest) MSR: 8000000000009033 CR: 24004844 XER: 00000000 CFAR: c00000000000dec4 DAR: 00000000000073b0 DSISR: 40000000 IRQMASK: 1 GPR00: c0000000003d5b94 c0000008b3783960 c00000000155d400 c0000008b301f500 GPR04: 0000000000000dc0 0000000000000002 c0000000003443d8 c0000008bb398620 GPR08: 00000008ba2f0000 0000000000000001 0000000000000000 0000000000000000 GPR12: 0000000024004844 c00000001ec52a00 0000000000000000 0000000000000000 GPR16: c0000008a1b20048 c000000001595898 c000000001750c18 0000000000000002 GPR20: c000000001750c28 c000000001624470 0000000fffffffe0 5deadbeef0000122 GPR24: 0000000000000001 0000000000000dc0 0000000000000002 c0000000003443d8 GPR28: c0000008b301f500 c0000008bb398620 0000000000000000 c00c000002287180 NIP [c0000000003d55f4] ___slab_alloc+0x1f4/0x760 LR [c0000000003d5b94] __slab_alloc+0x34/0x60 Call Trace: [c0000008b3783960] [c0000000003d5734] ___slab_alloc+0x334/0x760 (unreliable) [c0000008b3783a40] [c0000000003d5b94] __slab_alloc+0x34/0x60 [c0000008b3783a70] [c0000000003d6fa0] __kmalloc_node+0x110/0x490 [c0000008b3783af0] [c0000000003443d8] kvmalloc_node+0x58/0x110 [c0000008b3783b30] [c0000000003fee38] mem_cgroup_css_online+0x108/0x270 [c0000008b3783b90] [c000000000235aa8] online_css+0x48/0xd0 [c0000008b3783bc0] [c00000000023eaec] cgroup_apply_control_enable+0x2ec/0x4d0 [c0000008b3783ca0] [c000000000242318] cgroup_mkdir+0x228/0x5f0 [c0000008b3783d10] [c00000000051e170] kernfs_iop_mkdir+0x90/0xf0 [c0000008b3783d50] [c00000000043dc00] vfs_mkdir+0x110/0x230 [c0000008b3783da0] [c000000000441c90] do_mkdirat+0xb0/0x1a0 [c0000008b3783e20] [c00000000000b278] system_call+0x5c/0x68 This is a PowerPC platform with following NUMA topology: available: 2 nodes (0-1) node 0 cpus: node 0 size: 0 MB node 0 free: 0 MB node 1 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 node 1 size: 35247 MB node 1 free: 30907 MB node distances: node 0 1 0: 10 40 1: 40 10 possible numa nodes: 0-31 This only happens with a mmotm patch "mm/memcontrol.c: allocate shrinker_map on appropriate NUMA node" [2] which effectively calls kmalloc_node for each possible node. SLUB however only allocates kmem_cache_node on online N_NORMAL_MEMORY nodes, and relies on node_to_mem_node to return such valid node for other nodes since commit a561ce00b09e ("slub: fall back to node_to_mem_node() node if allocating on memoryless node"). This is however not true in this configuration where the _node_numa_mem_ array is not initialized for nodes 0 and 2-31, thus it contains zeroes and get_partial() ends up accessing non-allocated kmem_cache_node. A related issue was reported by Bharata (originally by Ramachandran) [3] where a similar PowerPC configuration, but with mainline kernel without patch [2] ends up allocating large amounts of pages by kmalloc-1k kmalloc-512. This seems to have the same underlying issue with node_to_mem_node() not behaving as expected, and might probably also lead to an infinite loop with CONFIG_SLUB_CPU_PARTIAL [4]. This patch should fix both issues by not relying on node_to_mem_node() anymore and instead simply falling back to NUMA_NO_NODE, when kmalloc_node(node) is attempted for a node that's not online, or has no usable memory. The "usable memory" condition is also changed from node_present_pages() to N_NORMAL_MEMORY node state, as that is exactly the condition that SLUB uses to allocate kmem_cache_node structures. The check in get_partial() is removed completely, as the checks in ___slab_alloc() are now sufficient to prevent get_partial() being reached with an invalid node. [1] https://lore.kernel.org/linux-next/3381CD91-AB3D-4773-BA04-E7A072A63968@linux.vnet.ibm.com/ [2] https://lore.kernel.org/linux-mm/fff0e636-4c36-ed10-281c-8cdb0687c839@virtuozzo.com/ [3] https://lore.kernel.org/linux-mm/20200317092624.GB22538@in.ibm.com/ [4] https://lore.kernel.org/linux-mm/088b5996-faae-8a56-ef9c-5b567125ae54@suse.cz/ Link: http://lkml.kernel.org/r/20200320115533.9604-1-vbabka@suse.cz Fixes: a561ce00b09e ("slub: fall back to node_to_mem_node() node if allocating on memoryless node") Signed-off-by: Vlastimil Babka Reported-by: Sachin Sant Tested-by: Sachin Sant Reported-by: PUVICHAKRAVARTHY RAMACHANDRAN Tested-by: Bharata B Rao Debugged-by: Srikar Dronamraju Reviewed-by: Srikar Dronamraju Cc: Mel Gorman Cc: Michael Ellerman Cc: Michal Hocko Cc: Christopher Lameter Cc: linuxppc-dev@lists.ozlabs.org Cc: Joonsoo Kim Cc: Pekka Enberg Cc: David Rientjes Cc: Kirill Tkhai Cc: Vlastimil Babka Cc: Nathan Lynch Cc: Signed-off-by: Andrew Morton --- mm/slub.c | 26 +++++++++++++++++--------- 1 file changed, 17 insertions(+), 9 deletions(-) --- a/mm/slub.c~mm-slub-prevent-kmalloc_node-crashes-and-memory-leaks +++ a/mm/slub.c @@ -1973,8 +1973,6 @@ static void *get_partial(struct kmem_cac if (node == NUMA_NO_NODE) searchnode = numa_mem_id(); - else if (!node_present_pages(node)) - searchnode = node_to_mem_node(node); object = get_partial_node(s, get_node(s, searchnode), c, flags); if (object || node != NUMA_NO_NODE) @@ -2563,17 +2561,27 @@ static void *___slab_alloc(struct kmem_c struct page *page; page = c->page; - if (!page) + if (!page) { + /* + * if the node is not online or has no normal memory, just + * ignore the node constraint + */ + if (unlikely(node != NUMA_NO_NODE && + !node_state(node, N_NORMAL_MEMORY))) + node = NUMA_NO_NODE; goto new_slab; + } redo: if (unlikely(!node_match(page, node))) { - int searchnode = node; - - if (node != NUMA_NO_NODE && !node_present_pages(node)) - searchnode = node_to_mem_node(node); - - if (unlikely(!node_match(page, searchnode))) { + /* + * same as above but node_match() being false already + * implies node != NUMA_NO_NODE + */ + if (!node_state(node, N_NORMAL_MEMORY)) { + node = NUMA_NO_NODE; + goto redo; + } else { stat(s, ALLOC_NODE_MISMATCH); deactivate_slab(s, page, c->freelist, c); goto new_slab;