From patchwork Mon Jun 7 12:57:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Liangyan X-Patchwork-Id: 456784 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, UNPARSEABLE_RELAY, URIBL_RED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 44960C47082 for ; Mon, 7 Jun 2021 12:57:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 23C2A60C3E for ; Mon, 7 Jun 2021 12:57:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230435AbhFGM7g (ORCPT ); Mon, 7 Jun 2021 08:59:36 -0400 Received: from out30-57.freemail.mail.aliyun.com ([115.124.30.57]:55764 "EHLO out30-57.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230194AbhFGM7g (ORCPT ); Mon, 7 Jun 2021 08:59:36 -0400 X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R451e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=e01e04423; MF=liangyan.peng@linux.alibaba.com; NM=1; PH=DS; RN=10; SR=0; TI=SMTPD_---0UbdRZ0i_1623070655; Received: from localhost(mailfrom:liangyan.peng@linux.alibaba.com fp:SMTPD_---0UbdRZ0i_1623070655) by smtp.aliyun-inc.com(127.0.0.1); Mon, 07 Jun 2021 20:57:43 +0800 From: Liangyan To: linux-kernel@vger.kernel.org, Steven Rostedt , Ingo Molnar Cc: Xunlei Pang , yinbinbin@alibabacloud.com, wetp , jnwang@linux.alibaba.com, Liangyan , stable@vger.kernel.org, Greg Kroah-Hartman Subject: [PATCH] tracing: Correct the length check which causes memory corruption Date: Mon, 7 Jun 2021 20:57:34 +0800 Message-Id: <20210607125734.1770447-1-liangyan.peng@linux.alibaba.com> X-Mailer: git-send-email 2.14.4.44.g2045bb6 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org We've suffered from severe kernel crashes due to memory corruption on our production environment, like, Call Trace: [1640542.554277] general protection fault: 0000 [#1] SMP PTI [1640542.554856] CPU: 17 PID: 26996 Comm: python Kdump: loaded Tainted:G [1640542.556629] RIP: 0010:kmem_cache_alloc+0x90/0x190 [1640542.559074] RSP: 0018:ffffb16faa597df8 EFLAGS: 00010286 [1640542.559587] RAX: 0000000000000000 RBX: 0000000000400200 RCX: 0000000006e931bf [1640542.560323] RDX: 0000000006e931be RSI: 0000000000400200 RDI: ffff9a45ff004300 [1640542.560996] RBP: 0000000000400200 R08: 0000000000023420 R09: 0000000000000000 [1640542.561670] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff9a20608d [1640542.562366] R13: ffff9a45ff004300 R14: ffff9a45ff004300 R15: 696c662f65636976 [1640542.563128] FS: 00007f45d7c6f740(0000) GS:ffff9a45ff840000(0000) knlGS:0000000000000000 [1640542.563937] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [1640542.564557] CR2: 00007f45d71311a0 CR3: 000000189d63e004 CR4: 00000000003606e0 [1640542.565279] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [1640542.566069] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [1640542.566742] Call Trace: [1640542.567009] anon_vma_clone+0x5d/0x170 [1640542.567417] __split_vma+0x91/0x1a0 [1640542.567777] do_munmap+0x2c6/0x320 [1640542.568128] vm_munmap+0x54/0x70 [1640542.569990] __x64_sys_munmap+0x22/0x30 [1640542.572005] do_syscall_64+0x5b/0x1b0 [1640542.573724] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [1640542.575642] RIP: 0033:0x7f45d6e61e27 James Wang has reproduced it stably on the latest 4.19 LTS. After some debugging, we finally proved that it's due to ftrace buffer out-of-bound access using a debug tool as follows: [ 86.775200] BUG: Out-of-bounds write at addr 0xffff88aefe8b7000 [ 86.780806] no_context+0xdf/0x3c0 [ 86.784327] __do_page_fault+0x252/0x470 [ 86.788367] do_page_fault+0x32/0x140 [ 86.792145] page_fault+0x1e/0x30 [ 86.795576] strncpy_from_unsafe+0x66/0xb0 [ 86.799789] fetch_memory_string+0x25/0x40 [ 86.804002] fetch_deref_string+0x51/0x60 [ 86.808134] kprobe_trace_func+0x32d/0x3a0 [ 86.812347] kprobe_dispatcher+0x45/0x50 [ 86.816385] kprobe_ftrace_handler+0x90/0xf0 [ 86.820779] ftrace_ops_assist_func+0xa1/0x140 [ 86.825340] 0xffffffffc00750bf [ 86.828603] do_sys_open+0x5/0x1f0 [ 86.832124] do_syscall_64+0x5b/0x1b0 [ 86.835900] entry_SYSCALL_64_after_hwframe+0x44/0xa9 commit b220c049d519 ("tracing: Check length before giving out the filter buffer") adds length check to protect trace data overflow introduced in 0fc1b09ff1ff, seems that this fix can't prevent overflow entirely, the length check should also take the sizeof entry->array[0] into account, since this array[0] is filled the length of trace data and occupy addtional space and risk overflow. Cc: stable@vger.kernel.org Fixes: b220c049d519 ("tracing: Check length before giving out the filter buffer") Signed-off-by: Liangyan Reviewed-by: Xunlei Pang Reviewed-by: yinbinbin Reviewed-by: Wetp Zhang Tested-by: James Wang Cc: Xunlei Pang Cc: Greg Kroah-Hartman --- kernel/trace/trace.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index a21ef9cd2aae..9299057feb56 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -2736,7 +2736,7 @@ trace_event_buffer_lock_reserve(struct trace_buffer **current_rb, (entry = this_cpu_read(trace_buffered_event))) { /* Try to use the per cpu buffer first */ val = this_cpu_inc_return(trace_buffered_event_cnt); - if ((len < (PAGE_SIZE - sizeof(*entry))) && val == 1) { + if ((len < (PAGE_SIZE - sizeof(*entry) - sizeof(entry->array[0]))) && val == 1) { trace_event_setup(entry, type, trace_ctx); entry->array[0] = len; return entry;