From patchwork Mon Aug 18 15:01:50 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Thompson X-Patchwork-Id: 35514 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-oi0-f69.google.com (mail-oi0-f69.google.com [209.85.218.69]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id 84C6220676 for ; Mon, 18 Aug 2014 15:04:04 +0000 (UTC) Received: by mail-oi0-f69.google.com with SMTP id h136sf31396004oig.0 for ; Mon, 18 Aug 2014 08:04:04 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:delivered-to:from:to:cc:subject :date:message-id:in-reply-to:references:x-original-sender :x-original-authentication-results:precedence:mailing-list:list-id :list-post:list-help:list-archive:list-unsubscribe; bh=y34XH3Aym9eHVJCLcBhQ9ENH85MjS8Wo6+3xpuoDi1Q=; b=SyKealBSNPeHDdBh2KN0PbDFo6aAQTB0fFJ0LVzPly8lHjE1D32nxNcrNYZKYsK4iS l4+OU0TGI+NUUy83LBmDl3t9M+RDhr+RWzCK8nqK80oUSwLMZqGpy5Ayty9f3dH9vrWf hpKhNRzgZd8n232z30dxqlMd0gNLyqxp69gDUb14rqhTqpeo3Cu+H/gx40MFkenKbHOH O3QN0UBK8sah0VMwa/ESYgj0A6W5yk+8uaxRIKCzgrqwYG4WYGHvwFlbGOqypWlqpwgp xHiE2Ozjj4lrl2W7a6P8rNkaza8bOdHuXNHtcWHwDxC/HZ16c80c5hXkwiKP1rvfgLhu W+pA== X-Gm-Message-State: ALoCoQllNpCzQqYTER2wqSOEsjL2xs6UtYenMcAu7NU4DjjwdZdMhuyTSiZDdUNXUQZlEQ4nkerI X-Received: by 10.182.43.164 with SMTP id x4mr18998597obl.5.1408374244187; Mon, 18 Aug 2014 08:04:04 -0700 (PDT) MIME-Version: 1.0 X-BeenThere: patchwork-forward@linaro.org Received: by 10.140.96.42 with SMTP id j39ls2488526qge.89.gmail; Mon, 18 Aug 2014 08:04:04 -0700 (PDT) X-Received: by 10.52.36.80 with SMTP id o16mr646747vdj.58.1408374244055; Mon, 18 Aug 2014 08:04:04 -0700 (PDT) Received: from mail-vc0-f178.google.com (mail-vc0-f178.google.com [209.85.220.178]) by mx.google.com with ESMTPS id c15si5565516veh.48.2014.08.18.08.04.04 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 18 Aug 2014 08:04:04 -0700 (PDT) Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.220.178 as permitted sender) client-ip=209.85.220.178; Received: by mail-vc0-f178.google.com with SMTP id la4so5798576vcb.23 for ; Mon, 18 Aug 2014 08:04:04 -0700 (PDT) X-Received: by 10.220.97.5 with SMTP id j5mr25451511vcn.16.1408374243916; Mon, 18 Aug 2014 08:04:03 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patches@linaro.org Received: by 10.221.37.5 with SMTP id tc5csp159735vcb; Mon, 18 Aug 2014 08:04:03 -0700 (PDT) X-Received: by 10.180.79.72 with SMTP id h8mr74003244wix.55.1408374242612; Mon, 18 Aug 2014 08:04:02 -0700 (PDT) Received: from mail-wi0-f182.google.com (mail-wi0-f182.google.com [209.85.212.182]) by mx.google.com with ESMTPS id gd3si17281129wib.1.2014.08.18.08.04.02 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 18 Aug 2014 08:04:02 -0700 (PDT) Received-SPF: pass (google.com: domain of daniel.thompson@linaro.org designates 209.85.212.182 as permitted sender) client-ip=209.85.212.182; Received: by mail-wi0-f182.google.com with SMTP id d1so3808540wiv.3 for ; Mon, 18 Aug 2014 08:04:02 -0700 (PDT) X-Received: by 10.180.38.84 with SMTP id e20mr40974439wik.43.1408374242073; Mon, 18 Aug 2014 08:04:02 -0700 (PDT) Received: from sundance.lan (cpc4-aztw19-0-0-cust157.18-1.cable.virginm.net. [82.33.25.158]) by mx.google.com with ESMTPSA id ph10sm42896947wjb.25.2014.08.18.08.03.59 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 18 Aug 2014 08:04:00 -0700 (PDT) From: Daniel Thompson To: Jason Wessel Cc: Daniel Thompson , linux-kernel@vger.kernel.org, patches@linaro.org, linaro-kernel@lists.linaro.org, Mike Travis , Randy Dunlap , Dimitri Sivanich , Andrew Morton , Borislav Petkov , kgdb-bugreport@lists.sourceforge.net, Ingo Molnar Subject: [RESEND PATCH v2 3.17rc1] kgdb: Timeout if secondary CPUs ignore the roundup Date: Mon, 18 Aug 2014 16:01:50 +0100 Message-Id: <1408374110-19656-1-git-send-email-daniel.thompson@linaro.org> X-Mailer: git-send-email 1.9.3 In-Reply-To: <1404310379-30228-1-git-send-email-daniel.thompson@linaro.org> References: <1404310379-30228-1-git-send-email-daniel.thompson@linaro.org> X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: daniel.thompson@linaro.org X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.220.178 as permitted sender) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Precedence: list Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org List-ID: X-Google-Group-Id: 836684582541 List-Post: , List-Help: , List-Archive: List-Unsubscribe: , Currently if an active CPU fails to respond to a roundup request the CPU that requested the roundup will become stuck. This needlessly reduces the robustness of the debugger. This patch introduces a timeout allowing the system state to be examined even when the system contains unresponsive processors. It also modifies kdb's cpu command to make it censor attempts to switch to unresponsive processors and to report their state as (D)ead. Signed-off-by: Daniel Thompson Cc: Jason Wessel Cc: Mike Travis Cc: Randy Dunlap Cc: Dimitri Sivanich Cc: Andrew Morton Cc: Borislav Petkov Cc: kgdb-bugreport@lists.sourceforge.net Cc: Ingo Molnar --- Notes: Changes since v1: - Set CATASTROPHIC if the system contains unresponsive processors (Jason Wessel) kernel/debug/debug_core.c | 9 +++++++-- kernel/debug/kdb/kdb_debugger.c | 4 ++++ kernel/debug/kdb/kdb_main.c | 4 +++- 3 files changed, 14 insertions(+), 3 deletions(-) -- 1.9.3 diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c index 1adf62b..acd7497 100644 --- a/kernel/debug/debug_core.c +++ b/kernel/debug/debug_core.c @@ -471,6 +471,7 @@ static int kgdb_cpu_enter(struct kgdb_state *ks, struct pt_regs *regs, int cpu; int trace_on = 0; int online_cpus = num_online_cpus(); + u64 time_left; kgdb_info[ks->cpu].enter_kgdb++; kgdb_info[ks->cpu].exception_state |= exception_state; @@ -595,9 +596,13 @@ return_normal: /* * Wait for the other CPUs to be notified and be waiting for us: */ - while (kgdb_do_roundup && (atomic_read(&masters_in_kgdb) + - atomic_read(&slaves_in_kgdb)) != online_cpus) + time_left = loops_per_jiffy * HZ; + while (kgdb_do_roundup && --time_left && + (atomic_read(&masters_in_kgdb) + atomic_read(&slaves_in_kgdb)) != + online_cpus) cpu_relax(); + if (!time_left) + pr_crit("KGDB: Timed out waiting for secondary CPUs.\n"); /* * At this point the primary processor is completely diff --git a/kernel/debug/kdb/kdb_debugger.c b/kernel/debug/kdb/kdb_debugger.c index 8859ca3..15e1a7a 100644 --- a/kernel/debug/kdb/kdb_debugger.c +++ b/kernel/debug/kdb/kdb_debugger.c @@ -129,6 +129,10 @@ int kdb_stub(struct kgdb_state *ks) ks->pass_exception = 1; KDB_FLAG_SET(CATASTROPHIC); } + /* set CATASTROPHIC if the system contains unresponsive processors */ + for_each_online_cpu(i) + if (!kgdb_info[i].enter_kgdb) + KDB_FLAG_SET(CATASTROPHIC); if (KDB_STATE(SSBPT) && reason == KDB_REASON_SSTEP) { KDB_STATE_CLEAR(SSBPT); KDB_STATE_CLEAR(DOING_SS); diff --git a/kernel/debug/kdb/kdb_main.c b/kernel/debug/kdb/kdb_main.c index 379650b..0633e78 100644 --- a/kernel/debug/kdb/kdb_main.c +++ b/kernel/debug/kdb/kdb_main.c @@ -2157,6 +2157,8 @@ static void kdb_cpu_status(void) for (start_cpu = -1, i = 0; i < NR_CPUS; i++) { if (!cpu_online(i)) { state = 'F'; /* cpu is offline */ + } else if (!kgdb_info[i].enter_kgdb) { + state = 'D'; /* cpu is online but unresponsive */ } else { state = ' '; /* cpu is responding to kdb */ if (kdb_task_state_char(KDB_TSK(i)) == 'I') @@ -2210,7 +2212,7 @@ static int kdb_cpu(int argc, const char **argv) /* * Validate cpunum */ - if ((cpunum > NR_CPUS) || !cpu_online(cpunum)) + if ((cpunum > NR_CPUS) || !kgdb_info[cpunum].enter_kgdb) return KDB_BADCPUNUM; dbg_switch_cpu = cpunum;