From patchwork Thu Nov 24 20:50:34 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Karol Herbst X-Patchwork-Id: 84001 Delivered-To: patch@linaro.org Received: by 10.140.20.101 with SMTP id 92csp235663qgi; Thu, 24 Nov 2016 12:50:47 -0800 (PST) X-Received: by 10.99.160.1 with SMTP id r1mr7471929pge.107.1480020647084; Thu, 24 Nov 2016 12:50:47 -0800 (PST) Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f12si12309514plm.169.2016.11.24.12.50.46; Thu, 24 Nov 2016 12:50:47 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S941343AbcKXUuh (ORCPT + 25 others); Thu, 24 Nov 2016 15:50:37 -0500 Received: from mail-io0-f195.google.com ([209.85.223.195]:34867 "EHLO mail-io0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933643AbcKXUuf (ORCPT ); Thu, 24 Nov 2016 15:50:35 -0500 Received: by mail-io0-f195.google.com with SMTP id h133so6595636ioe.2 for ; Thu, 24 Nov 2016 12:50:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=AuD/LumXSiGJRWvcxPhuhbpmXltuSjFVQB6I1oOmDWU=; b=r8WFq+CnNHSkFr+Ek6cRInZsorit61tCincm3XfgK14X5mUC2XiGOymrNWCSr39BIC gYbpgC7+hUNhbNZqk/nA5hRgP06MFeZAM9efrd5NugbPkxWg3amjQHj33tEKED/g9Uie WKxd7aavPvCeFroxrtkw7slyx0plG7Qr6nf4EKTi/xY3H4Vl25ve/TN2Zc2a6hymxSOu 7eSY6XKz+gSwSgKt6mDP5ARzP18O2gliQD04jktdzFt6l26suuE6I3pvx6NSSb39sqc4 uiHn9XLRNyIc/sypBBkFqd4qD5uqixS3h7fyij6mbPf4wjARm1+dRfrsM3dYVGGLoYAM FoKg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=AuD/LumXSiGJRWvcxPhuhbpmXltuSjFVQB6I1oOmDWU=; b=BPiUkAVaWW45/W2zx4QHSi3ydX6w52CAn4sjmClXqnHMxiBnovsxUtadA/IrBQrlkD /WxQK1ykNniULTvgJHEUVcMBGVShLpLrgclUPIBKlqZzREg4FQeMxyPwA8VIfwGhg7Jp iqeLQBA4cW9m+WdniFuf14OTYZ/Z20cVZiwsjOmj2ORUOwKbpBel5/LbsK66/0XLSp9I s33pEuZj0JCKSjQ1d3AtEHdgrKm7U939N6aWLdL3R+mUEVJgDnzTOGXdpHT1cJKooQxH rAr9TPZpanw6H2LXE4DDmiiw1VJc6L84OxqXjDGW84Op0ENv49vf7CQy8GIGgZ5o97Wo rgnA== X-Gm-Message-State: AKaTC01BNHS/E0YtoFB6HFcmIWHEphbHoJlk3C1tdNUm2ROki77PKeO9xb6fNvhLLBeMsL99PpA+nOco4wN2+A== X-Received: by 10.107.46.227 with SMTP id u96mr3856987iou.58.1480020634663; Thu, 24 Nov 2016 12:50:34 -0800 (PST) MIME-Version: 1.0 Received: by 10.107.59.207 with HTTP; Thu, 24 Nov 2016 12:50:34 -0800 (PST) In-Reply-To: References: <20160802113148.6784f2de@gandalf.local.home> <20160802121305.64307a8a@gandalf.local.home> From: Karol Herbst Date: Thu, 24 Nov 2016 21:50:34 +0100 Message-ID: Subject: Re: mmiotracer hangs the system To: Andy Shevchenko Cc: Steven Rostedt , "linux-kernel@vger.kernel.org" , "Paul E. McKenney" , Ingo Molnar Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org sorry for that, but I forgot the patch 2016-11-19 11:56 GMT+01:00 Karol Herbst : > this is odd, I found a bug related to nouveau (modprobe/bind doesn't > return), but that isn't related to your issue at all or maybe it is > exactly this, cause the binding of the device doesn't return and > depending on the kind of driver, it would hang the system... yeah, > maybe it is the same issue. > > anyway, could you try to trace with the attached patch? Maybe the > additional output would help me to verify it. Currently I am working > on the bugfix I mentioned above and this may also fix your issue. I > was still able to get a working mmiotrace file, even if the dvice > binding didn't finish. Is this the same for you? (try cat > "/sys/kernel/debug/tracing/trace_pipe > some_file"; and see if this > contains anything usefull). > > This really looks like an odd issue, because the mmiotracer still > behaves as expected. > > 2016-10-22 18:02 GMT+02:00 Andy Shevchenko : >> On Fri, Oct 14, 2016 at 12:12 AM, Karol Herbst wrote: >>> sorry for the delay fixing that bug. I got occupied with other things >>> and didn't really got to the issue again, it is on my todo list as the >>> next item though and I hope I will be able to get a fix ready this >>> weekend. I think I might know where the issue is, but didn't confirm >>> it yet. >> >> Thanks.I'm still using revert. Feel free to Cc me when you will have >> some material to test. >> >>> >>> Again, sorry for the delay. >>> >>> Karol >>> >>> 2016-08-19 22:46 GMT+02:00 Karol Herbst : >>>> Hi again, >>>> >>>> I was able to get a crash/freeze/something while unbinding/binding my >>>> nvidia gpu from nouveau. >>>> >>>> Guess that means something is odd. I will investigate this more over >>>> the weekend. >>>> >>>> 2016-08-19 17:35 GMT+02:00 Andy Shevchenko : >>>>> On Fri, Aug 19, 2016 at 6:08 PM, karol herbst wrote: >>>>>> 2016-08-19 15:02 GMT+02:00 Andy Shevchenko : >>>>>>> On Fri, Aug 19, 2016 at 1:35 PM, karol herbst wrote: >>>>>>>> is there any update on that issue I missed somehow? I really don't >>>>>>>> want to leave the mmiotracer in a state, where it breaks something >>>>>>>> while fixing other issues. >>>>>>> >>>>>>> No updates. I'm busy right now with more priority tasks and revert >>>>>>> works for me. Issue is reproducible in my case 100%. >>>>>>> >>>>>> >>>>>> Is there something I could do with a "normal" haswell desktop system >>>>>> to reproduce this issue? >>>>> >>>>> Try LPSS UART device(s) >>>>> >>>>>> >>>>>> I'll try to play around the next days a bit and maybe I find something >>>>>> that works out here as well. It seems to be related to >>>>>> unmapping-mapping cycles. >>>>> >>>>> That is the only thing I would think of. >>>>> >>>>>> >>>>>> Because if this only happens with the pwm-lpss driver, >>>>> >>>>> It has nothing to do with pwm-lpss since it's a HS UART and served by >>>>> intel-lpss driver. >>>>> >>>>>> it may be >>>>>> really troublesome to debug, because I don't really know the code that >>>>>> well to be sure where the issue might be. >>>>>> >>>>>>> So, I would able to attach dmesg in case it would be helpful. >>>>>>> Otherwise tell me exact instructions how to debug the issue. >>>>>>> >>>>>>> Here you are: >>>>>>> http://pastebin.com/raw/VfTZENt7 >>>>>>> >>>>>>>> But for now, without being able to even reproduce the issue, I can't >>>>>>>> really do much, because the code in the current state looks sane to >>>>>>>> me. Maybe this case includes the mmiotracer cleaning things up and >>>>>>>> arms new region for mmiotracing and that's why it fails? Besides that, >>>>>>>> I have no idea and no way to reproduce this, so I can't help this way. >>>>>>> >>>>>>> Maybe. First thing happened is iounmap(). >>>>> >>>>> >>>>> -- >>>>> With Best Regards, >>>>> Andy Shevchenko >> >> >> >> -- >> With Best Regards, >> Andy Shevchenko >From 92aea447a776f10aad0a2e971b5f2b208a1161d2 Mon Sep 17 00:00:00 2001 From: Karol Herbst Date: Thu, 24 Nov 2016 21:46:27 +0100 Subject: [PATCH] temp hack --- arch/x86/mm/kmmio.c | 29 +++++++++++++++++++++++------ 1 file changed, 23 insertions(+), 6 deletions(-) diff --git a/arch/x86/mm/kmmio.c b/arch/x86/mm/kmmio.c index afc47f5c9531..a002ee314a0c 100644 --- a/arch/x86/mm/kmmio.c +++ b/arch/x86/mm/kmmio.c @@ -97,11 +97,16 @@ static DEFINE_PER_CPU(struct kmmio_context, kmmio_ctx); static struct kmmio_probe *get_kmmio_probe(unsigned long addr) { struct kmmio_probe *p; + struct kmmio_probe *result = NULL; list_for_each_entry_rcu(p, &kmmio_probes, list) { - if (addr >= p->addr && addr < (p->addr + p->len)) - return p; + if (addr >= p->addr && addr < (p->addr + p->len)) { + if (!result) + result = p; + else + printk(KERN_ERR " %s collision detected %lu", __FUNCTION__, addr); + } } - return NULL; + return result; } /* You must be holding RCU read lock. */ @@ -109,6 +114,7 @@ static struct kmmio_fault_page *get_kmmio_fault_page(unsigned long addr) { struct list_head *head; struct kmmio_fault_page *f; + struct kmmio_fault_page *result = NULL; unsigned int l; pte_t *pte = lookup_address(addr, &l); @@ -116,11 +122,16 @@ static struct kmmio_fault_page *get_kmmio_fault_page(unsigned long addr) return NULL; addr &= page_level_mask(l); head = kmmio_page_list(addr); + list_for_each_entry_rcu(f, head, list) { - if (f->addr == addr) - return f; + if (f->addr == addr) { + if (!result) + return f; + else + printk(KERN_ERR " %s collision detected %lu", __FUNCTION__, addr); + } } - return NULL; + return result; } static void clear_pmd_presence(pmd_t *pmd, bool clear, pmdval_t *old) @@ -375,6 +386,7 @@ static int add_kmmio_fault_page(unsigned long addr) { struct kmmio_fault_page *f; + printk(KERN_WARNING " %s %lx", __FUNCTION__, addr); f = get_kmmio_fault_page(addr); if (f) { if (!f->count) @@ -406,6 +418,7 @@ static void release_kmmio_fault_page(unsigned long addr, { struct kmmio_fault_page *f; + printk(KERN_WARNING " %s %lx", __FUNCTION__, addr); f = get_kmmio_fault_page(addr); if (!f) return; @@ -445,6 +458,8 @@ int register_kmmio_probe(struct kmmio_probe *p) } pte = lookup_address(p->addr, &l); + printk(KERN_WARNING " %s %lx %u", __FUNCTION__, p->addr, l); + if (!pte) { ret = -EINVAL; goto out; @@ -537,6 +552,8 @@ void unregister_kmmio_probe(struct kmmio_probe *p) if (!pte) return; + printk(KERN_WARNING " %s %lx %u", __FUNCTION__, p->addr, l); + spin_lock_irqsave(&kmmio_lock, flags); while (size < size_lim) { release_kmmio_fault_page(p->addr + size, &release_list); -- 2.11.0.rc2