From patchwork Thu Oct 8 23:05:00 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Al Stone X-Patchwork-Id: 54679 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-lb0-f197.google.com (mail-lb0-f197.google.com [209.85.217.197]) by patches.linaro.org (Postfix) with ESMTPS id 22A5222FF4 for ; Thu, 8 Oct 2015 23:05:12 +0000 (UTC) Received: by lbbti1 with SMTP id ti1sf31040962lbb.3 for ; Thu, 08 Oct 2015 16:05:11 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:delivered-to:subject:to:references:cc:from :message-id:date:user-agent:mime-version:in-reply-to:content-type :content-transfer-encoding:sender:precedence:list-id :x-original-sender:x-original-authentication-results:mailing-list :list-post:list-help:list-archive:list-unsubscribe; bh=cPdzzSrkF/S65AGvAkMCmv9n+QlPyRtCRDqd4qxQkC0=; b=UwRjkGi+qE7vO/TUPgmeg/BKjCCBlaVjcrILW+Ymv44EyW+BoUCXttA5fkvdqAxZzU M/qMCtgttrw0zyC0AdevCOWMqoAUlr6TInBbpcvalNhNJxyEpw5W3k16/DuHb9MWDcsa qiT1YqgR605drm2KlugT8MBt58urY1GD5icG/X5Wpx6fMKt3jsp5gTaxKIYdvHSYF6OF C/HCEE9g5CmzUhzHn4yo15IAc4A/wSGvQuoWbSZWEzKUpLws20zC/PNxTpWh9X0fwMdz d3WijWMzFA7r4CTYSYEy/jWKfkma+BI/ELvxAoQukdFQrIE6iHeKuNeDrHP5la34cHLP lVBw== X-Gm-Message-State: ALoCoQmh2nswC2i4UuCg4XrMY/WdaukVqyTd9qTYjmD+irslPD+YRgnE9ld3i5Uo3o/x25G6AIwi X-Received: by 10.180.10.135 with SMTP id i7mr1238788wib.2.1444345511088; Thu, 08 Oct 2015 16:05:11 -0700 (PDT) X-BeenThere: patchwork-forward@linaro.org Received: by 10.25.163.73 with SMTP id m70ls245505lfe.39.gmail; Thu, 08 Oct 2015 16:05:10 -0700 (PDT) X-Received: by 10.25.87.9 with SMTP id l9mr2701984lfb.99.1444345510897; Thu, 08 Oct 2015 16:05:10 -0700 (PDT) Received: from mail-lb0-f182.google.com (mail-lb0-f182.google.com. [209.85.217.182]) by mx.google.com with ESMTPS id g5si31549371lbs.7.2015.10.08.16.05.10 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 08 Oct 2015 16:05:10 -0700 (PDT) Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.217.182 as permitted sender) client-ip=209.85.217.182; Received: by lbcao8 with SMTP id ao8so63846258lbc.3 for ; Thu, 08 Oct 2015 16:05:10 -0700 (PDT) X-Received: by 10.112.180.230 with SMTP id dr6mr4955341lbc.72.1444345510697; Thu, 08 Oct 2015 16:05:10 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.112.59.35 with SMTP id w3csp844587lbq; Thu, 8 Oct 2015 16:05:09 -0700 (PDT) X-Received: by 10.68.242.2 with SMTP id wm2mr11180880pbc.31.1444345509536; Thu, 08 Oct 2015 16:05:09 -0700 (PDT) Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id nv9si69875453pbb.37.2015.10.08.16.05.09; Thu, 08 Oct 2015 16:05:09 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933563AbbJHXFG (ORCPT + 30 others); Thu, 8 Oct 2015 19:05:06 -0400 Received: from mail-ig0-f175.google.com ([209.85.213.175]:35799 "EHLO mail-ig0-f175.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753963AbbJHXFE (ORCPT ); Thu, 8 Oct 2015 19:05:04 -0400 Received: by igbkq10 with SMTP id kq10so24048182igb.0 for ; Thu, 08 Oct 2015 16:05:03 -0700 (PDT) X-Received: by 10.50.83.65 with SMTP id o1mr6810650igy.50.1444345503060; Thu, 08 Oct 2015 16:05:03 -0700 (PDT) Received: from fidelio.ahs3 (c-50-134-239-249.hsd1.co.comcast.net. [50.134.239.249]) by smtp.googlemail.com with ESMTPSA id e19sm243337igo.14.2015.10.08.16.05.01 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 08 Oct 2015 16:05:01 -0700 (PDT) Subject: Re: [lkp] [ACPI] 7494b07eba: Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 0 To: "Rafael J. Wysocki" References: <871td6xcxc.fsf@yhuang-dev.intel.com> <9591316.rZitHL3YIp@vostro.rjw.lan> <5616D2CF.80200@linaro.org> <4975068.hExdhgQqJN@vostro.rjw.lan> Cc: Hanjun Guo , kernel test robot , lkp@01.org, LKML , "Rafael J. Wysocki" From: Al Stone X-Enigmail-Draft-Status: N1110 Message-ID: <5616F69C.9080900@linaro.org> Date: Thu, 8 Oct 2015 17:05:00 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: <4975068.hExdhgQqJN@vostro.rjw.lan> Sender: linux-kernel-owner@vger.kernel.org Precedence: list List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: al.stone@linaro.org X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.217.182 as permitted sender) smtp.mailfrom=patch+caf_=patchwork-forward=linaro.org@linaro.org Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org X-Google-Group-Id: 836684582541 List-Post: , List-Help: , List-Archive: List-Unsubscribe: , On 10/08/2015 04:50 PM, Rafael J. Wysocki wrote: > On Thursday, October 08, 2015 02:32:15 PM Al Stone wrote: >> On 10/08/2015 02:41 PM, Rafael J. Wysocki wrote: >>> On Thursday, October 08, 2015 10:37:55 PM Rafael J. Wysocki wrote: >>>> On Thursday, October 08, 2015 10:36:40 AM Al Stone wrote: >>>>> On 10/08/2015 05:44 AM, Hanjun Guo wrote: >>>>>> On 10/08/2015 11:21 AM, kernel test robot wrote: >>>>>>> FYI, we noticed the below changes on >>>>>>> >>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master >>>>>>> commit 7494b07ebaae2117629024369365f7be7adc16c3 ("ACPI: add in a >>>>>>> bad_madt_entry() function to eventually replace the macro") >>>>>>> >>>>>>> [ 0.000000] ACPI: undefined MADT subtable type for FADT 4.0: 127 (length 12) >>>>>> >>>>>> Seems that the MADT table contains reserved subtable type (0x7F), >>>>>> so this is traded as a wrong type in our patch. >>>>>> >>>>>>> [ 0.000000] ACPI: Error parsing LAPIC address override entry >>>>>> >>>>>> This was called by early_acpi_parse_madt_lapic_addr_ovr() in >>>>>> arch/x86/kernel/acpi/boot.c, which is scanning MADT for the first >>>>>> time when booting, so it will fail the boot process when finding >>>>>> the reserved MADT subtable type. >>>>>> >>>>>>> [ 0.000000] ACPI: Invalid BIOS MADT, disabling ACPI >>>>>> >>>>>> As the spec said in Table 5-46 (ACPI 6.0): >>>>>> >>>>>> 0x10-0x7F Reserved. OSPM skips structures of the reserved type. >>>>>> >>>>>> Should we just ignore those reserved type when scanning the MADT >>>>>> table? In the patch "ACPI: add in a bad_madt_entry() function to >>>>>> eventually replace the macro", we just trade it as wrong, that's >>>>>> why we failed to boot the system. >>>>>> >>>>>> Thanks >>>>>> Hanjun >>>>> >>>>> Arrgh. This is why people get frustrated with ACPI. The spec is >>>>> saying that those sub-table types are reserved -- implying they can >>>>> and probably will be used for something else in the future -- but >>>>> then vendors are shipping firmware that uses those reserved values, >>>>> and an OS *expects* them to be used, and there is *no* documentation >>>>> of it other than a kernel workaround. >>>>> >>>>> So yet again, technically this MADT subtable *is* wrong, and someone >>>>> should slap the vendor for doing this. But, the practical side of >>>>> this is that we now have to workaround what is now a known violation >>>>> of the spec. >>>>> >>>>> The more ACPI allows this kind of nonsense, the less usable it will >>>>> become. >>>> >>>> Linux Kernel Developer's First Rule: You shall not break setups that >>>> worked previously, even if they worked by accident. >>>> >>>> IOW, if something booted and your commit made it not boot any more, it counts >>>> as a regression and needs to be modified or reverted. >>> >>> Moreover, if the firmware in question shipped in a product, we have no choice >>> but to work around bugs in it. Doing otherwise would be refusing to support >>> our users and not the vendor of the systems they were unfortunate enough to >>> acquire. >>> >>> Thanks, >>> Rafael >>> >> >> Yup, understood and agreed. I have no issue at all with the First Rule. >> >> What I have an issue with is all the exceptions to the standards -- and >> primarily the unknown ones -- that exist with ACPI (or any other standard, >> mind you), independent of any OS. > > Well, one can argue that stadards are not what is written in specifications, > but what is done in practice by everybody. If a specification does not agree > with the common practice, there is a problem with it, not with the practice. True. That is the other side of it. How rampant is this particular problem, though? Does everyone and their uncle use some reserved MADT subtable ID value in their firmware? This is the first time I've personally seen this, but then I haven't been looking for it until now. >> It's just like driving a car. I will (and do) grumble at people when they >> break the rules. On the other hand, I'm not going to crash into them even >> if they are at fault. When the ACPI spec gets twisted around, I'm going >> to say something about it; just the same, I am not going to break their >> system if it already works. > > OK > > So IMO there are two things we can do. First, try to update the spec to > reflect the reality where needed. Second, having done that, add appropriate > checks to a firmware test suite and make it scream bloody murder every time > they trigger. It also may be a good idea to print warnings into the kernel > buffer for them. > > Thanks, > Rafael > Agreed. The patch below uses pr_err() for arm64, as the maintainers wish, and flags with pr_warn() any such uses on other architectures; this should fix the regression. In the meantime, I'll poke the spec folks on the use of reserved subtable IDs in the MADT and see what the consensus is there. It may just be a matter of clarifying the language in the spec. It's also on my plate to really dig into an ACPI test suite and see about building something really robust for that -- this can be added as an example. I'll see if I have time to send in a patch for FWTS, too, which is pretty good about capturing such things. diff --git a/drivers/acpi/tables.c b/drivers/acpi/tables.c index a2ed38a..a74b6fa 100644 --- a/drivers/acpi/tables.c +++ b/drivers/acpi/tables.c @@ -413,9 +413,15 @@ static int __init bad_madt_entry(struct acpi_table_header *table, } if (entry->type >= ms->num_types) { - pr_err("undefined MADT subtable type for FADT %d.%d: %d (length %d)\n", - major, minor, entry->type, entry->length); - return 1; + if (IS_ENABLED(CONFIG_ARM64)) { + pr_err("undefined MADT subtable type for FADT %d.%d: %d (length %d)\n", + major, minor, entry->type, entry->length); + return 1; + } else { + pr_warn("firmware should not be using reserved MADT subtable type for FADT %d.%d: %d (length %d)\n", + major, minor, entry->type, entry->length); + return 0; + } }