From patchwork Mon Aug 12 10:11:46 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shiju Jose X-Patchwork-Id: 171052 Delivered-To: patch@linaro.org Received: by 2002:a92:d204:0:0:0:0:0 with SMTP id y4csp2628680ily; Mon, 12 Aug 2019 03:12:33 -0700 (PDT) X-Google-Smtp-Source: APXvYqwiGyILj/WGpjU9QGBK2AgXtsnaY4jHPDNQyYZod9p1dXoA9rmOCkwSsBTLER8GIlCNB+mQ X-Received: by 2002:a63:6686:: with SMTP id a128mr29150866pgc.361.1565604753598; Mon, 12 Aug 2019 03:12:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565604753; cv=none; d=google.com; s=arc-20160816; b=rwJ6y0r6dSazrnuL9nafjRBeAH5MOVafX4RAq9IacsTC8YAckDT8pVgsatZsZPXxyr 2AZgUr0YZsrg1xyl3rSj3+nS0xfIflaC+7RCa9JCzt47OXEWtYVtKFdvnbKrNPUNmYOZ UA/VXS7AHTKEUhjVqofe6keFUQYroIjvnGJwjqWrinQk/wg4bRV7QODaRn1i1D44hB56 5vF3KatPiWqVMlrVB0il8XhSJe/bvfUnR370vEyzekgXqF2H60qhJEcGUt+hGjqy5XMf 0917NhOs6Jkf+hWXh7nD08WtunIYwWjGwqMTPfJl9WLNi+JU1TMBq5NfPliz9BMmjw5e /ETQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=k3Lfr+jIDFKZBN8zOTyzs9woM97SMwgisxF/h+PJe5s=; b=Nj446akJwoyq27SSi2KNfqqzv2+eUlCy72en3Zg3ta18P0rVD/QFBgfr3DOyBiCl6j SL7DgbjHjpsPvKVDA46p/vryU6Z1ub3bNri8DlK8nN33x1/aElbLk1V2bxcUKZzdiYxY XohnSqVosr4rw0cd1mhpQL34572/PY8tqyiac7x05jtvIJne+67tOiaL1GyHfqLz6LzD aZ/G1bPYDqXk5vOVXRKP6THd4wnwwqszpgW8+iPH9PYJqKze2VyBppFiPNgpgF8fnmkS JpktaKz9w0/EdJIzLN6T4crP3fdCosivkqm2xh0fHC3O987UGQKQ4fnoczssYJZktziT qOhQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-acpi-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-acpi-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c17si9653755plo.46.2019.08.12.03.12.33; Mon, 12 Aug 2019 03:12:33 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-acpi-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-acpi-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-acpi-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727857AbfHLKMc (ORCPT + 7 others); Mon, 12 Aug 2019 06:12:32 -0400 Received: from szxga07-in.huawei.com ([45.249.212.35]:35322 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727862AbfHLKMc (ORCPT ); Mon, 12 Aug 2019 06:12:32 -0400 Received: from DGGEMS411-HUB.china.huawei.com (unknown [172.30.72.59]) by Forcepoint Email with ESMTP id 0B55C43626E257FE43CD; Mon, 12 Aug 2019 18:12:25 +0800 (CST) Received: from DESKTOP-6T4S3DQ.china.huawei.com (10.202.226.45) by DGGEMS411-HUB.china.huawei.com (10.3.19.211) with Microsoft SMTP Server id 14.3.439.0; Mon, 12 Aug 2019 18:12:15 +0800 From: Shiju Jose To: , , , , , , , , CC: , , , Shiju Jose Subject: [PATCH RFC 1/4] ACPI: APEI: Add support to notify the vendor specific HW errors Date: Mon, 12 Aug 2019 11:11:46 +0100 Message-ID: <20190812101149.26036-2-shiju.jose@huawei.com> X-Mailer: git-send-email 2.19.2.windows.1 In-Reply-To: <20190812101149.26036-1-shiju.jose@huawei.com> References: <20190812101149.26036-1-shiju.jose@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.202.226.45] X-CFilter-Loop: Reflected Sender: linux-acpi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-acpi@vger.kernel.org Presently the vendor specific HW errors, in the non-standard format, are not reported to the vendor drivers for the recovery. This patch adds support to notify the vendor specific HW errors to the registered kernel drivers. Signed-off-by: Shiju Jose --- drivers/acpi/apei/ghes.c | 118 +++++++++++++++++++++++++++++++++++++++++++++-- include/acpi/ghes.h | 47 +++++++++++++++++++ 2 files changed, 160 insertions(+), 5 deletions(-) -- 1.9.1 diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index a66e00f..374d197 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -477,6 +477,77 @@ static void ghes_handle_aer(struct acpi_hest_generic_data *gdata) #endif } +struct ghes_error_notify { + struct list_head list; + struct rcu_head rcu_head; + guid_t sec_type; /* guid of the error record */ + error_handle handle; /* error handler function */ + void *data; /* handler driver's private data if any */ +}; + +/* List to store the registered error handling functions */ +static DEFINE_MUTEX(ghes_error_notify_mutex); +static LIST_HEAD(ghes_error_notify_list); +static refcount_t ghes_ref_count; + +/** + * ghes_error_notify_register - register an error handling function + * for the hw errors. + * @sec_type: sec_type of the corresponding CPER to be notified. + * @handle: pointer to the error handling function. + * @data: handler driver's private data. + * + * return 0 : SUCCESS, non-zero : FAIL + */ +int ghes_error_notify_register(guid_t sec_type, error_handle handle, void *data) +{ + struct ghes_error_notify *err_notify; + + mutex_lock(&ghes_error_notify_mutex); + err_notify = kzalloc(sizeof(*err_notify), GFP_KERNEL); + if (!err_notify) + return -ENOMEM; + + err_notify->handle = handle; + guid_copy(&err_notify->sec_type, &sec_type); + err_notify->data = data; + list_add_rcu(&err_notify->list, &ghes_error_notify_list); + mutex_unlock(&ghes_error_notify_mutex); + + return 0; +} +EXPORT_SYMBOL_GPL(ghes_error_notify_register); + +/** + * ghes_error_notify_unregister - unregister an error handling function. + * @sec_type: sec_type of the corresponding CPER. + * @handle: pointer to the error handling function. + * + * return none. + */ +void ghes_error_notify_unregister(guid_t sec_type, error_handle handle) +{ + struct ghes_error_notify *err_notify; + bool found = 0; + + mutex_lock(&ghes_error_notify_mutex); + rcu_read_lock(); + list_for_each_entry_rcu(err_notify, &ghes_error_notify_list, list) { + if (guid_equal(&err_notify->sec_type, &sec_type) && + err_notify->handle == handle) { + list_del_rcu(&err_notify->list); + found = 1; + break; + } + } + rcu_read_unlock(); + synchronize_rcu(); + mutex_unlock(&ghes_error_notify_mutex); + if (found) + kfree(err_notify); +} +EXPORT_SYMBOL_GPL(ghes_error_notify_unregister); + static void ghes_do_proc(struct ghes *ghes, const struct acpi_hest_generic_status *estatus) { @@ -485,6 +556,8 @@ static void ghes_do_proc(struct ghes *ghes, guid_t *sec_type; guid_t *fru_id = &NULL_UUID_LE; char *fru_text = ""; + bool is_notify = 0; + struct ghes_error_notify *err_notify; sev = ghes_severity(estatus->error_severity); apei_estatus_for_each_section(estatus, gdata) { @@ -512,11 +585,29 @@ static void ghes_do_proc(struct ghes *ghes, log_arm_hw_error(err); } else { - void *err = acpi_hest_get_payload(gdata); - - log_non_standard_event(sec_type, fru_id, fru_text, - sec_sev, err, - gdata->error_data_length); + rcu_read_lock(); + list_for_each_entry_rcu(err_notify, + &ghes_error_notify_list, list) { + if (guid_equal(&err_notify->sec_type, + sec_type)) { + /* The notification is called in the + * interrupt context, thus the handler + * functions should be take care of it. + */ + err_notify->handle(gdata, sev, + err_notify->data); + is_notify = 1; + } + } + rcu_read_unlock(); + + if (!is_notify) { + void *err = acpi_hest_get_payload(gdata); + + log_non_standard_event(sec_type, fru_id, + fru_text, sec_sev, err, + gdata->error_data_length); + } } } } @@ -1217,6 +1308,11 @@ static int ghes_probe(struct platform_device *ghes_dev) ghes_edac_register(ghes, &ghes_dev->dev); + if (!refcount_read(&ghes_ref_count)) + refcount_set(&ghes_ref_count, 1); + else + refcount_inc(&ghes_ref_count); + /* Handle any pending errors right away */ spin_lock_irqsave(&ghes_notify_lock_irq, flags); ghes_proc(ghes); @@ -1237,6 +1333,7 @@ static int ghes_remove(struct platform_device *ghes_dev) int rc; struct ghes *ghes; struct acpi_hest_generic *generic; + struct ghes_error_notify *err_notify, *tmp; ghes = platform_get_drvdata(ghes_dev); generic = ghes->generic; @@ -1279,6 +1376,17 @@ static int ghes_remove(struct platform_device *ghes_dev) ghes_fini(ghes); + if (refcount_dec_and_test(&ghes_ref_count) && + !list_empty(&ghes_error_notify_list)) { + mutex_lock(&ghes_error_notify_mutex); + list_for_each_entry_safe(err_notify, tmp, + &ghes_error_notify_list, list) { + list_del_rcu(&err_notify->list); + kfree_rcu(err_notify, rcu_head); + } + mutex_unlock(&ghes_error_notify_mutex); + } + ghes_edac_unregister(ghes); kfree(ghes); diff --git a/include/acpi/ghes.h b/include/acpi/ghes.h index e3f1cdd..d480537 100644 --- a/include/acpi/ghes.h +++ b/include/acpi/ghes.h @@ -50,6 +50,53 @@ enum { GHES_SEV_PANIC = 0x3, }; +/** + * error_handle - error handling function for the hw errors. + * This handle function is called in the interrupt context. + * @gdata: acpi_hest_generic_data. + * @sev: error severity of the entire error event defined in the + * ACPI spec table generic error status block. + * @data: handler driver's private data. + * + * return : none. + */ +typedef void (*error_handle)(struct acpi_hest_generic_data *gdata, int sev, + void *data); + +#ifdef CONFIG_ACPI_APEI_GHES +/** + * ghes_error_notify_register - register an error handling function + * for the hw errors. + * @sec_type: sec_type of the corresponding CPER to be notified. + * @handle: pointer to the error handling function. + * @data: handler driver's private data. + * + * return : 0 - SUCCESS, non-zero - FAIL. + */ +int ghes_error_notify_register(guid_t sec_type, error_handle handle, + void *data); + +/** + * ghes_error_notify_unregister - unregister an error handling function + * for the hw errors. + * @sec_type: sec_type of the corresponding CPER. + * @handle: pointer to the error handling function. + * + * return none. + */ +void ghes_error_notify_unregister(guid_t sec_type, error_handle handle); + +#else +int ghes_error_notify_register(guid_t sec_type, error_handle handle, void *data) +{ + return -ENODEV; +} + +void ghes_error_notify_unregister(guid_t sec_type, error_handle handle) +{ +} +#endif + int ghes_estatus_pool_init(int num_ghes); /* From drivers/edac/ghes_edac.c */ From patchwork Mon Feb 3 16:51:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shiju Jose X-Patchwork-Id: 194433 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 14239C35247 for ; Mon, 3 Feb 2020 16:52:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id D64742082E for ; Mon, 3 Feb 2020 16:52:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729076AbgBCQwF (ORCPT ); Mon, 3 Feb 2020 11:52:05 -0500 Received: from szxga05-in.huawei.com ([45.249.212.191]:9684 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728074AbgBCQwE (ORCPT ); Mon, 3 Feb 2020 11:52:04 -0500 Received: from DGGEMS409-HUB.china.huawei.com (unknown [172.30.72.60]) by Forcepoint Email with ESMTP id BE8B34B40E20B6A5A9BF; Tue, 4 Feb 2020 00:52:00 +0800 (CST) Received: from DESKTOP-6T4S3DQ.china.huawei.com (10.202.226.55) by DGGEMS409-HUB.china.huawei.com (10.3.19.209) with Microsoft SMTP Server id 14.3.439.0; Tue, 4 Feb 2020 00:51:49 +0800 From: Shiju Jose To: , , , , , , , , , , , CC: , , , , Shiju Jose Subject: [PATCH v3 2/2] PCI: HIP: Add handling of HiSilicon HIP PCIe controller's errors Date: Mon, 3 Feb 2020 16:51:22 +0000 Message-ID: <20200203165122.17748-3-shiju.jose@huawei.com> X-Mailer: git-send-email 2.19.2.windows.1 In-Reply-To: <20200203165122.17748-1-shiju.jose@huawei.com> References: <20200203165122.17748-1-shiju.jose@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.202.226.55] X-CFilter-Loop: Reflected Sender: linux-acpi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-acpi@vger.kernel.org From: Yicong Yang The HiSilicon HIP PCIe controller is capable of handling errors on root port and perform port reset separately at each root port. This patch add error handling driver for HIP PCIe controller to log and report recoverable errors. Perform root port reset and restore link status after the recovery. Following are some of the PCIe controller's recoverable errors 1. completion transmission timeout error. 2. CRS retry counter over the threshold error. 3. ECC 2 bit errors 4. AXI bresponse/rresponse errors etc. Signed-off-by: Yicong Yang Signed-off-by: Shiju Jose --- drivers/pci/controller/Kconfig | 8 + drivers/pci/controller/Makefile | 1 + drivers/pci/controller/pcie-hisi-error.c | 336 +++++++++++++++++++++++++++++++ 3 files changed, 345 insertions(+) create mode 100644 drivers/pci/controller/pcie-hisi-error.c --- drivers/pci/controller/Kconfig | 8 + drivers/pci/controller/Makefile | 1 + drivers/pci/controller/pcie-hisi-error.c | 334 +++++++++++++++++++++++++++++++ 3 files changed, 343 insertions(+) create mode 100644 drivers/pci/controller/pcie-hisi-error.c diff --git a/drivers/pci/controller/Kconfig b/drivers/pci/controller/Kconfig index c77069c..5dad1ca 100644 --- a/drivers/pci/controller/Kconfig +++ b/drivers/pci/controller/Kconfig @@ -260,6 +260,14 @@ config PCI_HYPERV_INTERFACE The Hyper-V PCI Interface is a helper driver allows other drivers to have a common interface with the Hyper-V PCI frontend driver. +config PCIE_HISI_ERR + depends on ARM64 || COMPILE_TEST + depends on ACPI + bool "HiSilicon HIP PCIe controller error handling driver" + help + Say Y here if you want error handling support + for the PCIe controller's errors on HiSilicon HIP SoCs + source "drivers/pci/controller/dwc/Kconfig" source "drivers/pci/controller/cadence/Kconfig" endmenu diff --git a/drivers/pci/controller/Makefile b/drivers/pci/controller/Makefile index 3d4f597..2d1565f 100644 --- a/drivers/pci/controller/Makefile +++ b/drivers/pci/controller/Makefile @@ -28,6 +28,7 @@ obj-$(CONFIG_PCIE_MEDIATEK) += pcie-mediatek.o obj-$(CONFIG_PCIE_MOBIVEIL) += pcie-mobiveil.o obj-$(CONFIG_PCIE_TANGO_SMP8759) += pcie-tango.o obj-$(CONFIG_VMD) += vmd.o +obj-$(CONFIG_PCIE_HISI_ERR) += pcie-hisi-error.o # pcie-hisi.o quirks are needed even without CONFIG_PCIE_DW obj-y += dwc/ diff --git a/drivers/pci/controller/pcie-hisi-error.c b/drivers/pci/controller/pcie-hisi-error.c new file mode 100644 index 0000000..5b33a63 --- /dev/null +++ b/drivers/pci/controller/pcie-hisi-error.c @@ -0,0 +1,334 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Driver for handling the PCIe controller's errors on + * HiSilicon HIP SoCs. + * + * Copyright (c) 2018-2019 HiSilicon Limited. + */ + +#include +#include +#include +#include +#include +#include +#include + +#include "../pci.h" + +#define HISI_PCIE_ERR_RECOVER_RING_SIZE 16 +#define HISI_PCIE_ERR_INFO_SIZE 1024 + +/* HISI PCIe controller's error definitions */ +#define HISI_PCIE_ERR_MISC_REGS 33 + +#define HISI_PCIE_SUB_MODULE_ID_AP 0 +#define HISI_PCIE_SUB_MODULE_ID_TL 1 +#define HISI_PCIE_SUB_MODULE_ID_MAC 2 +#define HISI_PCIE_SUB_MODULE_ID_DL 3 +#define HISI_PCIE_SUB_MODULE_ID_SDI 4 + +#define HISI_PCIE_LOCAL_VALID_VERSION BIT(0) +#define HISI_PCIE_LOCAL_VALID_SOC_ID BIT(1) +#define HISI_PCIE_LOCAL_VALID_SOCKET_ID BIT(2) +#define HISI_PCIE_LOCAL_VALID_NIMBUS_ID BIT(3) +#define HISI_PCIE_LOCAL_VALID_SUB_MODULE_ID BIT(4) +#define HISI_PCIE_LOCAL_VALID_CORE_ID BIT(5) +#define HISI_PCIE_LOCAL_VALID_PORT_ID BIT(6) +#define HISI_PCIE_LOCAL_VALID_ERR_TYPE BIT(7) +#define HISI_PCIE_LOCAL_VALID_ERR_SEVERITY BIT(8) +#define HISI_PCIE_LOCAL_VALID_ERR_MISC 9 + +#define HISI_ERR_SEV_RECOVERABLE 0 +#define HISI_ERR_SEV_FATAL 1 +#define HISI_ERR_SEV_CORRECTED 2 +#define HISI_ERR_SEV_NONE 3 + +static guid_t hisi_pcie_sec_type = GUID_INIT(0xB2889FC9, 0xE7D7, 0x4F9D, + 0xA8, 0x67, 0xAF, 0x42, 0xE9, 0x8B, 0xE7, 0x72); + +#define HISI_PCIE_CORE_ID(v) ((v) >> 3) +#define HISI_PCIE_PORT_ID(core, v) (((v) >> 1) + ((core) << 3)) +#define HISI_PCIE_CORE_PORT_ID(v) (((v) % 8) << 1) + +struct hisi_pcie_err_data { + u64 val_bits; + u8 version; + u8 soc_id; + u8 socket_id; + u8 nimbus_id; + u8 sub_module_id; + u8 core_id; + u8 port_id; + u8 err_severity; + u16 err_type; + u8 reserv[2]; + u32 err_misc[HISI_PCIE_ERR_MISC_REGS]; +}; + +struct hisi_pcie_err_info { + struct hisi_pcie_err_data err_data; + struct platform_device *pdev; +}; + +static char *hisi_pcie_sub_module_name(u8 id) +{ + switch (id) { + case HISI_PCIE_SUB_MODULE_ID_AP: return "AP Layer"; + case HISI_PCIE_SUB_MODULE_ID_TL: return "TL Layer"; + case HISI_PCIE_SUB_MODULE_ID_MAC: return "MAC Layer"; + case HISI_PCIE_SUB_MODULE_ID_DL: return "DL Layer"; + case HISI_PCIE_SUB_MODULE_ID_SDI: return "SDI Layer"; + } + + return "unknown"; +} + +static char *hisi_pcie_err_severity(u8 err_sev) +{ + switch (err_sev) { + case HISI_ERR_SEV_RECOVERABLE: return "recoverable"; + case HISI_ERR_SEV_FATAL: return "fatal"; + case HISI_ERR_SEV_CORRECTED: return "corrected"; + case HISI_ERR_SEV_NONE: return "none"; + } + + return "unknown"; +} + +static int hisi_pcie_port_reset(struct platform_device *pdev, + u32 chip_id, u32 port_id) +{ + struct device *dev = &pdev->dev; + acpi_handle handle = ACPI_HANDLE(dev); + union acpi_object arg[3]; + struct acpi_object_list arg_list; + acpi_status s; + unsigned long long data = 0; + + arg[0].type = ACPI_TYPE_INTEGER; + arg[0].integer.value = chip_id; + arg[1].type = ACPI_TYPE_INTEGER; + arg[1].integer.value = HISI_PCIE_CORE_ID(port_id); + arg[2].type = ACPI_TYPE_INTEGER; + arg[2].integer.value = HISI_PCIE_CORE_PORT_ID(port_id); + + arg_list.count = 3; + arg_list.pointer = arg; + + /* Call the ACPI handle to reset root port */ + s = acpi_evaluate_integer(handle, "RST", &arg_list, &data); + if (ACPI_FAILURE(s)) { + dev_err(dev, "No RST method\n"); + return -EIO; + } + + if (data) { + dev_err(dev, "Failed to Reset\n"); + return -EIO; + } + + return 0; +} + +static int hisi_pcie_port_do_recovery(struct platform_device *dev, + u32 chip_id, u32 port_id) +{ + acpi_status s; + struct device *device = &dev->dev; + acpi_handle root_handle = ACPI_HANDLE(device); + struct acpi_pci_root *pci_root; + struct pci_bus *root_bus; + struct pci_dev *pdev; + u32 domain, busnr, devfn; + + s = acpi_get_parent(root_handle, &root_handle); + if (ACPI_FAILURE(s)) + return -ENODEV; + pci_root = acpi_pci_find_root(root_handle); + if (!pci_root) + return -ENODEV; + root_bus = pci_root->bus; + domain = pci_root->segment; + + busnr = root_bus->number; + devfn = PCI_DEVFN(port_id, 0); + pdev = pci_get_domain_bus_and_slot(domain, busnr, devfn); + if (!pdev) { + dev_info(device, "Fail to get root port %04x:%02x:%02x.%d device\n", + domain, busnr, PCI_SLOT(devfn), PCI_FUNC(devfn)); + return -ENODEV; + } + + pci_stop_and_remove_bus_device_locked(pdev); + pci_dev_put(pdev); + + if (hisi_pcie_port_reset(dev, chip_id, port_id)) + return -EIO; + + /* + * The initialization time of subordinate devices after + * hot reset is no more than 1s, which is required by + * the PCI spec v5.0 sec 6.6.1. The time will shorten + * if Readiness Notifications mechanisms are used. But + * wait 1s here to adapt any conditions. + */ + ssleep(1UL); + + /* add root port and downstream devices */ + pci_lock_rescan_remove(); + pci_rescan_bus(root_bus); + pci_unlock_rescan_remove(); + + return 0; +} + +static void hisi_pcie_handle_one_error(const struct hisi_pcie_err_data *err, + struct platform_device *pdev) +{ + char buf[HISI_PCIE_ERR_INFO_SIZE]; + char *p = buf, *end = buf + sizeof(buf); + struct device *dev = &pdev->dev; + u32 i; + int rc; + + if (err->val_bits == 0) { + dev_warn(dev, "%s: no valid error information\n", __func__); + return; + } + + /* Logging */ + p += snprintf(p, end - p, "[ Table version=%d ", err->version); + if (err->val_bits & HISI_PCIE_LOCAL_VALID_SOC_ID) + p += snprintf(p, end - p, "SOC ID=%d ", err->soc_id); + + if (err->val_bits & HISI_PCIE_LOCAL_VALID_SOCKET_ID) + p += snprintf(p, end - p, "socket ID=%d ", err->socket_id); + + if (err->val_bits & HISI_PCIE_LOCAL_VALID_NIMBUS_ID) + p += snprintf(p, end - p, "nimbus ID=%d ", err->nimbus_id); + + if (err->val_bits & HISI_PCIE_LOCAL_VALID_SUB_MODULE_ID) + p += snprintf(p, end - p, "sub module=%s ", + hisi_pcie_sub_module_name(err->sub_module_id)); + + if (err->val_bits & HISI_PCIE_LOCAL_VALID_CORE_ID) + p += snprintf(p, end - p, "core ID=core%d ", err->core_id); + + if (err->val_bits & HISI_PCIE_LOCAL_VALID_PORT_ID) + p += snprintf(p, end - p, "port ID=port%d ", err->port_id); + + if (err->val_bits & HISI_PCIE_LOCAL_VALID_ERR_SEVERITY) + p += snprintf(p, end - p, "error severity=%s ", + hisi_pcie_err_severity(err->err_severity)); + + if (err->val_bits & HISI_PCIE_LOCAL_VALID_ERR_TYPE) + p += snprintf(p, end - p, "error type=0x%x ", err->err_type); + + p += snprintf(p, end - p, "]\n"); + dev_info(dev, "\nHISI : HIP : PCIe controller error\n"); + dev_info(dev, "%s\n", buf); + + dev_info(dev, "Reg Dump:\n"); + for (i = 0; i < HISI_PCIE_ERR_MISC_REGS; i++) { + if (err->val_bits & BIT(HISI_PCIE_LOCAL_VALID_ERR_MISC + i)) + dev_info(dev, + "ERR_MISC_%d=0x%x\n", i, err->err_misc[i]); + } + + /* Recovery for the PCIe controller's errors */ + if (err->err_severity == HISI_ERR_SEV_RECOVERABLE) { + /* try reset PCI port for the error recovery */ + rc = hisi_pcie_port_do_recovery(pdev, err->socket_id, + HISI_PCIE_PORT_ID(err->core_id, err->port_id)); + if (rc) { + dev_info(dev, "fail to do hisi pcie port reset\n"); + return; + } + } +} + +static DEFINE_KFIFO(hisi_pcie_err_recover_ring, struct hisi_pcie_err_info, + HISI_PCIE_ERR_RECOVER_RING_SIZE); +static DEFINE_SPINLOCK(hisi_pcie_err_recover_ring_lock); + +static void hisi_pcie_err_recover_work_func(struct work_struct *work) +{ + struct hisi_pcie_err_info pcie_err_entry; + + while (kfifo_get(&hisi_pcie_err_recover_ring, &pcie_err_entry)) { + hisi_pcie_handle_one_error(&pcie_err_entry.err_data, + pcie_err_entry.pdev); + } +} + +static DECLARE_WORK(hisi_pcie_err_recover_work, + hisi_pcie_err_recover_work_func); + +static int hisi_pcie_error_handle(struct acpi_hest_generic_data *gdata, + int sev, void *data) +{ + const struct hisi_pcie_err_data *err_data = + acpi_hest_get_payload(gdata); + struct hisi_pcie_err_info err_info; + struct platform_device *pdev = data; + struct device *dev = &pdev->dev; + u8 socket; + + if (device_property_read_u8(dev, "socket", &socket)) + return GHES_EVENT_NONE; + + if (err_data->socket_id != socket) + return GHES_EVENT_NONE; + + memcpy(&err_info.err_data, err_data, sizeof(*err_data)); + err_info.pdev = pdev; + + if (kfifo_in_spinlocked(&hisi_pcie_err_recover_ring, &err_info, 1, + &hisi_pcie_err_recover_ring_lock)) + schedule_work(&hisi_pcie_err_recover_work); + else + dev_warn(dev, "queue full when recovering PCIe controller's error\n"); + + return GHES_EVENT_HANDLED; +} + +static int hisi_pcie_err_handler_probe(struct platform_device *pdev) +{ + int ret; + + ret = ghes_register_event_handler(hisi_pcie_sec_type, + hisi_pcie_error_handle, pdev); + if (ret) { + dev_err(&pdev->dev, "%s : ghes_register_event_handler fail\n", + __func__); + return ret; + } + + return 0; +} + +static int hisi_pcie_err_handler_remove(struct platform_device *pdev) +{ + ghes_unregister_event_handler(hisi_pcie_sec_type, pdev); + + return 0; +} + +static const struct acpi_device_id hisi_pcie_acpi_match[] = { + { "HISI0361", 0 }, + { } +}; + +static struct platform_driver hisi_pcie_err_handler_driver = { + .driver = { + .name = "hisi-pcie-err-handler", + .acpi_match_table = hisi_pcie_acpi_match, + }, + .probe = hisi_pcie_err_handler_probe, + .remove = hisi_pcie_err_handler_remove, +}; +module_platform_driver(hisi_pcie_err_handler_driver); + +MODULE_DESCRIPTION("HiSilicon HIP PCIe controller's error handling driver"); +MODULE_LICENSE("GPL v2"); From patchwork Mon Aug 12 10:11:49 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shiju Jose X-Patchwork-Id: 171055 Delivered-To: patch@linaro.org Received: by 2002:a92:d204:0:0:0:0:0 with SMTP id y4csp2628804ily; Mon, 12 Aug 2019 03:12:41 -0700 (PDT) X-Google-Smtp-Source: APXvYqxYuehp/XH4k2XQwuDcLTKlkcr/OmXHsNMh2YYDvOGt1wAGZb61ZIBiukxJrRUIQa1U5kTf X-Received: by 2002:a17:902:9f81:: with SMTP id g1mr31785139plq.17.1565604761294; Mon, 12 Aug 2019 03:12:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565604761; cv=none; d=google.com; s=arc-20160816; b=hdUFOCbBAnf3RP/bB1w5vPQaoSIWuyVL5a2yKoiepKbZa8HCq7vhzExkSpe60WVIS2 ZUJ0Zz+e2WxFxu4lZOWyhppo7p26KDZJ5nPm5a3i9I3i4jRo0HiTOso4u9YByGkr+vpv jzfPoTWVE7famT8euz5vAyr8LdDuM6kSrOvuM/Qy2VLnbREcnyFfDn61sYwSkDORJ3VO mskvXuzoki9ZwsoDUuRVfrgVBnzOJfMUcPGHN7YdS8Y09QRWLJIBbYL2q2DPDicNJ4dk Oe1kO8F6w53vQtR7LIOW2T3eUE/9dwYA1cdqkryPoBSqIRSv6RPJ49NYybQzi3vhaZxr KQaw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=qZA6cW5uAO3iQ2TLTtxakRYiUgVIkAiGASwIdryEQ3w=; b=ddxrEknHS1hAHqZU5vMecGe4YUtaBqnz9k/3Zw1G/TDXkQfaiZ49L8de3FnlbhiVnj ZMs9lT7oKUTBv2FnaPdkdQy5AfhsrPWjzbpI+1tID5xN4rAduOF0/7JzAYYFGQrbBxTN 9MAxrRNN+Wz67H77CSqYEeIexBXBfsvgRSj1g/1PCyDhagsduiFZQXkM6U4AVHb0BuEA xCgdbWZ+ZAJdchI6pDtEOVoKEMgpWOyy2S15RiINPqH20NqWJIOPZlltEgV67FMg58KY pCIUBbv+498vNYrVoF/Avf6i//geqTHkYoC95LoWTpjZGjx34Et87eBu6tQugQnzKKYW zLew== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-acpi-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-acpi-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e5si47331731pgl.536.2019.08.12.03.12.41; Mon, 12 Aug 2019 03:12:41 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-acpi-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-acpi-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-acpi-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727930AbfHLKMk (ORCPT + 7 others); Mon, 12 Aug 2019 06:12:40 -0400 Received: from szxga07-in.huawei.com ([45.249.212.35]:35576 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1727703AbfHLKMh (ORCPT ); Mon, 12 Aug 2019 06:12:37 -0400 Received: from DGGEMS411-HUB.china.huawei.com (unknown [172.30.72.58]) by Forcepoint Email with ESMTP id 2AF2EF66CD59304F08B0; Mon, 12 Aug 2019 18:12:35 +0800 (CST) Received: from DESKTOP-6T4S3DQ.china.huawei.com (10.202.226.45) by DGGEMS411-HUB.china.huawei.com (10.3.19.211) with Microsoft SMTP Server id 14.3.439.0; Mon, 12 Aug 2019 18:12:26 +0800 From: Shiju Jose To: , , , , , , , , CC: , , , Shiju Jose Subject: [PATCH RFC 4/4] ACPI: APEI: Add log_arm_hw_error to the new notification method Date: Mon, 12 Aug 2019 11:11:49 +0100 Message-ID: <20190812101149.26036-5-shiju.jose@huawei.com> X-Mailer: git-send-email 2.19.2.windows.1 In-Reply-To: <20190812101149.26036-1-shiju.jose@huawei.com> References: <20190812101149.26036-1-shiju.jose@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.202.226.45] X-CFilter-Loop: Reflected Sender: linux-acpi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-acpi@vger.kernel.org This patch adds log_arm_hw_error to the new error notification method. Signed-off-by: Shiju Jose --- drivers/acpi/apei/ghes.c | 47 ++++++++++++++++++++++------------------------- drivers/ras/ras.c | 5 ++++- include/linux/ras.h | 7 +++++-- 3 files changed, 31 insertions(+), 28 deletions(-) -- 1.9.1 diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index ffc309c..013fea0 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -574,34 +574,27 @@ static void ghes_do_proc(struct ghes *ghes, if (gdata->validation_bits & CPER_SEC_VALID_FRU_TEXT) fru_text = gdata->fru_text; - if (guid_equal(sec_type, &CPER_SEC_PROC_ARM)) { - struct cper_sec_proc_arm *err = acpi_hest_get_payload(gdata); - - log_arm_hw_error(err); - } else { - rcu_read_lock(); - list_for_each_entry_rcu(err_notify, - &ghes_error_notify_list, list) { - if (guid_equal(&err_notify->sec_type, - sec_type)) { - /* The notification is called in the - * interrupt context, thus the handler - * functions should be take care of it. - */ - err_notify->handle(gdata, sev, - err_notify->data); - is_notify = 1; - } + rcu_read_lock(); + list_for_each_entry_rcu(err_notify, &ghes_error_notify_list, + list) { + if (guid_equal(&err_notify->sec_type, sec_type)) { + /* The notification is called in the + * interrupt context, thus the handler + * functions should be take care of it. + */ + err_notify->handle(gdata, sev, + err_notify->data); + is_notify = 1; } - rcu_read_unlock(); + } + rcu_read_unlock(); - if (!is_notify) { - void *err = acpi_hest_get_payload(gdata); + if (!is_notify) { + void *err = acpi_hest_get_payload(gdata); - log_non_standard_event(sec_type, fru_id, - fru_text, sec_sev, err, - gdata->error_data_length); - } + log_non_standard_event(sec_type, fru_id, + fru_text, sec_sev, err, + gdata->error_data_length); } } } @@ -1198,6 +1191,10 @@ struct ghes_err_handler_tab { .sec_type = CPER_SEC_PCIE, .handle = ghes_handle_aer, }, + { + .sec_type = CPER_SEC_PROC_ARM, + .handle = log_arm_hw_error, + }, { /* sentinel */ } }; diff --git a/drivers/ras/ras.c b/drivers/ras/ras.c index 95540ea..7ec3eeb 100644 --- a/drivers/ras/ras.c +++ b/drivers/ras/ras.c @@ -21,8 +21,11 @@ void log_non_standard_event(const guid_t *sec_type, const guid_t *fru_id, trace_non_standard_event(sec_type, fru_id, fru_text, sev, err, len); } -void log_arm_hw_error(struct cper_sec_proc_arm *err) +void log_arm_hw_error(struct acpi_hest_generic_data *gdata, + int sev, void *data) { + struct cper_sec_proc_arm *err = acpi_hest_get_payload(gdata); + trace_arm_event(err); } diff --git a/include/linux/ras.h b/include/linux/ras.h index 7c3debb..05b662d 100644 --- a/include/linux/ras.h +++ b/include/linux/ras.h @@ -5,6 +5,7 @@ #include #include #include +#include #ifdef CONFIG_DEBUG_FS int ras_userspace_consumers(void); @@ -29,7 +30,8 @@ static inline void __init cec_init(void) { } void log_non_standard_event(const guid_t *sec_type, const guid_t *fru_id, const char *fru_text, const u8 sev, const u8 *err, const u32 len); -void log_arm_hw_error(struct cper_sec_proc_arm *err); +void log_arm_hw_error(struct acpi_hest_generic_data *gdata, + int sev, void *data); #else static inline void log_non_standard_event(const guid_t *sec_type, @@ -37,7 +39,8 @@ void log_non_standard_event(const guid_t *sec_type, const u8 sev, const u8 *err, const u32 len) { return; } static inline void -log_arm_hw_error(struct cper_sec_proc_arm *err) { return; } +log_arm_hw_error(struct acpi_hest_generic_data *gdata, + int sev, void *data) { return; } #endif #endif /* __RAS_H__ */ From patchwork Tue Apr 21 13:21:36 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shiju Jose X-Patchwork-Id: 194274 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8B722C55183 for ; Tue, 21 Apr 2020 13:24:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6D3D02087E for ; Tue, 21 Apr 2020 13:24:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729130AbgDUNYR (ORCPT ); Tue, 21 Apr 2020 09:24:17 -0400 Received: from szxga04-in.huawei.com ([45.249.212.190]:2862 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729116AbgDUNYJ (ORCPT ); Tue, 21 Apr 2020 09:24:09 -0400 Received: from DGGEMS403-HUB.china.huawei.com (unknown [172.30.72.59]) by Forcepoint Email with ESMTP id 60FDE7DE4715E0B517C6; Tue, 21 Apr 2020 21:24:06 +0800 (CST) Received: from DESKTOP-6T4S3DQ.china.huawei.com (10.47.83.77) by DGGEMS403-HUB.china.huawei.com (10.3.19.203) with Microsoft SMTP Server id 14.3.487.0; Tue, 21 Apr 2020 21:23:56 +0800 From: Shiju Jose To: , , , , , , , , , , , , CC: , , , , Shiju Jose Subject: [RESEND PATCH v7 6/6] PCI: hip: Add handling of HiSilicon HIP PCIe controller errors Date: Tue, 21 Apr 2020 14:21:36 +0100 Message-ID: <20200421132136.1595-7-shiju.jose@huawei.com> X-Mailer: git-send-email 2.26.0.windows.1 In-Reply-To: <20200421132136.1595-1-shiju.jose@huawei.com> References: <20200421132136.1595-1-shiju.jose@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.47.83.77] X-CFilter-Loop: Reflected Sender: linux-acpi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-acpi@vger.kernel.org From: Yicong Yang The HiSilicon HIP PCIe controller is capable of handling errors on root port and perform port reset separately at each root port. Add error handling driver for HIP PCIe controller to log and report recoverable errors. Perform root port reset and restore link status after the recovery. Following are some of the PCIe controller's recoverable errors 1. completion transmission timeout error. 2. CRS retry counter over the threshold error. 3. ECC 2 bit errors 4. AXI bresponse/rresponse errors etc. The driver placed in the drivers/pci/controller/ because the HIP PCIe controller does not use DWC ip. Signed-off-by: Yicong Yang Signed-off-by: Shiju Jose --- drivers/pci/controller/Kconfig | 8 + drivers/pci/controller/Makefile | 1 + drivers/pci/controller/pcie-hisi-error.c | 336 +++++++++++++++++++++++++++++++ 3 files changed, 345 insertions(+) create mode 100644 drivers/pci/controller/pcie-hisi-error.c --- drivers/pci/controller/Kconfig | 8 + drivers/pci/controller/Makefile | 1 + drivers/pci/controller/pcie-hisi-error.c | 323 +++++++++++++++++++++++ 3 files changed, 332 insertions(+) create mode 100644 drivers/pci/controller/pcie-hisi-error.c diff --git a/drivers/pci/controller/Kconfig b/drivers/pci/controller/Kconfig index 20bf00f587bd..8bc6111480c8 100644 --- a/drivers/pci/controller/Kconfig +++ b/drivers/pci/controller/Kconfig @@ -268,6 +268,14 @@ config PCI_HYPERV_INTERFACE The Hyper-V PCI Interface is a helper driver allows other drivers to have a common interface with the Hyper-V PCI frontend driver. +config PCIE_HISI_ERR + depends on ARM64 || COMPILE_TEST + depends on ACPI + bool "HiSilicon HIP PCIe controller error handling driver" + help + Say Y here if you want error handling support + for the PCIe controller's errors on HiSilicon HIP SoCs + source "drivers/pci/controller/dwc/Kconfig" source "drivers/pci/controller/cadence/Kconfig" endmenu diff --git a/drivers/pci/controller/Makefile b/drivers/pci/controller/Makefile index 01b2502a5323..94f37b3d9929 100644 --- a/drivers/pci/controller/Makefile +++ b/drivers/pci/controller/Makefile @@ -29,6 +29,7 @@ obj-$(CONFIG_PCIE_MOBIVEIL) += pcie-mobiveil.o obj-$(CONFIG_PCIE_TANGO_SMP8759) += pcie-tango.o obj-$(CONFIG_VMD) += vmd.o obj-$(CONFIG_PCIE_BRCMSTB) += pcie-brcmstb.o +obj-$(CONFIG_PCIE_HISI_ERR) += pcie-hisi-error.o # pcie-hisi.o quirks are needed even without CONFIG_PCIE_DW obj-y += dwc/ diff --git a/drivers/pci/controller/pcie-hisi-error.c b/drivers/pci/controller/pcie-hisi-error.c new file mode 100644 index 000000000000..cc721070e07b --- /dev/null +++ b/drivers/pci/controller/pcie-hisi-error.c @@ -0,0 +1,323 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Driver for handling the PCIe controller errors on + * HiSilicon HIP SoCs. + * + * Copyright (c) 2018-2019 HiSilicon Limited. + */ + +#include +#include +#include +#include +#include +#include +#include + +#define HISI_PCIE_ERR_INFO_SIZE 1024 + +/* HISI PCIe controller error definitions */ +#define HISI_PCIE_ERR_MISC_REGS 33 + +#define HISI_PCIE_SUB_MODULE_ID_AP 0 +#define HISI_PCIE_SUB_MODULE_ID_TL 1 +#define HISI_PCIE_SUB_MODULE_ID_MAC 2 +#define HISI_PCIE_SUB_MODULE_ID_DL 3 +#define HISI_PCIE_SUB_MODULE_ID_SDI 4 + +#define HISI_PCIE_LOCAL_VALID_VERSION BIT(0) +#define HISI_PCIE_LOCAL_VALID_SOC_ID BIT(1) +#define HISI_PCIE_LOCAL_VALID_SOCKET_ID BIT(2) +#define HISI_PCIE_LOCAL_VALID_NIMBUS_ID BIT(3) +#define HISI_PCIE_LOCAL_VALID_SUB_MODULE_ID BIT(4) +#define HISI_PCIE_LOCAL_VALID_CORE_ID BIT(5) +#define HISI_PCIE_LOCAL_VALID_PORT_ID BIT(6) +#define HISI_PCIE_LOCAL_VALID_ERR_TYPE BIT(7) +#define HISI_PCIE_LOCAL_VALID_ERR_SEVERITY BIT(8) +#define HISI_PCIE_LOCAL_VALID_ERR_MISC 9 + +#define HISI_ERR_SEV_RECOVERABLE 0 +#define HISI_ERR_SEV_FATAL 1 +#define HISI_ERR_SEV_CORRECTED 2 +#define HISI_ERR_SEV_NONE 3 + +static guid_t hisi_pcie_sec_type = GUID_INIT(0xB2889FC9, 0xE7D7, 0x4F9D, + 0xA8, 0x67, 0xAF, 0x42, 0xE9, 0x8B, 0xE7, 0x72); + +#define HISI_PCIE_CORE_ID(v) ((v) >> 3) +#define HISI_PCIE_PORT_ID(core, v) (((v) >> 1) + ((core) << 3)) +#define HISI_PCIE_CORE_PORT_ID(v) (((v) % 8) << 1) + +struct hisi_pcie_error_data { + u64 val_bits; + u8 version; + u8 soc_id; + u8 socket_id; + u8 nimbus_id; + u8 sub_module_id; + u8 core_id; + u8 port_id; + u8 err_severity; + u16 err_type; + u8 reserv[2]; + u32 err_misc[HISI_PCIE_ERR_MISC_REGS]; +}; + +struct hisi_pcie_error_private { + struct notifier_block nb; + struct platform_device *pdev; +}; + +static char *hisi_pcie_sub_module_name(u8 id) +{ + switch (id) { + case HISI_PCIE_SUB_MODULE_ID_AP: return "AP Layer"; + case HISI_PCIE_SUB_MODULE_ID_TL: return "TL Layer"; + case HISI_PCIE_SUB_MODULE_ID_MAC: return "MAC Layer"; + case HISI_PCIE_SUB_MODULE_ID_DL: return "DL Layer"; + case HISI_PCIE_SUB_MODULE_ID_SDI: return "SDI Layer"; + } + + return "unknown"; +} + +static char *hisi_pcie_error_severity(u8 err_sev) +{ + switch (err_sev) { + case HISI_ERR_SEV_RECOVERABLE: return "recoverable"; + case HISI_ERR_SEV_FATAL: return "fatal"; + case HISI_ERR_SEV_CORRECTED: return "corrected"; + case HISI_ERR_SEV_NONE: return "none"; + } + + return "unknown"; +} + +static int hisi_pcie_port_reset(struct platform_device *pdev, + u32 chip_id, u32 port_id) +{ + struct device *dev = &pdev->dev; + acpi_handle handle = ACPI_HANDLE(dev); + union acpi_object arg[3]; + struct acpi_object_list arg_list; + acpi_status s; + unsigned long long data = 0; + + arg[0].type = ACPI_TYPE_INTEGER; + arg[0].integer.value = chip_id; + arg[1].type = ACPI_TYPE_INTEGER; + arg[1].integer.value = HISI_PCIE_CORE_ID(port_id); + arg[2].type = ACPI_TYPE_INTEGER; + arg[2].integer.value = HISI_PCIE_CORE_PORT_ID(port_id); + + arg_list.count = 3; + arg_list.pointer = arg; + + s = acpi_evaluate_integer(handle, "RST", &arg_list, &data); + if (ACPI_FAILURE(s)) { + dev_err(dev, "No RST method\n"); + return -EIO; + } + + if (data) { + dev_err(dev, "Failed to Reset\n"); + return -EIO; + } + + return 0; +} + +static int hisi_pcie_port_do_recovery(struct platform_device *dev, + u32 chip_id, u32 port_id) +{ + acpi_status s; + struct device *device = &dev->dev; + acpi_handle root_handle = ACPI_HANDLE(device); + struct acpi_pci_root *pci_root; + struct pci_bus *root_bus; + struct pci_dev *pdev; + u32 domain, busnr, devfn; + + s = acpi_get_parent(root_handle, &root_handle); + if (ACPI_FAILURE(s)) + return -ENODEV; + pci_root = acpi_pci_find_root(root_handle); + if (!pci_root) + return -ENODEV; + root_bus = pci_root->bus; + domain = pci_root->segment; + + busnr = root_bus->number; + devfn = PCI_DEVFN(port_id, 0); + pdev = pci_get_domain_bus_and_slot(domain, busnr, devfn); + if (!pdev) { + dev_info(device, "Fail to get root port %04x:%02x:%02x.%d device\n", + domain, busnr, PCI_SLOT(devfn), PCI_FUNC(devfn)); + return -ENODEV; + } + + pci_stop_and_remove_bus_device_locked(pdev); + pci_dev_put(pdev); + + if (hisi_pcie_port_reset(dev, chip_id, port_id)) + return -EIO; + + /* + * The initialization time of subordinate devices after + * hot reset is no more than 1s, which is required by + * the PCI spec v5.0 sec 6.6.1. The time will shorten + * if Readiness Notifications mechanisms are used. But + * wait 1s here to adapt any conditions. + */ + ssleep(1UL); + + /* add root port and downstream devices */ + pci_lock_rescan_remove(); + pci_rescan_bus(root_bus); + pci_unlock_rescan_remove(); + + return 0; +} + +static void hisi_pcie_handle_error(const struct hisi_pcie_error_data *error, + struct platform_device *pdev) +{ + char buf[HISI_PCIE_ERR_INFO_SIZE]; + char *p = buf, *end = buf + sizeof(buf); + struct device *dev = &pdev->dev; + u32 i; + int rc; + + if (error->val_bits == 0) { + dev_warn(dev, "%s: no valid error information\n", __func__); + return; + } + + /* Logging */ + p += snprintf(p, end - p, "[ Table version=%d ", error->version); + if (error->val_bits & HISI_PCIE_LOCAL_VALID_SOC_ID) + p += snprintf(p, end - p, "SOC ID=%d ", error->soc_id); + + if (error->val_bits & HISI_PCIE_LOCAL_VALID_SOCKET_ID) + p += snprintf(p, end - p, "socket ID=%d ", error->socket_id); + + if (error->val_bits & HISI_PCIE_LOCAL_VALID_NIMBUS_ID) + p += snprintf(p, end - p, "nimbus ID=%d ", error->nimbus_id); + + if (error->val_bits & HISI_PCIE_LOCAL_VALID_SUB_MODULE_ID) + p += snprintf(p, end - p, "sub module=%s ", + hisi_pcie_sub_module_name(error->sub_module_id)); + + if (error->val_bits & HISI_PCIE_LOCAL_VALID_CORE_ID) + p += snprintf(p, end - p, "core ID=core%d ", error->core_id); + + if (error->val_bits & HISI_PCIE_LOCAL_VALID_PORT_ID) + p += snprintf(p, end - p, "port ID=port%d ", error->port_id); + + if (error->val_bits & HISI_PCIE_LOCAL_VALID_ERR_SEVERITY) + p += snprintf(p, end - p, "error severity=%s ", + hisi_pcie_error_severity(error->err_severity)); + + if (error->val_bits & HISI_PCIE_LOCAL_VALID_ERR_TYPE) + p += snprintf(p, end - p, "error type=0x%x ", error->err_type); + + p += snprintf(p, end - p, "]\n"); + dev_info(dev, "\nHISI : HIP : PCIe controller error\n"); + dev_info(dev, "%s\n", buf); + + dev_info(dev, "Reg Dump:\n"); + for (i = 0; i < HISI_PCIE_ERR_MISC_REGS; i++) { + if (error->val_bits & + BIT_ULL(HISI_PCIE_LOCAL_VALID_ERR_MISC + i)) + dev_info(dev, + "ERR_MISC_%d=0x%x\n", i, error->err_misc[i]); + } + + /* Recovery for the PCIe controller errors */ + if (error->err_severity == HISI_ERR_SEV_RECOVERABLE) { + /* try reset PCI port for the error recovery */ + rc = hisi_pcie_port_do_recovery(pdev, error->socket_id, + HISI_PCIE_PORT_ID(error->core_id, error->port_id)); + if (rc) { + dev_info(dev, "fail to do hisi pcie port reset\n"); + return; + } + } +} + +static int hisi_pcie_notify_error(struct notifier_block *nb, + unsigned long event, void *data) +{ + struct acpi_hest_generic_data *gdata = data; + const struct hisi_pcie_error_data *error_data = + acpi_hest_get_payload(gdata); + struct hisi_pcie_error_private *priv = + container_of(nb, struct hisi_pcie_error_private, nb); + struct platform_device *pdev = priv->pdev; + struct device *dev = &pdev->dev; + u8 socket; + + if (device_property_read_u8(dev, "socket", &socket)) + return NOTIFY_DONE; + + if (!guid_equal((guid_t *)gdata->section_type, &hisi_pcie_sec_type) || + error_data->socket_id != socket) + return NOTIFY_DONE; + + hisi_pcie_handle_error(error_data, pdev); + + return NOTIFY_OK; +} + +static int hisi_pcie_error_handler_probe(struct platform_device *pdev) +{ + struct hisi_pcie_error_private *priv; + int ret; + + priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL); + if (!priv) + return -ENOMEM; + + priv->nb.notifier_call = hisi_pcie_notify_error; + priv->pdev = pdev; + ret = ghes_register_event_notifier(&priv->nb); + if (ret) { + dev_err(&pdev->dev, "%s : ghes_register_event_notifier fail\n", + __func__); + return ret; + } + + platform_set_drvdata(pdev, priv); + + return 0; +} + +static int hisi_pcie_error_handler_remove(struct platform_device *pdev) +{ + struct hisi_pcie_error_private *priv = platform_get_drvdata(pdev); + + if (priv) + ghes_unregister_event_notifier(&priv->nb); + + kfree(priv); + + return 0; +} + +static const struct acpi_device_id hisi_pcie_acpi_match[] = { + { "HISI0361", 0 }, + { } +}; + +static struct platform_driver hisi_pcie_error_handler_driver = { + .driver = { + .name = "hisi-pcie-error-handler", + .acpi_match_table = hisi_pcie_acpi_match, + }, + .probe = hisi_pcie_error_handler_probe, + .remove = hisi_pcie_error_handler_remove, +}; +module_platform_driver(hisi_pcie_error_handler_driver); + +MODULE_DESCRIPTION("HiSilicon HIP PCIe controller error handling driver"); +MODULE_LICENSE("GPL v2");