From patchwork Tue Aug 20 14:47:28 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Cameron X-Patchwork-Id: 171808 Delivered-To: patch@linaro.org Received: by 2002:a92:d204:0:0:0:0:0 with SMTP id y4csp4525939ily; Tue, 20 Aug 2019 07:48:25 -0700 (PDT) X-Google-Smtp-Source: APXvYqzA+oPJO+5GzscKnmoTyr+VDEYmXgr3l6AkUpo+QhYQLVMY3uuHCNLE5Wr3DClwHiGlWZLm X-Received: by 2002:a63:69c1:: with SMTP id e184mr24011141pgc.198.1566312504972; Tue, 20 Aug 2019 07:48:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1566312504; cv=none; d=google.com; s=arc-20160816; b=Mv4UkeScEdYgupm0+1Ub6Xls6hWQoeV4k1O5vLOHQPRmgwTg/+wajzxwLIPmpyQxcL AlVoTTKDHl0/xDCMoaD5XX0C3kdgxGjidAhwNMuo9/bdV1IoVL7I39653sfOa31NpQF/ Mnq+zhY81XRBfWad2en93qzZo+W2k3c2W6X5WIRFbg7ur9l5wb9BKFjU2ZWxOHaQTGZY LavcC/NQjsX+2Qp602FwPBOQUI8W6Fz9H5rK1nlxj7tEjlfPS3z/AAnZhjQ9v7GQ5Cxu gcZpTJIM7pvOU9QvZmPsEI9a3TIbd8pvlcmCiHr0FDofyR8FliBHF43OR6e/jfU1WDbt o2oA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=oI89f7b78+t0pqV4XEQyzv/2wBWPZB8+qnxdmvycMW4=; b=HQb+Kqds7AZeL1pC/ggdjRv0PXrGFse4J8wUUBemXqA8WvmtFHVeZjpOKuFY/ZFUM7 00L4N82rrpUZ4xbyOvN0dSZSQOgeBIQuE14S/9FGfksq/Yc2M6eBBUa+GEJr6GCYBSWo TO5jvn/+gSEYr2+vnbYjYOJiLrkElLRvdcIGuujtoYcgyWW9SGtMQI7DK9uY5joGT+L7 k2a/Hz6tKtqYvdIZOljnqcOVWxwpxLcYpv+apkgihceapZ6Lsy0K+OgUY/0d+VP+Wj9c MzpTNwdhMaCFsrWxn2VXSGalDS9VCGRmxwtCLKroDyHpCgc9MwCFkUFPe8nTITkbWvpO BOxg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-efi-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-efi-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u9si123833pjn.86.2019.08.20.07.48.24; Tue, 20 Aug 2019 07:48:24 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-efi-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-efi-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-efi-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730120AbfHTOsY (ORCPT + 3 others); Tue, 20 Aug 2019 10:48:24 -0400 Received: from szxga05-in.huawei.com ([45.249.212.191]:4739 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729900AbfHTOsY (ORCPT ); Tue, 20 Aug 2019 10:48:24 -0400 Received: from DGGEMS414-HUB.china.huawei.com (unknown [172.30.72.60]) by Forcepoint Email with ESMTP id B5184CF1B9576C63E22E; Tue, 20 Aug 2019 22:48:18 +0800 (CST) Received: from lhrphicprd00229.huawei.com (10.123.41.22) by DGGEMS414-HUB.china.huawei.com (10.3.19.214) with Microsoft SMTP Server id 14.3.439.0; Tue, 20 Aug 2019 22:48:08 +0800 From: Jonathan Cameron To: , , , Borislav Petkov , "Mauro Carvalho Chehab" , CC: , , , , , , , , Jonathan Cameron Subject: [PATCH 2/6 V2] efi / ras: CCIX Cache error reporting Date: Tue, 20 Aug 2019 22:47:28 +0800 Message-ID: <20190820144732.2370-3-Jonathan.Cameron@huawei.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190820144732.2370-1-Jonathan.Cameron@huawei.com> References: <20190820144732.2370-1-Jonathan.Cameron@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.123.41.22] X-CFilter-Loop: Reflected Sender: linux-efi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-efi@vger.kernel.org As CCIX Request Agents have fully cache coherent caches, the CCIX 1.0 Base Specification defines detailed error reporting for these caches. A CCIX cache error is reported via a CPER record as defined in the UEFI 2.8 specification. The PER log section is defined in the CCIX 1.0 Base Specification. Signed-off-by: Jonathan Cameron --- Changes since v1. Drop printing of vendor data to kernel log. drivers/acpi/apei/ghes.c | 4 + drivers/firmware/efi/cper-ccix.c | 156 +++++++++++++++++++++++++++++++ include/linux/cper.h | 57 +++++++++++ include/ras/ras_event.h | 66 +++++++++++++ 4 files changed, 283 insertions(+) -- 2.20.1 diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index 1ae36d836a77..5bda94e48b1b 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -504,6 +504,10 @@ static void ghes_handle_ccix_per(struct acpi_hest_generic_data *gdata, int sev) trace_ccix_memory_error_event(payload, err_seq, sev, ccix_mem_err_ven_len_get(payload)); break; + case CCIX_CACHE_ERROR: + trace_ccix_cache_error_event(payload, err_seq, sev, + ccix_cache_err_ven_len_get(payload)); + break; default: /* Unknown error type */ pr_info("CCIX error of unknown or vendor defined type\n"); diff --git a/drivers/firmware/efi/cper-ccix.c b/drivers/firmware/efi/cper-ccix.c index 0be2630b2712..fa3fafac402b 100644 --- a/drivers/firmware/efi/cper-ccix.c +++ b/drivers/firmware/efi/cper-ccix.c @@ -273,6 +273,96 @@ static int cper_ccix_mem_err_details(const char *pfx, return 0; } +static const char * const ccix_cache_type_strs[] = { + "Instruction Cache", + "Data Cache", + "Generic / Unified Cache", + "Snoop Filter Directory", +}; + +static const char *cper_ccix_cache_type_str(__u8 type) +{ + return type < ARRAY_SIZE(ccix_cache_type_strs) ? + ccix_cache_type_strs[type] : "Reserved"; +} + +static const char * const ccix_cache_err_type_strs[] = { + "Data", + "Tag", + "Timeout", + "Hang", + "Data Lost", + "Invalid Address", +}; + +const char *cper_ccix_cache_err_type_str(__u8 error_type) +{ + return error_type < ARRAY_SIZE(ccix_cache_err_type_strs) ? + ccix_cache_err_type_strs[error_type] : "Reserved"; +} + +static const char * const ccix_cache_err_op_strs[] = { + "Generic", + "Generic Read", + "Generic Write", + "Data Read", + "Data Write", + "Instruction Fetch", + "Prefetch", + "Eviction", + "Snooping", + "Snooped", + "Management / Command Error", +}; + +static const char *cper_ccix_cache_err_op_str(__u8 op) +{ + return op < ARRAY_SIZE(ccix_cache_err_op_strs) ? + ccix_cache_err_op_strs[op] : "Reserved"; +} + +static int cper_ccix_cache_err_details(const char *pfx, + struct acpi_hest_generic_data *gdata) +{ + struct cper_ccix_cache_error *full_cache_err; + struct cper_sec_ccix_cache_error *cache_err; + + if (gdata->error_data_length < sizeof(*full_cache_err)) + return -ENOSPC; + + full_cache_err = acpi_hest_get_payload(gdata); + + cache_err = &full_cache_err->cache_record; + + if (cache_err->validation_bits & CCIX_CACHE_ERR_TYPE_VALID) + printk("%s""Cache Type: %s\n", pfx, + cper_ccix_cache_type_str(cache_err->cache_type)); + + if (cache_err->validation_bits & CCIX_CACHE_ERR_OP_VALID) + printk("%s""Operation: %s\n", pfx, + cper_ccix_cache_err_op_str(cache_err->op_type)); + + if (cache_err->validation_bits & CCIX_CACHE_ERR_CACHE_ERR_TYPE_VALID) + printk("%s""Cache Error Type: %s\n", pfx, + cper_ccix_cache_err_type_str(cache_err->cache_error_type)); + + if (cache_err->validation_bits & CCIX_CACHE_ERR_LEVEL_VALID) + printk("%s""Level: %d\n", pfx, cache_err->cache_level); + + if (cache_err->validation_bits & CCIX_CACHE_ERR_SET_VALID) + printk("%s""Set: %d\n", pfx, cache_err->set); + + if (cache_err->validation_bits & CCIX_CACHE_ERR_WAY_VALID) + printk("%s""Way: %d\n", pfx, cache_err->way); + + if (cache_err->validation_bits & CCIX_CACHE_ERR_INSTANCE_ID_VALID) + printk("%s""Instance ID: %d\n", pfx, cache_err->instance); + + /* Vendor data is not printed to the kernel log */ + + return 0; +} + int cper_print_ccix_per(const char *pfx, struct acpi_hest_generic_data *gdata) { struct cper_sec_ccix_header *header = acpi_hest_get_payload(gdata); @@ -334,9 +424,75 @@ int cper_print_ccix_per(const char *pfx, struct acpi_hest_generic_data *gdata) switch (per_type) { case CCIX_MEMORY_ERROR: return cper_ccix_mem_err_details(pfx, gdata); + case CCIX_CACHE_ERROR: + return cper_ccix_cache_err_details(pfx, gdata); default: /* Vendor defined so no formatting be done */ break; } return 0; } + +void cper_ccix_cache_err_pack(const struct cper_sec_ccix_cache_error *cache_record, + struct cper_ccix_cache_err_compact *ccache_err, + const u16 vendor_data_len, + u8 *vendor_data) +{ + ccache_err->validation_bits = cache_record->validation_bits; + ccache_err->set = cache_record->set; + ccache_err->way = cache_record->way; + ccache_err->cache_type = cache_record->cache_type; + ccache_err->op_type = cache_record->op_type; + ccache_err->cache_error_type = cache_record->cache_error_type; + ccache_err->cache_level = cache_record->cache_level; + ccache_err->instance = cache_record->instance; + memcpy(vendor_data, &cache_record->vendor_data[1], vendor_data_len); +} + +static int cper_ccix_err_cache_location(struct cper_ccix_cache_err_compact *ccache_err, + char *msg) +{ + u32 len = CPER_REC_LEN - 1; + u32 n = 0; + + if (!msg) + return 0; + + if (ccache_err->validation_bits & CCIX_CACHE_ERR_CACHE_ERR_TYPE_VALID) + n += snprintf(msg + n, len, "Error: %s ", + cper_ccix_cache_err_type_str(ccache_err->cache_error_type)); + + if (ccache_err->validation_bits & CCIX_CACHE_ERR_TYPE_VALID) + n += snprintf(msg + n, len, "Type: %s ", + cper_ccix_cache_type_str(ccache_err->cache_type)); + + if (ccache_err->validation_bits & CCIX_CACHE_ERR_OP_VALID) + n += snprintf(msg + n, len, "Op: %s ", + cper_ccix_cache_err_op_str(ccache_err->op_type)); + + if (ccache_err->validation_bits & CCIX_CACHE_ERR_LEVEL_VALID) + n += snprintf(msg + n, len, "Level: %d ", + ccache_err->cache_level); + if (ccache_err->validation_bits & CCIX_CACHE_ERR_SET_VALID) + n += snprintf(msg + n, len, "Set: %d ", ccache_err->set); + if (ccache_err->validation_bits & CCIX_CACHE_ERR_WAY_VALID) + n += snprintf(msg + n, len, "Way: %d ", ccache_err->way); + if (ccache_err->validation_bits & CCIX_CACHE_ERR_INSTANCE_ID_VALID) + n += snprintf(msg + n, len, "Instance: %d ", + ccache_err->instance); + + return n; +} + +const char *cper_ccix_cache_err_unpack(struct trace_seq *p, + struct cper_ccix_cache_err_compact *ccache_err) +{ + const char *ret = trace_seq_buffer_ptr(p); + + if (cper_ccix_err_cache_location(ccache_err, rcd_decode_str)) + trace_seq_printf(p, "%s", rcd_decode_str); + + trace_seq_putc(p, '\0'); + + return ret; +} diff --git a/include/linux/cper.h b/include/linux/cper.h index df7a34c3ba4f..eef254b8b8b7 100644 --- a/include/linux/cper.h +++ b/include/linux/cper.h @@ -627,6 +627,54 @@ struct cper_ccix_mem_err_compact { __u8 fru; }; +struct cper_sec_ccix_cache_error { + __u32 validation_bits; +#define CCIX_CACHE_ERR_TYPE_VALID BIT(0) +#define CCIX_CACHE_ERR_OP_VALID BIT(1) +#define CCIX_CACHE_ERR_CACHE_ERR_TYPE_VALID BIT(2) +#define CCIX_CACHE_ERR_LEVEL_VALID BIT(3) +#define CCIX_CACHE_ERR_SET_VALID BIT(4) +#define CCIX_CACHE_ERR_WAY_VALID BIT(5) +#define CCIX_CACHE_ERR_INSTANCE_ID_VALID BIT(6) +#define CCIX_CACHE_ERR_VENDOR_DATA_VALID BIT(7) + __u16 length; /* Includes vendor specific log info */ + __u8 cache_type; + __u8 op_type; + __u8 cache_error_type; + __u8 cache_level; + __u32 set; + __u32 way; + __u8 instance; + __u8 reserved; + __u32 vendor_data[]; +}; + +struct cper_ccix_cache_error { + struct cper_sec_ccix_header header; + __u32 ccix_header[CCIX_PER_LOG_HEADER_DWS]; + struct cper_sec_ccix_cache_error cache_record; +}; + +static inline u16 ccix_cache_err_ven_len_get(struct cper_ccix_cache_error *cache_err) +{ + if (cache_err->cache_record.validation_bits & + CCIX_CACHE_ERR_VENDOR_DATA_VALID) + return cache_err->cache_record.vendor_data[0] & 0xFFFF; + else + return 0; +} + +struct cper_ccix_cache_err_compact { + __u32 validation_bits; + __u32 set; + __u32 way; + __u8 cache_type; + __u8 op_type; + __u8 cache_error_type; + __u8 cache_level; + __u8 instance; +}; + /* Reset to default packing */ #pragma pack() @@ -649,6 +697,15 @@ const char *cper_ccix_mem_err_unpack(struct trace_seq *p, struct cper_ccix_mem_err_compact *cmem_err); const char *cper_ccix_mem_err_type_str(unsigned int error_type); const char *cper_ccix_comp_type_str(u8 comp_type); + +void cper_ccix_cache_err_pack(const struct cper_sec_ccix_cache_error *cache_record, + struct cper_ccix_cache_err_compact *ccache_err, + const u16 vendor_data_len, + u8 *vendor_data); +const char *cper_ccix_cache_err_unpack(struct trace_seq *p, + struct cper_ccix_cache_err_compact *ccache_err); +const char *cper_ccix_cache_err_type_str(__u8 error_type); + struct acpi_hest_generic_data; int cper_print_ccix_per(const char *pfx, struct acpi_hest_generic_data *gdata); diff --git a/include/ras/ras_event.h b/include/ras/ras_event.h index 128728eaeef4..55f2c1900c54 100644 --- a/include/ras/ras_event.h +++ b/include/ras/ras_event.h @@ -415,6 +415,72 @@ TRACE_EVENT(ccix_memory_error_event, ) ); +TRACE_EVENT(ccix_cache_error_event, + TP_PROTO(struct cper_ccix_cache_error *err, + u32 err_seq, + u8 sev, + u16 ven_len), + + TP_ARGS(err, err_seq, sev, ven_len), + + TP_STRUCT__entry( + __field(u32, err_seq) + __field(u8, sev) + __field(u8, sevdetail) + __field(u8, source) + __field(u8, component) + __field(u64, pa) + __field(u8, pa_mask_lsb) + __field_struct(struct cper_ccix_cache_err_compact, data) + __field(u16, vendor_data_length) + __dynamic_array(u8, vendor_data, ven_len) + ), + + TP_fast_assign( + __entry->err_seq = err_seq; + + __entry->sev = sev; + __entry->sevdetail = FIELD_GET(CCIX_PER_LOG_DW1_SEV_UE_M | + CCIX_PER_LOG_DW1_SEV_NO_COMM_M | + CCIX_PER_LOG_DW1_SEV_DEGRADED_M | + CCIX_PER_LOG_DW1_SEV_DEFFERABLE_M, + err->ccix_header[1]); + if (err->header.validation_bits & 0x1) + __entry->source = err->header.source_id; + else + __entry->source = ~0; + __entry->component = FIELD_GET(CCIX_PER_LOG_DW1_COMP_TYPE_M, + err->ccix_header[1]); + if (err->ccix_header[1] & CCIX_PER_LOG_DW1_ADDR_VAL_M) { + __entry->pa = (u64)err->ccix_header[2] << 32 | + (err->ccix_header[3] & 0xfffffffc); + __entry->pa_mask_lsb = err->ccix_header[4] & 0xff; + } else { + __entry->pa = ~0ull; + __entry->pa_mask_lsb = ~0; + } + + __entry->vendor_data_length = ven_len ? ven_len - 4 : 0; + cper_ccix_cache_err_pack(&err->cache_record, &__entry->data, + __entry->vendor_data_length, + __get_dynamic_array(vendor_data)); + ), + + TP_printk("{%d} %s CCIX PER Cache Error in %s SevUE:%d SevNoComm:%d SevDegraded:%d SevDeferred:%d physical addr: %016llx (mask: %x) %s vendor:%s", + __entry->err_seq, + cper_severity_str(__entry->sev), + cper_ccix_comp_type_str(__entry->component), + __entry->sevdetail & BIT(0) ? 1 : 0, + __entry->sevdetail & BIT(1) ? 1 : 0, + __entry->sevdetail & BIT(2) ? 1 : 0, + __entry->sevdetail & BIT(3) ? 1 : 0, + __entry->pa, + __entry->pa_mask_lsb, + cper_ccix_cache_err_unpack(p, &__entry->data), + __print_hex(__get_dynamic_array(vendor_data), + __entry->vendor_data_length) + ) +); /* * memory-failure recovery action result event * From patchwork Tue Aug 20 14:47:29 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Cameron X-Patchwork-Id: 171809 Delivered-To: patch@linaro.org Received: by 2002:a92:d204:0:0:0:0:0 with SMTP id y4csp4526145ily; Tue, 20 Aug 2019 07:48:35 -0700 (PDT) X-Google-Smtp-Source: APXvYqw3UcH4p2DPUCqtauSb9VTi+Kx0va8vCBoA+qSD7k1l32r1FXTA1JEM8I+KsL4Ib4oF61TL X-Received: by 2002:aa7:8dd2:: with SMTP id j18mr30072077pfr.88.1566312515869; Tue, 20 Aug 2019 07:48:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1566312515; cv=none; d=google.com; s=arc-20160816; b=cskF73ORMMknI+EdyR+/XfkFZ2+nq9jlWFpi7tsIjQovNhT3vztPHNDe4QTU401q6B K7ODL6YtqwPIrd1lkOFg1Qz/5WL+bsrSpQWMsM24wGsmj0OKBeG712+Oq1o0li+PuTe+ VzeWDGvz5ucJV0xbKgfxug+Y9PWFpwZqD/H/9Lzbs1cvyNUErq5+duAOjc7Gcmz3fqG6 FdNoQ6rLnCPJ2rULpCWs/HPwoZhd06DLpvsjCPbcWQ6pHzXAz8Z/qG2rMQ9MCuY0qKBz LrL20lh6hwCPpOv+pX6G95Zz3i1NmOsTNAzvMssoc8mwdokqjnMXw5jzIbBMF/kmHtKN pFsA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=/U2Gi+grMpviwulNBDPwsYaS48n11ahnBxIaAbWsQLo=; b=aCRxs1cp02bEGWLRVDnQjVw0CQ3aw/L14A7yBjAq1E4LxePogo6/HSEutSPJdb0jQA GyQG8h3wVL+ML53zUvscg94GR6Ze7Dyd4vn6mS7pwYZbcOKa4hdwxDbfYZdBuKiD9EiW jQQucx/D28RLeAxH/95KbOb+lxzAwvCoKNXpDyhBs/VCMvlUd7fvtw12tpOWITJOiJDm 4FHSgCT9F4qLcYYTxPx/4f4yp9gIlZ2OBwnK6+BLNukxmz89hq7YE5OY8J5Xc7JaWsih YOj0BYOJbRDnuL/puly87YrR+WnsmxcQMeBXGDQ0zPXR64llozEJtZm1HQvn1nycbTQC G+2Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-efi-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-efi-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 28si12398676pgk.331.2019.08.20.07.48.35; Tue, 20 Aug 2019 07:48:35 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-efi-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-efi-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-efi-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730151AbfHTOsf (ORCPT + 3 others); Tue, 20 Aug 2019 10:48:35 -0400 Received: from szxga04-in.huawei.com ([45.249.212.190]:5164 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1729900AbfHTOsf (ORCPT ); Tue, 20 Aug 2019 10:48:35 -0400 Received: from DGGEMS414-HUB.china.huawei.com (unknown [172.30.72.59]) by Forcepoint Email with ESMTP id C5C2A6EAF7CA8450205A; Tue, 20 Aug 2019 22:48:23 +0800 (CST) Received: from lhrphicprd00229.huawei.com (10.123.41.22) by DGGEMS414-HUB.china.huawei.com (10.3.19.214) with Microsoft SMTP Server id 14.3.439.0; Tue, 20 Aug 2019 22:48:14 +0800 From: Jonathan Cameron To: , , , Borislav Petkov , "Mauro Carvalho Chehab" , CC: , , , , , , , , Jonathan Cameron Subject: [PATCH 3/6 V2] efi / ras: CCIX Address Translation Cache error reporting Date: Tue, 20 Aug 2019 22:47:29 +0800 Message-ID: <20190820144732.2370-4-Jonathan.Cameron@huawei.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190820144732.2370-1-Jonathan.Cameron@huawei.com> References: <20190820144732.2370-1-Jonathan.Cameron@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.123.41.22] X-CFilter-Loop: Reflected Sender: linux-efi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-efi@vger.kernel.org CCIX devices tend to make heavy use of ATCs. The CCIX base specification defines a protocol error message (PER) that describes errors reported by such caches. The UEFI 2.8 specification includes a CCIX CPER record for firmware first handling to report these errors to the operating system. This patch is very similar to the support previously added for CCIX Memory Errors and provides both logging and RAS tracepoint for this error class. Signed-off-by: Jonathan Cameron --- Changes since v1. Drop printing of vendor data to kernel log. drivers/acpi/apei/ghes.c | 4 ++ drivers/firmware/efi/cper-ccix.c | 72 ++++++++++++++++++++++++++++++++ include/linux/cper.h | 39 +++++++++++++++++ include/ras/ras_event.h | 67 +++++++++++++++++++++++++++++ 4 files changed, 182 insertions(+) -- 2.20.1 diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index 5bda94e48b1b..a2ae9311ffee 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -508,6 +508,10 @@ static void ghes_handle_ccix_per(struct acpi_hest_generic_data *gdata, int sev) trace_ccix_cache_error_event(payload, err_seq, sev, ccix_cache_err_ven_len_get(payload)); break; + case CCIX_ATC_ERROR: + trace_ccix_atc_error_event(payload, err_seq, sev, + ccix_atc_err_ven_len_get(payload)); + break; default: /* Unknown error type */ pr_info("CCIX error of unknown or vendor defined type\n"); diff --git a/drivers/firmware/efi/cper-ccix.c b/drivers/firmware/efi/cper-ccix.c index fa3fafac402b..da8b7e1bb3a9 100644 --- a/drivers/firmware/efi/cper-ccix.c +++ b/drivers/firmware/efi/cper-ccix.c @@ -363,6 +363,31 @@ static int cper_ccix_cache_err_details(const char *pfx, return 0; } +static int cper_ccix_atc_err_details(const char *pfx, + struct acpi_hest_generic_data *gdata) +{ + struct cper_ccix_atc_error *full_atc_err; + struct cper_sec_ccix_atc_error *atc_err; + + if (gdata->error_data_length < sizeof(*full_atc_err)) + return -ENOSPC; + + full_atc_err = acpi_hest_get_payload(gdata); + + atc_err = &full_atc_err->atc_record; + + if (atc_err->validation_bits & CCIX_ATC_ERR_OP_VALID) + printk("%s""Operation: %s\n", pfx, + cper_ccix_cache_err_op_str(atc_err->op_type)); + + if (atc_err->validation_bits & CCIX_ATC_ERR_INSTANCE_ID_VALID) + printk("%s""Instance ID: %d\n", pfx, atc_err->instance); + + /* Vendor data is not printed to the kernel log */ + + return 0; +} + int cper_print_ccix_per(const char *pfx, struct acpi_hest_generic_data *gdata) { struct cper_sec_ccix_header *header = acpi_hest_get_payload(gdata); @@ -426,6 +451,8 @@ int cper_print_ccix_per(const char *pfx, struct acpi_hest_generic_data *gdata) return cper_ccix_mem_err_details(pfx, gdata); case CCIX_CACHE_ERROR: return cper_ccix_cache_err_details(pfx, gdata); + case CCIX_ATC_ERROR: + return cper_ccix_atc_err_details(pfx, gdata); default: /* Vendor defined so no formatting be done */ break; @@ -496,3 +523,48 @@ const char *cper_ccix_cache_err_unpack(struct trace_seq *p, return ret; } + +void cper_ccix_atc_err_pack(const struct cper_sec_ccix_atc_error *atc_record, + struct cper_ccix_atc_err_compact *catc_err, + const u16 vendor_data_len, + u8 *vendor_data) +{ + catc_err->validation_bits = atc_record->validation_bits; + catc_err->op_type = atc_record->op_type; + catc_err->instance = atc_record->instance; + memcpy(vendor_data, &atc_record->vendor_data[1], vendor_data_len); +} + +static int cper_ccix_err_atc_location(struct cper_ccix_atc_err_compact *catc_err, + char *msg) +{ + u32 len = CPER_REC_LEN - 1; + u32 n = 0; + + if (!msg) + return 0; + + if (catc_err->validation_bits & CCIX_ATC_ERR_OP_VALID) + n += snprintf(msg + n, len, "Op: %s ", + cper_ccix_cache_err_op_str(catc_err->op_type)); + + if (catc_err->validation_bits & CCIX_ATC_ERR_INSTANCE_ID_VALID) + n += snprintf(msg + n, len, "Instance: %d ", + catc_err->instance); + + return n; +} + +const char *cper_ccix_atc_err_unpack(struct trace_seq *p, + struct cper_ccix_atc_err_compact *catc_err) +{ + const char *ret = trace_seq_buffer_ptr(p); + + if (cper_ccix_err_atc_location(catc_err, rcd_decode_str)) + trace_seq_printf(p, "%s", rcd_decode_str); + + trace_seq_putc(p, '\0'); + + return ret; +} + diff --git a/include/linux/cper.h b/include/linux/cper.h index eef254b8b8b7..6bb603e9a97a 100644 --- a/include/linux/cper.h +++ b/include/linux/cper.h @@ -675,6 +675,38 @@ struct cper_ccix_cache_err_compact { __u8 instance; }; +struct cper_sec_ccix_atc_error { + __u32 validation_bits; +#define CCIX_ATC_ERR_OP_VALID BIT(0) +#define CCIX_ATC_ERR_INSTANCE_ID_VALID BIT(1) +#define CCIX_ATC_ERR_VENDOR_DATA_VALID BIT(2) + __u16 length; /* Includes vendor specific log info */ + __u8 op_type; + __u8 instance; + __u32 reserved; + __u32 vendor_data[]; +}; + +struct cper_ccix_atc_error { + struct cper_sec_ccix_header header; + __u32 ccix_header[CCIX_PER_LOG_HEADER_DWS]; + struct cper_sec_ccix_atc_error atc_record; +}; + +static inline u16 ccix_atc_err_ven_len_get(struct cper_ccix_atc_error *atc_err) +{ + if (atc_err->atc_record.validation_bits & CCIX_ATC_ERR_VENDOR_DATA_VALID) + return atc_err->atc_record.vendor_data[0] & 0xFFFF; + else + return 0; +} + +struct cper_ccix_atc_err_compact { + __u32 validation_bits; + __u8 op_type; + __u8 instance; +}; + /* Reset to default packing */ #pragma pack() @@ -706,6 +738,13 @@ const char *cper_ccix_cache_err_unpack(struct trace_seq *p, struct cper_ccix_cache_err_compact *ccache_err); const char *cper_ccix_cache_err_type_str(__u8 error_type); +void cper_ccix_atc_err_pack(const struct cper_sec_ccix_atc_error *atc_record, + struct cper_ccix_atc_err_compact *catc_err, + const u16 vendor_data_len, + u8 *vendor_data); +const char *cper_ccix_atc_err_unpack(struct trace_seq *p, + struct cper_ccix_atc_err_compact *catc_err); + struct acpi_hest_generic_data; int cper_print_ccix_per(const char *pfx, struct acpi_hest_generic_data *gdata); diff --git a/include/ras/ras_event.h b/include/ras/ras_event.h index 55f2c1900c54..bab49e297551 100644 --- a/include/ras/ras_event.h +++ b/include/ras/ras_event.h @@ -481,6 +481,73 @@ TRACE_EVENT(ccix_cache_error_event, __entry->vendor_data_length) ) ); + +TRACE_EVENT(ccix_atc_error_event, + TP_PROTO(struct cper_ccix_atc_error *err, + u32 err_seq, + u8 sev, + u16 ven_len), + + TP_ARGS(err, err_seq, sev, ven_len), + + TP_STRUCT__entry( + __field(u32, err_seq) + __field(u8, sev) + __field(u8, sevdetail) + __field(u8, source) + __field(u8, component) + __field(u64, pa) + __field(u8, pa_mask_lsb) + __field_struct(struct cper_ccix_atc_err_compact, data) + __field(u16, vendor_data_length) + __dynamic_array(u8, vendor_data, ven_len) + ), + + TP_fast_assign( + __entry->err_seq = err_seq; + + __entry->sev = sev; + __entry->sevdetail = FIELD_GET(CCIX_PER_LOG_DW1_SEV_UE_M | + CCIX_PER_LOG_DW1_SEV_NO_COMM_M | + CCIX_PER_LOG_DW1_SEV_DEGRADED_M | + CCIX_PER_LOG_DW1_SEV_DEFFERABLE_M, + err->ccix_header[1]); + if (err->header.validation_bits & 0x1) + __entry->source = err->header.source_id; + else + __entry->source = ~0; + __entry->component = FIELD_GET(CCIX_PER_LOG_DW1_COMP_TYPE_M, + err->ccix_header[1]); + if (err->ccix_header[1] & CCIX_PER_LOG_DW1_ADDR_VAL_M) { + __entry->pa = (u64)err->ccix_header[2] << 32 | + (err->ccix_header[3] & 0xfffffffc); + __entry->pa_mask_lsb = err->ccix_header[4] & 0xff; + } else { + __entry->pa = ~0ull; + __entry->pa_mask_lsb = ~0; + } + + __entry->vendor_data_length = ven_len ? ven_len - 4 : 0; + cper_ccix_atc_err_pack(&err->atc_record, &__entry->data, + __entry->vendor_data_length, + __get_dynamic_array(vendor_data)); + ), + + TP_printk("{%d} %s CCIX PER ATC Error in %s SevUE:%d SevNoComm:%d SevDegraded:%d SevDeferred:%d physical addr: %016llx (mask: %x) %s vendor:%s", + __entry->err_seq, + cper_severity_str(__entry->sev), + cper_ccix_comp_type_str(__entry->component), + __entry->sevdetail & BIT(0) ? 1 : 0, + __entry->sevdetail & BIT(1) ? 1 : 0, + __entry->sevdetail & BIT(2) ? 1 : 0, + __entry->sevdetail & BIT(3) ? 1 : 0, + __entry->pa, + __entry->pa_mask_lsb, + cper_ccix_atc_err_unpack(p, &__entry->data), + __print_hex(__get_dynamic_array(vendor_data), __entry->vendor_data_length) + ) +); + /* * memory-failure recovery action result event * From patchwork Tue Aug 20 14:47:32 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Cameron X-Patchwork-Id: 171812 Delivered-To: patch@linaro.org Received: by 2002:a92:d204:0:0:0:0:0 with SMTP id y4csp4526467ily; Tue, 20 Aug 2019 07:48:50 -0700 (PDT) X-Google-Smtp-Source: APXvYqzPYEKhF9eqLDPGALwpfzaeZ/Cs/SuxByrisB4jcyq+CF8l4Hggmpjpyg6Iff8C8OqRB0qK X-Received: by 2002:a65:4489:: with SMTP id l9mr25752603pgq.207.1566312530691; Tue, 20 Aug 2019 07:48:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1566312530; cv=none; d=google.com; s=arc-20160816; b=nv48Ro83A4UcQCycRMkcsqM/NKe7KnD95ZN6+xhYlIlZHJeXz/plUqppcoZuFe+3ZF lh1VaGCxa/FWJQWRz5kuu3omHA92jnDiSKgrR57IiZKY/amvRMumtHGRVX0SqLKViKhX /h4rYfJ0W1AZW9tXeYXzNTdgR9cw8eWSVPWIVv9iUlg62FdunZWZ/jGMYcctIx9O3RKp fnRLrv6d/lkHXq2d/tw4PdAaGnTUKYo+s90aVH66IN0AQS/g0W+VSzb9xjABziNCNHZz /R9ihs28W3sRGPa3P29FbyGlUEsztOz7MVyASMhzWMoY58rmTVbDI2D46LcriHTv8bHY P6QA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=jVR2a15O7Bt6HwIiKpvVjil6GAcfrp1dO8xlZxhPMBw=; b=DLHlJi+q5mVxxb6hPWLqTCx9UOBVK31mO+b9dCCSNchTY45Qu3EKR7/Q1+lT1LgbjX dbgibpkqKLrh5xCCRdJpw7nP8cl11YAxthgcz2FATBr4skpOPZDCR8p33/IgB16iCoXG cEgHjgT4PFa/Ba8acgdvSghbOgLT5C32Z2IY5MJKSVcb7PrdoFCMdLuCWGQwDqZnJ18B KPaJSVLVuEexu/22MYXExB+8VtPXtLOfnpjplTuzAnPwDSNHXaQTdI+F8FMdCUvqkCeh R4qzHTRm0p2SGBqxQLK1wB2+rea1Td7zHyPdGSZTqQS8qCM8Pu0/D+qtv947tWBEB6oO 0YDQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-efi-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-efi-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 12si12946317pfi.199.2019.08.20.07.48.50; Tue, 20 Aug 2019 07:48:50 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-efi-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-efi-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-efi-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730519AbfHTOsu (ORCPT + 3 others); Tue, 20 Aug 2019 10:48:50 -0400 Received: from szxga04-in.huawei.com ([45.249.212.190]:5165 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730515AbfHTOsu (ORCPT ); Tue, 20 Aug 2019 10:48:50 -0400 Received: from DGGEMS414-HUB.china.huawei.com (unknown [172.30.72.59]) by Forcepoint Email with ESMTP id 01B392B2D0CAB188F8EA; Tue, 20 Aug 2019 22:48:39 +0800 (CST) Received: from lhrphicprd00229.huawei.com (10.123.41.22) by DGGEMS414-HUB.china.huawei.com (10.3.19.214) with Microsoft SMTP Server id 14.3.439.0; Tue, 20 Aug 2019 22:48:31 +0800 From: Jonathan Cameron To: , , , Borislav Petkov , "Mauro Carvalho Chehab" , CC: , , , , , , , , Jonathan Cameron Subject: [PATCH 6/6 V2] efi / ras: CCIX Agent internal error reporting Date: Tue, 20 Aug 2019 22:47:32 +0800 Message-ID: <20190820144732.2370-7-Jonathan.Cameron@huawei.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190820144732.2370-1-Jonathan.Cameron@huawei.com> References: <20190820144732.2370-1-Jonathan.Cameron@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.123.41.22] X-CFilter-Loop: Reflected Sender: linux-efi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-efi@vger.kernel.org The CCIX 1.0 Base specification defines an internal agent error, for which the specific data present afte the header is vendor defined. Signed-off-by: Jonathan Cameron --- Changes since v1. Drop printing of vendor data to kernel log. drivers/acpi/apei/ghes.c | 4 ++ drivers/firmware/efi/cper-ccix.c | 12 ++++++ include/linux/cper.h | 29 +++++++++++++++ include/ras/ras_event.h | 64 ++++++++++++++++++++++++++++++++ 4 files changed, 109 insertions(+) -- 2.20.1 diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index 3cdca0c0f4ae..f7b3c4777df3 100644 --- a/drivers/acpi/apei/ghes.c +++ b/drivers/acpi/apei/ghes.c @@ -520,6 +520,10 @@ static void ghes_handle_ccix_per(struct acpi_hest_generic_data *gdata, int sev) trace_ccix_link_error_event(payload, err_seq, sev, ccix_link_err_ven_len_get(payload)); break; + case CCIX_AGENT_INTERNAL_ERROR: + trace_ccix_agent_error_event(payload, err_seq, sev, + ccix_agent_err_ven_len_get(payload)); + break; default: /* Unknown error type */ pr_info("CCIX error of unknown or vendor defined type\n"); diff --git a/drivers/firmware/efi/cper-ccix.c b/drivers/firmware/efi/cper-ccix.c index 5820a58ec382..b2ad55291dc4 100644 --- a/drivers/firmware/efi/cper-ccix.c +++ b/drivers/firmware/efi/cper-ccix.c @@ -584,6 +584,9 @@ int cper_print_ccix_per(const char *pfx, struct acpi_hest_generic_data *gdata) return cper_ccix_port_err_details(pfx, gdata); case CCIX_LINK_ERROR: return cper_ccix_link_err_details(pfx, gdata); + case CCIX_AGENT_INTERNAL_ERROR: + /* Agent errors contain only vendor data, so nothing to do here */ + return 0; default: /* Vendor defined so no formatting be done */ break; @@ -803,3 +806,12 @@ const char *cper_ccix_link_err_unpack(struct trace_seq *p, return ret; } + +void cper_ccix_agent_err_pack(const struct cper_sec_ccix_agent_err *agent_record, + struct cper_ccix_agent_err_compact *cagent_err, + const u16 vendor_data_len, + u8 *vendor_data) +{ + cagent_err->validation_bits = agent_record->validation_bits; + memcpy(vendor_data, &agent_record->vendor_data[1], vendor_data_len); +} diff --git a/include/linux/cper.h b/include/linux/cper.h index d35be55351e3..373c1d387a70 100644 --- a/include/linux/cper.h +++ b/include/linux/cper.h @@ -783,6 +783,30 @@ struct cper_ccix_link_err_compact { __u8 credit_type; }; +struct cper_sec_ccix_agent_err { + __u32 validation_bits; +#define CCIX_AGENT_INTERNAL_ERR_VENDOR_DATA_VALID BIT(0) + __u32 vendor_data[]; +}; + +struct cper_ccix_agent_err { + struct cper_sec_ccix_header header; + __u32 ccix_header[CCIX_PER_LOG_HEADER_DWS]; + struct cper_sec_ccix_agent_err agent_record; +}; + +static inline u16 ccix_agent_err_ven_len_get(struct cper_ccix_agent_err *agent_err) +{ + if (agent_err->agent_record.validation_bits & CCIX_AGENT_INTERNAL_ERR_VENDOR_DATA_VALID) + return agent_err->agent_record.vendor_data[0] & 0xFFFF; + else + return 0; +} + +struct cper_ccix_agent_err_compact { + __u32 validation_bits; +}; + /* Reset to default packing */ #pragma pack() @@ -835,6 +859,11 @@ void cper_ccix_link_err_pack(const struct cper_sec_ccix_link_error *link_record, const char *cper_ccix_link_err_unpack(struct trace_seq *p, struct cper_ccix_link_err_compact *clink_err); +void cper_ccix_agent_err_pack(const struct cper_sec_ccix_agent_err *agent_record, + struct cper_ccix_agent_err_compact *cagent_err, + const u16 vendor_data_len, + u8 *vendor_data); + struct acpi_hest_generic_data; int cper_print_ccix_per(const char *pfx, struct acpi_hest_generic_data *gdata); diff --git a/include/ras/ras_event.h b/include/ras/ras_event.h index bfe1c64b9db0..d09e5389a1e6 100644 --- a/include/ras/ras_event.h +++ b/include/ras/ras_event.h @@ -679,6 +679,70 @@ TRACE_EVENT(ccix_link_error_event, ) ); +TRACE_EVENT(ccix_agent_error_event, + TP_PROTO(struct cper_ccix_agent_err *err, + u32 err_seq, + u8 sev, u16 ven_len), + + TP_ARGS(err, err_seq, sev, ven_len), + + TP_STRUCT__entry( + __field(u32, err_seq) + __field(u8, sev) + __field(u8, sevdetail) + __field(u8, source) + __field(u8, component) + __field(u64, pa) + __field(u8, pa_mask_lsb) + __field(u16, vendor_data_length) + __field_struct(struct cper_ccix_agent_err_compact, data) + __dynamic_array(u8, vendor_data, ven_len) + ), + + TP_fast_assign( + __entry->err_seq = err_seq; + + __entry->sev = sev; + __entry->sevdetail = FIELD_GET(CCIX_PER_LOG_DW1_SEV_UE_M | + CCIX_PER_LOG_DW1_SEV_NO_COMM_M | + CCIX_PER_LOG_DW1_SEV_DEGRADED_M | + CCIX_PER_LOG_DW1_SEV_DEFFERABLE_M, + err->ccix_header[1]); + if (err->header.validation_bits & 0x1) + __entry->source = err->header.source_id; + else + __entry->source = ~0; + __entry->component = FIELD_GET(CCIX_PER_LOG_DW1_COMP_TYPE_M, + err->ccix_header[1]); + if (err->ccix_header[1] & CCIX_PER_LOG_DW1_ADDR_VAL_M) { + __entry->pa = (u64)err->ccix_header[2] << 32 | + (err->ccix_header[3] & 0xfffffffc); + __entry->pa_mask_lsb = err->ccix_header[4] & 0xff; + } else { + __entry->pa = ~0ull; + __entry->pa_mask_lsb = ~0; + } + /* Do not store the vendor data header length */ + __entry->vendor_data_length = ven_len ? ven_len - 4 : 0; + cper_ccix_agent_err_pack(&err->agent_record, &__entry->data, + __entry->vendor_data_length, + __get_dynamic_array(vendor_data)); + ), + + TP_printk("{%d} %s CCIX PER Agent Internal Error in %s SevUE:%d SevNoComm:%d SevDegraded:%d SevDeferred:%d physical addr: %016llx (mask: %x) vendor:%s", + __entry->err_seq, + cper_severity_str(__entry->sev), + cper_ccix_comp_type_str(__entry->component), + __entry->sevdetail & BIT(0) ? 1 : 0, + __entry->sevdetail & BIT(1) ? 1 : 0, + __entry->sevdetail & BIT(2) ? 1 : 0, + __entry->sevdetail & BIT(3) ? 1 : 0, + __entry->pa, + __entry->pa_mask_lsb, + __print_hex(__get_dynamic_array(vendor_data), __entry->vendor_data_length) + ) +); + /* * memory-failure recovery action result event *