From patchwork Mon Jan 6 12:09:57 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shiju Jose X-Patchwork-Id: 855296 Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 74A711DBB38; Mon, 6 Jan 2025 12:11:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.176.79.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736165496; cv=none; b=I04Qo9S6F384zWK090DmSG4tjNQNxEKBGIxPLjFfKWapAQIDud0BwuGcz5nSZerd7ezmuCSetkg+W69ctCZ5t8zmiXe+tyUxFLgfygrsYd8DEZfafe8yKpmoTeLbkFqk/cU/6BV+VMOGg6ie6duPkHlzbsig+IerfS/6esSKId8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736165496; c=relaxed/simple; bh=AeJE/aFbvOs/tf9pBoWbwcBF7oA1ie224ZdA/zPupGs=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=uhRlIJrNul2a1O2Zz2JE8vVhOOEF3C3y1iUtKv5KTq74bcX1DmG4mOgkY/ddQzWCCUFTspb5++l+whEmRJ6xwoJ4qx9dzYYazYEFw7q5aEqBkz1t7uBIhMhpdWqJp1UH3/4gE47AJ6aq8qjm2j0n6zo+0TQZ/NFVj6yC15UFHAY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=185.176.79.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.18.186.216]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4YRY011nDdz6LD7n; Mon, 6 Jan 2025 20:09:53 +0800 (CST) Received: from frapeml500007.china.huawei.com (unknown [7.182.85.172]) by mail.maildlp.com (Postfix) with ESMTPS id 093691408F9; Mon, 6 Jan 2025 20:11:25 +0800 (CST) Received: from P_UKIT01-A7bmah.china.huawei.com (10.126.170.95) by frapeml500007.china.huawei.com (7.182.85.172) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.39; Mon, 6 Jan 2025 13:11:22 +0100 From: To: , , , , CC: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , Subject: [PATCH v18 01/19] EDAC: Add support for EDAC device features control Date: Mon, 6 Jan 2025 12:09:57 +0000 Message-ID: <20250106121017.1620-2-shiju.jose@huawei.com> X-Mailer: git-send-email 2.43.0.windows.1 In-Reply-To: <20250106121017.1620-1-shiju.jose@huawei.com> References: <20250106121017.1620-1-shiju.jose@huawei.com> Precedence: bulk X-Mailing-List: linux-acpi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: lhrpeml100011.china.huawei.com (7.191.174.247) To frapeml500007.china.huawei.com (7.182.85.172) From: Shiju Jose Add generic EDAC device feature controls supporting the registration of RAS features available in the system. The driver exposes control attributes for these features to userspace in /sys/bus/edac/devices/// Co-developed-by: Jonathan Cameron Signed-off-by: Jonathan Cameron Signed-off-by: Shiju Jose --- Documentation/edac/features.rst | 94 ++++++++++++++++++++++++++++++ Documentation/edac/index.rst | 10 ++++ drivers/edac/edac_device.c | 100 ++++++++++++++++++++++++++++++++ include/linux/edac.h | 28 +++++++++ 4 files changed, 232 insertions(+) create mode 100644 Documentation/edac/features.rst create mode 100644 Documentation/edac/index.rst diff --git a/Documentation/edac/features.rst b/Documentation/edac/features.rst new file mode 100644 index 000000000000..f32f259ce04d --- /dev/null +++ b/Documentation/edac/features.rst @@ -0,0 +1,94 @@ +.. SPDX-License-Identifier: GPL-2.0 + +============================================ +Augmenting EDAC for controlling RAS features +============================================ + +Copyright (c) 2024 HiSilicon Limited. + +:Author: Shiju Jose +:License: The GNU Free Documentation License, Version 1.2 + (dual licensed under the GPL v2) +:Original Reviewers: + +- Written for: 6.14 + +Introduction +------------ +The expansion of EDAC for controlling RAS features and exposing features +control attributes to userspace via sysfs. Some Examples: + +* Scrub control + +* Error Check Scrub (ECS) control + +* ACPI RAS2 features + +* Post Package Repair (PPR) control + +* Memory Sparing Repair control etc. + +High level design is illustrated in the following diagram:: + + _______________________________________________ + | Userspace - Rasdaemon | + | _____________ | + | | RAS CXL mem | _______________ | + | |error handler|---->| | | + | |_____________| | RAS dynamic | | + | _____________ | scrub, memory | | + | | RAS memory |---->| repair control| | + | |error handler| |_______________| | + | |_____________| | | + |__________________________|____________________| + | + | + _______________________________|______________________________ + | Kernel EDAC extension for | controlling RAS Features | + | ______________________________|____________________________ | + || EDAC Core Sysfs EDAC| Bus | | + || __________________________|_________ _____________ | | + || |/sys/bus/edac/devices//scrubX/ | | EDAC device || | + || |/sys/bus/edac/devices//ecsX/ |<->| EDAC MC || | + || |/sys/bus/edac/devices//repairX | | EDAC sysfs || | + || |____________________________________| |_____________|| | + || EDAC|Bus | | + || | | | + || __________ Get feature | Get feature | | + || | |desc _________|______ desc __________ | | + || |EDAC scrub|<-----| EDAC device | | | | | + || |__________| | driver- RAS |---->| EDAC mem | | | + || __________ | feature control| | repair | | | + || | |<-----|________________| |__________| | | + || |EDAC ECS | Register RAS|features | | + || |__________| | | | + || ______________________|_____________ | | + ||_________|_______________|__________________|______________| | + | _______|____ _______|_______ ____|__________ | + | | | | CXL mem driver| | Client driver | | + | | ACPI RAS2 | | scrub, ECS, | | memory repair | | + | | driver | | sparing, PPR | | features | | + | |____________| |_______________| |_______________| | + | | | | | + |________|_________________|____________________|______________| + | | | + ________|_________________|____________________|______________ + | ___|_________________|____________________|_______ | + | | | | + | | Platform HW and Firmware | | + | |__________________________________________________| | + |______________________________________________________________| + + +1. EDAC Features components - Create feature specific descriptors. +For example, EDAC scrub, EDAC ECS, EDAC memory repair in the above +diagram. + +2. EDAC device driver for controlling RAS Features - Get feature's attribute +descriptors from EDAC RAS feature component and registers device's RAS +features with EDAC bus and exposes the features control attributes via +the sysfs EDAC bus. For example, /sys/bus/edac/devices//X/ + +3. RAS dynamic feature controller - Userspace sample modules in rasdaemon for +dynamic scrub/repair control to issue scrubbing/repair when excess number +of corrected memory errors are reported in a short span of time. diff --git a/Documentation/edac/index.rst b/Documentation/edac/index.rst new file mode 100644 index 000000000000..b6c265a4cffb --- /dev/null +++ b/Documentation/edac/index.rst @@ -0,0 +1,10 @@ +.. SPDX-License-Identifier: GPL-2.0 + +============== +EDAC Subsystem +============== + +.. toctree:: + :maxdepth: 1 + + features diff --git a/drivers/edac/edac_device.c b/drivers/edac/edac_device.c index 621dc2a5d034..9fce46dd7405 100644 --- a/drivers/edac/edac_device.c +++ b/drivers/edac/edac_device.c @@ -570,3 +570,103 @@ void edac_device_handle_ue_count(struct edac_device_ctl_info *edac_dev, block ? block->name : "N/A", count, msg); } EXPORT_SYMBOL_GPL(edac_device_handle_ue_count); + +static void edac_dev_release(struct device *dev) +{ + struct edac_dev_feat_ctx *ctx = container_of(dev, struct edac_dev_feat_ctx, dev); + + kfree(ctx->dev.groups); + kfree(ctx); +} + +const struct device_type edac_dev_type = { + .name = "edac_dev", + .release = edac_dev_release, +}; + +static void edac_dev_unreg(void *data) +{ + device_unregister(data); +} + +/** + * edac_dev_register - register device for RAS features with EDAC + * @parent: parent device. + * @name: parent device's name. + * @private: parent driver's data to store in the context if any. + * @num_features: number of RAS features to register. + * @ras_features: list of RAS features to register. + * + * Return: + * * %0 - Success. + * * %-EINVAL - Invalid parameters passed. + * * %-ENOMEM - Dynamic memory allocation failed. + * + */ +int edac_dev_register(struct device *parent, char *name, + void *private, int num_features, + const struct edac_dev_feature *ras_features) +{ + const struct attribute_group **ras_attr_groups; + struct edac_dev_feat_ctx *ctx; + int attr_gcnt = 0; + int ret, feat; + + if (!parent || !name || !num_features || !ras_features) + return -EINVAL; + + /* Double parse to make space for attributes */ + for (feat = 0; feat < num_features; feat++) { + switch (ras_features[feat].ft_type) { + /* Add feature specific code */ + default: + return -EINVAL; + } + } + + ctx = kzalloc(sizeof(*ctx), GFP_KERNEL); + if (!ctx) + return -ENOMEM; + + ras_attr_groups = kcalloc(attr_gcnt + 1, sizeof(*ras_attr_groups), GFP_KERNEL); + if (!ras_attr_groups) { + ret = -ENOMEM; + goto ctx_free; + } + + attr_gcnt = 0; + for (feat = 0; feat < num_features; feat++, ras_features++) { + switch (ras_features->ft_type) { + /* Add feature specific code */ + default: + ret = -EINVAL; + goto groups_free; + } + } + + ctx->dev.parent = parent; + ctx->dev.bus = edac_get_sysfs_subsys(); + ctx->dev.type = &edac_dev_type; + ctx->dev.groups = ras_attr_groups; + ctx->private = private; + dev_set_drvdata(&ctx->dev, ctx); + + ret = dev_set_name(&ctx->dev, name); + if (ret) + goto groups_free; + + ret = device_register(&ctx->dev); + if (ret) { + put_device(&ctx->dev); + return ret; + } + + return devm_add_action_or_reset(parent, edac_dev_unreg, &ctx->dev); + +groups_free: + kfree(ras_attr_groups); +ctx_free: + kfree(ctx); + return ret; +} +EXPORT_SYMBOL_GPL(edac_dev_register); diff --git a/include/linux/edac.h b/include/linux/edac.h index b4ee8961e623..521b17113d4d 100644 --- a/include/linux/edac.h +++ b/include/linux/edac.h @@ -661,4 +661,32 @@ static inline struct dimm_info *edac_get_dimm(struct mem_ctl_info *mci, return mci->dimms[index]; } + +#define EDAC_FEAT_NAME_LEN 128 + +/* RAS feature type */ +enum edac_dev_feat { + RAS_FEAT_MAX +}; + +/* EDAC device feature information structure */ +struct edac_dev_data { + u8 instance; + void *private; +}; + +struct edac_dev_feat_ctx { + struct device dev; + void *private; +}; + +struct edac_dev_feature { + enum edac_dev_feat ft_type; + u8 instance; + void *ctx; +}; + +int edac_dev_register(struct device *parent, char *dev_name, + void *parent_pvt_data, int num_features, + const struct edac_dev_feature *ras_features); #endif /* _LINUX_EDAC_H_ */ From patchwork Mon Jan 6 12:10:00 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shiju Jose X-Patchwork-Id: 855295 Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A40CE1DDA39; Mon, 6 Jan 2025 12:11:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.176.79.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736165499; cv=none; b=p6aVBBKJaGG9k+nDMYg+aTMfnb1TD223j/wHSHv0YgxzKmHRU4HjDh4QMFjRgCf8RYtw3rb8/jljFEZ82y7kaaLE2Egdv7aYlLdEZDKW6gwUJKdxfdTFiEK8gOxTTBPE/oj2+Sb9BisZK4RP64lOWejljBKYNgoYU4ZO95kvp0c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736165499; c=relaxed/simple; bh=14j+Vk0VAD9CkxNJMOU/bZt9DuPJ50YGEcNPPw70DYU=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=sgY1cdx75wywwARvVzxuXWUprmNFax14SiQcvJ4MxCfPqBT2RLLWhMCKpUpJ6Bucs8tdvL8gbJA/Yp2rzc+BsPJwbTmqxFeXz9zjSAIURBuytujQaRSKWDJZwM0F2Za9+/88M8C0O0ai4pH9Q3t5GVf3SCY+Q9nI+5XXnI9mR/M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=185.176.79.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.18.186.231]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4YRY0m1b3Pz6FGbG; Mon, 6 Jan 2025 20:10:32 +0800 (CST) Received: from frapeml500007.china.huawei.com (unknown [7.182.85.172]) by mail.maildlp.com (Postfix) with ESMTPS id 960C7140A08; Mon, 6 Jan 2025 20:11:33 +0800 (CST) Received: from P_UKIT01-A7bmah.china.huawei.com (10.126.170.95) by frapeml500007.china.huawei.com (7.182.85.172) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.39; Mon, 6 Jan 2025 13:11:31 +0100 From: To: , , , , CC: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , Subject: [PATCH v18 04/19] EDAC: Add memory repair control feature Date: Mon, 6 Jan 2025 12:10:00 +0000 Message-ID: <20250106121017.1620-5-shiju.jose@huawei.com> X-Mailer: git-send-email 2.43.0.windows.1 In-Reply-To: <20250106121017.1620-1-shiju.jose@huawei.com> References: <20250106121017.1620-1-shiju.jose@huawei.com> Precedence: bulk X-Mailing-List: linux-acpi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: lhrpeml100011.china.huawei.com (7.191.174.247) To frapeml500007.china.huawei.com (7.182.85.172) From: Shiju Jose Add a generic EDAC memory repair control driver to manage memory repairs in the system, such as CXL Post Package Repair (PPR) and CXL memory sparing features. For example, a CXL device with DRAM components that support PPR features may implement PPR maintenance operations. DRAM components may support two types of PPR, hard PPR, for a permanent row repair, and soft PPR, for a temporary row repair. Soft PPR is much faster than hard PPR, but the repair is lost with a power cycle. Similarly a CXL memory device may support soft and hard memory sparing at cacheline, row, bank and rank granularities. Memory sparing is defined as a repair function that replaces a portion of memory with a portion of functional memory at that same granularity. When a CXL device detects an error in a memory, it may report the host of the need for a repair maintenance operation by using an event record where the "maintenance needed" flag is set. The event records contains the device physical address(DPA) and other attributes of the memory to repair (such as channel, sub-channel, bank group, bank, rank, row, column etc). The kernel will report the corresponding CXL general media or DRAM trace event to userspace, and userspace tools (e.g. rasdaemon) will initiate a repair operation in response to the device request via the sysfs repair control. Device with memory repair features registers with EDAC device driver, which retrieves memory repair descriptor from EDAC memory repair driver and exposes the sysfs repair control attributes to userspace in /sys/bus/edac/devices//mem_repairX/. The common memory repair control interface abstracts the control of arbitrary memory repair functionality into a standardized set of functions. The sysfs memory repair attribute nodes are only available if the client driver has implemented the corresponding attribute callback function and provided operations to the EDAC device driver during registration. Signed-off-by: Shiju Jose --- .../ABI/testing/sysfs-edac-memory-repair | 244 +++++++++ Documentation/edac/features.rst | 3 + Documentation/edac/index.rst | 1 + Documentation/edac/memory_repair.rst | 101 ++++ drivers/edac/Makefile | 2 +- drivers/edac/edac_device.c | 33 ++ drivers/edac/mem_repair.c | 492 ++++++++++++++++++ include/linux/edac.h | 139 +++++ 8 files changed, 1014 insertions(+), 1 deletion(-) create mode 100644 Documentation/ABI/testing/sysfs-edac-memory-repair create mode 100644 Documentation/edac/memory_repair.rst create mode 100755 drivers/edac/mem_repair.c diff --git a/Documentation/ABI/testing/sysfs-edac-memory-repair b/Documentation/ABI/testing/sysfs-edac-memory-repair new file mode 100644 index 000000000000..e9268f3780ed --- /dev/null +++ b/Documentation/ABI/testing/sysfs-edac-memory-repair @@ -0,0 +1,244 @@ +What: /sys/bus/edac/devices//mem_repairX +Date: Jan 2025 +KernelVersion: 6.14 +Contact: linux-edac@vger.kernel.org +Description: + The sysfs EDAC bus devices //mem_repairX subdirectory + pertains to the memory media repair features control, such as + PPR (Post Package Repair), memory sparing etc, where + directory corresponds to a device registered with the EDAC + device driver for the memory repair features. + + Post Package Repair is a maintenance operation requests the memory + device to perform a repair operation on its media, in detail is a + memory self-healing feature that fixes a failing memory location by + replacing it with a spare row in a DRAM device. For example, a + CXL memory device with DRAM components that support PPR features may + implement PPR maintenance operations. DRAM components may support + two types of PPR functions: hard PPR, for a permanent row repair, and + soft PPR, for a temporary row repair. soft PPR is much faster than + hard PPR, but the repair is lost with a power cycle. + + Memory sparing is a repair function that replaces a portion + of memory with a portion of functional memory at that same + sparing granularity. Memory sparing has cacheline/row/bank/rank + sparing granularities. For example, in memory-sparing mode, + one memory rank serves as a spare for other ranks on the same + channel in case they fail. The spare rank is held in reserve and + not used as active memory until a failure is indicated, with + reserved capacity subtracted from the total available memory + in the system.The DIMM installation order for memory sparing + varies based on the number of processors and memory modules + installed in the server. After an error threshold is surpassed + in a system protected by memory sparing, the content of a failing + rank of DIMMs is copied to the spare rank. The failing rank is + then taken offline and the spare rank placed online for use as + active memory in place of the failed rank. + + The sysfs attributes nodes for a repair feature are only + present if the parent driver has implemented the corresponding + attr callback function and provided the necessary operations + to the EDAC device driver during registration. + + In some states of system configuration (e.g. before address + decoders have been configured), memory devices (e.g. CXL) + may not have an active mapping in the main host address + physical address map. As such, the memory to repair must be + identified by a device specific physical addressing scheme + using a device physical address(DPA). The DPA and other control + attributes to use will be presented in related error records. + +What: /sys/bus/edac/devices//mem_repairX/repair_function +Date: Jan 2025 +KernelVersion: 6.14 +Contact: linux-edac@vger.kernel.org +Description: + (RO) Memory repair function type. For eg. post package repair, + memory sparing etc. + EDAC_SOFT_PPR - Soft post package repair + EDAC_HARD_PPR - Hard post package repair + EDAC_CACHELINE_MEM_SPARING - Cacheline memory sparing + EDAC_ROW_MEM_SPARING - Row memory sparing + EDAC_BANK_MEM_SPARING - Bank memory sparing + EDAC_RANK_MEM_SPARING - Rank memory sparing + All other values are reserved. + +What: /sys/bus/edac/devices//mem_repairX/persist_mode +Date: Jan 2025 +KernelVersion: 6.14 +Contact: linux-edac@vger.kernel.org +Description: + (RW) Read/Write the current persist repair mode set for a + repair function. Persist repair modes supported in the + device, based on the memory repair function is temporary + or permanent and is lost with a power cycle. + EDAC_MEM_REPAIR_SOFT - Soft repair function (temporary repair). + EDAC_MEM_REPAIR_HARD - Hard memory repair function (permanent repair). + All other values are reserved. + +What: /sys/bus/edac/devices//mem_repairX/dpa_support +Date: Jan 2025 +KernelVersion: 6.14 +Contact: linux-edac@vger.kernel.org +Description: + (RO) True if memory device required device physical + address (DPA) of memory to repair. + False if memory device required host specific physical + address (HPA) of memory to repair. + In some states of system configuration (e.g. before address + decoders have been configured), memory devices (e.g. CXL) + may not have an active mapping in the main host address + physical address map. As such, the memory to repair must be + identified by a device specific physical addressing scheme + using a DPA. The device physical address(DPA) to use will be + presented in related error records. + +What: /sys/bus/edac/devices//mem_repairX/repair_safe_when_in_use +Date: Jan 2025 +KernelVersion: 6.14 +Contact: linux-edac@vger.kernel.org +Description: + (RO) True if memory media is accessible and data is retained + during the memory repair operation. + The data may not be retained and memory requests may not be + correctly processed during a repair operation. In such case + the repair operation should not executed at runtime. + +What: /sys/bus/edac/devices//mem_repairX/hpa +Date: Jan 2025 +KernelVersion: 6.14 +Contact: linux-edac@vger.kernel.org +Description: + (RW) Host Physical Address (HPA) of the memory to repair. + See attribute 'dpa_support' for more details. + The HPA to use will be provided in related error records. + +What: /sys/bus/edac/devices//mem_repairX/dpa +Date: Jan 2025 +KernelVersion: 6.14 +Contact: linux-edac@vger.kernel.org +Description: + (RW) Device Physical Address (DPA) of the memory to repair. + See attribute 'dpa_support' for more details. + The specific DPA to use will be provided in related error + records. + +What: /sys/bus/edac/devices//mem_repairX/nibble_mask +Date: Jan 2025 +KernelVersion: 6.14 +Contact: linux-edac@vger.kernel.org +Description: + (RW) Read/Write Nibble mask of the memory to repair. + Nibble mask identifies one or more nibbles in error on the + memory bus that produced the error event. Nibble Mask bit 0 + shall be set if nibble 0 on the memory bus produced the + event, etc. For example, CXL PPR and sparing, a nibble mask + bit set to 1 indicates the request to perform repair + operation in the specific device. All nibble mask bits set + to 1 indicates the request to perform the operation in all + devices. For CXL memory to repiar, the specific value of + nibble mask to use will be provided in related error records. + For more details, See nibble mask field in CXL spec ver 3.1, + section 8.2.9.7.1.2 Table 8-103 soft PPR and section + 8.2.9.7.1.3 Table 8-104 hard PPR, section 8.2.9.7.1.4 + Table 8-105 memory sparing. + +What: /sys/bus/edac/devices//mem_repairX/bank_group +What: /sys/bus/edac/devices//mem_repairX/bank +What: /sys/bus/edac/devices//mem_repairX/rank +What: /sys/bus/edac/devices//mem_repairX/row +What: /sys/bus/edac/devices//mem_repairX/column +What: /sys/bus/edac/devices//mem_repairX/channel +What: /sys/bus/edac/devices//mem_repairX/sub_channel +Date: Jan 2025 +KernelVersion: 6.14 +Contact: linux-edac@vger.kernel.org +Description: + (RW) The control attributes associated with memory address + that is to be repaired. The specific value of attributes to + use depends on the portion of memory to repair and may be + reported to host in related error records and may be + available to userspace in trace events, such as in CXL + memory devices. + + channel - The channel of the memory to repair. Channel is + defined as an interface that can be independently accessed + for a transaction. + rank - The rank of the memory to repair. Rank is defined as a + set of memory devices on a channel that together execute a + transaction. + bank_group - The bank group of the memory to repair. + bank - The bank number of the memory to repair. + row - The row number of the memory to repair. + column - The column number of the memory to repair. + sub_channel - The sub-channel of the memory to repair. + + The requirement to set these attributes varies based on the + repair function. The attributes in sysfs are not present + unless required for a repair function. + For example, CXL spec ver 3.1, Section 8.2.9.7.1.2 Table 8-103 + soft PPR and Section 8.2.9.7.1.3 Table 8-104 hard PPR operations, + these attributes are not required to set. + For example, CXL spec ver 3.1, Section 8.2.9.7.1.4 Table 8-105 + memory sparing, these attributes are required to set based on + memory sparing granularity as follows. + Channel: Channel associated with the DPA that is to be spared + and applies to all subclasses of sparing (cacheline, bank, + row and rank sparing). + Rank: Rank associated with the DPA that is to be spared and + applies to all subclasses of sparing. + Bank & Bank Group: Bank & bank group are associated with + the DPA that is to be spared and applies to cacheline sparing, + row sparing and bank sparing subclasses. + Row: Row associated with the DPA that is to be spared and + applies to cacheline sparing and row sparing subclasses. + Column: column associated with the DPA that is to be spared + and applies to cacheline sparing only. + Sub-channel: sub-channel associated with the DPA that is to + be spared and applies to cacheline sparing only. + +What: /sys/bus/edac/devices//mem_repairX/min_hpa +What: /sys/bus/edac/devices//mem_repairX/min_dpa +What: /sys/bus/edac/devices//mem_repairX/min_nibble_mask +What: /sys/bus/edac/devices//mem_repairX/min_bank_group +What: /sys/bus/edac/devices//mem_repairX/min_bank +What: /sys/bus/edac/devices//mem_repairX/min_rank +What: /sys/bus/edac/devices//mem_repairX/min_row +What: /sys/bus/edac/devices//mem_repairX/min_column +What: /sys/bus/edac/devices//mem_repairX/min_channel +What: /sys/bus/edac/devices//mem_repairX/min_sub_channel +What: /sys/bus/edac/devices//mem_repairX/max_hpa +What: /sys/bus/edac/devices//mem_repairX/max_dpa +What: /sys/bus/edac/devices//mem_repairX/max_nibble_mask +What: /sys/bus/edac/devices//mem_repairX/max_bank_group +What: /sys/bus/edac/devices//mem_repairX/max_bank +What: /sys/bus/edac/devices//mem_repairX/max_rank +What: /sys/bus/edac/devices//mem_repairX/max_row +What: /sys/bus/edac/devices//mem_repairX/max_column +What: /sys/bus/edac/devices//mem_repairX/max_channel +What: /sys/bus/edac/devices//mem_repairX/max_sub_channel +Date: Jan 2025 +KernelVersion: 6.14 +Contact: linux-edac@vger.kernel.org +Description: + (RW) The supported range of control attributes (optional) + associated with a memory address that is to be repaired. + The memory device may give the supported range of + attributes to use and it will depend on the memory device + and the portion of memory to repair. + The userspace may receive the specific value of attributes + to use for a repair operation from the memory device via + related error records and trace events, such as in CXL + memory devices. + +What: /sys/bus/edac/devices//mem_repairX/repair +Date: Jan 2025 +KernelVersion: 6.14 +Contact: linux-edac@vger.kernel.org +Description: + (WO) Issue the memory repair operation for the specified + memory repair attributes. The operation may fail if resources + are insufficient based on the requirements of the memory + device and repair function. + EDAC_DO_MEM_REPAIR - issue repair operation. + All other values are reserved. diff --git a/Documentation/edac/features.rst b/Documentation/edac/features.rst index ba3ab993ee4f..bfd5533b81b7 100644 --- a/Documentation/edac/features.rst +++ b/Documentation/edac/features.rst @@ -97,3 +97,6 @@ RAS features ------------ 1. Memory Scrub Memory scrub features are documented in `Documentation/edac/scrub.rst`. + +2. Memory Repair +Memory repair features are documented in `Documentation/edac/memory_repair.rst`. diff --git a/Documentation/edac/index.rst b/Documentation/edac/index.rst index dfb0c9fb9ab1..d6778f4562dd 100644 --- a/Documentation/edac/index.rst +++ b/Documentation/edac/index.rst @@ -8,4 +8,5 @@ EDAC Subsystem :maxdepth: 1 features + memory_repair scrub diff --git a/Documentation/edac/memory_repair.rst b/Documentation/edac/memory_repair.rst new file mode 100644 index 000000000000..2787a8a2d6ba --- /dev/null +++ b/Documentation/edac/memory_repair.rst @@ -0,0 +1,101 @@ +.. SPDX-License-Identifier: GPL-2.0 + +========================== +EDAC Memory Repair Control +========================== + +Copyright (c) 2024 HiSilicon Limited. + +:Author: Shiju Jose +:License: The GNU Free Documentation License, Version 1.2 + (dual licensed under the GPL v2) +:Original Reviewers: + +- Written for: 6.14 + +Introduction +------------ +Memory devices may support repair operations to address issues in their +memory media. Post Package Repair (PPR) and memory sparing are examples +of such features. + +Post Package Repair(PPR) +~~~~~~~~~~~~~~~~~~~~~~~~ +Post Package Repair is a maintenance operation requests the memory device +to perform repair operation on its media, in detail is a memory self-healing +feature that fixes a failing memory location by replacing it with a spare +row in a DRAM device. For example, a CXL memory device with DRAM components +that support PPR features may implement PPR maintenance operations. DRAM +components may support types of PPR functions, hard PPR, for a permanent row +repair, and soft PPR, for a temporary row repair. Soft PPR is much faster +than hard PPR, but the repair is lost with a power cycle. The data may not +be retained and memory requests may not be correctly processed during a +repair operation. In such case, the repair operation should not executed +at runtime. +For example, CXL memory devices, soft PPR and hard PPR repair operations +may be supported. See CXL spec rev 3.1 sections 8.2.9.7.1.1 PPR Maintenance +Operations, 8.2.9.7.1.2 sPPR Maintenance Operation and 8.2.9.7.1.3 hPPR +Maintenance Operation for more details. + +Memory Sparing +~~~~~~~~~~~~~~ +Memory sparing is a repair function that replaces a portion of memory with +a portion of functional memory at that same sparing granularity. Memory +sparing has cacheline/row/bank/rank sparing granularities. For example, in +memory-sparing mode, one memory rank serves as a spare for other ranks on +the same channel in case they fail. The spare rank is held in reserve and +not used as active memory until a failure is indicated, with reserved +capacity subtracted from the total available memory in the system. The DIMM +installation order for memory sparing varies based on the number of processors +and memory modules installed in the server. After an error threshold is +surpassed in a system protected by memory sparing, the content of a failing +rank of DIMMs is copied to the spare rank. The failing rank is then taken +offline and the spare rank placed online for use as active memory in place +of the failed rank. + +For example, CXL memory devices may support various subclasses for sparing +operation vary in terms of the scope of the sparing being performed. +Cacheline sparing subclass refers to a sparing action that can replace a +full cacheline. Row sparing is provided as an alternative to PPR sparing +functions and its scope is that of a single DDR row. Bank sparing allows +an entire bank to be replaced. Rank sparing is defined as an operation +in which an entire DDR rank is replaced. See CXL spec 3.1 section +8.2.9.7.1.4 Memory Sparing Maintenance Operations for more details. + +Use cases of generic memory repair features control +--------------------------------------------------- + +1. The soft PPR , hard PPR and memory-sparing features share similar +control attributes. Therefore, there is a need for a standardized, generic +sysfs repair control that is exposed to userspace and used by +administrators, scripts and tools. + +2. When a CXL device detects an error in a memory component, it may inform +the host of the need for a repair maintenance operation by using an event +record where the "maintenance needed" flag is set. The event record +specifies the device physical address(DPA) and attributes of the memory that +requires repair. The kernel reports the corresponding CXL general media or +DRAM trace event to userspace, and userspace tools (e.g. rasdaemon) initiate +a repair maintenance operation in response to the device request using the +sysfs repair control. + +3. Userspace tools, such as rasdaemon, may request a PPR/sparing on a memory +region when an uncorrected memory error or an excess of corrected memory +errors is reported on that memory. + +4. Multiple PPR/sparing instances may be present per memory device. + +The File System +--------------- + +The control attributes of a registered memory repair instance could be +accessed in the + +/sys/bus/edac/devices//mem_repairX/ + +sysfs +----- + +Sysfs files are documented in + +`Documentation/ABI/testing/sysfs-edac-memory-repair`. diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile index 3a49304860f0..1de9fe66ac6b 100644 --- a/drivers/edac/Makefile +++ b/drivers/edac/Makefile @@ -10,7 +10,7 @@ obj-$(CONFIG_EDAC) := edac_core.o edac_core-y := edac_mc.o edac_device.o edac_mc_sysfs.o edac_core-y += edac_module.o edac_device_sysfs.o wq.o -edac_core-y += scrub.o ecs.o +edac_core-y += scrub.o ecs.o mem_repair.o edac_core-$(CONFIG_EDAC_DEBUG) += debugfs.o diff --git a/drivers/edac/edac_device.c b/drivers/edac/edac_device.c index 1c1142a2e4e4..a401d81dad8a 100644 --- a/drivers/edac/edac_device.c +++ b/drivers/edac/edac_device.c @@ -575,6 +575,7 @@ static void edac_dev_release(struct device *dev) { struct edac_dev_feat_ctx *ctx = container_of(dev, struct edac_dev_feat_ctx, dev); + kfree(ctx->mem_repair); kfree(ctx->scrub); kfree(ctx->dev.groups); kfree(ctx); @@ -611,6 +612,7 @@ int edac_dev_register(struct device *parent, char *name, const struct attribute_group **ras_attr_groups; struct edac_dev_data *dev_data; struct edac_dev_feat_ctx *ctx; + int mem_repair_cnt = 0; int attr_gcnt = 0; int scrub_cnt = 0; int ret, feat; @@ -628,6 +630,10 @@ int edac_dev_register(struct device *parent, char *name, case RAS_FEAT_ECS: attr_gcnt += ras_features[feat].ecs_info.num_media_frus; break; + case RAS_FEAT_MEM_REPAIR: + attr_gcnt++; + mem_repair_cnt++; + break; default: return -EINVAL; } @@ -651,8 +657,17 @@ int edac_dev_register(struct device *parent, char *name, } } + if (mem_repair_cnt) { + ctx->mem_repair = kcalloc(mem_repair_cnt, sizeof(*ctx->mem_repair), GFP_KERNEL); + if (!ctx->mem_repair) { + ret = -ENOMEM; + goto data_mem_free; + } + } + attr_gcnt = 0; scrub_cnt = 0; + mem_repair_cnt = 0; for (feat = 0; feat < num_features; feat++, ras_features++) { switch (ras_features->ft_type) { case RAS_FEAT_SCRUB: @@ -686,6 +701,23 @@ int edac_dev_register(struct device *parent, char *name, attr_gcnt += ras_features->ecs_info.num_media_frus; break; + case RAS_FEAT_MEM_REPAIR: + if (!ras_features->mem_repair_ops || + mem_repair_cnt != ras_features->instance) + goto data_mem_free; + + dev_data = &ctx->mem_repair[mem_repair_cnt]; + dev_data->instance = mem_repair_cnt; + dev_data->mem_repair_ops = ras_features->mem_repair_ops; + dev_data->private = ras_features->ctx; + ret = edac_mem_repair_get_desc(parent, &ras_attr_groups[attr_gcnt], + ras_features->instance); + if (ret) + goto data_mem_free; + + mem_repair_cnt++; + attr_gcnt++; + break; default: ret = -EINVAL; goto data_mem_free; @@ -712,6 +744,7 @@ int edac_dev_register(struct device *parent, char *name, return devm_add_action_or_reset(parent, edac_dev_unreg, &ctx->dev); data_mem_free: + kfree(ctx->mem_repair); kfree(ctx->scrub); groups_free: kfree(ras_attr_groups); diff --git a/drivers/edac/mem_repair.c b/drivers/edac/mem_repair.c new file mode 100755 index 000000000000..e7439fd26c41 --- /dev/null +++ b/drivers/edac/mem_repair.c @@ -0,0 +1,492 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * The generic EDAC memory repair driver is designed to control the memory + * devices with memory repair features, such as Post Package Repair (PPR), + * memory sparing etc. The common sysfs memory repair interface abstracts + * the control of various arbitrary memory repair functionalities into a + * unified set of functions. + * + * Copyright (c) 2024 HiSilicon Limited. + */ + +#include + +enum edac_mem_repair_attributes { + MEM_REPAIR_FUNCTION, + MEM_REPAIR_PERSIST_MODE, + MEM_REPAIR_DPA_SUPPORT, + MEM_REPAIR_SAFE_IN_USE, + MEM_REPAIR_HPA, + MEM_REPAIR_MIN_HPA, + MEM_REPAIR_MAX_HPA, + MEM_REPAIR_DPA, + MEM_REPAIR_MIN_DPA, + MEM_REPAIR_MAX_DPA, + MEM_REPAIR_NIBBLE_MASK, + MEM_REPAIR_MIN_NIBBLE_MASK, + MEM_REPAIR_MAX_NIBBLE_MASK, + MEM_REPAIR_BANK_GROUP, + MEM_REPAIR_MIN_BANK_GROUP, + MEM_REPAIR_MAX_BANK_GROUP, + MEM_REPAIR_BANK, + MEM_REPAIR_MIN_BANK, + MEM_REPAIR_MAX_BANK, + MEM_REPAIR_RANK, + MEM_REPAIR_MIN_RANK, + MEM_REPAIR_MAX_RANK, + MEM_REPAIR_ROW, + MEM_REPAIR_MIN_ROW, + MEM_REPAIR_MAX_ROW, + MEM_REPAIR_COLUMN, + MEM_REPAIR_MIN_COLUMN, + MEM_REPAIR_MAX_COLUMN, + MEM_REPAIR_CHANNEL, + MEM_REPAIR_MIN_CHANNEL, + MEM_REPAIR_MAX_CHANNEL, + MEM_REPAIR_SUB_CHANNEL, + MEM_REPAIR_MIN_SUB_CHANNEL, + MEM_REPAIR_MAX_SUB_CHANNEL, + MEM_DO_REPAIR, + MEM_REPAIR_MAX_ATTRS +}; + +struct edac_mem_repair_dev_attr { + struct device_attribute dev_attr; + u8 instance; +}; + +struct edac_mem_repair_context { + char name[EDAC_FEAT_NAME_LEN]; + struct edac_mem_repair_dev_attr mem_repair_dev_attr[MEM_REPAIR_MAX_ATTRS]; + struct attribute *mem_repair_attrs[MEM_REPAIR_MAX_ATTRS + 1]; + struct attribute_group group; +}; + +#define TO_MEM_REPAIR_DEV_ATTR(_dev_attr) \ + container_of(_dev_attr, struct edac_mem_repair_dev_attr, dev_attr) + +#define EDAC_MEM_REPAIR_ATTR_SHOW(attrib, cb, type, format) \ +static ssize_t attrib##_show(struct device *ras_feat_dev, \ + struct device_attribute *attr, char *buf) \ +{ \ + u8 inst = TO_MEM_REPAIR_DEV_ATTR(attr)->instance; \ + struct edac_dev_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev); \ + const struct edac_mem_repair_ops *ops = \ + ctx->mem_repair[inst].mem_repair_ops; \ + type data; \ + int ret; \ + \ + ret = ops->cb(ras_feat_dev->parent, ctx->mem_repair[inst].private, \ + &data); \ + if (ret) \ + return ret; \ + \ + return sysfs_emit(buf, format, data); \ +} + +EDAC_MEM_REPAIR_ATTR_SHOW(repair_function, get_repair_function, u32, "%u\n") +EDAC_MEM_REPAIR_ATTR_SHOW(persist_mode, get_persist_mode, u32, "%u\n") +EDAC_MEM_REPAIR_ATTR_SHOW(dpa_support, get_dpa_support, u32, "%u\n") +EDAC_MEM_REPAIR_ATTR_SHOW(repair_safe_when_in_use, get_repair_safe_when_in_use, u32, "%u\n") +EDAC_MEM_REPAIR_ATTR_SHOW(hpa, get_hpa, u64, "0x%llx\n") +EDAC_MEM_REPAIR_ATTR_SHOW(min_hpa, get_min_hpa, u64, "0x%llx\n") +EDAC_MEM_REPAIR_ATTR_SHOW(max_hpa, get_max_hpa, u64, "0x%llx\n") +EDAC_MEM_REPAIR_ATTR_SHOW(dpa, get_dpa, u64, "0x%llx\n") +EDAC_MEM_REPAIR_ATTR_SHOW(min_dpa, get_min_dpa, u64, "0x%llx\n") +EDAC_MEM_REPAIR_ATTR_SHOW(max_dpa, get_max_dpa, u64, "0x%llx\n") +EDAC_MEM_REPAIR_ATTR_SHOW(nibble_mask, get_nibble_mask, u64, "0x%llx\n") +EDAC_MEM_REPAIR_ATTR_SHOW(min_nibble_mask, get_min_nibble_mask, u64, "0x%llx\n") +EDAC_MEM_REPAIR_ATTR_SHOW(max_nibble_mask, get_max_nibble_mask, u64, "0x%llx\n") +EDAC_MEM_REPAIR_ATTR_SHOW(bank_group, get_bank_group, u32, "%u\n") +EDAC_MEM_REPAIR_ATTR_SHOW(min_bank_group, get_min_bank_group, u32, "%u\n") +EDAC_MEM_REPAIR_ATTR_SHOW(max_bank_group, get_max_bank_group, u32, "%u\n") +EDAC_MEM_REPAIR_ATTR_SHOW(bank, get_bank, u32, "%u\n") +EDAC_MEM_REPAIR_ATTR_SHOW(min_bank, get_min_bank, u32, "%u\n") +EDAC_MEM_REPAIR_ATTR_SHOW(max_bank, get_max_bank, u32, "%u\n") +EDAC_MEM_REPAIR_ATTR_SHOW(rank, get_rank, u32, "%u\n") +EDAC_MEM_REPAIR_ATTR_SHOW(min_rank, get_min_rank, u32, "%u\n") +EDAC_MEM_REPAIR_ATTR_SHOW(max_rank, get_max_rank, u32, "%u\n") +EDAC_MEM_REPAIR_ATTR_SHOW(row, get_row, u64, "0x%llx\n") +EDAC_MEM_REPAIR_ATTR_SHOW(min_row, get_min_row, u64, "0x%llx\n") +EDAC_MEM_REPAIR_ATTR_SHOW(max_row, get_max_row, u64, "0x%llx\n") +EDAC_MEM_REPAIR_ATTR_SHOW(column, get_column, u32, "%u\n") +EDAC_MEM_REPAIR_ATTR_SHOW(min_column, get_min_column, u32, "%u\n") +EDAC_MEM_REPAIR_ATTR_SHOW(max_column, get_max_column, u32, "%u\n") +EDAC_MEM_REPAIR_ATTR_SHOW(channel, get_channel, u32, "%u\n") +EDAC_MEM_REPAIR_ATTR_SHOW(min_channel, get_min_channel, u32, "%u\n") +EDAC_MEM_REPAIR_ATTR_SHOW(max_channel, get_max_channel, u32, "%u\n") +EDAC_MEM_REPAIR_ATTR_SHOW(sub_channel, get_sub_channel, u32, "%u\n") +EDAC_MEM_REPAIR_ATTR_SHOW(min_sub_channel, get_min_sub_channel, u32, "%u\n") +EDAC_MEM_REPAIR_ATTR_SHOW(max_sub_channel, get_max_sub_channel, u32, "%u\n") + +#define EDAC_MEM_REPAIR_ATTR_STORE(attrib, cb, type, conv_func) \ +static ssize_t attrib##_store(struct device *ras_feat_dev, \ + struct device_attribute *attr, \ + const char *buf, size_t len) \ +{ \ + u8 inst = TO_MEM_REPAIR_DEV_ATTR(attr)->instance; \ + struct edac_dev_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev); \ + const struct edac_mem_repair_ops *ops = \ + ctx->mem_repair[inst].mem_repair_ops; \ + type data; \ + int ret; \ + \ + ret = conv_func(buf, 0, &data); \ + if (ret < 0) \ + return ret; \ + \ + ret = ops->cb(ras_feat_dev->parent, ctx->mem_repair[inst].private, \ + data); \ + if (ret) \ + return ret; \ + \ + return len; \ +} + +EDAC_MEM_REPAIR_ATTR_STORE(persist_mode, set_persist_mode, unsigned long, kstrtoul) +EDAC_MEM_REPAIR_ATTR_STORE(hpa, set_hpa, u64, kstrtou64) +EDAC_MEM_REPAIR_ATTR_STORE(dpa, set_dpa, u64, kstrtou64) +EDAC_MEM_REPAIR_ATTR_STORE(nibble_mask, set_nibble_mask, u64, kstrtou64) +EDAC_MEM_REPAIR_ATTR_STORE(bank_group, set_bank_group, unsigned long, kstrtoul) +EDAC_MEM_REPAIR_ATTR_STORE(bank, set_bank, unsigned long, kstrtoul) +EDAC_MEM_REPAIR_ATTR_STORE(rank, set_rank, unsigned long, kstrtoul) +EDAC_MEM_REPAIR_ATTR_STORE(row, set_row, u64, kstrtou64) +EDAC_MEM_REPAIR_ATTR_STORE(column, set_column, unsigned long, kstrtoul) +EDAC_MEM_REPAIR_ATTR_STORE(channel, set_channel, unsigned long, kstrtoul) +EDAC_MEM_REPAIR_ATTR_STORE(sub_channel, set_sub_channel, unsigned long, kstrtoul) + +#define EDAC_MEM_REPAIR_DO_OP(attrib, cb) \ +static ssize_t attrib##_store(struct device *ras_feat_dev, \ + struct device_attribute *attr, \ + const char *buf, size_t len) \ +{ \ + u8 inst = TO_MEM_REPAIR_DEV_ATTR(attr)->instance; \ + struct edac_dev_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev); \ + const struct edac_mem_repair_ops *ops = ctx->mem_repair[inst].mem_repair_ops; \ + unsigned long data; \ + int ret; \ + \ + ret = kstrtoul(buf, 0, &data); \ + if (ret < 0) \ + return ret; \ + \ + ret = ops->cb(ras_feat_dev->parent, ctx->mem_repair[inst].private, data); \ + if (ret) \ + return ret; \ + \ + return len; \ +} + +EDAC_MEM_REPAIR_DO_OP(repair, do_repair) + +static umode_t mem_repair_attr_visible(struct kobject *kobj, struct attribute *a, int attr_id) +{ + struct device *ras_feat_dev = kobj_to_dev(kobj); + struct device_attribute *dev_attr = container_of(a, struct device_attribute, attr); + struct edac_dev_feat_ctx *ctx = dev_get_drvdata(ras_feat_dev); + u8 inst = TO_MEM_REPAIR_DEV_ATTR(dev_attr)->instance; + const struct edac_mem_repair_ops *ops = ctx->mem_repair[inst].mem_repair_ops; + + switch (attr_id) { + case MEM_REPAIR_FUNCTION: + if (ops->get_repair_function) + return a->mode; + break; + case MEM_REPAIR_PERSIST_MODE: + if (ops->get_persist_mode) { + if (ops->set_persist_mode) + return a->mode; + else + return 0444; + } + break; + case MEM_REPAIR_DPA_SUPPORT: + if (ops->get_dpa_support) + return a->mode; + break; + case MEM_REPAIR_SAFE_IN_USE: + if (ops->get_repair_safe_when_in_use) + return a->mode; + break; + case MEM_REPAIR_HPA: + if (ops->get_hpa) { + if (ops->set_hpa) + return a->mode; + else + return 0444; + } + break; + case MEM_REPAIR_MIN_HPA: + if (ops->get_min_hpa) + return a->mode; + break; + case MEM_REPAIR_MAX_HPA: + if (ops->get_max_hpa) + return a->mode; + break; + case MEM_REPAIR_DPA: + if (ops->get_dpa) { + if (ops->set_dpa) + return a->mode; + else + return 0444; + } + break; + case MEM_REPAIR_MIN_DPA: + if (ops->get_min_dpa) + return a->mode; + break; + case MEM_REPAIR_MAX_DPA: + if (ops->get_max_dpa) + return a->mode; + break; + case MEM_REPAIR_NIBBLE_MASK: + if (ops->get_nibble_mask) { + if (ops->set_nibble_mask) + return a->mode; + else + return 0444; + } + break; + case MEM_REPAIR_MIN_NIBBLE_MASK: + if (ops->get_min_nibble_mask) + return a->mode; + break; + case MEM_REPAIR_MAX_NIBBLE_MASK: + if (ops->get_max_nibble_mask) + return a->mode; + break; + case MEM_REPAIR_BANK_GROUP: + if (ops->get_bank_group) { + if (ops->set_bank_group) + return a->mode; + else + return 0444; + } + break; + case MEM_REPAIR_MIN_BANK_GROUP: + if (ops->get_min_bank_group) + return a->mode; + break; + case MEM_REPAIR_MAX_BANK_GROUP: + if (ops->get_max_bank_group) + return a->mode; + break; + case MEM_REPAIR_BANK: + if (ops->get_bank) { + if (ops->set_bank) + return a->mode; + else + return 0444; + } + break; + case MEM_REPAIR_MIN_BANK: + if (ops->get_min_bank) + return a->mode; + break; + case MEM_REPAIR_MAX_BANK: + if (ops->get_max_bank) + return a->mode; + break; + case MEM_REPAIR_RANK: + if (ops->get_rank) { + if (ops->set_rank) + return a->mode; + else + return 0444; + } + break; + case MEM_REPAIR_MIN_RANK: + if (ops->get_min_rank) + return a->mode; + break; + case MEM_REPAIR_MAX_RANK: + if (ops->get_max_rank) + return a->mode; + break; + case MEM_REPAIR_ROW: + if (ops->get_row) { + if (ops->set_row) + return a->mode; + else + return 0444; + } + break; + case MEM_REPAIR_MIN_ROW: + if (ops->get_min_row) + return a->mode; + break; + case MEM_REPAIR_MAX_ROW: + if (ops->get_max_row) + return a->mode; + break; + case MEM_REPAIR_COLUMN: + if (ops->get_column) { + if (ops->set_column) + return a->mode; + else + return 0444; + } + break; + case MEM_REPAIR_MIN_COLUMN: + if (ops->get_min_column) + return a->mode; + break; + case MEM_REPAIR_MAX_COLUMN: + if (ops->get_max_column) + return a->mode; + break; + case MEM_REPAIR_CHANNEL: + if (ops->get_channel) { + if (ops->set_channel) + return a->mode; + else + return 0444; + } + break; + case MEM_REPAIR_MIN_CHANNEL: + if (ops->get_min_channel) + return a->mode; + break; + case MEM_REPAIR_MAX_CHANNEL: + if (ops->get_max_channel) + return a->mode; + break; + case MEM_REPAIR_SUB_CHANNEL: + if (ops->get_sub_channel) { + if (ops->set_sub_channel) + return a->mode; + else + return 0444; + } + break; + case MEM_REPAIR_MIN_SUB_CHANNEL: + if (ops->get_min_sub_channel) + return a->mode; + break; + case MEM_REPAIR_MAX_SUB_CHANNEL: + if (ops->get_max_sub_channel) + return a->mode; + break; + case MEM_DO_REPAIR: + if (ops->do_repair) + return a->mode; + break; + default: + break; + } + + return 0; +} + +#define EDAC_MEM_REPAIR_ATTR_RO(_name, _instance) \ + ((struct edac_mem_repair_dev_attr) { .dev_attr = __ATTR_RO(_name), \ + .instance = _instance }) + +#define EDAC_MEM_REPAIR_ATTR_WO(_name, _instance) \ + ((struct edac_mem_repair_dev_attr) { .dev_attr = __ATTR_WO(_name), \ + .instance = _instance }) + +#define EDAC_MEM_REPAIR_ATTR_RW(_name, _instance) \ + ((struct edac_mem_repair_dev_attr) { .dev_attr = __ATTR_RW(_name), \ + .instance = _instance }) + +static int mem_repair_create_desc(struct device *dev, + const struct attribute_group **attr_groups, + u8 instance) +{ + struct edac_mem_repair_context *ctx; + struct attribute_group *group; + int i; + struct edac_mem_repair_dev_attr dev_attr[] = { + [MEM_REPAIR_FUNCTION] = EDAC_MEM_REPAIR_ATTR_RO(repair_function, + instance), + [MEM_REPAIR_PERSIST_MODE] = + EDAC_MEM_REPAIR_ATTR_RW(persist_mode, instance), + [MEM_REPAIR_DPA_SUPPORT] = + EDAC_MEM_REPAIR_ATTR_RO(dpa_support, instance), + [MEM_REPAIR_SAFE_IN_USE] = + EDAC_MEM_REPAIR_ATTR_RO(repair_safe_when_in_use, + instance), + [MEM_REPAIR_HPA] = EDAC_MEM_REPAIR_ATTR_RW(hpa, instance), + [MEM_REPAIR_MIN_HPA] = EDAC_MEM_REPAIR_ATTR_RO(min_hpa, instance), + [MEM_REPAIR_MAX_HPA] = EDAC_MEM_REPAIR_ATTR_RO(max_hpa, instance), + [MEM_REPAIR_DPA] = EDAC_MEM_REPAIR_ATTR_RW(dpa, instance), + [MEM_REPAIR_MIN_DPA] = EDAC_MEM_REPAIR_ATTR_RO(min_dpa, instance), + [MEM_REPAIR_MAX_DPA] = EDAC_MEM_REPAIR_ATTR_RO(max_dpa, instance), + [MEM_REPAIR_NIBBLE_MASK] = + EDAC_MEM_REPAIR_ATTR_RW(nibble_mask, instance), + [MEM_REPAIR_MIN_NIBBLE_MASK] = + EDAC_MEM_REPAIR_ATTR_RO(min_nibble_mask, instance), + [MEM_REPAIR_MAX_NIBBLE_MASK] = + EDAC_MEM_REPAIR_ATTR_RO(max_nibble_mask, instance), + [MEM_REPAIR_BANK_GROUP] = + EDAC_MEM_REPAIR_ATTR_RW(bank_group, instance), + [MEM_REPAIR_MIN_BANK_GROUP] = + EDAC_MEM_REPAIR_ATTR_RO(min_bank_group, instance), + [MEM_REPAIR_MAX_BANK_GROUP] = + EDAC_MEM_REPAIR_ATTR_RO(max_bank_group, instance), + [MEM_REPAIR_BANK] = EDAC_MEM_REPAIR_ATTR_RW(bank, instance), + [MEM_REPAIR_MIN_BANK] = EDAC_MEM_REPAIR_ATTR_RO(min_bank, instance), + [MEM_REPAIR_MAX_BANK] = EDAC_MEM_REPAIR_ATTR_RO(max_bank, instance), + [MEM_REPAIR_RANK] = EDAC_MEM_REPAIR_ATTR_RW(rank, instance), + [MEM_REPAIR_MIN_RANK] = EDAC_MEM_REPAIR_ATTR_RO(min_rank, instance), + [MEM_REPAIR_MAX_RANK] = EDAC_MEM_REPAIR_ATTR_RO(max_rank, instance), + [MEM_REPAIR_ROW] = EDAC_MEM_REPAIR_ATTR_RW(row, instance), + [MEM_REPAIR_MIN_ROW] = EDAC_MEM_REPAIR_ATTR_RO(min_row, instance), + [MEM_REPAIR_MAX_ROW] = EDAC_MEM_REPAIR_ATTR_RO(max_row, instance), + [MEM_REPAIR_COLUMN] = EDAC_MEM_REPAIR_ATTR_RW(column, instance), + [MEM_REPAIR_MIN_COLUMN] = EDAC_MEM_REPAIR_ATTR_RO(min_column, instance), + [MEM_REPAIR_MAX_COLUMN] = EDAC_MEM_REPAIR_ATTR_RO(max_column, instance), + [MEM_REPAIR_CHANNEL] = EDAC_MEM_REPAIR_ATTR_RW(channel, instance), + [MEM_REPAIR_MIN_CHANNEL] = EDAC_MEM_REPAIR_ATTR_RO(min_channel, instance), + [MEM_REPAIR_MAX_CHANNEL] = EDAC_MEM_REPAIR_ATTR_RO(max_channel, instance), + [MEM_REPAIR_SUB_CHANNEL] = + EDAC_MEM_REPAIR_ATTR_RW(sub_channel, instance), + [MEM_REPAIR_MIN_SUB_CHANNEL] = + EDAC_MEM_REPAIR_ATTR_RO(min_sub_channel, instance), + [MEM_REPAIR_MAX_SUB_CHANNEL] = + EDAC_MEM_REPAIR_ATTR_RO(max_sub_channel, instance), + [MEM_DO_REPAIR] = EDAC_MEM_REPAIR_ATTR_WO(repair, instance) + }; + + ctx = devm_kzalloc(dev, sizeof(*ctx), GFP_KERNEL); + if (!ctx) + return -ENOMEM; + + for (i = 0; i < MEM_REPAIR_MAX_ATTRS; i++) { + memcpy(&ctx->mem_repair_dev_attr[i].dev_attr, + &dev_attr[i], sizeof(dev_attr[i])); + ctx->mem_repair_attrs[i] = + &ctx->mem_repair_dev_attr[i].dev_attr.attr; + } + + sprintf(ctx->name, "%s%d", "mem_repair", instance); + group = &ctx->group; + group->name = ctx->name; + group->attrs = ctx->mem_repair_attrs; + group->is_visible = mem_repair_attr_visible; + attr_groups[0] = group; + + return 0; +} + +/** + * edac_mem_repair_get_desc - get EDAC memory repair descriptors + * @dev: client device with memory repair feature + * @attr_groups: pointer to attribute group container + * @instance: device's memory repair instance number. + * + * Return: + * * %0 - Success. + * * %-EINVAL - Invalid parameters passed. + * * %-ENOMEM - Dynamic memory allocation failed. + */ +int edac_mem_repair_get_desc(struct device *dev, + const struct attribute_group **attr_groups, u8 instance) +{ + if (!dev || !attr_groups) + return -EINVAL; + + return mem_repair_create_desc(dev, attr_groups, instance); +} diff --git a/include/linux/edac.h b/include/linux/edac.h index 979e91426701..5d07192bf1a7 100644 --- a/include/linux/edac.h +++ b/include/linux/edac.h @@ -668,6 +668,7 @@ static inline struct dimm_info *edac_get_dimm(struct mem_ctl_info *mci, enum edac_dev_feat { RAS_FEAT_SCRUB, RAS_FEAT_ECS, + RAS_FEAT_MEM_REPAIR, RAS_FEAT_MAX }; @@ -729,11 +730,147 @@ int edac_ecs_get_desc(struct device *ecs_dev, const struct attribute_group **attr_groups, u16 num_media_frus); +enum edac_mem_repair_function { + EDAC_SOFT_PPR, + EDAC_HARD_PPR, + EDAC_CACHELINE_MEM_SPARING, + EDAC_ROW_MEM_SPARING, + EDAC_BANK_MEM_SPARING, + EDAC_RANK_MEM_SPARING, +}; + +enum edac_mem_repair_persist_mode { + EDAC_MEM_REPAIR_SOFT, /* soft memory repair */ + EDAC_MEM_REPAIR_HARD, /* hard memory repair */ +}; + +enum edac_mem_repair_cmd { + EDAC_DO_MEM_REPAIR = 1, +}; + +/** + * struct edac_mem_repair_ops - memory repair operations + * (all elements are optional except do_repair, set_hpa/set_dpa) + * @get_repair_function: get the memory repair function, listed in + * enum edac_mem_repair_function. + * @get_persist_mode: get the current persist mode. Persist repair modes supported + * in the device is based on the memory repair function which is + * temporary or permanent and is lost with a power cycle. + * EDAC_MEM_REPAIR_SOFT - Soft repair function (temporary repair). + * EDAC_MEM_REPAIR_HARD - Hard memory repair function (permanent repair). + * All other values are reserved. + * @set_persist_mode: set the persist mode of the memory repair instance. + * @get_dpa_support: get dpa support flag. In some states of system configuration + * (e.g. before address decoders have been configured), memory devices + * (e.g. CXL) may not have an active mapping in the main host address + * physical address map. As such, the memory to repair must be identified + * by a device specific physical addressing scheme using a device physical + * address(DPA). The DPA and other control attributes to use for the + * dry_run and repair operations will be presented in related error records. + * @get_repair_safe_when_in_use: get whether memory media is accessible and + * data is retained during repair operation. + * @get_hpa: get current host physical address (HPA). + * @set_hpa: set host physical address (HPA) of memory to repair. + * @get_min_hpa: get the minimum supported host physical address (HPA). + * @get_max_hpa: get the maximum supported host physical address (HPA). + * @get_dpa: get current device physical address (DPA). + * @set_dpa: set device physical address (DPA) of memory to repair. + * @get_min_dpa: get the minimum supported device physical address (DPA). + * @get_max_dpa: get the maximum supported device physical address (DPA). + * @get_nibble_mask: get current nibble mask. + * @set_nibble_mask: set nibble mask of memory to repair. + * @get_min_nibble_mask: get the minimum supported nibble mask. + * @get_max_nibble_mask: get the maximum supported nibble mask. + * @get_bank_group: get current bank group. + * @set_bank_group: set bank group of memory to repair. + * @get_min_bank_group: get the minimum supported bank group. + * @get_max_bank_group: get the maximum supported bank group. + * @get_bank: get current bank. + * @set_bank: set bank of memory to repair. + * @get_min_bank: get the minimum supported bank. + * @get_max_bank: get the maximum supported bank. + * @get_rank: get current rank. + * @set_rank: set rank of memory to repair. + * @get_min_rank: get the minimum supported rank. + * @get_max_rank: get the maximum supported rank. + * @get_row: get current row. + * @set_row: set row of memory to repair. + * @get_min_row: get the minimum supported row. + * @get_max_row: get the maximum supported row. + * @get_column: get current column. + * @set_column: set column of memory to repair. + * @get_min_column: get the minimum supported column. + * @get_max_column: get the maximum supported column. + * @get_channel: get current channel. + * @set_channel: set channel of memory to repair. + * @get_min_channel: get the minimum supported channel. + * @get_max_channel: get the maximum supported channel. + * @get_sub_channel: get current sub channel. + * @set_sub_channel: set sub channel of memory to repair. + * @get_min_sub_channel: get the minimum supported sub channel. + * @get_max_sub_channel: get the maximum supported sub channel. + * @do_repair: Issue memory repair operation for the HPA/DPA and + * other control attributes set for the memory to repair. + */ +struct edac_mem_repair_ops { + int (*get_repair_function)(struct device *dev, void *drv_data, u32 *val); + int (*get_persist_mode)(struct device *dev, void *drv_data, u32 *mode); + int (*set_persist_mode)(struct device *dev, void *drv_data, u32 mode); + int (*get_dpa_support)(struct device *dev, void *drv_data, u32 *val); + int (*get_repair_safe_when_in_use)(struct device *dev, void *drv_data, u32 *val); + int (*get_hpa)(struct device *dev, void *drv_data, u64 *hpa); + int (*set_hpa)(struct device *dev, void *drv_data, u64 hpa); + int (*get_min_hpa)(struct device *dev, void *drv_data, u64 *hpa); + int (*get_max_hpa)(struct device *dev, void *drv_data, u64 *hpa); + int (*get_dpa)(struct device *dev, void *drv_data, u64 *dpa); + int (*set_dpa)(struct device *dev, void *drv_data, u64 dpa); + int (*get_min_dpa)(struct device *dev, void *drv_data, u64 *dpa); + int (*get_max_dpa)(struct device *dev, void *drv_data, u64 *dpa); + int (*get_nibble_mask)(struct device *dev, void *drv_data, u64 *val); + int (*set_nibble_mask)(struct device *dev, void *drv_data, u64 val); + int (*get_min_nibble_mask)(struct device *dev, void *drv_data, u64 *val); + int (*get_max_nibble_mask)(struct device *dev, void *drv_data, u64 *val); + int (*get_bank_group)(struct device *dev, void *drv_data, u32 *val); + int (*set_bank_group)(struct device *dev, void *drv_data, u32 val); + int (*get_min_bank_group)(struct device *dev, void *drv_data, u32 *val); + int (*get_max_bank_group)(struct device *dev, void *drv_data, u32 *val); + int (*get_bank)(struct device *dev, void *drv_data, u32 *val); + int (*set_bank)(struct device *dev, void *drv_data, u32 val); + int (*get_min_bank)(struct device *dev, void *drv_data, u32 *val); + int (*get_max_bank)(struct device *dev, void *drv_data, u32 *val); + int (*get_rank)(struct device *dev, void *drv_data, u32 *val); + int (*set_rank)(struct device *dev, void *drv_data, u32 val); + int (*get_min_rank)(struct device *dev, void *drv_data, u32 *val); + int (*get_max_rank)(struct device *dev, void *drv_data, u32 *val); + int (*get_row)(struct device *dev, void *drv_data, u64 *val); + int (*set_row)(struct device *dev, void *drv_data, u64 val); + int (*get_min_row)(struct device *dev, void *drv_data, u64 *val); + int (*get_max_row)(struct device *dev, void *drv_data, u64 *val); + int (*get_column)(struct device *dev, void *drv_data, u32 *val); + int (*set_column)(struct device *dev, void *drv_data, u32 val); + int (*get_min_column)(struct device *dev, void *drv_data, u32 *val); + int (*get_max_column)(struct device *dev, void *drv_data, u32 *val); + int (*get_channel)(struct device *dev, void *drv_data, u32 *val); + int (*set_channel)(struct device *dev, void *drv_data, u32 val); + int (*get_min_channel)(struct device *dev, void *drv_data, u32 *val); + int (*get_max_channel)(struct device *dev, void *drv_data, u32 *val); + int (*get_sub_channel)(struct device *dev, void *drv_data, u32 *val); + int (*set_sub_channel)(struct device *dev, void *drv_data, u32 val); + int (*get_min_sub_channel)(struct device *dev, void *drv_data, u32 *val); + int (*get_max_sub_channel)(struct device *dev, void *drv_data, u32 *val); + int (*do_repair)(struct device *dev, void *drv_data, u32 val); +}; + +int edac_mem_repair_get_desc(struct device *dev, + const struct attribute_group **attr_groups, + u8 instance); + /* EDAC device feature information structure */ struct edac_dev_data { union { const struct edac_scrub_ops *scrub_ops; const struct edac_ecs_ops *ecs_ops; + const struct edac_mem_repair_ops *mem_repair_ops; }; u8 instance; void *private; @@ -744,6 +881,7 @@ struct edac_dev_feat_ctx { void *private; struct edac_dev_data *scrub; struct edac_dev_data ecs; + struct edac_dev_data *mem_repair; }; struct edac_dev_feature { @@ -752,6 +890,7 @@ struct edac_dev_feature { union { const struct edac_scrub_ops *scrub_ops; const struct edac_ecs_ops *ecs_ops; + const struct edac_mem_repair_ops *mem_repair_ops; }; void *ctx; struct edac_ecs_ex_info ecs_info; From patchwork Mon Jan 6 12:10:02 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shiju Jose X-Patchwork-Id: 855294 Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2EE671DE3A9; Mon, 6 Jan 2025 12:11:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.176.79.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736165503; cv=none; b=LMWjq8hMAYDI4nN6roOJbMebLKP/lfHip7acXhrUr0RamkaaIScDcQreukEKzVWirC/qdW6pjBDLxLYm7lqRWYqARDmcWAQwir/h66wbvqLtp4i+58MdBn0Q2De2RQMwxZxC0jmO++fz5T+GQDhO2XtWE9wjKBDZz2WYMxR/1N4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736165503; c=relaxed/simple; bh=oNI1xtCHCoHwukhtok7qvchX2r9Hn+K0Cm37y+GM40E=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=mk3SWGbtQTQ0K0XFnFc2VOMiZpHpnPTPIRl3Kzj3u5ITH3Nd+yoq+Ur168vMfDDoFJect9TEaKPR0ZN6EEeEMIwYblCPUmE1e4nCunDNhIz+Ay5GXt19hBctIlfjKueFb8lXmlQwATZWC/pX/NLoXkCtSMY8RnkwLKnOlV9KJM8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=185.176.79.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.18.186.216]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4YRY0H32DSz6LD8Z; Mon, 6 Jan 2025 20:10:07 +0800 (CST) Received: from frapeml500007.china.huawei.com (unknown [7.182.85.172]) by mail.maildlp.com (Postfix) with ESMTPS id 2D421140D26; Mon, 6 Jan 2025 20:11:39 +0800 (CST) Received: from P_UKIT01-A7bmah.china.huawei.com (10.126.170.95) by frapeml500007.china.huawei.com (7.182.85.172) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.39; Mon, 6 Jan 2025 13:11:37 +0100 From: To: , , , , CC: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , Subject: [PATCH v18 06/19] ras: mem: Add memory ACPI RAS2 driver Date: Mon, 6 Jan 2025 12:10:02 +0000 Message-ID: <20250106121017.1620-7-shiju.jose@huawei.com> X-Mailer: git-send-email 2.43.0.windows.1 In-Reply-To: <20250106121017.1620-1-shiju.jose@huawei.com> References: <20250106121017.1620-1-shiju.jose@huawei.com> Precedence: bulk X-Mailing-List: linux-acpi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: lhrpeml100011.china.huawei.com (7.191.174.247) To frapeml500007.china.huawei.com (7.182.85.172) From: Shiju Jose Memory ACPI RAS2 auxiliary driver binds to the auxiliary device add by the ACPI RAS2 table parser. Driver uses a PCC subspace for communicating with the ACPI compliant platform. Device with ACPI RAS2 scrub feature registers with EDAC device driver, which retrieves the scrub descriptor from EDAC scrub and exposes the scrub control attributes for RAS2 scrub instance to userspace in /sys/bus/edac/devices/acpi_ras_mem0/scrubX/. Co-developed-by: Jonathan Cameron Signed-off-by: Jonathan Cameron Signed-off-by: Shiju Jose --- Documentation/edac/scrub.rst | 81 ++++++++ drivers/ras/Kconfig | 10 + drivers/ras/Makefile | 1 + drivers/ras/acpi_ras2.c | 385 +++++++++++++++++++++++++++++++++++ 4 files changed, 477 insertions(+) create mode 100644 drivers/ras/acpi_ras2.c diff --git a/Documentation/edac/scrub.rst b/Documentation/edac/scrub.rst index 5640f9aeee38..f86645c7f0af 100644 --- a/Documentation/edac/scrub.rst +++ b/Documentation/edac/scrub.rst @@ -244,3 +244,84 @@ Sysfs files are documented in `Documentation/ABI/testing/sysfs-edac-scrub`. `Documentation/ABI/testing/sysfs-edac-ecs`. + +Examples +-------- + +The usage takes the form shown in these examples: + +1. ACPI RAS2 + +1.1 On demand scrubbing for a specific memory region. + +root@localhost:~# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/min_cycle_duration + +3600 + +root@localhost:~# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/max_cycle_duration + +86400 + +root@localhost:~# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/current_cycle_duration + +36000 + +# Readback 'addr', non-zero - demand scrub is in progress, zero - scrub is finished. + +root@localhost:~# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/addr + +0 + +root@localhost:~# echo 54000 > /sys/bus/edac/devices/acpi_ras_mem0/scrub0/current_cycle_duration + +root@localhost:~# echo 0x150000 > /sys/bus/edac/devices/acpi_ras_mem0/scrub0/size + +# Write 'addr' starts demand scrubbing, please make sure other attributes are set prior to that. + +root@localhost:~# echo 0x120000 > /sys/bus/edac/devices/acpi_ras_mem0/scrub0/addr + +root@localhost:~# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/current_cycle_duration + +54000 + +# Readback 'addr', non-zero - demand scrub is in progress, zero - scrub is finished. + +root@localhost:~# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/addr + +0x120000 + +root@localhost:~# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/addr + +0 + +1.2 Background scrubbing the entire memory + +root@localhost:~# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/min_cycle_duration + +3600 + +root@localhost:~# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/max_cycle_duration + +86400 + +root@localhost:~# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/current_cycle_duration + +36000 + +root@localhost:~# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/enable_background + +0 + +root@localhost:~# echo 10800 > /sys/bus/edac/devices/acpi_ras_mem0/scrub0/current_cycle_duration + +root@localhost:~# echo 1 > /sys/bus/edac/devices/acpi_ras_mem0/scrub0/enable_background + +root@localhost:~# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/enable_background + +1 + +root@localhost:~# cat /sys/bus/edac/devices/acpi_ras_mem0/scrub0/current_cycle_duration + +10800 + +root@localhost:~# echo 0 > /sys/bus/edac/devices/acpi_ras_mem0/scrub0/enable_background diff --git a/drivers/ras/Kconfig b/drivers/ras/Kconfig index fc4f4bb94a4c..b77790bdc73a 100644 --- a/drivers/ras/Kconfig +++ b/drivers/ras/Kconfig @@ -46,4 +46,14 @@ config RAS_FMPM Memory will be retired during boot time and run time depending on platform-specific policies. +config MEM_ACPI_RAS2 + tristate "Memory ACPI RAS2 driver" + depends on ACPI_RAS2 + depends on EDAC + help + The driver binds to the platform device added by the ACPI RAS2 + table parser. Use a PCC channel subspace for communicating with + the ACPI compliant platform to provide control of memory scrub + parameters to the user via the EDAC scrub. + endif diff --git a/drivers/ras/Makefile b/drivers/ras/Makefile index 11f95d59d397..a0e6e903d6b0 100644 --- a/drivers/ras/Makefile +++ b/drivers/ras/Makefile @@ -2,6 +2,7 @@ obj-$(CONFIG_RAS) += ras.o obj-$(CONFIG_DEBUG_FS) += debugfs.o obj-$(CONFIG_RAS_CEC) += cec.o +obj-$(CONFIG_MEM_ACPI_RAS2) += acpi_ras2.o obj-$(CONFIG_RAS_FMPM) += amd/fmpm.o obj-y += amd/atl/ diff --git a/drivers/ras/acpi_ras2.c b/drivers/ras/acpi_ras2.c new file mode 100644 index 000000000000..6c5772d60f22 --- /dev/null +++ b/drivers/ras/acpi_ras2.c @@ -0,0 +1,385 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * ACPI RAS2 memory driver + * + * Copyright (c) 2024 HiSilicon Limited. + * + */ + +#define pr_fmt(fmt) "MEMORY ACPI RAS2: " fmt + +#include +#include +#include +#include + +#define RAS2_DEV_NUM_RAS_FEATURES 1 + +#define RAS2_SUPPORT_HW_PARTOL_SCRUB BIT(0) +#define RAS2_TYPE_PATROL_SCRUB 0x0000 + +#define RAS2_GET_PATROL_PARAMETERS 0x01 +#define RAS2_START_PATROL_SCRUBBER 0x02 +#define RAS2_STOP_PATROL_SCRUBBER 0x03 + +#define RAS2_PATROL_SCRUB_SCHRS_IN_MASK GENMASK(15, 8) +#define RAS2_PATROL_SCRUB_EN_BACKGROUND BIT(0) +#define RAS2_PATROL_SCRUB_SCHRS_OUT_MASK GENMASK(7, 0) +#define RAS2_PATROL_SCRUB_MIN_SCHRS_OUT_MASK GENMASK(15, 8) +#define RAS2_PATROL_SCRUB_MAX_SCHRS_OUT_MASK GENMASK(23, 16) +#define RAS2_PATROL_SCRUB_FLAG_SCRUBBER_RUNNING BIT(0) + +#define RAS2_SCRUB_NAME_LEN 128 +#define RAS2_HOUR_IN_SECS 3600 + +struct acpi_ras2_ps_shared_mem { + struct acpi_ras2_shared_memory common; + struct acpi_ras2_patrol_scrub_parameter params; +}; + +static int ras2_is_patrol_scrub_support(struct ras2_mem_ctx *ras2_ctx) +{ + struct acpi_ras2_shared_memory __iomem *common = (void *) + ras2_ctx->pcc_comm_addr; + + guard(mutex)(&ras2_ctx->lock); + common->set_capabilities[0] = 0; + + return common->features[0] & RAS2_SUPPORT_HW_PARTOL_SCRUB; +} + +static int ras2_update_patrol_scrub_params_cache(struct ras2_mem_ctx *ras2_ctx) +{ + struct acpi_ras2_ps_shared_mem __iomem *ps_sm = (void *) + ras2_ctx->pcc_comm_addr; + int ret; + + ps_sm->common.set_capabilities[0] = RAS2_SUPPORT_HW_PARTOL_SCRUB; + ps_sm->params.patrol_scrub_command = RAS2_GET_PATROL_PARAMETERS; + + ret = ras2_send_pcc_cmd(ras2_ctx, RAS2_PCC_CMD_EXEC); + if (ret) { + dev_err(ras2_ctx->dev, "failed to read parameters\n"); + return ret; + } + + ras2_ctx->min_scrub_cycle = FIELD_GET(RAS2_PATROL_SCRUB_MIN_SCHRS_OUT_MASK, + ps_sm->params.scrub_params_out); + ras2_ctx->max_scrub_cycle = FIELD_GET(RAS2_PATROL_SCRUB_MAX_SCHRS_OUT_MASK, + ps_sm->params.scrub_params_out); + if (!ras2_ctx->bg) { + ras2_ctx->base = ps_sm->params.actual_address_range[0]; + ras2_ctx->size = ps_sm->params.actual_address_range[1]; + } + ras2_ctx->scrub_cycle_hrs = FIELD_GET(RAS2_PATROL_SCRUB_SCHRS_OUT_MASK, + ps_sm->params.scrub_params_out); + + return 0; +} + +/* Context - lock must be held */ +static int ras2_get_patrol_scrub_running(struct ras2_mem_ctx *ras2_ctx, + bool *running) +{ + struct acpi_ras2_ps_shared_mem __iomem *ps_sm = (void *) + ras2_ctx->pcc_comm_addr; + int ret; + + ps_sm->common.set_capabilities[0] = RAS2_SUPPORT_HW_PARTOL_SCRUB; + ps_sm->params.patrol_scrub_command = RAS2_GET_PATROL_PARAMETERS; + + ret = ras2_send_pcc_cmd(ras2_ctx, RAS2_PCC_CMD_EXEC); + if (ret) { + dev_err(ras2_ctx->dev, "failed to read parameters\n"); + return ret; + } + + *running = ps_sm->params.flags & RAS2_PATROL_SCRUB_FLAG_SCRUBBER_RUNNING; + + return 0; +} + +static int ras2_hw_scrub_read_min_scrub_cycle(struct device *dev, void *drv_data, + u32 *min) +{ + struct ras2_mem_ctx *ras2_ctx = drv_data; + + *min = ras2_ctx->min_scrub_cycle * RAS2_HOUR_IN_SECS; + + return 0; +} + +static int ras2_hw_scrub_read_max_scrub_cycle(struct device *dev, void *drv_data, + u32 *max) +{ + struct ras2_mem_ctx *ras2_ctx = drv_data; + + *max = ras2_ctx->max_scrub_cycle * RAS2_HOUR_IN_SECS; + + return 0; +} + +static int ras2_hw_scrub_cycle_read(struct device *dev, void *drv_data, + u32 *scrub_cycle_secs) +{ + struct ras2_mem_ctx *ras2_ctx = drv_data; + + *scrub_cycle_secs = ras2_ctx->scrub_cycle_hrs * RAS2_HOUR_IN_SECS; + + return 0; +} + +static int ras2_hw_scrub_cycle_write(struct device *dev, void *drv_data, + u32 scrub_cycle_secs) +{ + u8 scrub_cycle_hrs = scrub_cycle_secs / RAS2_HOUR_IN_SECS; + struct ras2_mem_ctx *ras2_ctx = drv_data; + bool running; + int ret; + + guard(mutex)(&ras2_ctx->lock); + ret = ras2_get_patrol_scrub_running(ras2_ctx, &running); + if (ret) + return ret; + + if (running) + return -EBUSY; + + if (scrub_cycle_hrs < ras2_ctx->min_scrub_cycle || + scrub_cycle_hrs > ras2_ctx->max_scrub_cycle) + return -EINVAL; + + ras2_ctx->scrub_cycle_hrs = scrub_cycle_hrs; + + return 0; +} + +static int ras2_hw_scrub_read_addr(struct device *dev, void *drv_data, u64 *base) +{ + struct ras2_mem_ctx *ras2_ctx = drv_data; + int ret; + + /* + * When BG scrubbing is enabled the actual address range is not valid. + * Return -EBUSY now unless find out a method to retrieve actual full PA range. + */ + if (ras2_ctx->bg) + return -EBUSY; + + /* + * When demand scrubbing is finished firmware must reset actual + * address range to 0. Otherwise userspace assumes demand scrubbing + * is in progress. + */ + ret = ras2_update_patrol_scrub_params_cache(ras2_ctx); + if (ret) + return ret; + *base = ras2_ctx->base; + + return 0; +} + +static int ras2_hw_scrub_read_size(struct device *dev, void *drv_data, u64 *size) +{ + struct ras2_mem_ctx *ras2_ctx = drv_data; + int ret; + + if (ras2_ctx->bg) + return -EBUSY; + + ret = ras2_update_patrol_scrub_params_cache(ras2_ctx); + if (ret) + return ret; + *size = ras2_ctx->size; + + return 0; +} + +static int ras2_hw_scrub_write_addr(struct device *dev, void *drv_data, u64 base) +{ + struct ras2_mem_ctx *ras2_ctx = drv_data; + struct acpi_ras2_ps_shared_mem __iomem *ps_sm = (void *) + ras2_ctx->pcc_comm_addr; + bool running; + int ret; + + guard(mutex)(&ras2_ctx->lock); + ps_sm->common.set_capabilities[0] = RAS2_SUPPORT_HW_PARTOL_SCRUB; + if (ras2_ctx->bg) + return -EBUSY; + + if (!base || !ras2_ctx->size) { + dev_warn(ras2_ctx->dev, + "%s: Invalid address range, base=0x%llx " + "size=0x%llx\n", __func__, + base, ras2_ctx->size); + return -ERANGE; + } + + ret = ras2_get_patrol_scrub_running(ras2_ctx, &running); + if (ret) + return ret; + + if (running) + return -EBUSY; + + ps_sm->params.scrub_params_in &= ~RAS2_PATROL_SCRUB_SCHRS_IN_MASK; + ps_sm->params.scrub_params_in |= FIELD_PREP(RAS2_PATROL_SCRUB_SCHRS_IN_MASK, + ras2_ctx->scrub_cycle_hrs); + ps_sm->params.requested_address_range[0] = base; + ps_sm->params.requested_address_range[1] = ras2_ctx->size; + ps_sm->params.scrub_params_in &= ~RAS2_PATROL_SCRUB_EN_BACKGROUND; + ps_sm->params.patrol_scrub_command = RAS2_START_PATROL_SCRUBBER; + + ret = ras2_send_pcc_cmd(ras2_ctx, RAS2_PCC_CMD_EXEC); + if (ret) { + dev_err(ras2_ctx->dev, "Failed to start demand scrubbing\n"); + return ret; + } + + return ras2_update_patrol_scrub_params_cache(ras2_ctx); +} + +static int ras2_hw_scrub_write_size(struct device *dev, void *drv_data, u64 size) +{ + struct ras2_mem_ctx *ras2_ctx = drv_data; + bool running; + int ret; + + guard(mutex)(&ras2_ctx->lock); + ret = ras2_get_patrol_scrub_running(ras2_ctx, &running); + if (ret) + return ret; + + if (running) + return -EBUSY; + + if (!size) { + dev_warn(dev, "%s: Invalid address range size=0x%llx\n", + __func__, size); + return -EINVAL; + } + + ras2_ctx->size = size; + + return 0; +} + +static int ras2_hw_scrub_set_enabled_bg(struct device *dev, void *drv_data, bool enable) +{ + struct ras2_mem_ctx *ras2_ctx = drv_data; + struct acpi_ras2_ps_shared_mem __iomem *ps_sm = (void *) + ras2_ctx->pcc_comm_addr; + bool running; + int ret; + + guard(mutex)(&ras2_ctx->lock); + ps_sm->common.set_capabilities[0] = RAS2_SUPPORT_HW_PARTOL_SCRUB; + ret = ras2_get_patrol_scrub_running(ras2_ctx, &running); + if (ret) + return ret; + if (enable) { + if (ras2_ctx->bg || running) + return -EBUSY; + ps_sm->params.requested_address_range[0] = 0; + ps_sm->params.requested_address_range[1] = 0; + ps_sm->params.scrub_params_in &= ~RAS2_PATROL_SCRUB_SCHRS_IN_MASK; + ps_sm->params.scrub_params_in |= FIELD_PREP(RAS2_PATROL_SCRUB_SCHRS_IN_MASK, + ras2_ctx->scrub_cycle_hrs); + ps_sm->params.patrol_scrub_command = RAS2_START_PATROL_SCRUBBER; + } else { + if (!ras2_ctx->bg) + return -EPERM; + if (!ras2_ctx->bg && running) + return -EBUSY; + ps_sm->params.patrol_scrub_command = RAS2_STOP_PATROL_SCRUBBER; + } + ps_sm->params.scrub_params_in &= ~RAS2_PATROL_SCRUB_EN_BACKGROUND; + ps_sm->params.scrub_params_in |= FIELD_PREP(RAS2_PATROL_SCRUB_EN_BACKGROUND, + enable); + ret = ras2_send_pcc_cmd(ras2_ctx, RAS2_PCC_CMD_EXEC); + if (ret) { + dev_err(ras2_ctx->dev, "Failed to %s background scrubbing\n", + enable ? "enable" : "disable"); + return ret; + } + if (enable) { + ras2_ctx->bg = true; + /* Update the cache to account for rounding of supplied parameters and similar */ + ret = ras2_update_patrol_scrub_params_cache(ras2_ctx); + } else { + ret = ras2_update_patrol_scrub_params_cache(ras2_ctx); + ras2_ctx->bg = false; + } + + return ret; +} + +static int ras2_hw_scrub_get_enabled_bg(struct device *dev, void *drv_data, bool *enabled) +{ + struct ras2_mem_ctx *ras2_ctx = drv_data; + + *enabled = ras2_ctx->bg; + + return 0; +} + +static const struct edac_scrub_ops ras2_scrub_ops = { + .read_addr = ras2_hw_scrub_read_addr, + .read_size = ras2_hw_scrub_read_size, + .write_addr = ras2_hw_scrub_write_addr, + .write_size = ras2_hw_scrub_write_size, + .get_enabled_bg = ras2_hw_scrub_get_enabled_bg, + .set_enabled_bg = ras2_hw_scrub_set_enabled_bg, + .get_min_cycle = ras2_hw_scrub_read_min_scrub_cycle, + .get_max_cycle = ras2_hw_scrub_read_max_scrub_cycle, + .get_cycle_duration = ras2_hw_scrub_cycle_read, + .set_cycle_duration = ras2_hw_scrub_cycle_write, +}; + +static int ras2_probe(struct auxiliary_device *auxdev, + const struct auxiliary_device_id *id) +{ + struct ras2_mem_ctx *ras2_ctx = container_of(auxdev, struct ras2_mem_ctx, adev); + struct edac_dev_feature ras_features[RAS2_DEV_NUM_RAS_FEATURES]; + char scrub_name[RAS2_SCRUB_NAME_LEN]; + int num_ras_features = 0; + int ret; + + if (!ras2_is_patrol_scrub_support(ras2_ctx)) + return -EOPNOTSUPP; + + ret = ras2_update_patrol_scrub_params_cache(ras2_ctx); + if (ret) + return ret; + + snprintf(scrub_name, sizeof(scrub_name), "acpi_ras_mem%d", + ras2_ctx->id); + + ras_features[num_ras_features].ft_type = RAS_FEAT_SCRUB; + ras_features[num_ras_features].instance = ras2_ctx->instance; + ras_features[num_ras_features].scrub_ops = &ras2_scrub_ops; + ras_features[num_ras_features].ctx = ras2_ctx; + num_ras_features++; + + return edac_dev_register(&auxdev->dev, scrub_name, NULL, + num_ras_features, ras_features); +} + +static const struct auxiliary_device_id ras2_mem_dev_id_table[] = { + { .name = RAS2_AUX_DEV_NAME "." RAS2_MEM_DEV_ID_NAME, }, + { }, +}; + +MODULE_DEVICE_TABLE(auxiliary, ras2_mem_dev_id_table); + +static struct auxiliary_driver ras2_mem_driver = { + .name = RAS2_MEM_DEV_ID_NAME, + .probe = ras2_probe, + .id_table = ras2_mem_dev_id_table, +}; +module_auxiliary_driver(ras2_mem_driver); + +MODULE_IMPORT_NS("ACPI_RAS2"); +MODULE_DESCRIPTION("ACPI RAS2 memory driver"); +MODULE_LICENSE("GPL"); From patchwork Mon Jan 6 12:10:04 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shiju Jose X-Patchwork-Id: 855293 Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A693A1DE4F0; Mon, 6 Jan 2025 12:11:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.176.79.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736165508; cv=none; b=RLmMNJi0GScIXyHzr/FebjRRaPI2sdzTSTYWuNEcO7SxAE8wHD+6QEda5A8KXh2w3iIgzwEQ7F3mg9hJ4WJFFrmHqh+HtsQKIcUPj9g8HYTGgJ/RH0DJQOJllwxcCQOHXGFv2yUf3OcFURTcUuK89NO7nn5sWtpw6cREYkGuhEY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736165508; c=relaxed/simple; bh=hr48c2aKWn2sb79JlUxbmNm/QHwdNCQN5RcEgeJmvFA=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=MVBetiwQnjt7lLfNt1VhUVach4obK5/f7dbTi1Lsy3ZpnF16m8m0r7eIoJ1ibIcEAnrsi5/XN8CbvYlfkviFg7qxCWxJ0/kcuyNK66SL/qJ3b9dXx2nI8eTaElkDfh6fKqcVuTsJ7R5etwcaIKyXzNJKYc0g8RxQnivGQDO6DlE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=185.176.79.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.18.186.31]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4YRY0z2TW1z6FGmV; Mon, 6 Jan 2025 20:10:43 +0800 (CST) Received: from frapeml500007.china.huawei.com (unknown [7.182.85.172]) by mail.maildlp.com (Postfix) with ESMTPS id BD5F214022E; Mon, 6 Jan 2025 20:11:44 +0800 (CST) Received: from P_UKIT01-A7bmah.china.huawei.com (10.126.170.95) by frapeml500007.china.huawei.com (7.182.85.172) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.39; Mon, 6 Jan 2025 13:11:42 +0100 From: To: , , , , CC: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , Subject: [PATCH v18 08/19] cxl: Add skeletal features driver Date: Mon, 6 Jan 2025 12:10:04 +0000 Message-ID: <20250106121017.1620-9-shiju.jose@huawei.com> X-Mailer: git-send-email 2.43.0.windows.1 In-Reply-To: <20250106121017.1620-1-shiju.jose@huawei.com> References: <20250106121017.1620-1-shiju.jose@huawei.com> Precedence: bulk X-Mailing-List: linux-acpi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: lhrpeml100011.china.huawei.com (7.191.174.247) To frapeml500007.china.huawei.com (7.182.85.172) From: Dave Jiang Add the basic bits of a features driver to handle all CXL feature related services. Signed-off-by: Dave Jiang --- drivers/cxl/Kconfig | 8 +++++ drivers/cxl/Makefile | 3 ++ drivers/cxl/core/Makefile | 1 + drivers/cxl/core/core.h | 1 + drivers/cxl/core/features.c | 71 +++++++++++++++++++++++++++++++++++++ drivers/cxl/core/port.c | 3 ++ drivers/cxl/cxl.h | 1 + drivers/cxl/features.c | 44 +++++++++++++++++++++++ drivers/cxl/pci.c | 19 ++++++++++ include/cxl/features.h | 23 ++++++++++++ include/cxl/mailbox.h | 3 ++ tools/testing/cxl/Kbuild | 1 + 12 files changed, 178 insertions(+) create mode 100644 drivers/cxl/core/features.c create mode 100644 drivers/cxl/features.c create mode 100644 include/cxl/features.h diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig index 876469e23f7a..0bc6a2cb8474 100644 --- a/drivers/cxl/Kconfig +++ b/drivers/cxl/Kconfig @@ -146,4 +146,12 @@ config CXL_REGION_INVALIDATION_TEST If unsure, or if this kernel is meant for production environments, say N. +config CXL_FEATURES + tristate "CXL: Features support" + default CXL_BUS + help + Enable CXL features support that are tied to a CXL mailbox. + + If unsure say 'y'. + endif diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile index 2caa90fa4bf2..4696fc218df4 100644 --- a/drivers/cxl/Makefile +++ b/drivers/cxl/Makefile @@ -7,15 +7,18 @@ # - 'mem' and 'pmem' before endpoint drivers so that memdevs are # immediately enabled # - 'pci' last, also mirrors the hardware enumeration hierarchy +# - 'features' comes after pci device is enumerated obj-y += core/ obj-$(CONFIG_CXL_PORT) += cxl_port.o obj-$(CONFIG_CXL_ACPI) += cxl_acpi.o obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o obj-$(CONFIG_CXL_MEM) += cxl_mem.o obj-$(CONFIG_CXL_PCI) += cxl_pci.o +obj-$(CONFIG_CXL_FEATURES) += cxl_features.o cxl_port-y := port.o cxl_acpi-y := acpi.o cxl_pmem-y := pmem.o security.o cxl_mem-y := mem.o cxl_pci-y := pci.o +cxl_features-y := features.o diff --git a/drivers/cxl/core/Makefile b/drivers/cxl/core/Makefile index 9259bcc6773c..73b6348afd67 100644 --- a/drivers/cxl/core/Makefile +++ b/drivers/cxl/core/Makefile @@ -14,5 +14,6 @@ cxl_core-y += pci.o cxl_core-y += hdm.o cxl_core-y += pmu.o cxl_core-y += cdat.o +cxl_core-y += features.o cxl_core-$(CONFIG_TRACING) += trace.o cxl_core-$(CONFIG_CXL_REGION) += region.o diff --git a/drivers/cxl/core/core.h b/drivers/cxl/core/core.h index 23761340e65c..e8a3df226643 100644 --- a/drivers/cxl/core/core.h +++ b/drivers/cxl/core/core.h @@ -9,6 +9,7 @@ extern const struct device_type cxl_nvdimm_bridge_type; extern const struct device_type cxl_nvdimm_type; extern const struct device_type cxl_pmu_type; +extern const struct device_type cxl_features_type; extern struct attribute_group cxl_base_attribute_group; diff --git a/drivers/cxl/core/features.c b/drivers/cxl/core/features.c new file mode 100644 index 000000000000..eb6eb191a32e --- /dev/null +++ b/drivers/cxl/core/features.c @@ -0,0 +1,71 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright(c) 2024-2025 Intel Corporation. All rights reserved. */ +#include +#include "cxl.h" +#include "core.h" + +#define CXL_FEATURE_MAX_DEVS 65536 +static DEFINE_IDA(cxl_features_ida); + +static void cxl_features_release(struct device *dev) +{ + struct cxl_features *features = to_cxl_features(dev); + + ida_free(&cxl_features_ida, features->id); + kfree(features); +} + +static void remove_features_dev(void *dev) +{ + device_unregister(dev); +} + +const struct device_type cxl_features_type = { + .name = "features", + .release = cxl_features_release, +}; +EXPORT_SYMBOL_NS_GPL(cxl_features_type, "CXL"); + +struct cxl_features *cxl_features_alloc(struct cxl_mailbox *cxl_mbox, + struct device *parent) +{ + struct device *dev; + int rc; + + struct cxl_features *features __free(kfree) = + kzalloc(sizeof(*features), GFP_KERNEL); + if (!features) + return ERR_PTR(-ENOMEM); + + rc = ida_alloc_max(&cxl_features_ida, CXL_FEATURE_MAX_DEVS - 1, + GFP_KERNEL); + if (rc < 0) + return ERR_PTR(rc); + + features->id = rc; + features->cxl_mbox = cxl_mbox; + dev = &features->dev; + device_initialize(dev); + device_set_pm_not_required(dev); + dev->parent = parent; + dev->bus = &cxl_bus_type; + dev->type = &cxl_features_type; + rc = dev_set_name(dev, "features%d", features->id); + if (rc) + goto err; + + rc = device_add(dev); + if (rc) + goto err; + + rc = devm_add_action_or_reset(parent, remove_features_dev, dev); + if (rc) + goto err; + + return no_free_ptr(features); + +err: + put_device(dev); + return ERR_PTR(rc); +} +EXPORT_SYMBOL_NS_GPL(cxl_features_alloc, "CXL"); diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c index 78a5c2c25982..cc53a597cae6 100644 --- a/drivers/cxl/core/port.c +++ b/drivers/cxl/core/port.c @@ -74,6 +74,9 @@ static int cxl_device_id(const struct device *dev) return CXL_DEVICE_REGION; if (dev->type == &cxl_pmu_type) return CXL_DEVICE_PMU; + if (dev->type == &cxl_features_type) + return CXL_DEVICE_FEATURES; + return 0; } diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index f6015f24ad38..ee29d1a1c8df 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -855,6 +855,7 @@ void cxl_driver_unregister(struct cxl_driver *cxl_drv); #define CXL_DEVICE_PMEM_REGION 7 #define CXL_DEVICE_DAX_REGION 8 #define CXL_DEVICE_PMU 9 +#define CXL_DEVICE_FEATURES 10 #define MODULE_ALIAS_CXL(type) MODULE_ALIAS("cxl:t" __stringify(type) "*") #define CXL_MODALIAS_FMT "cxl:t%d" diff --git a/drivers/cxl/features.c b/drivers/cxl/features.c new file mode 100644 index 000000000000..93b16b5e2b68 --- /dev/null +++ b/drivers/cxl/features.c @@ -0,0 +1,44 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright(c) 2024,2025 Intel Corporation. All rights reserved. */ +#include +#include +#include +#include + +#include "cxl.h" + +static int cxl_features_probe(struct device *dev) +{ + struct cxl_features *features = to_cxl_features(dev); + struct cxl_features_state *cfs __free(kfree) = + kzalloc(sizeof(*cfs), GFP_KERNEL); + + if (!cfs) + return -ENOMEM; + + cfs->features = features; + dev_set_drvdata(dev, no_free_ptr(cfs)); + + return 0; +} + +static void cxl_features_remove(struct device *dev) +{ + struct cxl_features_state *cfs = dev_get_drvdata(dev); + + kfree(cfs); +} + +static struct cxl_driver cxl_features_driver = { + .name = "cxl_features", + .probe = cxl_features_probe, + .remove = cxl_features_remove, + .id = CXL_DEVICE_FEATURES, +}; + +module_cxl_driver(cxl_features_driver); + +MODULE_DESCRIPTION("CXL: Features"); +MODULE_LICENSE("GPL v2"); +MODULE_IMPORT_NS("CXL"); +MODULE_ALIAS_CXL(CXL_DEVICE_FEATURES); diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c index 6d94ff4a4f1a..eb68dd3f8b21 100644 --- a/drivers/cxl/pci.c +++ b/drivers/cxl/pci.c @@ -386,6 +386,21 @@ static int cxl_pci_mbox_send(struct cxl_mailbox *cxl_mbox, return rc; } +static int cxl_pci_setup_features(struct cxl_memdev_state *mds) +{ + struct cxl_dev_state *cxlds = &mds->cxlds; + struct cxl_mailbox *cxl_mbox = &cxlds->cxl_mbox; + struct cxl_features *features; + + features = cxl_features_alloc(cxl_mbox, cxlds->dev); + if (IS_ERR(features)) + return PTR_ERR(features); + + cxl_mbox->features = features; + + return 0; +} + static int cxl_pci_setup_mailbox(struct cxl_memdev_state *mds, bool irq_avail) { struct cxl_dev_state *cxlds = &mds->cxlds; @@ -980,6 +995,10 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) if (rc) return rc; + rc = cxl_pci_setup_features(mds); + if (rc) + return rc; + rc = cxl_set_timestamp(mds); if (rc) return rc; diff --git a/include/cxl/features.h b/include/cxl/features.h new file mode 100644 index 000000000000..b92da1e92780 --- /dev/null +++ b/include/cxl/features.h @@ -0,0 +1,23 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* Copyright(c) 2024-2025 Intel Corporation. */ +#ifndef __CXL_FEATURES_H__ +#define __CXL_FEATURES_H__ + +struct cxl_mailbox; + +struct cxl_features { + int id; + struct device dev; + struct cxl_mailbox *cxl_mbox; +}; +#define to_cxl_features(dev) container_of(dev, struct cxl_features, dev) + +struct cxl_features_state { + struct cxl_features *features; + int num_features; +}; + +struct cxl_features *cxl_features_alloc(struct cxl_mailbox *cxl_mbox, + struct device *parent); + +#endif diff --git a/include/cxl/mailbox.h b/include/cxl/mailbox.h index cc894f07a435..6caab0d406ba 100644 --- a/include/cxl/mailbox.h +++ b/include/cxl/mailbox.h @@ -3,6 +3,7 @@ #ifndef __CXL_MBOX_H__ #define __CXL_MBOX_H__ #include +#include #include /** @@ -50,6 +51,7 @@ struct cxl_mbox_cmd { * (CXL 3.1 8.2.8.4.3 Mailbox Capabilities Register) * @mbox_mutex: mutex protects device mailbox and firmware * @mbox_wait: rcuwait for mailbox + * @features: pointer to cxl_features device * @mbox_send: @dev specific transport for transmitting mailbox commands */ struct cxl_mailbox { @@ -59,6 +61,7 @@ struct cxl_mailbox { size_t payload_size; struct mutex mbox_mutex; /* lock to protect mailbox context */ struct rcuwait mbox_wait; + struct cxl_features *features; int (*mbox_send)(struct cxl_mailbox *cxl_mbox, struct cxl_mbox_cmd *cmd); }; diff --git a/tools/testing/cxl/Kbuild b/tools/testing/cxl/Kbuild index b1256fee3567..79de943841f4 100644 --- a/tools/testing/cxl/Kbuild +++ b/tools/testing/cxl/Kbuild @@ -61,6 +61,7 @@ cxl_core-y += $(CXL_CORE_SRC)/pci.o cxl_core-y += $(CXL_CORE_SRC)/hdm.o cxl_core-y += $(CXL_CORE_SRC)/pmu.o cxl_core-y += $(CXL_CORE_SRC)/cdat.o +cxl_core-y += $(CXL_CORE_SRC)/features.o cxl_core-$(CONFIG_TRACING) += $(CXL_CORE_SRC)/trace.o cxl_core-$(CONFIG_CXL_REGION) += $(CXL_CORE_SRC)/region.o cxl_core-y += config_check.o From patchwork Mon Jan 6 12:10:06 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shiju Jose X-Patchwork-Id: 855292 Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 64F651DE8AF; Mon, 6 Jan 2025 12:11:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.176.79.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736165514; cv=none; b=Ndlxg08t+XqLPp+lqVmOR31hb1Z18IjIv3teGaEoF02PPwSBFGi63ma25B0cvJ0peej+FIGb1DR6R75bzVtmEBoIihngHBxJSNZe1uABDkwk2HpbVXcZH2cAPsBUQoE29PSWNsRlxsry1bRAnn1bIPf2hBEoAs0r5tMQvKxW8Sk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736165514; c=relaxed/simple; bh=dX5AmIudljEDdI0urDRi2KRycFlDmnqShdtsWf24+1g=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=prg8LwsldXWue1dMx4T5Slxc/5RrWirE3uNj6yHw4YhifEhBW5chWbbJJMLuZMY/YPLq358VAJeAAdnjD5/WB0BYxWVOBwW09HrFyr8iA3l4/DD5ZQiO1D1yGBzYUaOtbP9ji7m9gbU6N8Ehke28t6dYhbIuOk6A68IWhf+WDAk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=185.176.79.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.18.186.216]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4YRXx45RsGz6K9Yt; Mon, 6 Jan 2025 20:07:20 +0800 (CST) Received: from frapeml500007.china.huawei.com (unknown [7.182.85.172]) by mail.maildlp.com (Postfix) with ESMTPS id 5BAF31408F9; Mon, 6 Jan 2025 20:11:50 +0800 (CST) Received: from P_UKIT01-A7bmah.china.huawei.com (10.126.170.95) by frapeml500007.china.huawei.com (7.182.85.172) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.39; Mon, 6 Jan 2025 13:11:48 +0100 From: To: , , , , CC: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , Subject: [PATCH v18 10/19] cxl: Add Get Supported Features command for kernel usage Date: Mon, 6 Jan 2025 12:10:06 +0000 Message-ID: <20250106121017.1620-11-shiju.jose@huawei.com> X-Mailer: git-send-email 2.43.0.windows.1 In-Reply-To: <20250106121017.1620-1-shiju.jose@huawei.com> References: <20250106121017.1620-1-shiju.jose@huawei.com> Precedence: bulk X-Mailing-List: linux-acpi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: lhrpeml100011.china.huawei.com (7.191.174.247) To frapeml500007.china.huawei.com (7.182.85.172) From: Dave Jiang CXL spec r3.1 8.2.9.6.1 Get Supported Features (Opcode 0500h) The command retrieve the list of supported device-specific features (identified by UUID) and general information about each Feature. The driver will retrieve the feature entries in order to make checks and provide information for the Get Feature and Set Feature command. One of the main piece of information retrieved are the effects a Set Feature command would have for a particular feature. Co-developed-by: Shiju Jose Signed-off-by: Shiju Jose Signed-off-by: Dave Jiang --- drivers/cxl/core/features.c | 28 +++++++ drivers/cxl/core/mbox.c | 3 +- drivers/cxl/cxl.h | 2 + drivers/cxl/features.c | 146 +++++++++++++++++++++++++++++++++++- include/cxl/features.h | 32 ++++++++ 5 files changed, 209 insertions(+), 2 deletions(-) diff --git a/drivers/cxl/core/features.c b/drivers/cxl/core/features.c index eb6eb191a32e..66a4b82910e6 100644 --- a/drivers/cxl/core/features.c +++ b/drivers/cxl/core/features.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0-only /* Copyright(c) 2024-2025 Intel Corporation. All rights reserved. */ #include +#include #include "cxl.h" #include "core.h" @@ -69,3 +70,30 @@ struct cxl_features *cxl_features_alloc(struct cxl_mailbox *cxl_mbox, return ERR_PTR(rc); } EXPORT_SYMBOL_NS_GPL(cxl_features_alloc, "CXL"); + +struct cxl_feat_entry * +cxl_get_supported_feature_entry(struct cxl_features *features, + const uuid_t *feat_uuid) +{ + struct cxl_feat_entry *feat_entry; + struct cxl_features_state *cfs; + int count; + + cfs = dev_get_drvdata(&features->dev); + if (!cfs) + return ERR_PTR(-EOPNOTSUPP); + + if (!cfs->num_features) + return ERR_PTR(-ENOENT); + + /* Check CXL dev supports the feature */ + feat_entry = cfs->entries; + for (count = 0; count < cfs->num_features; + count++, feat_entry++) { + if (uuid_equal(&feat_entry->uuid, feat_uuid)) + return feat_entry; + } + + return ERR_PTR(-ENOENT); +} +EXPORT_SYMBOL_NS_GPL(cxl_get_supported_feature_entry, "CXL"); diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c index 5e21ff99d70f..0b4946205910 100644 --- a/drivers/cxl/core/mbox.c +++ b/drivers/cxl/core/mbox.c @@ -234,7 +234,7 @@ static struct cxl_mem_command *cxl_mem_find_command(u16 opcode) return NULL; } -static struct cxl_mem_command *cxl_find_feature_command(u16 opcode) +struct cxl_mem_command *cxl_find_feature_command(u16 opcode) { struct cxl_mem_command *c; @@ -244,6 +244,7 @@ static struct cxl_mem_command *cxl_find_feature_command(u16 opcode) return NULL; } +EXPORT_SYMBOL_NS_GPL(cxl_find_feature_command, "CXL"); static const char *cxl_mem_opcode_to_name(u16 opcode) { diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h index ee29d1a1c8df..1284614d71d0 100644 --- a/drivers/cxl/cxl.h +++ b/drivers/cxl/cxl.h @@ -912,6 +912,8 @@ void cxl_coordinates_combine(struct access_coordinate *out, bool cxl_endpoint_decoder_reset_detected(struct cxl_port *port); +struct cxl_mem_command *cxl_find_feature_command(u16 opcode); + /* * Unit test builds overrides this to __weak, find the 'strong' version * of these symbols in tools/testing/cxl/. diff --git a/drivers/cxl/features.c b/drivers/cxl/features.c index 93b16b5e2b68..2cdf5ed0a771 100644 --- a/drivers/cxl/features.c +++ b/drivers/cxl/features.c @@ -3,20 +3,164 @@ #include #include #include +#include #include #include "cxl.h" +#include "cxlmem.h" + +static void cxl_free_feature_entries(void *entries) +{ + kvfree(entries); +} + +static int cxl_get_supported_features_count(struct cxl_mailbox *cxl_mbox) +{ + struct cxl_mbox_get_sup_feats_out mbox_out; + struct cxl_mbox_get_sup_feats_in mbox_in; + struct cxl_mbox_cmd mbox_cmd; + int rc; + + memset(&mbox_in, 0, sizeof(mbox_in)); + mbox_in.count = cpu_to_le32(sizeof(mbox_out)); + memset(&mbox_out, 0, sizeof(mbox_out)); + mbox_cmd = (struct cxl_mbox_cmd) { + .opcode = CXL_MBOX_OP_GET_SUPPORTED_FEATURES, + .size_in = sizeof(mbox_in), + .payload_in = &mbox_in, + .size_out = sizeof(mbox_out), + .payload_out = &mbox_out, + .min_out = sizeof(mbox_out), + }; + rc = cxl_internal_send_cmd(cxl_mbox, &mbox_cmd); + if (rc < 0) + return rc; + + return le16_to_cpu(mbox_out.supported_feats); +} + +static int cxl_get_supported_features(struct cxl_features_state *cfs) +{ + int remain_feats, max_size, max_feats, start, rc, hdr_size; + struct cxl_mailbox *cxl_mbox = cfs->features->cxl_mbox; + int feat_size = sizeof(struct cxl_feat_entry); + struct cxl_mbox_get_sup_feats_in mbox_in; + struct cxl_feat_entry *entry; + struct cxl_mbox_cmd mbox_cmd; + struct cxl_mem_command *cmd; + int count; + + /* Get supported features is optional, need to check */ + cmd = cxl_find_feature_command(CXL_MBOX_OP_GET_SUPPORTED_FEATURES); + if (!cmd) + return -EOPNOTSUPP; + if (!test_bit(cmd->info.id, cxl_mbox->feature_cmds)) + return -EOPNOTSUPP; + + count = cxl_get_supported_features_count(cxl_mbox); + if (count == 0) + return 0; + if (count < 0) + return -ENXIO; + + struct cxl_feat_entry *entries __free(kvfree) = + kvmalloc(count * sizeof(*entries), GFP_KERNEL); + if (!entries) + return -ENOMEM; + + struct cxl_mbox_get_sup_feats_out *mbox_out __free(kvfree) = + kvmalloc(cxl_mbox->payload_size, GFP_KERNEL); + if (!mbox_out) + return -ENOMEM; + + hdr_size = sizeof(*mbox_out); + max_size = cxl_mbox->payload_size - hdr_size; + /* max feat entries that can fit in mailbox max payload size */ + max_feats = max_size / feat_size; + entry = entries; + + start = 0; + remain_feats = count; + do { + int retrieved, alloc_size, copy_feats; + int num_entries; + + if (remain_feats > max_feats) { + alloc_size = sizeof(*mbox_out) + max_feats * feat_size; + remain_feats = remain_feats - max_feats; + copy_feats = max_feats; + } else { + alloc_size = sizeof(*mbox_out) + remain_feats * feat_size; + copy_feats = remain_feats; + remain_feats = 0; + } + + memset(&mbox_in, 0, sizeof(mbox_in)); + mbox_in.count = cpu_to_le32(alloc_size); + mbox_in.start_idx = cpu_to_le16(start); + memset(mbox_out, 0, alloc_size); + mbox_cmd = (struct cxl_mbox_cmd) { + .opcode = CXL_MBOX_OP_GET_SUPPORTED_FEATURES, + .size_in = sizeof(mbox_in), + .payload_in = &mbox_in, + .size_out = alloc_size, + .payload_out = mbox_out, + .min_out = hdr_size, + }; + rc = cxl_internal_send_cmd(cxl_mbox, &mbox_cmd); + if (rc < 0) + return rc; + + if (mbox_cmd.size_out <= hdr_size) + return -ENXIO; + + /* + * Make sure retrieved out buffer is multiple of feature + * entries. + */ + retrieved = mbox_cmd.size_out - hdr_size; + if (retrieved % feat_size) + return -ENXIO; + + num_entries = le16_to_cpu(mbox_out->num_entries); + /* + * If the reported output entries * defined entry size != + * retrieved output bytes, then the output package is incorrect. + */ + if (num_entries * feat_size != retrieved) + return -ENXIO; + + memcpy(entry, mbox_out->ents, retrieved); + entry++; + /* + * If the number of output entries is less than expected, add the + * remaining entries to the next batch. + */ + remain_feats += copy_feats - num_entries; + start += num_entries; + } while (remain_feats); + + cfs->num_features = count; + cfs->entries = no_free_ptr(entries); + return devm_add_action_or_reset(&cfs->features->dev, + cxl_free_feature_entries, cfs->entries); +} static int cxl_features_probe(struct device *dev) { struct cxl_features *features = to_cxl_features(dev); + int rc; + struct cxl_features_state *cfs __free(kfree) = kzalloc(sizeof(*cfs), GFP_KERNEL); - if (!cfs) return -ENOMEM; cfs->features = features; + rc = cxl_get_supported_features(cfs); + if (rc) + return rc; + dev_set_drvdata(dev, no_free_ptr(cfs)); return 0; diff --git a/include/cxl/features.h b/include/cxl/features.h index 7a8be3c621a1..429b9782667c 100644 --- a/include/cxl/features.h +++ b/include/cxl/features.h @@ -3,6 +3,8 @@ #ifndef __CXL_FEATURES_H__ #define __CXL_FEATURES_H__ +#include + struct cxl_mailbox; enum feature_cmds { @@ -19,12 +21,42 @@ struct cxl_features { }; #define to_cxl_features(dev) container_of(dev, struct cxl_features, dev) +/* Get Supported Features (0x500h) CXL r3.1 8.2.9.6.1 */ +struct cxl_mbox_get_sup_feats_in { + __le32 count; + __le16 start_idx; + u8 reserved[2]; +} __packed; + +struct cxl_feat_entry { + uuid_t uuid; + __le16 id; + __le16 get_feat_size; + __le16 set_feat_size; + __le32 flags; + u8 get_feat_ver; + u8 set_feat_ver; + __le16 effects; + u8 reserved[18]; +} __packed; + +struct cxl_mbox_get_sup_feats_out { + __le16 num_entries; + __le16 supported_feats; + u8 reserved[4]; + struct cxl_feat_entry ents[] __counted_by_le(num_entries); +} __packed; + struct cxl_features_state { struct cxl_features *features; int num_features; + struct cxl_feat_entry *entries; }; struct cxl_features *cxl_features_alloc(struct cxl_mailbox *cxl_mbox, struct device *parent); +struct cxl_feat_entry * +cxl_get_supported_feature_entry(struct cxl_features *features, + const uuid_t *feat_uuid); #endif From patchwork Mon Jan 6 12:10:08 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shiju Jose X-Patchwork-Id: 855291 Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0F2D81DD874; Mon, 6 Jan 2025 12:11:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.176.79.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736165519; cv=none; b=BF7TUuwwbYYnOzRU44wmr9TKIpv6kKenWGiqHKQEzeyPe9RQgZLCdsojdaAYlWu8D5YPp2E9dR4gfs5ipJtVm3R6ZMLKelLZ2vfqSDFaW879vuS33mBaIzlV939LkBC1RhP+BuAVc+ubu1LVWK3GKTZcXlUlBZ0UK4Gdl0gWJZU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736165519; c=relaxed/simple; bh=+yx7xCnt0ZOemBGyckjkr4z6UlqXQFF3Ep1mw3tNee4=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Lait1fUb/7AC0d5R5WBV+D7ccR5LB5pRq4j3yRdqy8SPIPSPMEVbnTv1Kca8W/7yQY8csiVf+zAeg3NnZipE3HvTYBzaG+dYe2udTSYBwW8AwxZABqim6QKdtGr38Jr9PTysGuQG/EngIAh+VDWRyxQW1UH0j4nzuTiQND21/tI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=185.176.79.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.18.186.231]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4YRXxB43NLz6K8qh; Mon, 6 Jan 2025 20:07:26 +0800 (CST) Received: from frapeml500007.china.huawei.com (unknown [7.182.85.172]) by mail.maildlp.com (Postfix) with ESMTPS id 26439140A08; Mon, 6 Jan 2025 20:11:56 +0800 (CST) Received: from P_UKIT01-A7bmah.china.huawei.com (10.126.170.95) by frapeml500007.china.huawei.com (7.182.85.172) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.39; Mon, 6 Jan 2025 13:11:54 +0100 From: To: , , , , CC: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , Subject: [PATCH v18 12/19] cxl/mbox: Add GET_FEATURE mailbox command Date: Mon, 6 Jan 2025 12:10:08 +0000 Message-ID: <20250106121017.1620-13-shiju.jose@huawei.com> X-Mailer: git-send-email 2.43.0.windows.1 In-Reply-To: <20250106121017.1620-1-shiju.jose@huawei.com> References: <20250106121017.1620-1-shiju.jose@huawei.com> Precedence: bulk X-Mailing-List: linux-acpi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: lhrpeml100011.china.huawei.com (7.191.174.247) To frapeml500007.china.huawei.com (7.182.85.172) From: Shiju Jose Add support for GET_FEATURE mailbox command. CXL spec 3.1 section 8.2.9.6 describes optional device specific features. The settings of a feature can be retrieved using Get Feature command. CXL spec 3.1 section 8.2.9.6.2 describes Get Feature command. Signed-off-by: Shiju Jose Signed-off-by: Dave Jiang --- drivers/cxl/core/features.c | 74 +++++++++++++++++++++++++++++++++++++ drivers/cxl/features.c | 6 +-- include/cxl/features.h | 27 ++++++++++++++ 3 files changed, 102 insertions(+), 5 deletions(-) diff --git a/drivers/cxl/core/features.c b/drivers/cxl/core/features.c index 66a4b82910e6..ab9386b53a95 100644 --- a/drivers/cxl/core/features.c +++ b/drivers/cxl/core/features.c @@ -4,6 +4,7 @@ #include #include "cxl.h" #include "core.h" +#include "cxlmem.h" #define CXL_FEATURE_MAX_DEVS 65536 static DEFINE_IDA(cxl_features_ida); @@ -97,3 +98,76 @@ cxl_get_supported_feature_entry(struct cxl_features *features, return ERR_PTR(-ENOENT); } EXPORT_SYMBOL_NS_GPL(cxl_get_supported_feature_entry, "CXL"); + +bool cxl_feature_enabled(struct cxl_features_state *cfs, u16 opcode) +{ + struct cxl_mailbox *cxl_mbox = cfs->features->cxl_mbox; + struct cxl_mem_command *cmd; + + cmd = cxl_find_feature_command(opcode); + if (!cmd) + return false; + + return test_bit(cmd->info.id, cxl_mbox->feature_cmds); +} +EXPORT_SYMBOL_NS_GPL(cxl_feature_enabled, "CXL"); + +size_t cxl_get_feature(struct cxl_features *features, const uuid_t feat_uuid, + enum cxl_get_feat_selection selection, + void *feat_out, size_t feat_out_size, u16 offset, + u16 *return_code) +{ + size_t data_to_rd_size, size_out; + struct cxl_features_state *cfs; + struct cxl_mbox_get_feat_in pi; + struct cxl_mailbox *cxl_mbox; + struct cxl_mbox_cmd mbox_cmd; + size_t data_rcvd_size = 0; + int rc; + + if (return_code) + *return_code = CXL_MBOX_CMD_RC_INPUT; + + cfs = dev_get_drvdata(&features->dev); + if (!cfs) + return 0; + + if (!cxl_feature_enabled(cfs, CXL_MBOX_OP_GET_FEATURE)) + return 0; + + if (!feat_out || !feat_out_size) + return 0; + + cxl_mbox = features->cxl_mbox; + size_out = min(feat_out_size, cxl_mbox->payload_size); + pi.uuid = feat_uuid; + pi.selection = selection; + do { + data_to_rd_size = min(feat_out_size - data_rcvd_size, + cxl_mbox->payload_size); + pi.offset = cpu_to_le16(offset + data_rcvd_size); + pi.count = cpu_to_le16(data_to_rd_size); + + mbox_cmd = (struct cxl_mbox_cmd) { + .opcode = CXL_MBOX_OP_GET_FEATURE, + .size_in = sizeof(pi), + .payload_in = &pi, + .size_out = size_out, + .payload_out = feat_out + data_rcvd_size, + .min_out = data_to_rd_size, + }; + rc = cxl_internal_send_cmd(cxl_mbox, &mbox_cmd); + if (rc < 0 || !mbox_cmd.size_out) { + if (return_code) + *return_code = mbox_cmd.return_code; + return 0; + } + data_rcvd_size += mbox_cmd.size_out; + } while (data_rcvd_size < feat_out_size); + + if (return_code) + *return_code = CXL_MBOX_CMD_RC_SUCCESS; + + return data_rcvd_size; +} +EXPORT_SYMBOL_NS_GPL(cxl_get_feature, "CXL"); diff --git a/drivers/cxl/features.c b/drivers/cxl/features.c index 2f0fb072921e..2faa9e03a840 100644 --- a/drivers/cxl/features.c +++ b/drivers/cxl/features.c @@ -47,14 +47,10 @@ static int cxl_get_supported_features(struct cxl_features_state *cfs) struct cxl_mbox_get_sup_feats_in mbox_in; struct cxl_feat_entry *entry; struct cxl_mbox_cmd mbox_cmd; - struct cxl_mem_command *cmd; int count; /* Get supported features is optional, need to check */ - cmd = cxl_find_feature_command(CXL_MBOX_OP_GET_SUPPORTED_FEATURES); - if (!cmd) - return -EOPNOTSUPP; - if (!test_bit(cmd->info.id, cxl_mbox->feature_cmds)) + if (!cxl_feature_enabled(cfs, CXL_MBOX_OP_GET_SUPPORTED_FEATURES)) return -EOPNOTSUPP; count = cxl_get_supported_features_count(cxl_mbox); diff --git a/include/cxl/features.h b/include/cxl/features.h index 429b9782667c..582fb85da5b9 100644 --- a/include/cxl/features.h +++ b/include/cxl/features.h @@ -53,10 +53,37 @@ struct cxl_features_state { struct cxl_feat_entry *entries; }; +/* + * Get Feature CXL 3.1 Spec 8.2.9.6.2 + */ + +/* + * Get Feature input payload + * CXL rev 3.1 section 8.2.9.6.2 Table 8-99 + */ +enum cxl_get_feat_selection { + CXL_GET_FEAT_SEL_CURRENT_VALUE, + CXL_GET_FEAT_SEL_DEFAULT_VALUE, + CXL_GET_FEAT_SEL_SAVED_VALUE, + CXL_GET_FEAT_SEL_MAX +}; + +struct cxl_mbox_get_feat_in { + uuid_t uuid; + __le16 offset; + __le16 count; + u8 selection; +} __packed; + +bool cxl_feature_enabled(struct cxl_features_state *cfs, u16 opcode); struct cxl_features *cxl_features_alloc(struct cxl_mailbox *cxl_mbox, struct device *parent); struct cxl_feat_entry * cxl_get_supported_feature_entry(struct cxl_features *features, const uuid_t *feat_uuid); +size_t cxl_get_feature(struct cxl_features *features, const uuid_t feat_uuid, + enum cxl_get_feat_selection selection, + void *feat_out, size_t feat_out_size, u16 offset, + u16 *return_code); #endif From patchwork Mon Jan 6 12:10:10 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shiju Jose X-Patchwork-Id: 855290 Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0D55B1DF276; Mon, 6 Jan 2025 12:12:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.176.79.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736165525; cv=none; b=Pg46VjN8qdaafdyT83G8RKpGI8JaodITAxV6JScgCdoJHIa/p2DmgpKkilbPksX48MWW6snVOByD5wAhh2TCLjueTk3zPg2wnXAejXyXelq2zk7WgDLAsYPlZ6p71HNp0fmRmxV4/eFBscnb2wxJbeXEwCZzhLObMmOg+OyrqTc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736165525; c=relaxed/simple; bh=2jsJ/3Uh0qcWryNe8UCGcy+iRFjQus0nqIlSzXULFBc=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=nl3QiLmYCtat+/oRoajLQtQvlfxIIK5JPJgP0zKKWCrrJq2z08FpbLSruQ96fl/9NddtK9OJBp6r5biNF8vQU02M8jrXDdciHfORF2yoMabiJOJn224tXtEnfi+80KtbbaKVrOEEmZ//l8MHM9YT+x7skWcT5zuZJbm9DA3GThE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=185.176.79.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.18.186.216]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4YRY0k0gFwz6LD8T; Mon, 6 Jan 2025 20:10:30 +0800 (CST) Received: from frapeml500007.china.huawei.com (unknown [7.182.85.172]) by mail.maildlp.com (Postfix) with ESMTPS id D897C1408F9; Mon, 6 Jan 2025 20:12:01 +0800 (CST) Received: from P_UKIT01-A7bmah.china.huawei.com (10.126.170.95) by frapeml500007.china.huawei.com (7.182.85.172) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.39; Mon, 6 Jan 2025 13:11:59 +0100 From: To: , , , , CC: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , Subject: [PATCH v18 14/19] cxl: Setup exclusive CXL features that are reserved for the kernel Date: Mon, 6 Jan 2025 12:10:10 +0000 Message-ID: <20250106121017.1620-15-shiju.jose@huawei.com> X-Mailer: git-send-email 2.43.0.windows.1 In-Reply-To: <20250106121017.1620-1-shiju.jose@huawei.com> References: <20250106121017.1620-1-shiju.jose@huawei.com> Precedence: bulk X-Mailing-List: linux-acpi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: lhrpeml100011.china.huawei.com (7.191.174.247) To frapeml500007.china.huawei.com (7.182.85.172) From: Dave Jiang Certain features will be exclusively used by components such as in kernel RAS driver. Setup an exclusion list that can be later filtered out before exposing to user space. Signed-off-by: Dave Jiang --- drivers/cxl/core/features.c | 22 ++++++++++++++++++++++ drivers/cxl/features.c | 6 +++++- include/cxl/features.h | 34 ++++++++++++++++++++++++++++++++++ 3 files changed, 61 insertions(+), 1 deletion(-) diff --git a/drivers/cxl/core/features.c b/drivers/cxl/core/features.c index 932e82b52f90..c836b3a64561 100644 --- a/drivers/cxl/core/features.c +++ b/drivers/cxl/core/features.c @@ -6,6 +6,17 @@ #include "core.h" #include "cxlmem.h" +static const uuid_t cxl_exclusive_feats[] = { + CXL_FEAT_PATROL_SCRUB_UUID, + CXL_FEAT_ECS_UUID, + CXL_FEAT_SPPR_UUID, + CXL_FEAT_HPPR_UUID, + CXL_FEAT_CACHELINE_SPARING_UUID, + CXL_FEAT_ROW_SPARING_UUID, + CXL_FEAT_BANK_SPARING_UUID, + CXL_FEAT_RANK_SPARING_UUID, +}; + #define CXL_FEATURE_MAX_DEVS 65536 static DEFINE_IDA(cxl_features_ida); @@ -263,3 +274,14 @@ int cxl_set_feature(struct cxl_features *features, } while (true); } EXPORT_SYMBOL_NS_GPL(cxl_set_feature, "CXL"); + +bool is_cxl_feature_exclusive(struct cxl_feat_entry *entry) +{ + for (int i = 0; i < ARRAY_SIZE(cxl_exclusive_feats); i++) { + if (uuid_equal(&entry->uuid, &cxl_exclusive_feats[i])) + return true; + } + + return false; +} +EXPORT_SYMBOL_NS_GPL(is_cxl_feature_exclusive, "CXL"); diff --git a/drivers/cxl/features.c b/drivers/cxl/features.c index 2faa9e03a840..3388bce5943a 100644 --- a/drivers/cxl/features.c +++ b/drivers/cxl/features.c @@ -47,6 +47,7 @@ static int cxl_get_supported_features(struct cxl_features_state *cfs) struct cxl_mbox_get_sup_feats_in mbox_in; struct cxl_feat_entry *entry; struct cxl_mbox_cmd mbox_cmd; + int user_feats = 0; int count; /* Get supported features is optional, need to check */ @@ -127,6 +128,8 @@ static int cxl_get_supported_features(struct cxl_features_state *cfs) return -ENXIO; memcpy(entry, mbox_out->ents, retrieved); + if (!is_cxl_feature_exclusive(entry)) + user_feats++; entry++; /* * If the number of output entries is less than expected, add the @@ -138,6 +141,7 @@ static int cxl_get_supported_features(struct cxl_features_state *cfs) cfs->num_features = count; cfs->entries = no_free_ptr(entries); + cfs->num_user_features = user_feats; return devm_add_action_or_reset(&cfs->features->dev, cxl_free_feature_entries, cfs->entries); } @@ -177,7 +181,7 @@ static ssize_t features_show(struct device *dev, struct device_attribute *attr, if (!cfs) return -ENOENT; - return sysfs_emit(buf, "%d\n", cfs->num_features); + return sysfs_emit(buf, "%d\n", cfs->num_user_features); } static DEVICE_ATTR_RO(features); diff --git a/include/cxl/features.h b/include/cxl/features.h index 41d3602b186c..adff3496b4be 100644 --- a/include/cxl/features.h +++ b/include/cxl/features.h @@ -5,6 +5,38 @@ #include +#define CXL_FEAT_PATROL_SCRUB_UUID \ + UUID_INIT(0x96dad7d6, 0xfde8, 0x482b, 0xa7, 0x33, 0x75, 0x77, 0x4e, \ + 0x06, 0xdb, 0x8a) + +#define CXL_FEAT_ECS_UUID \ + UUID_INIT(0xe5b13f22, 0x2328, 0x4a14, 0xb8, 0xba, 0xb9, 0x69, 0x1e, \ + 0x89, 0x33, 0x86) + +#define CXL_FEAT_SPPR_UUID \ + UUID_INIT(0x892ba475, 0xfad8, 0x474e, 0x9d, 0x3e, 0x69, 0x2c, 0x91, \ + 0x75, 0x68, 0xbb) + +#define CXL_FEAT_HPPR_UUID \ + UUID_INIT(0x80ea4521, 0x786f, 0x4127, 0xaf, 0xb1, 0xec, 0x74, 0x59, \ + 0xfb, 0x0e, 0x24) + +#define CXL_FEAT_CACHELINE_SPARING_UUID \ + UUID_INIT(0x96C33386, 0x91dd, 0x44c7, 0x9e, 0xcb, 0xfd, 0xaf, 0x65, \ + 0x03, 0xba, 0xc4) + +#define CXL_FEAT_ROW_SPARING_UUID \ + UUID_INIT(0x450ebf67, 0xb135, 0x4f97, 0xa4, 0x98, 0xc2, 0xd5, 0x7f, \ + 0x27, 0x9b, 0xed) + +#define CXL_FEAT_BANK_SPARING_UUID \ + UUID_INIT(0x78b79636, 0x90ac, 0x4b64, 0xa4, 0xef, 0xfa, 0xac, 0x5d, \ + 0x18, 0xa8, 0x63) + +#define CXL_FEAT_RANK_SPARING_UUID \ + UUID_INIT(0x34dbaff5, 0x0552, 0x4281, 0x8f, 0x76, 0xda, 0x0b, 0x5e, \ + 0x7a, 0x76, 0xa7) + struct cxl_mailbox; enum feature_cmds { @@ -50,6 +82,7 @@ struct cxl_mbox_get_sup_feats_out { struct cxl_features_state { struct cxl_features *features; int num_features; + int num_user_features; struct cxl_feat_entry *entries; }; @@ -117,5 +150,6 @@ size_t cxl_get_feature(struct cxl_features *features, const uuid_t feat_uuid, int cxl_set_feature(struct cxl_features *features, const uuid_t feat_uuid, u8 feat_version, void *feat_data, size_t feat_data_size, u32 feat_flag, u16 offset, u16 *return_code); +bool is_cxl_feature_exclusive(struct cxl_feat_entry *entry); #endif From patchwork Mon Jan 6 12:10:12 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shiju Jose X-Patchwork-Id: 855289 Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AB9291DF270; Mon, 6 Jan 2025 12:12:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.176.79.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736165531; cv=none; b=l3yb2g3T+I5alx7wEYe7sI0wHaLpHcsUh7Aikuxazq4K4TZdii1bPCs6etkesj04SFcjIdRMzCggd8Fh5uVxDq/BdOtyE1kDSsY5Yz/6BbgzmcEjXv45zTL4bR1iIlZRtgv9CHLVPmh7n6IcZRqB3yAFMG832q2MVmMGjzPKr90= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736165531; c=relaxed/simple; bh=6UF2qRPO9vELxW8Q/52ILV7vHdamalOKbIMBR0ZDTww=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=IzHNh9+dFdYT/Kyx37E2LTyaLi2eIQswmxXxmswWaPhImLTPm4eQm9DwYejVdNmzMp6VYQiQevrGbkcdlB4SEDIX1ljAuKG/yZxuXI4VjKfjVUcacTAmjpazDszgZBzBIAOR7Bx9naLUNTg3qtr0H0KfcISipc067mg0FC7DG24= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=185.176.79.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.18.186.31]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4YRY0q6wjdz6LD7g; Mon, 6 Jan 2025 20:10:35 +0800 (CST) Received: from frapeml500007.china.huawei.com (unknown [7.182.85.172]) by mail.maildlp.com (Postfix) with ESMTPS id C187114022E; Mon, 6 Jan 2025 20:12:07 +0800 (CST) Received: from P_UKIT01-A7bmah.china.huawei.com (10.126.170.95) by frapeml500007.china.huawei.com (7.182.85.172) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.39; Mon, 6 Jan 2025 13:12:05 +0100 From: To: , , , , CC: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , Subject: [PATCH v18 16/19] cxl/memfeature: Add CXL memory device ECS control feature Date: Mon, 6 Jan 2025 12:10:12 +0000 Message-ID: <20250106121017.1620-17-shiju.jose@huawei.com> X-Mailer: git-send-email 2.43.0.windows.1 In-Reply-To: <20250106121017.1620-1-shiju.jose@huawei.com> References: <20250106121017.1620-1-shiju.jose@huawei.com> Precedence: bulk X-Mailing-List: linux-acpi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: lhrpeml100011.china.huawei.com (7.191.174.247) To frapeml500007.china.huawei.com (7.182.85.172) From: Shiju Jose CXL spec 3.1 section 8.2.9.9.11.2 describes the DDR5 ECS (Error Check Scrub) control feature. The Error Check Scrub (ECS) is a feature defined in JEDEC DDR5 SDRAM Specification (JESD79-5) and allows the DRAM to internally read, correct single-bit errors, and write back corrected data bits to the DRAM array while providing transparency to error counts. The ECS control allows the requester to change the log entry type, the ECS threshold count (provided the request falls within the limits specified in DDR5 mode registers), switch between codeword mode and row count mode, and reset the ECS counter. Register with EDAC device driver, which retrieves the ECS attribute descriptors from the EDAC ECS and exposes the ECS control attributes to userspace via sysfs. For example, the ECS control for the memory media FRU0 in CXL mem0 device is located at /sys/bus/edac/devices/cxl_mem0/ecs_fru0/ Signed-off-by: Shiju Jose --- drivers/cxl/core/memfeature.c | 372 +++++++++++++++++++++++++++++++++- 1 file changed, 369 insertions(+), 3 deletions(-) diff --git a/drivers/cxl/core/memfeature.c b/drivers/cxl/core/memfeature.c index 77d1bf6ce45f..eed9d57aa691 100644 --- a/drivers/cxl/core/memfeature.c +++ b/drivers/cxl/core/memfeature.c @@ -18,7 +18,7 @@ #include #include -#define CXL_DEV_NUM_RAS_FEATURES 1 +#define CXL_DEV_NUM_RAS_FEATURES 2 #define CXL_DEV_HOUR_IN_SECS 3600 #define CXL_DEV_NAME_LEN 128 @@ -300,6 +300,311 @@ static const struct edac_scrub_ops cxl_ps_scrub_ops = { .set_cycle_duration = cxl_patrol_scrub_write_scrub_cycle, }; +/* CXL DDR5 ECS control definitions */ +struct cxl_ecs_context { + u16 num_media_frus; + u16 get_feat_size; + u16 set_feat_size; + u8 get_version; + u8 set_version; + u16 effects; + struct cxl_memdev *cxlmd; +}; + +enum { + CXL_ECS_PARAM_LOG_ENTRY_TYPE, + CXL_ECS_PARAM_THRESHOLD, + CXL_ECS_PARAM_MODE, + CXL_ECS_PARAM_RESET_COUNTER, +}; + +#define CXL_ECS_LOG_ENTRY_TYPE_MASK GENMASK(1, 0) +#define CXL_ECS_REALTIME_REPORT_CAP_MASK BIT(0) +#define CXL_ECS_THRESHOLD_COUNT_MASK GENMASK(2, 0) +#define CXL_ECS_COUNT_MODE_MASK BIT(3) +#define CXL_ECS_RESET_COUNTER_MASK BIT(4) +#define CXL_ECS_RESET_COUNTER 1 + +enum { + ECS_THRESHOLD_256 = 256, + ECS_THRESHOLD_1024 = 1024, + ECS_THRESHOLD_4096 = 4096, +}; + +enum { + ECS_THRESHOLD_IDX_256 = 3, + ECS_THRESHOLD_IDX_1024 = 4, + ECS_THRESHOLD_IDX_4096 = 5, +}; + +static const u16 ecs_supp_threshold[] = { + [ECS_THRESHOLD_IDX_256] = 256, + [ECS_THRESHOLD_IDX_1024] = 1024, + [ECS_THRESHOLD_IDX_4096] = 4096, +}; + +enum { + ECS_LOG_ENTRY_TYPE_DRAM = 0x0, + ECS_LOG_ENTRY_TYPE_MEM_MEDIA_FRU = 0x1, +}; + +enum cxl_ecs_count_mode { + ECS_MODE_COUNTS_ROWS = 0, + ECS_MODE_COUNTS_CODEWORDS = 1, +}; + +/** + * struct cxl_ecs_params - CXL memory DDR5 ECS parameter data structure. + * @log_entry_type: ECS log entry type, per DRAM or per memory media FRU. + * @threshold: ECS threshold count per GB of memory cells. + * @count_mode: codeword/row count mode + * 0 : ECS counts rows with errors + * 1 : ECS counts codeword with errors + * @reset_counter: [IN] reset ECC counter to default value. + */ +struct cxl_ecs_params { + u8 log_entry_type; + u16 threshold; + enum cxl_ecs_count_mode count_mode; + u8 reset_counter; +}; + +struct cxl_ecs_fru_rd_attrs { + u8 ecs_cap; + __le16 ecs_config; + u8 ecs_flags; +} __packed; + +struct cxl_ecs_rd_attrs { + u8 ecs_log_cap; + struct cxl_ecs_fru_rd_attrs fru_attrs[]; +} __packed; + +struct cxl_ecs_fru_wr_attrs { + __le16 ecs_config; +} __packed; + +struct cxl_ecs_wr_attrs { + u8 ecs_log_cap; + struct cxl_ecs_fru_wr_attrs fru_attrs[]; +} __packed; + +/* CXL DDR5 ECS control functions */ +static int cxl_mem_ecs_get_attrs(struct device *dev, + struct cxl_ecs_context *cxl_ecs_ctx, + int fru_id, struct cxl_ecs_params *params) +{ + struct cxl_memdev *cxlmd = cxl_ecs_ctx->cxlmd; + struct cxl_mailbox *cxl_mbox = &cxlmd->cxlds->cxl_mbox; + struct cxl_ecs_fru_rd_attrs *fru_rd_attrs; + size_t rd_data_size; + u8 threshold_index; + size_t data_size; + u16 return_code; + u16 ecs_config; + + rd_data_size = cxl_ecs_ctx->get_feat_size; + + struct cxl_ecs_rd_attrs *rd_attrs __free(kfree) = + kmalloc(rd_data_size, GFP_KERNEL); + if (!rd_attrs) + return -ENOMEM; + + params->log_entry_type = 0; + params->threshold = 0; + params->count_mode = 0; + data_size = cxl_get_feature(cxl_mbox->features, CXL_FEAT_ECS_UUID, + CXL_GET_FEAT_SEL_CURRENT_VALUE, + rd_attrs, rd_data_size, 0, &return_code); + if (!data_size || return_code != CXL_MBOX_CMD_RC_SUCCESS) + return -EIO; + + fru_rd_attrs = rd_attrs->fru_attrs; + params->log_entry_type = FIELD_GET(CXL_ECS_LOG_ENTRY_TYPE_MASK, + rd_attrs->ecs_log_cap); + ecs_config = le16_to_cpu(fru_rd_attrs[fru_id].ecs_config); + threshold_index = FIELD_GET(CXL_ECS_THRESHOLD_COUNT_MASK, + ecs_config); + params->threshold = ecs_supp_threshold[threshold_index]; + params->count_mode = FIELD_GET(CXL_ECS_COUNT_MODE_MASK, + ecs_config); + return 0; +} + +static int cxl_mem_ecs_set_attrs(struct device *dev, + struct cxl_ecs_context *cxl_ecs_ctx, + int fru_id, struct cxl_ecs_params *params, + u8 param_type) +{ + struct cxl_memdev *cxlmd = cxl_ecs_ctx->cxlmd; + struct cxl_mailbox *cxl_mbox = &cxlmd->cxlds->cxl_mbox; + struct cxl_ecs_fru_rd_attrs *fru_rd_attrs; + struct cxl_ecs_fru_wr_attrs *fru_wr_attrs; + size_t rd_data_size, wr_data_size; + u16 num_media_frus, count; + size_t data_size; + u16 return_code; + u16 ecs_config; + int ret; + + num_media_frus = cxl_ecs_ctx->num_media_frus; + rd_data_size = cxl_ecs_ctx->get_feat_size; + wr_data_size = cxl_ecs_ctx->set_feat_size; + struct cxl_ecs_rd_attrs *rd_attrs __free(kfree) = + kmalloc(rd_data_size, GFP_KERNEL); + if (!rd_attrs) + return -ENOMEM; + + data_size = cxl_get_feature(cxl_mbox->features, CXL_FEAT_ECS_UUID, + CXL_GET_FEAT_SEL_CURRENT_VALUE, + rd_attrs, rd_data_size, 0, &return_code); + if (!data_size || return_code != CXL_MBOX_CMD_RC_SUCCESS) + return -EIO; + + struct cxl_ecs_wr_attrs *wr_attrs __free(kfree) = + kmalloc(wr_data_size, GFP_KERNEL); + if (!wr_attrs) + return -ENOMEM; + + /* + * Fill writable attributes from the current attributes read + * for all the media FRUs. + */ + fru_rd_attrs = rd_attrs->fru_attrs; + fru_wr_attrs = wr_attrs->fru_attrs; + wr_attrs->ecs_log_cap = rd_attrs->ecs_log_cap; + for (count = 0; count < num_media_frus; count++) + fru_wr_attrs[count].ecs_config = fru_rd_attrs[count].ecs_config; + + /* Fill attribute to be set for the media FRU */ + ecs_config = le16_to_cpu(fru_rd_attrs[fru_id].ecs_config); + switch (param_type) { + case CXL_ECS_PARAM_LOG_ENTRY_TYPE: + if (params->log_entry_type != ECS_LOG_ENTRY_TYPE_DRAM && + params->log_entry_type != ECS_LOG_ENTRY_TYPE_MEM_MEDIA_FRU) { + dev_err(dev, + "Invalid CXL ECS scrub log entry type(%d) to set\n", + params->log_entry_type); + dev_err(dev, + "Log Entry Type 0: per DRAM 1: per Memory Media FRU\n"); + return -EINVAL; + } + wr_attrs->ecs_log_cap = FIELD_PREP(CXL_ECS_LOG_ENTRY_TYPE_MASK, + params->log_entry_type); + break; + case CXL_ECS_PARAM_THRESHOLD: + ecs_config &= ~CXL_ECS_THRESHOLD_COUNT_MASK; + switch (params->threshold) { + case ECS_THRESHOLD_256: + ecs_config |= FIELD_PREP(CXL_ECS_THRESHOLD_COUNT_MASK, + ECS_THRESHOLD_IDX_256); + break; + case ECS_THRESHOLD_1024: + ecs_config |= FIELD_PREP(CXL_ECS_THRESHOLD_COUNT_MASK, + ECS_THRESHOLD_IDX_1024); + break; + case ECS_THRESHOLD_4096: + ecs_config |= FIELD_PREP(CXL_ECS_THRESHOLD_COUNT_MASK, + ECS_THRESHOLD_IDX_4096); + break; + default: + dev_err(dev, + "Invalid CXL ECS scrub threshold count(%d) to set\n", + params->threshold); + dev_err(dev, + "Supported scrub threshold counts: %u, %u, %u\n", + ECS_THRESHOLD_256, ECS_THRESHOLD_1024, ECS_THRESHOLD_4096); + return -EINVAL; + } + break; + case CXL_ECS_PARAM_MODE: + if (params->count_mode != ECS_MODE_COUNTS_ROWS && + params->count_mode != ECS_MODE_COUNTS_CODEWORDS) { + dev_err(dev, + "Invalid CXL ECS scrub mode(%d) to set\n", + params->count_mode); + dev_err(dev, + "Supported ECS Modes: 0: ECS counts rows with errors," + " 1: ECS counts codewords with errors\n"); + return -EINVAL; + } + ecs_config &= ~CXL_ECS_COUNT_MODE_MASK; + ecs_config |= FIELD_PREP(CXL_ECS_COUNT_MODE_MASK, params->count_mode); + break; + case CXL_ECS_PARAM_RESET_COUNTER: + if (params->reset_counter != CXL_ECS_RESET_COUNTER) + return -EINVAL; + + ecs_config &= ~CXL_ECS_RESET_COUNTER_MASK; + ecs_config |= FIELD_PREP(CXL_ECS_RESET_COUNTER_MASK, params->reset_counter); + break; + default: + dev_err(dev, "Invalid CXL ECS parameter to set\n"); + return -EINVAL; + } + fru_wr_attrs[fru_id].ecs_config = cpu_to_le16(ecs_config); + + ret = cxl_set_feature(cxl_mbox->features, CXL_FEAT_ECS_UUID, + cxl_ecs_ctx->set_version, + wr_attrs, wr_data_size, + CXL_SET_FEAT_FLAG_DATA_SAVED_ACROSS_RESET, + 0, &return_code); + if (ret || return_code != CXL_MBOX_CMD_RC_SUCCESS) { + dev_err(dev, "CXL ECS set feature failed ret=%d return_code=%u\n", + ret, return_code); + return ret; + } + + return 0; +} + +#define CXL_ECS_GET_ATTR(attrib) \ +static int cxl_ecs_get_##attrib(struct device *dev, void *drv_data, \ + int fru_id, u32 *val) \ +{ \ + struct cxl_ecs_context *ctx = drv_data; \ + struct cxl_ecs_params params; \ + int ret; \ + \ + ret = cxl_mem_ecs_get_attrs(dev, ctx, fru_id, ¶ms); \ + if (ret) \ + return ret; \ + \ + *val = params.attrib; \ + \ + return 0; \ +} + +CXL_ECS_GET_ATTR(log_entry_type) +CXL_ECS_GET_ATTR(count_mode) +CXL_ECS_GET_ATTR(threshold) + +#define CXL_ECS_SET_ATTR(attrib, param_type) \ +static int cxl_ecs_set_##attrib(struct device *dev, void *drv_data, \ + int fru_id, u32 val) \ +{ \ + struct cxl_ecs_context *ctx = drv_data; \ + struct cxl_ecs_params params = { \ + .attrib = val, \ + }; \ + \ + return cxl_mem_ecs_set_attrs(dev, ctx, fru_id, ¶ms, (param_type)); \ +} +CXL_ECS_SET_ATTR(log_entry_type, CXL_ECS_PARAM_LOG_ENTRY_TYPE) +CXL_ECS_SET_ATTR(count_mode, CXL_ECS_PARAM_MODE) +CXL_ECS_SET_ATTR(reset_counter, CXL_ECS_PARAM_RESET_COUNTER) +CXL_ECS_SET_ATTR(threshold, CXL_ECS_PARAM_THRESHOLD) + +static const struct edac_ecs_ops cxl_ecs_ops = { + .get_log_entry_type = cxl_ecs_get_log_entry_type, + .set_log_entry_type = cxl_ecs_set_log_entry_type, + .get_mode = cxl_ecs_get_count_mode, + .set_mode = cxl_ecs_set_count_mode, + .reset = cxl_ecs_set_reset_counter, + .get_threshold = cxl_ecs_get_threshold, + .set_threshold = cxl_ecs_set_threshold, +}; + static int cxl_memdev_scrub_init(struct cxl_memdev *cxlmd, struct cxl_region *cxlr, struct edac_dev_feature *ras_feature, u8 scrub_inst) { @@ -363,6 +668,52 @@ static int cxl_memdev_scrub_init(struct cxl_memdev *cxlmd, struct cxl_region *cx return 0; } +static int cxl_memdev_ecs_init(struct cxl_memdev *cxlmd, + struct edac_dev_feature *ras_feature) +{ + struct cxl_dev_state *cxlds = cxlmd->cxlds; + struct cxl_mailbox *cxl_mbox = &cxlds->cxl_mbox; + struct cxl_ecs_context *cxl_ecs_ctx; + struct cxl_feat_entry *feat_entry; + int num_media_frus; + + feat_entry = cxl_get_supported_feature_entry(cxl_mbox->features, + &CXL_FEAT_ECS_UUID); + if (IS_ERR(feat_entry)) + return -EOPNOTSUPP; + + if (!(le32_to_cpu(feat_entry->flags) & CXL_FEAT_ENTRY_FLAG_CHANGABLE)) + return -EOPNOTSUPP; + + num_media_frus = (le16_to_cpu(feat_entry->get_feat_size) - + sizeof(struct cxl_ecs_rd_attrs)) / + sizeof(struct cxl_ecs_fru_rd_attrs); + if (!num_media_frus) + return -EOPNOTSUPP; + + cxl_ecs_ctx = devm_kzalloc(&cxlmd->dev, sizeof(*cxl_ecs_ctx), + GFP_KERNEL); + if (!cxl_ecs_ctx) + return -ENOMEM; + + *cxl_ecs_ctx = (struct cxl_ecs_context) { + .get_feat_size = le16_to_cpu(feat_entry->get_feat_size), + .set_feat_size = le16_to_cpu(feat_entry->set_feat_size), + .get_version = feat_entry->get_feat_ver, + .set_version = feat_entry->set_feat_ver, + .effects = le16_to_cpu(feat_entry->effects), + .num_media_frus = num_media_frus, + .cxlmd = cxlmd, + }; + + ras_feature->ft_type = RAS_FEAT_ECS; + ras_feature->ecs_ops = &cxl_ecs_ops; + ras_feature->ctx = cxl_ecs_ctx; + ras_feature->ecs_info.num_media_frus = num_media_frus; + + return 0; +} + int cxl_mem_ras_features_init(struct cxl_memdev *cxlmd, struct cxl_region *cxlr) { struct edac_dev_feature ras_features[CXL_DEV_NUM_RAS_FEATURES]; @@ -373,19 +724,34 @@ int cxl_mem_ras_features_init(struct cxl_memdev *cxlmd, struct cxl_region *cxlr) rc = cxl_memdev_scrub_init(cxlmd, cxlr, &ras_features[num_ras_features], scrub_inst); + if (rc == -EOPNOTSUPP) + goto feat_scrub_done; if (rc < 0) return rc; scrub_inst++; num_ras_features++; - if (cxlr) +feat_scrub_done: + if (cxlr) { snprintf(cxl_dev_name, sizeof(cxl_dev_name), "cxl_region%d", cxlr->id); - else + goto feat_register; + } else { snprintf(cxl_dev_name, sizeof(cxl_dev_name), "%s_%s", "cxl", dev_name(&cxlmd->dev)); + } + + rc = cxl_memdev_ecs_init(cxlmd, &ras_features[num_ras_features]); + if (rc == -EOPNOTSUPP) + goto feat_ecs_done; + if (rc < 0) + return rc; + + num_ras_features++; +feat_ecs_done: +feat_register: return edac_dev_register(&cxlmd->dev, cxl_dev_name, NULL, num_ras_features, ras_features); } From patchwork Mon Jan 6 12:10:14 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shiju Jose X-Patchwork-Id: 855288 Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 606741DF996; Mon, 6 Jan 2025 12:12:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=185.176.79.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736165537; cv=none; b=Tk0gjZk6EVDOjLw0FFD5KntOwGWjZD9BpFjvaAQDiFJme4nnww+p4CedmjEQoLEJR66g+wUZRBoNqGn5vdF5dG5ZPUQIn/7rcb8Lf8TyxJJoheQbmQLdFrIzWHkfmjCA7jQk+vSJyo6AKriUMp4TaTZg0lmxpYfIOHjLn33CEeI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736165537; c=relaxed/simple; bh=Zi3CiXH5QPMb3qtuS7Cr8hb7Fd+qVU/t+nbMaGQKFII=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=gjb8aM0ngJncnon/pGNpNOvkSbRxmuC2jfGOjuv88rxWbkOcfca+Zajp5UD020GVYBzzxjsErEXeph4ZCenPZT0o5kc6VOHHkhogiXbV0nW5g1iedXuGJ5TfGLCyPA1cp3LvxyLZkhLq5eP4VJwtpVrYw4L7hNwa2FPcsZrH3fU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=185.176.79.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.18.186.31]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4YRXxW69bXz6K9Zh; Mon, 6 Jan 2025 20:07:43 +0800 (CST) Received: from frapeml500007.china.huawei.com (unknown [7.182.85.172]) by mail.maildlp.com (Postfix) with ESMTPS id 7A89B14022E; Mon, 6 Jan 2025 20:12:13 +0800 (CST) Received: from P_UKIT01-A7bmah.china.huawei.com (10.126.170.95) by frapeml500007.china.huawei.com (7.182.85.172) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.39; Mon, 6 Jan 2025 13:12:11 +0100 From: To: , , , , CC: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , Subject: [PATCH v18 18/19] cxl/memfeature: Add CXL memory device soft PPR control feature Date: Mon, 6 Jan 2025 12:10:14 +0000 Message-ID: <20250106121017.1620-19-shiju.jose@huawei.com> X-Mailer: git-send-email 2.43.0.windows.1 In-Reply-To: <20250106121017.1620-1-shiju.jose@huawei.com> References: <20250106121017.1620-1-shiju.jose@huawei.com> Precedence: bulk X-Mailing-List: linux-acpi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-ClientProxiedBy: lhrpeml100011.china.huawei.com (7.191.174.247) To frapeml500007.china.huawei.com (7.182.85.172) From: Shiju Jose Post Package Repair (PPR) maintenance operations may be supported by CXL devices that implement CXL.mem protocol. A PPR maintenance operation requests the CXL device to perform a repair operation on its media. For example, a CXL device with DRAM components that support PPR features may implement PPR Maintenance operations. DRAM components may support two types of PPR, hard PPR (hPPR), for a permanent row repair, and Soft PPR (sPPR), for a temporary row repair. Soft PPR is much faster than hPPR, but the repair is lost with a power cycle. During the execution of a PPR Maintenance operation, a CXL memory device: - May or may not retain data - May or may not be able to process CXL.mem requests correctly, including the ones that target the DPA involved in the repair. These CXL Memory Device capabilities are specified by Restriction Flags in the sPPR Feature and hPPR Feature. Soft PPR maintenance operation may be executed at runtime, if data is retained and CXL.mem requests are correctly processed. For CXL devices with DRAM components, hPPR maintenance operation may be executed only at boot because data would not be retained. When a CXL device identifies error on a memory component, the device may inform the host about the need for a PPR maintenance operation by using an Event Record, where the Maintenance Needed flag is set. The Event Record specifies the DPA that should be repaired. A CXL device may not keep track of the requests that have already been sent and the information on which DPA should be repaired may be lost upon power cycle. The userspace tool requests for maintenance operation if the number of corrected error reported on a CXL.mem media exceeds error threshold. CXL spec 3.1 section 8.2.9.7.1.2 describes the device's sPPR (soft PPR) maintenance operation and section 8.2.9.7.1.3 describes the device's hPPR (hard PPR) maintenance operation feature. CXL spec 3.1 section 8.2.9.7.2.1 describes the sPPR feature discovery and configuration. CXL spec 3.1 section 8.2.9.7.2.2 describes the hPPR feature discovery and configuration. Add support for controlling CXL memory device sPPR feature. Register with EDAC driver, which gets the memory repair attr descriptors from the EDAC memory repair driver and exposes sysfs repair control attributes for PRR to the userspace. For example CXL PPR control for the CXL mem0 device is exposed in /sys/bus/edac/devices/cxl_mem0/mem_repairX/ Tested with QEMU patch for CXL PPR feature. https://lore.kernel.org/all/20240730045722.71482-1-dave@stgolabs.net/ Reviewed-by: Dave Jiang Signed-off-by: Shiju Jose --- Documentation/edac/memory_repair.rst | 58 ++++ drivers/cxl/core/memfeature.c | 387 ++++++++++++++++++++++++++- 2 files changed, 444 insertions(+), 1 deletion(-) diff --git a/Documentation/edac/memory_repair.rst b/Documentation/edac/memory_repair.rst index 2787a8a2d6ba..f40a34de32a4 100644 --- a/Documentation/edac/memory_repair.rst +++ b/Documentation/edac/memory_repair.rst @@ -99,3 +99,61 @@ sysfs Sysfs files are documented in `Documentation/ABI/testing/sysfs-edac-memory-repair`. + +Example +------- + +The usage takes the form shown in this example: + +1. CXL memory device Soft Post Package Repair (Soft PPR) + +# read capabilities + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/mem_repair0/dpa_support + +1 + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/mem_repair0/nibble_mask + +0x0 + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/mem_repair0/persist_mode + +0 + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/mem_repair0/repair_function + +0 + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/mem_repair0/repair_safe_when_in_use + +1 + +# set and readback attributes + +root@localhost:~# echo 0x8a2d > /sys/bus/edac/devices/cxl_mem0/mem_repair0/nibble_mask + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/mem_repair0/min_dpa + +0x0 + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/mem_repair0/max_dpa + +0xfffffff + +root@localhost:~# echo 0x300000 > /sys/bus/edac/devices/cxl_mem0/mem_repair0/dpa + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/mem_repair0/dpa + +0x300000 + +root@localhost:~# cat /sys/bus/edac/devices/cxl_mem0/mem_repair0/nibble_mask + +0x8a2d + +# issue repair operations + +# reapir returns error if unsupported/resources are not available +# for the repair operation. + +root@localhost:~# echo 1 > /sys/bus/edac/devices/cxl_mem0/mem_repair0/repair diff --git a/drivers/cxl/core/memfeature.c b/drivers/cxl/core/memfeature.c index eed9d57aa691..5437df549bf1 100644 --- a/drivers/cxl/core/memfeature.c +++ b/drivers/cxl/core/memfeature.c @@ -14,11 +14,13 @@ #include #include #include +#include #include #include #include +#include "core.h" -#define CXL_DEV_NUM_RAS_FEATURES 2 +#define CXL_DEV_NUM_RAS_FEATURES 3 #define CXL_DEV_HOUR_IN_SECS 3600 #define CXL_DEV_NAME_LEN 128 @@ -605,6 +607,334 @@ static const struct edac_ecs_ops cxl_ecs_ops = { .set_threshold = cxl_ecs_set_threshold, }; +/* CXL memory soft PPR & hard PPR control definitions */ +struct cxl_ppr_context { + uuid_t repair_uuid; + u8 instance; + u16 get_feat_size; + u16 set_feat_size; + u8 get_version; + u8 set_version; + u16 effects; + struct cxl_memdev *cxlmd; + enum edac_mem_repair_function repair_function; + enum edac_mem_repair_persist_mode persist_mode; + u64 dpa; + u32 nibble_mask; +}; + +/** + * struct cxl_memdev_ppr_params - CXL memory PPR parameter data structure. + * @op_class: PPR operation class. + * @op_subclass: PPR operation subclass. + * @dpa_support: device physical address for PPR support. + * @media_accessible: memory media is accessible or not during PPR operation. + * @data_retained: data is retained or not during PPR operation. + * @dpa: device physical address. + */ +struct cxl_memdev_ppr_params { + u8 op_class; + u8 op_subclass; + bool dpa_support; + bool media_accessible; + bool data_retained; + u64 dpa; +}; + +enum cxl_ppr_param { + CXL_PPR_PARAM_DO_QUERY, + CXL_PPR_PARAM_DO_PPR, +}; + +/* See CXL rev 3.1 @8.2.9.7.2.1 Table 8-113 sPPR Feature Readable Attributes */ +/* See CXL rev 3.1 @8.2.9.7.2.2 Table 8-116 hPPR Feature Readable Attributes */ +#define CXL_MEMDEV_PPR_QUERY_RESOURCE_FLAG BIT(0) + +#define CXL_MEMDEV_PPR_DEVICE_INITIATED_MASK BIT(0) +#define CXL_MEMDEV_PPR_FLAG_DPA_SUPPORT_MASK BIT(0) +#define CXL_MEMDEV_PPR_FLAG_NIBBLE_SUPPORT_MASK BIT(1) +#define CXL_MEMDEV_PPR_FLAG_MEM_SPARING_EV_REC_SUPPORT_MASK BIT(2) + +#define CXL_MEMDEV_PPR_RESTRICTION_FLAG_MEDIA_ACCESSIBLE_MASK BIT(0) +#define CXL_MEMDEV_PPR_RESTRICTION_FLAG_DATA_RETAINED_MASK BIT(2) + +#define CXL_MEMDEV_PPR_SPARING_EV_REC_EN_MASK BIT(0) + +struct cxl_memdev_repair_rd_attrs_hdr { + u8 max_op_latency; + __le16 op_cap; + __le16 op_mode; + u8 op_class; + u8 op_subclass; + u8 rsvd[9]; +} __packed; + +struct cxl_memdev_ppr_rd_attrs { + struct cxl_memdev_repair_rd_attrs_hdr hdr; + u8 ppr_flags; + __le16 restriction_flags; + u8 ppr_op_mode; +} __packed; + +/* See CXL rev 3.1 @8.2.9.7.1.2 Table 8-103 sPPR Maintenance Input Payload */ +/* See CXL rev 3.1 @8.2.9.7.1.3 Table 8-104 hPPR Maintenance Input Payload */ +struct cxl_memdev_ppr_maintenance_attrs { + u8 flags; + __le64 dpa; + u8 nibble_mask[3]; +} __packed; + +static int cxl_mem_ppr_get_attrs(struct cxl_ppr_context *cxl_ppr_ctx, + struct cxl_memdev_ppr_params *params) +{ + size_t rd_data_size = sizeof(struct cxl_memdev_ppr_rd_attrs); + struct cxl_memdev *cxlmd = cxl_ppr_ctx->cxlmd; + struct cxl_mailbox *cxl_mbox = &cxlmd->cxlds->cxl_mbox; + u16 restriction_flags; + size_t data_size; + u16 return_code; + + struct cxl_memdev_ppr_rd_attrs *rd_attrs __free(kfree) = + kmalloc(rd_data_size, GFP_KERNEL); + if (!rd_attrs) + return -ENOMEM; + + data_size = cxl_get_feature(cxl_mbox->features, cxl_ppr_ctx->repair_uuid, + CXL_GET_FEAT_SEL_CURRENT_VALUE, + rd_attrs, rd_data_size, 0, &return_code); + if (!data_size || return_code != CXL_MBOX_CMD_RC_SUCCESS) + return -EIO; + + params->op_class = rd_attrs->hdr.op_class; + params->op_subclass = rd_attrs->hdr.op_subclass; + params->dpa_support = FIELD_GET(CXL_MEMDEV_PPR_FLAG_DPA_SUPPORT_MASK, + rd_attrs->ppr_flags); + restriction_flags = le16_to_cpu(rd_attrs->restriction_flags); + params->media_accessible = FIELD_GET(CXL_MEMDEV_PPR_RESTRICTION_FLAG_MEDIA_ACCESSIBLE_MASK, + restriction_flags) ^ 1; + params->data_retained = FIELD_GET(CXL_MEMDEV_PPR_RESTRICTION_FLAG_DATA_RETAINED_MASK, + restriction_flags) ^ 1; + + return 0; +} + +static int cxl_mem_do_ppr_op(struct device *dev, + struct cxl_ppr_context *cxl_ppr_ctx, + struct cxl_memdev_ppr_params *rd_params, + enum cxl_ppr_param param_type) +{ + struct cxl_memdev_ppr_maintenance_attrs maintenance_attrs; + struct cxl_memdev *cxlmd = cxl_ppr_ctx->cxlmd; + int ret; + + if (!rd_params->media_accessible || !rd_params->data_retained) { + /* Check if DPA is mapped */ + if (cxl_dpa_to_region(cxlmd, cxl_ppr_ctx->dpa)) { + dev_err(dev, "CXL can't do PPR as DPA is mapped\n"); + return -EBUSY; + } + } + memset(&maintenance_attrs, 0, sizeof(maintenance_attrs)); + if (param_type == CXL_PPR_PARAM_DO_QUERY) + maintenance_attrs.flags = CXL_MEMDEV_PPR_QUERY_RESOURCE_FLAG; + else + maintenance_attrs.flags = 0; + maintenance_attrs.dpa = cpu_to_le64(cxl_ppr_ctx->dpa); + put_unaligned_le24(cxl_ppr_ctx->nibble_mask, maintenance_attrs.nibble_mask); + ret = cxl_do_maintenance(&cxlmd->cxlds->cxl_mbox, rd_params->op_class, + rd_params->op_subclass, &maintenance_attrs, + sizeof(maintenance_attrs)); + if (ret) { + dev_err(dev, "CXL do PPR failed ret=%d\n", ret); + up_read(&cxl_region_rwsem); + cxl_ppr_ctx->nibble_mask = 0; + cxl_ppr_ctx->dpa = 0; + return ret; + } + + return 0; +} + +static int cxl_mem_ppr_set_attrs(struct device *dev, + struct cxl_ppr_context *cxl_ppr_ctx, + enum cxl_ppr_param param_type) +{ + struct cxl_memdev_ppr_params rd_params; + int ret; + + ret = cxl_mem_ppr_get_attrs(cxl_ppr_ctx, &rd_params); + if (ret) { + dev_err(dev, "Get cxlmemdev PPR params failed ret=%d\n", + ret); + return ret; + } + + switch (param_type) { + case CXL_PPR_PARAM_DO_QUERY: + case CXL_PPR_PARAM_DO_PPR: + ret = down_read_interruptible(&cxl_region_rwsem); + if (ret) + return ret; + ret = down_read_interruptible(&cxl_dpa_rwsem); + if (ret) { + up_read(&cxl_region_rwsem); + return ret; + } + ret = cxl_mem_do_ppr_op(dev, cxl_ppr_ctx, &rd_params, param_type); + up_read(&cxl_dpa_rwsem); + up_read(&cxl_region_rwsem); + return ret; + default: + return -EINVAL; + } +} + +static int cxl_ppr_get_repair_function(struct device *dev, void *drv_data, + u32 *repair_function) +{ + struct cxl_ppr_context *cxl_ppr_ctx = drv_data; + + *repair_function = cxl_ppr_ctx->repair_function; + + return 0; +} + +static int cxl_ppr_get_persist_mode(struct device *dev, void *drv_data, + u32 *persist_mode) +{ + struct cxl_ppr_context *cxl_ppr_ctx = drv_data; + + *persist_mode = cxl_ppr_ctx->persist_mode; + + return 0; +} + +static int cxl_ppr_get_dpa_support(struct device *dev, void *drv_data, + u32 *dpa_support) +{ + struct cxl_ppr_context *cxl_ppr_ctx = drv_data; + struct cxl_memdev_ppr_params params; + int ret; + + ret = cxl_mem_ppr_get_attrs(cxl_ppr_ctx, ¶ms); + if (ret) + return ret; + + *dpa_support = params.dpa_support; + + return 0; +} + +static int cxl_get_ppr_safe_when_in_use(struct device *dev, void *drv_data, + u32 *safe) +{ + struct cxl_ppr_context *cxl_ppr_ctx = drv_data; + struct cxl_memdev_ppr_params params; + int ret; + + ret = cxl_mem_ppr_get_attrs(cxl_ppr_ctx, ¶ms); + if (ret) + return ret; + + *safe = params.media_accessible & params.data_retained; + + return 0; +} + +static int cxl_ppr_get_min_dpa(struct device *dev, void *drv_data, + u64 *min_dpa) +{ + struct cxl_ppr_context *cxl_ppr_ctx = drv_data; + struct cxl_memdev *cxlmd = cxl_ppr_ctx->cxlmd; + struct cxl_dev_state *cxlds = cxlmd->cxlds; + + *min_dpa = cxlds->dpa_res.start; + + return 0; +} + +static int cxl_ppr_get_max_dpa(struct device *dev, void *drv_data, + u64 *max_dpa) +{ + struct cxl_ppr_context *cxl_ppr_ctx = drv_data; + struct cxl_memdev *cxlmd = cxl_ppr_ctx->cxlmd; + struct cxl_dev_state *cxlds = cxlmd->cxlds; + + *max_dpa = cxlds->dpa_res.end; + + return 0; +} + +static int cxl_ppr_get_dpa(struct device *dev, void *drv_data, + u64 *dpa) +{ + struct cxl_ppr_context *cxl_ppr_ctx = drv_data; + + *dpa = cxl_ppr_ctx->dpa; + + return 0; +} + +static int cxl_ppr_set_dpa(struct device *dev, void *drv_data, u64 dpa) +{ + struct cxl_ppr_context *cxl_ppr_ctx = drv_data; + struct cxl_memdev *cxlmd = cxl_ppr_ctx->cxlmd; + struct cxl_dev_state *cxlds = cxlmd->cxlds; + + if (!dpa || dpa < cxlds->dpa_res.start || dpa > cxlds->dpa_res.end) + return -EINVAL; + + cxl_ppr_ctx->dpa = dpa; + + return 0; +} + +static int cxl_ppr_get_nibble_mask(struct device *dev, void *drv_data, + u64 *nibble_mask) +{ + struct cxl_ppr_context *cxl_ppr_ctx = drv_data; + + *nibble_mask = cxl_ppr_ctx->nibble_mask; + + return 0; +} + +static int cxl_ppr_set_nibble_mask(struct device *dev, void *drv_data, u64 nibble_mask) +{ + struct cxl_ppr_context *cxl_ppr_ctx = drv_data; + + cxl_ppr_ctx->nibble_mask = nibble_mask; + + return 0; +} + +static int cxl_do_ppr(struct device *dev, void *drv_data, u32 val) +{ + struct cxl_ppr_context *cxl_ppr_ctx = drv_data; + int ret; + + if (!cxl_ppr_ctx->dpa || val != EDAC_DO_MEM_REPAIR) + return -EINVAL; + + ret = cxl_mem_ppr_set_attrs(dev, cxl_ppr_ctx, CXL_PPR_PARAM_DO_PPR); + + return ret; +} + +static const struct edac_mem_repair_ops cxl_sppr_ops = { + .get_repair_function = cxl_ppr_get_repair_function, + .get_persist_mode = cxl_ppr_get_persist_mode, + .get_dpa_support = cxl_ppr_get_dpa_support, + .get_repair_safe_when_in_use = cxl_get_ppr_safe_when_in_use, + .get_min_dpa = cxl_ppr_get_min_dpa, + .get_max_dpa = cxl_ppr_get_max_dpa, + .get_dpa = cxl_ppr_get_dpa, + .set_dpa = cxl_ppr_set_dpa, + .get_nibble_mask = cxl_ppr_get_nibble_mask, + .set_nibble_mask = cxl_ppr_set_nibble_mask, + .do_repair = cxl_do_ppr, +}; + static int cxl_memdev_scrub_init(struct cxl_memdev *cxlmd, struct cxl_region *cxlr, struct edac_dev_feature *ras_feature, u8 scrub_inst) { @@ -714,11 +1044,55 @@ static int cxl_memdev_ecs_init(struct cxl_memdev *cxlmd, return 0; } +static int cxl_memdev_soft_ppr_init(struct cxl_memdev *cxlmd, + struct edac_dev_feature *ras_feature, + u8 repair_inst) +{ + struct cxl_dev_state *cxlds = cxlmd->cxlds; + struct cxl_mailbox *cxl_mbox = &cxlds->cxl_mbox; + struct cxl_ppr_context *cxl_sppr_ctx; + struct cxl_feat_entry *feat_entry; + + feat_entry = cxl_get_supported_feature_entry(cxl_mbox->features, + &CXL_FEAT_SPPR_UUID); + if (IS_ERR(feat_entry)) + return -EOPNOTSUPP; + + if (!(le32_to_cpu(feat_entry->flags) & CXL_FEAT_ENTRY_FLAG_CHANGABLE)) + return -EOPNOTSUPP; + + cxl_sppr_ctx = devm_kzalloc(&cxlmd->dev, sizeof(*cxl_sppr_ctx), + GFP_KERNEL); + if (!cxl_sppr_ctx) + return -ENOMEM; + + *cxl_sppr_ctx = (struct cxl_ppr_context) { + .repair_uuid = CXL_FEAT_SPPR_UUID, + .get_feat_size = le16_to_cpu(feat_entry->get_feat_size), + .set_feat_size = le16_to_cpu(feat_entry->set_feat_size), + .get_version = feat_entry->get_feat_ver, + .set_version = feat_entry->set_feat_ver, + .effects = le16_to_cpu(feat_entry->effects), + .cxlmd = cxlmd, + .repair_function = EDAC_SOFT_PPR, + .persist_mode = EDAC_MEM_REPAIR_SOFT, + .instance = repair_inst, + }; + + ras_feature->ft_type = RAS_FEAT_MEM_REPAIR; + ras_feature->instance = cxl_sppr_ctx->instance; + ras_feature->mem_repair_ops = &cxl_sppr_ops; + ras_feature->ctx = cxl_sppr_ctx; + + return 0; +} + int cxl_mem_ras_features_init(struct cxl_memdev *cxlmd, struct cxl_region *cxlr) { struct edac_dev_feature ras_features[CXL_DEV_NUM_RAS_FEATURES]; char cxl_dev_name[CXL_DEV_NAME_LEN]; int num_ras_features = 0; + u8 repair_inst = 0; u8 scrub_inst = 0; int rc; @@ -751,6 +1125,17 @@ int cxl_mem_ras_features_init(struct cxl_memdev *cxlmd, struct cxl_region *cxlr) num_ras_features++; feat_ecs_done: + rc = cxl_memdev_soft_ppr_init(cxlmd, &ras_features[num_ras_features], + repair_inst); + if (rc == -EOPNOTSUPP) + goto feat_soft_ppr_done; + if (rc < 0) + return rc; + + repair_inst++; + num_ras_features++; + +feat_soft_ppr_done: feat_register: return edac_dev_register(&cxlmd->dev, cxl_dev_name, NULL, num_ras_features, ras_features);