Message ID | 20221025061437.17571-1-zhangzekun11@huawei.com |
---|---|
State | New |
Headers | show |
Series | [RFC] ACPI: container: Add power domain control methods | expand |
On Tue, Oct 25, 2022 at 8:17 AM Zhang Zekun <zhangzekun11@huawei.com> wrote: > > Platform devices which supports power control are often required to be > power off/on together with the devices in the same power domain. However, > there isn't a generic driver that support the power control logic of > these devices. Not true. There is the ACPI power resources interface designed to represent power domains that is well supported and used in the industry. If it doesn't work for you, explain why.
Hi, Rafael J This patch wants to put some generic control logic in container, and these logic can cover a batch of scenarios similar to ours. ACPI power resources interface is not confilct with this patch and can be used inside the container for more complicated scenarios. In our secenaio, we need to control the power of some HBM memory device, each of it will be configured as a PNP0C80, HBM devices in one socket are in the same power domain and need to power on/off together. Every HBM memory device represent a numa node and have no cpu on it. The topology in one socket can be simplifed and represented as +---------+ | node0 | | CPUs | | DRAM | +---------+ | +------+-------+ | | +---------+ +---------+ | node1 | | node2 | | no-cpu | | no-cpu | | HBM | | HBM | +---------+ +---------+ To use ACPI power domain management interface, we need to develop a specialized driver to maintain the relationship between socket id and numa nodes to tell the userspace which socket does this numa node belong to. Note that the numa node in the same socket will be power on/off together. Socket id of a memory device can be reported by BIOS via DSDT or other ACPI tables, but we can just skip this step by put all of the devices belongs to the same socket in a container. And, we can call each child devices' "_PXM" function to expose numa nodes of HBM devices to userspace. Besides, To power off the devices we need first to offline these ACPI devices, and then call the ACPI function "_EJ0" to finally remove it. This are also generic logic that can be used to remove ejectable devices. what we really need is a place to support these generic control logic, rather than the interfaces to implement our requirements. Best Regards, Zekun, Zhang 在 2022/10/29 1:07, Rafael J. Wysocki 写道: > On Tue, Oct 25, 2022 at 8:17 AM Zhang Zekun <zhangzekun11@huawei.com> wrote: >> Platform devices which supports power control are often required to be >> power off/on together with the devices in the same power domain. However, >> there isn't a generic driver that support the power control logic of >> these devices. > Not true. > > There is the ACPI power resources interface designed to represent > power domains that is well supported and used in the industry. > > If it doesn't work for you, explain why. >
Kindly ping. 在 2022/10/29 1:07, Rafael J. Wysocki 写道: > On Tue, Oct 25, 2022 at 8:17 AM Zhang Zekun <zhangzekun11@huawei.com> wrote: >> Platform devices which supports power control are often required to be >> power off/on together with the devices in the same power domain. However, >> there isn't a generic driver that support the power control logic of >> these devices. > Not true. > > There is the ACPI power resources interface designed to represent > power domains that is well supported and used in the industry. > > If it doesn't work for you, explain why. >
On Thu, Nov 10, 2022 at 1:13 PM zhangzekun (A) <zhangzekun11@huawei.com> wrote: > > Kindly ping. I'm not going to apply this patch if that's what you're asking about. Please have a look at LPI which is the ACPI way of doing what you want. If you need to extend the support for it in the kernel, please do so. If you need to extend the definition of LPI in the ACPI specification, there is also a way to do that. What you are trying to do would require extending the container device definition in the specification anyway. > 在 2022/10/29 1:07, Rafael J. Wysocki 写道: > > On Tue, Oct 25, 2022 at 8:17 AM Zhang Zekun <zhangzekun11@huawei.com> wrote: > >> Platform devices which supports power control are often required to be > >> power off/on together with the devices in the same power domain. However, > >> there isn't a generic driver that support the power control logic of > >> these devices. > > Not true. > > > > There is the ACPI power resources interface designed to represent > > power domains that is well supported and used in the industry. > > > > If it doesn't work for you, explain why. > > >
Hi, Rafael J Thanks a lot for your advice! I will look into LPI and find a better way to do what I want. Best Regards, Zekun, Zhang 在 2022/11/10 21:05, Rafael J. Wysocki 写道: > On Thu, Nov 10, 2022 at 1:13 PM zhangzekun (A) <zhangzekun11@huawei.com> wrote: >> Kindly ping. > I'm not going to apply this patch if that's what you're asking about. > > Please have a look at LPI which is the ACPI way of doing what you want. > > If you need to extend the support for it in the kernel, please do so. > > If you need to extend the definition of LPI in the ACPI specification, > there is also a way to do that. > > What you are trying to do would require extending the container device > definition in the specification anyway. > >> 在 2022/10/29 1:07, Rafael J. Wysocki 写道: >>> On Tue, Oct 25, 2022 at 8:17 AM Zhang Zekun <zhangzekun11@huawei.com> wrote: >>>> Platform devices which supports power control are often required to be >>>> power off/on together with the devices in the same power domain. However, >>>> there isn't a generic driver that support the power control logic of >>>> these devices. >>> Not true. >>> >>> There is the ACPI power resources interface designed to represent >>> power domains that is well supported and used in the industry. >>> >>> If it doesn't work for you, explain why. >>>
diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig index 473241b5193f..ebb26d56dba0 100644 --- a/drivers/acpi/Kconfig +++ b/drivers/acpi/Kconfig @@ -584,6 +584,18 @@ config ACPI_PRMT substantially increase computational overhead related to the initialization of some server systems. +config ACPI_POWER_DOMAIN_CTL + bool "acpi container power domain control support" + depends on ACPI_CONTAINER + default n + help + Add userspace power control interfaces in container which can be used + for manipulating the power of child devices in the same power domain. + + To use this feature you need to put devices in the same power domain + in a container. Enable this feature if you want to control the power + of these devices together. + endif # ACPI config X86_PM_TIMER diff --git a/drivers/acpi/container.c b/drivers/acpi/container.c index 5b7e3b9ae370..9ed2eb5a3dcc 100644 --- a/drivers/acpi/container.c +++ b/drivers/acpi/container.c @@ -42,6 +42,115 @@ static void acpi_container_release(struct device *dev) kfree(to_container_dev(dev)); } +#ifdef CONFIG_ACPI_POWER_DOMAIN_CTL + +static int get_pxm(struct acpi_device *acpi_device, void *arg) +{ + int nid; + unsigned long long sta; + acpi_handle handle; + nodemask_t *mask; + acpi_status status; + + mask = arg; + handle = acpi_device->handle; + + status = acpi_evaluate_integer(handle, "_STA", NULL, &sta); + if (ACPI_SUCCESS(status) && (sta & ACPI_STA_DEVICE_ENABLED)) { + nid = acpi_get_node(handle); + if (nid >= 0) + node_set(nid, *mask); + } + + return 0; +} + +static ssize_t pxms_show(struct device *dev, + struct device_attribute *attr, + char *buf) +{ + nodemask_t mask; + acpi_status status; + struct acpi_device *adev; + + adev = to_acpi_device(dev); + nodes_clear(mask); + + status = acpi_dev_for_each_child(adev, get_pxm, &mask); + + return sysfs_emit(buf, "%*pbl\n", nodemask_pr_args(&mask)); +} +DEVICE_ATTR_RO(pxms); + +static ssize_t on_store(struct device *d, struct device_attribute *attr, + const char *buf, size_t count) +{ + acpi_status status; + acpi_handle handle; + struct acpi_device *adev; + + if (!count || buf[0] != '1') + return -EINVAL; + + adev = to_acpi_device(d); + handle = adev->handle; + status = acpi_evaluate_object(handle, "_ON", NULL, NULL); + if (status == AE_NOT_FOUND) + acpi_handle_warn(handle, "No power on support for the container\n"); + else if (ACPI_FAILURE(status)) + acpi_handle_warn(handle, "Power on the device failed (0x%x)\n", status); + + return count; +} +DEVICE_ATTR_WO(on); + +static int eject_device(struct acpi_device *acpi_device, void *not_used) +{ + acpi_object_type unused; + acpi_status status; + + status = acpi_get_type(acpi_device->handle, &unused); + if (ACPI_FAILURE(status) || !acpi_device->flags.ejectable) + return -ENODEV; + + acpi_dev_get(acpi_device); + status = acpi_hotplug_schedule(acpi_device, ACPI_OST_EC_OSPM_EJECT); + if (ACPI_SUCCESS(status)) + return status; + + acpi_dev_put(acpi_device); + acpi_evaluate_ost(acpi_device->handle, ACPI_OST_EC_OSPM_EJECT, + ACPI_OST_SC_NON_SPECIFIC_FAILURE, NULL); + + return status == AE_NO_MEMORY ? -ENOMEM : -EAGAIN; +} + +static ssize_t off_store(struct device *d, struct device_attribute *attr, + const char *buf, size_t count) +{ + struct acpi_device *adev; + acpi_status status; + + if (!count || buf[0] != '1') + return -EINVAL; + + adev = to_acpi_device(d); + status = acpi_dev_for_each_child(adev, eject_device, NULL); + if (ACPI_SUCCESS(status)) + return count; + + return status; +} +DEVICE_ATTR_WO(off); + +static void create_sysfs(struct device *dev) +{ + device_create_file(dev, &dev_attr_on); + device_create_file(dev, &dev_attr_off); + device_create_file(dev, &dev_attr_pxms); +} +#endif + static int container_device_attach(struct acpi_device *adev, const struct acpi_device_id *not_used) { @@ -68,6 +177,9 @@ static int container_device_attach(struct acpi_device *adev, return ret; } adev->driver_data = dev; +#ifdef CONFIG_ACPI_POWER_DOMAIN_CTL + create_sysfs(&adev->dev); +#endif return 1; }
Platform devices which supports power control are often required to be power off/on together with the devices in the same power domain. However, there isn't a generic driver that support the power control logic of these devices. ACPI container seems to be a good place to hold these control logic. Add platform devices in the same power domain in a ACPI container, we can easily get the locality information about these devices and can moniter the power of these devices in the same power domain together. This patch provide three userspace control interface to control the power of devices together in the container: - on: power up the devices in the container and then online these devices which will be triggered by BIOS. - off: offline and eject the child devices in the container which are ejectable. - pxms: show the pxms of devices which are present in the container. In our scenario, we need to control the power of HBM memory devices which can be power consuming and will only be used in some specialized scenarios, such as HPC. HBM memory devices in a socket are in the same power domain, and should be power off/on together. We have come up with an idea that put these power control logic in a specialized driver, but ACPI container seems to be a more generic place to hold these control logic. Signed-off-by: Zhang Zekun <zhangzekun11@huawei.com> --- drivers/acpi/Kconfig | 12 +++++ drivers/acpi/container.c | 112 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 124 insertions(+)