mbox series

[v2,0/5] Introduce Platform Firmware Runtime Update and Telemetry drivers

Message ID cover.1631600169.git.yu.c.chen@intel.com
Headers show
Series Introduce Platform Firmware Runtime Update and Telemetry drivers | expand

Message

Chen Yu Sept. 14, 2021, 7:24 a.m. UTC
High Service Level Agreements (SLAs) requires that the system runs without
service interruptions. Generally, system firmware provides runtime services
such as RAS(Reliability, Availability and Serviceability) features, UEFI runtime
services and ACPI services. Currently if there is any firmware code changes in
these code area, the system firmware update and reboot is required. Example of
bug fix could be wrong register size or location of the register. This means
customer services are not available during the firmware upgrade, which could
approach several minutes, resulting in not able to meet SLAs.

Intel provides a mechanism named Management Mode Runtime Update to help the users
update the firmware without having to reboot[1].

This series provides the following facilities.

  1. Perform a runtime firmware driver update and activate.
  2. Ability to inject firmware code at runtime, for dynamic instrumentation.
  3. Facility to retrieve logs from runtime firmware update and activate telemetry.
     (The telemetry is based on runtime firmware update: it records the logs during
      runtime update(code injection and driver update).

The Management Mode Runtime Update OS Interface Specification[1] provides two ACPI
device objects to interface with system firmware to perform these updates. This patch
series introduces the drivers for those ACPI devices.

[1] https://uefi.org/sites/default/files/resources/Intel_MM_OS_Interface_Spec_Rev100.pdf

Chen Yu (5):
  Documentation: Introduce Platform Firmware Runtime Update
    documentation
  efi: Introduce EFI_FIRMWARE_MANAGEMENT_CAPSULE_HEADER and
    corresponding structures
  drivers/acpi: Introduce Platform Firmware Runtime Update device driver
  drivers/acpi: Introduce Platform Firmware Runtime Update Telemetry
  selftests/pfru: add test for Platform Firmware Runtime Update and
    Telemetry

 .../userspace-api/ioctl/ioctl-number.rst      |   1 +
 Documentation/x86/index.rst                   |   1 +
 Documentation/x86/pfru.rst                    | 100 +++
 drivers/acpi/Kconfig                          |   1 +
 drivers/acpi/Makefile                         |   1 +
 drivers/acpi/pfru/Kconfig                     |  29 +
 drivers/acpi/pfru/Makefile                    |   3 +
 drivers/acpi/pfru/pfru_telemetry.c            | 412 +++++++++++++
 drivers/acpi/pfru/pfru_update.c               | 567 ++++++++++++++++++
 include/linux/efi.h                           |  50 ++
 include/uapi/linux/pfru.h                     | 151 +++++
 tools/testing/selftests/Makefile              |   1 +
 tools/testing/selftests/pfru/Makefile         |   7 +
 tools/testing/selftests/pfru/config           |   2 +
 tools/testing/selftests/pfru/pfru_test.c      | 329 ++++++++++
 15 files changed, 1655 insertions(+)
 create mode 100644 Documentation/x86/pfru.rst
 create mode 100644 drivers/acpi/pfru/Kconfig
 create mode 100644 drivers/acpi/pfru/Makefile
 create mode 100644 drivers/acpi/pfru/pfru_telemetry.c
 create mode 100644 drivers/acpi/pfru/pfru_update.c
 create mode 100644 include/uapi/linux/pfru.h
 create mode 100644 tools/testing/selftests/pfru/Makefile
 create mode 100644 tools/testing/selftests/pfru/config
 create mode 100644 tools/testing/selftests/pfru/pfru_test.c

Comments

Greg Kroah-Hartman Sept. 14, 2021, 8:11 a.m. UTC | #1
On Tue, Sep 14, 2021 at 03:58:41PM +0800, Chen Yu wrote:
> +enum start_action {
> +	START_STAGE,
> +	START_ACTIVATE,
> +	START_STAGE_ACTIVATE,
> +};
> +
> +enum dsm_status {
> +	DSM_SUCCEED,
> +	DSM_FUNC_NOT_SUPPORT,
> +	DSM_INVAL_INPUT,
> +	DSM_HARDWARE_ERR,
> +	DSM_RETRY_SUGGESTED,
> +	DSM_UNKNOWN,
> +	DSM_FUNC_SPEC_ERR,
> +};
> +
> +struct update_cap_info {
> +	enum dsm_status status;
> +	int update_cap;
> +
> +	uuid_t code_type;
> +	int fw_version;
> +	int code_rt_version;
> +
> +	uuid_t drv_type;
> +	int drv_rt_version;
> +	int drv_svn;
> +
> +	uuid_t platform_id;
> +	uuid_t oem_id;
> +
> +	char oem_info[];

Please use valid types for structures that cross the user/kernel
boundry.

> +};
> +
> +struct com_buf_info {
> +	enum dsm_status status;
> +	enum dsm_status ext_status;
> +	unsigned long addr_lo;
> +	unsigned long addr_hi;
> +	int buf_size;
> +};

Same here.

> +
> +struct updated_result {
> +	enum dsm_status status;
> +	enum dsm_status ext_status;
> +	unsigned long low_auth_time;
> +	unsigned long high_auth_time;
> +	unsigned long low_exec_time;
> +	unsigned long high_exec_time;

And same here.

And these are very odd structure names that you are adding to the
"global" namespace.  Please make them have a prefix for your driver so
that people know what they belong to.  "updated_result" is way too
generic.

thanks,

greg k-h
Chen Yu Sept. 15, 2021, 9:03 a.m. UTC | #2
Hi Greg,
thank you very much for your comments,
On Tue, Sep 14, 2021 at 10:11:31AM +0200, Greg Kroah-Hartman wrote:
> On Tue, Sep 14, 2021 at 03:58:41PM +0800, Chen Yu wrote:

> > +enum start_action {

> > +	START_STAGE,

> > +	START_ACTIVATE,

> > +	START_STAGE_ACTIVATE,

> > +};

> > +

> > +enum dsm_status {

> > +	DSM_SUCCEED,

> > +	DSM_FUNC_NOT_SUPPORT,

> > +	DSM_INVAL_INPUT,

> > +	DSM_HARDWARE_ERR,

> > +	DSM_RETRY_SUGGESTED,

> > +	DSM_UNKNOWN,

> > +	DSM_FUNC_SPEC_ERR,

> > +};

> > +

> > +struct update_cap_info {

> > +	enum dsm_status status;

> > +	int update_cap;

> > +

> > +	uuid_t code_type;

> > +	int fw_version;

> > +	int code_rt_version;

> > +

> > +	uuid_t drv_type;

> > +	int drv_rt_version;

> > +	int drv_svn;

> > +

> > +	uuid_t platform_id;

> > +	uuid_t oem_id;

> > +

> > +	char oem_info[];

> 

> Please use valid types for structures that cross the user/kernel

> boundry.

> 

Okay, I'll switch them into __u prefixed one.
> > +};

> > +

> > +struct com_buf_info {

> > +	enum dsm_status status;

> > +	enum dsm_status ext_status;

> > +	unsigned long addr_lo;

> > +	unsigned long addr_hi;

> > +	int buf_size;

> > +};

> 

> Same here.

> 

Okay, I'll fix them.
> > +

> > +struct updated_result {

> > +	enum dsm_status status;

> > +	enum dsm_status ext_status;

> > +	unsigned long low_auth_time;

> > +	unsigned long high_auth_time;

> > +	unsigned long low_exec_time;

> > +	unsigned long high_exec_time;

> 

> And same here.

> 

> And these are very odd structure names that you are adding to the

> "global" namespace.  Please make them have a prefix for your driver so

> that people know what they belong to.  "updated_result" is way too

> generic.

> 

Okay, added the driver name prefix in next version.

thanks,
Chenyu