Message ID | 167571650007.587790.10040913293130712882.stgit@djiang5-mobl3.local |
---|---|
Headers | show |
Series | cxl: Add support for QTG ID retrieval for CXL subsystem | expand |
On Mon, Feb 06, 2023 at 01:50:06PM -0700, Dave Jiang wrote: > 'struct acpi_cadt_dsmas' => 'struct acpi_cdat_dsmas' > > Fixes: 51aad1a6723b ("ACPICA: Finish support for the CDAT table") > Signed-off-by: Dave Jiang <dave.jiang@intel.com> ACPICA changes need to go into upstream first (via a pull request on GitHub). Once it's merged, you can submit the same patch downstream for the kernel and reference the ACPICA commit with a Link: tag. I've already submitted a pull request for the exact same change more than a week ago: https://github.com/acpica/acpica/pull/830 The pull request has been approved but not merged. Hopefully that'll happen soon. Thanks, Lukas > --- > include/acpi/actbl1.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/include/acpi/actbl1.h b/include/acpi/actbl1.h > index 4175dce3967c..e8297cefde09 100644 > --- a/include/acpi/actbl1.h > +++ b/include/acpi/actbl1.h > @@ -344,7 +344,7 @@ enum acpi_cdat_type { > > /* Subtable 0: Device Scoped Memory Affinity Structure (DSMAS) */ > > -struct acpi_cadt_dsmas { > +struct acpi_cdat_dsmas { > u8 dsmad_handle; > u8 flags; > u16 reserved; > >
On Mon, 06 Feb 2023 13:49:30 -0700 Dave Jiang <dave.jiang@intel.com> wrote: > Export the QoS Throttling Group ID from the CXL Fixed Memory Window > Structure (CFMWS) under the root decoder sysfs attributes. > CXL rev3.0 9.17.1.3 CXL Fixed Memory Window Structure (CFMWS) > > Signed-off-by: Dave Jiang <dave.jiang@intel.com> Hi Dave, I've no objection to this, but would good to say why this might be of use to userspace. What tooling needs it? One comment on docs inline. With those two things tidied up Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > --- > Documentation/ABI/testing/sysfs-bus-cxl | 7 +++++++ > drivers/cxl/acpi.c | 3 +++ > drivers/cxl/core/port.c | 14 ++++++++++++++ > drivers/cxl/cxl.h | 3 +++ > 4 files changed, 27 insertions(+) > > diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl > index 8494ef27e8d2..0932c2f6fbf4 100644 > --- a/Documentation/ABI/testing/sysfs-bus-cxl > +++ b/Documentation/ABI/testing/sysfs-bus-cxl > @@ -294,6 +294,13 @@ Description: > (WO) Write a string in the form 'regionZ' to delete that region, > provided it is currently idle / not bound to a driver. > > +What: /sys/bus/cxl/devices/decoderX.Y/qtg_id > +Date: Jan, 2023 > +KernelVersion: v6.3 > +Contact: linux-cxl@vger.kernel.org > +Description: > + (RO) Shows the QoS Throttling Group ID. The QTG ID for a root > + decoder comes from the CFMWS structure of the CEDT. Document the -1 value for no ID in here. Hopefully people will write their userspace against this document and we want them to know about that corner case! > > What: /sys/bus/cxl/devices/regionZ/uuid > Date: May, 2022 > diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c > index 13cde44c6086..7a71bb5041c7 100644 > --- a/drivers/cxl/acpi.c > +++ b/drivers/cxl/acpi.c > @@ -289,6 +289,9 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg, > } > } > } > + > + cxld->qtg_id = cfmws->qtg_id; > + > rc = cxl_decoder_add(cxld, target_map); > err_xormap: > if (rc) > diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c > index b631a0520456..fe78daf7e7c8 100644 > --- a/drivers/cxl/core/port.c > +++ b/drivers/cxl/core/port.c > @@ -284,6 +284,16 @@ static ssize_t interleave_ways_show(struct device *dev, > > static DEVICE_ATTR_RO(interleave_ways); > > +static ssize_t qtg_id_show(struct device *dev, > + struct device_attribute *attr, char *buf) > +{ > + struct cxl_decoder *cxld = to_cxl_decoder(dev); > + > + return sysfs_emit(buf, "%d\n", cxld->qtg_id); > +} > + > +static DEVICE_ATTR_RO(qtg_id); > + > static struct attribute *cxl_decoder_base_attrs[] = { > &dev_attr_start.attr, > &dev_attr_size.attr, > @@ -303,6 +313,7 @@ static struct attribute *cxl_decoder_root_attrs[] = { > &dev_attr_cap_type2.attr, > &dev_attr_cap_type3.attr, > &dev_attr_target_list.attr, > + &dev_attr_qtg_id.attr, > SET_CXL_REGION_ATTR(create_pmem_region) > SET_CXL_REGION_ATTR(delete_region) > NULL, > @@ -1606,6 +1617,7 @@ struct cxl_root_decoder *cxl_root_decoder_alloc(struct cxl_port *port, > } > > atomic_set(&cxlrd->region_id, rc); > + cxld->qtg_id = CXL_QTG_ID_INVALID; > return cxlrd; > } > EXPORT_SYMBOL_NS_GPL(cxl_root_decoder_alloc, CXL); > @@ -1643,6 +1655,7 @@ struct cxl_switch_decoder *cxl_switch_decoder_alloc(struct cxl_port *port, > > cxld = &cxlsd->cxld; > cxld->dev.type = &cxl_decoder_switch_type; > + cxld->qtg_id = CXL_QTG_ID_INVALID; > return cxlsd; > } > EXPORT_SYMBOL_NS_GPL(cxl_switch_decoder_alloc, CXL); > @@ -1675,6 +1688,7 @@ struct cxl_endpoint_decoder *cxl_endpoint_decoder_alloc(struct cxl_port *port) > } > > cxld->dev.type = &cxl_decoder_endpoint_type; > + cxld->qtg_id = CXL_QTG_ID_INVALID; > return cxled; > } > EXPORT_SYMBOL_NS_GPL(cxl_endpoint_decoder_alloc, CXL); > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h > index 1b1cf459ac77..f558bbfc0332 100644 > --- a/drivers/cxl/cxl.h > +++ b/drivers/cxl/cxl.h > @@ -279,6 +279,7 @@ enum cxl_decoder_type { > */ > #define CXL_DECODER_MAX_INTERLEAVE 16 > > +#define CXL_QTG_ID_INVALID -1 > > /** > * struct cxl_decoder - Common CXL HDM Decoder Attributes > @@ -290,6 +291,7 @@ enum cxl_decoder_type { > * @target_type: accelerator vs expander (type2 vs type3) selector > * @region: currently assigned region for this decoder > * @flags: memory type capabilities and locking > + * @qtg_id: QoS Throttling Group ID > * @commit: device/decoder-type specific callback to commit settings to hw > * @reset: device/decoder-type specific callback to reset hw settings > */ > @@ -302,6 +304,7 @@ struct cxl_decoder { > enum cxl_decoder_type target_type; > struct cxl_region *region; > unsigned long flags; > + int qtg_id; > int (*commit)(struct cxl_decoder *cxld); > int (*reset)(struct cxl_decoder *cxld); > }; > >
On Mon, 06 Feb 2023 13:49:48 -0700 Dave Jiang <dave.jiang@intel.com> wrote: > A CDAT table is available from a CXL device. The table is read by the > driver and cached in software. With the CXL subsystem needing to parse the > CDAT table, the checksum should be verified. Add checksum verification > after the CDAT table is read from device. > > Signed-off-by: Dave Jiang <dave.jiang@intel.com> Hi Dave, Some comments on this follow on from previous patch so may not be relevant once you've updated how that is done. Jonathan > --- > drivers/cxl/core/pci.c | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c > index 57764e9cd19d..a24dac36bedd 100644 > --- a/drivers/cxl/core/pci.c > +++ b/drivers/cxl/core/pci.c > @@ -3,6 +3,7 @@ > #include <linux/io-64-nonatomic-lo-hi.h> > #include <linux/device.h> > #include <linux/delay.h> > +#include <linux/acpi.h> > #include <linux/pci.h> > #include <linux/pci-doe.h> > #include <cxlpci.h> > @@ -592,6 +593,7 @@ void read_cdat_data(struct cxl_port *port) > struct device *dev = &port->dev; > struct device *uport = port->uport; > size_t cdat_length; > + acpi_status status; > int rc; > > cdat_doe = find_cdat_doe(uport); > @@ -620,5 +622,14 @@ void read_cdat_data(struct cxl_port *port) > port->cdat.length = 0; > dev_err(dev, "CDAT data read error\n"); > } > + > + status = acpi_ut_verify_cdat_checksum(port->cdat.table, port->cdat.length); > + if (status != AE_OK) { if (ACPI_FAILURE(acpi_ut...)) or better still put that in the wrapper I suggeste in previous patch so that we have normal kernel return code handling out here. > + /* Don't leave table data allocated on error */ > + devm_kfree(dev, port->cdat.table); > + port->cdat.table = NULL; > + port->cdat.length = 0; I'd rather see us manipulate a local copy of cdat_length, and cdat_table then only assign them to the port->cdat fields the success path rather than setting them then unsetting the on error. Diff will be bigger, but nicer resulting code (and hopefully diff won't be too big!) > + dev_err(dev, "CDAT data checksum error\n"); > + } > } > EXPORT_SYMBOL_NS_GPL(read_cdat_data, CXL); > >
On Mon, 06 Feb 2023 13:50:23 -0700 Dave Jiang <dave.jiang@intel.com> wrote: > Provide a callback to parse the Device Scoped Latency and Bandwidth > Information Structure (DSLBIS) in the CDAT structures. The DSLBIS > contains the bandwidth and latency information that's tied to a DSMAS > handle. The driver will retrieve the read and write latency and > bandwidth associated with the DSMAS which is tied to a DPA range. > > Signed-off-by: Dave Jiang <dave.jiang@intel.com> A few comments inline, Thanks, Jonathan > --- > drivers/cxl/core/cdat.c | 34 ++++++++++++++++++++++++++++++++++ > drivers/cxl/cxl.h | 2 ++ > drivers/cxl/port.c | 9 ++++++++- > include/acpi/actbl1.h | 5 +++++ > 4 files changed, 49 insertions(+), 1 deletion(-) > > diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c > index f9a64a0f1ee4..3c8f3956487e 100644 > --- a/drivers/cxl/core/cdat.c > +++ b/drivers/cxl/core/cdat.c > @@ -121,3 +121,37 @@ int cxl_dsmas_parse_entry(struct acpi_cdat_header *header, void *arg) > return 0; > } > EXPORT_SYMBOL_NS_GPL(cxl_dsmas_parse_entry, CXL); > + > +int cxl_dslbis_parse_entry(struct acpi_cdat_header *header, void *arg) > +{ > + struct cxl_port *port = (struct cxl_port *)arg; > + struct dsmas_entry *dent; > + struct acpi_cdat_dslbis *dslbis; Perhaps reorder to maintain the pretty upside-down Christmas trees (I don't care :) > + u64 val; > + > + if (header->type != ACPI_CDAT_TYPE_DSLBIS) > + return -EINVAL; Isn't this guaranteed by the caller? Seems overkill do it twice and I don't think these will ever be called outside of that wrapper that loops over the entries. I could be wrong though! > + > + dslbis = (struct acpi_cdat_dslbis *)((unsigned long)header + sizeof(*header)); header + 1 > + if ((dslbis->flags & ACPI_CEDT_DSLBIS_MEM_MASK) != This field 'must be ignored' if the DSMAS handle isn't a match (as it's an initiator only entry) Odd though it may seem I think we might see one of those on a type 3 device and we are probably going to have other users of this function anyway. I think you need to do the walk below to check we have a DSMAS match, before running this check. > + ACPI_CEDT_DSLBIS_MEM_MEMORY) > + return 0; > + > + if (dslbis->data_type > ACPI_HMAT_WRITE_BANDWIDTH) > + return -ENXIO; This would probably imply a new HMAT spec value, so probably just log it and ignore rather than error out. > + > + /* Value calculation with base_unit, see ACPI Spec 6.5 5.2.28.4 */ > + val = dslbis->entry[0] * dslbis->entry_base_unit; In theory this might overflow as u64 * u16. Doubt it will ever happen in reality, but maybe a check and debug print if it does? > + > + mutex_lock(&port->cdat.dsmas_lock); > + list_for_each_entry(dent, &port->cdat.dsmas_list, list) { > + if (dslbis->handle == dent->handle) { > + dent->qos[dslbis->data_type] = val; > + break; > + } > + } > + mutex_unlock(&port->cdat.dsmas_lock); > + > + return 0; > +} > +EXPORT_SYMBOL_NS_GPL(cxl_dslbis_parse_entry, CXL); > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h > index 1e5e69f08480..849b22236f1d 100644 > --- a/drivers/cxl/cxl.h > +++ b/drivers/cxl/cxl.h > @@ -705,6 +705,7 @@ struct dsmas_entry { > struct list_head list; > struct range dpa_range; > u16 handle; > + u64 qos[ACPI_HMAT_WRITE_BANDWIDTH + 1]; > }; > > typedef int (*cdat_tbl_entry_handler)(struct acpi_cdat_header *header, void *arg); > @@ -716,6 +717,7 @@ int cdat_table_parse_dslbis(void *table, cdat_tbl_entry_handler handler, > void *arg); > > int cxl_dsmas_parse_entry(struct acpi_cdat_header *header, void *arg); > +int cxl_dslbis_parse_entry(struct acpi_cdat_header *header, void *arg); > > /* > * Unit test builds overrides this to __weak, find the 'strong' version > diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c > index b1da73e99bab..8de311208b37 100644 > --- a/drivers/cxl/port.c > +++ b/drivers/cxl/port.c > @@ -65,8 +65,15 @@ static int cxl_port_probe(struct device *dev) > rc = cdat_table_parse_dsmas(port->cdat.table, > cxl_dsmas_parse_entry, > (void *)port); > - if (rc < 0) > + if (rc > 0) { > + rc = cdat_table_parse_dslbis(port->cdat.table, > + cxl_dslbis_parse_entry, > + (void *)port); > + if (rc <= 0) > + dev_dbg(dev, "Failed to parse DSLBIS: %d\n", rc); If we have entries and they won't parse, I think we should be screaming louder. dev_warn() would be my preference for this and the one in the previous patch. Sure we can carry on, but something on the device is not working as expected. > + } else { > dev_dbg(dev, "Failed to parse DSMAS: %d\n", rc); > + } > } > > rc = cxl_hdm_decode_init(cxlds, cxlhdm); > diff --git a/include/acpi/actbl1.h b/include/acpi/actbl1.h > index e8297cefde09..ff6092e45196 100644 > --- a/include/acpi/actbl1.h > +++ b/include/acpi/actbl1.h > @@ -369,6 +369,11 @@ struct acpi_cdat_dslbis { > u16 reserved2; > }; > > +/* Flags for subtable above */ > + > +#define ACPI_CEDT_DSLBIS_MEM_MASK GENMASK(3, 0) > +#define ACPI_CEDT_DSLBIS_MEM_MEMORY 0 > + > /* Subtable 2: Device Scoped Memory Side Cache Information Structure (DSMSCIS) */ > > struct acpi_cdat_dsmscis { > >
On Mon, 06 Feb 2023 13:50:42 -0700 Dave Jiang <dave.jiang@intel.com> wrote: > Provide a helper to find the ACPI0017 device in order to issue the _DSM. > The helper will take the 'struct device' from a cxl_port and iterate until > the root device is reached. The ACPI handle will be returned from the root > device. > > Signed-off-by: Dave Jiang <dave.jiang@intel.com> > --- > drivers/cxl/core/acpi.c | 30 ++++++++++++++++++++++++++++++ > drivers/cxl/cxl.h | 1 + > 2 files changed, 31 insertions(+) > > diff --git a/drivers/cxl/core/acpi.c b/drivers/cxl/core/acpi.c > index 86dc6c9c1f24..05fcd4751619 100644 > --- a/drivers/cxl/core/acpi.c > +++ b/drivers/cxl/core/acpi.c > @@ -5,6 +5,7 @@ > #include <linux/kernel.h> > #include <linux/acpi.h> > #include <linux/pci.h> > +#include <linux/platform_device.h> > #include <asm/div64.h> > #include "cxlpci.h" > #include "cxl.h" > @@ -13,6 +14,35 @@ const guid_t acpi_cxl_qtg_id_guid = > GUID_INIT(0xF365F9A6, 0xA7DE, 0x4071, > 0xA6, 0x6A, 0xB4, 0x0C, 0x0B, 0x4F, 0x8E, 0x52); > > +/** > + * cxl_acpi_get_root_acpi_handle - get the ACPI handle of the CXL root device > + * @dev: 'struct device' to start searching from. Should be from cxl_port->dev. > + * Looks for the ACPI0017 device and return the ACPI handle > + **/ Inconsistent comment style. > +acpi_handle cxl_acpi_get_rootdev_handle(struct device *dev) > +{ > + struct device *itr = dev, *root_dev; Not nice for readability to have an assignment in a list of definitions all on the same line. > + acpi_handle handle; > + > + if (!dev) > + return ERR_PTR(-EINVAL); > + > + while (itr->parent) { > + root_dev = itr; > + itr = itr->parent; > + } > + > + if (!dev_is_platform(root_dev)) > + return ERR_PTR(-ENODEV); > + > + handle = ACPI_HANDLE(root_dev); > + if (!handle) > + return ERR_PTR(-ENODEV); > + > + return handle; > +} > +EXPORT_SYMBOL_NS_GPL(cxl_acpi_get_rootdev_handle, CXL); > + > /** > * cxl_acpi_evaluate_qtg_dsm - Retrieve QTG ids via ACPI _DSM > * @handle: ACPI handle > diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h > index e70df07f9b4b..ac6ea550ab0a 100644 > --- a/drivers/cxl/cxl.h > +++ b/drivers/cxl/cxl.h > @@ -733,6 +733,7 @@ struct qtg_dsm_output { > > struct qtg_dsm_output *cxl_acpi_evaluate_qtg_dsm(acpi_handle handle, > struct qtg_dsm_input *input); > +acpi_handle cxl_acpi_get_rootdev_handle(struct device *dev); > > /* > * Unit test builds overrides this to __weak, find the 'strong' version > >
On Mon, 06 Feb 2023 13:51:19 -0700 Dave Jiang <dave.jiang@intel.com> wrote: > CXL Memory Device SW Guide rev1.0 2.11.2 provides instruction on how to > caluclate latency and bandwidth for CXL memory device. Calculate minimum Spell check your descriptions (I often forget to do this as well! ) > bandwidth and total latency for the path from the CXL device to the root > port. The calculates values are stored in the cached DSMAS entries attached > to the cxl_port of the CXL device. > > For example for a device that is directly attached to a host bus: > Total Latency = Device Latency (from CDAT) + Dev to Host Bus (HB) Link > Latency > Min Bandwidth = Link Bandwidth between Host Bus and CXL device > > For a device that has a switch in between host bus and CXL device: > Total Latency = Device (CDAT) Latency + Dev to Switch Link Latency + > Switch (CDAT) Latency + Switch to HB Link Latency For QTG purposes, are we also supposed to take into account HB to system interconnect type latency (or maybe nearest CPU?). That is likely to be non trivial. > Min Bandwidth = min(dev to switch bandwidth, switch to HB bandwidth) > Signed-off-by: Dave Jiang <dave.jiang@intel.com> Stray sign off. > > The internal latency for a switch can be retrieved from the CDAT of the > switch PCI device. However, since there's no easy way to retrieve that > right now on Linux, a guesstimated constant is used per switch to simplify > the driver code. I'd like to see that gap closed asap. I think it is fairly obvious how to do it, so shouldn't be too hard, just needs a dance to get the DOE for a switch port using Lukas' updated handling of DOE mailboxes. > > Signed-off-by: Dave Jiang <dave.jiang@intel.com> > --- > drivers/cxl/core/port.c | 60 +++++++++++++++++++++++++++++++++++++++++++++++ > drivers/cxl/cxl.h | 9 +++++++ > drivers/cxl/port.c | 42 +++++++++++++++++++++++++++++++++ > 3 files changed, 111 insertions(+) > > diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c > index 2b27319cfd42..aa260361ba7d 100644 > --- a/drivers/cxl/core/port.c > +++ b/drivers/cxl/core/port.c > @@ -1899,6 +1899,66 @@ bool schedule_cxl_memdev_detach(struct cxl_memdev *cxlmd) > } > EXPORT_SYMBOL_NS_GPL(schedule_cxl_memdev_detach, CXL); > > +int cxl_port_get_downstream_qos(struct cxl_port *port, long *bw, long *lat) > +{ > + long total_lat = 0, latency; Similar to before, not good for readability to hide asignments in a list all on one line.
On Mon, 06 Feb 2023 13:51:37 -0700 Dave Jiang <dave.jiang@intel.com> wrote: > Move the enumeration of device capacity to cxl_port_probe() from > cxl_pci_probe(). The size and capacity information should be read > after cxl_await_media_ready() so the data is valid. > > Signed-off-by: Dave Jiang <dave.jiang@intel.com> Fix? > --- > drivers/cxl/pci.c | 8 -------- > drivers/cxl/port.c | 8 ++++++++ > 2 files changed, 8 insertions(+), 8 deletions(-) > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c > index 258004f34281..e35ed250214e 100644 > --- a/drivers/cxl/pci.c > +++ b/drivers/cxl/pci.c > @@ -484,14 +484,6 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) > if (rc) > return rc; > > - rc = cxl_dev_state_identify(cxlds); > - if (rc) > - return rc; > - > - rc = cxl_mem_create_range_info(cxlds); > - if (rc) > - return rc; > - > cxlmd = devm_cxl_add_memdev(cxlds); > if (IS_ERR(cxlmd)) > return PTR_ERR(cxlmd); > diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c > index 03380c18fc52..b7a4a1be2945 100644 > --- a/drivers/cxl/port.c > +++ b/drivers/cxl/port.c > @@ -127,6 +127,14 @@ static int cxl_port_probe(struct device *dev) > if (rc) > dev_dbg(dev, "Failed to do QoS calculations\n"); > } > + > + rc = cxl_dev_state_identify(cxlds); > + if (rc) > + return rc; > + > + rc = cxl_mem_create_range_info(cxlds); > + if (rc) > + return rc; > } > > rc = devm_cxl_enumerate_decoders(cxlhdm); > >
On Mon, 06 Feb 2023 13:51:55 -0700 Dave Jiang <dave.jiang@intel.com> wrote: > Match the DPA ranges of the mem device and the calcuated DPA range attached > to the DSMAS. If a match is found, then assign the QTG ID to the relevant > DPA range of the memory device. > > Signed-off-by: Dave Jiang <dave.jiang@intel.com> > --- > drivers/cxl/core/mbox.c | 2 ++ > drivers/cxl/cxlmem.h | 2 ++ > drivers/cxl/port.c | 35 +++++++++++++++++++++++++++++++++++ > 3 files changed, 39 insertions(+) > > diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c > index b03fba212799..2a7b07d65010 100644 > --- a/drivers/cxl/core/mbox.c > +++ b/drivers/cxl/core/mbox.c > @@ -869,6 +869,8 @@ struct cxl_dev_state *cxl_dev_state_create(struct device *dev) > > mutex_init(&cxlds->mbox_mutex); > cxlds->dev = dev; > + cxlds->pmem_qtg_id = -1; > + cxlds->ram_qtg_id = -1; > > return cxlds; > } > diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h > index ab138004f644..d88b88ecc807 100644 > --- a/drivers/cxl/cxlmem.h > +++ b/drivers/cxl/cxlmem.h > @@ -251,6 +251,8 @@ struct cxl_dev_state { > struct resource dpa_res; > struct resource pmem_res; > struct resource ram_res; > + int pmem_qtg_id; > + int ram_qtg_id; > u64 total_bytes; > u64 volatile_only_bytes; > u64 persistent_only_bytes; > diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c > index 6b2ad22487f5..c4cee69d6625 100644 > --- a/drivers/cxl/port.c > +++ b/drivers/cxl/port.c > @@ -68,6 +68,39 @@ static int cxl_port_qos_calculate(struct cxl_port *port) > return 0; > } > > +static bool dpa_match_qtg_range(struct range *dpa, struct range *qtg) > +{ > + if (dpa->start >= qtg->start && dpa->end <= qtg->end) > + return true; > + return false; return dpa->start >= qtg->start && dpa->end <= qtg-end; > +} > + > +static void cxl_dev_set_qtg(struct cxl_port *port, struct cxl_dev_state *cxlds) > +{ > + struct dsmas_entry *dent; > + struct range ram_range = { > + .start = cxlds->ram_res.start, > + .end = cxlds->ram_res.end, > + }; > + struct range pmem_range = { > + .start = cxlds->pmem_res.start, > + .end = cxlds->pmem_res.end, > + }; > + > + mutex_lock(&port->cdat.dsmas_lock); > + list_for_each_entry(dent, &port->cdat.dsmas_list, list) { > + if (dpa_match_qtg_range(&ram_range, &dent->dpa_range)) { > + cxlds->ram_qtg_id = dent->qtg_id; > + break; > + } > + if (dpa_match_qtg_range(&pmem_range, &dent->dpa_range)) { > + cxlds->pmem_qtg_id = dent->qtg_id; Could be multiple ranges in ram and pmem. I guess we can leave that for future work. > + break; > + } > + } > + mutex_unlock(&port->cdat.dsmas_lock); > +} > + > static int cxl_port_probe(struct device *dev) > { > struct cxl_port *port = to_cxl_port(dev); > @@ -134,6 +167,8 @@ static int cxl_port_probe(struct device *dev) > rc = cxl_mem_create_range_info(cxlds); > if (rc) > return rc; > + > + cxl_dev_set_qtg(port, cxlds); > } > > rc = devm_cxl_enumerate_decoders(cxlhdm); > >
On 2/9/23 4:15 AM, Jonathan Cameron wrote: > On Mon, 06 Feb 2023 13:49:30 -0700 > Dave Jiang <dave.jiang@intel.com> wrote: > >> Export the QoS Throttling Group ID from the CXL Fixed Memory Window >> Structure (CFMWS) under the root decoder sysfs attributes. >> CXL rev3.0 9.17.1.3 CXL Fixed Memory Window Structure (CFMWS) >> >> Signed-off-by: Dave Jiang <dave.jiang@intel.com> > > Hi Dave, > > > I've no objection to this, but would good to say why this > might be of use to userspace. What tooling needs it? Will do. > > One comment on docs inline. With those two things tidied up > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > >> --- >> Documentation/ABI/testing/sysfs-bus-cxl | 7 +++++++ >> drivers/cxl/acpi.c | 3 +++ >> drivers/cxl/core/port.c | 14 ++++++++++++++ >> drivers/cxl/cxl.h | 3 +++ >> 4 files changed, 27 insertions(+) >> >> diff --git a/Documentation/ABI/testing/sysfs-bus-cxl b/Documentation/ABI/testing/sysfs-bus-cxl >> index 8494ef27e8d2..0932c2f6fbf4 100644 >> --- a/Documentation/ABI/testing/sysfs-bus-cxl >> +++ b/Documentation/ABI/testing/sysfs-bus-cxl >> @@ -294,6 +294,13 @@ Description: >> (WO) Write a string in the form 'regionZ' to delete that region, >> provided it is currently idle / not bound to a driver. >> >> +What: /sys/bus/cxl/devices/decoderX.Y/qtg_id >> +Date: Jan, 2023 >> +KernelVersion: v6.3 >> +Contact: linux-cxl@vger.kernel.org >> +Description: >> + (RO) Shows the QoS Throttling Group ID. The QTG ID for a root >> + decoder comes from the CFMWS structure of the CEDT. > > Document the -1 value for no ID in here. Hopefully people will write > their userspace against this document and we want them to know about that > corner case! Ok I will add. > >> >> What: /sys/bus/cxl/devices/regionZ/uuid >> Date: May, 2022 >> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c >> index 13cde44c6086..7a71bb5041c7 100644 >> --- a/drivers/cxl/acpi.c >> +++ b/drivers/cxl/acpi.c >> @@ -289,6 +289,9 @@ static int cxl_parse_cfmws(union acpi_subtable_headers *header, void *arg, >> } >> } >> } >> + >> + cxld->qtg_id = cfmws->qtg_id; >> + >> rc = cxl_decoder_add(cxld, target_map); >> err_xormap: >> if (rc) >> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c >> index b631a0520456..fe78daf7e7c8 100644 >> --- a/drivers/cxl/core/port.c >> +++ b/drivers/cxl/core/port.c >> @@ -284,6 +284,16 @@ static ssize_t interleave_ways_show(struct device *dev, >> >> static DEVICE_ATTR_RO(interleave_ways); >> >> +static ssize_t qtg_id_show(struct device *dev, >> + struct device_attribute *attr, char *buf) >> +{ >> + struct cxl_decoder *cxld = to_cxl_decoder(dev); >> + >> + return sysfs_emit(buf, "%d\n", cxld->qtg_id); >> +} >> + >> +static DEVICE_ATTR_RO(qtg_id); >> + >> static struct attribute *cxl_decoder_base_attrs[] = { >> &dev_attr_start.attr, >> &dev_attr_size.attr, >> @@ -303,6 +313,7 @@ static struct attribute *cxl_decoder_root_attrs[] = { >> &dev_attr_cap_type2.attr, >> &dev_attr_cap_type3.attr, >> &dev_attr_target_list.attr, >> + &dev_attr_qtg_id.attr, >> SET_CXL_REGION_ATTR(create_pmem_region) >> SET_CXL_REGION_ATTR(delete_region) >> NULL, >> @@ -1606,6 +1617,7 @@ struct cxl_root_decoder *cxl_root_decoder_alloc(struct cxl_port *port, >> } >> >> atomic_set(&cxlrd->region_id, rc); >> + cxld->qtg_id = CXL_QTG_ID_INVALID; >> return cxlrd; >> } >> EXPORT_SYMBOL_NS_GPL(cxl_root_decoder_alloc, CXL); >> @@ -1643,6 +1655,7 @@ struct cxl_switch_decoder *cxl_switch_decoder_alloc(struct cxl_port *port, >> >> cxld = &cxlsd->cxld; >> cxld->dev.type = &cxl_decoder_switch_type; >> + cxld->qtg_id = CXL_QTG_ID_INVALID; >> return cxlsd; >> } >> EXPORT_SYMBOL_NS_GPL(cxl_switch_decoder_alloc, CXL); >> @@ -1675,6 +1688,7 @@ struct cxl_endpoint_decoder *cxl_endpoint_decoder_alloc(struct cxl_port *port) >> } >> >> cxld->dev.type = &cxl_decoder_endpoint_type; >> + cxld->qtg_id = CXL_QTG_ID_INVALID; >> return cxled; >> } >> EXPORT_SYMBOL_NS_GPL(cxl_endpoint_decoder_alloc, CXL); >> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h >> index 1b1cf459ac77..f558bbfc0332 100644 >> --- a/drivers/cxl/cxl.h >> +++ b/drivers/cxl/cxl.h >> @@ -279,6 +279,7 @@ enum cxl_decoder_type { >> */ >> #define CXL_DECODER_MAX_INTERLEAVE 16 >> >> +#define CXL_QTG_ID_INVALID -1 >> >> /** >> * struct cxl_decoder - Common CXL HDM Decoder Attributes >> @@ -290,6 +291,7 @@ enum cxl_decoder_type { >> * @target_type: accelerator vs expander (type2 vs type3) selector >> * @region: currently assigned region for this decoder >> * @flags: memory type capabilities and locking >> + * @qtg_id: QoS Throttling Group ID >> * @commit: device/decoder-type specific callback to commit settings to hw >> * @reset: device/decoder-type specific callback to reset hw settings >> */ >> @@ -302,6 +304,7 @@ struct cxl_decoder { >> enum cxl_decoder_type target_type; >> struct cxl_region *region; >> unsigned long flags; >> + int qtg_id; >> int (*commit)(struct cxl_decoder *cxld); >> int (*reset)(struct cxl_decoder *cxld); >> }; >> >> >
On 2/9/23 4:34 AM, Jonathan Cameron wrote: > On Mon, 06 Feb 2023 13:49:48 -0700 > Dave Jiang <dave.jiang@intel.com> wrote: > >> A CDAT table is available from a CXL device. The table is read by the >> driver and cached in software. With the CXL subsystem needing to parse the >> CDAT table, the checksum should be verified. Add checksum verification >> after the CDAT table is read from device. >> >> Signed-off-by: Dave Jiang <dave.jiang@intel.com> > Hi Dave, > > Some comments on this follow on from previous patch so may not > be relevant once you've updated how that is done. Dan advised to just drop the ACPICA changes and just do it locally since the verification code is tiny and simple. > > Jonathan > >> --- >> drivers/cxl/core/pci.c | 11 +++++++++++ >> 1 file changed, 11 insertions(+) >> >> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c >> index 57764e9cd19d..a24dac36bedd 100644 >> --- a/drivers/cxl/core/pci.c >> +++ b/drivers/cxl/core/pci.c >> @@ -3,6 +3,7 @@ >> #include <linux/io-64-nonatomic-lo-hi.h> >> #include <linux/device.h> >> #include <linux/delay.h> >> +#include <linux/acpi.h> >> #include <linux/pci.h> >> #include <linux/pci-doe.h> >> #include <cxlpci.h> >> @@ -592,6 +593,7 @@ void read_cdat_data(struct cxl_port *port) >> struct device *dev = &port->dev; >> struct device *uport = port->uport; >> size_t cdat_length; >> + acpi_status status; >> int rc; >> >> cdat_doe = find_cdat_doe(uport); >> @@ -620,5 +622,14 @@ void read_cdat_data(struct cxl_port *port) >> port->cdat.length = 0; >> dev_err(dev, "CDAT data read error\n"); >> } >> + >> + status = acpi_ut_verify_cdat_checksum(port->cdat.table, port->cdat.length); >> + if (status != AE_OK) { > > if (ACPI_FAILURE(acpi_ut...)) or better still put that in the wrapper I suggeste > in previous patch so that we have normal kernel return code handling out here. > > >> + /* Don't leave table data allocated on error */ >> + devm_kfree(dev, port->cdat.table); >> + port->cdat.table = NULL; >> + port->cdat.length = 0; > > I'd rather see us manipulate a local copy of cdat_length, and cdat_table > then only assign them to the port->cdat fields the success path rather > than setting them then unsetting the on error. > > Diff will be bigger, but nicer resulting code (and hopefully diff > won't be too big!) Ok, I'll create a prep patch to change this as you suggested. > > >> + dev_err(dev, "CDAT data checksum error\n"); >> + } >> } >> EXPORT_SYMBOL_NS_GPL(read_cdat_data, CXL); >> >> >
On 2/9/23 6:50 AM, Jonathan Cameron wrote: > On Mon, 06 Feb 2023 13:50:23 -0700 > Dave Jiang <dave.jiang@intel.com> wrote: > >> Provide a callback to parse the Device Scoped Latency and Bandwidth >> Information Structure (DSLBIS) in the CDAT structures. The DSLBIS >> contains the bandwidth and latency information that's tied to a DSMAS >> handle. The driver will retrieve the read and write latency and >> bandwidth associated with the DSMAS which is tied to a DPA range. >> >> Signed-off-by: Dave Jiang <dave.jiang@intel.com> > A few comments inline, > > Thanks, > > Jonathan > >> --- >> drivers/cxl/core/cdat.c | 34 ++++++++++++++++++++++++++++++++++ >> drivers/cxl/cxl.h | 2 ++ >> drivers/cxl/port.c | 9 ++++++++- >> include/acpi/actbl1.h | 5 +++++ >> 4 files changed, 49 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/cxl/core/cdat.c b/drivers/cxl/core/cdat.c >> index f9a64a0f1ee4..3c8f3956487e 100644 >> --- a/drivers/cxl/core/cdat.c >> +++ b/drivers/cxl/core/cdat.c >> @@ -121,3 +121,37 @@ int cxl_dsmas_parse_entry(struct acpi_cdat_header *header, void *arg) >> return 0; >> } >> EXPORT_SYMBOL_NS_GPL(cxl_dsmas_parse_entry, CXL); >> + >> +int cxl_dslbis_parse_entry(struct acpi_cdat_header *header, void *arg) >> +{ >> + struct cxl_port *port = (struct cxl_port *)arg; >> + struct dsmas_entry *dent; >> + struct acpi_cdat_dslbis *dslbis; > > Perhaps reorder to maintain the pretty upside-down Christmas trees > (I don't care :) will fix > >> + u64 val; >> + >> + if (header->type != ACPI_CDAT_TYPE_DSLBIS) >> + return -EINVAL; > > Isn't this guaranteed by the caller? Seems overkill do it twice > and I don't think these will ever be called outside of that wrapper that > loops over the entries. I could be wrong though! > ok will remove >> + >> + dslbis = (struct acpi_cdat_dslbis *)((unsigned long)header + sizeof(*header)); > header + 1 > >> + if ((dslbis->flags & ACPI_CEDT_DSLBIS_MEM_MASK) != > > This field 'must be ignored' if the DSMAS handle isn't a match > (as it's an initiator only entry) Odd though it may seem I think we > might see one of those on a type 3 device and we are probably going to > have other users of this function anyway. > > I think you need to do the walk below to check we have a DSMAS match, before > running this check. ok, will move down to where entry is matched > >> + ACPI_CEDT_DSLBIS_MEM_MEMORY) >> + return 0; >> + >> + if (dslbis->data_type > ACPI_HMAT_WRITE_BANDWIDTH) >> + return -ENXIO; > > This would probably imply a new HMAT spec value, so probably just > log it and ignore rather than error out. ok > >> + >> + /* Value calculation with base_unit, see ACPI Spec 6.5 5.2.28.4 */ >> + val = dslbis->entry[0] * dslbis->entry_base_unit; > > In theory this might overflow as u64 * u16. > Doubt it will ever happen in reality, but maybe a check and debug print if it does? ok will use check_mul_overflow() > >> + >> + mutex_lock(&port->cdat.dsmas_lock); >> + list_for_each_entry(dent, &port->cdat.dsmas_list, list) { >> + if (dslbis->handle == dent->handle) { >> + dent->qos[dslbis->data_type] = val; >> + break; >> + } >> + } >> + mutex_unlock(&port->cdat.dsmas_lock); >> + >> + return 0; >> +} >> +EXPORT_SYMBOL_NS_GPL(cxl_dslbis_parse_entry, CXL); >> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h >> index 1e5e69f08480..849b22236f1d 100644 >> --- a/drivers/cxl/cxl.h >> +++ b/drivers/cxl/cxl.h >> @@ -705,6 +705,7 @@ struct dsmas_entry { >> struct list_head list; >> struct range dpa_range; >> u16 handle; >> + u64 qos[ACPI_HMAT_WRITE_BANDWIDTH + 1]; >> }; >> >> typedef int (*cdat_tbl_entry_handler)(struct acpi_cdat_header *header, void *arg); >> @@ -716,6 +717,7 @@ int cdat_table_parse_dslbis(void *table, cdat_tbl_entry_handler handler, >> void *arg); >> >> int cxl_dsmas_parse_entry(struct acpi_cdat_header *header, void *arg); >> +int cxl_dslbis_parse_entry(struct acpi_cdat_header *header, void *arg); >> >> /* >> * Unit test builds overrides this to __weak, find the 'strong' version >> diff --git a/drivers/cxl/port.c b/drivers/cxl/port.c >> index b1da73e99bab..8de311208b37 100644 >> --- a/drivers/cxl/port.c >> +++ b/drivers/cxl/port.c >> @@ -65,8 +65,15 @@ static int cxl_port_probe(struct device *dev) >> rc = cdat_table_parse_dsmas(port->cdat.table, >> cxl_dsmas_parse_entry, >> (void *)port); >> - if (rc < 0) >> + if (rc > 0) { >> + rc = cdat_table_parse_dslbis(port->cdat.table, >> + cxl_dslbis_parse_entry, >> + (void *)port); >> + if (rc <= 0) >> + dev_dbg(dev, "Failed to parse DSLBIS: %d\n", rc); > > If we have entries and they won't parse, I think we should be screaming louder. > dev_warn() would be my preference for this and the one in the previous patch. > Sure we can carry on, but something on the device is not working as expected. ok will fix this one and previous. > >> + } else { >> dev_dbg(dev, "Failed to parse DSMAS: %d\n", rc); >> + } >> } >> >> rc = cxl_hdm_decode_init(cxlds, cxlhdm); >> diff --git a/include/acpi/actbl1.h b/include/acpi/actbl1.h >> index e8297cefde09..ff6092e45196 100644 >> --- a/include/acpi/actbl1.h >> +++ b/include/acpi/actbl1.h >> @@ -369,6 +369,11 @@ struct acpi_cdat_dslbis { >> u16 reserved2; >> }; >> >> +/* Flags for subtable above */ >> + >> +#define ACPI_CEDT_DSLBIS_MEM_MASK GENMASK(3, 0) >> +#define ACPI_CEDT_DSLBIS_MEM_MEMORY 0 >> + >> /* Subtable 2: Device Scoped Memory Side Cache Information Structure (DSMSCIS) */ >> >> struct acpi_cdat_dsmscis { >> >> >
On 2/9/23 7:10 AM, Jonathan Cameron wrote: > On Mon, 06 Feb 2023 13:50:42 -0700 > Dave Jiang <dave.jiang@intel.com> wrote: > >> Provide a helper to find the ACPI0017 device in order to issue the _DSM. >> The helper will take the 'struct device' from a cxl_port and iterate until >> the root device is reached. The ACPI handle will be returned from the root >> device. >> >> Signed-off-by: Dave Jiang <dave.jiang@intel.com> >> --- >> drivers/cxl/core/acpi.c | 30 ++++++++++++++++++++++++++++++ >> drivers/cxl/cxl.h | 1 + >> 2 files changed, 31 insertions(+) >> >> diff --git a/drivers/cxl/core/acpi.c b/drivers/cxl/core/acpi.c >> index 86dc6c9c1f24..05fcd4751619 100644 >> --- a/drivers/cxl/core/acpi.c >> +++ b/drivers/cxl/core/acpi.c >> @@ -5,6 +5,7 @@ >> #include <linux/kernel.h> >> #include <linux/acpi.h> >> #include <linux/pci.h> >> +#include <linux/platform_device.h> >> #include <asm/div64.h> >> #include "cxlpci.h" >> #include "cxl.h" >> @@ -13,6 +14,35 @@ const guid_t acpi_cxl_qtg_id_guid = >> GUID_INIT(0xF365F9A6, 0xA7DE, 0x4071, >> 0xA6, 0x6A, 0xB4, 0x0C, 0x0B, 0x4F, 0x8E, 0x52); >> >> +/** >> + * cxl_acpi_get_root_acpi_handle - get the ACPI handle of the CXL root device >> + * @dev: 'struct device' to start searching from. Should be from cxl_port->dev. >> + * Looks for the ACPI0017 device and return the ACPI handle >> + **/ > > Inconsistent comment style. ok > >> +acpi_handle cxl_acpi_get_rootdev_handle(struct device *dev) >> +{ >> + struct device *itr = dev, *root_dev; > > Not nice for readability to have an assignment in a list of definitions > all on the same line. ok > >> + acpi_handle handle; >> + >> + if (!dev) >> + return ERR_PTR(-EINVAL); >> + >> + while (itr->parent) { >> + root_dev = itr; >> + itr = itr->parent; >> + } >> + >> + if (!dev_is_platform(root_dev)) >> + return ERR_PTR(-ENODEV); >> + >> + handle = ACPI_HANDLE(root_dev); >> + if (!handle) >> + return ERR_PTR(-ENODEV); >> + >> + return handle; >> +} >> +EXPORT_SYMBOL_NS_GPL(cxl_acpi_get_rootdev_handle, CXL); >> + >> /** >> * cxl_acpi_evaluate_qtg_dsm - Retrieve QTG ids via ACPI _DSM >> * @handle: ACPI handle >> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h >> index e70df07f9b4b..ac6ea550ab0a 100644 >> --- a/drivers/cxl/cxl.h >> +++ b/drivers/cxl/cxl.h >> @@ -733,6 +733,7 @@ struct qtg_dsm_output { >> >> struct qtg_dsm_output *cxl_acpi_evaluate_qtg_dsm(acpi_handle handle, >> struct qtg_dsm_input *input); >> +acpi_handle cxl_acpi_get_rootdev_handle(struct device *dev); >> >> /* >> * Unit test builds overrides this to __weak, find the 'strong' version >> >> >
On 2/9/23 8:24 AM, Jonathan Cameron wrote: > On Mon, 06 Feb 2023 13:51:19 -0700 > Dave Jiang <dave.jiang@intel.com> wrote: > >> CXL Memory Device SW Guide rev1.0 2.11.2 provides instruction on how to >> caluclate latency and bandwidth for CXL memory device. Calculate minimum > > Spell check your descriptions (I often forget to do this as well! > ) >> bandwidth and total latency for the path from the CXL device to the root >> port. The calculates values are stored in the cached DSMAS entries attached >> to the cxl_port of the CXL device. >> >> For example for a device that is directly attached to a host bus: >> Total Latency = Device Latency (from CDAT) + Dev to Host Bus (HB) Link >> Latency >> Min Bandwidth = Link Bandwidth between Host Bus and CXL device >> >> For a device that has a switch in between host bus and CXL device: >> Total Latency = Device (CDAT) Latency + Dev to Switch Link Latency + >> Switch (CDAT) Latency + Switch to HB Link Latency > > For QTG purposes, are we also supposed to take into account HB to > system interconnect type latency (or maybe nearest CPU?). > That is likely to be non trivial. Dan brought this ECN [1] to my attention. We can add this if we can find a BIOS that implements the ECN. Or should we code a place holder for it until this is available? https://lore.kernel.org/linux-cxl/e1a52da9aec90766da5de51b1b839fd95d63a5af.camel@intel.com/ > >> Min Bandwidth = min(dev to switch bandwidth, switch to HB bandwidth) >> Signed-off-by: Dave Jiang <dave.jiang@intel.com> > > Stray sign off. > >> >> The internal latency for a switch can be retrieved from the CDAT of the >> switch PCI device. However, since there's no easy way to retrieve that >> right now on Linux, a guesstimated constant is used per switch to simplify >> the driver code. > > I'd like to see that gap closed asap. I think it is fairly obvious how to do > it, so shouldn't be too hard, just needs a dance to get the DOE for a switch > port using Lukas' updated handling of DOE mailboxes. Talked to Lukas and this may not be difficult with his latest changes. I can take a look. Do we support switch CDAT in QEMU yet? > >> >> Signed-off-by: Dave Jiang <dave.jiang@intel.com> >> --- >> drivers/cxl/core/port.c | 60 +++++++++++++++++++++++++++++++++++++++++++++++ >> drivers/cxl/cxl.h | 9 +++++++ >> drivers/cxl/port.c | 42 +++++++++++++++++++++++++++++++++ >> 3 files changed, 111 insertions(+) >> >> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c >> index 2b27319cfd42..aa260361ba7d 100644 >> --- a/drivers/cxl/core/port.c >> +++ b/drivers/cxl/core/port.c >> @@ -1899,6 +1899,66 @@ bool schedule_cxl_memdev_detach(struct cxl_memdev *cxlmd) >> } >> EXPORT_SYMBOL_NS_GPL(schedule_cxl_memdev_detach, CXL); >> >> +int cxl_port_get_downstream_qos(struct cxl_port *port, long *bw, long *lat) >> +{ >> + long total_lat = 0, latency; > > Similar to before, not good for readability to hide asignments in a list all on one line. >
On Tue, 14 Feb 2023 16:03:27 -0700 Dave Jiang <dave.jiang@intel.com> wrote: > On 2/9/23 8:24 AM, Jonathan Cameron wrote: > > On Mon, 06 Feb 2023 13:51:19 -0700 > > Dave Jiang <dave.jiang@intel.com> wrote: > > > >> CXL Memory Device SW Guide rev1.0 2.11.2 provides instruction on how to > >> caluclate latency and bandwidth for CXL memory device. Calculate minimum > > > > Spell check your descriptions (I often forget to do this as well! > > ) > >> bandwidth and total latency for the path from the CXL device to the root > >> port. The calculates values are stored in the cached DSMAS entries attached > >> to the cxl_port of the CXL device. > >> > >> For example for a device that is directly attached to a host bus: > >> Total Latency = Device Latency (from CDAT) + Dev to Host Bus (HB) Link > >> Latency > >> Min Bandwidth = Link Bandwidth between Host Bus and CXL device > >> > >> For a device that has a switch in between host bus and CXL device: > >> Total Latency = Device (CDAT) Latency + Dev to Switch Link Latency + > >> Switch (CDAT) Latency + Switch to HB Link Latency > > > > For QTG purposes, are we also supposed to take into account HB to > > system interconnect type latency (or maybe nearest CPU?). > > That is likely to be non trivial. > > Dan brought this ECN [1] to my attention. We can add this if we can find > a BIOS that implements the ECN. Or should we code a place holder for it > until this is available? > > https://lore.kernel.org/linux-cxl/e1a52da9aec90766da5de51b1b839fd95d63a5af.camel@intel.com/ I've had Generic Ports on my list to add to QEMU for a while but not been high enough priority to either do it myself, or make it someone else's problem. I suspect the biggest barrier in QEMU is going to be the interface to add these to the NUMA description. It's easy enough to hand build and inject a SRAT /SLIT/HMAT tables with these in (that's how we developed the Generic Initiator support in Linux before any BIOS support). So I'd like to see it soon, but I'm not hugely bothered if that element follows this patch set. However, we are potentially going to see different decisions made when that detail is added so it 'might' count as ABI breakage if it's not there from the start. I think we are fine as probably no BIOS' yet though. > > > > >> Min Bandwidth = min(dev to switch bandwidth, switch to HB bandwidth) > >> Signed-off-by: Dave Jiang <dave.jiang@intel.com> > > > > Stray sign off. > > > >> > >> The internal latency for a switch can be retrieved from the CDAT of the > >> switch PCI device. However, since there's no easy way to retrieve that > >> right now on Linux, a guesstimated constant is used per switch to simplify > >> the driver code. > > > > I'd like to see that gap closed asap. I think it is fairly obvious how to do > > it, so shouldn't be too hard, just needs a dance to get the DOE for a switch > > port using Lukas' updated handling of DOE mailboxes. > > Talked to Lukas and this may not be difficult with his latest changes. I > can take a look. Do we support switch CDAT in QEMU yet? I started typing no, then thought I'd just check. Seems I did write support for CDAT on switches (and then completely forgot about it ;) It's upstream and everything! https://elixir.bootlin.com/qemu/latest/source/hw/pci-bridge/cxl_upstream.c#L194
On 2/15/23 6:17 AM, Jonathan Cameron wrote: > On Tue, 14 Feb 2023 16:03:27 -0700 > Dave Jiang <dave.jiang@intel.com> wrote: > >> On 2/9/23 8:24 AM, Jonathan Cameron wrote: >>> On Mon, 06 Feb 2023 13:51:19 -0700 >>> Dave Jiang <dave.jiang@intel.com> wrote: >>> >>>> CXL Memory Device SW Guide rev1.0 2.11.2 provides instruction on how to >>>> caluclate latency and bandwidth for CXL memory device. Calculate minimum >>> >>> Spell check your descriptions (I often forget to do this as well! >>> ) >>>> bandwidth and total latency for the path from the CXL device to the root >>>> port. The calculates values are stored in the cached DSMAS entries attached >>>> to the cxl_port of the CXL device. >>>> >>>> For example for a device that is directly attached to a host bus: >>>> Total Latency = Device Latency (from CDAT) + Dev to Host Bus (HB) Link >>>> Latency >>>> Min Bandwidth = Link Bandwidth between Host Bus and CXL device >>>> >>>> For a device that has a switch in between host bus and CXL device: >>>> Total Latency = Device (CDAT) Latency + Dev to Switch Link Latency + >>>> Switch (CDAT) Latency + Switch to HB Link Latency >>> >>> For QTG purposes, are we also supposed to take into account HB to >>> system interconnect type latency (or maybe nearest CPU?). >>> That is likely to be non trivial. >> >> Dan brought this ECN [1] to my attention. We can add this if we can find >> a BIOS that implements the ECN. Or should we code a place holder for it >> until this is available? >> >> https://lore.kernel.org/linux-cxl/e1a52da9aec90766da5de51b1b839fd95d63a5af.camel@intel.com/ > > I've had Generic Ports on my list to add to QEMU for a while but not been > high enough priority to either do it myself, or make it someone else's problem. > I suspect the biggest barrier in QEMU is going to be the interface to add > these to the NUMA description. > > It's easy enough to hand build and inject a SRAT /SLIT/HMAT tables with > these in (that's how we developed the Generic Initiator support in Linux before > any BIOS support). > > So I'd like to see it soon, but I'm not hugely bothered if that element > follows this patch set. However, we are potentially going to see different > decisions made when that detail is added so it 'might' count as ABI > breakage if it's not there from the start. I think we are fine as probably > no BIOS' yet though. > >> >>> >>>> Min Bandwidth = min(dev to switch bandwidth, switch to HB bandwidth) >>>> Signed-off-by: Dave Jiang <dave.jiang@intel.com> >>> >>> Stray sign off. >>> >>>> >>>> The internal latency for a switch can be retrieved from the CDAT of the >>>> switch PCI device. However, since there's no easy way to retrieve that >>>> right now on Linux, a guesstimated constant is used per switch to simplify >>>> the driver code. >>> >>> I'd like to see that gap closed asap. I think it is fairly obvious how to do >>> it, so shouldn't be too hard, just needs a dance to get the DOE for a switch >>> port using Lukas' updated handling of DOE mailboxes. >> >> Talked to Lukas and this may not be difficult with his latest changes. I >> can take a look. Do we support switch CDAT in QEMU yet? > > I started typing no, then thought I'd just check. Seems I did write support > for CDAT on switches (and then completely forgot about it ;) > It's upstream and everything! > https://elixir.bootlin.com/qemu/latest/source/hw/pci-bridge/cxl_upstream.c#L194 > Awesome! I'll go poke around a bit. Also it's very helpful to see the creation code. Helped me realize that I need to support parsing of SSLBIS sub-table for switches. Thanks!