Message ID | 20230421141723.2405942-9-peternewman@google.com |
---|---|
State | New |
Headers | show |
Series | x86/resctrl: Use soft RMIDs for reliable MBM on AMD | expand |
Hi Peter, On 4/21/2023 7:17 AM, Peter Newman wrote: > @@ -806,12 +811,27 @@ void mbm_handle_overflow(struct work_struct *work) > r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; > d = container_of(work, struct rdt_domain, mbm_over.work); > > + if (rdt_mon_soft_rmid) { > + /* > + * HW RMIDs are permanently assigned to CPUs, so only a per-CPU > + * flush is needed. > + */ > + on_each_cpu_mask(&d->cpu_mask, mbm_flush_cpu_handler, NULL, > + false); > + } > + > list_for_each_entry(prgrp, &rdt_all_groups, rdtgroup_list) { > - mbm_update(r, d, prgrp->mon.rmid); > + /* > + * mbm_update() on every RMID would result in excessive IPIs > + * when RMIDs are soft. > + */ > + if (!rdt_mon_soft_rmid) { > + mbm_update(r, d, prgrp->mon.rmid); > > - head = &prgrp->mon.crdtgrp_list; > - list_for_each_entry(crgrp, head, mon.crdtgrp_list) > - mbm_update(r, d, crgrp->mon.rmid); > + head = &prgrp->mon.crdtgrp_list; > + list_for_each_entry(crgrp, head, mon.crdtgrp_list) > + mbm_update(r, d, crgrp->mon.rmid); > + } > > if (is_mba_sc(NULL)) > update_mba_bw(prgrp, d); hmmm ... I think that update_mba_bw() relies on mbm_update() to call mbm_bw_count() to update the data it depends on. Keeping update_mba_bw() while dropping mbm_update() thus seems problematic. AMD does not support the software controller though so it may make things simpler if support for software RMIDs disables support for software controller (in a clear way). Reinette
Hi Reinette, On Thu, May 11, 2023 at 11:40 PM Reinette Chatre <reinette.chatre@intel.com> wrote: > On 4/21/2023 7:17 AM, Peter Newman wrote: > > list_for_each_entry(prgrp, &rdt_all_groups, rdtgroup_list) { > > - mbm_update(r, d, prgrp->mon.rmid); > > + /* > > + * mbm_update() on every RMID would result in excessive IPIs > > + * when RMIDs are soft. > > + */ > > + if (!rdt_mon_soft_rmid) { > > + mbm_update(r, d, prgrp->mon.rmid); > > > > - head = &prgrp->mon.crdtgrp_list; > > - list_for_each_entry(crgrp, head, mon.crdtgrp_list) > > - mbm_update(r, d, crgrp->mon.rmid); > > + head = &prgrp->mon.crdtgrp_list; > > + list_for_each_entry(crgrp, head, mon.crdtgrp_list) > > + mbm_update(r, d, crgrp->mon.rmid); > > + } > > > > if (is_mba_sc(NULL)) > > update_mba_bw(prgrp, d); > > > hmmm ... I think that update_mba_bw() relies on mbm_update() to call > mbm_bw_count() to update the data it depends on. Keeping update_mba_bw() > while dropping mbm_update() thus seems problematic. AMD does not support the > software controller though so it may make things simpler if support for > software RMIDs disables support for software controller (in a clear way). I looked over this again and realized that the rationale for skipping mbm_update() in this patch is incorrect. __mon_event_count_soft_rmid() does not send any IPIs, so it's perfectly fine to call mbm_update() and update_mba_bw() when using soft RMIDs. Even if we don't use the software controller on AMD, it seems conceptually cleaner to just allow soft and hard RMIDs to be used interchangeably wherever they work. -Peter
On Fri, Apr 21, 2023 at 4:18 PM Peter Newman <peternewman@google.com> wrote: > > __mon_event_count() only reads the current software count and does not > cause CPUs in the domain to flush. For mbm_update() to be effective in > preventing overflow in hardware counters with soft RMIDs, it needs to > flush the domain CPUs so that all of the HW RMIDs are read. > > When RMIDs are soft, mbm_update() is intended to push bandwidth counts > to the software counters rather than pulling the counts from hardware > when userspace reads event counts, as this is a lot more efficient when > the number of HW RMIDs is fixed. The low frequency with which the overflow handler is run would introduce too much error into bandwidth calculations and running it more frequently regardless of whether event count reads are being requested by the user is not a good use of CPU time. mon_event_read() needs to pull fresh event count values from hardware. > When RMIDs are soft, mbm_update() only calls mbm_flush_cpu_handler() on > each CPU in the domain rather than reading all RMIDs. I'll try going back to Stephane's original approach of rate-limiting how often domain CPUs need to be flushed and allowing the user to configure the time threshold. This will allow mbm_update() to read all of the RMIDs without triggering lots of redundant IPIs. (redundant because only the current RMID on each CPU can change when RMIDs are soft)
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index 3d54a634471a..9575cb79b8ee 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -487,6 +487,11 @@ void resctrl_mbm_flush_cpu(void) __mbm_flush(QOS_L3_MBM_TOTAL_EVENT_ID, r, d); } +static void mbm_flush_cpu_handler(void *p) +{ + resctrl_mbm_flush_cpu(); +} + static int __mon_event_count_soft_rmid(u32 rmid, struct rmid_read *rr) { struct mbm_state *m; @@ -806,12 +811,27 @@ void mbm_handle_overflow(struct work_struct *work) r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; d = container_of(work, struct rdt_domain, mbm_over.work); + if (rdt_mon_soft_rmid) { + /* + * HW RMIDs are permanently assigned to CPUs, so only a per-CPU + * flush is needed. + */ + on_each_cpu_mask(&d->cpu_mask, mbm_flush_cpu_handler, NULL, + false); + } + list_for_each_entry(prgrp, &rdt_all_groups, rdtgroup_list) { - mbm_update(r, d, prgrp->mon.rmid); + /* + * mbm_update() on every RMID would result in excessive IPIs + * when RMIDs are soft. + */ + if (!rdt_mon_soft_rmid) { + mbm_update(r, d, prgrp->mon.rmid); - head = &prgrp->mon.crdtgrp_list; - list_for_each_entry(crgrp, head, mon.crdtgrp_list) - mbm_update(r, d, crgrp->mon.rmid); + head = &prgrp->mon.crdtgrp_list; + list_for_each_entry(crgrp, head, mon.crdtgrp_list) + mbm_update(r, d, crgrp->mon.rmid); + } if (is_mba_sc(NULL)) update_mba_bw(prgrp, d);
__mon_event_count() only reads the current software count and does not cause CPUs in the domain to flush. For mbm_update() to be effective in preventing overflow in hardware counters with soft RMIDs, it needs to flush the domain CPUs so that all of the HW RMIDs are read. When RMIDs are soft, mbm_update() is intended to push bandwidth counts to the software counters rather than pulling the counts from hardware when userspace reads event counts, as this is a lot more efficient when the number of HW RMIDs is fixed. When RMIDs are soft, mbm_update() only calls mbm_flush_cpu_handler() on each CPU in the domain rather than reading all RMIDs. Signed-off-by: Peter Newman <peternewman@google.com> --- arch/x86/kernel/cpu/resctrl/monitor.c | 28 +++++++++++++++++++++++---- 1 file changed, 24 insertions(+), 4 deletions(-)