mbox series

[v2,0/4] cgroup/cpuset: Improve CPU isolation in isolated partitions

Message ID 20231025182555.4155614-1-longman@redhat.com
Headers show
Series cgroup/cpuset: Improve CPU isolation in isolated partitions | expand

Message

Waiman Long Oct. 25, 2023, 6:25 p.m. UTC
v2:
 - Add 2 read-only workqueue sysfs files to expose the user requested
   cpumask as well as the isolated CPUs to be excluded from
   wq_unbound_cpumask.
 - Ensure that caller of the new workqueue_unbound_exclude_cpumask()
   hold cpus_read_lock.
 - Update the cpuset code to make sure the cpus_read_lock is held
   whenever workqueue_unbound_exclude_cpumask() may be called.

Isolated cpuset partition can currently be created to contain an
exclusive set of CPUs not used in other cgroups and with load balancing
disabled to reduce interference from the scheduler.

The main purpose of this isolated partition type is to dynamically
emulate what can be done via the "isolcpus" boot command line option,
specifically the default domain flag. One effect of the "isolcpus" option
is to remove the isolated CPUs from the cpumasks of unbound workqueues
since running work functions in an isolated CPU can be a major source
of interference. Changing the unbound workqueue cpumasks can be done at
run time by writing an appropriate cpumask without the isolated CPUs to
/sys/devices/virtual/workqueue/cpumask. So one can set up an isolated
cpuset partition and then write to the cpumask sysfs file to achieve
similar level of CPU isolation. However, this manual process can be
error prone.

This patch series implements automatic exclusion of isolated CPUs from
unbound workqueue cpumasks when an isolated cpuset partition is created
and then adds those CPUs back when the isolated partition is destroyed.

There are also other places in the kernel that look at the HK_FLAG_DOMAIN
cpumask or other HK_FLAG_* cpumasks and exclude the isolated CPUs from
certain actions to further reduce interference. CPUs in an isolated
cpuset partition will not be able to avoid those interferences yet. That
may change in the future as the need arises.

Waiman Long (4):
  workqueue: Add workqueue_unbound_exclude_cpumask() to exclude CPUs
    from wq_unbound_cpumask
  selftests/cgroup: Minor code cleanup and reorganization of
    test_cpuset_prs.sh
  cgroup/cpuset: Keep track of CPUs in isolated partitions
  cgroup/cpuset: Take isolated CPUs out of workqueue unbound cpumask

 Documentation/admin-guide/cgroup-v2.rst       |  10 +-
 include/linux/workqueue.h                     |   2 +-
 kernel/cgroup/cpuset.c                        | 286 +++++++++++++-----
 kernel/workqueue.c                            |  91 +++++-
 .../selftests/cgroup/test_cpuset_prs.sh       | 216 ++++++++-----
 5 files changed, 438 insertions(+), 167 deletions(-)

Comments

Zhang, Rui Nov. 9, 2023, 9:02 a.m. UTC | #1
Hi, Waiman,

May I know which kernel this patch series is based on?

I'd like to test this feature, but cannot apply it cleanly on top of
v6.6.

thanks,
rui

On Wed, 2023-10-25 at 14:25 -0400, Waiman Long wrote:
> v2:
>  - Add 2 read-only workqueue sysfs files to expose the user requested
>    cpumask as well as the isolated CPUs to be excluded from
>    wq_unbound_cpumask.
>  - Ensure that caller of the new workqueue_unbound_exclude_cpumask()
>    hold cpus_read_lock.
>  - Update the cpuset code to make sure the cpus_read_lock is held
>    whenever workqueue_unbound_exclude_cpumask() may be called.
> 
> Isolated cpuset partition can currently be created to contain an
> exclusive set of CPUs not used in other cgroups and with load
> balancing
> disabled to reduce interference from the scheduler.
> 
> The main purpose of this isolated partition type is to dynamically
> emulate what can be done via the "isolcpus" boot command line option,
> specifically the default domain flag. One effect of the "isolcpus"
> option
> is to remove the isolated CPUs from the cpumasks of unbound
> workqueues
> since running work functions in an isolated CPU can be a major source
> of interference. Changing the unbound workqueue cpumasks can be done
> at
> run time by writing an appropriate cpumask without the isolated CPUs
> to
> /sys/devices/virtual/workqueue/cpumask. So one can set up an isolated
> cpuset partition and then write to the cpumask sysfs file to achieve
> similar level of CPU isolation. However, this manual process can be
> error prone.
> 
> This patch series implements automatic exclusion of isolated CPUs
> from
> unbound workqueue cpumasks when an isolated cpuset partition is
> created
> and then adds those CPUs back when the isolated partition is
> destroyed.
> 
> There are also other places in the kernel that look at the
> HK_FLAG_DOMAIN
> cpumask or other HK_FLAG_* cpumasks and exclude the isolated CPUs
> from
> certain actions to further reduce interference. CPUs in an isolated
> cpuset partition will not be able to avoid those interferences yet.
> That
> may change in the future as the need arises.
> 
> Waiman Long (4):
>   workqueue: Add workqueue_unbound_exclude_cpumask() to exclude CPUs
>     from wq_unbound_cpumask
>   selftests/cgroup: Minor code cleanup and reorganization of
>     test_cpuset_prs.sh
>   cgroup/cpuset: Keep track of CPUs in isolated partitions
>   cgroup/cpuset: Take isolated CPUs out of workqueue unbound cpumask
> 
>  Documentation/admin-guide/cgroup-v2.rst       |  10 +-
>  include/linux/workqueue.h                     |   2 +-
>  kernel/cgroup/cpuset.c                        | 286 +++++++++++++---
> --
>  kernel/workqueue.c                            |  91 +++++-
>  .../selftests/cgroup/test_cpuset_prs.sh       | 216 ++++++++-----
>  5 files changed, 438 insertions(+), 167 deletions(-)
>
Waiman Long Nov. 10, 2023, 1:24 a.m. UTC | #2
On 11/9/23 04:02, Zhang, Rui wrote:
> Hi, Waiman,
>
> May I know which kernel this patch series is based on?
>
> I'd like to test this feature, but cannot apply it cleanly on top of
> v6.6.

It was originally based on the cgroup/for-6.7 branch. It should be 
applicable to v6.7 kernel now.

Cheers,
Longman

> thanks,
> rui
>
> On Wed, 2023-10-25 at 14:25 -0400, Waiman Long wrote:
>> v2:
>>   - Add 2 read-only workqueue sysfs files to expose the user requested
>>     cpumask as well as the isolated CPUs to be excluded from
>>     wq_unbound_cpumask.
>>   - Ensure that caller of the new workqueue_unbound_exclude_cpumask()
>>     hold cpus_read_lock.
>>   - Update the cpuset code to make sure the cpus_read_lock is held
>>     whenever workqueue_unbound_exclude_cpumask() may be called.
>>
>> Isolated cpuset partition can currently be created to contain an
>> exclusive set of CPUs not used in other cgroups and with load
>> balancing
>> disabled to reduce interference from the scheduler.
>>
>> The main purpose of this isolated partition type is to dynamically
>> emulate what can be done via the "isolcpus" boot command line option,
>> specifically the default domain flag. One effect of the "isolcpus"
>> option
>> is to remove the isolated CPUs from the cpumasks of unbound
>> workqueues
>> since running work functions in an isolated CPU can be a major source
>> of interference. Changing the unbound workqueue cpumasks can be done
>> at
>> run time by writing an appropriate cpumask without the isolated CPUs
>> to
>> /sys/devices/virtual/workqueue/cpumask. So one can set up an isolated
>> cpuset partition and then write to the cpumask sysfs file to achieve
>> similar level of CPU isolation. However, this manual process can be
>> error prone.
>>
>> This patch series implements automatic exclusion of isolated CPUs
>> from
>> unbound workqueue cpumasks when an isolated cpuset partition is
>> created
>> and then adds those CPUs back when the isolated partition is
>> destroyed.
>>
>> There are also other places in the kernel that look at the
>> HK_FLAG_DOMAIN
>> cpumask or other HK_FLAG_* cpumasks and exclude the isolated CPUs
>> from
>> certain actions to further reduce interference. CPUs in an isolated
>> cpuset partition will not be able to avoid those interferences yet.
>> That
>> may change in the future as the need arises.
>>
>> Waiman Long (4):
>>    workqueue: Add workqueue_unbound_exclude_cpumask() to exclude CPUs
>>      from wq_unbound_cpumask
>>    selftests/cgroup: Minor code cleanup and reorganization of
>>      test_cpuset_prs.sh
>>    cgroup/cpuset: Keep track of CPUs in isolated partitions
>>    cgroup/cpuset: Take isolated CPUs out of workqueue unbound cpumask
>>
>>   Documentation/admin-guide/cgroup-v2.rst       |  10 +-
>>   include/linux/workqueue.h                     |   2 +-
>>   kernel/cgroup/cpuset.c                        | 286 +++++++++++++---
>> --
>>   kernel/workqueue.c                            |  91 +++++-
>>   .../selftests/cgroup/test_cpuset_prs.sh       | 216 ++++++++-----
>>   5 files changed, 438 insertions(+), 167 deletions(-)
>>
Tejun Heo Nov. 12, 2023, 9:10 p.m. UTC | #3
On Wed, Oct 25, 2023 at 02:25:51PM -0400, Waiman Long wrote:
> v2:
>  - Add 2 read-only workqueue sysfs files to expose the user requested
>    cpumask as well as the isolated CPUs to be excluded from
>    wq_unbound_cpumask.
>  - Ensure that caller of the new workqueue_unbound_exclude_cpumask()
>    hold cpus_read_lock.
>  - Update the cpuset code to make sure the cpus_read_lock is held
>    whenever workqueue_unbound_exclude_cpumask() may be called.

Applied to cgroup/for-6.8 with patch description for the third patch updated
to reflect the dropping of __DEBUG__ prefix.

Thanks.