Message ID | 20210315193707.359702-1-peterx@redhat.com |
---|---|
State | New |
Headers | show |
Series | rt-tests: Drop use_current_cpuset() check | expand |
Hi Peter, On Mon, Mar 15, 2021 at 03:37:07PM -0400, Peter Xu wrote: > CPU list should allow to be any list rather than only the cores that are > allowed to schedule for the current process. See commit aaa57168dfd3 ("rt-tests: cyclictest: Only run on runtime affinity and user supplied affinity") > Before this patch, cyclictest will fail with below condition: > > $ taskset -pc $$ > pid 2316's current affinity list: 0,2,4,6,8 > $ sudo cyclictest -m -N -p 1 -a 1,3,5,7 -t 4 > WARN: Couldn't setaffinity in main thread: Invalid argument > > After this patch, it'll be allowed to run. As John writes in the commit above message I think cyclictest should honor to the runtime settings. The warning above could be extended and telling you what's the problem is. Your commit message should contain the argument why it's better not to check the environment settings. So the question is what is the least surprise for the user? Thanks, Daniel
On Tue, Mar 16, 2021 at 09:18:26AM +0100, Daniel Wagner wrote: > Hi Peter, Daniel, > > On Mon, Mar 15, 2021 at 03:37:07PM -0400, Peter Xu wrote: > > CPU list should allow to be any list rather than only the cores that are > > allowed to schedule for the current process. > > See commit aaa57168dfd3 ("rt-tests: cyclictest: Only run on runtime > affinity and user supplied affinity") (Thanks for the commit ID) > > > Before this patch, cyclictest will fail with below condition: > > > > $ taskset -pc $$ > > pid 2316's current affinity list: 0,2,4,6,8 > > $ sudo cyclictest -m -N -p 1 -a 1,3,5,7 -t 4 > > WARN: Couldn't setaffinity in main thread: Invalid argument > > > > After this patch, it'll be allowed to run. > > As John writes in the commit above message I think cyclictest should > honor to the runtime settings. The warning above could be extended and > telling you what's the problem is. > > Your commit message should contain the argument why it's better not to > check the environment settings. So the question is what is the least > surprise for the user? I think what I'm missing is why we had such a restriction. Quotting from the commit ID: Currently if the user passes the affinity to cyclictest, threads are run there even if they should be excluded according to the affinity of the runtime environment. So I'm not sure I understand the word "runtime environment". If it's defined as "the set of cores that this process is allowed to run", I don't understand why it's not allowed to schedule things outside of this set of cores, especially if sched_setaffinity() syscall would succeed. IOW, I'm afraid we got the idea slightly mixed up on "cores allowed to schedule this process" with "cores allowed to be set as schedule target by this process", and IMHO we should simply rely on sched_setaffinity() syscall to decide whether the affinity setting is legal or not as a single checkpoint. John and I had some discussion offlist about this last time on oslat, it should be the same thing here I think. But I'd like John to help confirm too since I could be missing something. Thanks, -- Peter Xu
On Tue, Mar 16, 2021 at 04:07:05PM -0400, Peter Xu wrote: > I think what I'm missing is why we had such a restriction. Quotting from the > commit ID: IIRC, the current behavior allows the process to be placed into a cgroup with a subset of CPUs and you just can do 'cyclictest -a -t'. Process should not ignore external configuration. That's my whole point here. > So I'm not sure I understand the word "runtime environment". > > If it's defined as "the set of cores that this process is allowed to > run", I am trying to say is, the tests should not assume they have the full control of the placement as this is not what I would expect. If I as 'admin' limits the sched mask then cyclictest should not overwrite it. If you insist removing this code, please add a section in the documentation explaining why the tools ignore it. > John and I had some discussion offlist about this last time on oslat, As I said, I think this behavior is wrong. Anyway, I stopped caring.
Hi, Daniel, On Wed, Mar 17, 2021 at 08:49:03AM +0100, Daniel Wagner wrote: > On Tue, Mar 16, 2021 at 04:07:05PM -0400, Peter Xu wrote: > > I think what I'm missing is why we had such a restriction. Quotting from the > > commit ID: > > IIRC, the current behavior allows the process to be placed into a cgroup > with a subset of CPUs and you just can do 'cyclictest -a -t'. Process > should not ignore external configuration. That's my whole point here. In that case again I think a sane solution is not to check the cpu list in every single tool we use, because even if we do that for all tools in rt-teets repo, we can't guarantee to have this check for the rest tools to not ignore this restriction. A simple example is: what if the user specified "taskset -c $CPU cyclictest -a $CPU -t 1 ..." where $CPU is not in the allowed list of current bash? As long as the taskset would work the so-called "environment" will be changed before even loading cyclictest. If you see that's the point I said we should fail at the same check point of sched_setaffinity() rather than checking it explicitly in the tool, because if we want a real-world restriction that's the only place I think it's possible.. But I'm not a cgroup/container guy, please correct me if I understood. -- Peter Xu
On Wed, 17 Mar 2021, Peter Xu wrote: > Hi, Daniel, > > On Wed, Mar 17, 2021 at 08:49:03AM +0100, Daniel Wagner wrote: > > On Tue, Mar 16, 2021 at 04:07:05PM -0400, Peter Xu wrote: > > > I think what I'm missing is why we had such a restriction. Quotting from the > > > commit ID: > > > > IIRC, the current behavior allows the process to be placed into a cgroup > > with a subset of CPUs and you just can do 'cyclictest -a -t'. Process > > should not ignore external configuration. That's my whole point here. > > In that case again I think a sane solution is not to check the cpu list in > every single tool we use, because even if we do that for all tools in rt-teets > repo, we can't guarantee to have this check for the rest tools to not ignore > this restriction. > > A simple example is: what if the user specified "taskset -c $CPU cyclictest -a > $CPU -t 1 ..." where $CPU is not in the allowed list of current bash? As long > as the taskset would work the so-called "environment" will be changed before > even loading cyclictest. > > If you see that's the point I said we should fail at the same check point of > sched_setaffinity() rather than checking it explicitly in the tool, because > if we want a real-world restriction that's the only place I think it's possible.. > > But I'm not a cgroup/container guy, please correct me if I understood. > > -- > Peter Xu > > When cyclictest and friends were originally written, we had this view point that we "owned" the whole machine, and didn't have any restrictions on where to schedule. As machines grew in size, and we added numa awareness, and cgroups became more prominent we added this code that tried to schedule according to the ill-defined environment that we found ourselves in. As Peter points out we may have restricted ourselves more than is necessary, and can rely a bit more on the operating system to restrict us. On the otherhand using taskset is an easy workaround if the current code is to restrictive. Because we can use taskset and things are working well otherwise I don't see this as super urgent, but I am willing to revisit this code and make it less restrictive if that makes sense. I also am not a cgroup / container person, and would like to play around with this a bit more before we make some decisions on which direction to go in. Does that make sense to everyone? John
On Wed, Mar 17, 2021 at 11:08:31AM -0400, John Kacur wrote: > > > On Wed, 17 Mar 2021, Peter Xu wrote: > > > Hi, Daniel, > > > > On Wed, Mar 17, 2021 at 08:49:03AM +0100, Daniel Wagner wrote: > > > On Tue, Mar 16, 2021 at 04:07:05PM -0400, Peter Xu wrote: > > > > I think what I'm missing is why we had such a restriction. Quotting from the > > > > commit ID: > > > > > > IIRC, the current behavior allows the process to be placed into a cgroup > > > with a subset of CPUs and you just can do 'cyclictest -a -t'. Process > > > should not ignore external configuration. That's my whole point here. > > > > In that case again I think a sane solution is not to check the cpu list in > > every single tool we use, because even if we do that for all tools in rt-teets > > repo, we can't guarantee to have this check for the rest tools to not ignore > > this restriction. > > > > A simple example is: what if the user specified "taskset -c $CPU cyclictest -a > > $CPU -t 1 ..." where $CPU is not in the allowed list of current bash? As long > > as the taskset would work the so-called "environment" will be changed before > > even loading cyclictest. > > > > If you see that's the point I said we should fail at the same check point of > > sched_setaffinity() rather than checking it explicitly in the tool, because > > if we want a real-world restriction that's the only place I think it's possible.. > > > > But I'm not a cgroup/container guy, please correct me if I understood. > > > > -- > > Peter Xu > > > > > > When cyclictest and friends were originally written, we had this view > point that we "owned" the whole machine, and didn't have any restrictions > on where to schedule. As machines grew in size, and we added numa > awareness, and cgroups became more prominent we added this code that tried > to schedule according to the ill-defined environment that we found > ourselves in. > > As Peter points out we may have restricted ourselves more than is > necessary, and can rely a bit more on the operating system to restrict us. > On the otherhand using taskset is an easy workaround if the current code > is to restrictive. > > Because we can use taskset and things are working well otherwise I don't > see this as super urgent, but I am willing to revisit this code and make > it less restrictive if that makes sense. > > I also am not a cgroup / container person, and would like to play around > with this a bit more before we make some decisions on which direction to > go in. > > Does that make sense to everyone? Sure thing on my side. No bug reported so far this time, so I'll wait at least until then :) I just don't know why it's not hit just like oslat since I don't see a difference. When I fixed the oslat thing, I thought cyclictest didn't have such issue for some reason so I didn't consider to touch it at all. But when yesterday I rerun some tests I see this issue on rhel8, hence this patch. Thanks, -- Peter Xu
On Wed, 17 Mar 2021, Peter Xu wrote: > On Wed, Mar 17, 2021 at 11:08:31AM -0400, John Kacur wrote: > > > > > > On Wed, 17 Mar 2021, Peter Xu wrote: > > > > > Hi, Daniel, > > > > > > On Wed, Mar 17, 2021 at 08:49:03AM +0100, Daniel Wagner wrote: > > > > On Tue, Mar 16, 2021 at 04:07:05PM -0400, Peter Xu wrote: > > > > > I think what I'm missing is why we had such a restriction. Quotting from the > > > > > commit ID: > > > > > > > > IIRC, the current behavior allows the process to be placed into a cgroup > > > > with a subset of CPUs and you just can do 'cyclictest -a -t'. Process > > > > should not ignore external configuration. That's my whole point here. > > > > > > In that case again I think a sane solution is not to check the cpu list in > > > every single tool we use, because even if we do that for all tools in rt-teets > > > repo, we can't guarantee to have this check for the rest tools to not ignore > > > this restriction. > > > > > > A simple example is: what if the user specified "taskset -c $CPU cyclictest -a > > > $CPU -t 1 ..." where $CPU is not in the allowed list of current bash? As long > > > as the taskset would work the so-called "environment" will be changed before > > > even loading cyclictest. > > > > > > If you see that's the point I said we should fail at the same check point of > > > sched_setaffinity() rather than checking it explicitly in the tool, because > > > if we want a real-world restriction that's the only place I think it's possible.. > > > > > > But I'm not a cgroup/container guy, please correct me if I understood. > > > > > > -- > > > Peter Xu > > > > > > > > > > When cyclictest and friends were originally written, we had this view > > point that we "owned" the whole machine, and didn't have any restrictions > > on where to schedule. As machines grew in size, and we added numa > > awareness, and cgroups became more prominent we added this code that tried > > to schedule according to the ill-defined environment that we found > > ourselves in. > > > > As Peter points out we may have restricted ourselves more than is > > necessary, and can rely a bit more on the operating system to restrict us. > > On the otherhand using taskset is an easy workaround if the current code > > is to restrictive. > > > > Because we can use taskset and things are working well otherwise I don't > > see this as super urgent, but I am willing to revisit this code and make > > it less restrictive if that makes sense. > > > > I also am not a cgroup / container person, and would like to play around > > with this a bit more before we make some decisions on which direction to > > go in. > > > > Does that make sense to everyone? > > Sure thing on my side. No bug reported so far this time, so I'll wait at least > until then :) I just don't know why it's not hit just like oslat since I don't > see a difference. When I fixed the oslat thing, I thought cyclictest didn't > have such issue for some reason so I didn't consider to touch it at all. But > when yesterday I rerun some tests I see this issue on rhel8, hence this patch. > > Thanks, > > -- > Peter Xu > > Thanks Peter - I appreciate it! John
On Wed, Mar 17, 2021 at 11:08:31AM -0400, John Kacur wrote: > > I also am not a cgroup / container person, and would like to play around > with this a bit more before we make some decisions on which direction to > go in. > Pardon my ignorance, but (scheduling wise) what does an RT task inside a container/cgroup mean? AFAIK, the RT task will still command the CPU as it pleases, no matter the cgroup's cpu.shares configuration. Thanks, -- Ahmed S. Darwish
On Wed, Mar 17, 2021 at 05:40:56PM +0100, Ahmed S. Darwish wrote: > Pardon my ignorance, but (scheduling wise) what does an RT task inside a > container/cgroup mean? > > AFAIK, the RT task will still command the CPU as it pleases, no matter > the cgroup's cpu.shares configuration. It's not about scheduling it's about affinity only.
On Wed, Mar 17, 2021 at 11:47:51AM -0400, John Kacur wrote: > > > Does that make sense to everyone? > > > > Sure thing on my side. No bug reported so far this time, so I'll wait at least > > until then :) I just don't know why it's not hit just like oslat since I don't > > see a difference. When I fixed the oslat thing, I thought cyclictest didn't > > have such issue for some reason so I didn't consider to touch it at all. But > > when yesterday I rerun some tests I see this issue on rhel8, hence this patch. Don't let me stop you ripping out the code. I am not agreeing with the reasoning why it should go but it's a detail I don't care.
diff --git a/src/lib/rt-numa.c b/src/lib/rt-numa.c index babcc63..f581020 100644 --- a/src/lib/rt-numa.c +++ b/src/lib/rt-numa.c @@ -93,32 +93,6 @@ int cpu_for_thread_ua(int thread_num, int max_cpus) return 0; } -/* - * After this function is called, affinity_mask is the intersection of - * the user supplied affinity mask and the affinity mask from the run - * time environment - */ -static void use_current_cpuset(int max_cpus, struct bitmask *cpumask) -{ - struct bitmask *curmask; - int i; - - curmask = numa_allocate_cpumask(); - numa_sched_getaffinity(getpid(), curmask); - - /* - * Clear bits that are not set in both the cpuset from the - * environment, and in the user specified affinity. - */ - for (i = 0; i < max_cpus; i++) { - if ((!numa_bitmask_isbitset(cpumask, i)) || - (!numa_bitmask_isbitset(curmask, i))) - numa_bitmask_clearbit(cpumask, i); - } - - numa_bitmask_free(curmask); -} - int parse_cpumask(char *str, int max_cpus, struct bitmask **cpumask) { struct bitmask *mask; @@ -133,7 +107,6 @@ int parse_cpumask(char *str, int max_cpus, struct bitmask **cpumask) return 0; } - use_current_cpuset(max_cpus, mask); *cpumask = mask; return 0;
CPU list should allow to be any list rather than only the cores that are allowed to schedule for the current process. Before this patch, cyclictest will fail with below condition: $ taskset -pc $$ pid 2316's current affinity list: 0,2,4,6,8 $ sudo cyclictest -m -N -p 1 -a 1,3,5,7 -t 4 WARN: Couldn't setaffinity in main thread: Invalid argument After this patch, it'll be allowed to run. Cc: Daniel Wagner <dwagner@suse.de> Signed-off-by: Peter Xu <peterx@redhat.com> --- src/lib/rt-numa.c | 27 --------------------------- 1 file changed, 27 deletions(-)