Message ID | 20230713131932.133258-5-ilpo.jarvinen@linux.intel.com |
---|---|
State | New |
Headers | show |
Series | selftests/resctrl: Fixes and cleanups | expand |
Hi Ilpo, On 7/14/2023 3:35 AM, Ilpo Järvinen wrote: > On Thu, 13 Jul 2023, Reinette Chatre wrote: >> On 7/13/2023 6:19 AM, Ilpo Järvinen wrote: >>> Perf event fd (fd_lm) is not closed on some error paths. >>> >>> Always close fd_lm in get_llc_perf() and add close into an error >>> handling block in cat_val(). >>> >>> Fixes: 790bf585b0ee ("selftests/resctrl: Add Cache Allocation Technology (CAT) selftest") >>> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> >>> --- >>> tools/testing/selftests/resctrl/cache.c | 10 +++++----- >>> 1 file changed, 5 insertions(+), 5 deletions(-) >>> >>> diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c >>> index 8a4fe8693be6..ced47b445d1e 100644 >>> --- a/tools/testing/selftests/resctrl/cache.c >>> +++ b/tools/testing/selftests/resctrl/cache.c >>> @@ -87,21 +87,20 @@ static int reset_enable_llc_perf(pid_t pid, int cpu_no) >>> static int get_llc_perf(unsigned long *llc_perf_miss) >>> { >>> __u64 total_misses; >>> + int ret; >>> >>> /* Stop counters after one span to get miss rate */ >>> >>> ioctl(fd_lm, PERF_EVENT_IOC_DISABLE, 0); >>> >>> - if (read(fd_lm, &rf_cqm, sizeof(struct read_format)) == -1) { >>> + ret = read(fd_lm, &rf_cqm, sizeof(struct read_format)); >>> + close(fd_lm); >>> + if (ret == -1) { >>> perror("Could not get llc misses through perf"); >>> - >>> return -1; >>> } >>> >>> total_misses = rf_cqm.values[0].value; >>> - >>> - close(fd_lm); >>> - >>> *llc_perf_miss = total_misses; >>> >>> return 0; >>> @@ -253,6 +252,7 @@ int cat_val(struct resctrl_val_param *param) >>> memflush, operation, resctrl_val)) { >>> fprintf(stderr, "Error-running fill buffer\n"); >>> ret = -1; >>> + close(fd_lm); >>> break; >>> } >>> >> >> Instead of fixing these existing patterns I think it would make the code >> easier to understand and maintain if it is made symmetrical. >> Having the perf event fd opened in one place but its close() >> scattered elsewhere has the potential for confusion and making later >> mistakes easy to miss. >> >> What if perf event fd is closed in a new "disable_llc_perf()" that >> is matched with "reset_enable_llc_perf()" and called >> from cat_val()? >> >> I think this raises another issue with the test trickery where >> measure_cache_vals() has some assumptions about state based on the >> test name. > > I very much agree on the principle here, and thus I already have created > patches which will do a major cleanup on this area. The cleaned-up code > has pe_fd local var to cat_val() and handles closing it in cat_val() with > the usual patterns. > > However, the patch is currently resides post L3 CAT test rewrite. > Backporting the cleanups/refactors into this series would require > considerable effort due to how convoluted all those n-step cleanup patches > and L3 CAT test rewrite are in this area. There's just very much to > cleanup here and L3 rewrite will touch the same areas so its a net > full of conflicts. > > Do you want me to spend the effort to backport them into this series > (I expect will take some time)? Considering the "Fixes" tag, having a smaller fix that can easily be backported would be ideal so I am ok with deferring a bigger rework. I do think this fix can be made more robust with a couple of small changes that should not introduce significant conflicts: * initialize fd_lm to -1 * do not close() fd_lm in get_llc_perf() but instead move its close() to at exit of cat_val(). * add check in get_llc_perf() that it does not attempt ioctl() on "fd_lm == -1" (later addition would be error checking of the ioctl()) > I currently have these items pending besides this series (in order): > - L3 CAT test rewrite and its preparatory patches > - More cleanups (including the pe_fd cleanup) > - New generalized test framework > - L2 CAT test Thank you very much for taking this on. Reinette
On Fri, 14 Jul 2023, Reinette Chatre wrote: > On 7/14/2023 3:35 AM, Ilpo Järvinen wrote: > > On Thu, 13 Jul 2023, Reinette Chatre wrote: > >> On 7/13/2023 6:19 AM, Ilpo Järvinen wrote: > >>> Perf event fd (fd_lm) is not closed on some error paths. > >>> > >>> Always close fd_lm in get_llc_perf() and add close into an error > >>> handling block in cat_val(). > >>> > >>> Fixes: 790bf585b0ee ("selftests/resctrl: Add Cache Allocation Technology (CAT) selftest") > >>> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> > >>> --- > >>> tools/testing/selftests/resctrl/cache.c | 10 +++++----- > >>> 1 file changed, 5 insertions(+), 5 deletions(-) > >>> > >>> diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c > >>> index 8a4fe8693be6..ced47b445d1e 100644 > >>> --- a/tools/testing/selftests/resctrl/cache.c > >>> +++ b/tools/testing/selftests/resctrl/cache.c > >>> @@ -87,21 +87,20 @@ static int reset_enable_llc_perf(pid_t pid, int cpu_no) > >>> static int get_llc_perf(unsigned long *llc_perf_miss) > >>> { > >>> __u64 total_misses; > >>> + int ret; > >>> > >>> /* Stop counters after one span to get miss rate */ > >>> > >>> ioctl(fd_lm, PERF_EVENT_IOC_DISABLE, 0); > >>> > >>> - if (read(fd_lm, &rf_cqm, sizeof(struct read_format)) == -1) { > >>> + ret = read(fd_lm, &rf_cqm, sizeof(struct read_format)); > >>> + close(fd_lm); > >>> + if (ret == -1) { > >>> perror("Could not get llc misses through perf"); > >>> - > >>> return -1; > >>> } > >>> > >>> total_misses = rf_cqm.values[0].value; > >>> - > >>> - close(fd_lm); > >>> - > >>> *llc_perf_miss = total_misses; > >>> > >>> return 0; > >>> @@ -253,6 +252,7 @@ int cat_val(struct resctrl_val_param *param) > >>> memflush, operation, resctrl_val)) { > >>> fprintf(stderr, "Error-running fill buffer\n"); > >>> ret = -1; > >>> + close(fd_lm); > >>> break; > >>> } > >>> > >> > >> Instead of fixing these existing patterns I think it would make the code > >> easier to understand and maintain if it is made symmetrical. > >> Having the perf event fd opened in one place but its close() > >> scattered elsewhere has the potential for confusion and making later > >> mistakes easy to miss. > >> > >> What if perf event fd is closed in a new "disable_llc_perf()" that > >> is matched with "reset_enable_llc_perf()" and called > >> from cat_val()? > >> > >> I think this raises another issue with the test trickery where > >> measure_cache_vals() has some assumptions about state based on the > >> test name. > > > > I very much agree on the principle here, and thus I already have created > > patches which will do a major cleanup on this area. The cleaned-up code > > has pe_fd local var to cat_val() and handles closing it in cat_val() with > > the usual patterns. > > > > However, the patch is currently resides post L3 CAT test rewrite. > > Backporting the cleanups/refactors into this series would require > > considerable effort due to how convoluted all those n-step cleanup patches > > and L3 CAT test rewrite are in this area. There's just very much to > > cleanup here and L3 rewrite will touch the same areas so its a net > > full of conflicts. > > > > Do you want me to spend the effort to backport them into this series > > (I expect will take some time)? > > Considering the "Fixes" tag, having a smaller fix that can easily > be backported would be ideal so I am ok with deferring a bigger > rework. > > I do think this fix can be made more robust with a couple of small > changes that should not introduce significant conflicts: > * initialize fd_lm to -1 > * do not close() fd_lm in get_llc_perf() but instead move its > close() to at exit of cat_val(). I changed the test to only close the fd in cat_val() which is the direction the later refactor/cleanup changes (not in this series) was moving anyway. > * add check in get_llc_perf() that it does not attempt ioctl() > on "fd_lm == -1" (later addition would be error checking of > the ioctl()) The other two things suggested seem unnecessary and I've not implemented them, I don't thinkg fd_lm can be -1 at ioctl(). Given this code is going to be replaced soonish, putting any extra "safety" effort into it now seems waste of time.
Hi Ilpo, On 7/17/2023 6:05 AM, Ilpo Järvinen wrote: > On Fri, 14 Jul 2023, Reinette Chatre wrote: >> * add check in get_llc_perf() that it does not attempt ioctl() >> on "fd_lm == -1" (later addition would be error checking of >> the ioctl()) > > The other two things suggested seem unnecessary and I've not implemented > them, I don't thinkg fd_lm can be -1 at ioctl(). Given this code is going > to be replaced soonish, putting any extra "safety" effort into it now > seems waste of time. Yes, this suggestion was indeed to make the code more robust. I certainly do not want to waste your time. Please keep in mind when you respond that I do not have insight into the reworks you are still planning. Reinette
diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c index 8a4fe8693be6..ced47b445d1e 100644 --- a/tools/testing/selftests/resctrl/cache.c +++ b/tools/testing/selftests/resctrl/cache.c @@ -87,21 +87,20 @@ static int reset_enable_llc_perf(pid_t pid, int cpu_no) static int get_llc_perf(unsigned long *llc_perf_miss) { __u64 total_misses; + int ret; /* Stop counters after one span to get miss rate */ ioctl(fd_lm, PERF_EVENT_IOC_DISABLE, 0); - if (read(fd_lm, &rf_cqm, sizeof(struct read_format)) == -1) { + ret = read(fd_lm, &rf_cqm, sizeof(struct read_format)); + close(fd_lm); + if (ret == -1) { perror("Could not get llc misses through perf"); - return -1; } total_misses = rf_cqm.values[0].value; - - close(fd_lm); - *llc_perf_miss = total_misses; return 0; @@ -253,6 +252,7 @@ int cat_val(struct resctrl_val_param *param) memflush, operation, resctrl_val)) { fprintf(stderr, "Error-running fill buffer\n"); ret = -1; + close(fd_lm); break; }
Perf event fd (fd_lm) is not closed on some error paths. Always close fd_lm in get_llc_perf() and add close into an error handling block in cat_val(). Fixes: 790bf585b0ee ("selftests/resctrl: Add Cache Allocation Technology (CAT) selftest") Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> --- tools/testing/selftests/resctrl/cache.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)