mbox series

[v3,00/10] selftests/mm: Some cleanups from trying to run them

Message ID 20250228-mm-selftests-v3-0-958e3b6f0203@google.com
Headers show
Series selftests/mm: Some cleanups from trying to run them | expand

Message

Brendan Jackman Feb. 28, 2025, 4:54 p.m. UTC
I never had much luck running mm selftests so I spent a few hours
digging into why.

Looks like most of the reason is missing SKIP checks, so this series is
just adding a bunch of those that I found. I did not do anything like
all of them, just the ones I spotted in gup_longterm, gup_test, mmap,
userfaultfd and memfd_secret.

It's a bit unfortunate to have to skip those tests when ftruncate()
fails, but I don't have time to dig deep enough into it to actually make
them pass. I have observed the issue on 9pfs and heard rumours that NFS
has a similar problem.

I'm now able to run these test groups successfully:

- mmap
- gup_test
- compaction
- migration
- page_frag
- userfaultfd

Signed-off-by: Brendan Jackman <jackmanb@google.com>
---
Changes in v3:
- Added fix for userfaultfd tests.
- Dropped attempts to use sudo.
- Fixed garbage printf in uffd-stress.
  (Added EXTRA_CFLAGS=-Werror FORCE_TARGETS=1 to my scripts to prevent
   such errors happening again).
- Fixed missing newlines in ksft_test_result_skip() calls.
- Link to v2: https://lore.kernel.org/r/20250221-mm-selftests-v2-0-28c4d66383c5@google.com

Changes in v2 (Thanks to Dev for the reviews):
- Improve and cleanup some error messages
- Add some extra SKIPs
- Fix misnaming of nr_cpus variable in uffd tests
- Link to v1: https://lore.kernel.org/r/20250220-mm-selftests-v1-0-9bbf57d64463@google.com

---
Brendan Jackman (10):
      selftests/mm: Report errno when things fail in gup_longterm
      selftests/mm: Skip uffd-stress if userfaultfd not available
      selftests/mm: Skip uffd-wp-mremap if userfaultfd not available
      selftests/mm/uffd: Rename nr_cpus -> nr_threads
      selftests/mm: Print some details when uffd-stress gets bad params
      selftests/mm: Don't fail uffd-stress if too many CPUs
      selftests/mm: Skip map_populate on weird filesystems
      selftests/mm: Skip gup_longerm tests on weird filesystems
      selftests/mm: Drop unnecessary sudo usage
      selftests/mm: Ensure uffd-wp-mremap gets pages of each size

 tools/testing/selftests/mm/gup_longterm.c    | 45 ++++++++++++++++++----------
 tools/testing/selftests/mm/map_populate.c    |  7 +++++
 tools/testing/selftests/mm/run_vmtests.sh    | 25 ++++++++++++++--
 tools/testing/selftests/mm/uffd-common.c     |  8 ++---
 tools/testing/selftests/mm/uffd-common.h     |  2 +-
 tools/testing/selftests/mm/uffd-stress.c     | 42 ++++++++++++++++----------
 tools/testing/selftests/mm/uffd-unit-tests.c |  2 +-
 tools/testing/selftests/mm/uffd-wp-mremap.c  |  5 +++-
 8 files changed, 95 insertions(+), 41 deletions(-)
---
base-commit: 76544811c850a1f4c055aa182b513b7a843868ea
change-id: 20250220-mm-selftests-2d7d0542face

Best regards,

Comments

Brendan Jackman March 3, 2025, 10:48 a.m. UTC | #1
On Fri, Feb 28, 2025 at 10:55:00PM +0530, Dev Jain wrote:
> 
> 
> On 28/02/25 10:24 pm, Brendan Jackman wrote:
> > It's obvious that this should fail in that case, but still, save the
> > reader the effort of figuring out that they've run into this by just
> > SKIPping
> > 
> > Signed-off-by: Brendan Jackman <jackmanb@google.com>
> > ---
> >   tools/testing/selftests/mm/uffd-wp-mremap.c | 5 ++++-
> >   1 file changed, 4 insertions(+), 1 deletion(-)
> > 
> > diff --git a/tools/testing/selftests/mm/uffd-wp-mremap.c b/tools/testing/selftests/mm/uffd-wp-mremap.c
> > index 2c4f984bd73caa17e12b9f4a5bb71e7fdf5d8554..c2ba7d46c7b4581a3c32a6b6acd148e3e89c2172 100644
> > --- a/tools/testing/selftests/mm/uffd-wp-mremap.c
> > +++ b/tools/testing/selftests/mm/uffd-wp-mremap.c
> > @@ -182,7 +182,10 @@ static void test_one_folio(size_t size, bool private, bool swapout, bool hugetlb
> >   	/* Register range for uffd-wp. */
> >   	if (userfaultfd_open(&features)) {
> > -		ksft_test_result_fail("userfaultfd_open() failed\n");
> > +		if (errno == ENOENT)
> > +			ksft_test_result_skip("userfaultfd not available\n");
> > +		else
> > +			ksft_test_result_fail("userfaultfd_open() failed\n");
> >   		goto out;
> >   	}
> >   	if (uffd_register(uffd, mem, size, false, true, false)) {
> > 
> 
> I think you are correct, just want to confirm whether "uffd not available"
> if and only if "errno == ENOENT" is true. That is,
> is it possible that errno can be something else and uffd is still not
> available, 

Yeah, I strongly suspect this can happen. This is an attempt to
improve things but I don't think it's a full solution.

I've been pondering this a bit and I think it's impractical to solve
problems like this in the code of individual testst. I think the right
thing to do is either:

1. Have a centralised facility for detecting conditions like
   "userfaultfd not available" that tests can just query it, so they
   say something like:

   ksft_test_requires("userfaultfd");

   Which would do some sort of actual principled check for presence
   and then skip the test with an informative message when it's not
   there. There would be a list of these "system requirements" in the
   code so you can easily see in one place what things might be needed
   to successfully run all the tests.

or

2. Specify out of band that there's a fixed set of requirements for
   running the tests and document that you shouldn't run them without
   satisfying them. Then just don't bother with SKIPs and call it user
   error.

   This would require some reasonably usable tooling for actually
   getting a system that satisfies the requirements.

But both of them require a deeper investment. I would quite like to
explore option 1 a bit but that's for a future Brendan. 

In the meantime I'm just trying to get these tests running on
virtme-ng. (I'm not even gonna add all of them, because e.g. once I
noticed this one I added a `scripts/config -e USERFAULTFD` to my
script, so I won't notice if anything else is missing the check).

> or errno can be ENOENT even if uffd is available.

I think it's probably posible for this to happen too, e.g. if the
system has a perverted /dev or something. But again I think that can
only be solved with the kinda stuff I mentioned above.

Sorry for the essay :D