mbox series

[0/2] unicode: kunit: refactor selftest to kunit tests

Message ID 20240923173454.264852-1-gbittencourt@lkcamp.dev
Headers show
Series unicode: kunit: refactor selftest to kunit tests | expand

Message

Gabriela Bittencourt Sept. 23, 2024, 5:34 p.m. UTC
Hey all,

We are making these changes as part of a KUnit Hackathon at LKCamp [1].

This patch sets out to refactor fs/unicode/utf8-selftest.c to KUnit tests.

The first commit is the refactoring itself from self test into KUnit, while
the second one applies the naming style conventions.

We appreciate any feedback and suggestions. :)

(Resending patch series with the right lists on cc: kselftest and
 kunit-dev).

[1] https://lkcamp.dev/about/

Co-developed-by: Pedro Orlando <porlando@lkcamp.dev>
Co-developed-by: Danilo Pereira <dpereira@lkcamp.dev>
Signed-off-by: Pedro Orlando <porlando@lkcamp.dev>
Signed-off-by: Danilo Pereira <dpereira@lkcamp.dev>
Signed-off-by: Gabriela Bittencourt <gbittencourt@lkcamp.dev>

Gabriela Bittencourt (2):
  unicode: kunit: refactor selftest to kunit tests
  unicode: kunit: change tests filename and path

 fs/unicode/Kconfig                            |   5 +-
 fs/unicode/Makefile                           |   2 +-
 fs/unicode/tests/.kunitconfig                 |   3 +
 .../{utf8-selftest.c => tests/utf8_kunit.c}   | 152 ++++++++----------
 4 files changed, 76 insertions(+), 86 deletions(-)
 create mode 100644 fs/unicode/tests/.kunitconfig
 rename fs/unicode/{utf8-selftest.c => tests/utf8_kunit.c} (63%)

Comments

Shuah Khan Sept. 23, 2024, 7:18 p.m. UTC | #1
On 9/23/24 11:34, Gabriela Bittencourt wrote:
> Instead of creating 'test' functions, use kunit functions to test
> utf-8 support in unicode subsystem.

Can you elaborate on the reefactoring changes. This will help
others who would want to take on such refactoring in the future.

> 
> Co-developed-by: Pedro Orlando <porlando@lkcamp.dev>
> Signed-off-by: Pedro Orlando <porlando@lkcamp.dev>
> Co-developed-by: Danilo Pereira <dpereira@lkcamp.dev>
> Signed-off-by: Danilo Pereira <dpereira@lkcamp.dev>
> Signed-off-by: Gabriela Bittencourt <gbittencourt@lkcamp.dev>
> ---
>   fs/unicode/.kunitconfig    |   3 +
>   fs/unicode/Kconfig         |   5 +-
>   fs/unicode/Makefile        |   2 +-
>   fs/unicode/utf8-selftest.c | 152 +++++++++++++++++--------------------
>   4 files changed, 76 insertions(+), 86 deletions(-)
>   create mode 100644 fs/unicode/.kunitconfig
> 
> diff --git a/fs/unicode/.kunitconfig b/fs/unicode/.kunitconfig
> new file mode 100644
> index 000000000000..62dd5c171f9c
> --- /dev/null
> +++ b/fs/unicode/.kunitconfig
> @@ -0,0 +1,3 @@
> +CONFIG_KUNIT=y
> +CONFIG_UNICODE=y
> +CONFIG_UNICODE_NORMALIZATION_KUNIT_TEST=y
> diff --git a/fs/unicode/Kconfig b/fs/unicode/Kconfig
> index da786a687fdc..4ad2c36550f1 100644
> --- a/fs/unicode/Kconfig
> +++ b/fs/unicode/Kconfig
> @@ -10,6 +10,7 @@ config UNICODE
>   	  be a separate loadable module that gets requested only when a file
>   	  system actually use it.
>   
> -config UNICODE_NORMALIZATION_SELFTEST
> +config UNICODE_NORMALIZATION_KUNIT_TEST
>   	tristate "Test UTF-8 normalization support"
> -	depends on UNICODE
> +	depends on UNICODE && KUNIT
> +	default KUNIT_ALL_TESTS
> diff --git a/fs/unicode/Makefile b/fs/unicode/Makefile
> index e309afe2b2bb..37bbcbc628a1 100644
> --- a/fs/unicode/Makefile
> +++ b/fs/unicode/Makefile
> @@ -4,7 +4,7 @@ ifneq ($(CONFIG_UNICODE),)
>   obj-y			+= unicode.o
>   endif
>   obj-$(CONFIG_UNICODE)	+= utf8data.o
> -obj-$(CONFIG_UNICODE_NORMALIZATION_SELFTEST) += utf8-selftest.o
> +obj-$(CONFIG_UNICODE_NORMALIZATION_KUNIT_TEST) += utf8-selftest.o
>   
>   unicode-y := utf8-norm.o utf8-core.o
>   
> diff --git a/fs/unicode/utf8-selftest.c b/fs/unicode/utf8-selftest.c
> index 600e15efe9ed..54ded8db6b1c 100644
> --- a/fs/unicode/utf8-selftest.c
> +++ b/fs/unicode/utf8-selftest.c
> @@ -1,38 +1,18 @@
>   // SPDX-License-Identifier: GPL-2.0-only
>   /*
> - * Kernel module for testing utf-8 support.
> + * KUnit tests for utf-8 support
>    *
>    * Copyright 2017 Collabora Ltd.
>    */
>   
> -#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> -
> -#include <linux/module.h>
> -#include <linux/printk.h>
>   #include <linux/unicode.h>
> -#include <linux/dcache.h>
> +#include <kunit/test.h>
>   
>   #include "utf8n.h"
>   
> -static unsigned int failed_tests;
> -static unsigned int total_tests;
> -
>   /* Tests will be based on this version. */
>   #define UTF8_LATEST	UNICODE_AGE(12, 1, 0)
>   
> -#define _test(cond, func, line, fmt, ...) do {				\
> -		total_tests++;						\
> -		if (!cond) {						\
> -			failed_tests++;					\
> -			pr_err("test %s:%d Failed: %s%s",		\
> -			       func, line, #cond, (fmt?":":"."));	\
> -			if (fmt)					\
> -				pr_err(fmt, ##__VA_ARGS__);		\
> -		}							\
> -	} while (0)
> -#define test_f(cond, fmt, ...) _test(cond, __func__, __LINE__, fmt, ##__VA_ARGS__)
> -#define test(cond) _test(cond, __func__, __LINE__, "")
> -
>   static const struct {
>   	/* UTF-8 strings in this vector _must_ be NULL-terminated. */
>   	unsigned char str[10];
> @@ -158,22 +138,22 @@ static const struct {
>   	}
>   };
>   
> -static ssize_t utf8len(const struct unicode_map *um, enum utf8_normalization n,
> -		const char *s)
> +static ssize_t utf8len(const struct unicode_map *um, enum utf8_normalization n, const char *s)

Keep "const char *s" on the second line.

>   {
>   	return utf8nlen(um, n, s, (size_t)-1);
>   }
>   

Rest looks good to me.

thanks,
-- Shuah
André Almeida Sept. 24, 2024, 9:30 p.m. UTC | #2
Hey!

On 9/23/24 19:34, Gabriela Bittencourt wrote:
> Instead of creating 'test' functions, use kunit functions to test
> utf-8 support in unicode subsystem.

I think it would be nice to explain in the commit message what are the 
benefits of this change, why refactoring into KUnit is a good idea?

> Co-developed-by: Pedro Orlando <porlando@lkcamp.dev>
> Signed-off-by: Pedro Orlando <porlando@lkcamp.dev>
> Co-developed-by: Danilo Pereira <dpereira@lkcamp.dev>
> Signed-off-by: Danilo Pereira <dpereira@lkcamp.dev>
> Signed-off-by: Gabriela Bittencourt <gbittencourt@lkcamp.dev>
> ---
>   fs/unicode/.kunitconfig    |   3 +
>   fs/unicode/Kconfig         |   5 +-
>   fs/unicode/Makefile        |   2 +-
>   fs/unicode/utf8-selftest.c | 152 +++++++++++++++++--------------------
>   4 files changed, 76 insertions(+), 86 deletions(-)
>   create mode 100644 fs/unicode/.kunitconfig
>
> diff --git a/fs/unicode/.kunitconfig b/fs/unicode/.kunitconfig
> new file mode 100644
> index 000000000000..62dd5c171f9c
> --- /dev/null
> +++ b/fs/unicode/.kunitconfig
> @@ -0,0 +1,3 @@
> +CONFIG_KUNIT=y
> +CONFIG_UNICODE=y
> +CONFIG_UNICODE_NORMALIZATION_KUNIT_TEST=y
> diff --git a/fs/unicode/Kconfig b/fs/unicode/Kconfig
> index da786a687fdc..4ad2c36550f1 100644
> --- a/fs/unicode/Kconfig
> +++ b/fs/unicode/Kconfig
> @@ -10,6 +10,7 @@ config UNICODE
>   	  be a separate loadable module that gets requested only when a file
>   	  system actually use it.
>   
> -config UNICODE_NORMALIZATION_SELFTEST
> +config UNICODE_NORMALIZATION_KUNIT_TEST
>   	tristate "Test UTF-8 normalization support"
> -	depends on UNICODE
> +	depends on UNICODE && KUNIT
> +	default KUNIT_ALL_TESTS
> diff --git a/fs/unicode/Makefile b/fs/unicode/Makefile
> index e309afe2b2bb..37bbcbc628a1 100644
> --- a/fs/unicode/Makefile
> +++ b/fs/unicode/Makefile
> @@ -4,7 +4,7 @@ ifneq ($(CONFIG_UNICODE),)
>   obj-y			+= unicode.o
>   endif
>   obj-$(CONFIG_UNICODE)	+= utf8data.o
> -obj-$(CONFIG_UNICODE_NORMALIZATION_SELFTEST) += utf8-selftest.o
> +obj-$(CONFIG_UNICODE_NORMALIZATION_KUNIT_TEST) += utf8-selftest.o
>   
>   unicode-y := utf8-norm.o utf8-core.o
>   
> diff --git a/fs/unicode/utf8-selftest.c b/fs/unicode/utf8-selftest.c
> index 600e15efe9ed..54ded8db6b1c 100644
> --- a/fs/unicode/utf8-selftest.c
> +++ b/fs/unicode/utf8-selftest.c
> @@ -1,38 +1,18 @@
>   // SPDX-License-Identifier: GPL-2.0-only
>   /*
> - * Kernel module for testing utf-8 support.
> + * KUnit tests for utf-8 support
>    *
>    * Copyright 2017 Collabora Ltd.
>    */
>   
> -#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> -
> -#include <linux/module.h>
> -#include <linux/printk.h>
>   #include <linux/unicode.h>
> -#include <linux/dcache.h>
> +#include <kunit/test.h>
>   
>   #include "utf8n.h"
>   
> -static unsigned int failed_tests;
> -static unsigned int total_tests;
> -
>   /* Tests will be based on this version. */
>   #define UTF8_LATEST	UNICODE_AGE(12, 1, 0)
>   
> -#define _test(cond, func, line, fmt, ...) do {				\
> -		total_tests++;						\
> -		if (!cond) {						\
> -			failed_tests++;					\
> -			pr_err("test %s:%d Failed: %s%s",		\
> -			       func, line, #cond, (fmt?":":"."));	\
> -			if (fmt)					\
> -				pr_err(fmt, ##__VA_ARGS__);		\
> -		}							\
> -	} while (0)
> -#define test_f(cond, fmt, ...) _test(cond, __func__, __LINE__, fmt, ##__VA_ARGS__)
> -#define test(cond) _test(cond, __func__, __LINE__, "")
> -
>   static const struct {
>   	/* UTF-8 strings in this vector _must_ be NULL-terminated. */
>   	unsigned char str[10];
> @@ -158,22 +138,22 @@ static const struct {
>   	}
>   };
>   
> -static ssize_t utf8len(const struct unicode_map *um, enum utf8_normalization n,
> -		const char *s)
> +static ssize_t utf8len(const struct unicode_map *um, enum utf8_normalization n, const char *s)
>   {
>   	return utf8nlen(um, n, s, (size_t)-1);
>   }
>   
>   static int utf8cursor(struct utf8cursor *u8c, const struct unicode_map *um,
> -		enum utf8_normalization n, const char *s)
> +		      enum utf8_normalization n, const char *s)
>   {
>   	return utf8ncursor(u8c, um, n, s, (unsigned int)-1);
>   }
>   
> -static void check_utf8_nfdi(struct unicode_map *um)
> +static void check_utf8_nfdi(struct kunit *test)
>   {
>   	int i;
>   	struct utf8cursor u8c;
> +	struct unicode_map *um = test->priv;
>   
>   	for (i = 0; i < ARRAY_SIZE(nfdi_test_data); i++) {
>   		int len = strlen(nfdi_test_data[i].str);
> @@ -181,28 +161,29 @@ static void check_utf8_nfdi(struct unicode_map *um)
>   		int j = 0;
>   		unsigned char c;
>   
> -		test((utf8len(um, UTF8_NFDI, nfdi_test_data[i].str) == nlen));
> -		test((utf8nlen(um, UTF8_NFDI, nfdi_test_data[i].str, len) ==
> -			nlen));
> +		KUNIT_EXPECT_EQ(test, utf8len(um, UTF8_NFDI, nfdi_test_data[i].str), nlen);
> +		KUNIT_EXPECT_EQ(test, utf8nlen(um, UTF8_NFDI, nfdi_test_data[i].str, len),
> +				nlen);
>   
> -		if (utf8cursor(&u8c, um, UTF8_NFDI, nfdi_test_data[i].str) < 0)
> -			pr_err("can't create cursor\n");
> +		KUNIT_EXPECT_GE_MSG(test, utf8cursor(&u8c, um, UTF8_NFDI, nfdi_test_data[i].str),
> +				    0, "Can't create cursor\n");
>   
>   		while ((c = utf8byte(&u8c)) > 0) {
> -			test_f((c == nfdi_test_data[i].dec[j]),
> -			       "Unexpected byte 0x%x should be 0x%x\n",
> -			       c, nfdi_test_data[i].dec[j]);
> +			KUNIT_EXPECT_EQ_MSG(test, c, nfdi_test_data[i].dec[j],
> +					    "Unexpected byte 0x%x should be 0x%x\n",
> +					    c, nfdi_test_data[i].dec[j]);
>   			j++;
>   		}
>   
> -		test((j == nlen));
> +		KUNIT_EXPECT_EQ(test, j, nlen);
>   	}
>   }
>   
> -static void check_utf8_nfdicf(struct unicode_map *um)
> +static void check_utf8_nfdicf(struct kunit *test)
>   {
>   	int i;
>   	struct utf8cursor u8c;
> +	struct unicode_map *um = test->priv;
>   
>   	for (i = 0; i < ARRAY_SIZE(nfdicf_test_data); i++) {
>   		int len = strlen(nfdicf_test_data[i].str);
> @@ -210,29 +191,30 @@ static void check_utf8_nfdicf(struct unicode_map *um)
>   		int j = 0;
>   		unsigned char c;
>   
> -		test((utf8len(um, UTF8_NFDICF, nfdicf_test_data[i].str) ==
> -				nlen));
> -		test((utf8nlen(um, UTF8_NFDICF, nfdicf_test_data[i].str, len) ==
> -				nlen));
> +		KUNIT_EXPECT_EQ(test, utf8len(um, UTF8_NFDICF, nfdicf_test_data[i].str),
> +				nlen);
> +		KUNIT_EXPECT_EQ(test, utf8nlen(um, UTF8_NFDICF, nfdicf_test_data[i].str, len),
> +				nlen);
>   
> -		if (utf8cursor(&u8c, um, UTF8_NFDICF,
> -				nfdicf_test_data[i].str) < 0)
> -			pr_err("can't create cursor\n");
> +		KUNIT_EXPECT_GE_MSG(test,
> +				    utf8cursor(&u8c, um, UTF8_NFDICF, nfdicf_test_data[i].str),
> +				    0, "Can't create cursor\n");
>   
>   		while ((c = utf8byte(&u8c)) > 0) {
> -			test_f((c == nfdicf_test_data[i].ncf[j]),
> -			       "Unexpected byte 0x%x should be 0x%x\n",
> -			       c, nfdicf_test_data[i].ncf[j]);
> +			KUNIT_EXPECT_EQ_MSG(test, c, nfdicf_test_data[i].ncf[j],
> +					    "Unexpected byte 0x%x should be 0x%x\n",
> +					    c, nfdicf_test_data[i].ncf[j]);
>   			j++;
>   		}
>   
> -		test((j == nlen));
> +		KUNIT_EXPECT_EQ(test, j, nlen);
>   	}
>   }
>   
> -static void check_utf8_comparisons(struct unicode_map *table)
> +static void check_utf8_comparisons(struct kunit *test)
>   {
>   	int i;
> +	struct unicode_map *um = test->priv;
>   
>   	for (i = 0; i < ARRAY_SIZE(nfdi_test_data); i++) {
>   		const struct qstr s1 = {.name = nfdi_test_data[i].str,
> @@ -240,8 +222,9 @@ static void check_utf8_comparisons(struct unicode_map *table)
>   		const struct qstr s2 = {.name = nfdi_test_data[i].dec,
>   					.len = sizeof(nfdi_test_data[i].dec)};
>   
> -		test_f(!utf8_strncmp(table, &s1, &s2),
> -		       "%s %s comparison mismatch\n", s1.name, s2.name);
> +		// strncmp returns 0 when strings are equal
> +		KUNIT_EXPECT_EQ_MSG(test, utf8_strncmp(um, &s1, &s2), 0,
> +				    "%s %s comparison mismatch\n", s1.name, s2.name);
>   	}
>   
>   	for (i = 0; i < ARRAY_SIZE(nfdicf_test_data); i++) {
> @@ -250,62 +233,65 @@ static void check_utf8_comparisons(struct unicode_map *table)
>   		const struct qstr s2 = {.name = nfdicf_test_data[i].ncf,
>   					.len = sizeof(nfdicf_test_data[i].ncf)};
>   
> -		test_f(!utf8_strncasecmp(table, &s1, &s2),
> -		       "%s %s comparison mismatch\n", s1.name, s2.name);
> +		// strncasecmp returns 0 when strings are equal
> +		KUNIT_EXPECT_EQ_MSG(test, utf8_strncasecmp(um, &s1, &s2), 0,
> +				    "%s %s comparison mismatch\n", s1.name, s2.name);
>   	}
>   }
>   
> -static void check_supported_versions(struct unicode_map *um)
> +static void check_supported_versions(struct kunit *test)
>   {
> +	struct unicode_map *um = test->priv;
>   	/* Unicode 7.0.0 should be supported. */
> -	test(utf8version_is_supported(um, UNICODE_AGE(7, 0, 0)));
> +	KUNIT_EXPECT_TRUE(test, utf8version_is_supported(um, UNICODE_AGE(7, 0, 0)));
>   
>   	/* Unicode 9.0.0 should be supported. */
> -	test(utf8version_is_supported(um, UNICODE_AGE(9, 0, 0)));
> +	KUNIT_EXPECT_TRUE(test, utf8version_is_supported(um, UNICODE_AGE(9, 0, 0)));
>   
>   	/* Unicode 1x.0.0 (the latest version) should be supported. */
> -	test(utf8version_is_supported(um, UTF8_LATEST));
> +	KUNIT_EXPECT_TRUE(test, utf8version_is_supported(um, UTF8_LATEST));
>   
>   	/* Next versions don't exist. */
> -	test(!utf8version_is_supported(um, UNICODE_AGE(13, 0, 0)));
> -	test(!utf8version_is_supported(um, UNICODE_AGE(0, 0, 0)));
> -	test(!utf8version_is_supported(um, UNICODE_AGE(-1, -1, -1)));
> +	KUNIT_EXPECT_FALSE(test, utf8version_is_supported(um, UNICODE_AGE(13, 0, 0)));
> +	KUNIT_EXPECT_FALSE(test, utf8version_is_supported(um, UNICODE_AGE(0, 0, 0)));
> +	KUNIT_EXPECT_FALSE(test, utf8version_is_supported(um, UNICODE_AGE(-1, -1, -1)));
>   }
>   
> -static int __init init_test_ucd(void)
> +static struct kunit_case unicode_normalization_test_cases[] = {
> +	KUNIT_CASE(check_supported_versions),
> +	KUNIT_CASE(check_utf8_comparisons),
> +	KUNIT_CASE(check_utf8_nfdicf),
> +	KUNIT_CASE(check_utf8_nfdi),
> +	{}
> +};
> +
> +static int init_test_ucd(struct kunit *test)
>   {
> -	struct unicode_map *um;
> +	struct unicode_map *um = utf8_load(UTF8_LATEST);
>   
> -	failed_tests = 0;
> -	total_tests = 0;
> +	test->priv = um;
>   
> -	um = utf8_load(UTF8_LATEST);
> -	if (IS_ERR(um)) {
> -		pr_err("%s: Unable to load utf8 table.\n", __func__);
> -		return PTR_ERR(um);
> -	}
> +	KUNIT_EXPECT_EQ_MSG(test, IS_ERR(um), 0,
> +			    "%s: Unable to load utf8 table.\n", __func__);
>   
> -	check_supported_versions(um);
> -	check_utf8_nfdi(um);
> -	check_utf8_nfdicf(um);
> -	check_utf8_comparisons(um);
> -
> -	if (!failed_tests)
> -		pr_info("All %u tests passed\n", total_tests);
> -	else
> -		pr_err("%u out of %u tests failed\n", failed_tests,
> -		       total_tests);
> -	utf8_unload(um);
>   	return 0;
>   }
>   
> -static void __exit exit_test_ucd(void)
> +static void exit_test_ucd(struct kunit *test)
>   {
> +	utf8_unload(test->priv);
>   }
>   
> -module_init(init_test_ucd);
> -module_exit(exit_test_ucd);
> +static struct kunit_suite unicode_normalization_test_suite = {
> +	.name = "unicode_normalization",
> +	.test_cases = unicode_normalization_test_cases,
> +	.init = init_test_ucd,
> +	.exit = exit_test_ucd,
> +};
> +
> +kunit_test_suite(unicode_normalization_test_suite);
> +
>   
>   MODULE_AUTHOR("Gabriel Krisman Bertazi <krisman@collabora.co.uk>");
> -MODULE_DESCRIPTION("Kernel module for testing utf-8 support");
> +MODULE_DESCRIPTION("KUnit tests for utf-8 support");
>   MODULE_LICENSE("GPL");
Gabriel Krisman Bertazi Sept. 24, 2024, 9:40 p.m. UTC | #3
André Almeida <andrealmeid@riseup.net> writes:

> Hey!
>
> On 9/23/24 19:34, Gabriela Bittencourt wrote:
>> Instead of creating 'test' functions, use kunit functions to test
>> utf-8 support in unicode subsystem.
>
> I think it would be nice to explain in the commit message what are the
> benefits of this change, why refactoring into KUnit is a good idea?

TBH, I wouldn't mind dropping this code altogether.  We have similar
coverage in fstests already. It was useful when developing fs/unicode
but nowadays, it seems moot.
André Almeida Sept. 24, 2024, 10:16 p.m. UTC | #4
Em 24 de setembro de 2024 23:40:27 GMT+02:00, Gabriel Krisman Bertazi <gabriel@krisman.be> escreveu:
>André Almeida <andrealmeid@riseup.net> writes:
>
>> Hey!
>>
>> On 9/23/24 19:34, Gabriela Bittencourt wrote:
>>> Instead of creating 'test' functions, use kunit functions to test
>>> utf-8 support in unicode subsystem.
>>
>> I think it would be nice to explain in the commit message what are the
>> benefits of this change, why refactoring into KUnit is a good idea?
>
>TBH, I wouldn't mind dropping this code altogether.  We have similar
>coverage in fstests already. It was useful when developing fs/unicode
>but nowadays, it seems moot.
>

I see, I still find it useful when doing Unicode changes. Even if userspace tests show that something is broken, unit testing might be more precise in that regard.

I'm sure David also has some good arguments about that matter too :D