Message ID | 20201216105906.6607-1-luca.dariz@hitachi-powergrids.com |
---|---|
State | New |
Headers | show |
Series | [v2] hwrng: fix khwrng lifecycle | expand |
On Wed, Dec 16, 2020 at 11:59:06AM +0100, Luca Dariz wrote: > > @@ -432,12 +433,15 @@ static int hwrng_fillfn(void *unused) > { > long rc; > > + complete(&hwrng_started); > while (!kthread_should_stop()) { > struct hwrng *rng; > > rng = get_current_rng(); > - if (IS_ERR(rng) || !rng) > - break; > + if (IS_ERR(rng) || !rng) { > + msleep_interruptible(10); > + continue; Please fix this properly with reference counting. Cheers, -- Email: Herbert Xu <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
>On Wed, Dec 16, 2020 at 11:59:06AM +0100, Luca Dariz wrote: >> >> @@ -432,12 +433,15 @@ static int hwrng_fillfn(void *unused) { >> long rc; >> >> + complete(&hwrng_started); >> while (!kthread_should_stop()) { >> struct hwrng *rng; >> >> rng = get_current_rng(); >> - if (IS_ERR(rng) || !rng) >> - break; >> + if (IS_ERR(rng) || !rng) { >> + msleep_interruptible(10); >> + continue; > >Please fix this properly with reference counting. I thought a bit more about it, but I always find a potential race condition with kthread_stop() and the hwrng_fill NULL pointer check. In my opinion the thread termination should be only triggered with kthread_stop(), otherwise it might be called with an invalid or NULL hwrng_fill. Am I missing something? Thanks Luca
diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c index 8c1c47dd9f46..367b122c1d70 100644 --- a/drivers/char/hw_random/core.c +++ b/drivers/char/hw_random/core.c @@ -31,6 +31,7 @@ static struct hwrng *current_rng; /* the current rng has been explicitly chosen by user via sysfs */ static int cur_rng_set_by_user; static struct task_struct *hwrng_fill; +static struct completion hwrng_started = COMPLETION_INITIALIZER(hwrng_started); /* list of registered rngs, sorted decending by quality */ static LIST_HEAD(rng_list); /* Protects rng_list and current_rng */ @@ -432,12 +433,15 @@ static int hwrng_fillfn(void *unused) { long rc; + complete(&hwrng_started); while (!kthread_should_stop()) { struct hwrng *rng; rng = get_current_rng(); - if (IS_ERR(rng) || !rng) - break; + if (IS_ERR(rng) || !rng) { + msleep_interruptible(10); + continue; + } mutex_lock(&reading_mutex); rc = rng_get_data(rng, rng_fillbuf, rng_buffer_size(), 1); @@ -462,6 +466,8 @@ static void start_khwrngd(void) if (IS_ERR(hwrng_fill)) { pr_err("hwrng_fill thread creation failed\n"); hwrng_fill = NULL; + } else { + wait_for_completion(&hwrng_started); } }
There are two issues with the management of the kernel thread to gather entropy: * it can terminate also if the rng is removed, and in this case it doesn't synchronize with kthread_should_stop(), but it directly sets hwrng_fill to NULL. If this happens after the NULL check but before kthread_stop() is called, we'll have a NULL pointer dereference. * if we have a register/unregister too fast, it can happen that the kthread is not yet started when kthread_stop is called, and this seems to leave a corrupted or uninitialized kthread struct. This is detected by the WARN_ON at kernel/kthread.c:75 and later causes a page domain fault. CC: Matt Mackall <mpm@selenic.com> CC: Herbert Xu <herbert@gondor.apana.org.au> CC: Colin Ian King <colin.king@canonical.com> CC: Holger Brunck <holger.brunck@hitachi-powergrids.com> CC: Valentin Longchamp <valentin.longchamp@hitachi-powergrids.com> Signed-off-by: Luca Dariz <luca.dariz@hitachi-powergrids.com> --- v2: * reduced sleep from 10s to 10ms in case there is no rng; the termination should be faster in this case as it could block a pending register or unregister. drivers/char/hw_random/core.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)