Message ID | cover.1673539719.git.ydroneaud@opteya.com |
---|---|
Headers | show |
Series | random: a simple vDSO mechanism for reseeding userspace CSPRNGs | expand |
Sorry Yann, but I'm not interested in this approach, and I don't think reviewing the details of it are a good allocation of time. I don't want to lock the kernel into having specific reseeding semantics that are a contract with userspace, which is what this approach does. Please just let me iterate on my original patchset for a little bit, without adding more junk to the already overly large conversation.
[...] >GRND_TIMESTAMP allows userspace to ask the kernel if previous >"timestamp" has changed as the result of an event that >triggered kernel CSPRNG reseed, and to update the "timestamp". >In case the "timestamp" hasn't changed, userspace CSPRNG can >consume a slice of its buffered random stream. >If it has changed, remaining userspace buffered random values >should be discarded. Userspace should call getrandom() to fill >and/or generate its buffer with updated seed. >It's advised to test again the "timestamp" to deal with the >race condition, where the kernel reseed just after the call >to getrandom() to get entropy. This second check is the 'safe thing' to do, but if you're that worried about race conditions then this api is useless. You can't ignore the inherent TOCTOU problem with the GRND_TIMESTAMP calls. That said, the race condition is so tiny that the added overhead of a third syscall (without vDSO support) starts to negate the value in buffering the random numbers (not to mention adding significant latency to every nth arc4random() call, for example). IMO, for callers that cannot accept the risk, the getrandom(,,0) option is the perfect alternative. [...] >+static ssize_t get_random_timestamp(char __user *ubuf, size_t len, unsigned int flags) >+{ >+ u64 ts; >+ >+ /* other combination not supported */ >+ if (WARN(flags != GRND_TIMESTAMP, "GRND_TIMESTAMP cannot be used with other flags")) >+ return -EINVAL; If userspace messes up the flags then it's the problem of the caller. Why clutter the logs in that case? At the very least this should be WARN_ONCE() to avoid log spam. >+ /* shorter structure not supported */ >+ if (len < sizeof(ts)) >+ return -EINVAL; This should be sizeof(u64) to match the vDSO patch and to avoid having to change this condition if ts becomes larger. >+ >+ if (copy_from_user(&ts, ubuf, sizeof(ts))) >+ return -EFAULT; >+ >+ /* longer structure supported, only if 0-padded, >+ timestamp might be extended in the future with more fields */ >+ if (len > sizeof(ts)) { >+ char __user *p = ubuf + sizeof(ts); >+ size_t l = len - sizeof(ts); >+ >+ while (l) { >+ char b; >+ >+ if (get_user(b, p++)) >+ return -EFAULT; >+ >+ if (b) >+ return -EINVAL; >+ } >+ } >+ >+ if (!get_random_timestamp_update(&ts, READ_ONCE(base_crng.generation))) >+ return 0; >+ >+ if (copy_to_user(ubuf, &ts, sizeof(ts))) >+ return -EFAULT; >+ >+ return sizeof(ts); >+} >+ > SYSCALL_DEFINE3(getrandom, char __user *, ubuf, size_t, len, unsigned int, flags) > { > struct iov_iter iter; > struct iovec iov; > int ret; > >- if (flags & ~(GRND_NONBLOCK | GRND_RANDOM | GRND_INSECURE)) >+ if (flags & ~(GRND_NONBLOCK | GRND_RANDOM | GRND_INSECURE | GRND_TIMESTAMP)) > return -EINVAL; > >+ if (unlikely(flags & GRND_TIMESTAMP)) >+ return get_random_timestamp(ubuf, len, flags); >+ I'd remove the unlikely(). I don't like to assume usage patterns. > /* > * Requesting insecure and blocking randomness at the same time makes > * no sense. >diff --git a/include/linux/random.h b/include/linux/random.h >index b0a940af4fff..bc219b5a96a5 100644 >--- a/include/linux/random.h >+++ b/include/linux/random.h >@@ -161,4 +161,35 @@ int random_online_cpu(unsigned int cpu); > extern const struct file_operations random_fops, urandom_fops; > #endif > >+/* >+ * get_random_timestamp_update() >+ * >+ * @generation: current CRNG generation (from base_crng.generation >+ * or _vdso_rng_data.generation) >+ * >+ * Return: timestamp size if previous timestamp is expired and is updated, >+ * 0 if not expired (and not updated) >+ */ >+static inline bool get_random_timestamp_update(u64 *user_ts, >+ u64 generation) >+{ >+ u64 ts; >+ >+ /* base_crng.generation is never ULONG_MAX, >+ * OTOH userspace will initialize its timestamp >+ * to 0, so inverting base_crng.generation ensure >+ * first call to getrandom(,,GRND_TIMESTAMP) will >+ * update >+ */ Rather than assuming that the timestamp will start zero-initilized, expect it to be uninitilized. Either way the code works. >+ ts = ~generation; >+ >+ /* not expired ? no refresh suggested */ >+ if (*user_ts == ts) >+ return false; >+ >+ *user_ts = ts; >+ >+ return true; >+} >+ Not that it matters much, but you could make generation a u64* that gets dereferenced by get_random_timestamp_update(). It's cleaner for the caller, barely changes this function, and will get inlined anyway. I might just be imposing my personal style in this case though. After a cursory skimming of the rest of the series I think that this is a worthwhile direction to pursue. Jason's series is growing bulky and this provides the needed slimming while solving the root problem. The only thing I see immediately is the TOCTOU problem and the extra steps needed to guarantee forward secrecy. I should mention that I'm not a security or rng expert at all. Cheers, Peter Lafreniere <peter@n8pjl.ca>
On 1/12/23 11:55, Yann Droneaud wrote: > Hi > > 12 janvier 2023 à 18:07 "Jason A. Donenfeld" <Jason@zx2c4.com> a écrit: > >> Sorry Yann, but I'm not interested in this approach, and I don't think >> reviewing the details of it are a good allocation of time. I don't >> want to lock the kernel into having specific reseeding semantics that >> are a contract with userspace, which is what this approach does. > > This patch adds a mean for the kernel to tell userspace: between the > last time you call us with getrandom(timestamp,, GRND_TIMESTAMP), > something happened that trigger an update to the opaque cookie given > to getrandom(timestamp, GRND_TIMESTAMP). When such update happen, > userspace is advised to discard buffered random data and retry. > > The meaning of the timestamp cookie is up to the kernel, and can be > changed anytime. Userspace is not expected to read the content of this > blob. Userspace only acts on the length returned by getrandom(,, GRND_TIMESTAMP): > -1 : not supported > 0 : cookie not updated, no need to discard buffered data > >0 : cookie updated, userspace should discard buffered data > > For the cookie, I've used a single u64, but two u64 could be a better start, > providing room for implementing improved behavior in future kernel versions. > >> Please just let me iterate on my original patchset for a little bit, >> without adding more junk to the already overly large conversation. > > I like the simplicity of my so called "junk". It's streamlined, doesn't > require a new syscall, doesn't require a new copy of ChaCha20 code. > > I'm sorry it doesn't fit your expectations. > Why would anything more than a 64-bit counter be ever necessary? It only needs to be incremented. Let user space manage keeping track of the cookie matching its own buffers. You do NOT want this to be stateful, because that's just begging for multiple libraries to step on each other. Export the cookie from the vdso and volià, a very cheap check around any user space randomness buffer will work: static clone_cookie_t last_cookie; clone_cookie_t this_cookie; this_cookie = get_clone_cookie(); do { while (this_cookie != last_cookie) { last_cookie = this_cookie; reinit_randomness(); this_cookie = get_clone_cookie(); } extract_randomness_from_buffer(); this_cookie = get_clone_cookie(); } while (this_cookie != last_cookie); last_cookie = this_cookie; -hpa
> On Jan 13, 2023, at 7:16 PM, H. Peter Anvin <hpa@zytor.com> wrote: > > On 1/12/23 11:55, Yann Droneaud wrote: >> Hi >> 12 janvier 2023 à 18:07 "Jason A. Donenfeld" <Jason@zx2c4.com> a écrit: >> >>> Sorry Yann, but I'm not interested in this approach, and I don't think >>> reviewing the details of it are a good allocation of time. I don't >>> want to lock the kernel into having specific reseeding semantics that >>> are a contract with userspace, which is what this approach does. >> This patch adds a mean for the kernel to tell userspace: between the >> last time you call us with getrandom(timestamp,, GRND_TIMESTAMP), >> something happened that trigger an update to the opaque cookie given >> to getrandom(timestamp, GRND_TIMESTAMP). When such update happen, >> userspace is advised to discard buffered random data and retry. >> The meaning of the timestamp cookie is up to the kernel, and can be >> changed anytime. Userspace is not expected to read the content of this >> blob. Userspace only acts on the length returned by getrandom(,, GRND_TIMESTAMP): >> -1 : not supported >> 0 : cookie not updated, no need to discard buffered data >> >0 : cookie updated, userspace should discard buffered data >> For the cookie, I've used a single u64, but two u64 could be a better start, >> providing room for implementing improved behavior in future kernel versions. >>> Please just let me iterate on my original patchset for a little bit, >>> without adding more junk to the already overly large conversation. >> I like the simplicity of my so called "junk". It's streamlined, doesn't >> require a new syscall, doesn't require a new copy of ChaCha20 code. >> I'm sorry it doesn't fit your expectations. > > Why would anything more than a 64-bit counter be ever necessary? It only needs to be incremented. This is completely broken with CRIU or, for that matter, with VM forking. > > Let user space manage keeping track of the cookie matching its own buffers. You do NOT want this to be stateful, because that's just begging for multiple libraries to step on each other. > > Export the cookie from the vdso and volià, a very cheap check around any user space randomness buffer will work: > > static clone_cookie_t last_cookie; > clone_cookie_t this_cookie; > > this_cookie = get_clone_cookie(); > do { > while (this_cookie != last_cookie) { > last_cookie = this_cookie; > reinit_randomness(); > this_cookie = get_clone_cookie(); > } > > extract_randomness_from_buffer(); > this_cookie = get_clone_cookie(); > } while (this_cookie != last_cookie); > > last_cookie = this_cookie; > > -hpa
Hi, 16 janvier 2023 à 20:50 "Andy Lutomirski" <luto@amacapital.net> a écrit: > > On Jan 13, 2023, at 7:16 PM, H. Peter Anvin <hpa@zytor.com> wrote: > > On 1/12/23 11:55, Yann Droneaud wrote: > > > 12 janvier 2023 à 18:07 "Jason A. Donenfeld" <Jason@zx2c4.com> a écrit: > > > > > > > Sorry Yann, but I'm not interested in this approach, and I don't think > > reviewing the details of it are a good allocation of time. I don't > > want to lock the kernel into having specific reseeding semantics that > > are a contract with userspace, which is what this approach does. > > > > > > > > This patch adds a mean for the kernel to tell userspace: between the > > > last time you call us with getrandom(timestamp,, GRND_TIMESTAMP), > > > something happened that trigger an update to the opaque cookie given > > > to getrandom(timestamp, GRND_TIMESTAMP). When such update happen, > > > userspace is advised to discard buffered random data and retry. > > > The meaning of the timestamp cookie is up to the kernel, and can be > > > changed anytime. Userspace is not expected to read the content of this > > > blob. Userspace only acts on the length returned by getrandom(,, GRND_TIMESTAMP): > > > -1 : not supported > > > 0 : cookie not updated, no need to discard buffered data > > > >0 : cookie updated, userspace should discard buffered data > > > For the cookie, I've used a single u64, but two u64 could be a better start, > > > providing room for implementing improved behavior in future kernel versions. > > > > > > > Please just let me iterate on my original patchset for a little bit, > > without adding more junk to the already overly large conversation. > > > > > > > > I like the simplicity of my so called "junk". It's streamlined, doesn't > > > require a new syscall, doesn't require a new copy of ChaCha20 code. > > > I'm sorry it doesn't fit your expectations. > > > > > > > > > Why would anything more than a 64-bit counter be ever necessary? It only needs to be incremented. > > > > This is completely broken with CRIU or, for that matter, with VM forking. > Which raise the question of the support of CRIU with Jason's vDSO proposal. AFAIK CRIU handle vDSO[1] by interposing symbols so that, on restore, the process will call the interposed functions, which will resolve the new vDSO's functions. vgetrandom_alloc() would have been called before the checkpoint, allocating one opaque state of size x. After the restore, the vDSO's getrandom() would be given this opaque state, expecting it having size y. As the content of the opaque state should have been cleared per MADV_WIPEONFORK, there's nothing in the state that could help vDSO's getrandom() to achieve backward compatibility. I think backward compatibility can be achieved by adding an opaque state size argument to vDSO's getrandom(). What to think Jason ? [1] https://criu.org/Vdso Regards.