Message ID | 20230710223304.1174642-7-almasrymina@google.com |
---|---|
State | New |
Headers | show |
Series | Device Memory TCP | expand |
On 7/10/23 15:32, Mina Almasry wrote: > Add an interface for the user to notify the kernel that it is done reading > the NET_RX dmabuf pages returned as cmsg. The kernel will drop the > reference on the NET_RX pages to make them available for re-use. > > Signed-off-by: Mina Almasry <almasrymina@google.com> > --- > + for (i = 0; i < num_tokens; i++) { > + for (j = 0; j < tokens[i].token_count; j++) { > + struct page *pg = xa_erase(&sk->sk_pagepool, > + tokens[i].token_start + j); > + > + if (pg) > + put_page(pg); > + else > + /* -EINTR here notifies the userspace > + * that not all tokens passed to it have > + * been freed. > + */ > + ret = -EINTR; Unless I'm missing something, this type of error reporting is unrecoverable -- userspace doesn't know how many tokens have been freed. I think you should either make it explicitly unrecoverable (somehow shut down dmabuf handling entirely) or tell userspace how many tokens were successfully freed. --Andy
On Sun, Jul 16, 2023 at 4:57 PM Andy Lutomirski <luto@kernel.org> wrote: > > On 7/10/23 15:32, Mina Almasry wrote: > > Add an interface for the user to notify the kernel that it is done reading > > the NET_RX dmabuf pages returned as cmsg. The kernel will drop the > > reference on the NET_RX pages to make them available for re-use. > > > > Signed-off-by: Mina Almasry <almasrymina@google.com> > > --- > > > + for (i = 0; i < num_tokens; i++) { > > + for (j = 0; j < tokens[i].token_count; j++) { > > + struct page *pg = xa_erase(&sk->sk_pagepool, > > + tokens[i].token_start + j); > > + > > + if (pg) > > + put_page(pg); > > + else > > + /* -EINTR here notifies the userspace > > + * that not all tokens passed to it have > > + * been freed. > > + */ > > + ret = -EINTR; > > Unless I'm missing something, this type of error reporting is > unrecoverable -- userspace doesn't know how many tokens have been freed. > > I think you should either make it explicitly unrecoverable (somehow shut > down dmabuf handling entirely) or tell userspace how many tokens were > successfully freed. > Thank you, the latter suggestion sounds perfect actually. The user can't do much if the kernel fails to free all the tokens, but at least it can know that something wrong is happening and can log an error for further debugging. Great suggestion! Thanks!
diff --git a/include/uapi/asm-generic/socket.h b/include/uapi/asm-generic/socket.h index 88f9234f78cb..2a5a7f5da358 100644 --- a/include/uapi/asm-generic/socket.h +++ b/include/uapi/asm-generic/socket.h @@ -132,6 +132,7 @@ #define SO_RCVMARK 75 +#define SO_DEVMEM_DONTNEED 97 #define SO_DEVMEM_HEADER 98 #define SCM_DEVMEM_HEADER SO_DEVMEM_HEADER #define SO_DEVMEM_OFFSET 99 diff --git a/include/uapi/linux/uio.h b/include/uapi/linux/uio.h index 8b0be0f50838..faaa765fd5a4 100644 --- a/include/uapi/linux/uio.h +++ b/include/uapi/linux/uio.h @@ -26,6 +26,10 @@ struct cmsg_devmem { __u32 frag_token; }; +struct devmemtoken { + __u32 token_start; + __u32 token_count; +}; /* * UIO_MAXIOV shall be at least 16 1003.1g (5.4.1.1) */ diff --git a/net/core/sock.c b/net/core/sock.c index 24f2761bdb1d..f9b9d9ec7322 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -1531,7 +1531,48 @@ int sk_setsockopt(struct sock *sk, int level, int optname, /* Paired with READ_ONCE() in tcp_rtx_synack() */ WRITE_ONCE(sk->sk_txrehash, (u8)val); break; + case SO_DEVMEM_DONTNEED: { + struct devmemtoken tokens[128]; + unsigned int num_tokens, i, j; + if (sk->sk_type != SOCK_STREAM || + sk->sk_protocol != IPPROTO_TCP) { + ret = -EBADF; + break; + } + + if (optlen % sizeof(struct devmemtoken) || + optlen > sizeof(tokens)) { + ret = -EINVAL; + break; + } + + num_tokens = optlen / sizeof(struct devmemtoken); + if (copy_from_sockptr(tokens, optval, optlen)) { + ret = -EFAULT; + break; + } + + ret = 0; + + for (i = 0; i < num_tokens; i++) { + for (j = 0; j < tokens[i].token_count; j++) { + struct page *pg = xa_erase(&sk->sk_pagepool, + tokens[i].token_start + j); + + if (pg) + put_page(pg); + else + /* -EINTR here notifies the userspace + * that not all tokens passed to it have + * been freed. + */ + ret = -EINTR; + } + } + + break; + } default: ret = -ENOPROTOOPT; break;
Add an interface for the user to notify the kernel that it is done reading the NET_RX dmabuf pages returned as cmsg. The kernel will drop the reference on the NET_RX pages to make them available for re-use. Signed-off-by: Mina Almasry <almasrymina@google.com> --- include/uapi/asm-generic/socket.h | 1 + include/uapi/linux/uio.h | 4 +++ net/core/sock.c | 41 +++++++++++++++++++++++++++++++ 3 files changed, 46 insertions(+)