Message ID | 20230120144356.40717-4-gregory.price@memverge.com |
---|---|
State | Superseded |
Headers | show |
Series | Checkpoint Support for Syscall User Dispatch | expand |
On Fri, Jan 20, 2023 at 7:05 AM Gregory Price <gourry.memverge@gmail.com> wrote: > > Implement ptrace getter/setter interface for syscall user dispatch. > > Presently, these settings are write-only via prctl, making it impossible > to implement transparent checkpoint (coordination with the software is > required). > > This is modeled after a similar interface for SECCOMP, which can have > its configuration dumped by ptrace for software like CRIU. > > Signed-off-by: Gregory Price <gregory.price@memverge.com> > --- > .../admin-guide/syscall-user-dispatch.rst | 5 +- > include/linux/syscall_user_dispatch.h | 19 +++++++ > include/uapi/linux/ptrace.h | 10 ++++ > kernel/entry/syscall_user_dispatch.c | 49 +++++++++++++++++++ > kernel/ptrace.c | 9 ++++ > 5 files changed, 91 insertions(+), 1 deletion(-) > > diff --git a/Documentation/admin-guide/syscall-user-dispatch.rst b/Documentation/admin-guide/syscall-user-dispatch.rst > index 60314953c728..a23ae21a1d5b 100644 > --- a/Documentation/admin-guide/syscall-user-dispatch.rst > +++ b/Documentation/admin-guide/syscall-user-dispatch.rst <snip> > + > +int syscall_user_dispatch_get_config(struct task_struct *task, unsigned long size, > + void __user *data) > +{ > + struct syscall_user_dispatch *sd = &task->syscall_dispatch; > + struct syscall_user_dispatch_config config; > + > + if (size != sizeof(struct syscall_user_dispatch_config)) > + return -EINVAL; > + > + if (sd->selector) { > + config.mode = PR_SYS_DISPATCH_ON; > + config.offset = sd->offset; > + config.len = sd->len; > + config.selector = sd->selector; > + config.on_dispatch = sd->on_dispatch; > + } else { This doesn't look right for me. selector is optional and if it is 0, it doesn't mean that mode is PR_SYS_DISPATCH_OFF, does it? > + config.mode = PR_SYS_DISPATCH_OFF; > + config.offset = 0; > + config.len = 0; > + config.selector = NULL; > + config.on_dispatch = false; > + } > + if (copy_to_user(data, &config, sizeof(config))) > + return -EFAULT; > + > + return 0; > +} > +
On Fri, Jan 20, 2023 at 07:18:49PM -0800, Andrei Vagin wrote: > On Fri, Jan 20, 2023 at 7:05 AM Gregory Price <gourry.memverge@gmail.com> wrote: > > > > Implement ptrace getter/setter interface for syscall user dispatch. > > > > Presently, these settings are write-only via prctl, making it impossible > > to implement transparent checkpoint (coordination with the software is > > required). > > > > This is modeled after a similar interface for SECCOMP, which can have > > its configuration dumped by ptrace for software like CRIU. > > > > Signed-off-by: Gregory Price <gregory.price@memverge.com> > > --- > > .../admin-guide/syscall-user-dispatch.rst | 5 +- > > include/linux/syscall_user_dispatch.h | 19 +++++++ > > include/uapi/linux/ptrace.h | 10 ++++ > > kernel/entry/syscall_user_dispatch.c | 49 +++++++++++++++++++ > > kernel/ptrace.c | 9 ++++ > > 5 files changed, 91 insertions(+), 1 deletion(-) > > > > diff --git a/Documentation/admin-guide/syscall-user-dispatch.rst b/Documentation/admin-guide/syscall-user-dispatch.rst > > index 60314953c728..a23ae21a1d5b 100644 > > --- a/Documentation/admin-guide/syscall-user-dispatch.rst > > +++ b/Documentation/admin-guide/syscall-user-dispatch.rst > > <snip> > > > + > > +int syscall_user_dispatch_get_config(struct task_struct *task, unsigned long size, > > + void __user *data) > > +{ > > + struct syscall_user_dispatch *sd = &task->syscall_dispatch; > > + struct syscall_user_dispatch_config config; > > + > > + if (size != sizeof(struct syscall_user_dispatch_config)) > > + return -EINVAL; > > + > > + if (sd->selector) { > > + config.mode = PR_SYS_DISPATCH_ON; > > + config.offset = sd->offset; > > + config.len = sd->len; > > + config.selector = sd->selector; > > + config.on_dispatch = sd->on_dispatch; > > + } else { > > This doesn't look right for me. selector is optional and if it is 0, > it doesn't mean that > mode is PR_SYS_DISPATCH_OFF, does it? > > > + config.mode = PR_SYS_DISPATCH_OFF; > > + config.offset = 0; > > + config.len = 0; > > + config.selector = NULL; > > + config.on_dispatch = false; > > + } > > + if (copy_to_user(data, &config, sizeof(config))) > > + return -EFAULT; > > + > > + return 0; > > +} > > + Hm. Right you are. I suppose I should change this to checking offset instead. Will need to validate the fields are correctly cleared on disable and on task allocate (i presume this is true). Otherwise it might behoove us to actually add a state field. Thank you, i'll push an update tomorrow. I also need change patch 2/3 as well.
diff --git a/Documentation/admin-guide/syscall-user-dispatch.rst b/Documentation/admin-guide/syscall-user-dispatch.rst index 60314953c728..a23ae21a1d5b 100644 --- a/Documentation/admin-guide/syscall-user-dispatch.rst +++ b/Documentation/admin-guide/syscall-user-dispatch.rst @@ -43,7 +43,10 @@ doesn't rely on any of the syscall ABI to make the filtering. It uses only the syscall dispatcher address and the userspace key. As the ABI of these intercepted syscalls is unknown to Linux, these -syscalls are not instrumentable via ptrace or the syscall tracepoints. +syscalls are not instrumentable via ptrace or the syscall tracepoints, +however an interfaces to suspend, checkpoint, and restore syscall user +dispatch configuration has been added to ptrace to assist userland +checkpoint/restart software. Interface --------- diff --git a/include/linux/syscall_user_dispatch.h b/include/linux/syscall_user_dispatch.h index a0ae443fb7df..9e1bd0d87c1e 100644 --- a/include/linux/syscall_user_dispatch.h +++ b/include/linux/syscall_user_dispatch.h @@ -22,6 +22,13 @@ int set_syscall_user_dispatch(unsigned long mode, unsigned long offset, #define clear_syscall_work_syscall_user_dispatch(tsk) \ clear_task_syscall_work(tsk, SYSCALL_USER_DISPATCH) +int syscall_user_dispatch_get_config(struct task_struct *task, unsigned long size, + void __user *data); + +int syscall_user_dispatch_set_config(struct task_struct *task, unsigned long size, + void __user *data); + + #else struct syscall_user_dispatch {}; @@ -35,6 +42,18 @@ static inline void clear_syscall_work_syscall_user_dispatch(struct task_struct * { } +static inline int syscall_user_dispatch_get_config(struct task_struct *task, unsigned long size, + void __user *data) +{ + return -EINVAL; +} + +static inline int syscall_user_dispatch_set_config(struct task_struct *task, unsigned long size, + void __user *data) +{ + return -EINVAL; +} + #endif /* CONFIG_GENERIC_ENTRY */ #endif /* _SYSCALL_USER_DISPATCH_H */ diff --git a/include/uapi/linux/ptrace.h b/include/uapi/linux/ptrace.h index ba9e3f19a22c..8b93c78189b5 100644 --- a/include/uapi/linux/ptrace.h +++ b/include/uapi/linux/ptrace.h @@ -112,6 +112,16 @@ struct ptrace_rseq_configuration { __u32 pad; }; +#define PTRACE_SET_SYSCALL_USER_DISPATCH_CONFIG 0x4210 +#define PTRACE_GET_SYSCALL_USER_DISPATCH_CONFIG 0x4211 +struct syscall_user_dispatch_config { + __u64 mode; + __s8 *selector; + __u64 offset; + __u64 len; + __u8 on_dispatch; +}; + /* * These values are stored in task->ptrace_message * by ptrace_stop to describe the current syscall-stop. diff --git a/kernel/entry/syscall_user_dispatch.c b/kernel/entry/syscall_user_dispatch.c index b5ec75164805..a3b24d498b39 100644 --- a/kernel/entry/syscall_user_dispatch.c +++ b/kernel/entry/syscall_user_dispatch.c @@ -111,3 +111,52 @@ int set_syscall_user_dispatch(unsigned long mode, unsigned long offset, return 0; } + +int syscall_user_dispatch_get_config(struct task_struct *task, unsigned long size, + void __user *data) +{ + struct syscall_user_dispatch *sd = &task->syscall_dispatch; + struct syscall_user_dispatch_config config; + + if (size != sizeof(struct syscall_user_dispatch_config)) + return -EINVAL; + + if (sd->selector) { + config.mode = PR_SYS_DISPATCH_ON; + config.offset = sd->offset; + config.len = sd->len; + config.selector = sd->selector; + config.on_dispatch = sd->on_dispatch; + } else { + config.mode = PR_SYS_DISPATCH_OFF; + config.offset = 0; + config.len = 0; + config.selector = NULL; + config.on_dispatch = false; + } + if (copy_to_user(data, &config, sizeof(config))) + return -EFAULT; + + return 0; +} + +int syscall_user_dispatch_set_config(struct task_struct *task, unsigned long size, + void __user *data) +{ + struct syscall_user_dispatch_config config; + int ret; + + if (size != sizeof(struct syscall_user_dispatch_config)) + return -EINVAL; + + if (copy_from_user(&config, data, sizeof(config))) + return -EFAULT; + + ret = set_syscall_user_dispatch(config.mode, config.offset, config.len, + config.selector); + if (ret) + return ret; + + task->syscall_dispatch.on_dispatch = config.on_dispatch; + return 0; +} diff --git a/kernel/ptrace.c b/kernel/ptrace.c index 99467ba5f55b..d1e9c0808905 100644 --- a/kernel/ptrace.c +++ b/kernel/ptrace.c @@ -32,6 +32,7 @@ #include <linux/compat.h> #include <linux/sched/signal.h> #include <linux/minmax.h> +#include <linux/syscall_user_dispatch.h> #include <asm/syscall.h> /* for syscall_get_* */ @@ -1263,6 +1264,14 @@ int ptrace_request(struct task_struct *child, long request, break; #endif + case PTRACE_SET_SYSCALL_USER_DISPATCH_CONFIG: + ret = syscall_user_dispatch_set_config(child, addr, datavp); + break; + + case PTRACE_GET_SYSCALL_USER_DISPATCH_CONFIG: + ret = syscall_user_dispatch_get_config(child, addr, datavp); + break; + default: break; }
Implement ptrace getter/setter interface for syscall user dispatch. Presently, these settings are write-only via prctl, making it impossible to implement transparent checkpoint (coordination with the software is required). This is modeled after a similar interface for SECCOMP, which can have its configuration dumped by ptrace for software like CRIU. Signed-off-by: Gregory Price <gregory.price@memverge.com> --- .../admin-guide/syscall-user-dispatch.rst | 5 +- include/linux/syscall_user_dispatch.h | 19 +++++++ include/uapi/linux/ptrace.h | 10 ++++ kernel/entry/syscall_user_dispatch.c | 49 +++++++++++++++++++ kernel/ptrace.c | 9 ++++ 5 files changed, 91 insertions(+), 1 deletion(-)