Message ID | 20210529060526.422987-1-changbin.du@gmail.com |
---|---|
State | New |
Headers | show |
Series | net: fix oops in socket ioctl cmd SIOCGSKNS when NET_NS is disabled | expand |
On Sat, 29 May 2021 14:05:26 +0800 Changbin Du wrote: > When NET_NS is not enabled, socket ioctl cmd SIOCGSKNS should do nothing > but acknowledge userspace it is not supported. Otherwise, kernel would > panic wherever nsfs trys to access ns->ops since the proc_ns_operations > is not implemented in this case. > > [7.670023] Unable to handle kernel NULL pointer dereference at virtual address 00000010 > [7.670268] pgd = 32b54000 > [7.670544] [00000010] *pgd=00000000 > [7.671861] Internal error: Oops: 5 [#1] SMP ARM > [7.672315] Modules linked in: > [7.672918] CPU: 0 PID: 1 Comm: systemd Not tainted 5.13.0-rc3-00375-g6799d4f2da49 #16 > [7.673309] Hardware name: Generic DT based system > [7.673642] PC is at nsfs_evict+0x24/0x30 > [7.674486] LR is at clear_inode+0x20/0x9c > > Signed-off-by: Changbin Du <changbin.du@gmail.com> > Cc: <stable@vger.kernel.org> # v4.9 Please provide a Fixes tag. > diff --git a/net/socket.c b/net/socket.c > index 27e3e7d53f8e..644b46112d35 100644 > --- a/net/socket.c > +++ b/net/socket.c > @@ -1149,11 +1149,15 @@ static long sock_ioctl(struct file *file, unsigned cmd, unsigned long arg) > mutex_unlock(&vlan_ioctl_mutex); > break; > case SIOCGSKNS: > +#ifdef CONFIG_NET_NS > err = -EPERM; > if (!ns_capable(net->user_ns, CAP_NET_ADMIN)) > break; > > err = open_related_ns(&net->ns, get_net_ns); There's a few more places with this exact code. Can we please add the check in get_net_ns? That should fix all callers. > +#else > + err = -ENOTSUPP; EOPNOTSUPP, you shouldn't return ENOTSUPP to user space. > +#endif > break; > case SIOCGSTAMP_OLD: > case SIOCGSTAMPNS_OLD:
On Fri, May 28, 2021 at 11:08 PM Changbin Du <changbin.du@gmail.com> wrote: > diff --git a/net/socket.c b/net/socket.c > index 27e3e7d53f8e..644b46112d35 100644 > --- a/net/socket.c > +++ b/net/socket.c > @@ -1149,11 +1149,15 @@ static long sock_ioctl(struct file *file, unsigned cmd, unsigned long arg) > mutex_unlock(&vlan_ioctl_mutex); > break; > case SIOCGSKNS: > +#ifdef CONFIG_NET_NS > err = -EPERM; > if (!ns_capable(net->user_ns, CAP_NET_ADMIN)) > break; > > err = open_related_ns(&net->ns, get_net_ns); > +#else > + err = -ENOTSUPP; > +#endif I wonder if it is easier if we just reject ns->ops==NULL case in open_related_ns(). For 1) we can save an ugly #ifdef here; 2) drivers/net/tun.c has the same bugs. Something like this: diff --git a/fs/nsfs.c b/fs/nsfs.c index 800c1d0eb0d0..d63414604e99 100644 --- a/fs/nsfs.c +++ b/fs/nsfs.c @@ -152,6 +152,9 @@ int open_related_ns(struct ns_common *ns, int err; int fd; + if (!ns->ops) + return -EOPNOTSUPP; + fd = get_unused_fd_flags(O_CLOEXEC); if (fd < 0) return fd;
From: Cong Wang > Sent: 29 May 2021 20:15 > > On Fri, May 28, 2021 at 11:08 PM Changbin Du <changbin.du@gmail.com> wrote: > > diff --git a/net/socket.c b/net/socket.c > > index 27e3e7d53f8e..644b46112d35 100644 > > --- a/net/socket.c > > +++ b/net/socket.c > > @@ -1149,11 +1149,15 @@ static long sock_ioctl(struct file *file, unsigned cmd, unsigned long arg) > > mutex_unlock(&vlan_ioctl_mutex); > > break; > > case SIOCGSKNS: > > +#ifdef CONFIG_NET_NS > > err = -EPERM; > > if (!ns_capable(net->user_ns, CAP_NET_ADMIN)) > > break; > > > > err = open_related_ns(&net->ns, get_net_ns); > > +#else > > + err = -ENOTSUPP; > > +#endif > > I wonder if it is easier if we just reject ns->ops==NULL case > in open_related_ns(). For 1) we can save an ugly #ifdef here; > 2) drivers/net/tun.c has the same bugs. If CONFIG_NET_NS is unset then why not make both ns_capable() and open_related_ns() unconditionally return -ENOTSUPP? David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Sat, May 29, 2021 at 11:27:35AM -0700, Jakub Kicinski wrote: > On Sat, 29 May 2021 14:05:26 +0800 Changbin Du wrote: > > When NET_NS is not enabled, socket ioctl cmd SIOCGSKNS should do nothing > > but acknowledge userspace it is not supported. Otherwise, kernel would > > panic wherever nsfs trys to access ns->ops since the proc_ns_operations > > is not implemented in this case. > > > > [7.670023] Unable to handle kernel NULL pointer dereference at virtual address 00000010 > > [7.670268] pgd = 32b54000 > > [7.670544] [00000010] *pgd=00000000 > > [7.671861] Internal error: Oops: 5 [#1] SMP ARM > > [7.672315] Modules linked in: > > [7.672918] CPU: 0 PID: 1 Comm: systemd Not tainted 5.13.0-rc3-00375-g6799d4f2da49 #16 > > [7.673309] Hardware name: Generic DT based system > > [7.673642] PC is at nsfs_evict+0x24/0x30 > > [7.674486] LR is at clear_inode+0x20/0x9c > > > > Signed-off-by: Changbin Du <changbin.du@gmail.com> > > Cc: <stable@vger.kernel.org> # v4.9 > > Please provide a Fixes tag. > Now it will be fixed by nsfs side. And the code has been changed to many times.. > > diff --git a/net/socket.c b/net/socket.c > > index 27e3e7d53f8e..644b46112d35 100644 > > --- a/net/socket.c > > +++ b/net/socket.c > > @@ -1149,11 +1149,15 @@ static long sock_ioctl(struct file *file, unsigned cmd, unsigned long arg) > > mutex_unlock(&vlan_ioctl_mutex); > > break; > > case SIOCGSKNS: > > +#ifdef CONFIG_NET_NS > > err = -EPERM; > > if (!ns_capable(net->user_ns, CAP_NET_ADMIN)) > > break; > > > > err = open_related_ns(&net->ns, get_net_ns); > > There's a few more places with this exact code. Can we please add the > check in get_net_ns? That should fix all callers. > > > +#else > > + err = -ENOTSUPP; > > EOPNOTSUPP, you shouldn't return ENOTSUPP to user space. > Thanks for pointing out. Will change it. > > +#endif > > break; > > case SIOCGSTAMP_OLD: > > case SIOCGSTAMPNS_OLD: >
On Mon, May 31, 2021 at 08:30:58AM +0000, David Laight wrote: > From: Cong Wang > > Sent: 29 May 2021 20:15 > > > > On Fri, May 28, 2021 at 11:08 PM Changbin Du <changbin.du@gmail.com> wrote: > > > diff --git a/net/socket.c b/net/socket.c > > > index 27e3e7d53f8e..644b46112d35 100644 > > > --- a/net/socket.c > > > +++ b/net/socket.c > > > @@ -1149,11 +1149,15 @@ static long sock_ioctl(struct file *file, unsigned cmd, unsigned long arg) > > > mutex_unlock(&vlan_ioctl_mutex); > > > break; > > > case SIOCGSKNS: > > > +#ifdef CONFIG_NET_NS > > > err = -EPERM; > > > if (!ns_capable(net->user_ns, CAP_NET_ADMIN)) > > > break; > > > > > > err = open_related_ns(&net->ns, get_net_ns); > > > +#else > > > + err = -ENOTSUPP; > > > +#endif > > > > I wonder if it is easier if we just reject ns->ops==NULL case > > in open_related_ns(). For 1) we can save an ugly #ifdef here; > > 2) drivers/net/tun.c has the same bugs. > > If CONFIG_NET_NS is unset then why not make both ns_capable() > and open_related_ns() unconditionally return -ENOTSUPP? > Here is the new fix that reject creating indoe for disabled ns. --- a/fs/nsfs.c +++ b/fs/nsfs.c @@ -62,6 +62,10 @@ static int __ns_get_path(struct path *path, struct ns_common *ns) struct inode *inode; unsigned long d; + /* In case the namespace is not actually enabled. */ + if (!ns->ops) + return -EOPNOTSUPP; > David > > - > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK > Registration No: 1397386 (Wales)
diff --git a/net/socket.c b/net/socket.c index 27e3e7d53f8e..644b46112d35 100644 --- a/net/socket.c +++ b/net/socket.c @@ -1149,11 +1149,15 @@ static long sock_ioctl(struct file *file, unsigned cmd, unsigned long arg) mutex_unlock(&vlan_ioctl_mutex); break; case SIOCGSKNS: +#ifdef CONFIG_NET_NS err = -EPERM; if (!ns_capable(net->user_ns, CAP_NET_ADMIN)) break; err = open_related_ns(&net->ns, get_net_ns); +#else + err = -ENOTSUPP; +#endif break; case SIOCGSTAMP_OLD: case SIOCGSTAMPNS_OLD:
When NET_NS is not enabled, socket ioctl cmd SIOCGSKNS should do nothing but acknowledge userspace it is not supported. Otherwise, kernel would panic wherever nsfs trys to access ns->ops since the proc_ns_operations is not implemented in this case. [7.670023] Unable to handle kernel NULL pointer dereference at virtual address 00000010 [7.670268] pgd = 32b54000 [7.670544] [00000010] *pgd=00000000 [7.671861] Internal error: Oops: 5 [#1] SMP ARM [7.672315] Modules linked in: [7.672918] CPU: 0 PID: 1 Comm: systemd Not tainted 5.13.0-rc3-00375-g6799d4f2da49 #16 [7.673309] Hardware name: Generic DT based system [7.673642] PC is at nsfs_evict+0x24/0x30 [7.674486] LR is at clear_inode+0x20/0x9c Signed-off-by: Changbin Du <changbin.du@gmail.com> Cc: <stable@vger.kernel.org> # v4.9 --- net/socket.c | 4 ++++ 1 file changed, 4 insertions(+)