Message ID | 20201126132312.3593725-1-yangyingliang@huawei.com |
---|---|
State | New |
Headers | show |
Series | [net] net: fix memory leak in register_netdevice() on error path | expand |
On 2020/11/26 22:23, Yang Yingliang wrote: ... > Reported-by: Hulk Robot <hulkci@huawei.com> > Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> > --- > net/core/dev.c | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/net/core/dev.c b/net/core/dev.c > index 82dc6b48e45f..907204395b64 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -10000,6 +10000,17 @@ int register_netdevice(struct net_device *dev) > ret = notifier_to_errno(ret); > if (ret) { > rollback_registered(dev); > + /* > + * In common case, priv_destructor() will be As per netdev-faq, the comment style should be /* foobar blah blah blah * another line of text */ rather than /* * foobar blah blah blah * another line of text */ > + * called in netdev_run_todo() after calling > + * ndo_uninit() in rollback_registered(). > + * But in this case, priv_destructor() will > + * never be called, then it causes memory > + * leak, so we should call priv_destructor() > + * here. > + */ > + if (dev->priv_destructor) > + dev->priv_destructor(dev); To be in line with netdev_run_todo(), I think priv_destructor() should be called after "dev->reg_state = NETREG_UNREGISTERED". Toshiaki Makita > rcu_barrier(); > > dev->reg_state = NETREG_UNREGISTERED; >
On Thu, 26 Nov 2020 21:23:12 +0800 Yang Yingliang <yangyingliang@huawei.com> wrote: > I got a memleak report when doing fault-inject test: > > unreferenced object 0xffff88810ace9000 (size 1024): > comm "ip", pid 4622, jiffies 4295457037 (age 43.378s) > hex dump (first 32 bytes): > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > backtrace: > [<00000000008abe41>] __kmalloc+0x10f/0x210 > [<000000005d3533a6>] veth_dev_init+0x140/0x310 > [<0000000088353c64>] register_netdevice+0x496/0x7a0 > [<000000001324d322>] veth_newlink+0x40b/0x960 > [<00000000d0799866>] __rtnl_newlink+0xd8c/0x1360 > [<00000000d616040a>] rtnl_newlink+0x6b/0xa0 > [<00000000e0a1600d>] rtnetlink_rcv_msg+0x3cc/0x9e0 > [<000000009eeff98b>] netlink_rcv_skb+0x130/0x3a0 > [<00000000500f8be1>] netlink_unicast+0x4da/0x700 > [<00000000666c03b3>] netlink_sendmsg+0x7fe/0xcb0 > [<0000000073b28103>] sock_sendmsg+0x143/0x180 > [<00000000ad746a30>] ____sys_sendmsg+0x677/0x810 > [<0000000087dd98e5>] ___sys_sendmsg+0x105/0x180 > [<00000000028dd365>] __sys_sendmsg+0xf0/0x1c0 > [<00000000a6bfbae6>] do_syscall_64+0x33/0x40 > [<00000000e00521b4>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > It seems ifb and loopback may also hit the leak, so I try to fix this in > register_netdevice(). > > In common case, priv_destructor() will be called in netdev_run_todo() > after calling ndo_uninit() in rollback_registered(), on other error > path in register_netdevice(), ndo_uninit() and priv_destructor() are > called before register_netdevice() return, but in this case, > priv_destructor() will never be called, then it causes memory leak, > so we should call priv_destructor() here. > > Reported-by: Hulk Robot <hulkci@huawei.com> > Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> > --- > net/core/dev.c | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/net/core/dev.c b/net/core/dev.c > index 82dc6b48e45f..907204395b64 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -10000,6 +10000,17 @@ int register_netdevice(struct net_device *dev) > ret = notifier_to_errno(ret); > if (ret) { > rollback_registered(dev); > + /* > + * In common case, priv_destructor() will be > + * called in netdev_run_todo() after calling > + * ndo_uninit() in rollback_registered(). > + * But in this case, priv_destructor() will > + * never be called, then it causes memory > + * leak, so we should call priv_destructor() > + * here. > + */ > + if (dev->priv_destructor) > + dev->priv_destructor(dev); Are you sure this is safe? Several devices have destructors that call free_netdev. Up until now a common pattern for those devices was to call free_netdev on error. After this change it would lead to double free.
On 2020/11/30 12:39, Stephen Hemminger wrote: > On Thu, 26 Nov 2020 21:23:12 +0800 > Yang Yingliang <yangyingliang@huawei.com> wrote: > >> I got a memleak report when doing fault-inject test: >> >> unreferenced object 0xffff88810ace9000 (size 1024): >> comm "ip", pid 4622, jiffies 4295457037 (age 43.378s) >> hex dump (first 32 bytes): >> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ >> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ >> backtrace: >> [<00000000008abe41>] __kmalloc+0x10f/0x210 >> [<000000005d3533a6>] veth_dev_init+0x140/0x310 >> [<0000000088353c64>] register_netdevice+0x496/0x7a0 >> [<000000001324d322>] veth_newlink+0x40b/0x960 >> [<00000000d0799866>] __rtnl_newlink+0xd8c/0x1360 >> [<00000000d616040a>] rtnl_newlink+0x6b/0xa0 >> [<00000000e0a1600d>] rtnetlink_rcv_msg+0x3cc/0x9e0 >> [<000000009eeff98b>] netlink_rcv_skb+0x130/0x3a0 >> [<00000000500f8be1>] netlink_unicast+0x4da/0x700 >> [<00000000666c03b3>] netlink_sendmsg+0x7fe/0xcb0 >> [<0000000073b28103>] sock_sendmsg+0x143/0x180 >> [<00000000ad746a30>] ____sys_sendmsg+0x677/0x810 >> [<0000000087dd98e5>] ___sys_sendmsg+0x105/0x180 >> [<00000000028dd365>] __sys_sendmsg+0xf0/0x1c0 >> [<00000000a6bfbae6>] do_syscall_64+0x33/0x40 >> [<00000000e00521b4>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 >> >> It seems ifb and loopback may also hit the leak, so I try to fix this in >> register_netdevice(). >> >> In common case, priv_destructor() will be called in netdev_run_todo() >> after calling ndo_uninit() in rollback_registered(), on other error >> path in register_netdevice(), ndo_uninit() and priv_destructor() are >> called before register_netdevice() return, but in this case, >> priv_destructor() will never be called, then it causes memory leak, >> so we should call priv_destructor() here. >> >> Reported-by: Hulk Robot <hulkci@huawei.com> >> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> >> --- >> net/core/dev.c | 11 +++++++++++ >> 1 file changed, 11 insertions(+) >> >> diff --git a/net/core/dev.c b/net/core/dev.c >> index 82dc6b48e45f..907204395b64 100644 >> --- a/net/core/dev.c >> +++ b/net/core/dev.c >> @@ -10000,6 +10000,17 @@ int register_netdevice(struct net_device *dev) >> ret = notifier_to_errno(ret); >> if (ret) { >> rollback_registered(dev); >> + /* >> + * In common case, priv_destructor() will be >> + * called in netdev_run_todo() after calling >> + * ndo_uninit() in rollback_registered(). >> + * But in this case, priv_destructor() will >> + * never be called, then it causes memory >> + * leak, so we should call priv_destructor() >> + * here. >> + */ >> + if (dev->priv_destructor) >> + dev->priv_destructor(dev); > Are you sure this is safe? > Several devices have destructors that call free_netdev. > Up until now a common pattern for those devices was to call > free_netdev on error. After this change it would lead to double free. After commit cf124db566e6 ("net: Fix inconsistent teardown and release of private netdev state."), free_netdev() is not be called in priv_destructor(). > > .
On 2020/11/29 21:56, Toshiaki Makita wrote: > On 2020/11/26 22:23, Yang Yingliang wrote: > ... >> Reported-by: Hulk Robot <hulkci@huawei.com> >> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> >> --- >> net/core/dev.c | 11 +++++++++++ >> 1 file changed, 11 insertions(+) >> >> diff --git a/net/core/dev.c b/net/core/dev.c >> index 82dc6b48e45f..907204395b64 100644 >> --- a/net/core/dev.c >> +++ b/net/core/dev.c >> @@ -10000,6 +10000,17 @@ int register_netdevice(struct net_device *dev) >> ret = notifier_to_errno(ret); >> if (ret) { >> rollback_registered(dev); >> + /* >> + * In common case, priv_destructor() will be > > As per netdev-faq, the comment style should be > > /* foobar blah blah blah > * another line of text > */ > > rather than > > /* > * foobar blah blah blah > * another line of text > */ > >> + * called in netdev_run_todo() after calling >> + * ndo_uninit() in rollback_registered(). >> + * But in this case, priv_destructor() will >> + * never be called, then it causes memory >> + * leak, so we should call priv_destructor() >> + * here. >> + */ >> + if (dev->priv_destructor) >> + dev->priv_destructor(dev); > > To be in line with netdev_run_todo(), I think priv_destructor() should be > called after "dev->reg_state = NETREG_UNREGISTERED". OK, I will send a v2 later. > > Toshiaki Makita > >> rcu_barrier(); >> dev->reg_state = NETREG_UNREGISTERED; >>
diff --git a/net/core/dev.c b/net/core/dev.c index 82dc6b48e45f..907204395b64 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -10000,6 +10000,17 @@ int register_netdevice(struct net_device *dev) ret = notifier_to_errno(ret); if (ret) { rollback_registered(dev); + /* + * In common case, priv_destructor() will be + * called in netdev_run_todo() after calling + * ndo_uninit() in rollback_registered(). + * But in this case, priv_destructor() will + * never be called, then it causes memory + * leak, so we should call priv_destructor() + * here. + */ + if (dev->priv_destructor) + dev->priv_destructor(dev); rcu_barrier(); dev->reg_state = NETREG_UNREGISTERED;
I got a memleak report when doing fault-inject test: unreferenced object 0xffff88810ace9000 (size 1024): comm "ip", pid 4622, jiffies 4295457037 (age 43.378s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000008abe41>] __kmalloc+0x10f/0x210 [<000000005d3533a6>] veth_dev_init+0x140/0x310 [<0000000088353c64>] register_netdevice+0x496/0x7a0 [<000000001324d322>] veth_newlink+0x40b/0x960 [<00000000d0799866>] __rtnl_newlink+0xd8c/0x1360 [<00000000d616040a>] rtnl_newlink+0x6b/0xa0 [<00000000e0a1600d>] rtnetlink_rcv_msg+0x3cc/0x9e0 [<000000009eeff98b>] netlink_rcv_skb+0x130/0x3a0 [<00000000500f8be1>] netlink_unicast+0x4da/0x700 [<00000000666c03b3>] netlink_sendmsg+0x7fe/0xcb0 [<0000000073b28103>] sock_sendmsg+0x143/0x180 [<00000000ad746a30>] ____sys_sendmsg+0x677/0x810 [<0000000087dd98e5>] ___sys_sendmsg+0x105/0x180 [<00000000028dd365>] __sys_sendmsg+0xf0/0x1c0 [<00000000a6bfbae6>] do_syscall_64+0x33/0x40 [<00000000e00521b4>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 It seems ifb and loopback may also hit the leak, so I try to fix this in register_netdevice(). In common case, priv_destructor() will be called in netdev_run_todo() after calling ndo_uninit() in rollback_registered(), on other error path in register_netdevice(), ndo_uninit() and priv_destructor() are called before register_netdevice() return, but in this case, priv_destructor() will never be called, then it causes memory leak, so we should call priv_destructor() here. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> --- net/core/dev.c | 11 +++++++++++ 1 file changed, 11 insertions(+)