diff mbox series

[net] net: fix memory leak in register_netdevice() on error path

Message ID 20201126132312.3593725-1-yangyingliang@huawei.com
State New
Headers show
Series [net] net: fix memory leak in register_netdevice() on error path | expand

Commit Message

Yang Yingliang Nov. 26, 2020, 1:23 p.m. UTC
I got a memleak report when doing fault-inject test:

unreferenced object 0xffff88810ace9000 (size 1024):
  comm "ip", pid 4622, jiffies 4295457037 (age 43.378s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<00000000008abe41>] __kmalloc+0x10f/0x210
    [<000000005d3533a6>] veth_dev_init+0x140/0x310
    [<0000000088353c64>] register_netdevice+0x496/0x7a0
    [<000000001324d322>] veth_newlink+0x40b/0x960
    [<00000000d0799866>] __rtnl_newlink+0xd8c/0x1360
    [<00000000d616040a>] rtnl_newlink+0x6b/0xa0
    [<00000000e0a1600d>] rtnetlink_rcv_msg+0x3cc/0x9e0
    [<000000009eeff98b>] netlink_rcv_skb+0x130/0x3a0
    [<00000000500f8be1>] netlink_unicast+0x4da/0x700
    [<00000000666c03b3>] netlink_sendmsg+0x7fe/0xcb0
    [<0000000073b28103>] sock_sendmsg+0x143/0x180
    [<00000000ad746a30>] ____sys_sendmsg+0x677/0x810
    [<0000000087dd98e5>] ___sys_sendmsg+0x105/0x180
    [<00000000028dd365>] __sys_sendmsg+0xf0/0x1c0
    [<00000000a6bfbae6>] do_syscall_64+0x33/0x40
    [<00000000e00521b4>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

It seems ifb and loopback may also hit the leak, so I try to fix this in
register_netdevice().

In common case, priv_destructor() will be called in netdev_run_todo()
after calling ndo_uninit() in rollback_registered(), on other error
path in register_netdevice(), ndo_uninit() and priv_destructor() are
called before register_netdevice() return, but in this case,
priv_destructor() will never be called, then it causes memory leak,
so we should call priv_destructor() here.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
---
 net/core/dev.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

Comments

Toshiaki Makita Nov. 29, 2020, 1:56 p.m. UTC | #1
On 2020/11/26 22:23, Yang Yingliang wrote:
...
> Reported-by: Hulk Robot <hulkci@huawei.com>

> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

> ---

>   net/core/dev.c | 11 +++++++++++

>   1 file changed, 11 insertions(+)

> 

> diff --git a/net/core/dev.c b/net/core/dev.c

> index 82dc6b48e45f..907204395b64 100644

> --- a/net/core/dev.c

> +++ b/net/core/dev.c

> @@ -10000,6 +10000,17 @@ int register_netdevice(struct net_device *dev)

>   	ret = notifier_to_errno(ret);

>   	if (ret) {

>   		rollback_registered(dev);

> +		/*

> +		 * In common case, priv_destructor() will be


As per netdev-faq, the comment style should be

/* foobar blah blah blah
  * another line of text
  */

rather than

/*
  * foobar blah blah blah
  * another line of text
  */

> +		 * called in netdev_run_todo() after calling

> +		 * ndo_uninit() in rollback_registered().

> +		 * But in this case, priv_destructor() will

> +		 * never be called, then it causes memory

> +		 * leak, so we should call priv_destructor()

> +		 * here.

> +		 */

> +		if (dev->priv_destructor)

> +			dev->priv_destructor(dev);


To be in line with netdev_run_todo(), I think priv_destructor() should be
called after "dev->reg_state = NETREG_UNREGISTERED".

Toshiaki Makita

>   		rcu_barrier();

>   

>   		dev->reg_state = NETREG_UNREGISTERED;

>
Stephen Hemminger Nov. 30, 2020, 4:39 a.m. UTC | #2
On Thu, 26 Nov 2020 21:23:12 +0800
Yang Yingliang <yangyingliang@huawei.com> wrote:

> I got a memleak report when doing fault-inject test:

> 

> unreferenced object 0xffff88810ace9000 (size 1024):

>   comm "ip", pid 4622, jiffies 4295457037 (age 43.378s)

>   hex dump (first 32 bytes):

>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................

>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................

>   backtrace:

>     [<00000000008abe41>] __kmalloc+0x10f/0x210

>     [<000000005d3533a6>] veth_dev_init+0x140/0x310

>     [<0000000088353c64>] register_netdevice+0x496/0x7a0

>     [<000000001324d322>] veth_newlink+0x40b/0x960

>     [<00000000d0799866>] __rtnl_newlink+0xd8c/0x1360

>     [<00000000d616040a>] rtnl_newlink+0x6b/0xa0

>     [<00000000e0a1600d>] rtnetlink_rcv_msg+0x3cc/0x9e0

>     [<000000009eeff98b>] netlink_rcv_skb+0x130/0x3a0

>     [<00000000500f8be1>] netlink_unicast+0x4da/0x700

>     [<00000000666c03b3>] netlink_sendmsg+0x7fe/0xcb0

>     [<0000000073b28103>] sock_sendmsg+0x143/0x180

>     [<00000000ad746a30>] ____sys_sendmsg+0x677/0x810

>     [<0000000087dd98e5>] ___sys_sendmsg+0x105/0x180

>     [<00000000028dd365>] __sys_sendmsg+0xf0/0x1c0

>     [<00000000a6bfbae6>] do_syscall_64+0x33/0x40

>     [<00000000e00521b4>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

> 

> It seems ifb and loopback may also hit the leak, so I try to fix this in

> register_netdevice().

> 

> In common case, priv_destructor() will be called in netdev_run_todo()

> after calling ndo_uninit() in rollback_registered(), on other error

> path in register_netdevice(), ndo_uninit() and priv_destructor() are

> called before register_netdevice() return, but in this case,

> priv_destructor() will never be called, then it causes memory leak,

> so we should call priv_destructor() here.

> 

> Reported-by: Hulk Robot <hulkci@huawei.com>

> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

> ---

>  net/core/dev.c | 11 +++++++++++

>  1 file changed, 11 insertions(+)

> 

> diff --git a/net/core/dev.c b/net/core/dev.c

> index 82dc6b48e45f..907204395b64 100644

> --- a/net/core/dev.c

> +++ b/net/core/dev.c

> @@ -10000,6 +10000,17 @@ int register_netdevice(struct net_device *dev)

>  	ret = notifier_to_errno(ret);

>  	if (ret) {

>  		rollback_registered(dev);

> +		/*

> +		 * In common case, priv_destructor() will be

> +		 * called in netdev_run_todo() after calling

> +		 * ndo_uninit() in rollback_registered().

> +		 * But in this case, priv_destructor() will

> +		 * never be called, then it causes memory

> +		 * leak, so we should call priv_destructor()

> +		 * here.

> +		 */

> +		if (dev->priv_destructor)

> +			dev->priv_destructor(dev);


Are you sure this is safe?
Several devices have destructors that call free_netdev.
Up until now a common pattern for those devices was to call
free_netdev on error. After this change it would lead to double free.
Yang Yingliang Nov. 30, 2020, 11:12 a.m. UTC | #3
On 2020/11/30 12:39, Stephen Hemminger wrote:
> On Thu, 26 Nov 2020 21:23:12 +0800

> Yang Yingliang <yangyingliang@huawei.com> wrote:

>

>> I got a memleak report when doing fault-inject test:

>>

>> unreferenced object 0xffff88810ace9000 (size 1024):

>>    comm "ip", pid 4622, jiffies 4295457037 (age 43.378s)

>>    hex dump (first 32 bytes):

>>      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................

>>      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................

>>    backtrace:

>>      [<00000000008abe41>] __kmalloc+0x10f/0x210

>>      [<000000005d3533a6>] veth_dev_init+0x140/0x310

>>      [<0000000088353c64>] register_netdevice+0x496/0x7a0

>>      [<000000001324d322>] veth_newlink+0x40b/0x960

>>      [<00000000d0799866>] __rtnl_newlink+0xd8c/0x1360

>>      [<00000000d616040a>] rtnl_newlink+0x6b/0xa0

>>      [<00000000e0a1600d>] rtnetlink_rcv_msg+0x3cc/0x9e0

>>      [<000000009eeff98b>] netlink_rcv_skb+0x130/0x3a0

>>      [<00000000500f8be1>] netlink_unicast+0x4da/0x700

>>      [<00000000666c03b3>] netlink_sendmsg+0x7fe/0xcb0

>>      [<0000000073b28103>] sock_sendmsg+0x143/0x180

>>      [<00000000ad746a30>] ____sys_sendmsg+0x677/0x810

>>      [<0000000087dd98e5>] ___sys_sendmsg+0x105/0x180

>>      [<00000000028dd365>] __sys_sendmsg+0xf0/0x1c0

>>      [<00000000a6bfbae6>] do_syscall_64+0x33/0x40

>>      [<00000000e00521b4>] entry_SYSCALL_64_after_hwframe+0x44/0xa9

>>

>> It seems ifb and loopback may also hit the leak, so I try to fix this in

>> register_netdevice().

>>

>> In common case, priv_destructor() will be called in netdev_run_todo()

>> after calling ndo_uninit() in rollback_registered(), on other error

>> path in register_netdevice(), ndo_uninit() and priv_destructor() are

>> called before register_netdevice() return, but in this case,

>> priv_destructor() will never be called, then it causes memory leak,

>> so we should call priv_destructor() here.

>>

>> Reported-by: Hulk Robot <hulkci@huawei.com>

>> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

>> ---

>>   net/core/dev.c | 11 +++++++++++

>>   1 file changed, 11 insertions(+)

>>

>> diff --git a/net/core/dev.c b/net/core/dev.c

>> index 82dc6b48e45f..907204395b64 100644

>> --- a/net/core/dev.c

>> +++ b/net/core/dev.c

>> @@ -10000,6 +10000,17 @@ int register_netdevice(struct net_device *dev)

>>   	ret = notifier_to_errno(ret);

>>   	if (ret) {

>>   		rollback_registered(dev);

>> +		/*

>> +		 * In common case, priv_destructor() will be

>> +		 * called in netdev_run_todo() after calling

>> +		 * ndo_uninit() in rollback_registered().

>> +		 * But in this case, priv_destructor() will

>> +		 * never be called, then it causes memory

>> +		 * leak, so we should call priv_destructor()

>> +		 * here.

>> +		 */

>> +		if (dev->priv_destructor)

>> +			dev->priv_destructor(dev);

> Are you sure this is safe?

> Several devices have destructors that call free_netdev.

> Up until now a common pattern for those devices was to call

> free_netdev on error. After this change it would lead to double free.


After commit cf124db566e6 ("net: Fix inconsistent teardown and release 
of private netdev state."),

free_netdev() is not be called in priv_destructor().

>

> .
Yang Yingliang Nov. 30, 2020, 11:13 a.m. UTC | #4
On 2020/11/29 21:56, Toshiaki Makita wrote:
> On 2020/11/26 22:23, Yang Yingliang wrote:

> ...

>> Reported-by: Hulk Robot <hulkci@huawei.com>

>> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

>> ---

>>   net/core/dev.c | 11 +++++++++++

>>   1 file changed, 11 insertions(+)

>>

>> diff --git a/net/core/dev.c b/net/core/dev.c

>> index 82dc6b48e45f..907204395b64 100644

>> --- a/net/core/dev.c

>> +++ b/net/core/dev.c

>> @@ -10000,6 +10000,17 @@ int register_netdevice(struct net_device *dev)

>>       ret = notifier_to_errno(ret);

>>       if (ret) {

>>           rollback_registered(dev);

>> +        /*

>> +         * In common case, priv_destructor() will be

>

> As per netdev-faq, the comment style should be

>

> /* foobar blah blah blah

>  * another line of text

>  */

>

> rather than

>

> /*

>  * foobar blah blah blah

>  * another line of text

>  */

>

>> +         * called in netdev_run_todo() after calling

>> +         * ndo_uninit() in rollback_registered().

>> +         * But in this case, priv_destructor() will

>> +         * never be called, then it causes memory

>> +         * leak, so we should call priv_destructor()

>> +         * here.

>> +         */

>> +        if (dev->priv_destructor)

>> +            dev->priv_destructor(dev);

>

> To be in line with netdev_run_todo(), I think priv_destructor() should be

> called after "dev->reg_state = NETREG_UNREGISTERED".

OK,  I will send a v2 later.
>

> Toshiaki Makita

>

>>           rcu_barrier();

>>             dev->reg_state = NETREG_UNREGISTERED;

>>
diff mbox series

Patch

diff --git a/net/core/dev.c b/net/core/dev.c
index 82dc6b48e45f..907204395b64 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -10000,6 +10000,17 @@  int register_netdevice(struct net_device *dev)
 	ret = notifier_to_errno(ret);
 	if (ret) {
 		rollback_registered(dev);
+		/*
+		 * In common case, priv_destructor() will be
+		 * called in netdev_run_todo() after calling
+		 * ndo_uninit() in rollback_registered().
+		 * But in this case, priv_destructor() will
+		 * never be called, then it causes memory
+		 * leak, so we should call priv_destructor()
+		 * here.
+		 */
+		if (dev->priv_destructor)
+			dev->priv_destructor(dev);
 		rcu_barrier();
 
 		dev->reg_state = NETREG_UNREGISTERED;