diff mbox series

[v6,net-next,03/15] net: procfs: hold netif_lists_lock when retrieving device statistics

Message ID 20210109172624.2028156-4-olteanv@gmail.com
State New
Headers show
Series [v6,net-next,01/15] net: mark dev_base_lock for deprecation | expand

Commit Message

Vladimir Oltean Jan. 9, 2021, 5:26 p.m. UTC
From: Vladimir Oltean <vladimir.oltean@nxp.com>

In the effort of making .ndo_get_stats64 be able to sleep, we need to
ensure the callers of dev_get_stats do not use atomic context.

The /proc/net/dev file uses an RCU read-side critical section to ensure
the integrity of the list of network interfaces, because it iterates
through all net devices in the netns to show their statistics.

To offer the equivalent protection against an interface registering or
deregistering, while also remaining in sleepable context, we can use the
netns mutex for the interface lists.

Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
Changes in v6:
None.

Changes in v5:
None.

Changes in v4:
None.

Changes in v3:
None.

Changes in v2:
None.

 net/core/net-procfs.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

Comments

Saeed Mahameed Jan. 11, 2021, 11:46 p.m. UTC | #1
On Sat, 2021-01-09 at 19:26 +0200, Vladimir Oltean wrote:
> From: Vladimir Oltean <vladimir.oltean@nxp.com>

> 

> In the effort of making .ndo_get_stats64 be able to sleep, we need to

> ensure the callers of dev_get_stats do not use atomic context.

> 

> The /proc/net/dev file uses an RCU read-side critical section to

> ensure

> the integrity of the list of network interfaces, because it iterates

> through all net devices in the netns to show their statistics.

> 

> To offer the equivalent protection against an interface registering

> or

> deregistering, while also remaining in sleepable context, we can use

> the

> netns mutex for the interface lists.

> 

> Cc: Cong Wang <xiyou.wangcong@gmail.com>

> Cc: Eric Dumazet <edumazet@google.com>

> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>

> ---

> Changes in v6:

> None.

> 

> Changes in v5:

> None.

> 

> Changes in v4:

> None.

> 

> Changes in v3:

> None.

> 

> Changes in v2:

> None.

> 

>  net/core/net-procfs.c | 13 ++++++++-----

>  1 file changed, 8 insertions(+), 5 deletions(-)

> 

> diff --git a/net/core/net-procfs.c b/net/core/net-procfs.c

> index c714e6a9dad4..4784703c1e39 100644

> --- a/net/core/net-procfs.c

> +++ b/net/core/net-procfs.c

> @@ -21,7 +21,7 @@ static inline struct net_device

> *dev_from_same_bucket(struct seq_file *seq, loff

>  	unsigned int count = 0, offset = get_offset(*pos);

>  

>  	h = &net->dev_index_head[get_bucket(*pos)];

> -	hlist_for_each_entry_rcu(dev, h, index_hlist) {

> +	hlist_for_each_entry(dev, h, index_hlist) {

>  		if (++count == offset)

>  			return dev;

>  	}

> @@ -51,9 +51,11 @@ static inline struct net_device

> *dev_from_bucket(struct seq_file *seq, loff_t *p

>   *	in detail.

>   */

>  static void *dev_seq_start(struct seq_file *seq, loff_t *pos)

> -	__acquires(RCU)

>  {

> -	rcu_read_lock();

> +	struct net *net = seq_file_net(seq);

> +

> +	netif_lists_lock(net);

> +


This can be very costly, holding a mutex while traversing the whole
netedv lists and reading their stats, we need to at least allow
multiple readers to enter as it was before, so maybe you want to use
rw_semaphore instead of the mutex.

or just have a unified approach of rcu+refcnt/dev_hold as you did for
bonding and failover patches #13..#14, I used the same approach to
achieve the same for sysfs and procfs more than 2 years ago, you are
welcome to use my patches:
https://lore.kernel.org/netdev/4cc44e85-cb5e-502c-30f3-c6ea564fe9ac@gmail.com/


>  	if (!*pos)

>  		return SEQ_START_TOKEN;

>  

> @@ -70,9 +72,10 @@ static void *dev_seq_next(struct seq_file *seq,

> void *v, loff_t *pos)

>  }

>  

>  static void dev_seq_stop(struct seq_file *seq, void *v)

> -	__releases(RCU)

>  {

> -	rcu_read_unlock();

> +	struct net *net = seq_file_net(seq);

> +

> +	netif_lists_unlock(net);

>  }

>  

>  static void dev_seq_printf_stats(struct seq_file *seq, struct

> net_device *dev)
Vladimir Oltean Jan. 12, 2021, 1:44 p.m. UTC | #2
On Mon, Jan 11, 2021 at 03:46:32PM -0800, Saeed Mahameed wrote:
> This can be very costly, holding a mutex while traversing the whole

> netedv lists and reading their stats, we need to at least allow

> multiple readers to enter as it was before, so maybe you want to use

> rw_semaphore instead of the mutex.

> 

> or just have a unified approach of rcu+refcnt/dev_hold as you did for

> bonding and failover patches #13..#14, I used the same approach to

> achieve the same for sysfs and procfs more than 2 years ago, you are

> welcome to use my patches:

> https://lore.kernel.org/netdev/4cc44e85-cb5e-502c-30f3-c6ea564fe9ac@gmail.com/


Ok, what mail address do you want me to keep for your sign off?
Saeed Mahameed Jan. 12, 2021, 8:06 p.m. UTC | #3
On Tue, 2021-01-12 at 15:44 +0200, Vladimir Oltean wrote:
> On Mon, Jan 11, 2021 at 03:46:32PM -0800, Saeed Mahameed wrote:

> > This can be very costly, holding a mutex while traversing the whole

> > netedv lists and reading their stats, we need to at least allow

> > multiple readers to enter as it was before, so maybe you want to

> > use

> > rw_semaphore instead of the mutex.

> > 

> > or just have a unified approach of rcu+refcnt/dev_hold as you did

> > for

> > bonding and failover patches #13..#14, I used the same approach to

> > achieve the same for sysfs and procfs more than 2 years ago, you

> > are

> > welcome to use my patches:

> > https://lore.kernel.org/netdev/4cc44e85-cb5e-502c-30f3-c6ea564fe9ac@gmail.com/

> 

> Ok, what mail address do you want me to keep for your sign off?


Either is fine. Just make sure author and signed-off are the same :).
Thanks!
diff mbox series

Patch

diff --git a/net/core/net-procfs.c b/net/core/net-procfs.c
index c714e6a9dad4..4784703c1e39 100644
--- a/net/core/net-procfs.c
+++ b/net/core/net-procfs.c
@@ -21,7 +21,7 @@  static inline struct net_device *dev_from_same_bucket(struct seq_file *seq, loff
 	unsigned int count = 0, offset = get_offset(*pos);
 
 	h = &net->dev_index_head[get_bucket(*pos)];
-	hlist_for_each_entry_rcu(dev, h, index_hlist) {
+	hlist_for_each_entry(dev, h, index_hlist) {
 		if (++count == offset)
 			return dev;
 	}
@@ -51,9 +51,11 @@  static inline struct net_device *dev_from_bucket(struct seq_file *seq, loff_t *p
  *	in detail.
  */
 static void *dev_seq_start(struct seq_file *seq, loff_t *pos)
-	__acquires(RCU)
 {
-	rcu_read_lock();
+	struct net *net = seq_file_net(seq);
+
+	netif_lists_lock(net);
+
 	if (!*pos)
 		return SEQ_START_TOKEN;
 
@@ -70,9 +72,10 @@  static void *dev_seq_next(struct seq_file *seq, void *v, loff_t *pos)
 }
 
 static void dev_seq_stop(struct seq_file *seq, void *v)
-	__releases(RCU)
 {
-	rcu_read_unlock();
+	struct net *net = seq_file_net(seq);
+
+	netif_lists_unlock(net);
 }
 
 static void dev_seq_printf_stats(struct seq_file *seq, struct net_device *dev)