diff mbox series

[net-next] bcm63xx_enet: batch process rx path

Message ID 20201204054616.26876-1-liew.s.piaw@gmail.com
State New
Headers show
Series [net-next] bcm63xx_enet: batch process rx path | expand

Commit Message

Sieng Piaw Liew Dec. 4, 2020, 5:46 a.m. UTC
Use netif_receive_skb_list to batch process rx skb.
Tested on BCM6328 320 MHz using iperf3 -M 512, increasing performance
by 12.5%.

Before:
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-30.00  sec   120 MBytes  33.7 Mbits/sec  277         sender
[  4]   0.00-30.00  sec   120 MBytes  33.5 Mbits/sec            receiver

After:
[ ID] Interval           Transfer     Bandwidth       Retr
[  4]   0.00-30.00  sec   136 MBytes  37.9 Mbits/sec  203         sender
[  4]   0.00-30.00  sec   135 MBytes  37.7 Mbits/sec            receiver

Signed-off-by: Sieng Piaw Liew <liew.s.piaw@gmail.com>
---
 drivers/net/ethernet/broadcom/bcm63xx_enet.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Comments

Sieng Piaw Liew Dec. 9, 2020, 3:33 a.m. UTC | #1
On Fri, Dec 04, 2020 at 10:50:45AM +0100, Eric Dumazet wrote:
> 

> 

> On 12/4/20 6:46 AM, Sieng Piaw Liew wrote:

> > Use netif_receive_skb_list to batch process rx skb.

> > Tested on BCM6328 320 MHz using iperf3 -M 512, increasing performance

> > by 12.5%.

> > 

> 

> 

> 

> Well, the real question is why you do not simply use GRO,

> to get 100% performance gain or more for TCP flows.

> 

> 

> netif_receive_skb_list() is no longer needed,

> GRO layer already uses batching for non TCP packets.

> 

> We probably should mark is deprecated.

> 

> diff --git a/drivers/net/ethernet/broadcom/bcm63xx_enet.c b/drivers/net/ethernet/broadcom/bcm63xx_enet.c

> index 916824cca3fda194c42fefec7f514ced1a060043..6fdbe231b7c1b27f523889bda8a20ab7eaab65a6 100644

> --- a/drivers/net/ethernet/broadcom/bcm63xx_enet.c

> +++ b/drivers/net/ethernet/broadcom/bcm63xx_enet.c

> @@ -391,7 +391,7 @@ static int bcm_enet_receive_queue(struct net_device *dev, int budget)

>                 skb->protocol = eth_type_trans(skb, dev);

>                 dev->stats.rx_packets++;

>                 dev->stats.rx_bytes += len;

> -               netif_receive_skb(skb);

> +               napi_gro_receive_skb(&priv->napi, skb);

>  

>         } while (--budget > 0);

>  


The bcm63xx router SoC does not have enough CPU power nor hardware
accelerator to process checksum validation fast enough for GRO/GSO.

I have tested napi_gro_receive() on LAN-WAN setup. The resulting
bandwidth dropped from 95Mbps wire speed down to 80Mbps. And it's
inconsistent, with spikes and drops of >5Mbps.

The ag71xx driver for ath79 router SoC reverted its use for the same
reason.
http://lists.infradead.org/pipermail/lede-commits/2017-October/004864.html
diff mbox series

Patch

diff --git a/drivers/net/ethernet/broadcom/bcm63xx_enet.c b/drivers/net/ethernet/broadcom/bcm63xx_enet.c
index 916824cca3fd..b82b7805c36a 100644
--- a/drivers/net/ethernet/broadcom/bcm63xx_enet.c
+++ b/drivers/net/ethernet/broadcom/bcm63xx_enet.c
@@ -297,10 +297,12 @@  static void bcm_enet_refill_rx_timer(struct timer_list *t)
 static int bcm_enet_receive_queue(struct net_device *dev, int budget)
 {
 	struct bcm_enet_priv *priv;
+	struct list_head rx_list;
 	struct device *kdev;
 	int processed;
 
 	priv = netdev_priv(dev);
+	INIT_LIST_HEAD(&rx_list);
 	kdev = &priv->pdev->dev;
 	processed = 0;
 
@@ -391,10 +393,12 @@  static int bcm_enet_receive_queue(struct net_device *dev, int budget)
 		skb->protocol = eth_type_trans(skb, dev);
 		dev->stats.rx_packets++;
 		dev->stats.rx_bytes += len;
-		netif_receive_skb(skb);
+		list_add_tail(&skb->list, &rx_list);
 
 	} while (--budget > 0);
 
+	netif_receive_skb_list(&rx_list);
+
 	if (processed || !priv->rx_desc_count) {
 		bcm_enet_refill_rx(dev);