Message ID | 20201204054616.26876-1-liew.s.piaw@gmail.com |
---|---|
State | New |
Headers | show |
Series | [net-next] bcm63xx_enet: batch process rx path | expand |
On Fri, Dec 04, 2020 at 10:50:45AM +0100, Eric Dumazet wrote: > > > On 12/4/20 6:46 AM, Sieng Piaw Liew wrote: > > Use netif_receive_skb_list to batch process rx skb. > > Tested on BCM6328 320 MHz using iperf3 -M 512, increasing performance > > by 12.5%. > > > > > > Well, the real question is why you do not simply use GRO, > to get 100% performance gain or more for TCP flows. > > > netif_receive_skb_list() is no longer needed, > GRO layer already uses batching for non TCP packets. > > We probably should mark is deprecated. > > diff --git a/drivers/net/ethernet/broadcom/bcm63xx_enet.c b/drivers/net/ethernet/broadcom/bcm63xx_enet.c > index 916824cca3fda194c42fefec7f514ced1a060043..6fdbe231b7c1b27f523889bda8a20ab7eaab65a6 100644 > --- a/drivers/net/ethernet/broadcom/bcm63xx_enet.c > +++ b/drivers/net/ethernet/broadcom/bcm63xx_enet.c > @@ -391,7 +391,7 @@ static int bcm_enet_receive_queue(struct net_device *dev, int budget) > skb->protocol = eth_type_trans(skb, dev); > dev->stats.rx_packets++; > dev->stats.rx_bytes += len; > - netif_receive_skb(skb); > + napi_gro_receive_skb(&priv->napi, skb); > > } while (--budget > 0); > The bcm63xx router SoC does not have enough CPU power nor hardware accelerator to process checksum validation fast enough for GRO/GSO. I have tested napi_gro_receive() on LAN-WAN setup. The resulting bandwidth dropped from 95Mbps wire speed down to 80Mbps. And it's inconsistent, with spikes and drops of >5Mbps. The ag71xx driver for ath79 router SoC reverted its use for the same reason. http://lists.infradead.org/pipermail/lede-commits/2017-October/004864.html
diff --git a/drivers/net/ethernet/broadcom/bcm63xx_enet.c b/drivers/net/ethernet/broadcom/bcm63xx_enet.c index 916824cca3fd..b82b7805c36a 100644 --- a/drivers/net/ethernet/broadcom/bcm63xx_enet.c +++ b/drivers/net/ethernet/broadcom/bcm63xx_enet.c @@ -297,10 +297,12 @@ static void bcm_enet_refill_rx_timer(struct timer_list *t) static int bcm_enet_receive_queue(struct net_device *dev, int budget) { struct bcm_enet_priv *priv; + struct list_head rx_list; struct device *kdev; int processed; priv = netdev_priv(dev); + INIT_LIST_HEAD(&rx_list); kdev = &priv->pdev->dev; processed = 0; @@ -391,10 +393,12 @@ static int bcm_enet_receive_queue(struct net_device *dev, int budget) skb->protocol = eth_type_trans(skb, dev); dev->stats.rx_packets++; dev->stats.rx_bytes += len; - netif_receive_skb(skb); + list_add_tail(&skb->list, &rx_list); } while (--budget > 0); + netif_receive_skb_list(&rx_list); + if (processed || !priv->rx_desc_count) { bcm_enet_refill_rx(dev);
Use netif_receive_skb_list to batch process rx skb. Tested on BCM6328 320 MHz using iperf3 -M 512, increasing performance by 12.5%. Before: [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-30.00 sec 120 MBytes 33.7 Mbits/sec 277 sender [ 4] 0.00-30.00 sec 120 MBytes 33.5 Mbits/sec receiver After: [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-30.00 sec 136 MBytes 37.9 Mbits/sec 203 sender [ 4] 0.00-30.00 sec 135 MBytes 37.7 Mbits/sec receiver Signed-off-by: Sieng Piaw Liew <liew.s.piaw@gmail.com> --- drivers/net/ethernet/broadcom/bcm63xx_enet.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)