diff mbox series

[bpf-next] cpumap: bulk skb using netif_receive_skb_list

Message ID e01b1a562c523f64049fa45da6c031b0749ca412.1617267115.git.lorenzo@kernel.org
State New
Headers show
Series [bpf-next] cpumap: bulk skb using netif_receive_skb_list | expand

Commit Message

Lorenzo Bianconi April 1, 2021, 8:56 a.m. UTC
Rely on netif_receive_skb_list routine to send skbs converted from
xdp_frames in cpu_map_kthread_run in order to improve i-cache usage.
The proposed patch has been tested running xdp_redirect_cpu bpf sample
available in the kernel tree that is used to redirect UDP frames from
ixgbe driver to a cpumap entry and then to the networking stack.
UDP frames are generated using pkt_gen.

$xdp_redirect_cpu  --cpu <cpu> --progname xdp_cpu_map0 --dev <eth>

bpf-next: ~2.2Mpps
bpf-next + cpumap skb-list: ~3.15Mpps

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
 kernel/bpf/cpumap.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

Comments

Song Liu April 1, 2021, 4:40 p.m. UTC | #1
On Thu, Apr 1, 2021 at 1:57 AM Lorenzo Bianconi <lorenzo@kernel.org> wrote:
>
> Rely on netif_receive_skb_list routine to send skbs converted from
> xdp_frames in cpu_map_kthread_run in order to improve i-cache usage.
> The proposed patch has been tested running xdp_redirect_cpu bpf sample
> available in the kernel tree that is used to redirect UDP frames from
> ixgbe driver to a cpumap entry and then to the networking stack.
> UDP frames are generated using pkt_gen.
>
> $xdp_redirect_cpu  --cpu <cpu> --progname xdp_cpu_map0 --dev <eth>
>
> bpf-next: ~2.2Mpps
> bpf-next + cpumap skb-list: ~3.15Mpps
>
> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
> ---
>  kernel/bpf/cpumap.c | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
>
> diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c
> index 0cf2791d5099..b33114ce2e2b 100644
> --- a/kernel/bpf/cpumap.c
> +++ b/kernel/bpf/cpumap.c
> @@ -257,6 +257,7 @@ static int cpu_map_kthread_run(void *data)
>                 void *frames[CPUMAP_BATCH];
>                 void *skbs[CPUMAP_BATCH];
>                 int i, n, m, nframes;
> +               LIST_HEAD(list);
>
>                 /* Release CPU reschedule checks */
>                 if (__ptr_ring_empty(rcpu->queue)) {
> @@ -305,7 +306,6 @@ static int cpu_map_kthread_run(void *data)
>                 for (i = 0; i < nframes; i++) {
>                         struct xdp_frame *xdpf = frames[i];
>                         struct sk_buff *skb = skbs[i];
> -                       int ret;
>
>                         skb = __xdp_build_skb_from_frame(xdpf, skb,
>                                                          xdpf->dev_rx);
> @@ -314,11 +314,10 @@ static int cpu_map_kthread_run(void *data)
>                                 continue;
>                         }
>
> -                       /* Inject into network stack */
> -                       ret = netif_receive_skb_core(skb);
> -                       if (ret == NET_RX_DROP)
> -                               drops++;

I guess we stop tracking "drops" with this patch?

Thanks,
Song

> +                       list_add_tail(&skb->list, &list);
>                 }
> +               netif_receive_skb_list(&list);
> +
>                 /* Feedback loop via tracepoint */
>                 trace_xdp_cpumap_kthread(rcpu->map_id, n, drops, sched, &stats);
>
> --
> 2.30.2
>
Lorenzo Bianconi April 1, 2021, 4:49 p.m. UTC | #2
> On Thu, Apr 1, 2021 at 1:57 AM Lorenzo Bianconi <lorenzo@kernel.org> wrote:
> >

[...]

> > -                       /* Inject into network stack */
> > -                       ret = netif_receive_skb_core(skb);
> > -                       if (ret == NET_RX_DROP)
> > -                               drops++;
> 
> I guess we stop tracking "drops" with this patch?
> 
> Thanks,
> Song

Hi Song,

we do not report the packets dropped by the stack but we still count the drops
in the cpumap. If you think they are really important I guess we can change
return value of netif_receive_skb_list returning the dropped packets or
similar. What do you think?

Regards,
Lorenzo

> 
> > +                       list_add_tail(&skb->list, &list);
> >                 }
> > +               netif_receive_skb_list(&list);
> > +
> >                 /* Feedback loop via tracepoint */
> >                 trace_xdp_cpumap_kthread(rcpu->map_id, n, drops, sched, &stats);
> >
> > --
> > 2.30.2
> >
Lorenzo Bianconi April 8, 2021, 10:10 a.m. UTC | #3
> On Thu, Apr 1, 2021 at 9:49 AM Lorenzo Bianconi <lorenzo@kernel.org> wrote:

> >

> > > On Thu, Apr 1, 2021 at 1:57 AM Lorenzo Bianconi <lorenzo@kernel.org> wrote:

> > > >

> >

> > [...]

> >

> > > > -                       /* Inject into network stack */

> > > > -                       ret = netif_receive_skb_core(skb);

> > > > -                       if (ret == NET_RX_DROP)

> > > > -                               drops++;

> > >

> > > I guess we stop tracking "drops" with this patch?

> > >

> > > Thanks,

> > > Song

> >

> > Hi Song,

> >

> > we do not report the packets dropped by the stack but we still count the drops

> > in the cpumap. If you think they are really important I guess we can change

> > return value of netif_receive_skb_list returning the dropped packets or

> > similar. What do you think?

> 

> I think we shouldn't silently change the behavior of the tracepoint below:

> 

> trace_xdp_cpumap_kthread(rcpu->map_id, n, drops, sched, &stats);

> 

> Returning dropped packets from netif_receive_skb_list() sounds good to me.


Hi Song,

I reviewed the netif_receive_skb_list() and I guess the code needed to count
number of dropped frames is a bit intrusive and we need to add some checks
in the hot path.
Moreover the dropped frames are already accounted in the networking stack
(e.g. mib counters for the ip traffic).
Since drop counter is just exported in a tracepoint in cpu_map_kthread_run,
I guess we can just not count dropped packets in the networking stack here
and rely on the mib counters. What do you think?

@Jesper: since you added the original code, what do you think about it?

Regards,
Lorenzo

> 

> Thanks,

> Song

>
Jesper Dangaard Brouer April 8, 2021, 6:20 p.m. UTC | #4
On Thu, 8 Apr 2021 12:10:19 +0200
Lorenzo Bianconi <lorenzo.bianconi@redhat.com> wrote:

> > On Thu, Apr 1, 2021 at 9:49 AM Lorenzo Bianconi <lorenzo@kernel.org> wrote:  

> > >  

> > > > On Thu, Apr 1, 2021 at 1:57 AM Lorenzo Bianconi <lorenzo@kernel.org> wrote:  

> > > > >  

> > >

> > > [...]

> > >  

> > > > > -                       /* Inject into network stack */

> > > > > -                       ret = netif_receive_skb_core(skb);

> > > > > -                       if (ret == NET_RX_DROP)

> > > > > -                               drops++;  

> > > >

> > > > I guess we stop tracking "drops" with this patch?

> > > >

> > > > Thanks,

> > > > Song  

> > >

> > > Hi Song,

> > >

> > > we do not report the packets dropped by the stack but we still count the drops

> > > in the cpumap. If you think they are really important I guess we can change

> > > return value of netif_receive_skb_list returning the dropped packets or

> > > similar. What do you think?  

> > 

> > I think we shouldn't silently change the behavior of the tracepoint below:

> > 

> > trace_xdp_cpumap_kthread(rcpu->map_id, n, drops, sched, &stats);

> > 

> > Returning dropped packets from netif_receive_skb_list() sounds good to me.  

> 

> Hi Song,

> 

> I reviewed the netif_receive_skb_list() and I guess the code needed to count

> number of dropped frames is a bit intrusive and we need to add some checks

> in the hot path.

> Moreover the dropped frames are already accounted in the networking stack

> (e.g. mib counters for the ip traffic).

> Since drop counter is just exported in a tracepoint in cpu_map_kthread_run,

> I guess we can just not count dropped packets in the networking stack here

> and rely on the mib counters. What do you think?

> 

> @Jesper: since you added the original code, what do you think about it?


I'm actually fine with not counting the drops in the tracepoint.

As you say it is already accounted elsewere in MIB counters for the
network stack.  Which is actually better, as having this drop counter
in tracepoint have confused people before (when using xdp_redirect_cpu).
If they instead looked at the MIB counters, it should be easier to
understand why the drop happens.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer
diff mbox series

Patch

diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c
index 0cf2791d5099..b33114ce2e2b 100644
--- a/kernel/bpf/cpumap.c
+++ b/kernel/bpf/cpumap.c
@@ -257,6 +257,7 @@  static int cpu_map_kthread_run(void *data)
 		void *frames[CPUMAP_BATCH];
 		void *skbs[CPUMAP_BATCH];
 		int i, n, m, nframes;
+		LIST_HEAD(list);
 
 		/* Release CPU reschedule checks */
 		if (__ptr_ring_empty(rcpu->queue)) {
@@ -305,7 +306,6 @@  static int cpu_map_kthread_run(void *data)
 		for (i = 0; i < nframes; i++) {
 			struct xdp_frame *xdpf = frames[i];
 			struct sk_buff *skb = skbs[i];
-			int ret;
 
 			skb = __xdp_build_skb_from_frame(xdpf, skb,
 							 xdpf->dev_rx);
@@ -314,11 +314,10 @@  static int cpu_map_kthread_run(void *data)
 				continue;
 			}
 
-			/* Inject into network stack */
-			ret = netif_receive_skb_core(skb);
-			if (ret == NET_RX_DROP)
-				drops++;
+			list_add_tail(&skb->list, &list);
 		}
+		netif_receive_skb_list(&list);
+
 		/* Feedback loop via tracepoint */
 		trace_xdp_cpumap_kthread(rcpu->map_id, n, drops, sched, &stats);