Message ID | a14a30d3c06fff24e13f836c733d80efc0bd6eb5.1611957532.git.lorenzo@kernel.org |
---|---|
State | New |
Headers | show |
Series | [v3,bpf-next] net: veth: alloc skb in bulk for ndo_xdp_xmit | expand |
On 2021/01/30 7:04, Lorenzo Bianconi wrote: > Split ndo_xdp_xmit and ndo_start_xmit use cases in veth_xdp_rcv routine > in order to alloc skbs in bulk for XDP_PASS verdict. > Introduce xdp_alloc_skb_bulk utility routine to alloc skb bulk list. > The proposed approach has been tested in the following scenario: > > eth (ixgbe) --> XDP_REDIRECT --> veth0 --> (remote-ns) veth1 --> XDP_PASS > > XDP_REDIRECT: xdp_redirect_map bpf sample > XDP_PASS: xdp_rxq_info bpf sample > > traffic generator: pkt_gen sending udp traffic on a remote device > > bpf-next master: ~3.64Mpps > bpf-next + skb bulking allocation: ~3.79Mpps > > Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Toshiaki Makita <toshiaki.makita1@gmail.com> Thanks, Toshiaki Makita
On Fri, 29 Jan 2021 23:04:08 +0100 Lorenzo Bianconi <lorenzo@kernel.org> wrote: > Split ndo_xdp_xmit and ndo_start_xmit use cases in veth_xdp_rcv routine > in order to alloc skbs in bulk for XDP_PASS verdict. > Introduce xdp_alloc_skb_bulk utility routine to alloc skb bulk list. > The proposed approach has been tested in the following scenario: > > eth (ixgbe) --> XDP_REDIRECT --> veth0 --> (remote-ns) veth1 --> XDP_PASS > > XDP_REDIRECT: xdp_redirect_map bpf sample > XDP_PASS: xdp_rxq_info bpf sample > > traffic generator: pkt_gen sending udp traffic on a remote device > > bpf-next master: ~3.64Mpps > bpf-next + skb bulking allocation: ~3.79Mpps > > Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> > --- I wanted Lorenzo to test 8 vs 16 bulking, but after much testing and IRC dialog, we cannot find and measure any difference with enough accuracy. Thus: Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> > Changes since v2: > - use __GFP_ZERO flag instead of memset > - move some veth_xdp_rcv_batch() logic in veth_xdp_rcv_skb() > > Changes since v1: > - drop patch 2/3, squash patch 1/3 and 3/3 > - set VETH_XDP_BATCH to 16 > - rework veth_xdp_rcv to use __ptr_ring_consume > --- > drivers/net/veth.c | 78 ++++++++++++++++++++++++++++++++++------------ > include/net/xdp.h | 1 + > net/core/xdp.c | 11 +++++++ > 3 files changed, 70 insertions(+), 20 deletions(-) > > diff --git a/drivers/net/veth.c b/drivers/net/veth.c > index 6e03b619c93c..aa1a66ad2ce5 100644 > --- a/drivers/net/veth.c > +++ b/drivers/net/veth.c > @@ -35,6 +35,7 @@ > #define VETH_XDP_HEADROOM (XDP_PACKET_HEADROOM + NET_IP_ALIGN) > > #define VETH_XDP_TX_BULK_SIZE 16 > +#define VETH_XDP_BATCH 16 > -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer
Hello: This patch was applied to bpf/bpf-next.git (refs/heads/master): On Fri, 29 Jan 2021 23:04:08 +0100 you wrote: > Split ndo_xdp_xmit and ndo_start_xmit use cases in veth_xdp_rcv routine > in order to alloc skbs in bulk for XDP_PASS verdict. > Introduce xdp_alloc_skb_bulk utility routine to alloc skb bulk list. > The proposed approach has been tested in the following scenario: > > eth (ixgbe) --> XDP_REDIRECT --> veth0 --> (remote-ns) veth1 --> XDP_PASS > > [...] Here is the summary with links: - [v3,bpf-next] net: veth: alloc skb in bulk for ndo_xdp_xmit https://git.kernel.org/bpf/bpf-next/c/65e6dcf73398 You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html
On 1/29/21 11:04 PM, Lorenzo Bianconi wrote: > Split ndo_xdp_xmit and ndo_start_xmit use cases in veth_xdp_rcv routine > in order to alloc skbs in bulk for XDP_PASS verdict. > Introduce xdp_alloc_skb_bulk utility routine to alloc skb bulk list. > The proposed approach has been tested in the following scenario: [...] > diff --git a/net/core/xdp.c b/net/core/xdp.c > index 0d2630a35c3e..05354976c1fc 100644 > --- a/net/core/xdp.c > +++ b/net/core/xdp.c > @@ -514,6 +514,17 @@ void xdp_warn(const char *msg, const char *func, const int line) > }; > EXPORT_SYMBOL_GPL(xdp_warn); > > +int xdp_alloc_skb_bulk(void **skbs, int n_skb, gfp_t gfp) > +{ > + n_skb = kmem_cache_alloc_bulk(skbuff_head_cache, gfp, > + n_skb, skbs); Applied, but one question I was wondering about when reading the kmem_cache_alloc_bulk() code was whether it would be safer to simply test for kmem_cache_alloc_bulk() != n_skb given it could potentially in future also alloc less objs than requested, but I presume if such extension would get implemented then call-sites might need to indicate 'best effort' somehow via flag instead (to handle < n_skb case). Either way all current callers assume for != 0 that everything went well, so lgtm. > + if (unlikely(!n_skb)) > + return -ENOMEM; > + > + return 0; > +} > +EXPORT_SYMBOL_GPL(xdp_alloc_skb_bulk); > + > struct sk_buff *__xdp_build_skb_from_frame(struct xdp_frame *xdpf, > struct sk_buff *skb, > struct net_device *dev) >
On Thu, 4 Feb 2021 01:14:56 +0100 Daniel Borkmann <daniel@iogearbox.net> wrote: > On 1/29/21 11:04 PM, Lorenzo Bianconi wrote: > > Split ndo_xdp_xmit and ndo_start_xmit use cases in veth_xdp_rcv routine > > in order to alloc skbs in bulk for XDP_PASS verdict. > > Introduce xdp_alloc_skb_bulk utility routine to alloc skb bulk list. > > The proposed approach has been tested in the following scenario: > [...] > > diff --git a/net/core/xdp.c b/net/core/xdp.c > > index 0d2630a35c3e..05354976c1fc 100644 > > --- a/net/core/xdp.c > > +++ b/net/core/xdp.c > > @@ -514,6 +514,17 @@ void xdp_warn(const char *msg, const char *func, const int line) > > }; > > EXPORT_SYMBOL_GPL(xdp_warn); > > > > +int xdp_alloc_skb_bulk(void **skbs, int n_skb, gfp_t gfp) > > +{ > > + n_skb = kmem_cache_alloc_bulk(skbuff_head_cache, gfp, > > + n_skb, skbs); > > Applied, but one question I was wondering about when reading the kmem_cache_alloc_bulk() > code was whether it would be safer to simply test for kmem_cache_alloc_bulk() != n_skb > given it could potentially in future also alloc less objs than requested, but I presume > if such extension would get implemented then call-sites might need to indicate 'best > effort' somehow via flag instead (to handle < n_skb case). Either way all current callers > assume for != 0 that everything went well, so lgtm. It was Andrew (AKPM) that wanted the API to either return the requested number of objects or fail. I respected the MM-maintainers request at that point, even-though I wanted the other API as there is a small performance advantage (not crossing page boundary in SLUB). At that time we discussed it on MM-list, and I see his/the point: If API can allocate less objs than requested, then think about how this complicated the surrounding code. E.g. in this specific code we already have VETH_XDP_BATCH(16) xdp_frame objects, which we need to get 16 SKB objects for. What should the code do if it cannot get 16 SKBs(?). -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer
On 2/4/21 10:05 AM, Jesper Dangaard Brouer wrote: [...] > It was Andrew (AKPM) that wanted the API to either return the requested > number of objects or fail. I respected the MM-maintainers request at > that point, even-though I wanted the other API as there is a small > performance advantage (not crossing page boundary in SLUB). > > At that time we discussed it on MM-list, and I see his/the point: > If API can allocate less objs than requested, then think about how this > complicated the surrounding code. E.g. in this specific code we already > have VETH_XDP_BATCH(16) xdp_frame objects, which we need to get 16 SKB > objects for. What should the code do if it cannot get 16 SKBs(?). Right, I mentioned the error handling complications above wrt < n_skb case. I think iff this ever gets implemented and there's a need, it would probably be best to add a new flag like __GFP_BULK_BEST_EFFORT to indicate that it would be okay to return x elements with x being in (0, size], so that only those callers need to deal with this, and all others can expect [as today] that != 0 means all #size elements were bulk alloc'ed. Thanks, Daniel
diff --git a/drivers/net/veth.c b/drivers/net/veth.c index 6e03b619c93c..aa1a66ad2ce5 100644 --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -35,6 +35,7 @@ #define VETH_XDP_HEADROOM (XDP_PACKET_HEADROOM + NET_IP_ALIGN) #define VETH_XDP_TX_BULK_SIZE 16 +#define VETH_XDP_BATCH 16 struct veth_stats { u64 rx_drops; @@ -562,14 +563,13 @@ static int veth_xdp_tx(struct veth_rq *rq, struct xdp_buff *xdp, return 0; } -static struct sk_buff *veth_xdp_rcv_one(struct veth_rq *rq, - struct xdp_frame *frame, - struct veth_xdp_tx_bq *bq, - struct veth_stats *stats) +static struct xdp_frame *veth_xdp_rcv_one(struct veth_rq *rq, + struct xdp_frame *frame, + struct veth_xdp_tx_bq *bq, + struct veth_stats *stats) { struct xdp_frame orig_frame; struct bpf_prog *xdp_prog; - struct sk_buff *skb; rcu_read_lock(); xdp_prog = rcu_dereference(rq->xdp_prog); @@ -623,13 +623,7 @@ static struct sk_buff *veth_xdp_rcv_one(struct veth_rq *rq, } rcu_read_unlock(); - skb = xdp_build_skb_from_frame(frame, rq->dev); - if (!skb) { - xdp_return_frame(frame); - stats->rx_drops++; - } - - return skb; + return frame; err_xdp: rcu_read_unlock(); xdp_return_frame(frame); @@ -637,6 +631,37 @@ static struct sk_buff *veth_xdp_rcv_one(struct veth_rq *rq, return NULL; } +/* frames array contains VETH_XDP_BATCH at most */ +static void veth_xdp_rcv_bulk_skb(struct veth_rq *rq, void **frames, + int n_xdpf, struct veth_xdp_tx_bq *bq, + struct veth_stats *stats) +{ + void *skbs[VETH_XDP_BATCH]; + int i; + + if (xdp_alloc_skb_bulk(skbs, n_xdpf, + GFP_ATOMIC | __GFP_ZERO) < 0) { + for (i = 0; i < n_xdpf; i++) + xdp_return_frame(frames[i]); + stats->rx_drops += n_xdpf; + + return; + } + + for (i = 0; i < n_xdpf; i++) { + struct sk_buff *skb = skbs[i]; + + skb = __xdp_build_skb_from_frame(frames[i], skb, + rq->dev); + if (!skb) { + xdp_return_frame(frames[i]); + stats->rx_drops++; + continue; + } + napi_gro_receive(&rq->xdp_napi, skb); + } +} + static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq, struct sk_buff *skb, struct veth_xdp_tx_bq *bq, @@ -784,32 +809,45 @@ static int veth_xdp_rcv(struct veth_rq *rq, int budget, struct veth_xdp_tx_bq *bq, struct veth_stats *stats) { - int i, done = 0; + int i, done = 0, n_xdpf = 0; + void *xdpf[VETH_XDP_BATCH]; for (i = 0; i < budget; i++) { void *ptr = __ptr_ring_consume(&rq->xdp_ring); - struct sk_buff *skb; if (!ptr) break; if (veth_is_xdp_frame(ptr)) { + /* ndo_xdp_xmit */ struct xdp_frame *frame = veth_ptr_to_xdp(ptr); stats->xdp_bytes += frame->len; - skb = veth_xdp_rcv_one(rq, frame, bq, stats); + frame = veth_xdp_rcv_one(rq, frame, bq, stats); + if (frame) { + /* XDP_PASS */ + xdpf[n_xdpf++] = frame; + if (n_xdpf == VETH_XDP_BATCH) { + veth_xdp_rcv_bulk_skb(rq, xdpf, n_xdpf, + bq, stats); + n_xdpf = 0; + } + } } else { - skb = ptr; + /* ndo_start_xmit */ + struct sk_buff *skb = ptr; + stats->xdp_bytes += skb->len; skb = veth_xdp_rcv_skb(rq, skb, bq, stats); + if (skb) + napi_gro_receive(&rq->xdp_napi, skb); } - - if (skb) - napi_gro_receive(&rq->xdp_napi, skb); - done++; } + if (n_xdpf) + veth_xdp_rcv_bulk_skb(rq, xdpf, n_xdpf, bq, stats); + u64_stats_update_begin(&rq->stats.syncp); rq->stats.vs.xdp_redirect += stats->xdp_redirect; rq->stats.vs.xdp_bytes += stats->xdp_bytes; diff --git a/include/net/xdp.h b/include/net/xdp.h index c4bfdc9a8b79..a5bc214a49d9 100644 --- a/include/net/xdp.h +++ b/include/net/xdp.h @@ -169,6 +169,7 @@ struct sk_buff *__xdp_build_skb_from_frame(struct xdp_frame *xdpf, struct net_device *dev); struct sk_buff *xdp_build_skb_from_frame(struct xdp_frame *xdpf, struct net_device *dev); +int xdp_alloc_skb_bulk(void **skbs, int n_skb, gfp_t gfp); static inline void xdp_convert_frame_to_buff(struct xdp_frame *frame, struct xdp_buff *xdp) diff --git a/net/core/xdp.c b/net/core/xdp.c index 0d2630a35c3e..05354976c1fc 100644 --- a/net/core/xdp.c +++ b/net/core/xdp.c @@ -514,6 +514,17 @@ void xdp_warn(const char *msg, const char *func, const int line) }; EXPORT_SYMBOL_GPL(xdp_warn); +int xdp_alloc_skb_bulk(void **skbs, int n_skb, gfp_t gfp) +{ + n_skb = kmem_cache_alloc_bulk(skbuff_head_cache, gfp, + n_skb, skbs); + if (unlikely(!n_skb)) + return -ENOMEM; + + return 0; +} +EXPORT_SYMBOL_GPL(xdp_alloc_skb_bulk); + struct sk_buff *__xdp_build_skb_from_frame(struct xdp_frame *xdpf, struct sk_buff *skb, struct net_device *dev)
Split ndo_xdp_xmit and ndo_start_xmit use cases in veth_xdp_rcv routine in order to alloc skbs in bulk for XDP_PASS verdict. Introduce xdp_alloc_skb_bulk utility routine to alloc skb bulk list. The proposed approach has been tested in the following scenario: eth (ixgbe) --> XDP_REDIRECT --> veth0 --> (remote-ns) veth1 --> XDP_PASS XDP_REDIRECT: xdp_redirect_map bpf sample XDP_PASS: xdp_rxq_info bpf sample traffic generator: pkt_gen sending udp traffic on a remote device bpf-next master: ~3.64Mpps bpf-next + skb bulking allocation: ~3.79Mpps Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> --- Changes since v2: - use __GFP_ZERO flag instead of memset - move some veth_xdp_rcv_batch() logic in veth_xdp_rcv_skb() Changes since v1: - drop patch 2/3, squash patch 1/3 and 3/3 - set VETH_XDP_BATCH to 16 - rework veth_xdp_rcv to use __ptr_ring_consume --- drivers/net/veth.c | 78 ++++++++++++++++++++++++++++++++++------------ include/net/xdp.h | 1 + net/core/xdp.c | 11 +++++++ 3 files changed, 70 insertions(+), 20 deletions(-)