mbox series

[RFC,bpf-next,0/4] add partial rx hw csum offload support for XDP

Message ID cover.1622222367.git.lorenzo@kernel.org
Headers show
Series add partial rx hw csum offload support for XDP | expand

Message

Lorenzo Bianconi May 28, 2021, 5:43 p.m. UTC
Enable xdp rx checksum offload support for CHECKSUM_UNNECESSARY use-case.
Introduce flag field in xdp_buff/xdp_frame in order to save the checksum
result from the NIC and have a fast access to it performing XDP_REDIRECT.
CHECKSUM_COMPLETE is not supported yet since it will require adding the
csum result to the xdp_metadata area.
Moreover flag field will be reused for xdp multi-buff support.
This series has been tested generating UDP traffic with pktgen and performing
a xdp_redirect from an ixgbe device to a remote CPUMAP entry. PPS results show
a negligible penalty respect to the baseline where the UDP checksum has been
disabled. More info about the test can be found here [0].

[0] https://github.com/xdp-project/xdp-project/blob/master/areas/core/xdp_frame01_checksum.org

Lorenzo Bianconi (4):
  net: xdp: introduce flags field in xdp_buff and xdp_frame
  mvneta: return csum computation result from mvneta_rx_csum
  net: mvneta: report csum result in xdp_buff
  net: xdp: update csum building the skb

 drivers/net/ethernet/marvell/mvneta.c | 27 ++++++++------------
 include/net/xdp.h                     | 36 +++++++++++++++++++++++++++
 net/core/xdp.c                        |  2 +-
 3 files changed, 48 insertions(+), 17 deletions(-)

Comments

David Ahern May 29, 2021, 1:33 a.m. UTC | #1
On 5/28/21 3:18 PM, Tom Herbert wrote:
> On Fri, May 28, 2021 at 10:44 AM Lorenzo Bianconi <lorenzo@kernel.org> wrote:

>>

>> Introduce flag field in xdp_buff and xdp_frame data structure in order

>> to report xdp_buffer metadata. For the moment just hw checksum hints

>> are defined but flags field will be reused for xdp multi-buffer

>> For the moment just CHECKSUM_UNNECESSARY is supported.

>> CHECKSUM_COMPLETE will need to set csum value in metada space.

>>

> Lorenzo,

> 

> This isn't sufficient for the checksum-unnecessary interface, we'd

> also need ability to set csum_level for cases the device validated

> more than one checksum.


That's on me. The original patch was for XDP_REDIRECT to VMs and the
VIRTIO_NET_HDR_ API does not support csum_level.
VIRTIO_NET_HDR_F_DATA_VALID means CHECKSUM_UNNECESSARY, an API
implemented 10 years ago.

> 

> IMO, we shouldn't support CHECKSUM_UNNECESSARY for new uses like this.

> For years now, the Linux community has been pleading with vendors to

> provide CHECKSUM_COMPLETE which is far more useful and robust than

> CHECSUM_UNNECESSARY, and yet some still haven't got with the program

> even though we see more and more instances where CHECKSUM_UNNECESSARY

> doesn't even work at all (e.g. cases with SRv6, new encaps device

> doesn't understand). I believe it's time to take a stand! :-)

>


There is no new hardware or new feature at play here. This about XDP
frames getting the checksum validation setting that an skb enjoys today.
You are taking a stand against S/W equivalency with the existing NICs?
That basically penalizes XDP, continuing to limit its usefulness with
very well established use cases that could benefit from it.
Jakub Kicinski May 29, 2021, 4:56 a.m. UTC | #2
On Fri, 28 May 2021 14:18:33 -0700 Tom Herbert wrote:
> On Fri, May 28, 2021 at 10:44 AM Lorenzo Bianconi <lorenzo@kernel.org> wrote:

> > Introduce flag field in xdp_buff and xdp_frame data structure in order

> > to report xdp_buffer metadata. For the moment just hw checksum hints

> > are defined but flags field will be reused for xdp multi-buffer

> > For the moment just CHECKSUM_UNNECESSARY is supported.

> > CHECKSUM_COMPLETE will need to set csum value in metada space.

> >  

> Lorenzo,

> 

> This isn't sufficient for the checksum-unnecessary interface, we'd

> also need ability to set csum_level for cases the device validated

> more than one checksum.

> 

> IMO, we shouldn't support CHECKSUM_UNNECESSARY for new uses like this.

> For years now, the Linux community has been pleading with vendors to

> provide CHECKSUM_COMPLETE which is far more useful and robust than

> CHECSUM_UNNECESSARY, and yet some still haven't got with the program

> even though we see more and more instances where CHECKSUM_UNNECESSARY

> doesn't even work at all (e.g. cases with SRv6, new encaps device

> doesn't understand). I believe it's time to take a stand! :-)


I must agree. Not supporting CHECKSUM_COMPLETE seems like a step back.
Lorenzo Bianconi May 29, 2021, 1:34 p.m. UTC | #3
> On Fri, May 28, 2021 at 10:44 AM Lorenzo Bianconi <lorenzo@kernel.org> wrote:

> >

> > Introduce flag field in xdp_buff and xdp_frame data structure in order

> > to report xdp_buffer metadata. For the moment just hw checksum hints

> > are defined but flags field will be reused for xdp multi-buffer

> > For the moment just CHECKSUM_UNNECESSARY is supported.

> > CHECKSUM_COMPLETE will need to set csum value in metada space.

> >

> Lorenzo,


Hi Tom,

> 

> This isn't sufficient for the checksum-unnecessary interface, we'd

> also need ability to set csum_level for cases the device validated

> more than one checksum.


ack, right. I guess we can put this info in xdp metadata or do you think we can
add it in xdp_buff/xdp_frame as well?

> 

> IMO, we shouldn't support CHECKSUM_UNNECESSARY for new uses like this.

> For years now, the Linux community has been pleading with vendors to

> provide CHECKSUM_COMPLETE which is far more useful and robust than

> CHECSUM_UNNECESSARY, and yet some still haven't got with the program

> even though we see more and more instances where CHECKSUM_UNNECESSARY

> doesn't even work at all (e.g. cases with SRv6, new encaps device

> doesn't understand). I believe it's time to take a stand! :-)


I completely agree CHECKSUM_COMPLETE is more useful and robust than
CHECSUM_UNNECESSARY and I want to add support for it as soon as we
agree on the best way to do it. At the same time there are plenty of
XDP NICs where this feature is quite useful since they support just
CHECSUM_UNNECESSARY.

Regards,
Lorenzo

> 

> Tom

> 

> > Signed-off-by: David Ahern <dsahern@kernel.org>

> > Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>

> > ---

> >  include/net/xdp.h | 36 ++++++++++++++++++++++++++++++++++++

> >  1 file changed, 36 insertions(+)

> >

> > diff --git a/include/net/xdp.h b/include/net/xdp.h

> > index 5533f0ab2afc..e81ac505752b 100644

> > --- a/include/net/xdp.h

> > +++ b/include/net/xdp.h

> > @@ -66,6 +66,13 @@ struct xdp_txq_info {

> >         struct net_device *dev;

> >  };

> >

> > +/* xdp metadata bitmask */

> > +#define XDP_CSUM_MASK          GENMASK(1, 0)

> > +enum xdp_flags {

> > +       XDP_CSUM_UNNECESSARY    = BIT(0),

> > +       XDP_CSUM_COMPLETE       = BIT(1),

> > +};

> > +

> >  struct xdp_buff {

> >         void *data;

> >         void *data_end;

> > @@ -74,6 +81,7 @@ struct xdp_buff {

> >         struct xdp_rxq_info *rxq;

> >         struct xdp_txq_info *txq;

> >         u32 frame_sz; /* frame size to deduce data_hard_end/reserved tailroom*/

> > +       u16 flags; /* xdp_flags */

> >  };

> >

> >  static __always_inline void

> > @@ -81,6 +89,7 @@ xdp_init_buff(struct xdp_buff *xdp, u32 frame_sz, struct xdp_rxq_info *rxq)

> >  {

> >         xdp->frame_sz = frame_sz;

> >         xdp->rxq = rxq;

> > +       xdp->flags = 0;

> >  }

> >

> >  static __always_inline void

> > @@ -95,6 +104,18 @@ xdp_prepare_buff(struct xdp_buff *xdp, unsigned char *hard_start,

> >         xdp->data_meta = meta_valid ? data : data + 1;

> >  }

> >

> > +static __always_inline void

> > +xdp_buff_get_csum(struct xdp_buff *xdp, struct sk_buff *skb)

> > +{

> > +       switch (xdp->flags & XDP_CSUM_MASK) {

> > +       case XDP_CSUM_UNNECESSARY:

> > +               skb->ip_summed = CHECKSUM_UNNECESSARY;

> > +               break;

> > +       default:

> > +               break;

> > +       }

> > +}

> > +

> >  /* Reserve memory area at end-of data area.

> >   *

> >   * This macro reserves tailroom in the XDP buffer by limiting the

> > @@ -122,8 +143,21 @@ struct xdp_frame {

> >          */

> >         struct xdp_mem_info mem;

> >         struct net_device *dev_rx; /* used by cpumap */

> > +       u16 flags; /* xdp_flags */

> >  };

> >

> > +static __always_inline void

> > +xdp_frame_get_csum(struct xdp_frame *xdpf, struct sk_buff *skb)

> > +{

> > +       switch (xdpf->flags & XDP_CSUM_MASK) {

> > +       case XDP_CSUM_UNNECESSARY:

> > +               skb->ip_summed = CHECKSUM_UNNECESSARY;

> > +               break;

> > +       default:

> > +               break;

> > +       }

> > +}

> > +

> >  #define XDP_BULK_QUEUE_SIZE    16

> >  struct xdp_frame_bulk {

> >         int count;

> > @@ -180,6 +214,7 @@ void xdp_convert_frame_to_buff(struct xdp_frame *frame, struct xdp_buff *xdp)

> >         xdp->data_end = frame->data + frame->len;

> >         xdp->data_meta = frame->data - frame->metasize;

> >         xdp->frame_sz = frame->frame_sz;

> > +       xdp->flags = frame->flags;

> >  }

> >

> >  static inline

> > @@ -206,6 +241,7 @@ int xdp_update_frame_from_buff(struct xdp_buff *xdp,

> >         xdp_frame->headroom = headroom - sizeof(*xdp_frame);

> >         xdp_frame->metasize = metasize;

> >         xdp_frame->frame_sz = xdp->frame_sz;

> > +       xdp_frame->flags = xdp->flags;

> >

> >         return 0;

> >  }

> > --

> > 2.31.1

> >
Lorenzo Bianconi May 29, 2021, 1:37 p.m. UTC | #4
> On Fri, 28 May 2021 14:18:33 -0700 Tom Herbert wrote:

> > On Fri, May 28, 2021 at 10:44 AM Lorenzo Bianconi <lorenzo@kernel.org> wrote:

> > > Introduce flag field in xdp_buff and xdp_frame data structure in order

> > > to report xdp_buffer metadata. For the moment just hw checksum hints

> > > are defined but flags field will be reused for xdp multi-buffer

> > > For the moment just CHECKSUM_UNNECESSARY is supported.

> > > CHECKSUM_COMPLETE will need to set csum value in metada space.

> > >  

> > Lorenzo,

> > 

> > This isn't sufficient for the checksum-unnecessary interface, we'd

> > also need ability to set csum_level for cases the device validated

> > more than one checksum.

> > 

> > IMO, we shouldn't support CHECKSUM_UNNECESSARY for new uses like this.

> > For years now, the Linux community has been pleading with vendors to

> > provide CHECKSUM_COMPLETE which is far more useful and robust than

> > CHECSUM_UNNECESSARY, and yet some still haven't got with the program

> > even though we see more and more instances where CHECKSUM_UNNECESSARY

> > doesn't even work at all (e.g. cases with SRv6, new encaps device

> > doesn't understand). I believe it's time to take a stand! :-)

> 

> I must agree. Not supporting CHECKSUM_COMPLETE seems like a step back.


I completely agree on it and I want add support for CHECKSUM_COMPLETE as soon
as we decide what is the best way to store csum value (xdp_metadata?). At the
same time this preliminary series wants to add support just for
CHECSUM_UNNECESSARY. Moreover the flags field in xdp_buff/xdp_frame will be
reused for xdp multi-buff work.

Regards,
Lorenzo
Jesper Dangaard Brouer May 31, 2021, 11:07 a.m. UTC | #5
On Sat, 29 May 2021 15:34:18 +0200
Lorenzo Bianconi <lorenzo@kernel.org> wrote:

> > On Fri, May 28, 2021 at 10:44 AM Lorenzo Bianconi <lorenzo@kernel.org> wrote:  

> > >

> > > Introduce flag field in xdp_buff and xdp_frame data structure in order

> > > to report xdp_buffer metadata. For the moment just hw checksum hints

> > > are defined but flags field will be reused for xdp multi-buffer

> > > For the moment just CHECKSUM_UNNECESSARY is supported.

> > > CHECKSUM_COMPLETE will need to set csum value in metada space.

> > >  

> > Lorenzo,  

> 

> Hi Tom,

> 

> > 

> > This isn't sufficient for the checksum-unnecessary interface, we'd

> > also need ability to set csum_level for cases the device validated

> > more than one checksum.  

> 

> ack, right. I guess we can put this info in xdp metadata or do you

> think we can add it in xdp_buff/xdp_frame as well?


This is an interesting question as where do we draw the line.  I
definitely only don't want to add the same information in two places.

As is clear by the XDP-hints discussion: Today we are lacking *kernel*
infrastructure to interpret the XDP-metadata area when we create the
SKB from xdp_frame.  I want to add this capability...

(See XDP-hints discussion, as wisely pointed out by John, the BPF-prog
infrastructure to interpret XDP-metadata via BTF-info is already
available to userspace loading BPF-progs, but AFAIK the kernel is
lacking this capability. Maybe we will end-up loading BPF-progs that
populate SKB fields based on xdp_frame + XDP-metadata area... I'm
keeping an open mind for the solutions in this space.).

The question is really, should we wait for this infra (that can use
csum value from metadata) or should be go ahead and expand
xdp_buff/xdp_frame with an csum value (+ ip_summed) for the
CHECKSUM_COMPLETE use-case?

 
> > 

> > IMO, we shouldn't support CHECKSUM_UNNECESSARY for new uses like this.

> > For years now, the Linux community has been pleading with vendors to

> > provide CHECKSUM_COMPLETE which is far more useful and robust than

> > CHECSUM_UNNECESSARY, and yet some still haven't got with the program

> > even though we see more and more instances where CHECKSUM_UNNECESSARY

> > doesn't even work at all (e.g. cases with SRv6, new encaps device

> > doesn't understand). I believe it's time to take a stand! :-)  

> 

> I completely agree CHECKSUM_COMPLETE is more useful and robust than

> CHECSUM_UNNECESSARY and I want to add support for it as soon as we

> agree on the best way to do it. At the same time there are plenty of

> XDP NICs where this feature is quite useful since they support just

> CHECSUM_UNNECESSARY.

> 

> Regards,

> Lorenzo

> 

> > 

> > Tom

> >   

> > > Signed-off-by: David Ahern <dsahern@kernel.org>

> > > Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>

> > > ---

> > >  include/net/xdp.h | 36 ++++++++++++++++++++++++++++++++++++

> > >  1 file changed, 36 insertions(+)

> > >

> > > diff --git a/include/net/xdp.h b/include/net/xdp.h

> > > index 5533f0ab2afc..e81ac505752b 100644

> > > --- a/include/net/xdp.h

> > > +++ b/include/net/xdp.h

> > > @@ -66,6 +66,13 @@ struct xdp_txq_info {

> > >         struct net_device *dev;

> > >  };

> > >

> > > +/* xdp metadata bitmask */

> > > +#define XDP_CSUM_MASK          GENMASK(1, 0)

> > > +enum xdp_flags {

> > > +       XDP_CSUM_UNNECESSARY    = BIT(0),

> > > +       XDP_CSUM_COMPLETE       = BIT(1),

> > > +};

> > > +

> > >  struct xdp_buff {

> > >         void *data;

> > >         void *data_end;

> > > @@ -74,6 +81,7 @@ struct xdp_buff {

> > >         struct xdp_rxq_info *rxq;

> > >         struct xdp_txq_info *txq;

> > >         u32 frame_sz; /* frame size to deduce data_hard_end/reserved tailroom*/

> > > +       u16 flags; /* xdp_flags */

> > >  };

> > >

> > >  static __always_inline void

> > > @@ -81,6 +89,7 @@ xdp_init_buff(struct xdp_buff *xdp, u32 frame_sz, struct xdp_rxq_info *rxq)

> > >  {

> > >         xdp->frame_sz = frame_sz;

> > >         xdp->rxq = rxq;

> > > +       xdp->flags = 0;

> > >  }

> > >

> > >  static __always_inline void

> > > @@ -95,6 +104,18 @@ xdp_prepare_buff(struct xdp_buff *xdp, unsigned char *hard_start,

> > >         xdp->data_meta = meta_valid ? data : data + 1;

> > >  }

> > >

> > > +static __always_inline void

> > > +xdp_buff_get_csum(struct xdp_buff *xdp, struct sk_buff *skb)

> > > +{

> > > +       switch (xdp->flags & XDP_CSUM_MASK) {

> > > +       case XDP_CSUM_UNNECESSARY:

> > > +               skb->ip_summed = CHECKSUM_UNNECESSARY;

> > > +               break;

> > > +       default:

> > > +               break;

> > > +       }

> > > +}

> > > +

> > >  /* Reserve memory area at end-of data area.

> > >   *

> > >   * This macro reserves tailroom in the XDP buffer by limiting the

> > > @@ -122,8 +143,21 @@ struct xdp_frame {

> > >          */

> > >         struct xdp_mem_info mem;

> > >         struct net_device *dev_rx; /* used by cpumap */

> > > +       u16 flags; /* xdp_flags */

> > >  };

> > >

> > > +static __always_inline void

> > > +xdp_frame_get_csum(struct xdp_frame *xdpf, struct sk_buff *skb)

> > > +{

> > > +       switch (xdpf->flags & XDP_CSUM_MASK) {

> > > +       case XDP_CSUM_UNNECESSARY:

> > > +               skb->ip_summed = CHECKSUM_UNNECESSARY;

> > > +               break;

> > > +       default:

> > > +               break;

> > > +       }

> > > +}

> > > +

> > >  #define XDP_BULK_QUEUE_SIZE    16

> > >  struct xdp_frame_bulk {

> > >         int count;

> > > @@ -180,6 +214,7 @@ void xdp_convert_frame_to_buff(struct xdp_frame *frame, struct xdp_buff *xdp)

> > >         xdp->data_end = frame->data + frame->len;

> > >         xdp->data_meta = frame->data - frame->metasize;

> > >         xdp->frame_sz = frame->frame_sz;

> > > +       xdp->flags = frame->flags;

> > >  }

> > >

> > >  static inline

> > > @@ -206,6 +241,7 @@ int xdp_update_frame_from_buff(struct xdp_buff *xdp,

> > >         xdp_frame->headroom = headroom - sizeof(*xdp_frame);

> > >         xdp_frame->metasize = metasize;

> > >         xdp_frame->frame_sz = xdp->frame_sz;

> > > +       xdp_frame->flags = xdp->flags;

> > >

> > >         return 0;

> > >  }

> > > --

> > > 2.31.1

> > >  




-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer