Message ID | cover.1622222367.git.lorenzo@kernel.org |
---|---|
Headers | show |
Series | add partial rx hw csum offload support for XDP | expand |
On 5/28/21 3:18 PM, Tom Herbert wrote: > On Fri, May 28, 2021 at 10:44 AM Lorenzo Bianconi <lorenzo@kernel.org> wrote: >> >> Introduce flag field in xdp_buff and xdp_frame data structure in order >> to report xdp_buffer metadata. For the moment just hw checksum hints >> are defined but flags field will be reused for xdp multi-buffer >> For the moment just CHECKSUM_UNNECESSARY is supported. >> CHECKSUM_COMPLETE will need to set csum value in metada space. >> > Lorenzo, > > This isn't sufficient for the checksum-unnecessary interface, we'd > also need ability to set csum_level for cases the device validated > more than one checksum. That's on me. The original patch was for XDP_REDIRECT to VMs and the VIRTIO_NET_HDR_ API does not support csum_level. VIRTIO_NET_HDR_F_DATA_VALID means CHECKSUM_UNNECESSARY, an API implemented 10 years ago. > > IMO, we shouldn't support CHECKSUM_UNNECESSARY for new uses like this. > For years now, the Linux community has been pleading with vendors to > provide CHECKSUM_COMPLETE which is far more useful and robust than > CHECSUM_UNNECESSARY, and yet some still haven't got with the program > even though we see more and more instances where CHECKSUM_UNNECESSARY > doesn't even work at all (e.g. cases with SRv6, new encaps device > doesn't understand). I believe it's time to take a stand! :-) > There is no new hardware or new feature at play here. This about XDP frames getting the checksum validation setting that an skb enjoys today. You are taking a stand against S/W equivalency with the existing NICs? That basically penalizes XDP, continuing to limit its usefulness with very well established use cases that could benefit from it.
On Fri, 28 May 2021 14:18:33 -0700 Tom Herbert wrote: > On Fri, May 28, 2021 at 10:44 AM Lorenzo Bianconi <lorenzo@kernel.org> wrote: > > Introduce flag field in xdp_buff and xdp_frame data structure in order > > to report xdp_buffer metadata. For the moment just hw checksum hints > > are defined but flags field will be reused for xdp multi-buffer > > For the moment just CHECKSUM_UNNECESSARY is supported. > > CHECKSUM_COMPLETE will need to set csum value in metada space. > > > Lorenzo, > > This isn't sufficient for the checksum-unnecessary interface, we'd > also need ability to set csum_level for cases the device validated > more than one checksum. > > IMO, we shouldn't support CHECKSUM_UNNECESSARY for new uses like this. > For years now, the Linux community has been pleading with vendors to > provide CHECKSUM_COMPLETE which is far more useful and robust than > CHECSUM_UNNECESSARY, and yet some still haven't got with the program > even though we see more and more instances where CHECKSUM_UNNECESSARY > doesn't even work at all (e.g. cases with SRv6, new encaps device > doesn't understand). I believe it's time to take a stand! :-) I must agree. Not supporting CHECKSUM_COMPLETE seems like a step back.
> On Fri, May 28, 2021 at 10:44 AM Lorenzo Bianconi <lorenzo@kernel.org> wrote: > > > > Introduce flag field in xdp_buff and xdp_frame data structure in order > > to report xdp_buffer metadata. For the moment just hw checksum hints > > are defined but flags field will be reused for xdp multi-buffer > > For the moment just CHECKSUM_UNNECESSARY is supported. > > CHECKSUM_COMPLETE will need to set csum value in metada space. > > > Lorenzo, Hi Tom, > > This isn't sufficient for the checksum-unnecessary interface, we'd > also need ability to set csum_level for cases the device validated > more than one checksum. ack, right. I guess we can put this info in xdp metadata or do you think we can add it in xdp_buff/xdp_frame as well? > > IMO, we shouldn't support CHECKSUM_UNNECESSARY for new uses like this. > For years now, the Linux community has been pleading with vendors to > provide CHECKSUM_COMPLETE which is far more useful and robust than > CHECSUM_UNNECESSARY, and yet some still haven't got with the program > even though we see more and more instances where CHECKSUM_UNNECESSARY > doesn't even work at all (e.g. cases with SRv6, new encaps device > doesn't understand). I believe it's time to take a stand! :-) I completely agree CHECKSUM_COMPLETE is more useful and robust than CHECSUM_UNNECESSARY and I want to add support for it as soon as we agree on the best way to do it. At the same time there are plenty of XDP NICs where this feature is quite useful since they support just CHECSUM_UNNECESSARY. Regards, Lorenzo > > Tom > > > Signed-off-by: David Ahern <dsahern@kernel.org> > > Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> > > --- > > include/net/xdp.h | 36 ++++++++++++++++++++++++++++++++++++ > > 1 file changed, 36 insertions(+) > > > > diff --git a/include/net/xdp.h b/include/net/xdp.h > > index 5533f0ab2afc..e81ac505752b 100644 > > --- a/include/net/xdp.h > > +++ b/include/net/xdp.h > > @@ -66,6 +66,13 @@ struct xdp_txq_info { > > struct net_device *dev; > > }; > > > > +/* xdp metadata bitmask */ > > +#define XDP_CSUM_MASK GENMASK(1, 0) > > +enum xdp_flags { > > + XDP_CSUM_UNNECESSARY = BIT(0), > > + XDP_CSUM_COMPLETE = BIT(1), > > +}; > > + > > struct xdp_buff { > > void *data; > > void *data_end; > > @@ -74,6 +81,7 @@ struct xdp_buff { > > struct xdp_rxq_info *rxq; > > struct xdp_txq_info *txq; > > u32 frame_sz; /* frame size to deduce data_hard_end/reserved tailroom*/ > > + u16 flags; /* xdp_flags */ > > }; > > > > static __always_inline void > > @@ -81,6 +89,7 @@ xdp_init_buff(struct xdp_buff *xdp, u32 frame_sz, struct xdp_rxq_info *rxq) > > { > > xdp->frame_sz = frame_sz; > > xdp->rxq = rxq; > > + xdp->flags = 0; > > } > > > > static __always_inline void > > @@ -95,6 +104,18 @@ xdp_prepare_buff(struct xdp_buff *xdp, unsigned char *hard_start, > > xdp->data_meta = meta_valid ? data : data + 1; > > } > > > > +static __always_inline void > > +xdp_buff_get_csum(struct xdp_buff *xdp, struct sk_buff *skb) > > +{ > > + switch (xdp->flags & XDP_CSUM_MASK) { > > + case XDP_CSUM_UNNECESSARY: > > + skb->ip_summed = CHECKSUM_UNNECESSARY; > > + break; > > + default: > > + break; > > + } > > +} > > + > > /* Reserve memory area at end-of data area. > > * > > * This macro reserves tailroom in the XDP buffer by limiting the > > @@ -122,8 +143,21 @@ struct xdp_frame { > > */ > > struct xdp_mem_info mem; > > struct net_device *dev_rx; /* used by cpumap */ > > + u16 flags; /* xdp_flags */ > > }; > > > > +static __always_inline void > > +xdp_frame_get_csum(struct xdp_frame *xdpf, struct sk_buff *skb) > > +{ > > + switch (xdpf->flags & XDP_CSUM_MASK) { > > + case XDP_CSUM_UNNECESSARY: > > + skb->ip_summed = CHECKSUM_UNNECESSARY; > > + break; > > + default: > > + break; > > + } > > +} > > + > > #define XDP_BULK_QUEUE_SIZE 16 > > struct xdp_frame_bulk { > > int count; > > @@ -180,6 +214,7 @@ void xdp_convert_frame_to_buff(struct xdp_frame *frame, struct xdp_buff *xdp) > > xdp->data_end = frame->data + frame->len; > > xdp->data_meta = frame->data - frame->metasize; > > xdp->frame_sz = frame->frame_sz; > > + xdp->flags = frame->flags; > > } > > > > static inline > > @@ -206,6 +241,7 @@ int xdp_update_frame_from_buff(struct xdp_buff *xdp, > > xdp_frame->headroom = headroom - sizeof(*xdp_frame); > > xdp_frame->metasize = metasize; > > xdp_frame->frame_sz = xdp->frame_sz; > > + xdp_frame->flags = xdp->flags; > > > > return 0; > > } > > -- > > 2.31.1 > >
> On Fri, 28 May 2021 14:18:33 -0700 Tom Herbert wrote: > > On Fri, May 28, 2021 at 10:44 AM Lorenzo Bianconi <lorenzo@kernel.org> wrote: > > > Introduce flag field in xdp_buff and xdp_frame data structure in order > > > to report xdp_buffer metadata. For the moment just hw checksum hints > > > are defined but flags field will be reused for xdp multi-buffer > > > For the moment just CHECKSUM_UNNECESSARY is supported. > > > CHECKSUM_COMPLETE will need to set csum value in metada space. > > > > > Lorenzo, > > > > This isn't sufficient for the checksum-unnecessary interface, we'd > > also need ability to set csum_level for cases the device validated > > more than one checksum. > > > > IMO, we shouldn't support CHECKSUM_UNNECESSARY for new uses like this. > > For years now, the Linux community has been pleading with vendors to > > provide CHECKSUM_COMPLETE which is far more useful and robust than > > CHECSUM_UNNECESSARY, and yet some still haven't got with the program > > even though we see more and more instances where CHECKSUM_UNNECESSARY > > doesn't even work at all (e.g. cases with SRv6, new encaps device > > doesn't understand). I believe it's time to take a stand! :-) > > I must agree. Not supporting CHECKSUM_COMPLETE seems like a step back. I completely agree on it and I want add support for CHECKSUM_COMPLETE as soon as we decide what is the best way to store csum value (xdp_metadata?). At the same time this preliminary series wants to add support just for CHECSUM_UNNECESSARY. Moreover the flags field in xdp_buff/xdp_frame will be reused for xdp multi-buff work. Regards, Lorenzo
On Sat, 29 May 2021 15:34:18 +0200 Lorenzo Bianconi <lorenzo@kernel.org> wrote: > > On Fri, May 28, 2021 at 10:44 AM Lorenzo Bianconi <lorenzo@kernel.org> wrote: > > > > > > Introduce flag field in xdp_buff and xdp_frame data structure in order > > > to report xdp_buffer metadata. For the moment just hw checksum hints > > > are defined but flags field will be reused for xdp multi-buffer > > > For the moment just CHECKSUM_UNNECESSARY is supported. > > > CHECKSUM_COMPLETE will need to set csum value in metada space. > > > > > Lorenzo, > > Hi Tom, > > > > > This isn't sufficient for the checksum-unnecessary interface, we'd > > also need ability to set csum_level for cases the device validated > > more than one checksum. > > ack, right. I guess we can put this info in xdp metadata or do you > think we can add it in xdp_buff/xdp_frame as well? This is an interesting question as where do we draw the line. I definitely only don't want to add the same information in two places. As is clear by the XDP-hints discussion: Today we are lacking *kernel* infrastructure to interpret the XDP-metadata area when we create the SKB from xdp_frame. I want to add this capability... (See XDP-hints discussion, as wisely pointed out by John, the BPF-prog infrastructure to interpret XDP-metadata via BTF-info is already available to userspace loading BPF-progs, but AFAIK the kernel is lacking this capability. Maybe we will end-up loading BPF-progs that populate SKB fields based on xdp_frame + XDP-metadata area... I'm keeping an open mind for the solutions in this space.). The question is really, should we wait for this infra (that can use csum value from metadata) or should be go ahead and expand xdp_buff/xdp_frame with an csum value (+ ip_summed) for the CHECKSUM_COMPLETE use-case? > > > > IMO, we shouldn't support CHECKSUM_UNNECESSARY for new uses like this. > > For years now, the Linux community has been pleading with vendors to > > provide CHECKSUM_COMPLETE which is far more useful and robust than > > CHECSUM_UNNECESSARY, and yet some still haven't got with the program > > even though we see more and more instances where CHECKSUM_UNNECESSARY > > doesn't even work at all (e.g. cases with SRv6, new encaps device > > doesn't understand). I believe it's time to take a stand! :-) > > I completely agree CHECKSUM_COMPLETE is more useful and robust than > CHECSUM_UNNECESSARY and I want to add support for it as soon as we > agree on the best way to do it. At the same time there are plenty of > XDP NICs where this feature is quite useful since they support just > CHECSUM_UNNECESSARY. > > Regards, > Lorenzo > > > > > Tom > > > > > Signed-off-by: David Ahern <dsahern@kernel.org> > > > Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> > > > --- > > > include/net/xdp.h | 36 ++++++++++++++++++++++++++++++++++++ > > > 1 file changed, 36 insertions(+) > > > > > > diff --git a/include/net/xdp.h b/include/net/xdp.h > > > index 5533f0ab2afc..e81ac505752b 100644 > > > --- a/include/net/xdp.h > > > +++ b/include/net/xdp.h > > > @@ -66,6 +66,13 @@ struct xdp_txq_info { > > > struct net_device *dev; > > > }; > > > > > > +/* xdp metadata bitmask */ > > > +#define XDP_CSUM_MASK GENMASK(1, 0) > > > +enum xdp_flags { > > > + XDP_CSUM_UNNECESSARY = BIT(0), > > > + XDP_CSUM_COMPLETE = BIT(1), > > > +}; > > > + > > > struct xdp_buff { > > > void *data; > > > void *data_end; > > > @@ -74,6 +81,7 @@ struct xdp_buff { > > > struct xdp_rxq_info *rxq; > > > struct xdp_txq_info *txq; > > > u32 frame_sz; /* frame size to deduce data_hard_end/reserved tailroom*/ > > > + u16 flags; /* xdp_flags */ > > > }; > > > > > > static __always_inline void > > > @@ -81,6 +89,7 @@ xdp_init_buff(struct xdp_buff *xdp, u32 frame_sz, struct xdp_rxq_info *rxq) > > > { > > > xdp->frame_sz = frame_sz; > > > xdp->rxq = rxq; > > > + xdp->flags = 0; > > > } > > > > > > static __always_inline void > > > @@ -95,6 +104,18 @@ xdp_prepare_buff(struct xdp_buff *xdp, unsigned char *hard_start, > > > xdp->data_meta = meta_valid ? data : data + 1; > > > } > > > > > > +static __always_inline void > > > +xdp_buff_get_csum(struct xdp_buff *xdp, struct sk_buff *skb) > > > +{ > > > + switch (xdp->flags & XDP_CSUM_MASK) { > > > + case XDP_CSUM_UNNECESSARY: > > > + skb->ip_summed = CHECKSUM_UNNECESSARY; > > > + break; > > > + default: > > > + break; > > > + } > > > +} > > > + > > > /* Reserve memory area at end-of data area. > > > * > > > * This macro reserves tailroom in the XDP buffer by limiting the > > > @@ -122,8 +143,21 @@ struct xdp_frame { > > > */ > > > struct xdp_mem_info mem; > > > struct net_device *dev_rx; /* used by cpumap */ > > > + u16 flags; /* xdp_flags */ > > > }; > > > > > > +static __always_inline void > > > +xdp_frame_get_csum(struct xdp_frame *xdpf, struct sk_buff *skb) > > > +{ > > > + switch (xdpf->flags & XDP_CSUM_MASK) { > > > + case XDP_CSUM_UNNECESSARY: > > > + skb->ip_summed = CHECKSUM_UNNECESSARY; > > > + break; > > > + default: > > > + break; > > > + } > > > +} > > > + > > > #define XDP_BULK_QUEUE_SIZE 16 > > > struct xdp_frame_bulk { > > > int count; > > > @@ -180,6 +214,7 @@ void xdp_convert_frame_to_buff(struct xdp_frame *frame, struct xdp_buff *xdp) > > > xdp->data_end = frame->data + frame->len; > > > xdp->data_meta = frame->data - frame->metasize; > > > xdp->frame_sz = frame->frame_sz; > > > + xdp->flags = frame->flags; > > > } > > > > > > static inline > > > @@ -206,6 +241,7 @@ int xdp_update_frame_from_buff(struct xdp_buff *xdp, > > > xdp_frame->headroom = headroom - sizeof(*xdp_frame); > > > xdp_frame->metasize = metasize; > > > xdp_frame->frame_sz = xdp->frame_sz; > > > + xdp_frame->flags = xdp->flags; > > > > > > return 0; > > > } > > > -- > > > 2.31.1 > > > -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer