Message ID | cover.1616692794.git.pabeni@redhat.com |
---|---|
Headers | show |
Series | udp: GRO L4 improvements | expand |
On Thu, Mar 25, 2021 at 1:24 PM Paolo Abeni <pabeni@redhat.com> wrote: > > If NETIF_F_GRO_FRAGLIST or NETIF_F_GRO_UDP_FWD are enabled, and there > are UDP tunnels available in the system, udp_gro_receive() could end-up > doing L4 aggregation (either SKB_GSO_UDP_L4 or SKB_GSO_FRAGLIST) at > the outer UDP tunnel level for packets effectively carrying and UDP > tunnel header. > > That could cause inner protocol corruption. If e.g. the relevant > packets carry a vxlan header, different vxlan ids will be ignored/ > aggregated to the same GSO packet. Inner headers will be ignored, too, > so that e.g. TCP over vxlan push packets will be held in the GRO > engine till the next flush, etc. > > Just skip the SKB_GSO_UDP_L4 and SKB_GSO_FRAGLIST code path if the > current packet could land in a UDP tunnel, and let udp_gro_receive() > do GRO via udp_sk(sk)->gro_receive. > > The check implemented in this patch is broader than what is strictly > needed, as the existing UDP tunnel could be e.g. configured on top of > a different device: we could end-up skipping GRO at-all for some packets. > > Anyhow, the latter is a very thin corner case and covering it would add > quite a bit of complexity. > > v1 -> v2: > - hopefully clarify the commit message > > Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.") > Fixes: 36707061d6ba ("udp: allow forwarding of plain (non-fraglisted) UDP GRO packets") > Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Key is that udp tunnel GRO must take precedence over transport GRO, but the way the code is structured, the latter is tried first.
On Thu, Mar 25, 2021 at 1:24 PM Paolo Abeni <pabeni@redhat.com> wrote: > > When UDP packets generated locally by a socket with UDP_SEGMENT > traverse the following path: > > UDP tunnel(xmit) -> veth (segmentation) -> veth (gro) -> > UDP tunnel (rx) -> UDP socket (no UDP_GRO) > > they are segmented as part of the rx socket receive operation, and > present a CHECKSUM_NONE after segmentation. would be good to capture how this happens, as it was not immediately obvious. > > Additionally the segmented packets UDP CB still refers to the original > GSO packet len. Overall that causes unexpected/wrong csum validation > errors later in the UDP receive path. > > We could possibly address the issue with some additional checks and > csum mangling in the UDP tunnel code. Since the issue affects only > this UDP receive slow path, let's set a suitable csum status there. > > v1 -> v2: > - restrict the csum update to the packets strictly needing them > - hopefully clarify the commit message and code comments > > Signed-off-by: Paolo Abeni <pabeni@redhat.com> > + if (skb->ip_summed == CHECKSUM_NONE && !skb->csum_valid) > + skb->csum_valid = 1; Not entirely obvious is that UDP packets arriving on a device with rx checksum offload off, i.e., with CHECKSUM_NONE, are not matched by this test. I assume that such packets are not coalesced by the GRO layer in the first place. But I can't immediately spot the reason for it..
On Fri, 2021-03-26 at 14:30 -0400, Willem de Bruijn wrote: > On Thu, Mar 25, 2021 at 1:24 PM Paolo Abeni <pabeni@redhat.com> wrote: > > When UDP packets generated locally by a socket with UDP_SEGMENT > > traverse the following path: > > > > UDP tunnel(xmit) -> veth (segmentation) -> veth (gro) -> > > UDP tunnel (rx) -> UDP socket (no UDP_GRO) > > > > they are segmented as part of the rx socket receive operation, and > > present a CHECKSUM_NONE after segmentation. > > would be good to capture how this happens, as it was not immediately obvious. The CHECKSUM_PARTIAL is propagated up to the UDP tunnel processing, where we have: __iptunnel_pull_header() -> skb_pull_rcsum() -> skb_postpull_rcsum() -> __skb_postpull_rcsum() and the latter do the conversion. > > Additionally the segmented packets UDP CB still refers to the original > > GSO packet len. Overall that causes unexpected/wrong csum validation > > errors later in the UDP receive path. > > > > We could possibly address the issue with some additional checks and > > csum mangling in the UDP tunnel code. Since the issue affects only > > this UDP receive slow path, let's set a suitable csum status there. > > > > v1 -> v2: > > - restrict the csum update to the packets strictly needing them > > - hopefully clarify the commit message and code comments > > > > Signed-off-by: Paolo Abeni <pabeni@redhat.com> > > + if (skb->ip_summed == CHECKSUM_NONE && !skb->csum_valid) > > + skb->csum_valid = 1; > > Not entirely obvious is that UDP packets arriving on a device with rx > checksum offload off, i.e., with CHECKSUM_NONE, are not matched by > this test. > > I assume that such packets are not coalesced by the GRO layer in the > first place. But I can't immediately spot the reason for it.. Packets with CHECKSUM_NONE are actually aggregated by the GRO engine. Their checksum is validated by: udp4_gro_receive -> skb_gro_checksum_validate_zero_check() -> __skb_gro_checksum_validate -> __skb_gro_checksum_validate_complete() and skb->ip_summed is changed to CHECKSUM_UNNECESSARY by: __skb_gro_checksum_validate -> skb_gro_incr_csum_unnecessary -> __skb_incr_checksum_unnecessary() and finally to CHECKSUM_PARTIAL by: udp4_gro_complete() -> udp_gro_complete() -> udp_gro_complete_segment() Do you prefer I resubmit with some more comments, either in the commit message or in the code? Thanks Paolo side note: perf probe here is fooled by skb->ip_summed being a bitfield and does not dump the real value. I had to look at skb- >__pkt_type_offset[0] instead.
On Mon, Mar 29, 2021 at 7:26 AM Paolo Abeni <pabeni@redhat.com> wrote: > > On Fri, 2021-03-26 at 14:30 -0400, Willem de Bruijn wrote: > > On Thu, Mar 25, 2021 at 1:24 PM Paolo Abeni <pabeni@redhat.com> wrote: > > > When UDP packets generated locally by a socket with UDP_SEGMENT > > > traverse the following path: > > > > > > UDP tunnel(xmit) -> veth (segmentation) -> veth (gro) -> > > > UDP tunnel (rx) -> UDP socket (no UDP_GRO) > > > > > > they are segmented as part of the rx socket receive operation, and > > > present a CHECKSUM_NONE after segmentation. > > > > would be good to capture how this happens, as it was not immediately obvious. > > The CHECKSUM_PARTIAL is propagated up to the UDP tunnel processing, > where we have: > > __iptunnel_pull_header() -> skb_pull_rcsum() -> > skb_postpull_rcsum() -> __skb_postpull_rcsum() and the latter do the > conversion. Please capture this in the commit message. > > > Additionally the segmented packets UDP CB still refers to the original > > > GSO packet len. Overall that causes unexpected/wrong csum validation > > > errors later in the UDP receive path. > > > > > > We could possibly address the issue with some additional checks and > > > csum mangling in the UDP tunnel code. Since the issue affects only > > > this UDP receive slow path, let's set a suitable csum status there. > > > > > > v1 -> v2: > > > - restrict the csum update to the packets strictly needing them > > > - hopefully clarify the commit message and code comments > > > > > > Signed-off-by: Paolo Abeni <pabeni@redhat.com> > > > + if (skb->ip_summed == CHECKSUM_NONE && !skb->csum_valid) > > > + skb->csum_valid = 1; > > > > Not entirely obvious is that UDP packets arriving on a device with rx > > checksum offload off, i.e., with CHECKSUM_NONE, are not matched by > > this test. > > > > I assume that such packets are not coalesced by the GRO layer in the > > first place. But I can't immediately spot the reason for it.. > > Packets with CHECKSUM_NONE are actually aggregated by the GRO engine. > > Their checksum is validated by: > > udp4_gro_receive -> skb_gro_checksum_validate_zero_check() > -> __skb_gro_checksum_validate -> __skb_gro_checksum_validate_complete() > > and skb->ip_summed is changed to CHECKSUM_UNNECESSARY by: > > __skb_gro_checksum_validate -> skb_gro_incr_csum_unnecessary > -> __skb_incr_checksum_unnecessary() > > and finally to CHECKSUM_PARTIAL by: > > udp4_gro_complete() -> udp_gro_complete() -> udp_gro_complete_segment() > > Do you prefer I resubmit with some more comments, either in the commit > message or in the code? That breaks the checksum-and-copy optimization when delivering to local sockets. I wonder if that is a regression.
On Mon, 2021-03-29 at 08:28 -0400, Willem de Bruijn wrote: > On Mon, Mar 29, 2021 at 7:26 AM Paolo Abeni <pabeni@redhat.com> wrote: > > On Fri, 2021-03-26 at 14:30 -0400, Willem de Bruijn wrote: > > > On Thu, Mar 25, 2021 at 1:24 PM Paolo Abeni <pabeni@redhat.com> wrote: > > > > When UDP packets generated locally by a socket with UDP_SEGMENT > > > > traverse the following path: > > > > > > > > UDP tunnel(xmit) -> veth (segmentation) -> veth (gro) -> > > > > UDP tunnel (rx) -> UDP socket (no UDP_GRO) > > > > > > > > they are segmented as part of the rx socket receive operation, and > > > > present a CHECKSUM_NONE after segmentation. > > > > > > would be good to capture how this happens, as it was not immediately obvious. > > > > The CHECKSUM_PARTIAL is propagated up to the UDP tunnel processing, > > where we have: > > > > __iptunnel_pull_header() -> skb_pull_rcsum() -> > > skb_postpull_rcsum() -> __skb_postpull_rcsum() and the latter do the > > conversion. > > Please capture this in the commit message. I will do. > > > > Additionally the segmented packets UDP CB still refers to the original > > > > GSO packet len. Overall that causes unexpected/wrong csum validation > > > > errors later in the UDP receive path. > > > > > > > > We could possibly address the issue with some additional checks and > > > > csum mangling in the UDP tunnel code. Since the issue affects only > > > > this UDP receive slow path, let's set a suitable csum status there. > > > > > > > > v1 -> v2: > > > > - restrict the csum update to the packets strictly needing them > > > > - hopefully clarify the commit message and code comments > > > > > > > > Signed-off-by: Paolo Abeni <pabeni@redhat.com> > > > > + if (skb->ip_summed == CHECKSUM_NONE && !skb->csum_valid) > > > > + skb->csum_valid = 1; > > > > > > Not entirely obvious is that UDP packets arriving on a device with rx > > > checksum offload off, i.e., with CHECKSUM_NONE, are not matched by > > > this test. > > > > > > I assume that such packets are not coalesced by the GRO layer in the > > > first place. But I can't immediately spot the reason for it.. > > > > Packets with CHECKSUM_NONE are actually aggregated by the GRO engine. > > > > Their checksum is validated by: > > > > udp4_gro_receive -> skb_gro_checksum_validate_zero_check() > > -> __skb_gro_checksum_validate -> __skb_gro_checksum_validate_complete() > > > > and skb->ip_summed is changed to CHECKSUM_UNNECESSARY by: > > > > __skb_gro_checksum_validate -> skb_gro_incr_csum_unnecessary > > -> __skb_incr_checksum_unnecessary() > > > > and finally to CHECKSUM_PARTIAL by: > > > > udp4_gro_complete() -> udp_gro_complete() -> udp_gro_complete_segment() > > > > Do you prefer I resubmit with some more comments, either in the commit > > message or in the code? > > That breaks the checksum-and-copy optimization when delivering to > local sockets. I wonder if that is a regression. The conversion to CHECKSUM_UNNECESSARY happens since commit 573e8fca255a27e3573b51f9b183d62641c47a3d. Even the conversion to CHECKSUM_PARTIAL happens independently from this series, since commit 6f1c0ea133a6e4a193a7b285efe209664caeea43. I don't see a regression here ?!? Thanks! Paolo
> > > > > Additionally the segmented packets UDP CB still refers to the original > > > > > GSO packet len. Overall that causes unexpected/wrong csum validation > > > > > errors later in the UDP receive path. > > > > > > > > > > We could possibly address the issue with some additional checks and > > > > > csum mangling in the UDP tunnel code. Since the issue affects only > > > > > this UDP receive slow path, let's set a suitable csum status there. > > > > > > > > > > v1 -> v2: > > > > > - restrict the csum update to the packets strictly needing them > > > > > - hopefully clarify the commit message and code comments > > > > > > > > > > Signed-off-by: Paolo Abeni <pabeni@redhat.com> > > > > > + if (skb->ip_summed == CHECKSUM_NONE && !skb->csum_valid) > > > > > + skb->csum_valid = 1; > > > > > > > > Not entirely obvious is that UDP packets arriving on a device with rx > > > > checksum offload off, i.e., with CHECKSUM_NONE, are not matched by > > > > this test. > > > > > > > > I assume that such packets are not coalesced by the GRO layer in the > > > > first place. But I can't immediately spot the reason for it.. > > > > > > Packets with CHECKSUM_NONE are actually aggregated by the GRO engine. > > > > > > Their checksum is validated by: > > > > > > udp4_gro_receive -> skb_gro_checksum_validate_zero_check() > > > -> __skb_gro_checksum_validate -> __skb_gro_checksum_validate_complete() > > > > > > and skb->ip_summed is changed to CHECKSUM_UNNECESSARY by: > > > > > > __skb_gro_checksum_validate -> skb_gro_incr_csum_unnecessary > > > -> __skb_incr_checksum_unnecessary() > > > > > > and finally to CHECKSUM_PARTIAL by: > > > > > > udp4_gro_complete() -> udp_gro_complete() -> udp_gro_complete_segment() > > > > > > Do you prefer I resubmit with some more comments, either in the commit > > > message or in the code? > > > > That breaks the checksum-and-copy optimization when delivering to > > local sockets. I wonder if that is a regression. > > The conversion to CHECKSUM_UNNECESSARY happens since > commit 573e8fca255a27e3573b51f9b183d62641c47a3d. > > Even the conversion to CHECKSUM_PARTIAL happens independently from this > series, since commit 6f1c0ea133a6e4a193a7b285efe209664caeea43. > > I don't see a regression here ?!? I mean that UDP packets with local destination socket and no tunnels that arrive with CHECKSUM_NONE normally benefit from the checksum-and-copy optimization in recvmsg() when copying to user. If those packets are now checksummed during GRO, that voids that optimization, and the packet payload is now touched twice.
On Mon, 2021-03-29 at 09:52 -0400, Willem de Bruijn wrote: > > > That breaks the checksum-and-copy optimization when delivering to > > > local sockets. I wonder if that is a regression. > > > > The conversion to CHECKSUM_UNNECESSARY happens since > > commit 573e8fca255a27e3573b51f9b183d62641c47a3d. > > > > Even the conversion to CHECKSUM_PARTIAL happens independently from this > > series, since commit 6f1c0ea133a6e4a193a7b285efe209664caeea43. > > > > I don't see a regression here ?!? > > I mean that UDP packets with local destination socket and no tunnels > that arrive with CHECKSUM_NONE normally benefit from the > checksum-and-copy optimization in recvmsg() when copying to user. > > If those packets are now checksummed during GRO, that voids that > optimization, and the packet payload is now touched twice. The 'now' part confuses me. Nothing in this patch or this series changes the processing of CHECKSUM_NONE UDP packets with no tunnel. I do see checksum validation in the GRO engine for CHECKSUM_NONE UDP packet prior to this series. I *think* the checksum-and-copy optimization is lost since 573e8fca255a27e3573b51f9b183d62641c47a3d. Regards, Paolo
On Mon, Mar 29, 2021 at 11:01 AM Paolo Abeni <pabeni@redhat.com> wrote: > > On Mon, 2021-03-29 at 09:52 -0400, Willem de Bruijn wrote: > > > > That breaks the checksum-and-copy optimization when delivering to > > > > local sockets. I wonder if that is a regression. > > > > > > The conversion to CHECKSUM_UNNECESSARY happens since > > > commit 573e8fca255a27e3573b51f9b183d62641c47a3d. > > > > > > Even the conversion to CHECKSUM_PARTIAL happens independently from this > > > series, since commit 6f1c0ea133a6e4a193a7b285efe209664caeea43. > > > > > > I don't see a regression here ?!? > > > > I mean that UDP packets with local destination socket and no tunnels > > that arrive with CHECKSUM_NONE normally benefit from the > > checksum-and-copy optimization in recvmsg() when copying to user. > > > > If those packets are now checksummed during GRO, that voids that > > optimization, and the packet payload is now touched twice. > > The 'now' part confuses me. Nothing in this patch or this series > changes the processing of CHECKSUM_NONE UDP packets with no tunnel. Agreed. I did not mean to imply that this patch changes that. I was responding to > > > + if (skb->ip_summed == CHECKSUM_NONE && !skb->csum_valid) > > > + skb->csum_valid = 1; > > > > Not entirely obvious is that UDP packets arriving on a device with rx > > checksum offload off, i.e., with CHECKSUM_NONE, are not matched by > > this test. > > > > I assume that such packets are not coalesced by the GRO layer in the > > first place. But I can't immediately spot the reason for it.. As you point out, such packets will already have had their checksum verified at this point, so this branch only matches tunneled packets. That point is just not immediately obvious from the code. > I do see checksum validation in the GRO engine for CHECKSUM_NONE UDP > packet prior to this series. > > I *think* the checksum-and-copy optimization is lost > since 573e8fca255a27e3573b51f9b183d62641c47a3d. Wouldn't this have been introduced with UDP_GRO?
On Mon, 2021-03-29 at 11:24 -0400, Willem de Bruijn wrote: > On Mon, Mar 29, 2021 at 11:01 AM Paolo Abeni <pabeni@redhat.com> wrote: > > On Mon, 2021-03-29 at 09:52 -0400, Willem de Bruijn wrote: > > > > + if (skb->ip_summed == CHECKSUM_NONE && !skb->csum_valid) > > > > + skb->csum_valid = 1; > > > > > > Not entirely obvious is that UDP packets arriving on a device with rx > > > checksum offload off, i.e., with CHECKSUM_NONE, are not matched by > > > this test. > > > > > > I assume that such packets are not coalesced by the GRO layer in the > > > first place. But I can't immediately spot the reason for it.. > > As you point out, such packets will already have had their checksum > verified at this point, so this branch only matches tunneled packets. > That point is just not immediately obvious from the code. I understand is a matter of comment clarity ?!? I'll rewrite the related code comment - in udp_post_segment_fix_csum() - as: /* UDP packets generated with UDP_SEGMENT and traversing: * * UDP tunnel(xmit) -> veth (segmentation) -> veth (gro) -> UDP tunnel (rx) * * land here with CHECKSUM_NONE, because __iptunnel_pull_header() converts * CHECKSUM_PARTIAL into NONE. * SKB_GSO_UDP_L4 or SKB_GSO_FRAGLIST packets with no UDP tunnel will land * here with valid checksum, as the GRO engine validates the UDP csum * before the aggregation and nobody strips such info in between. * Instead of adding another check in the tunnel fastpath, we can force * a valid csum here. * Additionally fixup the UDP CB. */ Would that be clear enough? > > I do see checksum validation in the GRO engine for CHECKSUM_NONE UDP > > packet prior to this series. > > > > I *think* the checksum-and-copy optimization is lost > > since 573e8fca255a27e3573b51f9b183d62641c47a3d. > > Wouldn't this have been introduced with UDP_GRO? Uhmm.... looks like the checksum-and-copy optimization has been lost and recovered a few times. I think the last one with 9fd1ff5d2ac7181844735806b0a703c942365291, which move the csum validation before the static branch on udp_encap_needed_key. Can we agree re-introducing the optimization is independent from this series? Thanks! Paolo
On Mon, Mar 29, 2021 at 12:24 PM Paolo Abeni <pabeni@redhat.com> wrote: > > On Mon, 2021-03-29 at 11:24 -0400, Willem de Bruijn wrote: > > On Mon, Mar 29, 2021 at 11:01 AM Paolo Abeni <pabeni@redhat.com> wrote: > > > On Mon, 2021-03-29 at 09:52 -0400, Willem de Bruijn wrote: > > > > > + if (skb->ip_summed == CHECKSUM_NONE && !skb->csum_valid) > > > > > + skb->csum_valid = 1; > > > > > > > > Not entirely obvious is that UDP packets arriving on a device with rx > > > > checksum offload off, i.e., with CHECKSUM_NONE, are not matched by > > > > this test. > > > > > > > > I assume that such packets are not coalesced by the GRO layer in the > > > > first place. But I can't immediately spot the reason for it.. > > > > As you point out, such packets will already have had their checksum > > verified at this point, so this branch only matches tunneled packets. > > That point is just not immediately obvious from the code. > > I understand is a matter of comment clarity ?!? > > I'll rewrite the related code comment - in udp_post_segment_fix_csum() > - as: > > /* UDP packets generated with UDP_SEGMENT and traversing: > * > * UDP tunnel(xmit) -> veth (segmentation) -> veth (gro) -> UDP tunnel (rx) > * > * land here with CHECKSUM_NONE, because __iptunnel_pull_header() converts > * CHECKSUM_PARTIAL into NONE. > * SKB_GSO_UDP_L4 or SKB_GSO_FRAGLIST packets with no UDP tunnel will land > * here with valid checksum, as the GRO engine validates the UDP csum > * before the aggregation and nobody strips such info in between. > * Instead of adding another check in the tunnel fastpath, we can force > * a valid csum here. > * Additionally fixup the UDP CB. > */ > > Would that be clear enough? Definitely. Thanks! > > > I do see checksum validation in the GRO engine for CHECKSUM_NONE UDP > > > packet prior to this series. > > > > > > I *think* the checksum-and-copy optimization is lost > > > since 573e8fca255a27e3573b51f9b183d62641c47a3d. > > > > Wouldn't this have been introduced with UDP_GRO? > > Uhmm.... looks like the checksum-and-copy optimization has been lost > and recovered a few times. I think the last one > with 9fd1ff5d2ac7181844735806b0a703c942365291, which move the csum > validation before the static branch on udp_encap_needed_key. > > Can we agree re-introducing the optimization is independent from this > series? Yep :) > Thanks! > > Paolo > >