Message ID | 20210226212248.8300-1-daniel@iogearbox.net |
---|---|
State | New |
Headers | show |
Series | [net] net: Fix gro aggregation for udp encaps with zero csum | expand |
On Fri, Feb 26, 2021 at 4:23 PM Daniel Borkmann <daniel@iogearbox.net> wrote: > > We noticed a GRO issue for UDP-based encaps such as vxlan/geneve when the > csum for the UDP header itself is 0. In that case, GRO aggregation does > not take place on the phys dev, but instead is deferred to the vxlan/geneve > driver (see trace below). > > The reason is essentially that GRO aggregation bails out in udp_gro_receive() > for such case when drivers marked the skb with CHECKSUM_UNNECESSARY (ice, i40e, > others) where for non-zero csums 2abb7cdc0dc8 ("udp: Add support for doing > checksum unnecessary conversion") promotes those skbs to CHECKSUM_COMPLETE > and napi context has csum_valid set. This is however not the case for zero > UDP csum (here: csum_cnt is still 0 and csum_valid continues to be false). > > At the same time 57c67ff4bd92 ("udp: additional GRO support") added matches > on !uh->check ^ !uh2->check as part to determine candidates for aggregation, > so it certainly is expected to handle zero csums in udp_gro_receive(). The > purpose of the check added via 662880f44203 ("net: Allow GRO to use and set > levels of checksum unnecessary") seems to catch bad csum and stop aggregation > right away. > > One way to fix aggregation in the zero case is to only perform the !csum_valid > check in udp_gro_receive() if uh->check is infact non-zero. > > Before: > > [...] > swapper 0 [008] 731.946506: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100400 len=1500 (1) > swapper 0 [008] 731.946507: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100200 len=1500 > swapper 0 [008] 731.946507: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101100 len=1500 > swapper 0 [008] 731.946508: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101700 len=1500 > swapper 0 [008] 731.946508: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101b00 len=1500 > swapper 0 [008] 731.946508: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100600 len=1500 > swapper 0 [008] 731.946508: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100f00 len=1500 > swapper 0 [008] 731.946509: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100a00 len=1500 > swapper 0 [008] 731.946516: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100500 len=1500 > swapper 0 [008] 731.946516: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100700 len=1500 > swapper 0 [008] 731.946516: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101d00 len=1500 (2) > swapper 0 [008] 731.946517: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101000 len=1500 > swapper 0 [008] 731.946517: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101c00 len=1500 > swapper 0 [008] 731.946517: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101400 len=1500 > swapper 0 [008] 731.946518: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100e00 len=1500 > swapper 0 [008] 731.946518: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101600 len=1500 > swapper 0 [008] 731.946521: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100800 len=774 > swapper 0 [008] 731.946530: net:netif_receive_skb: dev=test_vxlan skbaddr=0xffff966497100400 len=14032 (1) > swapper 0 [008] 731.946530: net:netif_receive_skb: dev=test_vxlan skbaddr=0xffff966497101d00 len=9112 (2) > [...] > > # netperf -H 10.55.10.4 -t TCP_STREAM -l 20 > MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.55.10.4 () port 0 AF_INET : demo > Recv Send Send > Socket Socket Message Elapsed > Size Size Size Time Throughput > bytes bytes bytes secs. 10^6bits/sec > > 87380 16384 16384 20.01 13129.24 > > After: > > [...] > swapper 0 [026] 521.862641: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff93ab0d479000 len=11286 (1) > swapper 0 [026] 521.862643: net:netif_receive_skb: dev=test_vxlan skbaddr=0xffff93ab0d479000 len=11236 (1) > swapper 0 [026] 521.862650: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff93ab0d478500 len=2898 (2) > swapper 0 [026] 521.862650: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff93ab0d479f00 len=8490 (3) > swapper 0 [026] 521.862653: net:netif_receive_skb: dev=test_vxlan skbaddr=0xffff93ab0d478500 len=2848 (2) > swapper 0 [026] 521.862653: net:netif_receive_skb: dev=test_vxlan skbaddr=0xffff93ab0d479f00 len=8440 (3) > [...] > > # netperf -H 10.55.10.4 -t TCP_STREAM -l 20 > MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.55.10.4 () port 0 AF_INET : demo > Recv Send Send > Socket Socket Message Elapsed > Size Size Size Time Throughput > bytes bytes bytes secs. 10^6bits/sec > > 87380 16384 16384 20.01 24576.53 > > Fixes: 57c67ff4bd92 ("udp: additional GRO support") > Fixes: 662880f44203 ("net: Allow GRO to use and set levels of checksum unnecessary") > Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> > Cc: Eric Dumazet <edumazet@google.com> > Cc: Willem de Bruijn <willemb@google.com> > Cc: John Fastabend <john.fastabend@gmail.com> > Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> > Cc: Tom Herbert <tom@herbertland.com> Makes sense to me. We cannot do checksum conversion with zero field, but that does not have to limit coalescing. CHECKSUM_COMPLETE with a checksum validated by skb_gro_checksum_validate_zero_check implies csum_valid. So the test > (skb->ip_summed != CHECKSUM_PARTIAL && > NAPI_GRO_CB(skb)->csum_cnt == 0 && > !NAPI_GRO_CB(skb)->csum_valid) || Basically matches - CHECKSUM_NONE - CHECKSUM_UNNECESSARY which has already used up its valid state on a prior header - CHECKSUM_COMPLETE with bad checksum. This change just refines to not drop for in the first two cases on a zero checksum field. Making this explicit in case anyone sees holes in the logic. Else, Acked-by: Willem de Bruijn <willemb@google.com>
Willem de Bruijn wrote: > On Fri, Feb 26, 2021 at 4:23 PM Daniel Borkmann <daniel@iogearbox.net> wrote: > > > > We noticed a GRO issue for UDP-based encaps such as vxlan/geneve when the > > csum for the UDP header itself is 0. In that case, GRO aggregation does > > not take place on the phys dev, but instead is deferred to the vxlan/geneve > > driver (see trace below). > > > > The reason is essentially that GRO aggregation bails out in udp_gro_receive() > > for such case when drivers marked the skb with CHECKSUM_UNNECESSARY (ice, i40e, > > others) where for non-zero csums 2abb7cdc0dc8 ("udp: Add support for doing > > checksum unnecessary conversion") promotes those skbs to CHECKSUM_COMPLETE > > and napi context has csum_valid set. This is however not the case for zero > > UDP csum (here: csum_cnt is still 0 and csum_valid continues to be false). > > > > At the same time 57c67ff4bd92 ("udp: additional GRO support") added matches > > on !uh->check ^ !uh2->check as part to determine candidates for aggregation, > > so it certainly is expected to handle zero csums in udp_gro_receive(). The > > purpose of the check added via 662880f44203 ("net: Allow GRO to use and set > > levels of checksum unnecessary") seems to catch bad csum and stop aggregation > > right away. > > > > One way to fix aggregation in the zero case is to only perform the !csum_valid > > check in udp_gro_receive() if uh->check is infact non-zero. > > [...] > We cannot do checksum conversion with zero field, but that does not > have to limit coalescing. > > CHECKSUM_COMPLETE with a checksum validated by > skb_gro_checksum_validate_zero_check implies csum_valid. > > So the test > > > (skb->ip_summed != CHECKSUM_PARTIAL && > > NAPI_GRO_CB(skb)->csum_cnt == 0 && > > !NAPI_GRO_CB(skb)->csum_valid) || > > Basically matches > > - CHECKSUM_NONE > - CHECKSUM_UNNECESSARY which has already used up its valid state on a > prior header > - CHECKSUM_COMPLETE with bad checksum. > > This change just refines to not drop for in the first two cases on a > zero checksum field. +1 > > Making this explicit in case anyone sees holes in the logic. Else, > > Acked-by: Willem de Bruijn <willemb@google.com> LGTM, Acked-by: John Fastabend <john.fastabend@gmail.com>
On Sat, 27 Feb 2021 09:16:38 -0800 John Fastabend wrote: > Willem de Bruijn wrote: > > On Fri, Feb 26, 2021 at 4:23 PM Daniel Borkmann <daniel@iogearbox.net> wrote: > > > > > > We noticed a GRO issue for UDP-based encaps such as vxlan/geneve when the > > > csum for the UDP header itself is 0. In that case, GRO aggregation does > > > not take place on the phys dev, but instead is deferred to the vxlan/geneve > > > driver (see trace below). > > > > > > The reason is essentially that GRO aggregation bails out in udp_gro_receive() > > > for such case when drivers marked the skb with CHECKSUM_UNNECESSARY (ice, i40e, > > > others) where for non-zero csums 2abb7cdc0dc8 ("udp: Add support for doing > > > checksum unnecessary conversion") promotes those skbs to CHECKSUM_COMPLETE > > > and napi context has csum_valid set. This is however not the case for zero > > > UDP csum (here: csum_cnt is still 0 and csum_valid continues to be false). > > > > > > At the same time 57c67ff4bd92 ("udp: additional GRO support") added matches > > > on !uh->check ^ !uh2->check as part to determine candidates for aggregation, > > > so it certainly is expected to handle zero csums in udp_gro_receive(). The > > > purpose of the check added via 662880f44203 ("net: Allow GRO to use and set > > > levels of checksum unnecessary") seems to catch bad csum and stop aggregation > > > right away. > > > > > > One way to fix aggregation in the zero case is to only perform the !csum_valid > > > check in udp_gro_receive() if uh->check is infact non-zero. > > > > Acked-by: Willem de Bruijn <willemb@google.com> > > Acked-by: John Fastabend <john.fastabend@gmail.com> Applied, thanks!
diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c index ff39e94781bf..a7f714c4b068 100644 --- a/net/ipv4/udp_offload.c +++ b/net/ipv4/udp_offload.c @@ -466,7 +466,7 @@ struct sk_buff *udp_gro_receive(struct list_head *head, struct sk_buff *skb, } if (!sk || NAPI_GRO_CB(skb)->encap_mark || - (skb->ip_summed != CHECKSUM_PARTIAL && + (uh->check && skb->ip_summed != CHECKSUM_PARTIAL && NAPI_GRO_CB(skb)->csum_cnt == 0 && !NAPI_GRO_CB(skb)->csum_valid) || !udp_sk(sk)->gro_receive)
We noticed a GRO issue for UDP-based encaps such as vxlan/geneve when the csum for the UDP header itself is 0. In that case, GRO aggregation does not take place on the phys dev, but instead is deferred to the vxlan/geneve driver (see trace below). The reason is essentially that GRO aggregation bails out in udp_gro_receive() for such case when drivers marked the skb with CHECKSUM_UNNECESSARY (ice, i40e, others) where for non-zero csums 2abb7cdc0dc8 ("udp: Add support for doing checksum unnecessary conversion") promotes those skbs to CHECKSUM_COMPLETE and napi context has csum_valid set. This is however not the case for zero UDP csum (here: csum_cnt is still 0 and csum_valid continues to be false). At the same time 57c67ff4bd92 ("udp: additional GRO support") added matches on !uh->check ^ !uh2->check as part to determine candidates for aggregation, so it certainly is expected to handle zero csums in udp_gro_receive(). The purpose of the check added via 662880f44203 ("net: Allow GRO to use and set levels of checksum unnecessary") seems to catch bad csum and stop aggregation right away. One way to fix aggregation in the zero case is to only perform the !csum_valid check in udp_gro_receive() if uh->check is infact non-zero. Before: [...] swapper 0 [008] 731.946506: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100400 len=1500 (1) swapper 0 [008] 731.946507: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100200 len=1500 swapper 0 [008] 731.946507: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101100 len=1500 swapper 0 [008] 731.946508: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101700 len=1500 swapper 0 [008] 731.946508: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101b00 len=1500 swapper 0 [008] 731.946508: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100600 len=1500 swapper 0 [008] 731.946508: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100f00 len=1500 swapper 0 [008] 731.946509: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100a00 len=1500 swapper 0 [008] 731.946516: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100500 len=1500 swapper 0 [008] 731.946516: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100700 len=1500 swapper 0 [008] 731.946516: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101d00 len=1500 (2) swapper 0 [008] 731.946517: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101000 len=1500 swapper 0 [008] 731.946517: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101c00 len=1500 swapper 0 [008] 731.946517: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101400 len=1500 swapper 0 [008] 731.946518: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100e00 len=1500 swapper 0 [008] 731.946518: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101600 len=1500 swapper 0 [008] 731.946521: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100800 len=774 swapper 0 [008] 731.946530: net:netif_receive_skb: dev=test_vxlan skbaddr=0xffff966497100400 len=14032 (1) swapper 0 [008] 731.946530: net:netif_receive_skb: dev=test_vxlan skbaddr=0xffff966497101d00 len=9112 (2) [...] # netperf -H 10.55.10.4 -t TCP_STREAM -l 20 MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.55.10.4 () port 0 AF_INET : demo Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 20.01 13129.24 After: [...] swapper 0 [026] 521.862641: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff93ab0d479000 len=11286 (1) swapper 0 [026] 521.862643: net:netif_receive_skb: dev=test_vxlan skbaddr=0xffff93ab0d479000 len=11236 (1) swapper 0 [026] 521.862650: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff93ab0d478500 len=2898 (2) swapper 0 [026] 521.862650: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff93ab0d479f00 len=8490 (3) swapper 0 [026] 521.862653: net:netif_receive_skb: dev=test_vxlan skbaddr=0xffff93ab0d478500 len=2848 (2) swapper 0 [026] 521.862653: net:netif_receive_skb: dev=test_vxlan skbaddr=0xffff93ab0d479f00 len=8440 (3) [...] # netperf -H 10.55.10.4 -t TCP_STREAM -l 20 MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.55.10.4 () port 0 AF_INET : demo Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 20.01 24576.53 Fixes: 57c67ff4bd92 ("udp: additional GRO support") Fixes: 662880f44203 ("net: Allow GRO to use and set levels of checksum unnecessary") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Tom Herbert <tom@herbertland.com> --- net/ipv4/udp_offload.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)