Message ID | 20201016111156.26927-1-ovov@yandex-team.ru |
---|---|
State | Superseded |
Headers | show |
Series | [net] ip6_tunnel: set inner ipproto before ip6_tnl_encap. | expand |
On Fri, Oct 16, 2020 at 7:14 AM Alexander Ovechkin <ovov@yandex-team.ru> wrote: > > ip6_tnl_encap assigns to proto transport protocol which > encapsulates inner packet, but we must pass to set_inner_ipproto > protocol of that inner packet. > > Calling set_inner_ipproto after ip6_tnl_encap might break gso. > For example, in case of encapsulating ipv6 packet in fou6 packet, inner_ipproto > would be set to IPPROTO_UDP instead of IPPROTO_IPV6. This would lead to > incorrect calling sequence of gso functions: > ipv6_gso_segment -> udp6_ufo_fragment -> skb_udp_tunnel_segment -> udp6_ufo_fragment > instead of: > ipv6_gso_segment -> udp6_ufo_fragment -> skb_udp_tunnel_segment -> ip6ip6_gso_segment > > Signed-off-by: Alexander Ovechkin <ovov@yandex-team.ru> Commit 6c11fbf97e69 ("ip6_tunnel: add MPLS transmit support") moved the call from ip6_tnl_encap's caller to inside ip6_tnl_encap. It makes sense that that likely broke this behavior for UDP (L4) tunnels. But it was moved on purpose to avoid setting the inner protocol to IPPROTO_MPLS. That needs to use skb->inner_protocol to further segment. I suspect we need to set this before or after conditionally to avoid breaking that use case.
On 16.10.2020 18:55, Willem de Bruijn wrote: > On Fri, Oct 16, 2020 at 7:14 AM Alexander Ovechkin <ovov@yandex-team.ru> wrote: >> ip6_tnl_encap assigns to proto transport protocol which >> encapsulates inner packet, but we must pass to set_inner_ipproto >> protocol of that inner packet. >> >> Calling set_inner_ipproto after ip6_tnl_encap might break gso. >> For example, in case of encapsulating ipv6 packet in fou6 packet, inner_ipproto >> would be set to IPPROTO_UDP instead of IPPROTO_IPV6. This would lead to >> incorrect calling sequence of gso functions: >> ipv6_gso_segment -> udp6_ufo_fragment -> skb_udp_tunnel_segment -> udp6_ufo_fragment >> instead of: >> ipv6_gso_segment -> udp6_ufo_fragment -> skb_udp_tunnel_segment -> ip6ip6_gso_segment >> >> Signed-off-by: Alexander Ovechkin <ovov@yandex-team.ru> > Commit 6c11fbf97e69 ("ip6_tunnel: add MPLS transmit support") moved > the call from ip6_tnl_encap's caller to inside ip6_tnl_encap. > > It makes sense that that likely broke this behavior for UDP (L4) tunnels. > > But it was moved on purpose to avoid setting the inner protocol to > IPPROTO_MPLS. That needs to use skb->inner_protocol to further > segment. > > I suspect we need to set this before or after conditionally to avoid > breaking that use case. I hope it could be fixed with something like this: diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c index a0217e5..87368b0 100644 --- a/net/ipv6/ip6_tunnel.c +++ b/net/ipv6/ip6_tunnel.c @@ -1121,6 +1121,7 @@ int ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev, __u8 dsfield, bool use_cache = false; u8 hop_limit; int err = -1; + __u8 pproto = proto; if (t->parms.collect_md) { hop_limit = skb_tunnel_info(skb)->key.ttl; @@ -1280,7 +1281,7 @@ int ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev, __u8 dsfield, ipv6_push_frag_opts(skb, &opt.ops, &proto); } - skb_set_inner_ipproto(skb, proto); + skb_set_inner_ipproto(skb, pproto == IPPROTO_MPLS ? proto : pproto); skb_push(skb, sizeof(struct ipv6hdr)); skb_reset_network_header(skb);
> But it was moved on purpose to avoid setting the inner protocol to IPPROTO_MPLS. That needs to use skb->inner_protocol to further segment. And why do we need to avoid setting the inner protocol to IPPROTO_MPLS? Currently skb->inner_protocol is used before call of ip6_tnl_xmit. Can you please give example when this patch breaks MPLS segmentation? > On 16 Oct 2020, at 20:55, Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote: > > On Fri, Oct 16, 2020 at 7:14 AM Alexander Ovechkin <ovov@yandex-team.ru> wrote: >> >> ip6_tnl_encap assigns to proto transport protocol which >> encapsulates inner packet, but we must pass to set_inner_ipproto >> protocol of that inner packet. >> >> Calling set_inner_ipproto after ip6_tnl_encap might break gso. >> For example, in case of encapsulating ipv6 packet in fou6 packet, inner_ipproto >> would be set to IPPROTO_UDP instead of IPPROTO_IPV6. This would lead to >> incorrect calling sequence of gso functions: >> ipv6_gso_segment -> udp6_ufo_fragment -> skb_udp_tunnel_segment -> udp6_ufo_fragment >> instead of: >> ipv6_gso_segment -> udp6_ufo_fragment -> skb_udp_tunnel_segment -> ip6ip6_gso_segment >> >> Signed-off-by: Alexander Ovechkin <ovov@yandex-team.ru> > > Commit 6c11fbf97e69 ("ip6_tunnel: add MPLS transmit support") moved > the call from ip6_tnl_encap's caller to inside ip6_tnl_encap. > > It makes sense that that likely broke this behavior for UDP (L4) tunnels. > > But it was moved on purpose to avoid setting the inner protocol to > IPPROTO_MPLS. That needs to use skb->inner_protocol to further > segment. > > I suspect we need to set this before or after conditionally to avoid > breaking that use case.
But ip6_tnl_encap assigns to proto number of outer protocol (i.e. = protocol that encapsulates our original packet). Setting inner_ipproto = to this value makes no sense.=20 For example in case of ipv6 inside MPLS inside fou6 encapsulation we = have following packet structure: +--------------+ <---+ | ipv6 | | +--------------+ +- fou6-encap | udp | | +--------------+ <---+ | mpls | <--- mpls-enacp +--------------+ <---+ | inner-ipv6 | | +--------------+ +- original packet | ... | | +--------------+ <---+ After ip6_tnl_encap proto will be equal to IPPROTO_UDP, that is = obviously is not inner ipproto. Actually if pproto =3D=3D IPPROTO_MPLS than we have two layers of = encapsulation and it is meaningless to set inner ipproto, cause = currently there is no support for segmentation of packets with two = layers of encapsulation. > On 17 Oct 2020, at 03:59, Vadim Fedorenko <vfedorenko@novek.ru> wrote: > > On 16.10.2020 18:55, Willem de Bruijn wrote: >> On Fri, Oct 16, 2020 at 7:14 AM Alexander Ovechkin <ovov@yandex-team.ru> wrote: >>> ip6_tnl_encap assigns to proto transport protocol which >>> encapsulates inner packet, but we must pass to set_inner_ipproto >>> protocol of that inner packet. >>> >>> Calling set_inner_ipproto after ip6_tnl_encap might break gso. >>> For example, in case of encapsulating ipv6 packet in fou6 packet, inner_ipproto >>> would be set to IPPROTO_UDP instead of IPPROTO_IPV6. This would lead to >>> incorrect calling sequence of gso functions: >>> ipv6_gso_segment -> udp6_ufo_fragment -> skb_udp_tunnel_segment -> udp6_ufo_fragment >>> instead of: >>> ipv6_gso_segment -> udp6_ufo_fragment -> skb_udp_tunnel_segment -> ip6ip6_gso_segment >>> >>> Signed-off-by: Alexander Ovechkin <ovov@yandex-team.ru> >> Commit 6c11fbf97e69 ("ip6_tunnel: add MPLS transmit support") moved >> the call from ip6_tnl_encap's caller to inside ip6_tnl_encap. >> >> It makes sense that that likely broke this behavior for UDP (L4) tunnels. >> >> But it was moved on purpose to avoid setting the inner protocol to >> IPPROTO_MPLS. That needs to use skb->inner_protocol to further >> segment. >> >> I suspect we need to set this before or after conditionally to avoid >> breaking that use case. > I hope it could be fixed with something like this: > > diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c > index a0217e5..87368b0 100644 > --- a/net/ipv6/ip6_tunnel.c > +++ b/net/ipv6/ip6_tunnel.c > @@ -1121,6 +1121,7 @@ int ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev, __u8 dsfield, > bool use_cache = false; > u8 hop_limit; > int err = -1; > + __u8 pproto = proto; > > if (t->parms.collect_md) { > hop_limit = skb_tunnel_info(skb)->key.ttl; > @@ -1280,7 +1281,7 @@ int ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev, __u8 dsfield, > ipv6_push_frag_opts(skb, &opt.ops, &proto); > } > > - skb_set_inner_ipproto(skb, proto); > + skb_set_inner_ipproto(skb, pproto == IPPROTO_MPLS ? proto : pproto); > > skb_push(skb, sizeof(struct ipv6hdr)); > skb_reset_network_header(skb); >
On Tue, Oct 27, 2020 at 5:52 PM Alexander Ovechkin <ovov@yandex-team.ru> wrote: > > > But it was moved on purpose to avoid setting the inner protocol to IPPROTO_MPLS. That needs to use skb->inner_protocol to further segment. > And why do we need to avoid setting the inner protocol to IPPROTO_MPLS? Currently skb->inner_protocol is used before call of ip6_tnl_xmit. > Can you please give example when this patch breaks MPLS segmentation? mpls_gso_segment calls skb_mac_gso_segment on the inner packet. After setting skb->protocol based on skb->inner_protocol. perhaps mpls encap gso and udp tunnel gso simply cannot be enabled together, because both use skb->inner_(ipproto|protocol) to demultiplex: 18 163 net/ipv4/udp_offload.c <<skb_udp_tunnel_segment>> protocol = skb->inner_protocol; 19 35 net/mpls/mpls_gso.c <<mpls_gso_segment>> skb->protocol = skb->inner_protocol; 3 168 net/ipv4/udp_offload.c <<skb_udp_tunnel_segment>> ops = rcu_dereference(offloads[skb->inner_ipproto]); Please don't top post, btw. > > On 16 Oct 2020, at 20:55, Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote: > > > > On Fri, Oct 16, 2020 at 7:14 AM Alexander Ovechkin <ovov@yandex-team.ru> wrote: > >> > >> ip6_tnl_encap assigns to proto transport protocol which > >> encapsulates inner packet, but we must pass to set_inner_ipproto > >> protocol of that inner packet. > >> > >> Calling set_inner_ipproto after ip6_tnl_encap might break gso. > >> For example, in case of encapsulating ipv6 packet in fou6 packet, inner_ipproto > >> would be set to IPPROTO_UDP instead of IPPROTO_IPV6. This would lead to > >> incorrect calling sequence of gso functions: > >> ipv6_gso_segment -> udp6_ufo_fragment -> skb_udp_tunnel_segment -> udp6_ufo_fragment > >> instead of: > >> ipv6_gso_segment -> udp6_ufo_fragment -> skb_udp_tunnel_segment -> ip6ip6_gso_segment > >> > >> Signed-off-by: Alexander Ovechkin <ovov@yandex-team.ru> > > > > Commit 6c11fbf97e69 ("ip6_tunnel: add MPLS transmit support") moved > > the call from ip6_tnl_encap's caller to inside ip6_tnl_encap. > > > > It makes sense that that likely broke this behavior for UDP (L4) tunnels. > > > > But it was moved on purpose to avoid setting the inner protocol to > > IPPROTO_MPLS. That needs to use skb->inner_protocol to further > > segment. > > > > I suspect we need to set this before or after conditionally to avoid > > breaking that use case. >
On 28 Oct 2020, at 01:53 UTC Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote: > On Tue, Oct 27, 2020 at 5:52 PM Alexander Ovechkin <ovov@yandex-team.ru> wrote: > > > > > But it was moved on purpose to avoid setting the inner protocol to IPPROTO_MPLS. That needs to use skb->inner_protocol to further segment. > > And why do we need to avoid setting the inner protocol to IPPROTO_MPLS? Currently skb->inner_protocol is used before call of ip6_tnl_xmit. > > Can you please give example when this patch breaks MPLS segmentation? > > mpls_gso_segment calls skb_mac_gso_segment on the inner packet. After > setting skb->protocol based on skb->inner_protocol. Yeah, but mpls_gso_segment is called before ip6_tnl_xmit (because tun devices don't have NETIF_F_GSO_SOFTWARE in their mpls_features), so it does not matter to what value ip6_tnl_xmit sets skb->inner_ipproto. And even if gso would been called after both mpls_xmit and ip6_tnl_xmit it would fail as you have written.
On Thu, Oct 29, 2020 at 3:46 AM Alexander Ovechkin <ovov@yandex-team.ru> wrote: > > On 28 Oct 2020, at 01:53 UTC Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote: > > On Tue, Oct 27, 2020 at 5:52 PM Alexander Ovechkin <ovov@yandex-team.ru> wrote: > > > > > > > But it was moved on purpose to avoid setting the inner protocol to IPPROTO_MPLS. That needs to use skb->inner_protocol to further segment. > > > And why do we need to avoid setting the inner protocol to IPPROTO_MPLS? Currently skb->inner_protocol is used before call of ip6_tnl_xmit. > > > Can you please give example when this patch breaks MPLS segmentation? > > > > mpls_gso_segment calls skb_mac_gso_segment on the inner packet. After > > setting skb->protocol based on skb->inner_protocol. > > Yeah, but mpls_gso_segment is called before ip6_tnl_xmit (because tun devices don't have NETIF_F_GSO_SOFTWARE in their mpls_features), so it does not matter to what value ip6_tnl_xmit sets skb->inner_ipproto. > And even if gso would been called after both mpls_xmit and ip6_tnl_xmit it would fail as you have written. Good point. Okay, if no mpls gso packets can make it here, then it should not matter. Vadim, are we missing another reason for this move? Else, no other concerns from me. Please do add a Fixes tag.
On 29.10.2020 14:40, Willem de Bruijn wrote: > On Thu, Oct 29, 2020 at 3:46 AM Alexander Ovechkin <ovov@yandex-team.ru> wrote: >> On 28 Oct 2020, at 01:53 UTC Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote: >>> On Tue, Oct 27, 2020 at 5:52 PM Alexander Ovechkin <ovov@yandex-team.ru> wrote: >>>>> But it was moved on purpose to avoid setting the inner protocol to IPPROTO_MPLS. That needs to use skb->inner_protocol to further segment. >>>> And why do we need to avoid setting the inner protocol to IPPROTO_MPLS? Currently skb->inner_protocol is used before call of ip6_tnl_xmit. >>>> Can you please give example when this patch breaks MPLS segmentation? >>> mpls_gso_segment calls skb_mac_gso_segment on the inner packet. After >>> setting skb->protocol based on skb->inner_protocol. >> Yeah, but mpls_gso_segment is called before ip6_tnl_xmit (because tun devices don't have NETIF_F_GSO_SOFTWARE in their mpls_features), so it does not matter to what value ip6_tnl_xmit sets skb->inner_ipproto. >> And even if gso would been called after both mpls_xmit and ip6_tnl_xmit it would fail as you have written. > Good point. Okay, if no mpls gso packets can make it here, then it > should not matter. > > Vadim, are we missing another reason for this move? > > Else, no other concerns from me. Please do add a Fixes tag. I need a bit of time to repeat all the tests I've done earlier. Will be back soon with the results.
On 29.10.2020 14:40, Willem de Bruijn wrote: > On Thu, Oct 29, 2020 at 3:46 AM Alexander Ovechkin <ovov@yandex-team.ru> wrote: >> On 28 Oct 2020, at 01:53 UTC Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote: >>> On Tue, Oct 27, 2020 at 5:52 PM Alexander Ovechkin <ovov@yandex-team.ru> wrote: >>>>> But it was moved on purpose to avoid setting the inner protocol to IPPROTO_MPLS. That needs to use skb->inner_protocol to further segment. >>>> And why do we need to avoid setting the inner protocol to IPPROTO_MPLS? Currently skb->inner_protocol is used before call of ip6_tnl_xmit. >>>> Can you please give example when this patch breaks MPLS segmentation? >>> mpls_gso_segment calls skb_mac_gso_segment on the inner packet. After >>> setting skb->protocol based on skb->inner_protocol. >> Yeah, but mpls_gso_segment is called before ip6_tnl_xmit (because tun devices don't have NETIF_F_GSO_SOFTWARE in their mpls_features), so it does not matter to what value ip6_tnl_xmit sets skb->inner_ipproto. >> And even if gso would been called after both mpls_xmit and ip6_tnl_xmit it would fail as you have written. > Good point. Okay, if no mpls gso packets can make it here, then it > should not matter. > > Vadim, are we missing another reason for this move? > > Else, no other concerns from me. Please do add a Fixes tag. Could not reproduce the bug. Could you please provide a test scenario? Anyway, all my scenarious with MPLS-in-IPv6 and MPLS-in-GUE are working so I'm ok with moving
On 30 Oct 2020, at 14:01, Vadim Fedorenko <vfedorenko@novek.ru> wrote:
> Could not reproduce the bug. Could you please provide a test scenario?
It can be reproduced if your net device doesn’t support udp tunnel segmentation (i.e its features do not have SKB_GSO_UDP_TUNNEL).
If you try to send packet larger than the MTU fou6-only tunnel (without any other encap) it will be dropped, because of invalid skb->inner_ipproto (that will be equal to IPPROTO_UDP — outer protocol, instead of IPPROTO_IPV6).
skb->inner_ipproto is used here:
https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/tree/net/ipv4/udp_offload.c?id=07e0887302450a62f51dba72df6afb5fabb23d1c#n168
On 30.10.2020 12:54, Alexander Ovechkin wrote: > On 30 Oct 2020, at 14:01, Vadim Fedorenko <vfedorenko@novek.ru> wrote: >> Could not reproduce the bug. Could you please provide a test scenario? > It can be reproduced if your net device doesn’t support udp tunnel segmentation (i.e its features do not have SKB_GSO_UDP_TUNNEL). > If you try to send packet larger than the MTU fou6-only tunnel (without any other encap) it will be dropped, because of invalid skb->inner_ipproto (that will be equal to IPPROTO_UDP — outer protocol, instead of IPPROTO_IPV6). > skb->inner_ipproto is used here: > https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/tree/net/ipv4/udp_offload.c?id=07e0887302450a62f51dba72df6afb5fabb23d1c#n168 Ok, all my tests show that MPLS encap is working after this moving, so I have no concerns too.
diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c index a0217e5bf3bc..648db3fe508f 100644 --- a/net/ipv6/ip6_tunnel.c +++ b/net/ipv6/ip6_tunnel.c @@ -1271,6 +1271,8 @@ int ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev, __u8 dsfield, if (max_headroom > dev->needed_headroom) dev->needed_headroom = max_headroom; + skb_set_inner_ipproto(skb, proto); + err = ip6_tnl_encap(skb, t, &proto, fl6); if (err) return err; @@ -1280,8 +1282,6 @@ int ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev, __u8 dsfield, ipv6_push_frag_opts(skb, &opt.ops, &proto); } - skb_set_inner_ipproto(skb, proto); - skb_push(skb, sizeof(struct ipv6hdr)); skb_reset_network_header(skb); ipv6h = ipv6_hdr(skb);
ip6_tnl_encap assigns to proto transport protocol which encapsulates inner packet, but we must pass to set_inner_ipproto protocol of that inner packet. Calling set_inner_ipproto after ip6_tnl_encap might break gso. For example, in case of encapsulating ipv6 packet in fou6 packet, inner_ipproto would be set to IPPROTO_UDP instead of IPPROTO_IPV6. This would lead to incorrect calling sequence of gso functions: ipv6_gso_segment -> udp6_ufo_fragment -> skb_udp_tunnel_segment -> udp6_ufo_fragment instead of: ipv6_gso_segment -> udp6_ufo_fragment -> skb_udp_tunnel_segment -> ip6ip6_gso_segment Signed-off-by: Alexander Ovechkin <ovov@yandex-team.ru> --- net/ipv6/ip6_tunnel.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)