Message ID | 20210819083854.156996-1-bpoirier@nvidia.com |
---|---|
State | New |
Headers | show |
Series | [net-next] doc: Document unexpected tcp_l3mdev_accept=1 behavior | expand |
On 8/19/21 2:38 AM, Benjamin Poirier wrote: > As suggested by David, document a somewhat unexpected behavior that results > from net.ipv4.tcp_l3mdev_accept=1. This behavior was encountered while > debugging FRR, a VRF-aware application, on a system which used > net.ipv4.tcp_l3mdev_accept=1 and where TCP connections for BGP with MD5 > keys were failing to establish. > > Cc: David Ahern <dsahern@gmail.com> > Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> > --- > Documentation/networking/vrf.rst | 13 +++++++++++++ > 1 file changed, 13 insertions(+) > > diff --git a/Documentation/networking/vrf.rst b/Documentation/networking/vrf.rst > index 0dde145043bc..0a9a6f968cb9 100644 > --- a/Documentation/networking/vrf.rst > +++ b/Documentation/networking/vrf.rst > @@ -144,6 +144,19 @@ default VRF are only handled by a socket not bound to any VRF:: > netfilter rules on the VRF device can be used to limit access to services > running in the default VRF context as well. > > +Using VRF-aware applications (applications which simultaneously create sockets > +outside and inside VRFs) in conjunction with ``net.ipv4.tcp_l3mdev_accept=1`` > +is possible but may lead to problems in some situations. With that sysctl > +value, it is unspecified which listening socket will be selected to handle > +connections for VRF traffic; ie. either a socket bound to the VRF or an unbound > +socket may be used to accept new connections from a VRF. This somewhat > +unexpected behavior can lead to problems if sockets are configured with extra > +options (ex. TCP MD5 keys) with the expectation that VRF traffic will > +exclusively be handled by sockets bound to VRFs, as would be the case with > +``net.ipv4.tcp_l3mdev_accept=0``. Finally and as a reminder, regardless of > +which listening socket is selected, established sockets will be created in the > +VRF based on the ingress interface, as documented earlier. > + > -------------------------------------------------------------------------------- > > Using iproute2 for VRFs > Reviewed-by: David Ahern <dsahern@kernel.org> I don't have the cycles right now, but if you or someone else has time it would be good to look at ways to improve the current situation.
diff --git a/Documentation/networking/vrf.rst b/Documentation/networking/vrf.rst index 0dde145043bc..0a9a6f968cb9 100644 --- a/Documentation/networking/vrf.rst +++ b/Documentation/networking/vrf.rst @@ -144,6 +144,19 @@ default VRF are only handled by a socket not bound to any VRF:: netfilter rules on the VRF device can be used to limit access to services running in the default VRF context as well. +Using VRF-aware applications (applications which simultaneously create sockets +outside and inside VRFs) in conjunction with ``net.ipv4.tcp_l3mdev_accept=1`` +is possible but may lead to problems in some situations. With that sysctl +value, it is unspecified which listening socket will be selected to handle +connections for VRF traffic; ie. either a socket bound to the VRF or an unbound +socket may be used to accept new connections from a VRF. This somewhat +unexpected behavior can lead to problems if sockets are configured with extra +options (ex. TCP MD5 keys) with the expectation that VRF traffic will +exclusively be handled by sockets bound to VRFs, as would be the case with +``net.ipv4.tcp_l3mdev_accept=0``. Finally and as a reminder, regardless of +which listening socket is selected, established sockets will be created in the +VRF based on the ingress interface, as documented earlier. + -------------------------------------------------------------------------------- Using iproute2 for VRFs
As suggested by David, document a somewhat unexpected behavior that results from net.ipv4.tcp_l3mdev_accept=1. This behavior was encountered while debugging FRR, a VRF-aware application, on a system which used net.ipv4.tcp_l3mdev_accept=1 and where TCP connections for BGP with MD5 keys were failing to establish. Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> --- Documentation/networking/vrf.rst | 13 +++++++++++++ 1 file changed, 13 insertions(+)