diff mbox series

[RFC,v2,net-next,17/17] net: bridge: offloaded ports are always promiscuous

Message ID 20210224114350.2791260-18-olteanv@gmail.com
State New
Headers show
Series RX filtering in DSA | expand

Commit Message

Vladimir Oltean Feb. 24, 2021, 11:43 a.m. UTC
From: Vladimir Oltean <vladimir.oltean@nxp.com>

Automatic bridge ports are ones with source address learning or unicast
flooding enabled (corollary: non-automatic ports have learning and
flooding disabled).

The bridge driver has an optimization which says that if all ports are
non-automatic, they don't need to operate in promiscuous mode (i.e. they
don't need to receive all packets). Instead, if a non-automatic port
supports unicast filtering, it can be made to receive only the packets
which have a static FDB entry towards another port in the same forwarding
domain. The logic is that, if a packet is received and does not have a
static FDB entry for routing, it would be dropped anyway because all the
other ports have flooding disabled. So it makes sense to not accept the
packet on RX in the first place.

When a non-automatic port switches to non-promiscuous mode, the static
FDB entries towards the other bridge ports are synced to the RX filtering
list of this ingress port using dev_uc_add.

However, this optimization doesn't bring any benefit for switchdev ports
that offload the bridge. Their hardware data path is promiscuous by its
very definition, i.e. they accept packets regardless of destination MAC
address, because they need to forward them towards the correct station.

Not only is the optimization not necessary, it is also detrimential.
The promise of switchdev is that it looks just like a regular interface
to user space, and it offers extra offloading functionality for stacked
virtual interfaces that can benefit from it. Therefore, it is imaginable
that a switchdev driver might want to implement IFF_UNICAST_FLT too.
When not offloading the bridge, a switchdev interface should really be
indistinguishable from a normal port from user space's perspective,
which means that addresses installed with dev_uc_add and dev_mc_add
should be accepted, and those which aren't should be dropped.

So a switchdev driver might implement dev_uc_add and dev_mc_add by
extracting these addresses from the hardware data path and delivering
them to the CPU, and drop the rest by disabling flooding to the CPU,
and this makes perfect sense when there is no bridge involved on top.

However, things get complicated when the bridge ports are non-automatic
and enter non-promiscuous mode. The bridge will then panic 'oh no, I
need to do something in order for my packets to not get dropped', and
will do the dev_uc_add procedure mentioned above. This will result in
the undesirable behavior that the switchdev driver will extract those
MAC addresses to the CPU, when in fact all that the bridge wanted was
for the packets to not be dropped.

To avoid this situation, the switchdev driver would need to conditionally
accept an address added through dev_uc_add and extract it to the CPU
only if it wasn't added by the bridge, which is both complicated,
strange and counterproductive. It is already unfortunate enough that the
bridge uses its own notification mechanisms for addresses which need to
be extracted (SWITCHDEV_OBJ_ID_HOST_MDB, SWITCHDEV_FDB_ADD_TO_DEVICE
with dev=br0). It shouldn't monopolize the switchdev driver's
functionality and instead it should allow it to offer its services to
other layers which are unaware of switchdev.

So this patch's premise is that the bridge, which is fully aware of
switchdev anyway, is the one that needs to compromise, and must not do
something which isn't needed if switchdev is being used to offload a
port. This way, dev_uc_add and dev_mc_add can be used as a valid mechanism
for address filtering towards the CPU requested by switchdev-unaware
layers ('towards the CPU' because switchdev-unaware will not benefit
from the hardware offload datapath anyway, that's effectively the only
destination which is relevant for them).

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 net/bridge/br_if.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)
diff mbox series

Patch

diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index 680fc3bed549..fc32066bbfdc 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -111,9 +111,13 @@  static void br_port_clear_promisc(struct net_bridge_port *p)
 	/* Check if the port is already non-promisc or if it doesn't
 	 * support UNICAST filtering.  Without unicast filtering support
 	 * we'll end up re-enabling promisc mode anyway, so just check for
-	 * it here.
+	 * it here. Also, a switchdev offloading this port needs to be
+	 * promiscuous by definition, so don't even attempt to get it out of
+	 * promiscuous mode or sync unicast FDB entries to it, since that is
+	 * pointless and not necessary.
 	 */
-	if (!br_promisc_port(p) || !(p->dev->priv_flags & IFF_UNICAST_FLT))
+	if (!br_promisc_port(p) || !(p->dev->priv_flags & IFF_UNICAST_FLT) ||
+	    p->offloaded)
 		return;
 
 	/* Since we'll be clearing the promisc mode, program the port