Message ID | 20220707152930.1789437-6-netdev@kapio-technology.com |
---|---|
State | New |
Headers | show |
Series | Extend locked port feature with FDB locked flag (MAC-Auth/MAB) | expand |
On Thu, Jul 07, 2022 at 05:29:29PM +0200, Hans Schultz wrote: > This implementation for the Marvell mv88e6xxx chip series, > is based on handling ATU miss violations occurring when packets > ingress on a port that is locked. The mac address triggering > the ATU miss violation will be added to the ATU with a zero-DPV, > and is then communicated through switchdev to the bridge module, > which adds a fdb entry with the fdb locked flag set. The entry > is kept according to the bridges ageing time, thus simulating a > dynamic entry. > > As this is essentially a form of CPU based learning, the amount > of locked entries will be limited by a hardcoded value for now, > so as to prevent DOS attacks. > > Signed-off-by: Hans Schultz <netdev@kapio-technology.com> > --- > static void assert_reg_lock(struct mv88e6xxx_chip *chip) > { > @@ -919,6 +920,13 @@ static void mv88e6xxx_mac_link_down(struct dsa_switch *ds, int port, > if (err) > dev_err(chip->dev, > "p%d: failed to force MAC link down\n", port); > + else > + if (mv88e6xxx_port_is_locked(chip, port)) { > + err = mv88e6xxx_atu_locked_entry_flush(ds, port); > + if (err) > + dev_err(chip->dev, > + "p%d: failed to clear locked entries\n", port); > + } > } > > static void mv88e6xxx_mac_link_up(struct dsa_switch *ds, int port, > @@ -1685,6 +1693,12 @@ static void mv88e6xxx_port_fast_age(struct dsa_switch *ds, int port) > struct mv88e6xxx_chip *chip = ds->priv; > int err; > > + if (mv88e6xxx_port_is_locked(chip, port)) > + err = mv88e6xxx_atu_locked_entry_flush(ds, port); > + if (err) Kernel build robot pointed this out too, but it's worth repeating to make sure it doesn't get missed. If the port isn't locked, this function returns a junk integer from the stack which isn't zero, so it's treated as an error code. > + dev_err(chip->ds->dev, "p%d: failed to clear locked entries: %d\n", > + port, err); > + > mv88e6xxx_reg_lock(chip); > err = mv88e6xxx_port_fast_age_fid(chip, port, 0); > mv88e6xxx_reg_unlock(chip); > @@ -1721,11 +1735,11 @@ static int mv88e6xxx_vtu_get(struct mv88e6xxx_chip *chip, u16 vid, > return err; > } > > int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port, > bool locked) > { > @@ -1257,13 +1270,18 @@ int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port, > if (err) > return err; > > - err = mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_ASSOC_VECTOR, ®); > - if (err) > - return err; > + if (!locked) { > + err = mv88e6xxx_atu_locked_entry_flush(chip->ds, port); Did you re-run the selftest with v4? Because this deadlocks due to the double reg_lock illustrated below: + vrf_name=vlan1 + ip vrf exec vlan1 ping 192.0.2.2 -c 10 -i 0.1 -w 5 + check_err 1 'Ping did not work after locking port and adding FDB entry' + local err=1 + local 'msg=Ping did not work after locking port and adding FDB entry' + [[ 0 -eq 0 ]] + [[ 1 -ne 0 ]] + RET=1 + retmsg='Ping did not work after locking port and adding FDB entry' + bridge link set dev lan2 locked off [ 733.273994] [ 733.275515] ============================================ [ 733.280823] WARNING: possible recursive locking detected [ 733.286133] 5.19.0-rc6-07010-ga9b9500ffaac-dirty #3293 Not tainted [ 733.292311] -------------------------------------------- [ 733.297613] kworker/0:0/601 is trying to acquire lock: [ 733.302751] ffff00000213c110 (&chip->reg_lock){+.+.}-{4:4}, at: mv88e6xxx_atu_locked_entry_purge+0x70/0x1a4 [ 733.312524] [ 733.312524] but task is already holding lock: [ 733.318351] ffff00000213c110 (&chip->reg_lock){+.+.}-{4:4}, at: mv88e6xxx_port_bridge_flags+0x48/0x234 [ 733.327674] [ 733.327674] other info that might help us debug this: [ 733.334198] Possible unsafe locking scenario: [ 733.334198] [ 733.340109] CPU0 [ 733.342549] ---- [ 733.344990] lock(&chip->reg_lock); [ 733.348567] lock(&chip->reg_lock); [ 733.352149] [ 733.352149] *** DEADLOCK *** [ 733.352149] [ 733.358063] May be due to missing lock nesting notation [ 733.358063] [ 733.364846] 6 locks held by kworker/0:0/601: [ 733.369115] #0: ffff00000000b748 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x1f4/0x6c4 [ 733.378541] #1: ffff80000c43bdc8 (deferred_process_work){+.+.}-{0:0}, at: process_one_work+0x1f4/0x6c4 [ 733.387966] #2: ffff80000b020cb0 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_lock+0x1c/0x30 [ 733.395660] #3: ffff80000b073370 ((switchdev_blocking_notif_chain).rwsem){++++}-{4:4}, at: blocking_notifier_call_chain+0x34/0xa0 [ 733.407432] #4: ffff00000213c110 (&chip->reg_lock){+.+.}-{4:4}, at: mv88e6xxx_port_bridge_flags+0x48/0x234 [ 733.417202] #5: ffff00000213e0f0 (&p->ale_list_lock){+.+.}-{4:4}, at: mv88e6xxx_atu_locked_entry_flush+0x4c/0xc0 [ 733.427495] [ 733.427495] stack backtrace: [ 733.431858] CPU: 0 PID: 601 Comm: kworker/0:0 Not tainted 5.19.0-rc6-07010-ga9b9500ffaac-dirty #3293 [ 733.440992] Hardware name: CZ.NIC Turris Mox Board (DT) [ 733.446219] Workqueue: events switchdev_deferred_process_work [ 733.451982] Call trace: [ 733.454424] dump_backtrace.part.0+0xcc/0xe0 [ 733.458702] show_stack+0x18/0x6c [ 733.462028] dump_stack_lvl+0x8c/0xb8 [ 733.465703] dump_stack+0x18/0x34 [ 733.469029] __lock_acquire+0x1074/0x1fd0 [ 733.473052] lock_acquire.part.0+0xe4/0x220 [ 733.477244] lock_acquire+0x68/0x8c [ 733.480744] __mutex_lock+0x9c/0x460 [ 733.484332] mutex_lock_nested+0x40/0x50 [ 733.488268] mv88e6xxx_atu_locked_entry_purge+0x70/0x1a4 [ 733.493592] mv88e6xxx_atu_locked_entry_flush+0x7c/0xc0 [ 733.498827] mv88e6xxx_port_set_lock+0xfc/0x10c [ 733.503374] mv88e6xxx_port_bridge_flags+0x200/0x234 [ 733.508351] dsa_port_bridge_flags+0x44/0xe0 [ 733.512635] dsa_slave_port_attr_set+0x1ec/0x230 [ 733.517268] __switchdev_handle_port_attr_set+0x58/0x100 [ 733.522594] switchdev_handle_port_attr_set+0x10/0x24 [ 733.527659] dsa_slave_switchdev_blocking_event+0x8c/0xd4 [ 733.533074] blocking_notifier_call_chain+0x6c/0xa0 [ 733.537968] switchdev_port_attr_notify.constprop.0+0x4c/0xb0 [ 733.543729] switchdev_port_attr_set_deferred+0x24/0x80 [ 733.548967] switchdev_deferred_process+0x90/0x164 [ 733.553774] switchdev_deferred_process_work+0x14/0x2c [ 733.558926] process_one_work+0x28c/0x6c4 [ 733.562949] worker_thread+0x74/0x450 [ 733.566623] kthread+0x118/0x11c [ 733.569860] ret_from_fork+0x10/0x20 ++ mac_get lan1 ++ local if_name=lan1 ++ jq -r '.[]["address"]' ++ ip -j link show dev lan1 I've tentatively fixed this in my tree by taking the register lock more localized to the sub-functions of mv88e6xxx_port_bridge_flags(), and not taking it at caller side for mv88e6xxx_port_set_lock(), but rather letting the callee take it: diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index 7b57ac121589..ec5954f32774 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -6557,41 +6557,47 @@ static int mv88e6xxx_port_bridge_flags(struct dsa_switch *ds, int port, struct mv88e6xxx_chip *chip = ds->priv; int err = -EOPNOTSUPP; - mv88e6xxx_reg_lock(chip); - if (flags.mask & BR_LEARNING) { bool learning = !!(flags.val & BR_LEARNING); u16 pav = learning ? (1 << port) : 0; + mv88e6xxx_reg_lock(chip); err = mv88e6xxx_port_set_assoc_vector(chip, port, pav); + mv88e6xxx_reg_unlock(chip); if (err) - goto out; + return err; } if (flags.mask & BR_FLOOD) { bool unicast = !!(flags.val & BR_FLOOD); + mv88e6xxx_reg_lock(chip); err = chip->info->ops->port_set_ucast_flood(chip, port, unicast); + mv88e6xxx_reg_unlock(chip); if (err) - goto out; + return err; } if (flags.mask & BR_MCAST_FLOOD) { bool multicast = !!(flags.val & BR_MCAST_FLOOD); + mv88e6xxx_reg_lock(chip); err = chip->info->ops->port_set_mcast_flood(chip, port, multicast); + mv88e6xxx_reg_unlock(chip); if (err) - goto out; + return err; } if (flags.mask & BR_BCAST_FLOOD) { bool broadcast = !!(flags.val & BR_BCAST_FLOOD); + mv88e6xxx_reg_lock(chip); err = mv88e6xxx_port_broadcast_sync(chip, port, broadcast); + mv88e6xxx_reg_unlock(chip); if (err) - goto out; + return err; } if (flags.mask & BR_PORT_LOCKED) { @@ -6599,10 +6605,8 @@ static int mv88e6xxx_port_bridge_flags(struct dsa_switch *ds, int port, err = mv88e6xxx_port_set_lock(chip, port, locked); if (err) - goto out; + return err; } -out: - mv88e6xxx_reg_unlock(chip); return err; } diff --git a/drivers/net/dsa/mv88e6xxx/port.c b/drivers/net/dsa/mv88e6xxx/port.c index 5e4d5e9265a4..2a60aded6fbe 100644 --- a/drivers/net/dsa/mv88e6xxx/port.c +++ b/drivers/net/dsa/mv88e6xxx/port.c @@ -1222,15 +1222,19 @@ int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port, u16 reg; int err; + mv88e6xxx_reg_lock(chip); err = mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_CTL0, ®); - if (err) + if (err) { + mv88e6xxx_reg_unlock(chip); return err; + } reg &= ~MV88E6XXX_PORT_CTL0_SA_FILT_MASK; if (locked) reg |= MV88E6XXX_PORT_CTL0_SA_FILT_DROP_ON_LOCK; err = mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_CTL0, reg); + mv88e6xxx_reg_unlock(chip); if (err) return err; @@ -1247,7 +1251,11 @@ int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port, MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT; } - return mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_ASSOC_VECTOR, reg); + mv88e6xxx_reg_lock(chip); + err = mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_ASSOC_VECTOR, reg); + mv88e6xxx_reg_unlock(chip); + + return err; } int mv88e6xxx_port_set_8021q_mode(struct mv88e6xxx_chip *chip, int port, > + if (err) > + return err; > + } > > - reg &= ~MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT; > - if (locked) > - reg |= MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT; > + reg = 0; > + if (locked) { > + reg = (1 << port); > + reg |= MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG | > + MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT; > + } > > return mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_ASSOC_VECTOR, reg); > } > diff --git a/drivers/net/dsa/mv88e6xxx/port.h b/drivers/net/dsa/mv88e6xxx/port.h > index e0a705d82019..5c1d485e7442 100644 > --- a/drivers/net/dsa/mv88e6xxx/port.h > +++ b/drivers/net/dsa/mv88e6xxx/port.h > @@ -231,6 +231,7 @@ > #define MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT 0x2000 > #define MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG 0x1000 > #define MV88E6XXX_PORT_ASSOC_VECTOR_REFRESH_LOCKED 0x0800 > +#define MV88E6XXX_PORT_ASSOC_VECTOR_PAV_MASK 0x07ff > > /* Offset 0x0C: Port ATU Control */ > #define MV88E6XXX_PORT_ATU_CTL 0x0c > @@ -374,6 +375,7 @@ int mv88e6xxx_port_set_fid(struct mv88e6xxx_chip *chip, int port, u16 fid); > int mv88e6xxx_port_get_pvid(struct mv88e6xxx_chip *chip, int port, u16 *pvid); > int mv88e6xxx_port_set_pvid(struct mv88e6xxx_chip *chip, int port, u16 pvid); > > +bool mv88e6xxx_port_is_locked(struct mv88e6xxx_chip *chip, int port); > int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port, > bool locked); > Other than that, after going through some hoops and patching iproute2, the switch appears to do something, I do see locked FDB entries: [root@mox:~/.../drivers/net/dsa] # bridge fdb show dev lan2 00:01:02:03:04:01 vlan 1 locked extern_learn offload master br0 00:01:02:03:04:02 vlan 1 master br0 permanent 00:01:02:03:04:02 master br0 permanent however all selftests except for MAB appear to fail: [root@mox:~/.../drivers/net/dsa] # ./bridge_locked_port.sh lan1 lan2 lan3 lan4 [ 2394.084584] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 portvec 4 spid 2 [ 2398.984189] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 portvec 4 spid 2 RTNETLINK answers: File exists TEST: Locked port ipv4 [FAIL] Ping did not work after locking port and adding FDB entry [ 2410.452638] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 portvec 4 spid 2 [ 2415.366824] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 portvec 4 spid 2 RTNETLINK answers: File exists TEST: Locked port ipv6 [FAIL] Ping6 did not work after locking port and adding FDB entry [ 2425.653013] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2425.663784] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2425.752771] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2425.853188] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2425.954193] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2426.054593] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2426.155646] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2426.256733] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2426.357159] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2426.457638] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2427.354018] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 portvec 4 spid 2 [ 2430.679471] mv88e6xxx_g1_vtu_prob_irq_thread_fn: 33 callbacks suppressed [ 2430.679502] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2430.723742] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2430.783752] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2430.887742] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2430.991738] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2431.095714] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2431.199505] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2431.303700] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2431.407707] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2431.511674] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 RTNETLINK answers: File exists [ 2435.683201] mv88e6xxx_g1_vtu_prob_irq_thread_fn: 28 callbacks suppressed [ 2435.683232] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2435.787060] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2435.891223] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2435.995249] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2436.099218] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2436.203215] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2436.307207] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2436.411175] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2436.515179] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2436.619220] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2440.712164] mv88e6xxx_g1_vtu_prob_irq_thread_fn: 26 callbacks suppressed [ 2440.712194] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2440.812777] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2440.913630] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2441.014523] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2441.114196] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2441.214961] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 TEST: Locked port vlan [FAIL] Ping through vlan did not work after locking port and adding FDB entry [ 2443.945008] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 portvec 4 spid 2 [ 2448.131708] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 portvec 4 spid 2 TEST: Locked port MAB [ OK ] [ 2455.139841] mv88e6085 d0032004.mdio-mii:10 lan3: Link is Down [ 2455.158623] br0: port 2(lan3) entered disabled state [ 2455.288477] mv88e6085 d0032004.mdio-mii:10 lan2: Link is Down [ 2455.309996] br0: port 1(lan2) entered disabled state [ 2455.446341] device lan3 left promiscuous mode [ 2455.452208] br0: port 2(lan3) entered disabled state [ 2455.574232] device lan2 left promiscuous mode [ 2455.580378] br0: port 1(lan2) entered disabled state [ 2455.906058] mv88e6085 d0032004.mdio-mii:10 lan4: Link is Down [ 2456.069399] mv88e6085 d0032004.mdio-mii:10 lan1: Link is Down Compare this to the same selftest, run just before applying the mv88e6xxx MAB patch: TEST: Locked port ipv4 [ OK ] TEST: Locked port ipv6 [ OK ] [ 119.080114] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 119.091009] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 119.180557] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 119.280916] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 119.381256] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 119.481618] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 119.581993] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 119.682416] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 119.782799] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 119.883177] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 124.100262] mv88e6xxx_g1_vtu_prob_irq_thread_fn: 33 callbacks suppressed [ 124.100293] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 124.136182] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 124.204408] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 124.312440] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 124.416384] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 124.520361] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 124.624347] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 124.728344] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 124.832336] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 124.936341] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 130.014610] mv88e6xxx_g1_vtu_prob_irq_thread_fn: 20 callbacks suppressed [ 130.014637] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 130.118292] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 130.219203] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 130.324289] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 130.335695] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 130.424680] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 130.525600] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 130.626466] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 130.728275] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 130.829422] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 TEST: Locked port vlan [ OK ] [ 133.926477] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 portvec 4 spid 2 [ 134.028380] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 portvec 4 spid 2 [ 134.132571] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 portvec 4 spid 2 [ 134.244056] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 portvec 4 spid 2 [ 134.344703] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 portvec 4 spid 2 [ 134.448416] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 portvec 4 spid 2 [ 134.552692] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 portvec 4 spid 2 [ 134.656642] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 portvec 4 spid 2 [ 134.760786] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 portvec 4 spid 2 [ 134.864605] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 portvec 4 spid 2 [ 139.247209] mv88e6xxx_g1_atu_prob_irq_thread_fn: 41 callbacks suppressed [ 139.247241] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 portvec 4 spid 2 [ 144.358773] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 portvec 4 spid 2 TEST: Locked port MAB [FAIL] MAB: No locked fdb entry after ping on locked port In other words, this patch set makes MAB work and breaks everything else. I'm willing to investigate exactly what is it that breaks the other selftest, but not today. It may be related to the "RTNETLINK answers: File exists" messages, which themselves come from the commands | bridge fdb add 00:01:02:03:04:01 dev lan2 master static If I were to randomly guess at almost 4AM in the morning, it has to do with "bridge fdb add" rather than the "bridge fdb replace" that's used for the MAB selftest. The fact I pointed out a few revisions ago, that MAB needs to be opt-in, is now coming back to bite us. Since it's not opt-in, the mv88e6xxx driver always creates locked FDB entries, and when we try to "bridge fdb add", the kernel says "hey, the FDB entry is already there!". Is that it? As for how to opt into MAB. Hmm. MAB seems to be essentially CPU assisted learning, which creates locked FDB entries. I wonder whether we should reconsider the position that address learning makes no sense on locked ports, and say that "+locked -learning" means no MAB, and "+locked +learning" means MAB? This would make a bunch of things more natural to handle in the kernel, and would also give us the opt-in we need. Side note, the VTU and ATU member violation printks annoy me so badly. They aren't stating something super useful and they're a DoS attack vector in itself, even if they're rate limited. I wonder whether we could just turn the prints into a set of ethtool counters and call it a day?
> > In other words, this patch set makes MAB work and breaks everything > else. > I'm willing to investigate exactly what is it that breaks the other > selftest, but not today. It may be related to the "RTNETLINK answers: > File exists" > messages, which themselves come from the commands > | bridge fdb add 00:01:02:03:04:01 dev lan2 master static > > If I were to randomly guess at almost 4AM in the morning, it has to do > with > "bridge fdb add" rather than the "bridge fdb replace" that's used for > the MAB selftest. The fact I pointed out a few revisions ago, that MAB > needs to be opt-in, is now coming back to bite us. Since it's not > opt-in, the mv88e6xxx driver always creates locked FDB entries, and > when > we try to "bridge fdb add", the kernel says "hey, the FDB entry is > already there!". Is that it? Yes, that sounds like a reasonable explanation, as it adds 'ext learned, offloaded' entries. If you try and replace the 'add' with 'replace' in those tests, does it work? > > As for how to opt into MAB. Hmm. MAB seems to be essentially CPU > assisted learning, which creates locked FDB entries. I wonder whether > we > should reconsider the position that address learning makes no sense on > locked ports, and say that "+locked -learning" means no MAB, and > "+locked +learning" means MAB? This would make a bunch of things more > natural to handle in the kernel, and would also give us the opt-in we > need. I have done the one and then the other. We need to have some final decision on this point. And remember that this gave rise to an extra patch to fix link-local learning if learning is turned on on a locked port, which resulted in the decision to allways have learning off on locked ports. > > > > Side note, the VTU and ATU member violation printks annoy me so badly. > They aren't stating something super useful and they're a DoS attack > vector in itself, even if they're rate limited. I wonder whether we > could just turn the prints into a set of ethtool counters and call it a > day? Sounds like a good idea to me. :-)
On Sun, Jul 17, 2022 at 02:34:22PM +0200, netdev@kapio-technology.com wrote: > > If I were to randomly guess at almost 4AM in the morning, it has to do with > > "bridge fdb add" rather than the "bridge fdb replace" that's used for > > the MAB selftest. The fact I pointed out a few revisions ago, that MAB > > needs to be opt-in, is now coming back to bite us. Since it's not > > opt-in, the mv88e6xxx driver always creates locked FDB entries, and when > > we try to "bridge fdb add", the kernel says "hey, the FDB entry is > > already there!". Is that it? > > Yes, that sounds like a reasonable explanation, as it adds 'ext learned, > offloaded' entries. If you try and replace the 'add' with 'replace' in those > tests, does it work? Well, you have access to the selftests too... But yes, that is the reason, and it works when I change 'add' to 'replace', although of course this isn't the correct solution. > > As for how to opt into MAB. Hmm. MAB seems to be essentially CPU > > assisted learning, which creates locked FDB entries. I wonder whether we > > should reconsider the position that address learning makes no sense on > > locked ports, and say that "+locked -learning" means no MAB, and > > "+locked +learning" means MAB? This would make a bunch of things more > > natural to handle in the kernel, and would also give us the opt-in we > > need. > > I have done the one and then the other. We need to have some final decision > on this point. And remember that this gave rise to an extra patch to fix > link-local learning if learning is turned on on a locked port, which > resulted in the decision to allways have learning off on locked ports. I think part of the reason for the back-and-forth was not making a very clear distinction between basic 802.1X using hostapd, and MAB. While I agree hostapd doesn't have what to do with learning, for MAB I'm still wondering. It's the same situation for mv88e6xxx's Port Association Vector in fact. > > Side note, the VTU and ATU member violation printks annoy me so badly. > > They aren't stating something super useful and they're a DoS attack > > vector in itself, even if they're rate limited. I wonder whether we > > could just turn the prints into a set of ethtool counters and call it a > > day? > > Sounds like a good idea to me. :-) Thinking this through, what we really want is trace points here, otherwise we'd lose information about which MAC address/VID/FID was it that caused the violation.
On 2022-07-17 02:47, Vladimir Oltean wrote: >> @@ -1721,11 +1735,11 @@ static int mv88e6xxx_vtu_get(struct >> mv88e6xxx_chip *chip, u16 vid, >> return err; >> } >> >> int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port, >> bool locked) >> { >> @@ -1257,13 +1270,18 @@ int mv88e6xxx_port_set_lock(struct >> mv88e6xxx_chip *chip, int port, >> if (err) >> return err; >> >> - err = mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_ASSOC_VECTOR, >> ®); >> - if (err) >> - return err; >> + if (!locked) { >> + err = mv88e6xxx_atu_locked_entry_flush(chip->ds, port); > > Did you re-run the selftest with v4? Because this deadlocks due to the > double reg_lock illustrated below: I did some selftest, but I didn't experience the problem of the double chip lock as it only happens on unlock which is only called where the MAB test ends. > > + vrf_name=vlan1 > + ip vrf exec vlan1 ping 192.0.2.2 -c 10 -i 0.1 -w 5 > + check_err 1 'Ping did not work after locking port and adding FDB > entry' > + local err=1 > + local 'msg=Ping did not work after locking port and adding FDB entry' > + [[ 0 -eq 0 ]] > + [[ 1 -ne 0 ]] > + RET=1 > + retmsg='Ping did not work after locking port and adding FDB entry' > + bridge link set dev lan2 locked off > [ 733.273994] > [ 733.275515] ============================================ > [ 733.280823] WARNING: possible recursive locking detected > [ 733.286133] 5.19.0-rc6-07010-ga9b9500ffaac-dirty #3293 Not tainted > [ 733.292311] -------------------------------------------- > [ 733.297613] kworker/0:0/601 is trying to acquire lock: > [ 733.302751] ffff00000213c110 (&chip->reg_lock){+.+.}-{4:4}, at: > mv88e6xxx_atu_locked_entry_purge+0x70/0x1a4 > [ 733.312524] > [ 733.312524] but task is already holding lock: > [ 733.318351] ffff00000213c110 (&chip->reg_lock){+.+.}-{4:4}, at: > mv88e6xxx_port_bridge_flags+0x48/0x234 > [ 733.327674] > [ 733.327674] other info that might help us debug this: > [ 733.334198] Possible unsafe locking scenario: > [ 733.334198] > [ 733.340109] CPU0 > [ 733.342549] ---- > [ 733.344990] lock(&chip->reg_lock); > [ 733.348567] lock(&chip->reg_lock); > [ 733.352149] > [ 733.352149] *** DEADLOCK *** > [ 733.352149] > [ 733.358063] May be due to missing lock nesting notation > [ 733.358063] > [ 733.364846] 6 locks held by kworker/0:0/601: > [ 733.369115] #0: ffff00000000b748 > ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x1f4/0x6c4 > [ 733.378541] #1: ffff80000c43bdc8 > (deferred_process_work){+.+.}-{0:0}, at: process_one_work+0x1f4/0x6c4 > [ 733.387966] #2: ffff80000b020cb0 (rtnl_mutex){+.+.}-{4:4}, at: > rtnl_lock+0x1c/0x30 > [ 733.395660] #3: ffff80000b073370 > ((switchdev_blocking_notif_chain).rwsem){++++}-{4:4}, at: > blocking_notifier_call_chain+0x34/0xa0 > [ 733.407432] #4: ffff00000213c110 (&chip->reg_lock){+.+.}-{4:4}, > at: mv88e6xxx_port_bridge_flags+0x48/0x234 > [ 733.417202] #5: ffff00000213e0f0 (&p->ale_list_lock){+.+.}-{4:4}, > at: mv88e6xxx_atu_locked_entry_flush+0x4c/0xc0 > [ 733.427495] > [ 733.427495] stack backtrace: > [ 733.431858] CPU: 0 PID: 601 Comm: kworker/0:0 Not tainted > 5.19.0-rc6-07010-ga9b9500ffaac-dirty #3293 > [ 733.440992] Hardware name: CZ.NIC Turris Mox Board (DT) > [ 733.446219] Workqueue: events switchdev_deferred_process_work > [ 733.451982] Call trace: > [ 733.454424] dump_backtrace.part.0+0xcc/0xe0 > [ 733.458702] show_stack+0x18/0x6c > [ 733.462028] dump_stack_lvl+0x8c/0xb8 > [ 733.465703] dump_stack+0x18/0x34 > [ 733.469029] __lock_acquire+0x1074/0x1fd0 > [ 733.473052] lock_acquire.part.0+0xe4/0x220 > [ 733.477244] lock_acquire+0x68/0x8c > [ 733.480744] __mutex_lock+0x9c/0x460 > [ 733.484332] mutex_lock_nested+0x40/0x50 > [ 733.488268] mv88e6xxx_atu_locked_entry_purge+0x70/0x1a4 > [ 733.493592] mv88e6xxx_atu_locked_entry_flush+0x7c/0xc0 > [ 733.498827] mv88e6xxx_port_set_lock+0xfc/0x10c > [ 733.503374] mv88e6xxx_port_bridge_flags+0x200/0x234 > [ 733.508351] dsa_port_bridge_flags+0x44/0xe0 > [ 733.512635] dsa_slave_port_attr_set+0x1ec/0x230 > [ 733.517268] __switchdev_handle_port_attr_set+0x58/0x100 > [ 733.522594] switchdev_handle_port_attr_set+0x10/0x24 > [ 733.527659] dsa_slave_switchdev_blocking_event+0x8c/0xd4 > [ 733.533074] blocking_notifier_call_chain+0x6c/0xa0 > [ 733.537968] switchdev_port_attr_notify.constprop.0+0x4c/0xb0 > [ 733.543729] switchdev_port_attr_set_deferred+0x24/0x80 > [ 733.548967] switchdev_deferred_process+0x90/0x164 > [ 733.553774] switchdev_deferred_process_work+0x14/0x2c > [ 733.558926] process_one_work+0x28c/0x6c4 > [ 733.562949] worker_thread+0x74/0x450 > [ 733.566623] kthread+0x118/0x11c > [ 733.569860] ret_from_fork+0x10/0x20 > ++ mac_get lan1 > ++ local if_name=lan1 > ++ jq -r '.[]["address"]' > ++ ip -j link show dev lan1 > > I've tentatively fixed this in my tree by taking the register lock more > localized to the sub-functions of mv88e6xxx_port_bridge_flags(), and > not > taking it at caller side for mv88e6xxx_port_set_lock(), but rather > letting the callee take it: Yes, the double chip lock came from calling mv88e6xxx_atu_locked_entry_flush() in mv88e6xxx_port_set_lock() that is called from mv88e6xxx_port_bridge_flags() whic has the lock taken. I have fixed this now, but when I unlock a port with locked entries, I cannot have notify the bridge to clear the same locked entries by having mv88e6xxx_atu_locked_entry_flush() calling with the notify on, thus taking the rtnl lock, as this also results in a double lock. To remove the locked entries in the bridge FDB it is then necessary as I see it, to take the link down or have a userspace daemon clear them. I hope that is okay if documented? > > diff --git a/drivers/net/dsa/mv88e6xxx/chip.c > b/drivers/net/dsa/mv88e6xxx/chip.c > index 7b57ac121589..ec5954f32774 100644 > --- a/drivers/net/dsa/mv88e6xxx/chip.c > +++ b/drivers/net/dsa/mv88e6xxx/chip.c > @@ -6557,41 +6557,47 @@ static int mv88e6xxx_port_bridge_flags(struct > dsa_switch *ds, int port, > struct mv88e6xxx_chip *chip = ds->priv; > int err = -EOPNOTSUPP; > > - mv88e6xxx_reg_lock(chip); > - > if (flags.mask & BR_LEARNING) { > bool learning = !!(flags.val & BR_LEARNING); > u16 pav = learning ? (1 << port) : 0; > > + mv88e6xxx_reg_lock(chip); > err = mv88e6xxx_port_set_assoc_vector(chip, port, pav); > + mv88e6xxx_reg_unlock(chip); > if (err) > - goto out; > + return err; > } > > if (flags.mask & BR_FLOOD) { > bool unicast = !!(flags.val & BR_FLOOD); > > + mv88e6xxx_reg_lock(chip); > err = chip->info->ops->port_set_ucast_flood(chip, port, > unicast); > + mv88e6xxx_reg_unlock(chip); > if (err) > - goto out; > + return err; > } > > if (flags.mask & BR_MCAST_FLOOD) { > bool multicast = !!(flags.val & BR_MCAST_FLOOD); > > + mv88e6xxx_reg_lock(chip); > err = chip->info->ops->port_set_mcast_flood(chip, port, > multicast); > + mv88e6xxx_reg_unlock(chip); > if (err) > - goto out; > + return err; > } > > if (flags.mask & BR_BCAST_FLOOD) { > bool broadcast = !!(flags.val & BR_BCAST_FLOOD); > > + mv88e6xxx_reg_lock(chip); > err = mv88e6xxx_port_broadcast_sync(chip, port, broadcast); > + mv88e6xxx_reg_unlock(chip); > if (err) > - goto out; > + return err; > } > > if (flags.mask & BR_PORT_LOCKED) { > @@ -6599,10 +6605,8 @@ static int mv88e6xxx_port_bridge_flags(struct > dsa_switch *ds, int port, > > err = mv88e6xxx_port_set_lock(chip, port, locked); > if (err) > - goto out; > + return err; > } > -out: > - mv88e6xxx_reg_unlock(chip); > > return err; > } > diff --git a/drivers/net/dsa/mv88e6xxx/port.c > b/drivers/net/dsa/mv88e6xxx/port.c > index 5e4d5e9265a4..2a60aded6fbe 100644 > --- a/drivers/net/dsa/mv88e6xxx/port.c > +++ b/drivers/net/dsa/mv88e6xxx/port.c > @@ -1222,15 +1222,19 @@ int mv88e6xxx_port_set_lock(struct > mv88e6xxx_chip *chip, int port, > u16 reg; > int err; > > + mv88e6xxx_reg_lock(chip); > err = mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_CTL0, ®); > - if (err) > + if (err) { > + mv88e6xxx_reg_unlock(chip); > return err; > + } > > reg &= ~MV88E6XXX_PORT_CTL0_SA_FILT_MASK; > if (locked) > reg |= MV88E6XXX_PORT_CTL0_SA_FILT_DROP_ON_LOCK; > > err = mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_CTL0, reg); > + mv88e6xxx_reg_unlock(chip); > if (err) > return err; > > @@ -1247,7 +1251,11 @@ int mv88e6xxx_port_set_lock(struct > mv88e6xxx_chip *chip, int port, > MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT; > } > > - return mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_ASSOC_VECTOR, > reg); > + mv88e6xxx_reg_lock(chip); > + err = mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_ASSOC_VECTOR, > reg); > + mv88e6xxx_reg_unlock(chip); > + > + return err; > } > > int mv88e6xxx_port_set_8021q_mode(struct mv88e6xxx_chip *chip, int > port, > >> + if (err) >> + return err; >> + } >> >> - reg &= ~MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT; >> - if (locked) >> - reg |= MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT; >> + reg = 0; >> + if (locked) { >> + reg = (1 << port); >> + reg |= MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG | >> + MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT; >> + } >> >> return mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_ASSOC_VECTOR, >> reg); >> } >> diff --git a/drivers/net/dsa/mv88e6xxx/port.h >> b/drivers/net/dsa/mv88e6xxx/port.h >> index e0a705d82019..5c1d485e7442 100644 >> --- a/drivers/net/dsa/mv88e6xxx/port.h >> +++ b/drivers/net/dsa/mv88e6xxx/port.h >> @@ -231,6 +231,7 @@ >> #define MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT 0x2000 >> #define MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG 0x1000 >> #define MV88E6XXX_PORT_ASSOC_VECTOR_REFRESH_LOCKED 0x0800 >> +#define MV88E6XXX_PORT_ASSOC_VECTOR_PAV_MASK 0x07ff >> >> /* Offset 0x0C: Port ATU Control */ >> #define MV88E6XXX_PORT_ATU_CTL 0x0c >> @@ -374,6 +375,7 @@ int mv88e6xxx_port_set_fid(struct mv88e6xxx_chip >> *chip, int port, u16 fid); >> int mv88e6xxx_port_get_pvid(struct mv88e6xxx_chip *chip, int port, >> u16 *pvid); >> int mv88e6xxx_port_set_pvid(struct mv88e6xxx_chip *chip, int port, >> u16 pvid); >> >> +bool mv88e6xxx_port_is_locked(struct mv88e6xxx_chip *chip, int port); >> int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port, >> bool locked); >>
diff --git a/drivers/net/dsa/mv88e6xxx/Makefile b/drivers/net/dsa/mv88e6xxx/Makefile index c8eca2b6f959..be903a983780 100644 --- a/drivers/net/dsa/mv88e6xxx/Makefile +++ b/drivers/net/dsa/mv88e6xxx/Makefile @@ -15,3 +15,4 @@ mv88e6xxx-objs += port_hidden.o mv88e6xxx-$(CONFIG_NET_DSA_MV88E6XXX_PTP) += ptp.o mv88e6xxx-objs += serdes.o mv88e6xxx-objs += smi.o +mv88e6xxx-objs += switchdev.o diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index d134153ef023..c1fb7b5ba3ac 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -42,6 +42,7 @@ #include "ptp.h" #include "serdes.h" #include "smi.h" +#include "switchdev.h" static void assert_reg_lock(struct mv88e6xxx_chip *chip) { @@ -919,6 +920,13 @@ static void mv88e6xxx_mac_link_down(struct dsa_switch *ds, int port, if (err) dev_err(chip->dev, "p%d: failed to force MAC link down\n", port); + else + if (mv88e6xxx_port_is_locked(chip, port)) { + err = mv88e6xxx_atu_locked_entry_flush(ds, port); + if (err) + dev_err(chip->dev, + "p%d: failed to clear locked entries\n", port); + } } static void mv88e6xxx_mac_link_up(struct dsa_switch *ds, int port, @@ -1685,6 +1693,12 @@ static void mv88e6xxx_port_fast_age(struct dsa_switch *ds, int port) struct mv88e6xxx_chip *chip = ds->priv; int err; + if (mv88e6xxx_port_is_locked(chip, port)) + err = mv88e6xxx_atu_locked_entry_flush(ds, port); + if (err) + dev_err(chip->ds->dev, "p%d: failed to clear locked entries: %d\n", + port, err); + mv88e6xxx_reg_lock(chip); err = mv88e6xxx_port_fast_age_fid(chip, port, 0); mv88e6xxx_reg_unlock(chip); @@ -1721,11 +1735,11 @@ static int mv88e6xxx_vtu_get(struct mv88e6xxx_chip *chip, u16 vid, return err; } -static int mv88e6xxx_vtu_walk(struct mv88e6xxx_chip *chip, - int (*cb)(struct mv88e6xxx_chip *chip, - const struct mv88e6xxx_vtu_entry *entry, - void *priv), - void *priv) +int mv88e6xxx_vtu_walk(struct mv88e6xxx_chip *chip, + int (*cb)(struct mv88e6xxx_chip *chip, + const struct mv88e6xxx_vtu_entry *entry, + void *priv), + void *priv) { struct mv88e6xxx_vtu_entry entry = { .vid = mv88e6xxx_max_vid(chip), @@ -2727,6 +2741,9 @@ static int mv88e6xxx_port_fdb_add(struct dsa_switch *ds, int port, if (is_locked) return 0; + if (mv88e6xxx_port_is_locked(chip, port)) + mv88e6xxx_atu_locked_entry_find_purge(ds, port, addr, vid); + mv88e6xxx_reg_lock(chip); err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid, MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC); @@ -2740,12 +2757,17 @@ static int mv88e6xxx_port_fdb_del(struct dsa_switch *ds, int port, struct dsa_db db) { struct mv88e6xxx_chip *chip = ds->priv; - int err; + bool locked_found = false; + int err = 0; - mv88e6xxx_reg_lock(chip); - err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid, 0); - mv88e6xxx_reg_unlock(chip); + if (mv88e6xxx_port_is_locked(chip, port)) + locked_found = mv88e6xxx_atu_locked_entry_find_purge(ds, port, addr, vid); + if (!locked_found) { + mv88e6xxx_reg_lock(chip); + err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid, 0); + mv88e6xxx_reg_unlock(chip); + } return err; } @@ -3814,11 +3836,18 @@ static int mv88e6xxx_setup(struct dsa_switch *ds) static int mv88e6xxx_port_setup(struct dsa_switch *ds, int port) { - return mv88e6xxx_setup_devlink_regions_port(ds, port); + int err; + + err = mv88e6xxx_setup_devlink_regions_port(ds, port); + if (!err) + return mv88e6xxx_init_violation_handler(ds, port); + + return err; } static void mv88e6xxx_port_teardown(struct dsa_switch *ds, int port) { + mv88e6xxx_teardown_violation_handler(ds, port); mv88e6xxx_teardown_devlink_regions_port(ds, port); } diff --git a/drivers/net/dsa/mv88e6xxx/chip.h b/drivers/net/dsa/mv88e6xxx/chip.h index 5e03cfe50156..3cc2db722ad9 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.h +++ b/drivers/net/dsa/mv88e6xxx/chip.h @@ -280,6 +280,12 @@ struct mv88e6xxx_port { unsigned int serdes_irq; char serdes_irq_name[64]; struct devlink_region *region; + + /* List and maintenance of ATU locked entries */ + struct mutex ale_list_lock; + struct list_head ale_list; + struct delayed_work ale_work; + int ale_cnt; }; enum mv88e6xxx_region_id { @@ -399,6 +405,9 @@ struct mv88e6xxx_chip { int egress_dest_port; int ingress_dest_port; + /* Keep the register written age time for easy access */ + u8 age_time; + /* Per-port timestamping resources. */ struct mv88e6xxx_port_hwtstamp port_hwtstamp[DSA_MAX_PORTS]; @@ -803,6 +812,12 @@ static inline void mv88e6xxx_reg_unlock(struct mv88e6xxx_chip *chip) mutex_unlock(&chip->reg_lock); } +int mv88e6xxx_vtu_walk(struct mv88e6xxx_chip *chip, + int (*cb)(struct mv88e6xxx_chip *chip, + const struct mv88e6xxx_vtu_entry *entry, + void *priv), + void *priv); + int mv88e6xxx_fid_map(struct mv88e6xxx_chip *chip, unsigned long *bitmap); #endif /* _MV88E6XXX_CHIP_H */ diff --git a/drivers/net/dsa/mv88e6xxx/global1.h b/drivers/net/dsa/mv88e6xxx/global1.h index 65958b2a0d3a..503fbf216670 100644 --- a/drivers/net/dsa/mv88e6xxx/global1.h +++ b/drivers/net/dsa/mv88e6xxx/global1.h @@ -136,6 +136,7 @@ #define MV88E6XXX_G1_ATU_DATA_TRUNK 0x8000 #define MV88E6XXX_G1_ATU_DATA_TRUNK_ID_MASK 0x00f0 #define MV88E6XXX_G1_ATU_DATA_PORT_VECTOR_MASK 0x3ff0 +#define MV88E6XXX_G1_ATU_DATA_PORT_VECTOR_NO_EGRESS 0x0000 #define MV88E6XXX_G1_ATU_DATA_STATE_MASK 0x000f #define MV88E6XXX_G1_ATU_DATA_STATE_UC_UNUSED 0x0000 #define MV88E6XXX_G1_ATU_DATA_STATE_UC_AGE_1_OLDEST 0x0001 diff --git a/drivers/net/dsa/mv88e6xxx/global1_atu.c b/drivers/net/dsa/mv88e6xxx/global1_atu.c index 5d120d53823c..d4ca976dac91 100644 --- a/drivers/net/dsa/mv88e6xxx/global1_atu.c +++ b/drivers/net/dsa/mv88e6xxx/global1_atu.c @@ -12,6 +12,8 @@ #include "chip.h" #include "global1.h" +#include "port.h" +#include "switchdev.h" /* Offset 0x01: ATU FID Register */ @@ -54,6 +56,7 @@ int mv88e6xxx_g1_atu_set_age_time(struct mv88e6xxx_chip *chip, /* Round to nearest multiple of coeff */ age_time = (msecs + coeff / 2) / coeff; + chip->age_time = age_time; err = mv88e6xxx_g1_read(chip, MV88E6XXX_G1_ATU_CTL, &val); if (err) @@ -369,6 +372,7 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id) int spid; int err; u16 val; + u16 fid; mv88e6xxx_reg_lock(chip); @@ -380,6 +384,10 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id) if (err) goto out; + err = mv88e6xxx_g1_read(chip, MV88E6352_G1_ATU_FID, &fid); + if (err) + goto out; + err = mv88e6xxx_g1_atu_data_read(chip, &entry); if (err) goto out; @@ -388,6 +396,8 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id) if (err) goto out; + mv88e6xxx_reg_unlock(chip); + spid = entry.state; if (val & MV88E6XXX_G1_ATU_OP_AGE_OUT_VIOLATION) { @@ -408,6 +418,11 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id) "ATU miss violation for %pM portvec %x spid %d\n", entry.mac, entry.portvec, spid); chip->ports[spid].atu_miss_violation++; + if (fid && mv88e6xxx_port_is_locked(chip, spid)) + err = mv88e6xxx_handle_violation(chip, spid, &entry, fid, + MV88E6XXX_G1_ATU_OP_MISS_VIOLATION); + if (err) + goto out; } if (val & MV88E6XXX_G1_ATU_OP_FULL_VIOLATION) { @@ -416,7 +431,6 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id) entry.mac, entry.portvec, spid); chip->ports[spid].atu_full_violation++; } - mv88e6xxx_reg_unlock(chip); return IRQ_HANDLED; diff --git a/drivers/net/dsa/mv88e6xxx/port.c b/drivers/net/dsa/mv88e6xxx/port.c index 795b3128768f..57eb25e9eb63 100644 --- a/drivers/net/dsa/mv88e6xxx/port.c +++ b/drivers/net/dsa/mv88e6xxx/port.c @@ -14,9 +14,11 @@ #include <linux/phylink.h> #include "chip.h" +#include "global1.h" #include "global2.h" #include "port.h" #include "serdes.h" +#include "switchdev.h" int mv88e6xxx_port_read(struct mv88e6xxx_chip *chip, int port, int reg, u16 *val) @@ -1239,6 +1241,17 @@ int mv88e6xxx_port_set_mirror(struct mv88e6xxx_chip *chip, int port, return err; } +bool mv88e6xxx_port_is_locked(struct mv88e6xxx_chip *chip, int port) +{ + u16 reg; + + mv88e6xxx_reg_lock(chip); + mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_CTL0, ®); + mv88e6xxx_reg_unlock(chip); + + return reg & MV88E6XXX_PORT_CTL0_SA_FILT_DROP_ON_LOCK; +} + int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port, bool locked) { @@ -1257,13 +1270,18 @@ int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port, if (err) return err; - err = mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_ASSOC_VECTOR, ®); - if (err) - return err; + if (!locked) { + err = mv88e6xxx_atu_locked_entry_flush(chip->ds, port); + if (err) + return err; + } - reg &= ~MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT; - if (locked) - reg |= MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT; + reg = 0; + if (locked) { + reg = (1 << port); + reg |= MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG | + MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT; + } return mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_ASSOC_VECTOR, reg); } diff --git a/drivers/net/dsa/mv88e6xxx/port.h b/drivers/net/dsa/mv88e6xxx/port.h index e0a705d82019..5c1d485e7442 100644 --- a/drivers/net/dsa/mv88e6xxx/port.h +++ b/drivers/net/dsa/mv88e6xxx/port.h @@ -231,6 +231,7 @@ #define MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT 0x2000 #define MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG 0x1000 #define MV88E6XXX_PORT_ASSOC_VECTOR_REFRESH_LOCKED 0x0800 +#define MV88E6XXX_PORT_ASSOC_VECTOR_PAV_MASK 0x07ff /* Offset 0x0C: Port ATU Control */ #define MV88E6XXX_PORT_ATU_CTL 0x0c @@ -374,6 +375,7 @@ int mv88e6xxx_port_set_fid(struct mv88e6xxx_chip *chip, int port, u16 fid); int mv88e6xxx_port_get_pvid(struct mv88e6xxx_chip *chip, int port, u16 *pvid); int mv88e6xxx_port_set_pvid(struct mv88e6xxx_chip *chip, int port, u16 pvid); +bool mv88e6xxx_port_is_locked(struct mv88e6xxx_chip *chip, int port); int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port, bool locked); diff --git a/drivers/net/dsa/mv88e6xxx/switchdev.c b/drivers/net/dsa/mv88e6xxx/switchdev.c new file mode 100644 index 000000000000..e6c67fb741af --- /dev/null +++ b/drivers/net/dsa/mv88e6xxx/switchdev.c @@ -0,0 +1,280 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * switchdev.c + * + * Authors: + * Hans J. Schultz <hans.schultz@westermo.com> + * + */ + +#include <net/switchdev.h> +#include <linux/list.h> +#include "chip.h" +#include "global1.h" +#include "switchdev.h" + +static void mv88e6xxx_atu_locked_entry_purge(struct mv88e6xxx_atu_locked_entry *ale, bool notify) +{ + struct switchdev_notifier_fdb_info info = { + .addr = ale->mac, + .vid = ale->vid, + .is_locked = true, + .offloaded = true, + }; + struct mv88e6xxx_atu_entry entry; + struct net_device *brport; + struct dsa_port *dp; + + entry.portvec = MV88E6XXX_G1_ATU_DATA_PORT_VECTOR_NO_EGRESS; + entry.state = MV88E6XXX_G1_ATU_DATA_STATE_UC_UNUSED; + entry.trunk = false; + ether_addr_copy(entry.mac, ale->mac); + + mv88e6xxx_reg_lock(ale->chip); + mv88e6xxx_g1_atu_loadpurge(ale->chip, ale->fid, &entry); + mv88e6xxx_reg_unlock(ale->chip); + + dp = dsa_to_port(ale->chip->ds, ale->port); + + if (notify) { + rtnl_lock(); + brport = dsa_port_to_bridge_port(dp); + + if (brport) { + call_switchdev_notifiers(SWITCHDEV_FDB_DEL_TO_BRIDGE, + brport, &info.info, NULL); + } else { + dev_err(ale->chip->dev, "No bridge port for dsa port belonging to port %d\n", + ale->port); + } + rtnl_unlock(); + } + + list_del(&ale->list); + kfree(ale); +} + +static void mv88e6xxx_atu_locked_entry_cleanup(struct work_struct *work) +{ + struct mv88e6xxx_port *p = container_of(work, struct mv88e6xxx_port, ale_work.work); + struct mv88e6xxx_atu_locked_entry *ale, *tmp; + + mutex_lock(&p->ale_list_lock); + list_for_each_entry_safe(ale, tmp, &p->ale_list, list) { + if (time_after(jiffies, ale->expires)) { + mv88e6xxx_atu_locked_entry_purge(ale, true); + p->ale_cnt--; + } + } + mutex_unlock(&p->ale_list_lock); + + mod_delayed_work(system_long_wq, &p->ale_work, msecs_to_jiffies(100)); +} + +static int mv88e6xxx_new_atu_locked_entry(struct mv88e6xxx_chip *chip, const unsigned char *addr, + int port, u16 fid, u16 vid, + struct mv88e6xxx_atu_locked_entry **alep) +{ + struct mv88e6xxx_atu_locked_entry *ale; + unsigned long now, age_time; + + ale = kmalloc(sizeof(*ale), GFP_ATOMIC); + if (!ale) + return -ENOMEM; + + ether_addr_copy(ale->mac, addr); + ale->chip = chip; + ale->port = port; + ale->fid = fid; + ale->vid = vid; + now = jiffies; + age_time = chip->age_time * chip->info->age_time_coeff; + ale->expires = now + age_time; + + *alep = ale; + return 0; +} + +struct mv88e6xxx_fid_search_ctx { + u16 fid_search; + u16 vid_found; +}; + +static int mv88e6xxx_find_vid_on_matching_fid(struct mv88e6xxx_chip *chip, + const struct mv88e6xxx_vtu_entry *entry, + void *priv) +{ + struct mv88e6xxx_fid_search_ctx *ctx = priv; + + if (ctx->fid_search == entry->fid) { + ctx->vid_found = entry->vid; + return 1; + } + + return 0; +} + +int mv88e6xxx_handle_violation(struct mv88e6xxx_chip *chip, int port, + struct mv88e6xxx_atu_entry *entry, + u16 fid, u16 type) +{ + struct switchdev_notifier_fdb_info info = { + .addr = entry->mac, + .vid = 0, + .is_locked = true, + .offloaded = true, + }; + struct mv88e6xxx_atu_locked_entry *ale; + struct mv88e6xxx_fid_search_ctx ctx; + struct net_device *brport; + struct mv88e6xxx_port *p; + struct dsa_port *dp; + int err; + + if (!mv88e6xxx_is_invalid_port(chip, port)) + p = &chip->ports[port]; + else + return -ENODEV; + + ctx.fid_search = fid; + mv88e6xxx_reg_lock(chip); + err = mv88e6xxx_vtu_walk(chip, mv88e6xxx_find_vid_on_matching_fid, &ctx); + mv88e6xxx_reg_unlock(chip); + if (err < 0) + return err; + if (err == 1) + info.vid = ctx.vid_found; + else + return -ENODATA; + + switch (type) { + case MV88E6XXX_G1_ATU_OP_MISS_VIOLATION: + mutex_lock(&p->ale_list_lock); + if (p->ale_cnt >= ATU_LOCKED_ENTRIES_MAX) + goto exit; + mutex_unlock(&p->ale_list_lock); + entry->portvec = MV88E6XXX_G1_ATU_DATA_PORT_VECTOR_NO_EGRESS; + entry->state = MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC; + entry->trunk = false; + + mv88e6xxx_reg_lock(chip); + err = mv88e6xxx_g1_atu_loadpurge(chip, fid, entry); + if (err) + goto fail; + mv88e6xxx_reg_unlock(chip); + + dp = dsa_to_port(chip->ds, port); + err = mv88e6xxx_new_atu_locked_entry(chip, entry->mac, port, fid, + info.vid, &ale); + if (err) + return err; + + mutex_lock(&p->ale_list_lock); + list_add(&ale->list, &p->ale_list); + p->ale_cnt++; + mutex_unlock(&p->ale_list_lock); + + rtnl_lock(); + brport = dsa_port_to_bridge_port(dp); + if (!brport) { + rtnl_unlock(); + return -ENODEV; + } + err = call_switchdev_notifiers(SWITCHDEV_FDB_ADD_TO_BRIDGE, + brport, &info.info, NULL); + rtnl_unlock(); + break; + } + + return err; + +fail: + mv88e6xxx_reg_unlock(chip); + return err; + +exit: + mutex_unlock(&p->ale_list_lock); + return 0; +} + +bool mv88e6xxx_atu_locked_entry_find_purge(struct dsa_switch *ds, int port, + const unsigned char *addr, u16 vid) +{ + struct mv88e6xxx_atu_locked_entry *ale, *tmp; + struct mv88e6xxx_chip *chip = ds->priv; + struct mv88e6xxx_port *p; + bool found = false; + + p = &chip->ports[port]; + mutex_lock(&p->ale_list_lock); + list_for_each_entry_safe(ale, tmp, &p->ale_list, list) { + if (ether_addr_equal(ale->mac, addr) == 0) { + if (ale->vid == vid) { + mv88e6xxx_atu_locked_entry_purge(ale, false); + p->ale_cnt--; + found = true; + break; + } + } + } + mutex_unlock(&p->ale_list_lock); + return found; +} + +int mv88e6xxx_atu_locked_entry_flush(struct dsa_switch *ds, int port) +{ + struct mv88e6xxx_atu_locked_entry *ale, *tmp; + struct mv88e6xxx_chip *chip = ds->priv; + struct mv88e6xxx_port *p; + + if (!mv88e6xxx_is_invalid_port(chip, port)) + p = &chip->ports[port]; + else + return -ENODEV; + + mutex_lock(&p->ale_list_lock); + list_for_each_entry_safe(ale, tmp, &p->ale_list, list) { + mv88e6xxx_atu_locked_entry_purge(ale, false); + p->ale_cnt--; + } + mutex_unlock(&p->ale_list_lock); + + return 0; +} + +int mv88e6xxx_init_violation_handler(struct dsa_switch *ds, int port) +{ + struct mv88e6xxx_chip *chip = ds->priv; + struct mv88e6xxx_port *p; + + if (!mv88e6xxx_is_invalid_port(chip, port)) + p = &chip->ports[port]; + else + return -ENODEV; + + INIT_LIST_HEAD(&p->ale_list); + mutex_init(&p->ale_list_lock); + p->ale_cnt = 0; + INIT_DELAYED_WORK(&p->ale_work, mv88e6xxx_atu_locked_entry_cleanup); + mod_delayed_work(system_long_wq, &p->ale_work, msecs_to_jiffies(500)); + + ds->ageing_time_min = 100; + return 0; +} + +int mv88e6xxx_teardown_violation_handler(struct dsa_switch *ds, int port) +{ + struct mv88e6xxx_chip *chip = ds->priv; + struct mv88e6xxx_port *p; + + if (!mv88e6xxx_is_invalid_port(chip, port)) + p = &chip->ports[port]; + else + return -ENODEV; + + cancel_delayed_work(&p->ale_work); + mv88e6xxx_atu_locked_entry_flush(ds, port); + mutex_destroy(&p->ale_list_lock); + + return 0; +} diff --git a/drivers/net/dsa/mv88e6xxx/switchdev.h b/drivers/net/dsa/mv88e6xxx/switchdev.h new file mode 100644 index 000000000000..df2005c36f47 --- /dev/null +++ b/drivers/net/dsa/mv88e6xxx/switchdev.h @@ -0,0 +1,37 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later + * + * switchdev.h + * + * Authors: + * Hans J. Schultz <hans.schultz@westermo.com> + * + */ + +#ifndef DRIVERS_NET_DSA_MV88E6XXX_SWITCHDEV_H_ +#define DRIVERS_NET_DSA_MV88E6XXX_SWITCHDEV_H_ + +#include <net/switchdev.h> +#include "chip.h" + +#define ATU_LOCKED_ENTRIES_MAX 64 + +struct mv88e6xxx_atu_locked_entry { + struct list_head list; + struct mv88e6xxx_chip *chip; + int port; + u8 mac[ETH_ALEN]; + u16 fid; + u16 vid; + unsigned long expires; +}; + +int mv88e6xxx_handle_violation(struct mv88e6xxx_chip *chip, int port, + struct mv88e6xxx_atu_entry *entry, + u16 fid, u16 type); +bool mv88e6xxx_atu_locked_entry_find_purge(struct dsa_switch *ds, int port, + const unsigned char *addr, u16 vid); +int mv88e6xxx_atu_locked_entry_flush(struct dsa_switch *ds, int port); +int mv88e6xxx_init_violation_handler(struct dsa_switch *ds, int port); +int mv88e6xxx_teardown_violation_handler(struct dsa_switch *ds, int port); + +#endif /* DRIVERS_NET_DSA_MV88E6XXX_SWITCHDEV_H_ */
This implementation for the Marvell mv88e6xxx chip series, is based on handling ATU miss violations occurring when packets ingress on a port that is locked. The mac address triggering the ATU miss violation will be added to the ATU with a zero-DPV, and is then communicated through switchdev to the bridge module, which adds a fdb entry with the fdb locked flag set. The entry is kept according to the bridges ageing time, thus simulating a dynamic entry. As this is essentially a form of CPU based learning, the amount of locked entries will be limited by a hardcoded value for now, so as to prevent DOS attacks. Signed-off-by: Hans Schultz <netdev@kapio-technology.com> --- drivers/net/dsa/mv88e6xxx/Makefile | 1 + drivers/net/dsa/mv88e6xxx/chip.c | 49 ++++- drivers/net/dsa/mv88e6xxx/chip.h | 15 ++ drivers/net/dsa/mv88e6xxx/global1.h | 1 + drivers/net/dsa/mv88e6xxx/global1_atu.c | 16 +- drivers/net/dsa/mv88e6xxx/port.c | 30 ++- drivers/net/dsa/mv88e6xxx/port.h | 2 + drivers/net/dsa/mv88e6xxx/switchdev.c | 280 ++++++++++++++++++++++++ drivers/net/dsa/mv88e6xxx/switchdev.h | 37 ++++ 9 files changed, 414 insertions(+), 17 deletions(-) create mode 100644 drivers/net/dsa/mv88e6xxx/switchdev.c create mode 100644 drivers/net/dsa/mv88e6xxx/switchdev.h