Message ID | 20210116005943.219479-1-olteanv@gmail.com |
---|---|
Headers | show |
Series | LAG offload for Ocelot DSA switches | expand |
On Sat, Jan 16, 2021 at 02:59:29AM +0200, Vladimir Oltean wrote: > From: Vladimir Oltean <vladimir.oltean@nxp.com> > > This patch series reworks the ocelot switchdev driver such that it could > share the same implementation for LAG offload as the felix DSA driver. Jakub, I sent these patches a few hours early because I didn't want to wait for the devlink-sb series to get accepted. Now that it did, can you move the patches back from the RFC state into review, or do I need to resend them?
On Sat, 16 Jan 2021 17:51:03 +0200 Vladimir Oltean wrote: > On Sat, Jan 16, 2021 at 02:59:29AM +0200, Vladimir Oltean wrote: > > From: Vladimir Oltean <vladimir.oltean@nxp.com> > > > > This patch series reworks the ocelot switchdev driver such that it could > > share the same implementation for LAG offload as the felix DSA driver. > > Jakub, I sent these patches a few hours early because I didn't want to > wait for the devlink-sb series to get accepted. Now that it did, can you > move the patches back from the RFC state into review, or do I need to > resend them? I tried to convince the build bot to take a look at this series again, but failed :( Let me look at the patches now, but you'll have to repost to get them merged.
On Sat, 16 Jan 2021 02:59:30 +0200 Vladimir Oltean wrote: > From: Vladimir Oltean <vladimir.oltean@nxp.com> > > Commit 7afb3e575e5a ("net: mscc: ocelot: don't handle netdev events for > other netdevs") was too aggressive, and it made ocelot_netdevice_event > react only to network interface events emitted for the ocelot switch > ports. > > In fact, only the PRECHANGEUPPER should have had that check. > > When we ignore all events that are not for us, we miss the fact that the > upper of the LAG changes, and the bonding interface gets enslaved to a > bridge. This is an operation we could offload under certain conditions. I see the commit in question is in net, perhaps worth spelling out why this is not a fix? Perhaps add some "in the future" to the last sentence if it's the case that this will only matter with the following patches applied?
On Sat, 16 Jan 2021 17:25:10 -0800 Jakub Kicinski wrote: > On Sat, 16 Jan 2021 17:51:03 +0200 Vladimir Oltean wrote: > > On Sat, Jan 16, 2021 at 02:59:29AM +0200, Vladimir Oltean wrote: > > > From: Vladimir Oltean <vladimir.oltean@nxp.com> > > > > > > This patch series reworks the ocelot switchdev driver such that it could > > > share the same implementation for LAG offload as the felix DSA driver. > > > > Jakub, I sent these patches a few hours early because I didn't want to > > wait for the devlink-sb series to get accepted. Now that it did, can you > > move the patches back from the RFC state into review, or do I need to > > resend them? > > I tried to convince the build bot to take a look at this series again, > but failed :( Let me look at the patches now, but you'll have to repost > to get them merged. The code LGTM, FWIW. I'm a little surprised you opted in for allocation in ocelot_set_aggr_pgids() but admittedly that makes the code much simpler than trying to for instance use lower bits of pointers as markers, or even a bitmask on the stack..
On Sat, Jan 16, 2021 at 05:26:23PM -0800, Jakub Kicinski wrote: > On Sat, 16 Jan 2021 02:59:30 +0200 Vladimir Oltean wrote: > > From: Vladimir Oltean <vladimir.oltean@nxp.com> > > > > Commit 7afb3e575e5a ("net: mscc: ocelot: don't handle netdev events for > > other netdevs") was too aggressive, and it made ocelot_netdevice_event > > react only to network interface events emitted for the ocelot switch > > ports. > > > > In fact, only the PRECHANGEUPPER should have had that check. > > > > When we ignore all events that are not for us, we miss the fact that the > > upper of the LAG changes, and the bonding interface gets enslaved to a > > bridge. This is an operation we could offload under certain conditions. > > I see the commit in question is in net, perhaps worth spelling out why > this is not a fix? Perhaps add some "in the future" to the last > sentence if it's the case that this will only matter with the following > patches applied? It is a fix. However, so is patch 13/14 "net: mscc: ocelot: rebalance LAGs on link up/down events", but I didn't see an easy way to backport that. Honestly the reasons why I did not attempt to split this series into a part for "net" and one for "net-next" are: (a) It would unnecessarily complicate my work for felix DSA, where this is considered a new feature as opposed to ocelot switchdev where it was supposedly already working (although.. not quite, due to the lack of rebalancing, a link down would throw off the LAG). I don't really think that anybody was seriously using LAG offload on ocelot so far. (b) Even if I were to split this patch, it can only be trivially backported as far as commit 9c90eea310f8 ("net: mscc: ocelot: move net_device related functions to ocelot_net.c") from June 2020 anyway. (c) I cannot test the mscc_ocelot.ko switchdev driver with traffic, since I don't have the hardware (I just have a local patch that I keep rebasing on top of net-next which makes me able to at least probe it and access its registers on a different switch revision, but the traffic I/O procedure there is completely different). So I can not really confirm what is the state I'm leaving the mscc_ocelot driver in, for stable kernels. At least now, I've made the entry points into the control code path very similar to those of DSA, and I've exercised the switchdev driver in blind (without traffic), so I have a bit more confidence that it should work. (d) Had the AUTOSEL guys picked up this patch, I would have probably had no objection (since my belief is that there's nothing to break and nothing to fix in stable kernels). That being said, if we want to engage in a rigid demonstration of procedures, sure we can do that. I have other patches anyway to fill the pipeline until "net" is merged back into "net-next" :)
On Sun, 17 Jan 2021 14:37:44 +0200 Vladimir Oltean wrote: > That being said, if we want to engage in a rigid demonstration of > procedures, sure we can do that. I have other patches anyway to fill the > pipeline until "net" is merged back into "net-next" :) If you don't mind I'd rather apply the fix to net, and the rest on Thu/Fri after the trees get merged.
On Mon, Jan 18, 2021 at 11:04:47AM -0800, Jakub Kicinski wrote: > On Sun, 17 Jan 2021 14:37:44 +0200 Vladimir Oltean wrote: > > That being said, if we want to engage in a rigid demonstration of > > procedures, sure we can do that. I have other patches anyway to fill the > > pipeline until "net" is merged back into "net-next" :) > > If you don't mind I'd rather apply the fix to net, and the rest on > Thu/Fri after the trees get merged. Sure, I already split this patch and sent it to "net": https://patchwork.kernel.org/project/netdevbpf/patch/20210118135210.2666246-1-olteanv@gmail.com/
From: Vladimir Oltean <vladimir.oltean@nxp.com> This patch series reworks the ocelot switchdev driver such that it could share the same implementation for LAG offload as the felix DSA driver. Testing has been done in the following topology: +----------------------------------+ | Board 1 br0 | | +---------+ | | / \ | | | | | | | bond0 | | | +-----+ | | | / \ | | eno0 swp0 swp1 swp2 | +---|--------|-------|-------|-----+ | | | | +--------+ | | Cable | | Cable| |Cable Cable | | +--------+ | | | | | | +---|--------|-------|-------|-----+ | eno0 swp0 swp1 swp2 | | | \ / | | | +-----+ | | | bond0 | | | | | | \ / | | +---------+ | | Board 2 br0 | +----------------------------------+ The same script can be run on both Board 1 and Board 2 to set this up: #!/bin/bash ip link del bond0 ip link add bond0 type bond mode 802.3ad ip link set swp1 down && ip link set swp1 master bond0 && ip link set swp1 up ip link set swp2 down && ip link set swp2 master bond0 && ip link set swp2 up ip link del br0 ip link add br0 type bridge ip link set bond0 master br0 ip link set swp0 master br0 Then traffic can be tested between eno0 of Board 1 and eno0 of Board 2. Note: series applies on top of: https://patchwork.kernel.org/project/netdevbpf/cover/20210115021120.3055988-1-olteanv@gmail.com/ Vladimir Oltean (14): net: mscc: ocelot: allow offloading of bridge on top of LAG net: mscc: ocelot: rename ocelot_netdevice_port_event to ocelot_netdevice_changeupper net: mscc: ocelot: use a switch-case statement in ocelot_netdevice_event net: mscc: ocelot: don't refuse bonding interfaces we can't offload net: mscc: ocelot: use ipv6 in the aggregation code net: mscc: ocelot: set up the bonding mask in a way that avoids a net_device net: mscc: ocelot: avoid unneeded "lp" variable in LAG join net: mscc: ocelot: use "lag" variable name in ocelot_bridge_stp_state_set net: mscc: ocelot: reapply bridge forwarding mask on bonding join/leave net: mscc: ocelot: set up logical port IDs centrally net: mscc: ocelot: drop the use of the "lags" array net: mscc: ocelot: rename aggr_count to num_ports_in_lag net: mscc: ocelot: rebalance LAGs on link up/down events net: dsa: felix: propagate the LAG offload ops towards the ocelot lib drivers/net/dsa/ocelot/felix.c | 28 +++ drivers/net/ethernet/mscc/ocelot.c | 256 ++++++++++++++----------- drivers/net/ethernet/mscc/ocelot.h | 4 - drivers/net/ethernet/mscc/ocelot_net.c | 131 ++++++++----- include/soc/mscc/ocelot.h | 11 +- 5 files changed, 265 insertions(+), 165 deletions(-)