Message ID | 20210519171825.600110-1-kuba@kernel.org |
---|---|
State | New |
Headers | show |
Series | [net-next] mlx5: count all link events | expand |
On Wed, 19 May 2021 17:07:00 -0700 Saeed Mahameed wrote: > On Wed, 2021-05-19 at 14:06 -0700, Jakub Kicinski wrote: > > On Wed, 19 May 2021 13:49:00 -0700 Saeed Mahameed wrote: > > > Can you share more on the actual scenario that has happened ? > > > in mlx5 i know of situations where fw might generate such events, > > > just > > > as FYI for virtual ports (vports) on some configuration changes. > > > > > > another explanation is that in the driver we explicitly query the > > > link > > > state and we never take the event value, so it could have been that > > > the > > > link flapped so fast we missed the intermediate state. > > > > The link flaps quite a bit, this is likely a bad cable or port. > > I scanned the fleet a little bit more and I see a couple machines > > in such state, in each case the switch is also seeing the link flaps, > > not just the NIC. Without this patch the driver registers a full flap > > once every ~15min, with the patch it's once a second. That's much > > closer to what the switch registers. > > > > Also the issue affects all hosts in MH, and persists across reboots > > of a single host (hence I could test this patch). > > reproduces on reboots even with a good cable ? I don't have access to the machines so the cable stays the same. I was just saying that it doesn't seem like a driver issue since it persists across reboots. > you reboot the peer machine or the DUT (under test) machine ? DUT > > > According to HW spec for some reason we should always query and not > > > rely on the event. > > > > > > <quote> > > > If software retrieves this indication (port state change event), > > > this > > > signifies that the state has been > > > changed and a QUERY_VPORT_STATE command should be performed to get > > > the > > > new state. > > > </quote> > > > > I see, seems reasonable. I'm guessing the FW generates only one of > > the > > events on minor type of faults? I don't think the link goes fully > > down, > > because I can SSH to those machines, they just periodically drop > > traffic. But the can't fully retrain the link at such high rate, > > I don't think. > > > > hmm, Then i would like to get to the bottom of this, so i will have to > consult with FW. > But regardless, we can progress with the patch, I think the HW spec > description forces us to do so.. SGTM :)
On Wed, 2021-05-19 at 13:56 -0700, Jakub Kicinski wrote: > On Wed, 19 May 2021 13:18:36 -0700 Saeed Mahameed wrote: > > On Wed, 2021-05-19 at 12:51 -0700, Jakub Kicinski wrote: > > > > > > > > > I assumed netif_carrier_event() would be used specifically in > > > places > > > driver is actually servicing a link event from the device, and > > > therefore is relatively certain that _something_ has happened. > > > > then according to the above assumption it is safe to make > > netif_carrier_event() do everything. > > > > netif_carrier_event(netdev, up) { > > if (dev->reg_state == NETREG_UNINITIALIZED) > > return; > > > > if (up == netif_carrier_ok(netdev) { > > atomic_inc(&netdev->carrier_up_count); > > atomic_inc(&netdev->carrier_down_count); > > linkwatch_fire_event(netdev); > > } > > > > if (up) { > > netdev_info(netdev, "Link up\n"); > > netif_carrier_on(netdev); > > } else { > > netdev_info(netdev, "Link down\n"); > > netif_carrier_off(netdev); > > } > > } > > Two things to consider are: > - some drivers print more info than just "link up/link down" so > they'd > have to drop that extra stuff (as much as I'd like the > consistency) +1 for the consistency > - again with the unnecessary events I was afraid that drivers reuse > the same handler for device events and to read the state in which > case we may do something like: > > if (from_event && up == netif_carrier_ok(netdev) > I don't actually understand your point here .. what kind of scenarios it is wrong to use this function ? But anyway, the name of the function makes it very clear this is from event.. also we can document this. > Maybe we can revisit when there's more users? goes both ways :), we can do what fits the requirement for mlx5 now and revisit in the future, if we do believe this should be general behavior for all/most vendors of-course!
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index bca832cdc4cb..5a67ebc0c96c 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -91,12 +91,16 @@ void mlx5e_update_carrier(struct mlx5e_priv *priv) { struct mlx5_core_dev *mdev = priv->mdev; u8 port_state; + bool up; port_state = mlx5_query_vport_state(mdev, MLX5_VPORT_STATE_OP_MOD_VNIC_VPORT, 0); - if (port_state == VPORT_STATE_UP) { + up = port_state == VPORT_STATE_UP; + if (up == netif_carrier_ok(priv->netdev)) + netif_carrier_event(priv->netdev); + if (up) { netdev_info(priv->netdev, "Link up\n"); netif_carrier_on(priv->netdev); } else { diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 5cbc950b34df..be1dcceda5e4 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -4187,8 +4187,8 @@ unsigned long dev_trans_start(struct net_device *dev); void __netdev_watchdog_up(struct net_device *dev); void netif_carrier_on(struct net_device *dev); - void netif_carrier_off(struct net_device *dev); +void netif_carrier_event(struct net_device *dev); /** * netif_dormant_on - mark device as dormant. diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index 44991ea726fc..3090ae32307b 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -515,6 +515,24 @@ void netif_carrier_off(struct net_device *dev) } EXPORT_SYMBOL(netif_carrier_off); +/** + * netif_carrier_event - report carrier state event + * @dev: network device + * + * Device has detected a carrier event but the carrier state wasn't changed. + * Use in drivers when querying carrier state asynchronously, to avoid missing + * events (link flaps) if link recovers before it's queried. + */ +void netif_carrier_event(struct net_device *dev) +{ + if (dev->reg_state == NETREG_UNINITIALIZED) + return; + atomic_inc(&dev->carrier_up_count); + atomic_inc(&dev->carrier_down_count); + linkwatch_fire_event(dev); +} +EXPORT_SYMBOL_GPL(netif_carrier_event); + /* "NOOP" scheduler: the best scheduler, recommended for all interfaces under all circumstances. It is difficult to invent anything faster or cheaper.
mlx5 devices were observed generating MLX5_PORT_CHANGE_SUBTYPE_ACTIVE events without an intervening MLX5_PORT_CHANGE_SUBTYPE_DOWN. This breaks link flap detection based on Linux carrier state transition count as netif_carrier_on() does nothing if carrier is already on. Make sure we count such events. netif_carrier_event() increments the counters and fires the linkwatch events. The latter is not necessary for the use case but seems like the right thing to do. Signed-off-by: Jakub Kicinski <kuba@kernel.org> --- .../net/ethernet/mellanox/mlx5/core/en_main.c | 6 +++++- include/linux/netdevice.h | 2 +- net/sched/sch_generic.c | 18 ++++++++++++++++++ 3 files changed, 24 insertions(+), 2 deletions(-)