From patchwork Mon May 4 22:28:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roopa Prabhu X-Patchwork-Id: 219927 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 650E8C47247 for ; Mon, 4 May 2020 22:28:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 403B7206A5 for ; Mon, 4 May 2020 22:28:32 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=cumulusnetworks.com header.i=@cumulusnetworks.com header.b="dLvpVxEC" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728194AbgEDW2b (ORCPT ); Mon, 4 May 2020 18:28:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42484 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1726291AbgEDW2a (ORCPT ); Mon, 4 May 2020 18:28:30 -0400 Received: from mail-pl1-x644.google.com (mail-pl1-x644.google.com [IPv6:2607:f8b0:4864:20::644]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ED315C061A0E for ; Mon, 4 May 2020 15:28:29 -0700 (PDT) Received: by mail-pl1-x644.google.com with SMTP id w3so370174plz.5 for ; Mon, 04 May 2020 15:28:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cumulusnetworks.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=gZxwmcOnYI8K1qxp5F3nQ8jlL3utLOPCQfqA68jQdYU=; b=dLvpVxEChsNmj9/ihurFXdnxwmRwBzi1lvOqMWfbMGOt1Q9LG6GV91kairi04y3I8H xOTSKIrKykSd+xm5tnZiIEJIPHfIEBeaoi9Pc+S5dN+zkPJD4RHd0oDcFWFWZGgn6pGM XqiP5Pxnk+DrHecJaMa6dmJpc9eyeC/Yb52ME= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=gZxwmcOnYI8K1qxp5F3nQ8jlL3utLOPCQfqA68jQdYU=; b=q1eO4JF1VbXBZ9QwgP136eB0jm9EcKWG1aj/2dyRqN3D3hz3GOoez9wiSg7UMQsbwT 6amL3Dup1yZvMmsxTrX+A8+TlU4DbusC26OFmuSdrxRzlXeDIboAchCpzoFMbvPqYQ5B afbMtTghZsISkJbNl6nKI7sREAWgCrMfLJUZ8cyfSz1ttvpTQ1jvLLpjvclSKluvRYyv jchuaZsfCTo4h0dduV6Jg6VBHPR/m6Xj1NArXHxLT7r6z8hHodn/oJ49oc0vxIa5d/Mb 37EVVvdvW1ninKRDIaZFaIqCJgPDwy3Ij52g31tyIKoeOSMdL9P1efVNwC5jjBTg1Tgk 4IuQ== X-Gm-Message-State: AGi0PuawrBDLkumnzm2nQUDkHnLjW01Q81UsICY5cB2uLPh33JqP+wOv K7cciDAAhGpt2VbOHYbA+Uiggw== X-Google-Smtp-Source: APiQypKM5nrZJvXf6jCzpm1yaGBXeW4CV8jIkVDK0Gtki6UzH4ZPzIXGqyyh1cGkvzWmIKj/M9cAOA== X-Received: by 2002:a17:902:ed4a:: with SMTP id y10mr249362plb.227.1588631309359; Mon, 04 May 2020 15:28:29 -0700 (PDT) Received: from monster-08.mvlab.cumulusnetworks.com. (fw.cumulusnetworks.com. [216.129.126.126]) by smtp.googlemail.com with ESMTPSA id ie17sm21213pjb.19.2020.05.04.15.28.28 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 04 May 2020 15:28:28 -0700 (PDT) From: Roopa Prabhu X-Google-Original-From: Roopa Prabhu To: dsahern@gmail.com, davem@davemloft.net Cc: netdev@vger.kernel.org, nikolay@cumulusnetworks.com, idosch@mellanox.com, jiri@mellanox.com, petrm@mellanox.com Subject: [RFC PATCH net-next 1/5] nexthop: support for fdb ecmp nexthops Date: Mon, 4 May 2020 15:28:17 -0700 Message-Id: <1588631301-21564-2-git-send-email-roopa@cumulusnetworks.com> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1588631301-21564-1-git-send-email-roopa@cumulusnetworks.com> References: <1588631301-21564-1-git-send-email-roopa@cumulusnetworks.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Roopa Prabhu This patch introduces ecmp nexthops and nexthop groups for mac fdb entries. In subsequent patches this is used by the vxlan driver fdb entries. The use case is E-VPN multihoming [1,2,3] which requires bridged vxlan traffic to be load balanced to remote switches (vteps) belonging to the same multi-homed ethernet segment (This is analogous to a multi-homed LAG but over vxlan). Changes include new nexthop flag NHA_FDB for nexthops referenced by fdb entries. These nexthops only have ip. This patch includes appropriate checks to avoid routes referencing such nexthops. example: $ip nexthop add id 12 via 172.16.1.2 fdb $ip nexthop add id 13 via 172.16.1.3 fdb $ip nexthop add id 102 group 12/13 fdb $bridge fdb add 02:02:00:00:00:13 dev vxlan1000 nhid 101 self [1] E-VPN https://tools.ietf.org/html/rfc7432 [2] E-VPN VxLAN: https://tools.ietf.org/html/rfc8365 [3] LPC talk with mention of nexthop groups for L2 ecmp http://vger.kernel.org/lpc_net2018_talks/scaling_bridge_fdb_database_slidesV3.pdf Signed-off-by: Roopa Prabhu --- include/net/nexthop.h | 14 ++++++ include/uapi/linux/nexthop.h | 1 + net/ipv4/nexthop.c | 101 +++++++++++++++++++++++++++++++++---------- 3 files changed, 93 insertions(+), 23 deletions(-) diff --git a/include/net/nexthop.h b/include/net/nexthop.h index c440ccc..3ad4e97 100644 --- a/include/net/nexthop.h +++ b/include/net/nexthop.h @@ -26,6 +26,7 @@ struct nh_config { u8 nh_family; u8 nh_protocol; u8 nh_blackhole; + u8 nh_fdb; u32 nh_flags; int nh_ifindex; @@ -52,6 +53,7 @@ struct nh_info { u8 family; bool reject_nh; + bool fdb_nh; union { struct fib_nh_common fib_nhc; @@ -80,6 +82,7 @@ struct nexthop { struct rb_node rb_node; /* entry on netns rbtree */ struct list_head fi_list; /* v4 entries using nh */ struct list_head f6i_list; /* v6 entries using nh */ + struct list_head fdb_list; /* fdb entries using this nh */ struct list_head grp_list; /* nh group entries using this nh */ struct net *net; @@ -88,6 +91,7 @@ struct nexthop { u8 protocol; /* app managing this nh */ u8 nh_flags; bool is_group; + bool is_fdb_nh; refcount_t refcnt; struct rcu_head rcu; @@ -304,4 +308,14 @@ static inline void nexthop_path_fib6_result(struct fib6_result *res, int hash) int nexthop_for_each_fib6_nh(struct nexthop *nh, int (*cb)(struct fib6_nh *nh, void *arg), void *arg); + +static inline struct nh_info *nexthop_path_fdb(struct nexthop *nh, u32 hash) +{ + struct nh_info *nhi; + + nh = nexthop_select_path(nh, hash); + nhi = rcu_dereference(nh->nh_info); + + return nhi; +} #endif diff --git a/include/uapi/linux/nexthop.h b/include/uapi/linux/nexthop.h index 7b61867..19a234a 100644 --- a/include/uapi/linux/nexthop.h +++ b/include/uapi/linux/nexthop.h @@ -48,6 +48,7 @@ enum { */ NHA_GROUPS, /* flag; only return nexthop groups in dump */ NHA_MASTER, /* u32; only return nexthops with given master dev */ + NHA_FDB, /* nexthop belongs to a bridge fdb */ __NHA_MAX, }; diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c index 3957364..98f8d2a 100644 --- a/net/ipv4/nexthop.c +++ b/net/ipv4/nexthop.c @@ -33,6 +33,7 @@ static const struct nla_policy rtm_nh_policy[NHA_MAX + 1] = { [NHA_ENCAP] = { .type = NLA_NESTED }, [NHA_GROUPS] = { .type = NLA_FLAG }, [NHA_MASTER] = { .type = NLA_U32 }, + [NHA_FDB] = { .type = NLA_FLAG }, }; static unsigned int nh_dev_hashfn(unsigned int val) @@ -107,6 +108,7 @@ static struct nexthop *nexthop_alloc(void) INIT_LIST_HEAD(&nh->fi_list); INIT_LIST_HEAD(&nh->f6i_list); INIT_LIST_HEAD(&nh->grp_list); + INIT_LIST_HEAD(&nh->fdb_list); } return nh; } @@ -227,6 +229,9 @@ static int nh_fill_node(struct sk_buff *skb, struct nexthop *nh, if (nla_put_u32(skb, NHA_ID, nh->id)) goto nla_put_failure; + if (nh->is_fdb_nh && nla_put_flag(skb, NHA_FDB)) + goto nla_put_failure; + if (nh->is_group) { struct nh_group *nhg = rtnl_dereference(nh->nh_grp); @@ -241,7 +246,7 @@ static int nh_fill_node(struct sk_buff *skb, struct nexthop *nh, if (nla_put_flag(skb, NHA_BLACKHOLE)) goto nla_put_failure; goto out; - } else { + } else if (!nh->is_fdb_nh) { const struct net_device *dev; dev = nhi->fib_nhc.nhc_dev; @@ -393,6 +398,7 @@ static int nh_check_attr_group(struct net *net, struct nlattr *tb[], unsigned int len = nla_len(tb[NHA_GROUP]); struct nexthop_grp *nhg; unsigned int i, j; + u8 nhg_fdb = 0; if (len & (sizeof(struct nexthop_grp) - 1)) { NL_SET_ERR_MSG(extack, @@ -421,6 +427,8 @@ static int nh_check_attr_group(struct net *net, struct nlattr *tb[], } } + if (tb[NHA_FDB]) + nhg_fdb = 1; nhg = nla_data(tb[NHA_GROUP]); for (i = 0; i < len; ++i) { struct nexthop *nh; @@ -432,11 +440,16 @@ static int nh_check_attr_group(struct net *net, struct nlattr *tb[], } if (!valid_group_nh(nh, len, extack)) return -EINVAL; + if (nhg_fdb && !nh->is_fdb_nh) { + NL_SET_ERR_MSG(extack, "FDB Multipath group can only have fdb nexthops"); + return -EINVAL; + } } for (i = NHA_GROUP + 1; i < __NHA_MAX; ++i) { if (!tb[i]) continue; - + if (tb[NHA_FDB]) + continue; NL_SET_ERR_MSG(extack, "No other attributes can be set in nexthop groups"); return -EINVAL; @@ -495,6 +508,9 @@ struct nexthop *nexthop_select_path(struct nexthop *nh, int hash) if (hash > atomic_read(&nhge->upper_bound)) continue; + if (nhge->nh->is_fdb_nh) + return nhge->nh; + /* nexthops always check if it is good and does * not rely on a sysctl for this behavior */ @@ -564,6 +580,11 @@ int fib6_check_nexthop(struct nexthop *nh, struct fib6_config *cfg, { struct nh_info *nhi; + if (nh->is_fdb_nh) { + NL_SET_ERR_MSG(extack, "Route cannot point to a fdb nexthop"); + return -EINVAL; + } + /* fib6_src is unique to a fib6_info and limits the ability to cache * routes in fib6_nh within a nexthop that is potentially shared * across multiple fib entries. If the config wants to use source @@ -640,6 +661,12 @@ int fib_check_nexthop(struct nexthop *nh, u8 scope, { int err = 0; + if (nh->is_fdb_nh) { + NL_SET_ERR_MSG(extack, "Route cannot point to a fdb nexthop"); + err = -EINVAL; + goto out; + } + if (nh->is_group) { struct nh_group *nhg; @@ -1125,6 +1152,9 @@ static struct nexthop *nexthop_create_group(struct net *net, nh_group_rebalance(nhg); } + if (cfg->nh_fdb) + nh->is_fdb_nh = 1; + rcu_assign_pointer(nh->nh_grp, nhg); return nh; @@ -1152,7 +1182,7 @@ static int nh_create_ipv4(struct net *net, struct nexthop *nh, .fc_encap = cfg->nh_encap, .fc_encap_type = cfg->nh_encap_type, }; - u32 tb_id = l3mdev_fib_table(cfg->dev); + u32 tb_id = (cfg->dev ? l3mdev_fib_table(cfg->dev) : RT_TABLE_MAIN); int err; err = fib_nh_init(net, fib_nh, &fib_cfg, 1, extack); @@ -1161,6 +1191,9 @@ static int nh_create_ipv4(struct net *net, struct nexthop *nh, goto out; } + if (nh->is_fdb_nh) + goto out; + /* sets nh_dev if successful */ err = fib_check_nh(net, fib_nh, tb_id, 0, extack); if (!err) { @@ -1227,6 +1260,9 @@ static struct nexthop *nexthop_create(struct net *net, struct nh_config *cfg, nhi->family = cfg->nh_family; nhi->fib_nhc.nhc_scope = RT_SCOPE_LINK; + if (cfg->nh_fdb) + nh->is_fdb_nh = 1; + if (cfg->nh_blackhole) { nhi->reject_nh = 1; cfg->nh_ifindex = net->loopback_dev->ifindex; @@ -1248,7 +1284,8 @@ static struct nexthop *nexthop_create(struct net *net, struct nh_config *cfg, } /* add the entry to the device based hash */ - nexthop_devhash_add(net, nhi); + if (!nh->is_fdb_nh) + nexthop_devhash_add(net, nhi); rcu_assign_pointer(nh->nh_info, nhi); @@ -1367,6 +1404,9 @@ static int rtm_to_nh_config(struct net *net, struct sk_buff *skb, NL_SET_ERR_MSG(extack, "Invalid group type"); goto out; } + if (tb[NHA_FDB]) + cfg->nh_fdb = nla_get_flag(tb[NHA_FDB]); + err = nh_check_attr_group(net, tb, extack); /* no other attributes should be set */ @@ -1385,26 +1425,38 @@ static int rtm_to_nh_config(struct net *net, struct sk_buff *skb, goto out; } - if (!tb[NHA_OIF]) { - NL_SET_ERR_MSG(extack, "Device attribute required for non-blackhole nexthops"); + if (tb[NHA_FDB]) { + if (tb[NHA_OIF] || + tb[NHA_ENCAP] || tb[NHA_ENCAP_TYPE]) { + NL_SET_ERR_MSG(extack, "Fdb attribute can not be used with encap or oif"); + goto out; + } + + cfg->nh_fdb = nla_get_flag(tb[NHA_FDB]); + } + + if (!cfg->nh_fdb && !tb[NHA_OIF]) { + NL_SET_ERR_MSG(extack, "Device attribute required for non-blackhole and non-fdb nexthops"); goto out; } - cfg->nh_ifindex = nla_get_u32(tb[NHA_OIF]); - if (cfg->nh_ifindex) - cfg->dev = __dev_get_by_index(net, cfg->nh_ifindex); + if (!cfg->nh_fdb && tb[NHA_OIF]) { + cfg->nh_ifindex = nla_get_u32(tb[NHA_OIF]); + if (cfg->nh_ifindex) + cfg->dev = __dev_get_by_index(net, cfg->nh_ifindex); - if (!cfg->dev) { - NL_SET_ERR_MSG(extack, "Invalid device index"); - goto out; - } else if (!(cfg->dev->flags & IFF_UP)) { - NL_SET_ERR_MSG(extack, "Nexthop device is not up"); - err = -ENETDOWN; - goto out; - } else if (!netif_carrier_ok(cfg->dev)) { - NL_SET_ERR_MSG(extack, "Carrier for nexthop device is down"); - err = -ENETDOWN; - goto out; + if (!cfg->dev) { + NL_SET_ERR_MSG(extack, "Invalid device index"); + goto out; + } else if (!(cfg->dev->flags & IFF_UP)) { + NL_SET_ERR_MSG(extack, "Nexthop device is not up"); + err = -ENETDOWN; + goto out; + } else if (!netif_carrier_ok(cfg->dev)) { + NL_SET_ERR_MSG(extack, "Carrier for nexthop device is down"); + err = -ENETDOWN; + goto out; + } } err = -EINVAL; @@ -1633,7 +1685,7 @@ static bool nh_dump_filtered(struct nexthop *nh, int dev_idx, int master_idx, static int nh_valid_dump_req(const struct nlmsghdr *nlh, int *dev_idx, int *master_idx, bool *group_filter, - struct netlink_callback *cb) + bool *fdb_filter, struct netlink_callback *cb) { struct netlink_ext_ack *extack = cb->extack; struct nlattr *tb[NHA_MAX + 1]; @@ -1670,6 +1722,9 @@ static int nh_valid_dump_req(const struct nlmsghdr *nlh, int *dev_idx, case NHA_GROUPS: *group_filter = true; break; + case NHA_FDB: + *fdb_filter = true; + break; default: NL_SET_ERR_MSG(extack, "Unsupported attribute in dump request"); return -EINVAL; @@ -1688,17 +1743,17 @@ static int nh_valid_dump_req(const struct nlmsghdr *nlh, int *dev_idx, /* rtnl */ static int rtm_dump_nexthop(struct sk_buff *skb, struct netlink_callback *cb) { + bool group_filter = false, fdb_filter = false; struct nhmsg *nhm = nlmsg_data(cb->nlh); int dev_filter_idx = 0, master_idx = 0; struct net *net = sock_net(skb->sk); struct rb_root *root = &net->nexthop.rb_root; - bool group_filter = false; struct rb_node *node; int idx = 0, s_idx; int err; err = nh_valid_dump_req(cb->nlh, &dev_filter_idx, &master_idx, - &group_filter, cb); + &group_filter, &fdb_filter, cb); if (err < 0) return err; From patchwork Mon May 4 22:28:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roopa Prabhu X-Patchwork-Id: 219926 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 40F48C47247 for ; Mon, 4 May 2020 22:28:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1CBD0206A5 for ; Mon, 4 May 2020 22:28:36 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=cumulusnetworks.com header.i=@cumulusnetworks.com header.b="H+uF5EuJ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728217AbgEDW2e (ORCPT ); Mon, 4 May 2020 18:28:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42496 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1728209AbgEDW2c (ORCPT ); Mon, 4 May 2020 18:28:32 -0400 Received: from mail-pj1-x1044.google.com (mail-pj1-x1044.google.com [IPv6:2607:f8b0:4864:20::1044]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 801B2C061A0E for ; Mon, 4 May 2020 15:28:32 -0700 (PDT) Received: by mail-pj1-x1044.google.com with SMTP id a5so114432pjh.2 for ; Mon, 04 May 2020 15:28:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cumulusnetworks.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Qic54R4NggcJN+/5ioaMrmL7rZZprEXwI9Tmar3DcZQ=; b=H+uF5EuJN+WWf5ZI0vtjtGgsmzpUm7fBHcZ2sMHxFpn+veC4ynt8pcHwf44aV3bEbm WAZom3ifyYJYTqfuwYElUSPxDsa5jEhkLXrsEeHERtM6c3xJcankkiZxaMXedRWhKVOc ZPYeCzm64acvZdcFU0RzZkPMSmMPjZMGzIU14= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Qic54R4NggcJN+/5ioaMrmL7rZZprEXwI9Tmar3DcZQ=; b=U5NBILfz0p9FzQzaO/nNibnmv8Lqnigc517AopVMZqNGDBDex3HpA+rUrb0+sMjuNJ ySnudy2B2pjGjtIHqor93hLPPlbB3hRezI5mfOEpeycurVPwM524V3D9xtzbt8PNH46D s0ZGsfm5GVwWHIgszn0gK1ZteAmDKIE+N72ME0dkdS2mNZqkhclHv3U7II5QL/73QtPq 6SFwVsaY/OP8zN5B0zvHYINscUxMLnAosklohS7UxGn7ilb85QqkxONcBIHQQ4buCZzt cdZakOc4AXYJ+SMLwdMryx/GivF7GpTbAvqUIuinyhNneHHM8SCLfQGLPPwsDlEpSIt7 4oyA== X-Gm-Message-State: AGi0PuZEvQrxfOsXzpWp8PI9EBwrTYU2oqwju/gmKsez3A+huFpg0Rg7 5TZeWkbA7fUiWqMRvIKdc7t5RQ== X-Google-Smtp-Source: APiQypIC/100T8gDZL/jwUSbRhAQrfYNvAcSqJe1b4sFicdGbT17gSiNmYm4HIgxurK/cEZNNN1/UQ== X-Received: by 2002:a17:902:7203:: with SMTP id ba3mr152257plb.202.1588631312051; Mon, 04 May 2020 15:28:32 -0700 (PDT) Received: from monster-08.mvlab.cumulusnetworks.com. (fw.cumulusnetworks.com. [216.129.126.126]) by smtp.googlemail.com with ESMTPSA id ie17sm21213pjb.19.2020.05.04.15.28.30 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 04 May 2020 15:28:31 -0700 (PDT) From: Roopa Prabhu X-Google-Original-From: Roopa Prabhu To: dsahern@gmail.com, davem@davemloft.net Cc: netdev@vger.kernel.org, nikolay@cumulusnetworks.com, idosch@mellanox.com, jiri@mellanox.com, petrm@mellanox.com Subject: [RFC PATCH net-next 3/5] nexthop: add support for notifiers Date: Mon, 4 May 2020 15:28:19 -0700 Message-Id: <1588631301-21564-4-git-send-email-roopa@cumulusnetworks.com> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1588631301-21564-1-git-send-email-roopa@cumulusnetworks.com> References: <1588631301-21564-1-git-send-email-roopa@cumulusnetworks.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Roopa Prabhu This patch adds nexthop add/del notifiers. To be used by vxlan driver in a later patch. Could possibly be used by switchdev drivers in the future. Signed-off-by: Roopa Prabhu --- include/net/netns/nexthop.h | 1 + include/net/nexthop.h | 12 ++++++++++++ net/ipv4/nexthop.c | 27 +++++++++++++++++++++++++++ 3 files changed, 40 insertions(+) diff --git a/include/net/netns/nexthop.h b/include/net/netns/nexthop.h index c712ee5..1937476 100644 --- a/include/net/netns/nexthop.h +++ b/include/net/netns/nexthop.h @@ -14,5 +14,6 @@ struct netns_nexthop { unsigned int seq; /* protected by rtnl_mutex */ u32 last_id_allocated; + struct atomic_notifier_head notifier_chain; }; #endif diff --git a/include/net/nexthop.h b/include/net/nexthop.h index 3ad4e97..0301740 100644 --- a/include/net/nexthop.h +++ b/include/net/nexthop.h @@ -10,6 +10,7 @@ #define __LINUX_NEXTHOP_H #include +#include #include #include #include @@ -102,6 +103,17 @@ struct nexthop { }; }; +enum nexthop_event_type { + NEXTHOP_EVENT_ADD, + NEXTHOP_EVENT_DEL +}; + +int call_nexthop_notifier(struct notifier_block *nb, struct net *net, + enum nexthop_event_type event_type, + struct nexthop *nh); +int register_nexthop_notifier(struct notifier_block *nb); +int unregister_nexthop_notifier(struct notifier_block *nb); + /* caller is holding rcu or rtnl; no reference taken to nexthop */ struct nexthop *nexthop_find_by_id(struct net *net, u32 id); void nexthop_free_rcu(struct rcu_head *head); diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c index 98f8d2a..514bb4e 100644 --- a/net/ipv4/nexthop.c +++ b/net/ipv4/nexthop.c @@ -36,6 +36,17 @@ static const struct nla_policy rtm_nh_policy[NHA_MAX + 1] = { [NHA_FDB] = { .type = NLA_FLAG }, }; +static int call_nexthop_notifiers(struct net *net, + enum fib_event_type event_type, + struct nexthop *nh) +{ + int err; + + err = atomic_notifier_call_chain(&net->nexthop.notifier_chain, + event_type, nh); + return notifier_to_errno(err); +} + static unsigned int nh_dev_hashfn(unsigned int val) { unsigned int mask = NH_DEV_HASHSIZE - 1; @@ -814,6 +825,8 @@ static void __remove_nexthop_fib(struct net *net, struct nexthop *nh) ipv6_stub->ip6_del_rt(net, f6i, !net->ipv4.sysctl_nexthop_compat_mode); } + + call_nexthop_notifiers(net, NEXTHOP_EVENT_DEL, nh); } static void __remove_nexthop(struct net *net, struct nexthop *nh, @@ -1838,6 +1851,20 @@ static struct notifier_block nh_netdev_notifier = { .notifier_call = nh_netdev_event, }; +static ATOMIC_NOTIFIER_HEAD(nexthop_notif_chain); + +int register_nexthop_notifier(struct notifier_block *nb) +{ + return atomic_notifier_chain_register(&nexthop_notif_chain, nb); +} +EXPORT_SYMBOL(register_nexthop_notifier); + +int unregister_nexthop_notifier(struct notifier_block *nb) +{ + return atomic_notifier_chain_unregister(&nexthop_notif_chain, nb); +} +EXPORT_SYMBOL(unregister_nexthop_notifier); + static void __net_exit nexthop_net_exit(struct net *net) { rtnl_lock(); From patchwork Mon May 4 22:28:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roopa Prabhu X-Patchwork-Id: 219925 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B4284C47247 for ; Mon, 4 May 2020 22:28:40 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 94376206A5 for ; Mon, 4 May 2020 22:28:40 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=cumulusnetworks.com header.i=@cumulusnetworks.com header.b="KAX4gB4F" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728233AbgEDW2h (ORCPT ); Mon, 4 May 2020 18:28:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42500 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1728212AbgEDW2d (ORCPT ); Mon, 4 May 2020 18:28:33 -0400 Received: from mail-pf1-x442.google.com (mail-pf1-x442.google.com [IPv6:2607:f8b0:4864:20::442]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CA1FEC061A0E for ; Mon, 4 May 2020 15:28:33 -0700 (PDT) Received: by mail-pf1-x442.google.com with SMTP id 18so45679pfv.8 for ; Mon, 04 May 2020 15:28:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cumulusnetworks.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=1cB14twuekU1YJH/tSo1tdOgfx/DodnrUu3THkwnHrg=; b=KAX4gB4F6XIexWZLE5LSGwAG9LraR0WVVNsx/8jylzNVCOiy+tXiIUCWGn2crzhxrV 68zZ004IWcLnCkaaitNaQ3XVzK5MgTlXa9gZgxnKZqSCnAkcdGdrX0EsPeEF2LOla0ep 9hr2K8KmKroODWczHcs3140xbTCJDolnHbrtE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=1cB14twuekU1YJH/tSo1tdOgfx/DodnrUu3THkwnHrg=; b=nQHLO3qyH1QUJ7Zb1EkgmkwDVeVZE64ZawRNqWdeB4bV5YhbLiDddqLSO8h0ctsjWo Lt6PRXAjCAsiINaQEKjkPwwuqIlpVmIsvU+a1k5uwWx8B44xOZa1kDeY+bA5fKNG7GX1 60P/0GxflRupP9Dbd6YL5fQNL3UQaAlq6AsLvURswITORVtHxJ4g9JUinM1vHebMcrZb 4ZrnjFrCboWFqFZLGCtmLyULWNQrJZcbCJDCwBU3V2BgBHgx+F27G9Bh5lORj/26nBfs z+nde8h0fqi30BSfp/gxgC0B70oXNKekRwP5991gmfK339/v49xKG5ZknmSUZIIp45AH Geww== X-Gm-Message-State: AGi0PuZui81IckPzIKPVSXeEM2hBm2xNE6ss7iP3rorw+L6qMewbEBRH WIlkpiq0mFo8F6gKyo+ZeX+TCQ== X-Google-Smtp-Source: APiQypJbZ/3lrcqM9MFsZKQ2/wI88DTeTw5BaeFytNe36gQb6rrYerdo49PPYKDevBhBAxPDD1tq1A== X-Received: by 2002:a63:f70f:: with SMTP id x15mr337107pgh.199.1588631313164; Mon, 04 May 2020 15:28:33 -0700 (PDT) Received: from monster-08.mvlab.cumulusnetworks.com. (fw.cumulusnetworks.com. [216.129.126.126]) by smtp.googlemail.com with ESMTPSA id ie17sm21213pjb.19.2020.05.04.15.28.32 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 04 May 2020 15:28:32 -0700 (PDT) From: Roopa Prabhu X-Google-Original-From: Roopa Prabhu To: dsahern@gmail.com, davem@davemloft.net Cc: netdev@vger.kernel.org, nikolay@cumulusnetworks.com, idosch@mellanox.com, jiri@mellanox.com, petrm@mellanox.com Subject: [RFC PATCH net-next 4/5] vxlan: support for nexthop notifiers Date: Mon, 4 May 2020 15:28:20 -0700 Message-Id: <1588631301-21564-5-git-send-email-roopa@cumulusnetworks.com> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1588631301-21564-1-git-send-email-roopa@cumulusnetworks.com> References: <1588631301-21564-1-git-send-email-roopa@cumulusnetworks.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Roopa Prabhu vxlan driver registers for nexthop add/del notifiers to cleanup fdb entries pointing to nexthops. Signed-off-by: Roopa Prabhu --- drivers/net/vxlan.c | 36 +++++++++++++++++++++++++++++++++--- 1 file changed, 33 insertions(+), 3 deletions(-) diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c index 176f2b3..df6c5ff 100644 --- a/drivers/net/vxlan.c +++ b/drivers/net/vxlan.c @@ -81,6 +81,7 @@ struct vxlan_fdb { u16 flags; /* see ndm_flags and below */ struct list_head nh_list; struct nexthop *nh; + struct vxlan_dev *vdev; }; #define NTF_VXLAN_ADDED_BY_USER 0x100 @@ -811,8 +812,9 @@ static int vxlan_gro_complete(struct sock *sk, struct sk_buff *skb, int nhoff) return eth_gro_complete(skb, nhoff + sizeof(struct vxlanhdr)); } -static struct vxlan_fdb *vxlan_fdb_alloc(const u8 *mac, __u16 state, - __be32 src_vni, __u16 ndm_flags) +static struct vxlan_fdb *vxlan_fdb_alloc(struct vxlan_dev *vxlan, const u8 *mac, + __u16 state, __be32 src_vni, + __u16 ndm_flags) { struct vxlan_fdb *f; @@ -824,6 +826,7 @@ static struct vxlan_fdb *vxlan_fdb_alloc(const u8 *mac, __u16 state, f->updated = f->used = jiffies; f->vni = src_vni; f->nh = NULL; + f->vdev = vxlan; INIT_LIST_HEAD(&f->nh_list); INIT_LIST_HEAD(&f->remotes); memcpy(f->eth_addr, mac, ETH_ALEN); @@ -902,7 +905,7 @@ static int vxlan_fdb_create(struct vxlan_dev *vxlan, return -ENOSPC; netdev_dbg(vxlan->dev, "add %pM -> %pIS\n", mac, ip); - f = vxlan_fdb_alloc(mac, state, src_vni, ndm_flags); + f = vxlan_fdb_alloc(vxlan, mac, state, src_vni, ndm_flags); if (!f) return -ENOMEM; @@ -954,6 +957,7 @@ static void vxlan_fdb_destroy(struct vxlan_dev *vxlan, struct vxlan_fdb *f, swdev_notify, NULL); hlist_del_rcu(&f->hlist); + f->vdev = NULL; call_rcu(&f->rcu, vxlan_fdb_free); } @@ -4576,6 +4580,25 @@ static struct notifier_block vxlan_switchdev_notifier_block __read_mostly = { .notifier_call = vxlan_switchdev_event, }; +static int vxlan_nexthop_event(struct notifier_block *nb, + unsigned long event, void *ptr) +{ + struct nexthop *nh = ptr; + struct vxlan_fdb *fdb; + + if (!nh || event != NEXTHOP_EVENT_DEL) + return NOTIFY_DONE; + + list_for_each_entry(fdb, &nh->fdb_list, nh_list) + vxlan_fdb_destroy(fdb->vdev, fdb, true, false); + + return NOTIFY_DONE; +} + +static struct notifier_block vxlan_nexthop_notifier_block __read_mostly = { + .notifier_call = vxlan_nexthop_event, +}; + static __net_init int vxlan_init_net(struct net *net) { struct vxlan_net *vn = net_generic(net, vxlan_net_id); @@ -4655,7 +4678,13 @@ static int __init vxlan_init_module(void) if (rc) goto out4; + rc = register_nexthop_notifier(&vxlan_nexthop_notifier_block); + if (rc) + goto out5; + return 0; +out5: + rtnl_link_unregister(&vxlan_link_ops); out4: unregister_switchdev_notifier(&vxlan_switchdev_notifier_block); out3: @@ -4672,6 +4701,7 @@ static void __exit vxlan_cleanup_module(void) rtnl_link_unregister(&vxlan_link_ops); unregister_switchdev_notifier(&vxlan_switchdev_notifier_block); unregister_netdevice_notifier(&vxlan_notifier_block); + unregister_nexthop_notifier(&vxlan_nexthop_notifier_block); unregister_pernet_subsys(&vxlan_net_ops); /* rcu_barrier() is called by netns */ }