From patchwork Thu Mar 12 10:23:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paul Blakey X-Patchwork-Id: 222650 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, UNPARSEABLE_RELAY,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5F2A1C10DCE for ; Thu, 12 Mar 2020 10:23:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 3C570206FA for ; Thu, 12 Mar 2020 10:23:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726877AbgCLKXd (ORCPT ); Thu, 12 Mar 2020 06:23:33 -0400 Received: from mail-il-dmz.mellanox.com ([193.47.165.129]:56601 "EHLO mellanox.co.il" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725268AbgCLKXc (ORCPT ); Thu, 12 Mar 2020 06:23:32 -0400 Received: from Internal Mail-Server by MTLPINE1 (envelope-from paulb@mellanox.com) with ESMTPS (AES256-SHA encrypted); 12 Mar 2020 12:23:28 +0200 Received: from reg-r-vrt-019-120.mtr.labs.mlnx (reg-r-vrt-019-120.mtr.labs.mlnx [10.213.19.120]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 02CANSTc017875; Thu, 12 Mar 2020 12:23:28 +0200 From: Paul Blakey To: Paul Blakey , Saeed Mahameed , Oz Shlomo , Jakub Kicinski , Vlad Buslov , David Miller , "netdev@vger.kernel.org" , Jiri Pirko , Roi Dayan Subject: [PATCH net-next ct-offload v4 04/15] net/sched: act_ct: Instantiate flow table entry actions Date: Thu, 12 Mar 2020 12:23:06 +0200 Message-Id: <1584008597-15875-5-git-send-email-paulb@mellanox.com> X-Mailer: git-send-email 1.8.4.3 In-Reply-To: <1584008597-15875-1-git-send-email-paulb@mellanox.com> References: <1584008597-15875-1-git-send-email-paulb@mellanox.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org NF flow table API associate 5-tuple rule with an action list by calling the flow table type action() CB to fill the rule's actions. In action CB of act_ct, populate the ct offload entry actions with a new ct_metadata action. Initialize the ct_metadata with the ct mark, label and zone information. If ct nat was performed, then also append the relevant packet mangle actions (e.g. ipv4/ipv6/tcp/udp header rewrites). Drivers that offload the ft entries may match on the 5-tuple and perform the action list. Signed-off-by: Paul Blakey Reviewed-by: Jiri Pirko Reviewed-by: Edward Cree --- Changelog: v3->v4: Added reviewed by "Reviewed-by: Edward Cree ", thanks! v2->v3: Fix nat masks and conversions Moved comment above nat helpers v1->v2: Remove zone from metadata Add add mangle helper func (removes the unneccasry () and correct the mask there) Remove "abuse" of ? operator and use switch case Check protocol and ports in relevant function and return err On error restore action entries (on the topic, validaiting num of action isn't available) Add comment expalining nat Remove Inlinie from tcf_ct_flow_table_flow_action_get_next Refactor tcf_ct_flow_table_add_action_nat_ipv6 with helper On nats, allow both src and dst mangles include/net/flow_offload.h | 5 + include/net/netfilter/nf_flow_table.h | 23 ++++ net/netfilter/nf_flow_table_offload.c | 23 ---- net/sched/act_ct.c | 207 ++++++++++++++++++++++++++++++++++ 4 files changed, 235 insertions(+), 23 deletions(-) diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h index d1b1e4a..ba43349 100644 --- a/include/net/flow_offload.h +++ b/include/net/flow_offload.h @@ -136,6 +136,7 @@ enum flow_action_id { FLOW_ACTION_SAMPLE, FLOW_ACTION_POLICE, FLOW_ACTION_CT, + FLOW_ACTION_CT_METADATA, FLOW_ACTION_MPLS_PUSH, FLOW_ACTION_MPLS_POP, FLOW_ACTION_MPLS_MANGLE, @@ -225,6 +226,10 @@ struct flow_action_entry { int action; u16 zone; } ct; + struct { + u32 mark; + u32 labels[4]; + } ct_metadata; struct { /* FLOW_ACTION_MPLS_PUSH */ u32 label; __be16 proto; diff --git a/include/net/netfilter/nf_flow_table.h b/include/net/netfilter/nf_flow_table.h index d9d0945..c2d5cdd 100644 --- a/include/net/netfilter/nf_flow_table.h +++ b/include/net/netfilter/nf_flow_table.h @@ -16,6 +16,29 @@ struct flow_offload; enum flow_offload_tuple_dir; +struct nf_flow_key { + struct flow_dissector_key_meta meta; + struct flow_dissector_key_control control; + struct flow_dissector_key_basic basic; + union { + struct flow_dissector_key_ipv4_addrs ipv4; + struct flow_dissector_key_ipv6_addrs ipv6; + }; + struct flow_dissector_key_tcp tcp; + struct flow_dissector_key_ports tp; +} __aligned(BITS_PER_LONG / 8); /* Ensure that we can do comparisons as longs. */ + +struct nf_flow_match { + struct flow_dissector dissector; + struct nf_flow_key key; + struct nf_flow_key mask; +}; + +struct nf_flow_rule { + struct nf_flow_match match; + struct flow_rule *rule; +}; + struct nf_flowtable_type { struct list_head list; int family; diff --git a/net/netfilter/nf_flow_table_offload.c b/net/netfilter/nf_flow_table_offload.c index f5afdf0..42b73a0 100644 --- a/net/netfilter/nf_flow_table_offload.c +++ b/net/netfilter/nf_flow_table_offload.c @@ -23,29 +23,6 @@ struct flow_offload_work { struct flow_offload *flow; }; -struct nf_flow_key { - struct flow_dissector_key_meta meta; - struct flow_dissector_key_control control; - struct flow_dissector_key_basic basic; - union { - struct flow_dissector_key_ipv4_addrs ipv4; - struct flow_dissector_key_ipv6_addrs ipv6; - }; - struct flow_dissector_key_tcp tcp; - struct flow_dissector_key_ports tp; -} __aligned(BITS_PER_LONG / 8); /* Ensure that we can do comparisons as longs. */ - -struct nf_flow_match { - struct flow_dissector dissector; - struct nf_flow_key key; - struct nf_flow_key mask; -}; - -struct nf_flow_rule { - struct nf_flow_match match; - struct flow_rule *rule; -}; - #define NF_FLOW_DISSECTOR(__match, __type, __field) \ (__match)->dissector.offset[__type] = \ offsetof(struct nf_flow_key, __field) diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c index 3d9e678..9c522bc 100644 --- a/net/sched/act_ct.c +++ b/net/sched/act_ct.c @@ -55,7 +55,214 @@ struct tcf_ct_flow_table { .automatic_shrinking = true, }; +static struct flow_action_entry * +tcf_ct_flow_table_flow_action_get_next(struct flow_action *flow_action) +{ + int i = flow_action->num_entries++; + + return &flow_action->entries[i]; +} + +static void tcf_ct_add_mangle_action(struct flow_action *action, + enum flow_action_mangle_base htype, + u32 offset, + u32 mask, + u32 val) +{ + struct flow_action_entry *entry; + + entry = tcf_ct_flow_table_flow_action_get_next(action); + entry->id = FLOW_ACTION_MANGLE; + entry->mangle.htype = htype; + entry->mangle.mask = ~mask; + entry->mangle.offset = offset; + entry->mangle.val = val; +} + +/* The following nat helper functions check if the inverted reverse tuple + * (target) is different then the current dir tuple - meaning nat for ports + * and/or ip is needed, and add the relevant mangle actions. + */ +static void +tcf_ct_flow_table_add_action_nat_ipv4(const struct nf_conntrack_tuple *tuple, + struct nf_conntrack_tuple target, + struct flow_action *action) +{ + if (memcmp(&target.src.u3, &tuple->src.u3, sizeof(target.src.u3))) + tcf_ct_add_mangle_action(action, FLOW_ACT_MANGLE_HDR_TYPE_IP4, + offsetof(struct iphdr, saddr), + 0xFFFFFFFF, + be32_to_cpu(target.src.u3.ip)); + if (memcmp(&target.dst.u3, &tuple->dst.u3, sizeof(target.dst.u3))) + tcf_ct_add_mangle_action(action, FLOW_ACT_MANGLE_HDR_TYPE_IP4, + offsetof(struct iphdr, daddr), + 0xFFFFFFFF, + be32_to_cpu(target.dst.u3.ip)); +} + +static void +tcf_ct_add_ipv6_addr_mangle_action(struct flow_action *action, + union nf_inet_addr *addr, + u32 offset) +{ + int i; + + for (i = 0; i < sizeof(struct in6_addr) / sizeof(u32); i++) + tcf_ct_add_mangle_action(action, FLOW_ACT_MANGLE_HDR_TYPE_IP6, + i * sizeof(u32) + offset, + 0xFFFFFFFF, be32_to_cpu(addr->ip6[i])); +} + +static void +tcf_ct_flow_table_add_action_nat_ipv6(const struct nf_conntrack_tuple *tuple, + struct nf_conntrack_tuple target, + struct flow_action *action) +{ + if (memcmp(&target.src.u3, &tuple->src.u3, sizeof(target.src.u3))) + tcf_ct_add_ipv6_addr_mangle_action(action, &target.src.u3, + offsetof(struct ipv6hdr, + saddr)); + if (memcmp(&target.dst.u3, &tuple->dst.u3, sizeof(target.dst.u3))) + tcf_ct_add_ipv6_addr_mangle_action(action, &target.dst.u3, + offsetof(struct ipv6hdr, + daddr)); +} + +static void +tcf_ct_flow_table_add_action_nat_tcp(const struct nf_conntrack_tuple *tuple, + struct nf_conntrack_tuple target, + struct flow_action *action) +{ + __be16 target_src = target.src.u.tcp.port; + __be16 target_dst = target.dst.u.tcp.port; + + if (target_src != tuple->src.u.tcp.port) + tcf_ct_add_mangle_action(action, FLOW_ACT_MANGLE_HDR_TYPE_TCP, + offsetof(struct tcphdr, source), + 0xFFFF, be16_to_cpu(target_src)); + if (target_dst != tuple->dst.u.tcp.port) + tcf_ct_add_mangle_action(action, FLOW_ACT_MANGLE_HDR_TYPE_TCP, + offsetof(struct tcphdr, dest), + 0xFFFF, be16_to_cpu(target_dst)); +} + +static void +tcf_ct_flow_table_add_action_nat_udp(const struct nf_conntrack_tuple *tuple, + struct nf_conntrack_tuple target, + struct flow_action *action) +{ + __be16 target_src = target.src.u.udp.port; + __be16 target_dst = target.dst.u.udp.port; + + if (target_src != tuple->src.u.udp.port) + tcf_ct_add_mangle_action(action, FLOW_ACT_MANGLE_HDR_TYPE_TCP, + offsetof(struct udphdr, source), + 0xFFFF, be16_to_cpu(target_src)); + if (target_dst != tuple->dst.u.udp.port) + tcf_ct_add_mangle_action(action, FLOW_ACT_MANGLE_HDR_TYPE_TCP, + offsetof(struct udphdr, dest), + 0xFFFF, be16_to_cpu(target_dst)); +} + +static void tcf_ct_flow_table_add_action_meta(struct nf_conn *ct, + enum ip_conntrack_dir dir, + struct flow_action *action) +{ + struct nf_conn_labels *ct_labels; + struct flow_action_entry *entry; + u32 *act_ct_labels; + + entry = tcf_ct_flow_table_flow_action_get_next(action); + entry->id = FLOW_ACTION_CT_METADATA; +#if IS_ENABLED(CONFIG_NF_CONNTRACK_MARK) + entry->ct_metadata.mark = ct->mark; +#endif + + act_ct_labels = entry->ct_metadata.labels; + ct_labels = nf_ct_labels_find(ct); + if (ct_labels) + memcpy(act_ct_labels, ct_labels->bits, NF_CT_LABELS_MAX_SIZE); + else + memset(act_ct_labels, 0, NF_CT_LABELS_MAX_SIZE); +} + +static int tcf_ct_flow_table_add_action_nat(struct net *net, + struct nf_conn *ct, + enum ip_conntrack_dir dir, + struct flow_action *action) +{ + const struct nf_conntrack_tuple *tuple = &ct->tuplehash[dir].tuple; + struct nf_conntrack_tuple target; + + nf_ct_invert_tuple(&target, &ct->tuplehash[!dir].tuple); + + switch (tuple->src.l3num) { + case NFPROTO_IPV4: + tcf_ct_flow_table_add_action_nat_ipv4(tuple, target, + action); + break; + case NFPROTO_IPV6: + tcf_ct_flow_table_add_action_nat_ipv6(tuple, target, + action); + break; + default: + return -EOPNOTSUPP; + } + + switch (nf_ct_protonum(ct)) { + case IPPROTO_TCP: + tcf_ct_flow_table_add_action_nat_tcp(tuple, target, action); + break; + case IPPROTO_UDP: + tcf_ct_flow_table_add_action_nat_udp(tuple, target, action); + break; + default: + return -EOPNOTSUPP; + } + + return 0; +} + +static int tcf_ct_flow_table_fill_actions(struct net *net, + const struct flow_offload *flow, + enum flow_offload_tuple_dir tdir, + struct nf_flow_rule *flow_rule) +{ + struct flow_action *action = &flow_rule->rule->action; + int num_entries = action->num_entries; + struct nf_conn *ct = flow->ct; + enum ip_conntrack_dir dir; + int i, err; + + switch (tdir) { + case FLOW_OFFLOAD_DIR_ORIGINAL: + dir = IP_CT_DIR_ORIGINAL; + break; + case FLOW_OFFLOAD_DIR_REPLY: + dir = IP_CT_DIR_REPLY; + break; + default: + return -EOPNOTSUPP; + } + + err = tcf_ct_flow_table_add_action_nat(net, ct, dir, action); + if (err) + goto err_nat; + + tcf_ct_flow_table_add_action_meta(ct, dir, action); + return 0; + +err_nat: + /* Clear filled actions */ + for (i = num_entries; i < action->num_entries; i++) + memset(&action->entries[i], 0, sizeof(action->entries[i])); + action->num_entries = num_entries; + + return err; +} + static struct nf_flowtable_type flowtable_ct = { + .action = tcf_ct_flow_table_fill_actions, .owner = THIS_MODULE, }; From patchwork Thu Mar 12 10:23:07 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paul Blakey X-Patchwork-Id: 222655 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, UNPARSEABLE_RELAY,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EF58BC1975A for ; Thu, 12 Mar 2020 10:23:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id C9E4320674 for ; Thu, 12 Mar 2020 10:23:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726892AbgCLKXe (ORCPT ); Thu, 12 Mar 2020 06:23:34 -0400 Received: from mail-il-dmz.mellanox.com ([193.47.165.129]:56519 "EHLO mellanox.co.il" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726841AbgCLKXd (ORCPT ); Thu, 12 Mar 2020 06:23:33 -0400 Received: from Internal Mail-Server by MTLPINE2 (envelope-from paulb@mellanox.com) with ESMTPS (AES256-SHA encrypted); 12 Mar 2020 12:23:28 +0200 Received: from reg-r-vrt-019-120.mtr.labs.mlnx (reg-r-vrt-019-120.mtr.labs.mlnx [10.213.19.120]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 02CANSTd017875; Thu, 12 Mar 2020 12:23:28 +0200 From: Paul Blakey To: Paul Blakey , Saeed Mahameed , Oz Shlomo , Jakub Kicinski , Vlad Buslov , David Miller , "netdev@vger.kernel.org" , Jiri Pirko , Roi Dayan Subject: [PATCH net-next ct-offload v4 05/15] net/sched: act_ct: Support restoring conntrack info on skbs Date: Thu, 12 Mar 2020 12:23:07 +0200 Message-Id: <1584008597-15875-6-git-send-email-paulb@mellanox.com> X-Mailer: git-send-email 1.8.4.3 In-Reply-To: <1584008597-15875-1-git-send-email-paulb@mellanox.com> References: <1584008597-15875-1-git-send-email-paulb@mellanox.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Provide an API to restore the ct state pointer. This may be used by drivers to restore the ct state if they miss in tc chain after they already did the hardware connection tracking action (ct_metadata action). For example, consider the following rule on chain 0 that is in_hw, however chain 1 is not_in_hw: $ tc filter add dev ... chain 0 ... \ flower ... action ct pipe action goto chain 1 Packets of a flow offloaded (via nf flow table offload) by the driver hit this rule in hardware, will be marked with the ct metadata action (mark, label, zone) that does the equivalent of the software ct action, and when the packet jumps to hardware chain 1, there would be a miss. CT was already processed in hardware. Therefore, the driver's miss handling should restore the ct state on the skb, using the provided API, and continue the packet processing in chain 1. Signed-off-by: Paul Blakey Reviewed-by: Jiri Pirko --- Changelog: v3->v4: Rev xmas tree include/net/flow_offload.h | 1 + include/net/tc_act/tc_ct.h | 7 +++++++ net/sched/act_ct.c | 16 ++++++++++++++++ 3 files changed, 24 insertions(+) diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h index ba43349..a039c90 100644 --- a/include/net/flow_offload.h +++ b/include/net/flow_offload.h @@ -227,6 +227,7 @@ struct flow_action_entry { u16 zone; } ct; struct { + unsigned long cookie; u32 mark; u32 labels[4]; } ct_metadata; diff --git a/include/net/tc_act/tc_ct.h b/include/net/tc_act/tc_ct.h index cf3492e..735da59 100644 --- a/include/net/tc_act/tc_ct.h +++ b/include/net/tc_act/tc_ct.h @@ -55,6 +55,13 @@ static inline int tcf_ct_action(const struct tc_action *a) static inline int tcf_ct_action(const struct tc_action *a) { return 0; } #endif /* CONFIG_NF_CONNTRACK */ +#if IS_ENABLED(CONFIG_NET_ACT_CT) +void tcf_ct_flow_table_restore_skb(struct sk_buff *skb, unsigned long cookie); +#else +static inline void +tcf_ct_flow_table_restore_skb(struct sk_buff *skb, unsigned long cookie) { } +#endif + static inline bool is_tcf_ct(const struct tc_action *a) { #if defined(CONFIG_NET_CLS_ACT) && IS_ENABLED(CONFIG_NF_CONNTRACK) diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c index 9c522bc..31eef8a 100644 --- a/net/sched/act_ct.c +++ b/net/sched/act_ct.c @@ -170,6 +170,7 @@ static void tcf_ct_flow_table_add_action_meta(struct nf_conn *ct, { struct nf_conn_labels *ct_labels; struct flow_action_entry *entry; + enum ip_conntrack_info ctinfo; u32 *act_ct_labels; entry = tcf_ct_flow_table_flow_action_get_next(action); @@ -177,6 +178,10 @@ static void tcf_ct_flow_table_add_action_meta(struct nf_conn *ct, #if IS_ENABLED(CONFIG_NF_CONNTRACK_MARK) entry->ct_metadata.mark = ct->mark; #endif + ctinfo = dir == IP_CT_DIR_ORIGINAL ? IP_CT_ESTABLISHED : + IP_CT_ESTABLISHED_REPLY; + /* aligns with the CT reference on the SKB nf_ct_set */ + entry->ct_metadata.cookie = (unsigned long)ct | ctinfo; act_ct_labels = entry->ct_metadata.labels; ct_labels = nf_ct_labels_find(ct); @@ -1530,6 +1535,17 @@ static void __exit ct_cleanup_module(void) destroy_workqueue(act_ct_wq); } +void tcf_ct_flow_table_restore_skb(struct sk_buff *skb, unsigned long cookie) +{ + enum ip_conntrack_info ctinfo = cookie & NFCT_INFOMASK; + struct nf_conn *ct; + + ct = (struct nf_conn *)(cookie & NFCT_PTRMASK); + nf_conntrack_get(&ct->ct_general); + nf_ct_set(skb, ct, ctinfo); +} +EXPORT_SYMBOL_GPL(tcf_ct_flow_table_restore_skb); + module_init(ct_init_module); module_exit(ct_cleanup_module); MODULE_AUTHOR("Paul Blakey "); From patchwork Thu Mar 12 10:23:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paul Blakey X-Patchwork-Id: 222649 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, UNPARSEABLE_RELAY,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3A7DDC1975A for ; Thu, 12 Mar 2020 10:23:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 214B520674 for ; Thu, 12 Mar 2020 10:23:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727018AbgCLKXz (ORCPT ); Thu, 12 Mar 2020 06:23:55 -0400 Received: from mail-il-dmz.mellanox.com ([193.47.165.129]:56534 "EHLO mellanox.co.il" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726845AbgCLKXd (ORCPT ); Thu, 12 Mar 2020 06:23:33 -0400 Received: from Internal Mail-Server by MTLPINE2 (envelope-from paulb@mellanox.com) with ESMTPS (AES256-SHA encrypted); 12 Mar 2020 12:23:28 +0200 Received: from reg-r-vrt-019-120.mtr.labs.mlnx (reg-r-vrt-019-120.mtr.labs.mlnx [10.213.19.120]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 02CANSTe017875; Thu, 12 Mar 2020 12:23:28 +0200 From: Paul Blakey To: Paul Blakey , Saeed Mahameed , Oz Shlomo , Jakub Kicinski , Vlad Buslov , David Miller , "netdev@vger.kernel.org" , Jiri Pirko , Roi Dayan Subject: [PATCH net-next ct-offload v4 06/15] net/sched: act_ct: Support refreshing the flow table entries Date: Thu, 12 Mar 2020 12:23:08 +0200 Message-Id: <1584008597-15875-7-git-send-email-paulb@mellanox.com> X-Mailer: git-send-email 1.8.4.3 In-Reply-To: <1584008597-15875-1-git-send-email-paulb@mellanox.com> References: <1584008597-15875-1-git-send-email-paulb@mellanox.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org If driver deleted an FT entry, a FT failed to offload, or registered to the flow table after flows were already added, we still get packets in software. For those packets, while restoring the ct state from the flow table entry, refresh it's hardware offload. Signed-off-by: Paul Blakey Reviewed-by: Jiri Pirko --- include/net/netfilter/nf_flow_table.h | 3 +++ net/netfilter/nf_flow_table_core.c | 13 +++++++++++++ net/netfilter/nf_flow_table_ip.c | 15 ++------------- net/sched/act_ct.c | 1 + 4 files changed, 19 insertions(+), 13 deletions(-) diff --git a/include/net/netfilter/nf_flow_table.h b/include/net/netfilter/nf_flow_table.h index c2d5cdd..6890f1c 100644 --- a/include/net/netfilter/nf_flow_table.h +++ b/include/net/netfilter/nf_flow_table.h @@ -162,6 +162,9 @@ int flow_offload_route_init(struct flow_offload *flow, const struct nf_flow_route *route); int flow_offload_add(struct nf_flowtable *flow_table, struct flow_offload *flow); +void flow_offload_refresh(struct nf_flowtable *flow_table, + struct flow_offload *flow); + struct flow_offload_tuple_rhash *flow_offload_lookup(struct nf_flowtable *flow_table, struct flow_offload_tuple *tuple); void nf_flow_table_cleanup(struct net_device *dev); diff --git a/net/netfilter/nf_flow_table_core.c b/net/netfilter/nf_flow_table_core.c index 4af0327..9a477bd 100644 --- a/net/netfilter/nf_flow_table_core.c +++ b/net/netfilter/nf_flow_table_core.c @@ -252,6 +252,19 @@ int flow_offload_add(struct nf_flowtable *flow_table, struct flow_offload *flow) } EXPORT_SYMBOL_GPL(flow_offload_add); +void flow_offload_refresh(struct nf_flowtable *flow_table, + struct flow_offload *flow) +{ + flow->timeout = nf_flowtable_time_stamp + NF_FLOW_TIMEOUT; + + if (likely(!nf_flowtable_hw_offload(flow_table) || + !test_and_clear_bit(NF_FLOW_HW_REFRESH, &flow->flags))) + return; + + nf_flow_offload_add(flow_table, flow); +} +EXPORT_SYMBOL_GPL(flow_offload_refresh); + static inline bool nf_flow_has_expired(const struct flow_offload *flow) { return nf_flow_timeout_delta(flow->timeout) <= 0; diff --git a/net/netfilter/nf_flow_table_ip.c b/net/netfilter/nf_flow_table_ip.c index 9e563fd..5272721 100644 --- a/net/netfilter/nf_flow_table_ip.c +++ b/net/netfilter/nf_flow_table_ip.c @@ -232,13 +232,6 @@ static unsigned int nf_flow_xmit_xfrm(struct sk_buff *skb, return NF_STOLEN; } -static bool nf_flow_offload_refresh(struct nf_flowtable *flow_table, - struct flow_offload *flow) -{ - return nf_flowtable_hw_offload(flow_table) && - test_and_clear_bit(NF_FLOW_HW_REFRESH, &flow->flags); -} - unsigned int nf_flow_offload_ip_hook(void *priv, struct sk_buff *skb, const struct nf_hook_state *state) @@ -279,8 +272,7 @@ static bool nf_flow_offload_refresh(struct nf_flowtable *flow_table, if (nf_flow_state_check(flow, ip_hdr(skb)->protocol, skb, thoff)) return NF_ACCEPT; - if (unlikely(nf_flow_offload_refresh(flow_table, flow))) - nf_flow_offload_add(flow_table, flow); + flow_offload_refresh(flow_table, flow); if (nf_flow_offload_dst_check(&rt->dst)) { flow_offload_teardown(flow); @@ -290,7 +282,6 @@ static bool nf_flow_offload_refresh(struct nf_flowtable *flow_table, if (nf_flow_nat_ip(flow, skb, thoff, dir) < 0) return NF_DROP; - flow->timeout = nf_flowtable_time_stamp + NF_FLOW_TIMEOUT; iph = ip_hdr(skb); ip_decrease_ttl(iph); skb->tstamp = 0; @@ -508,8 +499,7 @@ static int nf_flow_tuple_ipv6(struct sk_buff *skb, const struct net_device *dev, sizeof(*ip6h))) return NF_ACCEPT; - if (unlikely(nf_flow_offload_refresh(flow_table, flow))) - nf_flow_offload_add(flow_table, flow); + flow_offload_refresh(flow_table, flow); if (nf_flow_offload_dst_check(&rt->dst)) { flow_offload_teardown(flow); @@ -522,7 +512,6 @@ static int nf_flow_tuple_ipv6(struct sk_buff *skb, const struct net_device *dev, if (nf_flow_nat_ipv6(flow, skb, dir) < 0) return NF_DROP; - flow->timeout = nf_flowtable_time_stamp + NF_FLOW_TIMEOUT; ip6h = ipv6_hdr(skb); ip6h->hop_limit--; skb->tstamp = 0; diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c index 31eef8a..16fc731 100644 --- a/net/sched/act_ct.c +++ b/net/sched/act_ct.c @@ -531,6 +531,7 @@ static bool tcf_ct_flow_table_lookup(struct tcf_ct_params *p, ctinfo = dir == FLOW_OFFLOAD_DIR_ORIGINAL ? IP_CT_ESTABLISHED : IP_CT_ESTABLISHED_REPLY; + flow_offload_refresh(nf_ft, flow); nf_conntrack_get(&ct->ct_general); nf_ct_set(skb, ct, ctinfo); From patchwork Thu Mar 12 10:23:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paul Blakey X-Patchwork-Id: 222648 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, UNPARSEABLE_RELAY,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A7B53C2BB1D for ; Thu, 12 Mar 2020 10:23:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 8ECA020674 for ; Thu, 12 Mar 2020 10:23:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727024AbgCLKX4 (ORCPT ); Thu, 12 Mar 2020 06:23:56 -0400 Received: from mail-il-dmz.mellanox.com ([193.47.165.129]:56553 "EHLO mellanox.co.il" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726856AbgCLKXd (ORCPT ); Thu, 12 Mar 2020 06:23:33 -0400 Received: from Internal Mail-Server by MTLPINE2 (envelope-from paulb@mellanox.com) with ESMTPS (AES256-SHA encrypted); 12 Mar 2020 12:23:29 +0200 Received: from reg-r-vrt-019-120.mtr.labs.mlnx (reg-r-vrt-019-120.mtr.labs.mlnx [10.213.19.120]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 02CANSTg017875; Thu, 12 Mar 2020 12:23:28 +0200 From: Paul Blakey To: Paul Blakey , Saeed Mahameed , Oz Shlomo , Jakub Kicinski , Vlad Buslov , David Miller , "netdev@vger.kernel.org" , Jiri Pirko , Roi Dayan Subject: [PATCH net-next ct-offload v4 08/15] net/mlx5: E-Switch, Introduce global tables Date: Thu, 12 Mar 2020 12:23:10 +0200 Message-Id: <1584008597-15875-9-git-send-email-paulb@mellanox.com> X-Mailer: git-send-email 1.8.4.3 In-Reply-To: <1584008597-15875-1-git-send-email-paulb@mellanox.com> References: <1584008597-15875-1-git-send-email-paulb@mellanox.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Currently, flow tables are automatically connected according to their tuple. Introduce global tables which are flow tables that are detached from the eswitch chains processing, and will be connected by explicitly referencing them from multiple chains. Add this new table type, and allow connecting them by refenece. Signed-off-by: Paul Blakey Reviewed-by: Oz Shlomo --- drivers/net/ethernet/mellanox/mlx5/core/eswitch.h | 2 ++ .../ethernet/mellanox/mlx5/core/eswitch_offloads.c | 18 +++++++++---- .../mellanox/mlx5/core/eswitch_offloads_chains.c | 30 ++++++++++++++++++++++ .../mellanox/mlx5/core/eswitch_offloads_chains.h | 6 +++++ 4 files changed, 51 insertions(+), 5 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h index ee36a8a..dae0f3e 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h @@ -421,6 +421,8 @@ struct mlx5_esw_flow_attr { u16 prio; u32 dest_chain; u32 flags; + struct mlx5_flow_table *fdb; + struct mlx5_flow_table *dest_ft; struct mlx5e_tc_flow_parse_attr *parse_attr; }; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c index c79a73b..8bfab5d 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c @@ -324,7 +324,12 @@ struct mlx5_flow_handle * if (flow_act.action & MLX5_FLOW_CONTEXT_ACTION_FWD_DEST) { struct mlx5_flow_table *ft; - if (attr->flags & MLX5_ESW_ATTR_FLAG_SLOW_PATH) { + if (attr->dest_ft) { + flow_act.flags |= FLOW_ACT_IGNORE_FLOW_LEVEL; + dest[i].type = MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE; + dest[i].ft = attr->dest_ft; + i++; + } else if (attr->flags & MLX5_ESW_ATTR_FLAG_SLOW_PATH) { flow_act.flags |= FLOW_ACT_IGNORE_FLOW_LEVEL; dest[i].type = MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE; dest[i].ft = mlx5_esw_chains_get_tc_end_ft(esw); @@ -378,8 +383,11 @@ struct mlx5_flow_handle * if (split) { fdb = esw_vport_tbl_get(esw, attr); } else { - fdb = mlx5_esw_chains_get_table(esw, attr->chain, attr->prio, - 0); + if (attr->chain || attr->prio) + fdb = mlx5_esw_chains_get_table(esw, attr->chain, + attr->prio, 0); + else + fdb = attr->fdb; mlx5_eswitch_set_rule_source_port(esw, spec, attr); } if (IS_ERR(fdb)) { @@ -402,7 +410,7 @@ struct mlx5_flow_handle * err_add_rule: if (split) esw_vport_tbl_put(esw, attr); - else + else if (attr->chain || attr->prio) mlx5_esw_chains_put_table(esw, attr->chain, attr->prio, 0); err_esw_get: if (!(attr->flags & MLX5_ESW_ATTR_FLAG_SLOW_PATH) && attr->dest_chain) @@ -499,7 +507,7 @@ struct mlx5_flow_handle * } else { if (split) esw_vport_tbl_put(esw, attr); - else + else if (attr->chain || attr->prio) mlx5_esw_chains_put_table(esw, attr->chain, attr->prio, 0); if (attr->dest_chain) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_chains.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_chains.c index ca3bbf8..84e3313 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_chains.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_chains.c @@ -722,6 +722,36 @@ struct mlx5_flow_table * return tc_end_fdb(esw); } +struct mlx5_flow_table * +mlx5_esw_chains_create_global_table(struct mlx5_eswitch *esw) +{ + int chain, prio, level, err; + + if (!fdb_ignore_flow_level_supported(esw)) { + err = -EOPNOTSUPP; + + esw_warn(esw->dev, + "Couldn't create global flow table, ignore_flow_level not supported."); + goto err_ignore; + } + + chain = mlx5_esw_chains_get_chain_range(esw), + prio = mlx5_esw_chains_get_prio_range(esw); + level = mlx5_esw_chains_get_level_range(esw); + + return mlx5_esw_chains_create_fdb_table(esw, chain, prio, level); + +err_ignore: + return ERR_PTR(err); +} + +void +mlx5_esw_chains_destroy_global_table(struct mlx5_eswitch *esw, + struct mlx5_flow_table *ft) +{ + mlx5_esw_chains_destroy_fdb_table(esw, ft); +} + static int mlx5_esw_chains_init(struct mlx5_eswitch *esw) { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_chains.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_chains.h index e806d8d..c7bc609 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_chains.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_chains.h @@ -25,6 +25,12 @@ struct mlx5_flow_table * struct mlx5_flow_table * mlx5_esw_chains_get_tc_end_ft(struct mlx5_eswitch *esw); +struct mlx5_flow_table * +mlx5_esw_chains_create_global_table(struct mlx5_eswitch *esw); +void +mlx5_esw_chains_destroy_global_table(struct mlx5_eswitch *esw, + struct mlx5_flow_table *ft); + int mlx5_esw_chains_create(struct mlx5_eswitch *esw); void mlx5_esw_chains_destroy(struct mlx5_eswitch *esw); From patchwork Thu Mar 12 10:23:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paul Blakey X-Patchwork-Id: 222653 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, UNPARSEABLE_RELAY,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DDF7CC2BB1D for ; Thu, 12 Mar 2020 10:23:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B90C120674 for ; Thu, 12 Mar 2020 10:23:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726956AbgCLKXh (ORCPT ); Thu, 12 Mar 2020 06:23:37 -0400 Received: from mail-il-dmz.mellanox.com ([193.47.165.129]:56661 "EHLO mellanox.co.il" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726921AbgCLKXg (ORCPT ); Thu, 12 Mar 2020 06:23:36 -0400 Received: from Internal Mail-Server by MTLPINE1 (envelope-from paulb@mellanox.com) with ESMTPS (AES256-SHA encrypted); 12 Mar 2020 12:23:29 +0200 Received: from reg-r-vrt-019-120.mtr.labs.mlnx (reg-r-vrt-019-120.mtr.labs.mlnx [10.213.19.120]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 02CANSTh017875; Thu, 12 Mar 2020 12:23:28 +0200 From: Paul Blakey To: Paul Blakey , Saeed Mahameed , Oz Shlomo , Jakub Kicinski , Vlad Buslov , David Miller , "netdev@vger.kernel.org" , Jiri Pirko , Roi Dayan Subject: [PATCH net-next ct-offload v4 09/15] net/mlx5: E-Switch, Add support for offloading rules with no in_port Date: Thu, 12 Mar 2020 12:23:11 +0200 Message-Id: <1584008597-15875-10-git-send-email-paulb@mellanox.com> X-Mailer: git-send-email 1.8.4.3 In-Reply-To: <1584008597-15875-1-git-send-email-paulb@mellanox.com> References: <1584008597-15875-1-git-send-email-paulb@mellanox.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org FTEs in global tables may match on packets from multiple in_ports. Provide the capability to omit the in_port match condition. Signed-off-by: Paul Blakey Reviewed-by: Oz Shlomo --- drivers/net/ethernet/mellanox/mlx5/core/eswitch.h | 1 + drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c | 4 +++- 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h index dae0f3e..6254bb6 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h @@ -391,6 +391,7 @@ enum { enum { MLX5_ESW_ATTR_FLAG_VLAN_HANDLED = BIT(0), MLX5_ESW_ATTR_FLAG_SLOW_PATH = BIT(1), + MLX5_ESW_ATTR_FLAG_NO_IN_PORT = BIT(2), }; struct mlx5_esw_flow_attr { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c index 8bfab5d..21ac813 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c @@ -388,7 +388,9 @@ struct mlx5_flow_handle * attr->prio, 0); else fdb = attr->fdb; - mlx5_eswitch_set_rule_source_port(esw, spec, attr); + + if (!(attr->flags & MLX5_ESW_ATTR_FLAG_NO_IN_PORT)) + mlx5_eswitch_set_rule_source_port(esw, spec, attr); } if (IS_ERR(fdb)) { rule = ERR_CAST(fdb); From patchwork Thu Mar 12 10:23:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paul Blakey X-Patchwork-Id: 222652 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, UNPARSEABLE_RELAY,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0DAE3C1975A for ; Thu, 12 Mar 2020 10:23:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id E120F206FA for ; Thu, 12 Mar 2020 10:23:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726974AbgCLKXk (ORCPT ); Thu, 12 Mar 2020 06:23:40 -0400 Received: from mail-il-dmz.mellanox.com ([193.47.165.129]:56643 "EHLO mellanox.co.il" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726917AbgCLKXg (ORCPT ); Thu, 12 Mar 2020 06:23:36 -0400 Received: from Internal Mail-Server by MTLPINE1 (envelope-from paulb@mellanox.com) with ESMTPS (AES256-SHA encrypted); 12 Mar 2020 12:23:29 +0200 Received: from reg-r-vrt-019-120.mtr.labs.mlnx (reg-r-vrt-019-120.mtr.labs.mlnx [10.213.19.120]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 02CANSTi017875; Thu, 12 Mar 2020 12:23:29 +0200 From: Paul Blakey To: Paul Blakey , Saeed Mahameed , Oz Shlomo , Jakub Kicinski , Vlad Buslov , David Miller , "netdev@vger.kernel.org" , Jiri Pirko , Roi Dayan Subject: [PATCH net-next ct-offload v4 10/15] net/mlx5: E-Switch, Support getting chain mapping Date: Thu, 12 Mar 2020 12:23:12 +0200 Message-Id: <1584008597-15875-11-git-send-email-paulb@mellanox.com> X-Mailer: git-send-email 1.8.4.3 In-Reply-To: <1584008597-15875-1-git-send-email-paulb@mellanox.com> References: <1584008597-15875-1-git-send-email-paulb@mellanox.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Currently, we write chain register mapping on miss from the the last prio of a chain. It is used to restore the chain in software. To support re-using the chain register mapping from global tables (such as CT tuple table) misses, export the chain mapping. Signed-off-by: Paul Blakey Reviewed-by: Oz Shlomo --- .../ethernet/mellanox/mlx5/core/eswitch_offloads_chains.c | 13 +++++++++++++ .../ethernet/mellanox/mlx5/core/eswitch_offloads_chains.h | 7 +++++++ 2 files changed, 20 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_chains.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_chains.c index 84e3313..0702c21 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_chains.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_chains.c @@ -900,6 +900,19 @@ struct mlx5_flow_table * mlx5_esw_chains_cleanup(esw); } +int +mlx5_esw_chains_get_chain_mapping(struct mlx5_eswitch *esw, u32 chain, + u32 *chain_mapping) +{ + return mapping_add(esw_chains_mapping(esw), &chain, chain_mapping); +} + +int +mlx5_esw_chains_put_chain_mapping(struct mlx5_eswitch *esw, u32 chain_mapping) +{ + return mapping_remove(esw_chains_mapping(esw), chain_mapping); +} + int mlx5_eswitch_get_chain_for_tag(struct mlx5_eswitch *esw, u32 tag, u32 *chain) { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_chains.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_chains.h index c7bc609..f3b9ae6 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_chains.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_chains.h @@ -31,6 +31,13 @@ struct mlx5_flow_table * mlx5_esw_chains_destroy_global_table(struct mlx5_eswitch *esw, struct mlx5_flow_table *ft); +int +mlx5_esw_chains_get_chain_mapping(struct mlx5_eswitch *esw, u32 chain, + u32 *chain_mapping); +int +mlx5_esw_chains_put_chain_mapping(struct mlx5_eswitch *esw, + u32 chain_mapping); + int mlx5_esw_chains_create(struct mlx5_eswitch *esw); void mlx5_esw_chains_destroy(struct mlx5_eswitch *esw); From patchwork Thu Mar 12 10:23:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paul Blakey X-Patchwork-Id: 222651 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, UNPARSEABLE_RELAY,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C7E72C10DCE for ; Thu, 12 Mar 2020 10:23:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 8AE72206FA for ; Thu, 12 Mar 2020 10:23:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726992AbgCLKXl (ORCPT ); Thu, 12 Mar 2020 06:23:41 -0400 Received: from mail-il-dmz.mellanox.com ([193.47.165.129]:56660 "EHLO mellanox.co.il" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726922AbgCLKXh (ORCPT ); Thu, 12 Mar 2020 06:23:37 -0400 Received: from Internal Mail-Server by MTLPINE1 (envelope-from paulb@mellanox.com) with ESMTPS (AES256-SHA encrypted); 12 Mar 2020 12:23:29 +0200 Received: from reg-r-vrt-019-120.mtr.labs.mlnx (reg-r-vrt-019-120.mtr.labs.mlnx [10.213.19.120]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 02CANSTk017875; Thu, 12 Mar 2020 12:23:29 +0200 From: Paul Blakey To: Paul Blakey , Saeed Mahameed , Oz Shlomo , Jakub Kicinski , Vlad Buslov , David Miller , "netdev@vger.kernel.org" , Jiri Pirko , Roi Dayan Subject: [PATCH net-next ct-offload v4 12/15] net/mlx5e: CT: Introduce connection tracking Date: Thu, 12 Mar 2020 12:23:14 +0200 Message-Id: <1584008597-15875-13-git-send-email-paulb@mellanox.com> X-Mailer: git-send-email 1.8.4.3 In-Reply-To: <1584008597-15875-1-git-send-email-paulb@mellanox.com> References: <1584008597-15875-1-git-send-email-paulb@mellanox.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Add support for offloading tc ct action and ct matches. We translate the tc filter with CT action the following HW model: +-------------------+ +--------------------+ +--------------+ + pre_ct (tc chain) +----->+ CT (nat or no nat) +--->+ post_ct +-----> + original match + | + tuple + zone match + | + fte_id match + | +-------------------+ | +--------------------+ | +--------------+ | v v v set chain miss mapping set mark original set fte_id set label filter set zone set established actions set tunnel_id do nat (if needed) do decap Signed-off-by: Paul Blakey Reviewed-by: Oz Shlomo Reviewed-by: Roi Dayan --- Changelog: v2->v3: check reg loopback support instead of metadata support/enabled drivers/net/ethernet/mellanox/mlx5/core/Kconfig | 10 + drivers/net/ethernet/mellanox/mlx5/core/Makefile | 1 + drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c | 541 +++++++++++++++++++++ drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.h | 140 ++++++ drivers/net/ethernet/mellanox/mlx5/core/en_rep.h | 3 + drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 104 +++- drivers/net/ethernet/mellanox/mlx5/core/en_tc.h | 8 + drivers/net/ethernet/mellanox/mlx5/core/eswitch.h | 2 + 8 files changed, 793 insertions(+), 16 deletions(-) create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.h diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig index a1f20b2..312e0a1 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/Kconfig +++ b/drivers/net/ethernet/mellanox/mlx5/core/Kconfig @@ -78,6 +78,16 @@ config MLX5_ESWITCH Legacy SRIOV mode (L2 mac vlan steering based). Switchdev mode (eswitch offloads). +config MLX5_TC_CT + bool "MLX5 TC connection tracking offload support" + depends on MLX5_CORE_EN && NET_SWITCHDEV && NF_FLOW_TABLE && NET_ACT_CT && NET_TC_SKB_EXT + default y + help + Say Y here if you want to support offloading connection tracking rules + via tc ct action. + + If unsure, set to Y + config MLX5_CORE_EN_DCB bool "Data Center Bridging (DCB) Support" default y diff --git a/drivers/net/ethernet/mellanox/mlx5/core/Makefile b/drivers/net/ethernet/mellanox/mlx5/core/Makefile index a62dc81..7408ae3 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/Makefile +++ b/drivers/net/ethernet/mellanox/mlx5/core/Makefile @@ -37,6 +37,7 @@ mlx5_core-$(CONFIG_MLX5_ESWITCH) += en_rep.o en_tc.o en/tc_tun.o lib/port_tu lib/geneve.o en/mapping.o en/tc_tun_vxlan.o en/tc_tun_gre.o \ en/tc_tun_geneve.o diag/en_tc_tracepoint.o mlx5_core-$(CONFIG_PCI_HYPERV_INTERFACE) += en/hv_vhca_stats.o +mlx5_core-$(CONFIG_MLX5_TC_CT) += en/tc_ct.o # # Core extra diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c new file mode 100644 index 0000000..c113046 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c @@ -0,0 +1,541 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* Copyright (c) 2019 Mellanox Technologies. */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "en/tc_ct.h" +#include "en.h" +#include "en_tc.h" +#include "en_rep.h" +#include "eswitch_offloads_chains.h" + +#define MLX5_CT_ZONE_BITS (mlx5e_tc_attr_to_reg_mappings[ZONE_TO_REG].mlen * 8) +#define MLX5_CT_ZONE_MASK GENMASK(MLX5_CT_ZONE_BITS - 1, 0) +#define MLX5_CT_STATE_ESTABLISHED_BIT BIT(1) +#define MLX5_CT_STATE_TRK_BIT BIT(2) + +#define MLX5_FTE_ID_BITS (mlx5e_tc_attr_to_reg_mappings[FTEID_TO_REG].mlen * 8) +#define MLX5_FTE_ID_MAX GENMASK(MLX5_FTE_ID_BITS - 1, 0) +#define MLX5_FTE_ID_MASK MLX5_FTE_ID_MAX + +#define ct_dbg(fmt, args...)\ + netdev_dbg(ct_priv->netdev, "ct_debug: " fmt "\n", ##args) + +struct mlx5_tc_ct_priv { + struct mlx5_eswitch *esw; + const struct net_device *netdev; + struct idr fte_ids; + struct mlx5_flow_table *ct; + struct mlx5_flow_table *ct_nat; + struct mlx5_flow_table *post_ct; + struct mutex control_lock; /* guards parallel adds/dels */ +}; + +struct mlx5_ct_flow { + struct mlx5_esw_flow_attr pre_ct_attr; + struct mlx5_esw_flow_attr post_ct_attr; + struct mlx5_flow_handle *pre_ct_rule; + struct mlx5_flow_handle *post_ct_rule; + u32 fte_id; + u32 chain_mapping; +}; + +static struct mlx5_tc_ct_priv * +mlx5_tc_ct_get_ct_priv(struct mlx5e_priv *priv) +{ + struct mlx5_eswitch *esw = priv->mdev->priv.eswitch; + struct mlx5_rep_uplink_priv *uplink_priv; + struct mlx5e_rep_priv *uplink_rpriv; + + uplink_rpriv = mlx5_eswitch_get_uplink_priv(esw, REP_ETH); + uplink_priv = &uplink_rpriv->uplink_priv; + return uplink_priv->ct_priv; +} + +int +mlx5_tc_ct_parse_match(struct mlx5e_priv *priv, + struct mlx5_flow_spec *spec, + struct flow_cls_offload *f, + struct netlink_ext_ack *extack) +{ + struct mlx5_tc_ct_priv *ct_priv = mlx5_tc_ct_get_ct_priv(priv); + struct flow_dissector_key_ct *mask, *key; + bool trk, est, untrk, unest, new, unnew; + u32 ctstate = 0, ctstate_mask = 0; + u16 ct_state_on, ct_state_off; + u16 ct_state, ct_state_mask; + struct flow_match_ct match; + + if (!flow_rule_match_key(f->rule, FLOW_DISSECTOR_KEY_CT)) + return 0; + + if (!ct_priv) { + NL_SET_ERR_MSG_MOD(extack, + "offload of ct matching isn't available"); + return -EOPNOTSUPP; + } + + flow_rule_match_ct(f->rule, &match); + + key = match.key; + mask = match.mask; + + ct_state = key->ct_state; + ct_state_mask = mask->ct_state; + + if (ct_state_mask & ~(TCA_FLOWER_KEY_CT_FLAGS_TRACKED | + TCA_FLOWER_KEY_CT_FLAGS_ESTABLISHED | + TCA_FLOWER_KEY_CT_FLAGS_NEW)) { + NL_SET_ERR_MSG_MOD(extack, + "only ct_state trk, est and new are supported for offload"); + return -EOPNOTSUPP; + } + + if (mask->ct_labels[1] || mask->ct_labels[2] || mask->ct_labels[3]) { + NL_SET_ERR_MSG_MOD(extack, + "only lower 32bits of ct_labels are supported for offload"); + return -EOPNOTSUPP; + } + + ct_state_on = ct_state & ct_state_mask; + ct_state_off = (ct_state & ct_state_mask) ^ ct_state_mask; + trk = ct_state_on & TCA_FLOWER_KEY_CT_FLAGS_TRACKED; + new = ct_state_on & TCA_FLOWER_KEY_CT_FLAGS_NEW; + est = ct_state_on & TCA_FLOWER_KEY_CT_FLAGS_ESTABLISHED; + untrk = ct_state_off & TCA_FLOWER_KEY_CT_FLAGS_TRACKED; + unnew = ct_state_off & TCA_FLOWER_KEY_CT_FLAGS_NEW; + unest = ct_state_off & TCA_FLOWER_KEY_CT_FLAGS_ESTABLISHED; + + ctstate |= trk ? MLX5_CT_STATE_TRK_BIT : 0; + ctstate |= est ? MLX5_CT_STATE_ESTABLISHED_BIT : 0; + ctstate_mask |= (untrk || trk) ? MLX5_CT_STATE_TRK_BIT : 0; + ctstate_mask |= (unest || est) ? MLX5_CT_STATE_ESTABLISHED_BIT : 0; + + if (new) { + NL_SET_ERR_MSG_MOD(extack, + "matching on ct_state +new isn't supported"); + return -EOPNOTSUPP; + } + + if (mask->ct_zone) + mlx5e_tc_match_to_reg_match(spec, ZONE_TO_REG, + key->ct_zone, MLX5_CT_ZONE_MASK); + if (ctstate_mask) + mlx5e_tc_match_to_reg_match(spec, CTSTATE_TO_REG, + ctstate, ctstate_mask); + if (mask->ct_mark) + mlx5e_tc_match_to_reg_match(spec, MARK_TO_REG, + key->ct_mark, mask->ct_mark); + if (mask->ct_labels[0]) + mlx5e_tc_match_to_reg_match(spec, LABELS_TO_REG, + key->ct_labels[0], + mask->ct_labels[0]); + + return 0; +} + +int +mlx5_tc_ct_parse_action(struct mlx5e_priv *priv, + struct mlx5_esw_flow_attr *attr, + const struct flow_action_entry *act, + struct netlink_ext_ack *extack) +{ + struct mlx5_tc_ct_priv *ct_priv = mlx5_tc_ct_get_ct_priv(priv); + + if (!ct_priv) { + NL_SET_ERR_MSG_MOD(extack, + "offload of ct action isn't available"); + return -EOPNOTSUPP; + } + + attr->ct_attr.zone = act->ct.zone; + attr->ct_attr.ct_action = act->ct.action; + + return 0; +} + +/* We translate the tc filter with CT action to the following HW model: + * + * +-------------------+ +--------------------+ +--------------+ + * + pre_ct (tc chain) +----->+ CT (nat or no nat) +--->+ post_ct +-----> + * + original match + | + tuple + zone match + | + fte_id match + | + * +-------------------+ | +--------------------+ | +--------------+ | + * v v v + * set chain miss mapping set mark original + * set fte_id set label filter + * set zone set established actions + * set tunnel_id do nat (if needed) + * do decap + */ +static int +__mlx5_tc_ct_flow_offload(struct mlx5e_priv *priv, + struct mlx5e_tc_flow *flow, + struct mlx5_flow_spec *orig_spec, + struct mlx5_esw_flow_attr *attr, + struct mlx5_flow_handle **flow_rule) +{ + struct mlx5_tc_ct_priv *ct_priv = mlx5_tc_ct_get_ct_priv(priv); + bool nat = attr->ct_attr.ct_action & TCA_CT_ACT_NAT; + struct mlx5e_tc_mod_hdr_acts pre_mod_acts = {}; + struct mlx5_eswitch *esw = ct_priv->esw; + struct mlx5_flow_spec post_ct_spec = {}; + struct mlx5_esw_flow_attr *pre_ct_attr; + struct mlx5_modify_hdr *mod_hdr; + struct mlx5_flow_handle *rule; + struct mlx5_ct_flow *ct_flow; + int chain_mapping = 0, err; + u32 fte_id = 1; + + ct_flow = kzalloc(sizeof(*ct_flow), GFP_KERNEL); + if (!ct_flow) + return -ENOMEM; + + err = idr_alloc_u32(&ct_priv->fte_ids, ct_flow, &fte_id, + MLX5_FTE_ID_MAX, GFP_KERNEL); + if (err) { + netdev_warn(priv->netdev, + "Failed to allocate fte id, err: %d\n", err); + goto err_idr; + } + ct_flow->fte_id = fte_id; + + /* Base esw attributes of both rules on original rule attribute */ + pre_ct_attr = &ct_flow->pre_ct_attr; + memcpy(pre_ct_attr, attr, sizeof(*attr)); + memcpy(&ct_flow->post_ct_attr, attr, sizeof(*attr)); + + /* Modify the original rule's action to fwd and modify, leave decap */ + pre_ct_attr->action = attr->action & MLX5_FLOW_CONTEXT_ACTION_DECAP; + pre_ct_attr->action |= MLX5_FLOW_CONTEXT_ACTION_FWD_DEST | + MLX5_FLOW_CONTEXT_ACTION_MOD_HDR; + + /* Write chain miss tag for miss in ct table as we + * don't go though all prios of this chain as normal tc rules + * miss. + */ + err = mlx5_esw_chains_get_chain_mapping(esw, attr->chain, + &chain_mapping); + if (err) { + ct_dbg("Failed to get chain register mapping for chain"); + goto err_get_chain; + } + ct_flow->chain_mapping = chain_mapping; + + err = mlx5e_tc_match_to_reg_set(esw->dev, &pre_mod_acts, + CHAIN_TO_REG, chain_mapping); + if (err) { + ct_dbg("Failed to set chain register mapping"); + goto err_mapping; + } + + err = mlx5e_tc_match_to_reg_set(esw->dev, &pre_mod_acts, ZONE_TO_REG, + attr->ct_attr.zone & + MLX5_CT_ZONE_MASK); + if (err) { + ct_dbg("Failed to set zone register mapping"); + goto err_mapping; + } + + err = mlx5e_tc_match_to_reg_set(esw->dev, &pre_mod_acts, + FTEID_TO_REG, fte_id); + if (err) { + ct_dbg("Failed to set fte_id register mapping"); + goto err_mapping; + } + + /* If original flow is decap, we do it before going into ct table + * so add a rewrite for the tunnel match_id. + */ + if ((pre_ct_attr->action & MLX5_FLOW_CONTEXT_ACTION_DECAP) && + attr->chain == 0) { + u32 tun_id = mlx5e_tc_get_flow_tun_id(flow); + + err = mlx5e_tc_match_to_reg_set(esw->dev, &pre_mod_acts, + TUNNEL_TO_REG, + tun_id); + if (err) { + ct_dbg("Failed to set tunnel register mapping"); + goto err_mapping; + } + } + + mod_hdr = mlx5_modify_header_alloc(esw->dev, + MLX5_FLOW_NAMESPACE_FDB, + pre_mod_acts.num_actions, + pre_mod_acts.actions); + if (IS_ERR(mod_hdr)) { + err = PTR_ERR(mod_hdr); + ct_dbg("Failed to create pre ct mod hdr"); + goto err_mapping; + } + pre_ct_attr->modify_hdr = mod_hdr; + + /* Post ct rule matches on fte_id and executes original rule's + * tc rule action + */ + mlx5e_tc_match_to_reg_match(&post_ct_spec, FTEID_TO_REG, + fte_id, MLX5_FTE_ID_MASK); + + /* Put post_ct rule on post_ct fdb */ + ct_flow->post_ct_attr.chain = 0; + ct_flow->post_ct_attr.prio = 0; + ct_flow->post_ct_attr.fdb = ct_priv->post_ct; + + ct_flow->post_ct_attr.inner_match_level = MLX5_MATCH_NONE; + ct_flow->post_ct_attr.outer_match_level = MLX5_MATCH_NONE; + ct_flow->post_ct_attr.action &= ~(MLX5_FLOW_CONTEXT_ACTION_DECAP); + rule = mlx5_eswitch_add_offloaded_rule(esw, &post_ct_spec, + &ct_flow->post_ct_attr); + ct_flow->post_ct_rule = rule; + if (IS_ERR(ct_flow->post_ct_rule)) { + err = PTR_ERR(ct_flow->post_ct_rule); + ct_dbg("Failed to add post ct rule"); + goto err_insert_post_ct; + } + + /* Change original rule point to ct table */ + pre_ct_attr->dest_chain = 0; + pre_ct_attr->dest_ft = nat ? ct_priv->ct_nat : ct_priv->ct; + ct_flow->pre_ct_rule = mlx5_eswitch_add_offloaded_rule(esw, + orig_spec, + pre_ct_attr); + if (IS_ERR(ct_flow->pre_ct_rule)) { + err = PTR_ERR(ct_flow->pre_ct_rule); + ct_dbg("Failed to add pre ct rule"); + goto err_insert_orig; + } + + attr->ct_attr.ct_flow = ct_flow; + *flow_rule = ct_flow->post_ct_rule; + dealloc_mod_hdr_actions(&pre_mod_acts); + + return 0; + +err_insert_orig: + mlx5_eswitch_del_offloaded_rule(ct_priv->esw, ct_flow->post_ct_rule, + &ct_flow->post_ct_attr); +err_insert_post_ct: + mlx5_modify_header_dealloc(priv->mdev, pre_ct_attr->modify_hdr); +err_mapping: + dealloc_mod_hdr_actions(&pre_mod_acts); + mlx5_esw_chains_put_chain_mapping(esw, ct_flow->chain_mapping); +err_get_chain: + idr_remove(&ct_priv->fte_ids, fte_id); +err_idr: + kfree(ct_flow); + netdev_warn(priv->netdev, "Failed to offload ct flow, err %d\n", err); + return err; +} + +struct mlx5_flow_handle * +mlx5_tc_ct_flow_offload(struct mlx5e_priv *priv, + struct mlx5e_tc_flow *flow, + struct mlx5_flow_spec *spec, + struct mlx5_esw_flow_attr *attr) +{ + struct mlx5_tc_ct_priv *ct_priv = mlx5_tc_ct_get_ct_priv(priv); + struct mlx5_flow_handle *rule; + int err; + + if (!ct_priv) + return ERR_PTR(-EOPNOTSUPP); + + mutex_lock(&ct_priv->control_lock); + err = __mlx5_tc_ct_flow_offload(priv, flow, spec, attr, &rule); + mutex_unlock(&ct_priv->control_lock); + if (err) + return ERR_PTR(err); + + return rule; +} + +static void +__mlx5_tc_ct_delete_flow(struct mlx5_tc_ct_priv *ct_priv, + struct mlx5_ct_flow *ct_flow) +{ + struct mlx5_esw_flow_attr *pre_ct_attr = &ct_flow->pre_ct_attr; + struct mlx5_eswitch *esw = ct_priv->esw; + + mlx5_eswitch_del_offloaded_rule(esw, ct_flow->pre_ct_rule, + pre_ct_attr); + mlx5_modify_header_dealloc(esw->dev, pre_ct_attr->modify_hdr); + mlx5_eswitch_del_offloaded_rule(esw, ct_flow->post_ct_rule, + &ct_flow->post_ct_attr); + mlx5_esw_chains_put_chain_mapping(esw, ct_flow->chain_mapping); + idr_remove(&ct_priv->fte_ids, ct_flow->fte_id); + kfree(ct_flow); +} + +void +mlx5_tc_ct_delete_flow(struct mlx5e_priv *priv, struct mlx5e_tc_flow *flow, + struct mlx5_esw_flow_attr *attr) +{ + struct mlx5_tc_ct_priv *ct_priv = mlx5_tc_ct_get_ct_priv(priv); + struct mlx5_ct_flow *ct_flow = attr->ct_attr.ct_flow; + + /* We are called on error to clean up stuff from parsing + * but we don't have anything for now + */ + if (!ct_flow) + return; + + mutex_lock(&ct_priv->control_lock); + __mlx5_tc_ct_delete_flow(ct_priv, ct_flow); + mutex_unlock(&ct_priv->control_lock); +} + +static int +mlx5_tc_ct_init_check_support(struct mlx5_eswitch *esw, + const char **err_msg) +{ +#if !IS_ENABLED(CONFIG_NET_TC_SKB_EXT) + /* cannot restore chain ID on HW miss */ + + *err_msg = "tc skb extension missing"; + return -EOPNOTSUPP; +#endif + + if (!MLX5_CAP_ESW_FLOWTABLE_FDB(esw->dev, ignore_flow_level)) { + *err_msg = "firmware level support is missing"; + return -EOPNOTSUPP; + } + + if (!mlx5_eswitch_vlan_actions_supported(esw->dev, 1)) { + /* vlan workaround should be avoided for multi chain rules. + * This is just a sanity check as pop vlan action should + * be supported by any FW that supports ignore_flow_level + */ + + *err_msg = "firmware vlan actions support is missing"; + return -EOPNOTSUPP; + } + + if (!MLX5_CAP_ESW_FLOWTABLE(esw->dev, + fdb_modify_header_fwd_to_table)) { + /* CT always writes to registers which are mod header actions. + * Therefore, mod header and goto is required + */ + + *err_msg = "firmware fwd and modify support is missing"; + return -EOPNOTSUPP; + } + + if (!mlx5_eswitch_reg_c1_loopback_enabled(esw)) { + *err_msg = "register loopback isn't supported"; + return -EOPNOTSUPP; + } + + return 0; +} + +static void +mlx5_tc_ct_init_err(struct mlx5e_rep_priv *rpriv, const char *msg, int err) +{ + if (msg) + netdev_warn(rpriv->netdev, + "tc ct offload not supported, %s, err: %d\n", + msg, err); + else + netdev_warn(rpriv->netdev, + "tc ct offload not supported, err: %d\n", + err); +} + +int +mlx5_tc_ct_init(struct mlx5_rep_uplink_priv *uplink_priv) +{ + struct mlx5_tc_ct_priv *ct_priv; + struct mlx5e_rep_priv *rpriv; + struct mlx5_eswitch *esw; + struct mlx5e_priv *priv; + const char *msg; + int err; + + rpriv = container_of(uplink_priv, struct mlx5e_rep_priv, uplink_priv); + priv = netdev_priv(rpriv->netdev); + esw = priv->mdev->priv.eswitch; + + err = mlx5_tc_ct_init_check_support(esw, &msg); + if (err) { + mlx5_tc_ct_init_err(rpriv, msg, err); + goto err_support; + } + + ct_priv = kzalloc(sizeof(*ct_priv), GFP_KERNEL); + if (!ct_priv) { + mlx5_tc_ct_init_err(rpriv, NULL, -ENOMEM); + goto err_alloc; + } + + ct_priv->esw = esw; + ct_priv->netdev = rpriv->netdev; + ct_priv->ct = mlx5_esw_chains_create_global_table(esw); + if (IS_ERR(ct_priv->ct)) { + err = PTR_ERR(ct_priv->ct); + mlx5_tc_ct_init_err(rpriv, "failed to create ct table", err); + goto err_ct_tbl; + } + + ct_priv->ct_nat = mlx5_esw_chains_create_global_table(esw); + if (IS_ERR(ct_priv->ct_nat)) { + err = PTR_ERR(ct_priv->ct_nat); + mlx5_tc_ct_init_err(rpriv, "failed to create ct nat table", + err); + goto err_ct_nat_tbl; + } + + ct_priv->post_ct = mlx5_esw_chains_create_global_table(esw); + if (IS_ERR(ct_priv->post_ct)) { + err = PTR_ERR(ct_priv->post_ct); + mlx5_tc_ct_init_err(rpriv, "failed to create post ct table", + err); + goto err_post_ct_tbl; + } + + idr_init(&ct_priv->fte_ids); + mutex_init(&ct_priv->control_lock); + + /* Done, set ct_priv to know it initializted */ + uplink_priv->ct_priv = ct_priv; + + return 0; + +err_post_ct_tbl: + mlx5_esw_chains_destroy_global_table(esw, ct_priv->ct_nat); +err_ct_nat_tbl: + mlx5_esw_chains_destroy_global_table(esw, ct_priv->ct); +err_ct_tbl: + kfree(ct_priv); +err_alloc: +err_support: + + return 0; +} + +void +mlx5_tc_ct_clean(struct mlx5_rep_uplink_priv *uplink_priv) +{ + struct mlx5_tc_ct_priv *ct_priv = uplink_priv->ct_priv; + + if (!ct_priv) + return; + + mlx5_esw_chains_destroy_global_table(ct_priv->esw, ct_priv->post_ct); + mlx5_esw_chains_destroy_global_table(ct_priv->esw, ct_priv->ct_nat); + mlx5_esw_chains_destroy_global_table(ct_priv->esw, ct_priv->ct); + + mutex_destroy(&ct_priv->control_lock); + idr_destroy(&ct_priv->fte_ids); + kfree(ct_priv); + + uplink_priv->ct_priv = NULL; +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.h b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.h new file mode 100644 index 0000000..3a84216 --- /dev/null +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.h @@ -0,0 +1,140 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2018 Mellanox Technologies. */ + +#ifndef __MLX5_EN_TC_CT_H__ +#define __MLX5_EN_TC_CT_H__ + +#include +#include +#include + +struct mlx5_esw_flow_attr; +struct mlx5_rep_uplink_priv; +struct mlx5e_tc_flow; +struct mlx5e_priv; + +struct mlx5_ct_flow; + +struct mlx5_ct_attr { + u16 zone; + u16 ct_action; + struct mlx5_ct_flow *ct_flow; +}; + +#define zone_to_reg_ct {\ + .mfield = MLX5_ACTION_IN_FIELD_METADATA_REG_C_2,\ + .moffset = 0,\ + .mlen = 2,\ + .soffset = MLX5_BYTE_OFF(fte_match_param,\ + misc_parameters_2.metadata_reg_c_2) + 2,\ +} + +#define ctstate_to_reg_ct {\ + .mfield = MLX5_ACTION_IN_FIELD_METADATA_REG_C_2,\ + .moffset = 2,\ + .mlen = 2,\ + .soffset = MLX5_BYTE_OFF(fte_match_param,\ + misc_parameters_2.metadata_reg_c_2),\ +} + +#define mark_to_reg_ct {\ + .mfield = MLX5_ACTION_IN_FIELD_METADATA_REG_C_3,\ + .moffset = 0,\ + .mlen = 4,\ + .soffset = MLX5_BYTE_OFF(fte_match_param,\ + misc_parameters_2.metadata_reg_c_3),\ +} + +#define labels_to_reg_ct {\ + .mfield = MLX5_ACTION_IN_FIELD_METADATA_REG_C_4,\ + .moffset = 0,\ + .mlen = 4,\ + .soffset = MLX5_BYTE_OFF(fte_match_param,\ + misc_parameters_2.metadata_reg_c_4),\ +} + +#define fteid_to_reg_ct {\ + .mfield = MLX5_ACTION_IN_FIELD_METADATA_REG_C_5,\ + .moffset = 0,\ + .mlen = 4,\ + .soffset = MLX5_BYTE_OFF(fte_match_param,\ + misc_parameters_2.metadata_reg_c_5),\ +} + +#if IS_ENABLED(CONFIG_MLX5_TC_CT) + +int +mlx5_tc_ct_init(struct mlx5_rep_uplink_priv *uplink_priv); +void +mlx5_tc_ct_clean(struct mlx5_rep_uplink_priv *uplink_priv); + +int +mlx5_tc_ct_parse_match(struct mlx5e_priv *priv, + struct mlx5_flow_spec *spec, + struct flow_cls_offload *f, + struct netlink_ext_ack *extack); +int +mlx5_tc_ct_parse_action(struct mlx5e_priv *priv, + struct mlx5_esw_flow_attr *attr, + const struct flow_action_entry *act, + struct netlink_ext_ack *extack); + +struct mlx5_flow_handle * +mlx5_tc_ct_flow_offload(struct mlx5e_priv *priv, + struct mlx5e_tc_flow *flow, + struct mlx5_flow_spec *spec, + struct mlx5_esw_flow_attr *attr); +void +mlx5_tc_ct_delete_flow(struct mlx5e_priv *priv, + struct mlx5e_tc_flow *flow, + struct mlx5_esw_flow_attr *attr); + +#else /* CONFIG_MLX5_TC_CT */ + +static inline int +mlx5_tc_ct_init(struct mlx5_rep_uplink_priv *uplink_priv) +{ + return 0; +} + +static inline void +mlx5_tc_ct_clean(struct mlx5_rep_uplink_priv *uplink_priv) +{ +} + +static inline int +mlx5_tc_ct_parse_match(struct mlx5e_priv *priv, + struct mlx5_flow_spec *spec, + struct flow_cls_offload *f, + struct netlink_ext_ack *extack) +{ + return -EOPNOTSUPP; +} + +static inline int +mlx5_tc_ct_parse_action(struct mlx5e_priv *priv, + struct mlx5_esw_flow_attr *attr, + const struct flow_action_entry *act, + struct netlink_ext_ack *extack) +{ + return -EOPNOTSUPP; +} + +static inline struct mlx5_flow_handle * +mlx5_tc_ct_flow_offload(struct mlx5e_priv *priv, + struct mlx5e_tc_flow *flow, + struct mlx5_flow_spec *spec, + struct mlx5_esw_flow_attr *attr) +{ + return ERR_PTR(-EOPNOTSUPP); +} + +static inline void +mlx5_tc_ct_delete_flow(struct mlx5e_priv *priv, + struct mlx5e_tc_flow *flow, + struct mlx5_esw_flow_attr *attr) +{ +} + +#endif /* !IS_ENABLED(CONFIG_MLX5_TC_CT) */ +#endif /* __MLX5_EN_TC_CT_H__ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h index 2bbdbdc..6a23379 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.h @@ -55,6 +55,7 @@ struct mlx5e_neigh_update_table { unsigned long min_interval; /* jiffies */ }; +struct mlx5_tc_ct_priv; struct mlx5_rep_uplink_priv { /* Filters DB - instantiated by the uplink representor and shared by * the uplink's VFs @@ -86,6 +87,8 @@ struct mlx5_rep_uplink_priv { struct mapping_ctx *tunnel_mapping; /* maps tun_enc_opts to a unique id*/ struct mapping_ctx *tunnel_enc_opts_mapping; + + struct mlx5_tc_ct_priv *ct_priv; }; struct mlx5e_rep_priv { diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c index 4b04992..1f6a306 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c @@ -56,6 +56,7 @@ #include "en/port.h" #include "en/tc_tun.h" #include "en/mapping.h" +#include "en/tc_ct.h" #include "lib/devcom.h" #include "lib/geneve.h" #include "diag/en_tc_tracepoint.h" @@ -87,6 +88,7 @@ enum { MLX5E_TC_FLOW_FLAG_DUP = MLX5E_TC_FLOW_BASE + 4, MLX5E_TC_FLOW_FLAG_NOT_READY = MLX5E_TC_FLOW_BASE + 5, MLX5E_TC_FLOW_FLAG_DELETED = MLX5E_TC_FLOW_BASE + 6, + MLX5E_TC_FLOW_FLAG_CT = MLX5E_TC_FLOW_BASE + 7, }; #define MLX5E_TC_MAX_SPLITS 1 @@ -193,6 +195,11 @@ struct mlx5e_tc_attr_to_reg_mapping mlx5e_tc_attr_to_reg_mappings[] = { .soffset = MLX5_BYTE_OFF(fte_match_param, misc_parameters_2.metadata_reg_c_1), }, + [ZONE_TO_REG] = zone_to_reg_ct, + [CTSTATE_TO_REG] = ctstate_to_reg_ct, + [MARK_TO_REG] = mark_to_reg_ct, + [LABELS_TO_REG] = labels_to_reg_ct, + [FTEID_TO_REG] = fteid_to_reg_ct, }; static void mlx5e_put_flow_tunnel_id(struct mlx5e_tc_flow *flow); @@ -1144,6 +1151,10 @@ static int mlx5e_attach_encap(struct mlx5e_priv *priv, struct mlx5_esw_flow_attr *attr) { struct mlx5_flow_handle *rule; + struct mlx5e_tc_mod_hdr_acts; + + if (flow_flag_test(flow, CT)) + return mlx5_tc_ct_flow_offload(flow->priv, flow, spec, attr); rule = mlx5_eswitch_add_offloaded_rule(esw, spec, attr); if (IS_ERR(rule)) @@ -1163,10 +1174,15 @@ static int mlx5e_attach_encap(struct mlx5e_priv *priv, static void mlx5e_tc_unoffload_fdb_rules(struct mlx5_eswitch *esw, struct mlx5e_tc_flow *flow, - struct mlx5_esw_flow_attr *attr) + struct mlx5_esw_flow_attr *attr) { flow_flag_clear(flow, OFFLOADED); + if (flow_flag_test(flow, CT)) { + mlx5_tc_ct_delete_flow(flow->priv, flow, attr); + return; + } + if (attr->split_count) mlx5_eswitch_del_fwd_rule(esw, flow->rule[1], attr); @@ -1938,6 +1954,11 @@ static void mlx5e_put_flow_tunnel_id(struct mlx5e_tc_flow *flow) enc_opts_id); } +u32 mlx5e_tc_get_flow_tun_id(struct mlx5e_tc_flow *flow) +{ + return flow->tunnel_id; +} + static int parse_tunnel_attr(struct mlx5e_priv *priv, struct mlx5e_tc_flow *flow, struct mlx5_flow_spec *spec, @@ -2103,6 +2124,7 @@ static int __parse_cls_flower(struct mlx5e_priv *priv, BIT(FLOW_DISSECTOR_KEY_ENC_CONTROL) | BIT(FLOW_DISSECTOR_KEY_TCP) | BIT(FLOW_DISSECTOR_KEY_IP) | + BIT(FLOW_DISSECTOR_KEY_CT) | BIT(FLOW_DISSECTOR_KEY_ENC_IP) | BIT(FLOW_DISSECTOR_KEY_ENC_OPTS))) { NL_SET_ERR_MSG_MOD(extack, "Unsupported key"); @@ -2913,7 +2935,9 @@ struct ipv6_hoplimit_word { __u8 hop_limit; }; -static bool is_action_keys_supported(const struct flow_action_entry *act) +static int is_action_keys_supported(const struct flow_action_entry *act, + bool ct_flow, bool *modify_ip_header, + struct netlink_ext_ack *extack) { u32 mask, offset; u8 htype; @@ -2932,7 +2956,13 @@ static bool is_action_keys_supported(const struct flow_action_entry *act) if (offset != offsetof(struct iphdr, ttl) || ttl_word->protocol || ttl_word->check) { - return true; + *modify_ip_header = true; + } + + if (ct_flow && offset >= offsetof(struct iphdr, saddr)) { + NL_SET_ERR_MSG_MOD(extack, + "can't offload re-write of ipv4 address with action ct"); + return -EOPNOTSUPP; } } else if (htype == FLOW_ACT_MANGLE_HDR_TYPE_IP6) { struct ipv6_hoplimit_word *hoplimit_word = @@ -2941,15 +2971,27 @@ static bool is_action_keys_supported(const struct flow_action_entry *act) if (offset != offsetof(struct ipv6hdr, payload_len) || hoplimit_word->payload_len || hoplimit_word->nexthdr) { - return true; + *modify_ip_header = true; + } + + if (ct_flow && offset >= offsetof(struct ipv6hdr, saddr)) { + NL_SET_ERR_MSG_MOD(extack, + "can't offload re-write of ipv6 address with action ct"); + return -EOPNOTSUPP; } + } else if (ct_flow && (htype == FLOW_ACT_MANGLE_HDR_TYPE_TCP || + htype == FLOW_ACT_MANGLE_HDR_TYPE_UDP)) { + NL_SET_ERR_MSG_MOD(extack, + "can't offload re-write of transport header ports with action ct"); + return -EOPNOTSUPP; } - return false; + + return 0; } static bool modify_header_match_supported(struct mlx5_flow_spec *spec, struct flow_action *flow_action, - u32 actions, + u32 actions, bool ct_flow, struct netlink_ext_ack *extack) { const struct flow_action_entry *act; @@ -2957,7 +2999,7 @@ static bool modify_header_match_supported(struct mlx5_flow_spec *spec, void *headers_v; u16 ethertype; u8 ip_proto; - int i; + int i, err; headers_v = get_match_headers_value(actions, spec); ethertype = MLX5_GET(fte_match_set_lyr_2_4, headers_v, ethertype); @@ -2972,10 +3014,10 @@ static bool modify_header_match_supported(struct mlx5_flow_spec *spec, act->id != FLOW_ACTION_ADD) continue; - if (is_action_keys_supported(act)) { - modify_ip_header = true; - break; - } + err = is_action_keys_supported(act, ct_flow, + &modify_ip_header, extack); + if (err) + return err; } ip_proto = MLX5_GET(fte_match_set_lyr_2_4, headers_v, ip_protocol); @@ -2998,13 +3040,24 @@ static bool actions_match_supported(struct mlx5e_priv *priv, struct netlink_ext_ack *extack) { struct net_device *filter_dev = parse_attr->filter_dev; - bool drop_action, pop_action; + bool drop_action, pop_action, ct_flow; u32 actions; - if (mlx5e_is_eswitch_flow(flow)) + ct_flow = flow_flag_test(flow, CT); + if (mlx5e_is_eswitch_flow(flow)) { actions = flow->esw_attr->action; - else + + if (flow->esw_attr->split_count && ct_flow) { + /* All registers used by ct are cleared when using + * split rules. + */ + NL_SET_ERR_MSG_MOD(extack, + "Can't offload mirroring with action ct"); + return -EOPNOTSUPP; + } + } else { actions = flow->nic_attr->action; + } drop_action = actions & MLX5_FLOW_CONTEXT_ACTION_DROP; pop_action = actions & MLX5_FLOW_CONTEXT_ACTION_VLAN_POP; @@ -3021,7 +3074,7 @@ static bool actions_match_supported(struct mlx5e_priv *priv, if (actions & MLX5_FLOW_CONTEXT_ACTION_MOD_HDR) return modify_header_match_supported(&parse_attr->spec, flow_action, actions, - extack); + ct_flow, extack); return true; } @@ -3826,6 +3879,13 @@ static int parse_tc_fdb_actions(struct mlx5e_priv *priv, action |= MLX5_FLOW_CONTEXT_ACTION_COUNT; attr->dest_chain = act->chain_index; break; + case FLOW_ACTION_CT: + err = mlx5_tc_ct_parse_action(priv, attr, act, extack); + if (err) + return err; + + flow_flag_set(flow, CT); + break; default: NL_SET_ERR_MSG_MOD(extack, "The offload action is not supported"); return -EOPNOTSUPP; @@ -4066,6 +4126,10 @@ static bool is_peer_flow_needed(struct mlx5e_tc_flow *flow) if (err) goto err_free; + err = mlx5_tc_ct_parse_match(priv, &parse_attr->spec, f, extack); + if (err) + goto err_free; + err = mlx5e_tc_add_fdb_flow(priv, flow, extack); complete_all(&flow->init_done); if (err) { @@ -4350,7 +4414,7 @@ int mlx5e_stats_flower(struct net_device *dev, struct mlx5e_priv *priv, goto errout; } - if (mlx5e_is_offloaded_flow(flow)) { + if (mlx5e_is_offloaded_flow(flow) || flow_flag_test(flow, CT)) { counter = mlx5e_tc_get_counter(flow); if (!counter) goto errout; @@ -4622,6 +4686,10 @@ int mlx5e_tc_esw_init(struct rhashtable *tc_ht) uplink_priv = container_of(tc_ht, struct mlx5_rep_uplink_priv, tc_ht); priv = container_of(uplink_priv, struct mlx5e_rep_priv, uplink_priv); + err = mlx5_tc_ct_init(uplink_priv); + if (err) + goto err_ct; + mapping = mapping_create(sizeof(struct tunnel_match_key), TUNNEL_INFO_BITS_MASK, true); if (IS_ERR(mapping)) { @@ -4648,6 +4716,8 @@ int mlx5e_tc_esw_init(struct rhashtable *tc_ht) err_enc_opts_mapping: mapping_destroy(uplink_priv->tunnel_mapping); err_tun_mapping: + mlx5_tc_ct_clean(uplink_priv); +err_ct: netdev_warn(priv->netdev, "Failed to initialize tc (eswitch), err: %d", err); return err; @@ -4662,6 +4732,8 @@ void mlx5e_tc_esw_cleanup(struct rhashtable *tc_ht) uplink_priv = container_of(tc_ht, struct mlx5_rep_uplink_priv, tc_ht); mapping_destroy(uplink_priv->tunnel_enc_opts_mapping); mapping_destroy(uplink_priv->tunnel_mapping); + + mlx5_tc_ct_clean(uplink_priv); } int mlx5e_tc_num_filters(struct mlx5e_priv *priv, unsigned long flags) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.h b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.h index 21cbde4..31c9e81 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.h @@ -94,6 +94,11 @@ void mlx5e_tc_encap_flows_del(struct mlx5e_priv *priv, enum mlx5e_tc_attr_to_reg { CHAIN_TO_REG, TUNNEL_TO_REG, + CTSTATE_TO_REG, + ZONE_TO_REG, + MARK_TO_REG, + LABELS_TO_REG, + FTEID_TO_REG, }; struct mlx5e_tc_attr_to_reg_mapping { @@ -139,6 +144,9 @@ int alloc_mod_hdr_actions(struct mlx5_core_dev *mdev, struct mlx5e_tc_mod_hdr_acts *mod_hdr_acts); void dealloc_mod_hdr_actions(struct mlx5e_tc_mod_hdr_acts *mod_hdr_acts); +struct mlx5e_tc_flow; +u32 mlx5e_tc_get_flow_tun_id(struct mlx5e_tc_flow *flow); + #else /* CONFIG_MLX5_ESWITCH */ static inline int mlx5e_tc_nic_init(struct mlx5e_priv *priv) { return 0; } static inline void mlx5e_tc_nic_cleanup(struct mlx5e_priv *priv) {} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h index 6254bb6..2e0417d 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h @@ -42,6 +42,7 @@ #include #include #include "lib/mpfs.h" +#include "en/tc_ct.h" #define FDB_TC_MAX_CHAIN 3 #define FDB_FT_CHAIN (FDB_TC_MAX_CHAIN + 1) @@ -424,6 +425,7 @@ struct mlx5_esw_flow_attr { u32 flags; struct mlx5_flow_table *fdb; struct mlx5_flow_table *dest_ft; + struct mlx5_ct_attr ct_attr; struct mlx5e_tc_flow_parse_attr *parse_attr; }; From patchwork Thu Mar 12 10:23:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paul Blakey X-Patchwork-Id: 222654 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, UNPARSEABLE_RELAY,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D72FFC4CECE for ; Thu, 12 Mar 2020 10:23:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id AF6D420674 for ; Thu, 12 Mar 2020 10:23:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726928AbgCLKXf (ORCPT ); Thu, 12 Mar 2020 06:23:35 -0400 Received: from mail-il-dmz.mellanox.com ([193.47.165.129]:56566 "EHLO mellanox.co.il" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726859AbgCLKXe (ORCPT ); Thu, 12 Mar 2020 06:23:34 -0400 Received: from Internal Mail-Server by MTLPINE2 (envelope-from paulb@mellanox.com) with ESMTPS (AES256-SHA encrypted); 12 Mar 2020 12:23:29 +0200 Received: from reg-r-vrt-019-120.mtr.labs.mlnx (reg-r-vrt-019-120.mtr.labs.mlnx [10.213.19.120]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 02CANSTn017875; Thu, 12 Mar 2020 12:23:29 +0200 From: Paul Blakey To: Paul Blakey , Saeed Mahameed , Oz Shlomo , Jakub Kicinski , Vlad Buslov , David Miller , "netdev@vger.kernel.org" , Jiri Pirko , Roi Dayan Subject: [PATCH net-next ct-offload v4 15/15] net/mlx5e: CT: Support clear action Date: Thu, 12 Mar 2020 12:23:17 +0200 Message-Id: <1584008597-15875-16-git-send-email-paulb@mellanox.com> X-Mailer: git-send-email 1.8.4.3 In-Reply-To: <1584008597-15875-1-git-send-email-paulb@mellanox.com> References: <1584008597-15875-1-git-send-email-paulb@mellanox.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Clear action, as with software, removes all ct metadata from the packet. Signed-off-by: Paul Blakey Reviewed-by: Oz Shlomo --- drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c | 90 ++++++++++++++++++++-- drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.h | 7 +- drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 10 ++- 3 files changed, 95 insertions(+), 12 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c index c75dc97..956d9dd 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c @@ -1048,12 +1048,79 @@ struct mlx5_ct_entry { return err; } +static int +__mlx5_tc_ct_flow_offload_clear(struct mlx5e_priv *priv, + struct mlx5e_tc_flow *flow, + struct mlx5_flow_spec *orig_spec, + struct mlx5_esw_flow_attr *attr, + struct mlx5e_tc_mod_hdr_acts *mod_acts, + struct mlx5_flow_handle **flow_rule) +{ + struct mlx5_tc_ct_priv *ct_priv = mlx5_tc_ct_get_ct_priv(priv); + struct mlx5_eswitch *esw = ct_priv->esw; + struct mlx5_esw_flow_attr *pre_ct_attr; + struct mlx5_modify_hdr *mod_hdr; + struct mlx5_flow_handle *rule; + struct mlx5_ct_flow *ct_flow; + int err; + + ct_flow = kzalloc(sizeof(*ct_flow), GFP_KERNEL); + if (!ct_flow) + return -ENOMEM; + + /* Base esw attributes on original rule attribute */ + pre_ct_attr = &ct_flow->pre_ct_attr; + memcpy(pre_ct_attr, attr, sizeof(*attr)); + + err = mlx5_tc_ct_entry_set_registers(ct_priv, mod_acts, 0, 0, 0, 0); + if (err) { + ct_dbg("Failed to set register for ct clear"); + goto err_set_registers; + } + + mod_hdr = mlx5_modify_header_alloc(esw->dev, + MLX5_FLOW_NAMESPACE_FDB, + mod_acts->num_actions, + mod_acts->actions); + if (IS_ERR(mod_hdr)) { + err = PTR_ERR(mod_hdr); + ct_dbg("Failed to add create ct clear mod hdr"); + goto err_set_registers; + } + + dealloc_mod_hdr_actions(mod_acts); + pre_ct_attr->modify_hdr = mod_hdr; + pre_ct_attr->action |= MLX5_FLOW_CONTEXT_ACTION_MOD_HDR; + + rule = mlx5_eswitch_add_offloaded_rule(esw, orig_spec, pre_ct_attr); + if (IS_ERR(rule)) { + err = PTR_ERR(rule); + ct_dbg("Failed to add ct clear rule"); + goto err_insert; + } + + attr->ct_attr.ct_flow = ct_flow; + ct_flow->pre_ct_rule = rule; + *flow_rule = rule; + + return 0; + +err_insert: + mlx5_modify_header_dealloc(priv->mdev, mod_hdr); +err_set_registers: + netdev_warn(priv->netdev, + "Failed to offload ct clear flow, err %d\n", err); + return err; +} + struct mlx5_flow_handle * mlx5_tc_ct_flow_offload(struct mlx5e_priv *priv, struct mlx5e_tc_flow *flow, struct mlx5_flow_spec *spec, - struct mlx5_esw_flow_attr *attr) + struct mlx5_esw_flow_attr *attr, + struct mlx5e_tc_mod_hdr_acts *mod_hdr_acts) { + bool clear_action = attr->ct_attr.ct_action & TCA_CT_ACT_CLEAR; struct mlx5_tc_ct_priv *ct_priv = mlx5_tc_ct_get_ct_priv(priv); struct mlx5_flow_handle *rule; int err; @@ -1062,7 +1129,12 @@ struct mlx5_flow_handle * return ERR_PTR(-EOPNOTSUPP); mutex_lock(&ct_priv->control_lock); - err = __mlx5_tc_ct_flow_offload(priv, flow, spec, attr, &rule); + if (clear_action) + err = __mlx5_tc_ct_flow_offload_clear(priv, flow, spec, attr, + mod_hdr_acts, &rule); + else + err = __mlx5_tc_ct_flow_offload(priv, flow, spec, attr, + &rule); mutex_unlock(&ct_priv->control_lock); if (err) return ERR_PTR(err); @@ -1080,11 +1152,15 @@ struct mlx5_flow_handle * mlx5_eswitch_del_offloaded_rule(esw, ct_flow->pre_ct_rule, pre_ct_attr); mlx5_modify_header_dealloc(esw->dev, pre_ct_attr->modify_hdr); - mlx5_eswitch_del_offloaded_rule(esw, ct_flow->post_ct_rule, - &ct_flow->post_ct_attr); - mlx5_esw_chains_put_chain_mapping(esw, ct_flow->chain_mapping); - idr_remove(&ct_priv->fte_ids, ct_flow->fte_id); - mlx5_tc_ct_del_ft_cb(ct_priv, ct_flow->ft); + + if (ct_flow->post_ct_rule) { + mlx5_eswitch_del_offloaded_rule(esw, ct_flow->post_ct_rule, + &ct_flow->post_ct_attr); + mlx5_esw_chains_put_chain_mapping(esw, ct_flow->chain_mapping); + idr_remove(&ct_priv->fte_ids, ct_flow->fte_id); + mlx5_tc_ct_del_ft_cb(ct_priv, ct_flow->ft); + } + kfree(ct_flow); } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.h b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.h index 464c865..6b2c893 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.h @@ -9,6 +9,7 @@ #include struct mlx5_esw_flow_attr; +struct mlx5e_tc_mod_hdr_acts; struct mlx5_rep_uplink_priv; struct mlx5e_tc_flow; struct mlx5e_priv; @@ -97,7 +98,8 @@ struct mlx5_flow_handle * mlx5_tc_ct_flow_offload(struct mlx5e_priv *priv, struct mlx5e_tc_flow *flow, struct mlx5_flow_spec *spec, - struct mlx5_esw_flow_attr *attr); + struct mlx5_esw_flow_attr *attr, + struct mlx5e_tc_mod_hdr_acts *mod_hdr_acts); void mlx5_tc_ct_delete_flow(struct mlx5e_priv *priv, struct mlx5e_tc_flow *flow, @@ -142,7 +144,8 @@ struct mlx5_flow_handle * mlx5_tc_ct_flow_offload(struct mlx5e_priv *priv, struct mlx5e_tc_flow *flow, struct mlx5_flow_spec *spec, - struct mlx5_esw_flow_attr *attr) + struct mlx5_esw_flow_attr *attr, + struct mlx5e_tc_mod_hdr_acts *mod_hdr_acts) { return ERR_PTR(-EOPNOTSUPP); } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c index 5497d5e..044891a 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c @@ -1151,11 +1151,15 @@ static int mlx5e_attach_encap(struct mlx5e_priv *priv, struct mlx5_flow_spec *spec, struct mlx5_esw_flow_attr *attr) { + struct mlx5e_tc_mod_hdr_acts *mod_hdr_acts; struct mlx5_flow_handle *rule; - struct mlx5e_tc_mod_hdr_acts; - if (flow_flag_test(flow, CT)) - return mlx5_tc_ct_flow_offload(flow->priv, flow, spec, attr); + if (flow_flag_test(flow, CT)) { + mod_hdr_acts = &attr->parse_attr->mod_hdr_acts; + + return mlx5_tc_ct_flow_offload(flow->priv, flow, spec, attr, + mod_hdr_acts); + } rule = mlx5_eswitch_add_offloaded_rule(esw, spec, attr); if (IS_ERR(rule))