Message ID | 20201006083355.121018-1-nusiddiq@redhat.com |
---|---|
State | New |
Headers | show |
Series | [net-next] net: openvswitch: Add support to lookup invalid packet in ct action. | expand |
nusiddiq@redhat.com <nusiddiq@redhat.com> wrote: > From: Numan Siddique <nusiddiq@redhat.com> > > For a tcp packet which is part of an existing committed connection, > nf_conntrack_in() will return err and set skb->_nfct to NULL if it is > out of tcp window. ct action for this packet will set the ct_state > to +inv which is as expected. This is because from conntrack p.o.v., such TCP packet is NOT part of the existing connection. For example, because it is considered part of a previous incarnation of the same connection. > But a controller cannot add an OVS flow as > > table=21,priority=100,ct_state=+inv, actions=drop > > to drop such packets. That is because when ct action is executed on other > packets which are not part of existing committed connections, ct_state > can be set to invalid. Few such cases are: > - ICMP reply packets. Can you elaborate? Echo reply should not be invalid. Conntrack should mark it as established (unless such echo reply came out of the blue). > - TCP SYN/ACK packets during connection establishment. SYN/ACK should also be established state. INVALID should only be matched for packets that were never seen by conntrack, or that are deemed out of date / corrupted. > To distinguish between an invalid packet part of committed connection > and others, this patch introduces as a new ct attribute > OVS_CT_ATTR_LOOKUP_INV. If this is set in the ct action (without commit), > it tries to find the ct entry and if present, sets the ct_state to > +inv,+trk and also sets the mark and labels associated with the > connection. > > With this, a controller can add flows like > > .... > .... > table=20,ip, action=ct(table=21, lookup_invalid) > table=21,priority=100,ct_state=+inv+trk,ct_label=0x2/0x2 actions=drop > table=21,ip, actions=resubmit(,22) > .... > .... What exactly is the feature/problem that needs to be solved? I suspect this would help me to provide better feedback than the semi-random comments below .... :-) My only problem with how conntrack does things ATM is that the ruleset cannot distinguish: 1. packet was not even seen by conntrack 2. packet matches existing connection, but is "bad", for example: - contradicting tcp flags - out of window - invalid checksum There are a few sysctls to modify default behaviour, e.g. relax window checks, or ignore/skip checksum validation. The other problem i see (solveable for sure by yet-another-sysctl but i see that as last-resort) is usual compatibility problem: ct state invalid drop ct mark gt 0 accept If standard netfilter conntrack were to set skb->_nfct e.g. even if state is invalid, we could still make the above work via some internal flag. But if you reverse it, you get different behaviour: ct mark gt 0 accept ct state invalid drop First rule might now accept out-of-window packet even when "be_liberal" sysctl is off.
On Tue, Oct 6, 2020 at 4:46 PM Florian Westphal <fw@strlen.de> wrote: > > nusiddiq@redhat.com <nusiddiq@redhat.com> wrote: > > From: Numan Siddique <nusiddiq@redhat.com> > > > > For a tcp packet which is part of an existing committed connection, > > nf_conntrack_in() will return err and set skb->_nfct to NULL if it is > > out of tcp window. ct action for this packet will set the ct_state > > to +inv which is as expected. > > This is because from conntrack p.o.v., such TCP packet is NOT part of > the existing connection. > > For example, because it is considered part of a previous incarnation > of the same connection. > > > But a controller cannot add an OVS flow as > > > > table=21,priority=100,ct_state=+inv, actions=drop > > > > to drop such packets. That is because when ct action is executed on other > > packets which are not part of existing committed connections, ct_state > > can be set to invalid. Few such cases are: > > - ICMP reply packets. > > Can you elaborate? Echo reply should not be invalid. Conntrack should > mark it as established (unless such echo reply came out of the blue). Hi Florian, Thanks for providing the comments. Sorry for not being very clear. Let me brief about the present problem we see in OVN (which is a controller using ovs) When a VM/container sends a packet (in the ingress direction), we don't send all the packets to conntrack. If a packet is destined to an OVN load balancer virtual ip, only then we send the packet to conntrack in the ingress direction and then we do dnat to the backend. Eg. in the ingress direction table=1, match = (ip && ip4.dst == VIP) action = ct(table=2) tablle=2, ct_state=+new+trk && ip4.dst == VIP, action = ct(commit, nat=BACKEND_IP) ... .. However for the egress direction (when the packet is to be delivered to the VM/container), we send all the packets to conntrack and if the ct.est is set, we do undnat before delivering the packet to the VM/container. ... table=40, match = ip, action = ct(table=41) table=41, match = ct_state=+est+trk, action = ct(nat) ... What I mean here is that, since we send all the packets in the egress pipeline to conntrack, we can't add a flow like - match = ct_state=+inv, action = drop. i.e When a VM/container sends an ICMP request packet, it will not be sent to conntrack, but the reply ICMP will be sent to conntrack and it will be marked as invalid. So is the case with TCP, the TCP SYN from the VM is not sent to conntrack, but the SYN/ACK from the server would be sent to conntrack and it will be marked as invalid. > > > - TCP SYN/ACK packets during connection establishment. > > SYN/ACK should also be established state. > INVALID should only be matched for packets that were never seen > by conntrack, or that are deemed out of date / corrupted. > > > To distinguish between an invalid packet part of committed connection > > and others, this patch introduces as a new ct attribute > > OVS_CT_ATTR_LOOKUP_INV. If this is set in the ct action (without commit), > > it tries to find the ct entry and if present, sets the ct_state to > > +inv,+trk and also sets the mark and labels associated with the > > connection. > > > > With this, a controller can add flows like > > > > .... > > .... > > table=20,ip, action=ct(table=21, lookup_invalid) > > table=21,priority=100,ct_state=+inv+trk,ct_label=0x2/0x2 actions=drop > > table=21,ip, actions=resubmit(,22) > > .... > > .... > > What exactly is the feature/problem that needs to be solved? > I suspect this would help me to provide better feedback than the > semi-random comments below .... :-) > > My only problem with how conntrack does things ATM is that the ruleset > cannot distinguish: > > 1. packet was not even seen by conntrack > 2. packet matches existing connection, but is "bad", for example: > - contradicting tcp flags > - out of window > - invalid checksum We want the below to be solved (using OVS flows) : - If the packet is marked as invalid due to (2) which you mentioned above, we would like to read the ct_mark and ct_label fields as the packet is part of existing connection, so that we can add an OVS flow like ct_state=+inv+trk,ct_label=0x2 actions=drop Right now it is not possible. This patch does another lookup if skb->_nfct is NULL after nf_conntrack_in() to check if (2) is the case. If the lookup is successful, it updates the ct flow key with the ct_mark and ct_label. This is made optional using a netlink attribute. I'm not sure if it's possible for nf_conntrack_in() to provide this information for its callers so that the caller can come to know that the state is invalid because of (2). I tested by setting 'be_liberal' sysctl flag and since skb->_nfct was set for (2), OVS datapath module set the ct_state to +est. Thanks Numan > > There are a few sysctls to modify default behaviour, e.g. relax window > checks, or ignore/skip checksum validation. > > The other problem i see (solveable for sure by yet-another-sysctl but i > see that as last-resort) is usual compatibility problem: > > ct state invalid drop > ct mark gt 0 accept > > If standard netfilter conntrack were to set skb->_nfct e.g. even if > state is invalid, we could still make the above work via some internal > flag. > > But if you reverse it, you get different behaviour: > > ct mark gt 0 accept > ct state invalid drop > > First rule might now accept out-of-window packet even when "be_liberal" > sysctl is off. >
On Tue, Oct 6, 2020 at 5:49 PM Numan Siddique <nusiddiq@redhat.com> wrote: > > On Tue, Oct 6, 2020 at 4:46 PM Florian Westphal <fw@strlen.de> wrote: > > > > nusiddiq@redhat.com <nusiddiq@redhat.com> wrote: > > > From: Numan Siddique <nusiddiq@redhat.com> > > > > > > For a tcp packet which is part of an existing committed connection, > > > nf_conntrack_in() will return err and set skb->_nfct to NULL if it is > > > out of tcp window. ct action for this packet will set the ct_state > > > to +inv which is as expected. > > > > This is because from conntrack p.o.v., such TCP packet is NOT part of > > the existing connection. > > > > For example, because it is considered part of a previous incarnation > > of the same connection. > > > > > But a controller cannot add an OVS flow as > > > > > > table=21,priority=100,ct_state=+inv, actions=drop > > > > > > to drop such packets. That is because when ct action is executed on other > > > packets which are not part of existing committed connections, ct_state > > > can be set to invalid. Few such cases are: > > > - ICMP reply packets. > > > > Can you elaborate? Echo reply should not be invalid. Conntrack should > > mark it as established (unless such echo reply came out of the blue). > > Hi Florian, > > Thanks for providing the comments. > > Sorry for not being very clear. > > Let me brief about the present problem we see in OVN (which is a > controller using ovs) > > When a VM/container sends a packet (in the ingress direction), we don't send all > the packets to conntrack. If a packet is destined to an OVN load > balancer virtual ip, > only then we send the packet to conntrack in the ingress direction and > then we do dnat > to the backend. > > Eg. in the ingress direction > > table=1, match = (ip && ip4.dst == VIP) action = ct(table=2) > tablle=2, ct_state=+new+trk && ip4.dst == VIP, action = ct(commit, > nat=BACKEND_IP) > ... > .. > > However for the egress direction (when the packet is to be delivered > to the VM/container), > we send all the packets to conntrack and if the ct.est is set, we do > undnat before delivering > the packet to the VM/container. > ... > table=40, match = ip, action = ct(table=41) > table=41, match = ct_state=+est+trk, action = ct(nat) > ... > > What I mean here is that, since we send all the packets in the egress > pipeline to conntrack, > we can't add a flow like - match = ct_state=+inv, action = drop. > > i.e When a VM/container sends an ICMP request packet, it will not be > sent to conntrack, but > the reply ICMP will be sent to conntrack and it will be marked as invalid. > > So is the case with TCP, the TCP SYN from the VM is not sent to > conntrack, but the SYN/ACK > from the server would be sent to conntrack and it will be marked as invalid. > > > > > > - TCP SYN/ACK packets during connection establishment. > > > > SYN/ACK should also be established state. > > INVALID should only be matched for packets that were never seen > > by conntrack, or that are deemed out of date / corrupted. > > > > > To distinguish between an invalid packet part of committed connection > > > and others, this patch introduces as a new ct attribute > > > OVS_CT_ATTR_LOOKUP_INV. If this is set in the ct action (without commit), > > > it tries to find the ct entry and if present, sets the ct_state to > > > +inv,+trk and also sets the mark and labels associated with the > > > connection. > > > > > > With this, a controller can add flows like > > > > > > .... > > > .... > > > table=20,ip, action=ct(table=21, lookup_invalid) > > > table=21,priority=100,ct_state=+inv+trk,ct_label=0x2/0x2 actions=drop > > > table=21,ip, actions=resubmit(,22) > > > .... > > > .... > > > > What exactly is the feature/problem that needs to be solved? > > I suspect this would help me to provide better feedback than the > > semi-random comments below .... :-) > > > > My only problem with how conntrack does things ATM is that the ruleset > > cannot distinguish: > > > > 1. packet was not even seen by conntrack > > 2. packet matches existing connection, but is "bad", for example: > > - contradicting tcp flags > > - out of window > > - invalid checksum > > We want the below to be solved (using OVS flows) : > - If the packet is marked as invalid due to (2) which you mentioned above, > we would like to read the ct_mark and ct_label fields as the packet is > part of existing connection, so that we can add an OVS flow like > > ct_state=+inv+trk,ct_label=0x2 actions=drop > > Right now it is not possible. I forgot to mention the side effect of it. Since the tcp out of window packet is set as +inv, this packet is delivered to the VM/container without undnat and because of this VM/container resets the connection. Thanks Numan > > This patch does another lookup if skb->_nfct is NULL after > nf_conntrack_in() to check > if (2) is the case. If the lookup is successful, it updates the ct flow > key with the ct_mark and ct_label. This is made optional using a > netlink attribute. > > I'm not sure if it's possible for nf_conntrack_in() to provide this > information for > its callers so that the caller can come to know that the state is > invalid because of (2). > > I tested by setting 'be_liberal' sysctl flag and since skb->_nfct was > set for (2), OVS > datapath module set the ct_state to +est. > > Thanks > Numan > > > > > > There are a few sysctls to modify default behaviour, e.g. relax window > > checks, or ignore/skip checksum validation. > > > > The other problem i see (solveable for sure by yet-another-sysctl but i > > see that as last-resort) is usual compatibility problem: > > > > ct state invalid drop > > ct mark gt 0 accept > > > > If standard netfilter conntrack were to set skb->_nfct e.g. even if > > state is invalid, we could still make the above work via some internal > > flag. > > > > But if you reverse it, you get different behaviour: > > > > ct mark gt 0 accept > > ct state invalid drop > > > > First rule might now accept out-of-window packet even when "be_liberal" > > sysctl is off. > >
Numan Siddique <nusiddiq@redhat.com> wrote: > On Tue, Oct 6, 2020 at 4:46 PM Florian Westphal <fw@strlen.de> wrote: > > > > nusiddiq@redhat.com <nusiddiq@redhat.com> wrote: > > > From: Numan Siddique <nusiddiq@redhat.com> > > > > > > For a tcp packet which is part of an existing committed connection, > > > nf_conntrack_in() will return err and set skb->_nfct to NULL if it is > > > out of tcp window. ct action for this packet will set the ct_state > > > to +inv which is as expected. > > > > This is because from conntrack p.o.v., such TCP packet is NOT part of > > the existing connection. > > > > For example, because it is considered part of a previous incarnation > > of the same connection. > > > > > But a controller cannot add an OVS flow as > > > > > > table=21,priority=100,ct_state=+inv, actions=drop > > > > > > to drop such packets. That is because when ct action is executed on other > > > packets which are not part of existing committed connections, ct_state > > > can be set to invalid. Few such cases are: > > > - ICMP reply packets. > > > > Can you elaborate? Echo reply should not be invalid. Conntrack should > > mark it as established (unless such echo reply came out of the blue). > > Hi Florian, > > Thanks for providing the comments. > > Sorry for not being very clear. > > Let me brief about the present problem we see in OVN (which is a > controller using ovs) > > When a VM/container sends a packet (in the ingress direction), we don't send all > the packets to conntrack. If a packet is destined to an OVN load > balancer virtual ip, > only then we send the packet to conntrack in the ingress direction and > then we do dnat > to the backend. Ah, okay. That explains the INVALID then, but in that case I think this patch/direction is less desirable from conntrack point of view. More below. > table=1, match = (ip && ip4.dst == VIP) action = ct(table=2) > tablle=2, ct_state=+new+trk && ip4.dst == VIP, action = ct(commit, > nat=BACKEND_IP) > ... > .. > > However for the egress direction (when the packet is to be delivered > to the VM/container), > we send all the packets to conntrack and if the ct.est is set, we do > undnat before delivering > the packet to the VM/container. > ... > table=40, match = ip, action = ct(table=41) > table=41, match = ct_state=+est+trk, action = ct(nat) > ... > > What I mean here is that, since we send all the packets in the egress > pipeline to conntrack, > we can't add a flow like - match = ct_state=+inv, action = drop. Basically this is like a normal routing/netfitler (no ovs), where the machine in question only sees unidirectional traffic (reply packets taking different route). > i.e When a VM/container sends an ICMP request packet, it will not be > sent to conntrack, but > the reply ICMP will be sent to conntrack and it will be marked as invalid. Yes. For plain netfilter, the solution would be to accept INVALID icmp replies in the iptables/nftables ruleset. > So is the case with TCP, the TCP SYN from the VM is not sent to > conntrack, but the SYN/ACK > from the server would be sent to conntrack and it will be marked as invalid. Right. > > 1. packet was not even seen by conntrack > > 2. packet matches existing connection, but is "bad", for example: > > - contradicting tcp flags > > - out of window > > - invalid checksum > > We want the below to be solved (using OVS flows) : > - If the packet is marked as invalid due to (2) which you mentioned above, > we would like to read the ct_mark and ct_label fields as the packet is > part of existing connection, so that we can add an OVS flow like > > ct_state=+inv+trk,ct_label=0x2 actions=drop Wouldn't it make more sense to let tcp conntrack skip the in-window check? AFAIU thats the only problem and its identical to other cases that we have at the moment, for example, conntrack supports mid-stream pickup (on by default) which also disables tcp window checks. > I tested by setting 'be_liberal' sysctl flag and since skb->_nfct was > set for (2), OVS > datapath module set the ct_state to +est. Yes. This flag can be set programatically on a per-ct basis. See nft_flow_offload_eval() for example (BE_LIBERAL). Given we now have multiple places that set this I think it would make sense to add a helper, say, e.g. void nf_ct_tcp_be_liberal(struct nf_conn *ct); ? (Or perhaps nf_ct_tcp_disable_window_checks() , might be more clear/descriptive as to what this is doing). Do you think that this resolves the OVN problem?
On Wed, Oct 7, 2020 at 12:22 AM Florian Westphal <fw@strlen.de> wrote: > > Numan Siddique <nusiddiq@redhat.com> wrote: > > On Tue, Oct 6, 2020 at 4:46 PM Florian Westphal <fw@strlen.de> wrote: > > > > > > nusiddiq@redhat.com <nusiddiq@redhat.com> wrote: > > > > From: Numan Siddique <nusiddiq@redhat.com> > > > > > > > > For a tcp packet which is part of an existing committed connection, > > > > nf_conntrack_in() will return err and set skb->_nfct to NULL if it is > > > > out of tcp window. ct action for this packet will set the ct_state > > > > to +inv which is as expected. > > > > > > This is because from conntrack p.o.v., such TCP packet is NOT part of > > > the existing connection. > > > > > > For example, because it is considered part of a previous incarnation > > > of the same connection. > > > > > > > But a controller cannot add an OVS flow as > > > > > > > > table=21,priority=100,ct_state=+inv, actions=drop > > > > > > > > to drop such packets. That is because when ct action is executed on other > > > > packets which are not part of existing committed connections, ct_state > > > > can be set to invalid. Few such cases are: > > > > - ICMP reply packets. > > > > > > Can you elaborate? Echo reply should not be invalid. Conntrack should > > > mark it as established (unless such echo reply came out of the blue). > > > > Hi Florian, > > > > Thanks for providing the comments. > > > > Sorry for not being very clear. > > > > Let me brief about the present problem we see in OVN (which is a > > controller using ovs) > > > > When a VM/container sends a packet (in the ingress direction), we don't send all > > the packets to conntrack. If a packet is destined to an OVN load > > balancer virtual ip, > > only then we send the packet to conntrack in the ingress direction and > > then we do dnat > > to the backend. > > Ah, okay. That explains the INVALID then, but in that case I think this > patch/direction is less desirable from conntrack point of view. > > More below. > > > table=1, match = (ip && ip4.dst == VIP) action = ct(table=2) > > tablle=2, ct_state=+new+trk && ip4.dst == VIP, action = ct(commit, > > nat=BACKEND_IP) > > ... > > .. > > > > However for the egress direction (when the packet is to be delivered > > to the VM/container), > > we send all the packets to conntrack and if the ct.est is set, we do > > undnat before delivering > > the packet to the VM/container. > > ... > > table=40, match = ip, action = ct(table=41) > > table=41, match = ct_state=+est+trk, action = ct(nat) > > ... > > > > What I mean here is that, since we send all the packets in the egress > > pipeline to conntrack, > > we can't add a flow like - match = ct_state=+inv, action = drop. > > Basically this is like a normal routing/netfitler (no ovs), where > the machine in question only sees unidirectional traffic (reply packets > taking different route). > > > i.e When a VM/container sends an ICMP request packet, it will not be > > sent to conntrack, but > > the reply ICMP will be sent to conntrack and it will be marked as invalid. > > Yes. For plain netfilter, the solution would be to accept INVALID icmp > replies in the iptables/nftables ruleset. > > > So is the case with TCP, the TCP SYN from the VM is not sent to > > conntrack, but the SYN/ACK > > from the server would be sent to conntrack and it will be marked as invalid. > > Right. > > > > 1. packet was not even seen by conntrack > > > 2. packet matches existing connection, but is "bad", for example: > > > - contradicting tcp flags > > > - out of window > > > - invalid checksum > > > > We want the below to be solved (using OVS flows) : > > - If the packet is marked as invalid due to (2) which you mentioned above, > > we would like to read the ct_mark and ct_label fields as the packet is > > part of existing connection, so that we can add an OVS flow like > > > > ct_state=+inv+trk,ct_label=0x2 actions=drop > > Wouldn't it make more sense to let tcp conntrack skip the in-window > check? AFAIU thats the only problem and its identical to other cases > that we have at the moment, for example, conntrack supports mid-stream > pickup (on by default) which also disables tcp window checks. > > > I tested by setting 'be_liberal' sysctl flag and since skb->_nfct was > > set for (2), OVS > > datapath module set the ct_state to +est. > > Yes. This flag can be set programatically on a per-ct basis. > > See nft_flow_offload_eval() for example (BE_LIBERAL). > Given we now have multiple places that set this I think it would make > sense to add a helper, say, e.g. > > void nf_ct_tcp_be_liberal(struct nf_conn *ct); > ? > > (Or perhaps nf_ct_tcp_disable_window_checks() , might be more > clear/descriptive as to what this is doing). > > Do you think that this resolves the OVN problem? Thanks for the comments. I think this should resolve the problem for OVN. Numan >
diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h index 8300cc29dec8..db942986c5b7 100644 --- a/include/uapi/linux/openvswitch.h +++ b/include/uapi/linux/openvswitch.h @@ -768,6 +768,9 @@ struct ovs_action_hash { * respectively. Remaining bits control the changes for which an event is * delivered on the NFNLGRP_CONNTRACK_UPDATE group. * @OVS_CT_ATTR_TIMEOUT: Variable length string defining conntrack timeout. + * @OVS_CT_ATTR_LOOKUP_INV: If present, looks up and sets the state, mark and + * labels for an invalid packet (eg. out of tcp window) if it is part of + * committed connection. */ enum ovs_ct_attr { OVS_CT_ATTR_UNSPEC, @@ -782,6 +785,7 @@ enum ovs_ct_attr { OVS_CT_ATTR_EVENTMASK, /* u32 mask of IPCT_* events. */ OVS_CT_ATTR_TIMEOUT, /* Associate timeout with this connection for * fine-grain timeout tuning. */ + OVS_CT_ATTR_LOOKUP_INV, /* No argument. */ __OVS_CT_ATTR_MAX }; diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c index e6fe26a9c892..a6f96d9b4452 100644 --- a/net/openvswitch/conntrack.c +++ b/net/openvswitch/conntrack.c @@ -62,6 +62,7 @@ struct ovs_conntrack_info { u8 nat : 3; /* enum ovs_ct_nat */ u8 force : 1; u8 have_eventmask : 1; + u8 lookup_invalid : 1; u16 family; u32 eventmask; /* Mask of 1 << IPCT_*. */ struct md_mark mark; @@ -601,12 +602,13 @@ ovs_ct_get_info(const struct nf_conntrack_tuple_hash *h) * * Must be called with rcu_read_lock. * - * On success, populates skb->_nfct and returns the connection. Returns NULL - * if there is no existing entry. + * On success, populates skb->_nfct if 'skb_set_ct' is true and returns the + * connection. Returns NULL if there is no existing entry. */ static struct nf_conn * ovs_ct_find_existing(struct net *net, const struct nf_conntrack_zone *zone, - u8 l3num, struct sk_buff *skb, bool natted) + u8 l3num, struct sk_buff *skb, bool natted, + bool skb_set_ct) { struct nf_conntrack_tuple tuple; struct nf_conntrack_tuple_hash *h; @@ -636,14 +638,17 @@ ovs_ct_find_existing(struct net *net, const struct nf_conntrack_zone *zone, ct = nf_ct_tuplehash_to_ctrack(h); - /* Inverted packet tuple matches the reverse direction conntrack tuple, - * select the other tuplehash to get the right 'ctinfo' bits for this - * packet. - */ - if (natted) - h = &ct->tuplehash[!h->tuple.dst.dir]; + if (skb_set_ct) { + /* Inverted packet tuple matches the reverse direction + * conntrack tuple, select the other tuplehash to get the + * right 'ctinfo' bits for this packet. + */ + if (natted) + h = &ct->tuplehash[!h->tuple.dst.dir]; + + nf_ct_set(skb, ct, ovs_ct_get_info(h)); + } - nf_ct_set(skb, ct, ovs_ct_get_info(h)); return ct; } @@ -669,7 +674,7 @@ struct nf_conn *ovs_ct_executed(struct net *net, if (*ct_executed || (!key->ct_state && info->force)) { ct = ovs_ct_find_existing(net, &info->zone, info->family, skb, !!(key->ct_state & - OVS_CS_F_NAT_MASK)); + OVS_CS_F_NAT_MASK), true); } return ct; @@ -1033,6 +1038,20 @@ static int __ovs_ct_lookup(struct net *net, struct sw_flow_key *key, ovs_ct_helper(skb, info->family) != NF_ACCEPT) { return -EINVAL; } + } else if (info->lookup_invalid) { + /* nf_conntrack_in() sets skb->_nfct to NULL if the packet is + * invalid (eg. out of tcp window) even if it belongs to + * an existing connection. Check if there is an existing entry + * and if so, update the key with the mark and ct_labels. + */ + ct = ovs_ct_find_existing(net, &info->zone, info->family, skb, + false, false); + if (ct) { + u8 state; + + state = OVS_CS_F_TRACKED | OVS_CS_F_INVALID; + __ovs_ct_update_key(key, state, &info->zone, ct); + } } return 0; @@ -1602,6 +1621,9 @@ static int parse_ct(const struct nlattr *attr, struct ovs_conntrack_info *info, } break; #endif + case OVS_CT_ATTR_LOOKUP_INV: + info->lookup_invalid = true; + break; default: OVS_NLERR(log, "Unknown conntrack attr (%d)", @@ -1819,6 +1841,9 @@ int ovs_ct_action_to_attr(const struct ovs_conntrack_info *ct_info, if (nla_put_string(skb, OVS_CT_ATTR_TIMEOUT, ct_info->timeout)) return -EMSGSIZE; } + if (ct_info->lookup_invalid && + nla_put_flag(skb, OVS_CT_ATTR_LOOKUP_INV)) + return -EMSGSIZE; #if IS_ENABLED(CONFIG_NF_NAT) if (ct_info->nat && !ovs_ct_nat_to_attr(ct_info, skb))