Message ID | 160259304349.181017.7492443293310262978.stgit@ebuild |
---|---|
State | New |
Headers | show |
Series | [net-next] net: openvswitch: fix to make sure flow_lookup() is not preempted | expand |
On 2020-10-13 14:44:19 [+0200], Eelco Chaudron wrote: > The flow_lookup() function uses per CPU variables, which must not be > preempted. However, this is fine in the general napi use case where > the local BH is disabled. But, it's also called in the netlink > context, which is preemptible. The below patch makes sure that even > in the netlink path, preemption is disabled. > > Fixes: eac87c413bf9 ("net: openvswitch: reorder masks array based on usage") > Reported-by: Juri Lelli <jlelli@redhat.com> > Signed-off-by: Eelco Chaudron <echaudro@redhat.com> > --- > net/openvswitch/flow_table.c | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c > index 87c286ad660e..16289386632b 100644 > --- a/net/openvswitch/flow_table.c > +++ b/net/openvswitch/flow_table.c > @@ -850,9 +850,17 @@ struct sw_flow *ovs_flow_tbl_lookup(struct flow_table *tbl, > struct mask_array *ma = rcu_dereference_ovsl(tbl->mask_array); > u32 __always_unused n_mask_hit; > u32 __always_unused n_cache_hit; > + struct sw_flow *flow; > u32 index = 0; > > - return flow_lookup(tbl, ti, ma, key, &n_mask_hit, &n_cache_hit, &index); > + /* This function gets called trough the netlink interface and therefore > + * is preemptible. However, flow_lookup() function needs to be called > + * with preemption disabled due to CPU specific variables. > + */ Once again. u64_stats_update_begin(). What protects you against concurrent access. > + preempt_disable(); > + flow = flow_lookup(tbl, ti, ma, key, &n_mask_hit, &n_cache_hit, &index); > + preempt_enable(); > + return flow; > } > > struct sw_flow *ovs_flow_tbl_lookup_exact(struct flow_table *tbl, > Sebastian
On 13 Oct 2020, at 14:53, Sebastian Andrzej Siewior wrote: > On 2020-10-13 14:44:19 [+0200], Eelco Chaudron wrote: >> The flow_lookup() function uses per CPU variables, which must not be >> preempted. However, this is fine in the general napi use case where >> the local BH is disabled. But, it's also called in the netlink >> context, which is preemptible. The below patch makes sure that even >> in the netlink path, preemption is disabled. >> >> Fixes: eac87c413bf9 ("net: openvswitch: reorder masks array based on >> usage") >> Reported-by: Juri Lelli <jlelli@redhat.com> >> Signed-off-by: Eelco Chaudron <echaudro@redhat.com> >> --- >> net/openvswitch/flow_table.c | 10 +++++++++- >> 1 file changed, 9 insertions(+), 1 deletion(-) >> >> diff --git a/net/openvswitch/flow_table.c >> b/net/openvswitch/flow_table.c >> index 87c286ad660e..16289386632b 100644 >> --- a/net/openvswitch/flow_table.c >> +++ b/net/openvswitch/flow_table.c >> @@ -850,9 +850,17 @@ struct sw_flow *ovs_flow_tbl_lookup(struct >> flow_table *tbl, >> struct mask_array *ma = rcu_dereference_ovsl(tbl->mask_array); >> u32 __always_unused n_mask_hit; >> u32 __always_unused n_cache_hit; >> + struct sw_flow *flow; >> u32 index = 0; >> >> - return flow_lookup(tbl, ti, ma, key, &n_mask_hit, &n_cache_hit, >> &index); >> + /* This function gets called trough the netlink interface and >> therefore >> + * is preemptible. However, flow_lookup() function needs to be >> called >> + * with preemption disabled due to CPU specific variables. >> + */ > > Once again. u64_stats_update_begin(). What protects you against > concurrent access. Thanks Sebastian for repeating this, as I thought I went over the seqcount code and thought it should be fine for my use case. However based on this comment I went over it again, and found the logic part I was constantly missing :) My idea is to send a v2 patch and in addition to the preempt_disable() also make the seqcount part per CPU. I noticed other parts of the networking stack doing it the same way. So the patch would look something like: @@ -731,7 +732,7 @@ static struct sw_flow *flow_lookup(struct flow_table *tbl, u32 *n_cache_hit, u32 *index) { - u64 *usage_counters = this_cpu_ptr(ma->masks_usage_cntr); + struct mask_array_stats *stats = this_cpu_ptr(ma->masks_usage_stats); struct sw_flow *flow; struct sw_flow_mask *mask; int i; @@ -741,9 +742,9 @@ static struct sw_flow *flow_lookup(struct flow_table *tbl, if (mask) { flow = masked_flow_lookup(ti, key, mask, n_mask_hit); if (flow) { - u64_stats_update_begin(&ma->syncp); - usage_counters[*index]++; - u64_stats_update_end(&ma->syncp); + u64_stats_update_begin(&stats->syncp); + stats->usage_cntr[*index]++; + u64_stats_update_end(&stats->syncp); (*n_cache_hit)++; return flow; } Let me know your thoughts. Thanks, Eelco
On 2020-10-14 12:44:23 [+0200], Eelco Chaudron wrote: > Let me know your thoughts. better. If your seccount is per-CPU then you get away without explicit writer locking if you rely on global per-CPU locking. You can't do preempt_disable() because this section can be interrupt by softirq. You need something stronger :) Side note: Adding a fixes tag and net-next looks like "stable material starting next merge window". > Thanks, > > Eelco Sebastian
On 15 Oct 2020, at 9:55, Sebastian Andrzej Siewior wrote: > On 2020-10-14 12:44:23 [+0200], Eelco Chaudron wrote: >> Let me know your thoughts. > > better. If your seccount is per-CPU then you get away without explicit > writer locking if you rely on global per-CPU locking. You can't do > preempt_disable() because this section can be interrupt by softirq. > You > need something stronger :) Thanks for your reply! Yes I had it replaced with local_bh_disable() in my v2, as I noticed the hard IRQ to softirq part earlier. > Side note: Adding a fixes tag and net-next looks like "stable material > starting next merge window". Have the patch on net-next, but will send it out on next (will do the conversion later today and sent it out). Thanks, Eelco
On 2020-10-15 10:11:31 [+0200], Eelco Chaudron wrote: > Thanks for your reply! Yes I had it replaced with local_bh_disable() in my > v2, as I noticed the hard IRQ to softirq part earlier. Okay. Resend the complete thing once you ready and I take a look. > Thanks, > > Eelco Sebastian
diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c index 87c286ad660e..16289386632b 100644 --- a/net/openvswitch/flow_table.c +++ b/net/openvswitch/flow_table.c @@ -850,9 +850,17 @@ struct sw_flow *ovs_flow_tbl_lookup(struct flow_table *tbl, struct mask_array *ma = rcu_dereference_ovsl(tbl->mask_array); u32 __always_unused n_mask_hit; u32 __always_unused n_cache_hit; + struct sw_flow *flow; u32 index = 0; - return flow_lookup(tbl, ti, ma, key, &n_mask_hit, &n_cache_hit, &index); + /* This function gets called trough the netlink interface and therefore + * is preemptible. However, flow_lookup() function needs to be called + * with preemption disabled due to CPU specific variables. + */ + preempt_disable(); + flow = flow_lookup(tbl, ti, ma, key, &n_mask_hit, &n_cache_hit, &index); + preempt_enable(); + return flow; } struct sw_flow *ovs_flow_tbl_lookup_exact(struct flow_table *tbl,
The flow_lookup() function uses per CPU variables, which must not be preempted. However, this is fine in the general napi use case where the local BH is disabled. But, it's also called in the netlink context, which is preemptible. The below patch makes sure that even in the netlink path, preemption is disabled. Fixes: eac87c413bf9 ("net: openvswitch: reorder masks array based on usage") Reported-by: Juri Lelli <jlelli@redhat.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com> --- net/openvswitch/flow_table.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)