diff mbox series

[net-next,v6] net: psample: Introduce stubs to remove NIC driver dependency

Message ID 20210201020412.52790-1-cmi@nvidia.com
State New
Headers show
Series [net-next,v6] net: psample: Introduce stubs to remove NIC driver dependency | expand

Commit Message

Chris Mi Feb. 1, 2021, 2:04 a.m. UTC
In order to send sampled packets to userspace, NIC driver calls
psample api directly. But it creates a hard dependency on module
psample. Introduce psample_ops to remove the hard dependency.
It is initialized when psample module is loaded and set to NULL
when the module is unloaded.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
---
v1->v2:
 - fix sparse errors
v2->v3:
 - remove inline
v3->v4:
 - add inline back
v4->v5:
 - address Jakub's comments
v5-v6:
 - fix psample_stub.c copyright

 include/net/psample.h    | 26 ++++++++++++++++++++++++++
 net/psample/psample.c    | 14 +++++++++++++-
 net/sched/Makefile       |  2 +-
 net/sched/psample_stub.c |  5 +++++
 4 files changed, 45 insertions(+), 2 deletions(-)
 create mode 100644 net/sched/psample_stub.c

Comments

Jakub Kicinski Feb. 2, 2021, 1:14 a.m. UTC | #1
On Mon,  1 Feb 2021 10:04:12 +0800 Chris Mi wrote:
> In order to send sampled packets to userspace, NIC driver calls

> psample api directly. But it creates a hard dependency on module

> psample. Introduce psample_ops to remove the hard dependency.

> It is initialized when psample module is loaded and set to NULL

> when the module is unloaded.

> 

> Reported-by: kernel test robot <lkp@intel.com>

> Signed-off-by: Chris Mi <cmi@nvidia.com>

> Reviewed-by: Jiri Pirko <jiri@nvidia.com>


The necessity of making this change is not obvious (you can fix the
distro scripts instead), you did not include a clarification in the
commit message even though two people asked you why it's needed and 
on top of that you keep sending code which doesn't build. 

Please consider this change rejected and do not send a v7.
Saeed Mahameed Feb. 2, 2021, 7:31 a.m. UTC | #2
On Mon, 2021-02-01 at 17:14 -0800, Jakub Kicinski wrote:
> On Mon,  1 Feb 2021 10:04:12 +0800 Chris Mi wrote:

> > In order to send sampled packets to userspace, NIC driver calls

> > psample api directly. But it creates a hard dependency on module

> > psample. Introduce psample_ops to remove the hard dependency.

> > It is initialized when psample module is loaded and set to NULL

> > when the module is unloaded.

> > 

> > Reported-by: kernel test robot <lkp@intel.com>

> > Signed-off-by: Chris Mi <cmi@nvidia.com>

> > Reviewed-by: Jiri Pirko <jiri@nvidia.com>

> 

> The necessity of making this change is not obvious (you can fix the

> distro scripts instead), you did not include a clarification in the

> commit message even though two people asked you why it's needed and 

> on top of that you keep sending code which doesn't build. 

> 

> Please consider this change rejected and do not send a v7.


Jakub, it is not only about installation dependencies, the issue is
more critical than this, 

We had some other issues with similar dependency problem where
mlx5_core had hard dependency with netfilter, firewalld when disabled,
removes netfilter and all its dependencies including mlx5, this is a no
go for our users.

Again, having a hard dependency between a hardware driver and a
peripheral module in the kernel is a bad design.
Jakub Kicinski Feb. 2, 2021, 9:24 p.m. UTC | #3
On Mon, 01 Feb 2021 23:31:15 -0800 Saeed Mahameed wrote:
> On Mon, 2021-02-01 at 17:14 -0800, Jakub Kicinski wrote:

> > On Mon,  1 Feb 2021 10:04:12 +0800 Chris Mi wrote:  

> > > In order to send sampled packets to userspace, NIC driver calls

> > > psample api directly. But it creates a hard dependency on module

> > > psample. Introduce psample_ops to remove the hard dependency.

> > > It is initialized when psample module is loaded and set to NULL

> > > when the module is unloaded.

> > > 

> > > Reported-by: kernel test robot <lkp@intel.com>

> > > Signed-off-by: Chris Mi <cmi@nvidia.com>

> > > Reviewed-by: Jiri Pirko <jiri@nvidia.com>  

> > 

> > The necessity of making this change is not obvious (you can fix the

> > distro scripts instead), you did not include a clarification in the

> > commit message even though two people asked you why it's needed and 

> > on top of that you keep sending code which doesn't build. 

> > 

> > Please consider this change rejected and do not send a v7.  

> 

> Jakub, it is not only about installation dependencies, the issue is

> more critical than this, 

> 

> We had some other issues with similar dependency problem where

> mlx5_core had hard dependency with netfilter, firewalld when disabled,

> removes netfilter and all its dependencies including mlx5, this is a no

> go for our users.

> 

> Again, having a hard dependency between a hardware driver and a

> peripheral module in the kernel is a bad design.


That is not the point.

The technical problem is minor, and it's a problem for _your_ driver.
Yet, it appears to be my responsibility to make sure the patch even
compiles.

I believe there should be a limit to the ignorance a community
volunteer is expected to put up with when dealing with patches 
from and for the benefit of a commercial vendor.

This is up for discussion, if you disagree let's talk it out. I'm 
not particularly patient (to put it mildly), but I don't understand 
how v5 could have built, and yet v6 gets posted with the same exact
problem :/

So from my perspective it seems like the right step to push back.
If you, Tariq, Jiri, Ido or any other seasoned kernel contributor
reposts this after making sure it's up to snuff themselves I will
most certainly take a look / apply.
Saeed Mahameed Feb. 2, 2021, 9:43 p.m. UTC | #4
On Tue, 2021-02-02 at 13:24 -0800, Jakub Kicinski wrote:
> On Mon, 01 Feb 2021 23:31:15 -0800 Saeed Mahameed wrote:


...

> > 

> > Jakub, it is not only about installation dependencies, the issue is

> > more critical than this, 

> > 

> > We had some other issues with similar dependency problem where

> > mlx5_core had hard dependency with netfilter, firewalld when

> > disabled,

> > removes netfilter and all its dependencies including mlx5, this is

> > a no

> > go for our users.

> > 

> > Again, having a hard dependency between a hardware driver and a

> > peripheral module in the kernel is a bad design.

> 

> That is not the point.

> 

> The technical problem is minor, and it's a problem for _your_ driver.

> Yet, it appears to be my responsibility to make sure the patch even

> compiles.

> 


I understand your frustration, We should have been more professional
with this patch submission, 

> I believe there should be a limit to the ignorance a community

> volunteer is expected to put up with when dealing with patches 

> from and for the benefit of a commercial vendor.

> 


totally agree, we have the tools internally to avoid clutter in mailing
list and we do this for pure mlx5 patches, but for net patches, people
tend to hurry and skip the queue so they get feedback ASAP.. 
this doesn't make it right, I will work to improve this.

> This is up for discussion, if you disagree let's talk it out. I'm 

> not particularly patient (to put it mildly), but I don't understand 

> how v5 could have built, and yet v6 gets posted with the same exact

> problem :/

> 


given the circumstance i would've done the same, even on v4.. 
sorry for the inconvenience .. :(.. thanks for your patience. 

> So from my perspective it seems like the right step to push back.


And I back you up !

> If you, Tariq, Jiri, Ido or any other seasoned kernel contributor

> reposts this after making sure it's up to snuff themselves I will

> most certainly take a look / apply.


My point wasn't about psample, it's about hw drivers vs stack/modules
dependency in general, maybe I used the wrong patch to discuss this
matter :D.. 

I will post a separate RFC for discussion later not related to psample
at all. this patch smells so bad we can't even discuss the general
issue here ..

Again thanks for your review and patience.
Chris Mi Feb. 3, 2021, 2:55 a.m. UTC | #5
Hi Jakub,

On 2/3/2021 5:24 AM, Jakub Kicinski wrote:
> On Mon, 01 Feb 2021 23:31:15 -0800 Saeed Mahameed wrote:

>> On Mon, 2021-02-01 at 17:14 -0800, Jakub Kicinski wrote:

>>> On Mon,  1 Feb 2021 10:04:12 +0800 Chris Mi wrote:

>>>> In order to send sampled packets to userspace, NIC driver calls

>>>> psample api directly. But it creates a hard dependency on module

>>>> psample. Introduce psample_ops to remove the hard dependency.

>>>> It is initialized when psample module is loaded and set to NULL

>>>> when the module is unloaded.

>>>>

>>>> Reported-by: kernel test robot <lkp@intel.com>

>>>> Signed-off-by: Chris Mi <cmi@nvidia.com>

>>>> Reviewed-by: Jiri Pirko <jiri@nvidia.com>

>>> The necessity of making this change is not obvious (you can fix the

>>> distro scripts instead), you did not include a clarification in the

>>> commit message even though two people asked you why it's needed and

>>> on top of that you keep sending code which doesn't build.

>>>

>>> Please consider this change rejected and do not send a v7.

>> Jakub, it is not only about installation dependencies, the issue is

>> more critical than this,

>>

>> We had some other issues with similar dependency problem where

>> mlx5_core had hard dependency with netfilter, firewalld when disabled,

>> removes netfilter and all its dependencies including mlx5, this is a no

>> go for our users.

>>

>> Again, having a hard dependency between a hardware driver and a

>> peripheral module in the kernel is a bad design.

> That is not the point.

>

> The technical problem is minor, and it's a problem for _your_ driver.

> Yet, it appears to be my responsibility to make sure the patch even

> compiles.

>

> I believe there should be a limit to the ignorance a community

> volunteer is expected to put up with when dealing with patches

> from and for the benefit of a commercial vendor.

>

> This is up for discussion, if you disagree let's talk it out. I'm

> not particularly patient (to put it mildly), but I don't understand

> how v5 could have built, and yet v6 gets posted with the same exact

> problem :/

>

> So from my perspective it seems like the right step to push back.

> If you, Tariq, Jiri, Ido or any other seasoned kernel contributor

> reposts this after making sure it's up to snuff themselves I will

> most certainly take a look / apply.

Just now I tested v5 using sparse. I did hit some errors for psample. 
The reason I
didn't hit it is because I only build the psample module. But I forget I 
changed
the kernel. So I should build the whole kernel.

I'm really sorry that I waste you a lot of time for this bad quality 
patch. You know
usually Saeed submits most of the patches on behalf of us. So I'm not 
very familiar
with rules here.

So how about I submit v7 based on v4. I know you are very busy, if it is 
not ok,
I'll wait for Saeed's further information about next step.

Thanks,
Chris
diff mbox series

Patch

diff --git a/include/net/psample.h b/include/net/psample.h
index 68ae16bb0a4a..d0f1cfc56f6f 100644
--- a/include/net/psample.h
+++ b/include/net/psample.h
@@ -14,6 +14,15 @@  struct psample_group {
 	struct rcu_head rcu;
 };
 
+struct psample_ops {
+	void (*sample_packet)(struct psample_group *group, struct sk_buff *skb,
+			      u32 trunc_size, int in_ifindex, int out_ifindex,
+			      u32 sample_rate);
+
+};
+
+extern const struct psample_ops __rcu *psample_ops __read_mostly;
+
 struct psample_group *psample_group_get(struct net *net, u32 group_num);
 void psample_group_take(struct psample_group *group);
 void psample_group_put(struct psample_group *group);
@@ -35,4 +44,21 @@  static inline void psample_sample_packet(struct psample_group *group,
 
 #endif
 
+static inline void
+psample_nic_sample_packet(struct psample_group *group,
+			  struct sk_buff *skb, u32 trunc_size,
+			  int in_ifindex, int out_ifindex,
+			  u32 sample_rate)
+{
+	const struct psample_ops *ops;
+
+	rcu_read_lock();
+	ops = rcu_dereference(psample_ops);
+	if (ops)
+		ops->sample_packet(group, skb, trunc_size,
+				   in_ifindex, out_ifindex,
+				   sample_rate);
+	rcu_read_unlock();
+}
+
 #endif /* __NET_PSAMPLE_H */
diff --git a/net/psample/psample.c b/net/psample/psample.c
index 33e238c965bd..983ca5b698fe 100644
--- a/net/psample/psample.c
+++ b/net/psample/psample.c
@@ -8,6 +8,7 @@ 
 #include <linux/kernel.h>
 #include <linux/skbuff.h>
 #include <linux/module.h>
+#include <linux/rcupdate.h>
 #include <net/net_namespace.h>
 #include <net/sock.h>
 #include <net/netlink.h>
@@ -35,6 +36,10 @@  static const struct genl_multicast_group psample_nl_mcgrps[] = {
 
 static struct genl_family psample_nl_family __ro_after_init;
 
+static const struct psample_ops psample_sample_ops = {
+	.sample_packet	= psample_sample_packet,
+};
+
 static int psample_group_nl_fill(struct sk_buff *msg,
 				 struct psample_group *group,
 				 enum psample_command cmd, u32 portid, u32 seq,
@@ -456,11 +461,18 @@  EXPORT_SYMBOL_GPL(psample_sample_packet);
 
 static int __init psample_module_init(void)
 {
-	return genl_register_family(&psample_nl_family);
+	int ret;
+
+	ret = genl_register_family(&psample_nl_family);
+	if (!ret)
+		RCU_INIT_POINTER(psample_ops, &psample_sample_ops);
+	return ret;
 }
 
 static void __exit psample_module_exit(void)
 {
+	rcu_assign_pointer(psample_ops, NULL);
+	synchronize_rcu();
 	genl_unregister_family(&psample_nl_family);
 }
 
diff --git a/net/sched/Makefile b/net/sched/Makefile
index dd14ef413fda..0d92bb98bb26 100644
--- a/net/sched/Makefile
+++ b/net/sched/Makefile
@@ -3,7 +3,7 @@ 
 # Makefile for the Linux Traffic Control Unit.
 #
 
-obj-y	:= sch_generic.o sch_mq.o
+obj-y	:= sch_generic.o sch_mq.o psample_stub.o
 
 obj-$(CONFIG_INET)		+= sch_frag.o
 obj-$(CONFIG_NET_SCHED)		+= sch_api.o sch_blackhole.o
diff --git a/net/sched/psample_stub.c b/net/sched/psample_stub.c
new file mode 100644
index 000000000000..0541b8c5100d
--- /dev/null
+++ b/net/sched/psample_stub.c
@@ -0,0 +1,5 @@ 
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright (c) 2021 Mellanox Technologies. */
+
+const struct psample_ops __rcu *psample_ops __read_mostly;
+EXPORT_SYMBOL_GPL(psample_ops);