Message ID | 20210617095020.28628-2-callum.sinclair@alliedtelesis.co.nz |
---|---|
State | New |
Headers | show |
Series | [1/1] net: Allow all multicast packets to be received on a interface. | expand |
Hi Callum, On Thu, Jun 17, 2021 at 09:50:20PM +1200, Callum Sinclair wrote: > +mc_snooping - BOOLEAN > + Enable multicast snooping on the interface. This allows any given > + multicast group to be received without explicitly being joined. > + The kernel needs to be compiled with CONFIG_MROUTE and/or > + CONFIG_IPV6_MROUTE. > + conf/all/mc_snooping must also be set to TRUE to enable multicast > + snooping for the interface. > + Generally this sounds like a useful feature. One note: When there are snooping bridges/switches involved, you might run into issues in receiving all multicast packets, as due to the missing IGMP/MLD reports the snooping switches won't forward to you. In that case, to conform to RFC4541 you would also need to become the selected IGMP/MLD querier and send IGMP/MLD query messages. Or better, you'd need to send Multicast Router Advertisements (RFC4286). The latter is the recommended, more flexible way but might not be supported by all multicast snooping switches yet. The Linux bridge supports this. There is a userspace tool called mrdisc you can use for MRD-Adv. though: https://github.com/troglobit/mrdisc. So no need to implement MRD Advertisements in the kernel with this patch (though I could imagine that it might be a useful feature to have, having MRD support out-of-the-box with this option). Just a note that some IGMP/MLD Querier or MRD Adv. would be needed when IGMP/MLD snooping switches are invoved might be nice to have in the mc_snooping description for now, to avoid potential confusion later. I'm also wondering if it could be useful to configure it via setsockopt() per application instead of per device via sysctl. Either by adding a new socket option. Or by allowing the any IP address 0.0.0.0 / :: with IP_ADD_MEMBERSHIP/IPV6_JOIN_GROUP. So that you could use this for instance: $ socat -u UDP6-RECV:1234,reuseaddr,ipv6-join-group="[::]:eth0" - (currently :: fails with "Invalid argument") I'm not sure however what the requirements for adding or extending socket options are, if there are some POSIX standards that'd need to be followed for compatibility with other OSes, for instance. Hm, actually, I just noticed that there seem to be some multicast related setsockopt()s already: - PACKET_MR_PROMISC - PACKET_MR_MULTICAST - PACKET_MR_ALLMULTI The last one seems to be what you are looking for, I think, the manpage here says: "PACKET_MR_ALLMULTI sets the socket up to receive all multicast packets arriving at the interface" https://www.man7.org/linux/man-pages/man7/packet.7.html Or would you prefer to be able to use a layer 3 IP instead of a layer 2 packet socket? Regards, Linus
On Thu, Jun 17, 2021 at 09:50:20PM +1200, Callum Sinclair wrote: > To receive IGMP or MLD packets on a IP socket on any interface the > multicast group needs to be explicitly joined. This works well for when > the multicast group the user is interested in is known, but does not > provide an easy way to snoop all packets in the 224.0.0.0/8 or the > FF00::/8 range. > > Define a new sysctl to allow a given interface to become a IGMP or MLD > snooper. When set the interface will allow any IGMP or MLD packet to be > received on sockets bound to these devices. Hi Callum What is the big picture here? Are you trying to move the snooping algorithm into user space? User space will then add/remove Multicast FIB entries to the bridge to control where mulitcast frames are sent? In the past i have written a multicast routing daemon. It is a similar problem. You need access to all the join/leaves. But the stack does provide them, if you bind to the multicast routing socket. Why not use that mechanism? Look in the mrouted sources for an example. Andrew
Hi Callum,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on linus/master]
[also build test ERROR on v5.13-rc6 next-20210617]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Callum-Sinclair/Create-multicast-snooping-sysctl-option/20210617-175212
base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 70585216fe7730d9fb5453d3e2804e149d0fe201
config: x86_64-rhel-8.3-kselftests (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
# https://github.com/0day-ci/linux/commit/4220b6837f4315ff557bd44f7aada23b69e181b6
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Callum-Sinclair/Create-multicast-snooping-sysctl-option/20210617-175212
git checkout 4220b6837f4315ff557bd44f7aada23b69e181b6
# save the attached .config to linux build tree
make W=1 ARCH=x86_64
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
All errors (new ones prefixed by >>):
In file included from net/ipv4/devinet.c:47:
net/ipv4/devinet.c: In function 'inet_netconf_fill_devconf':
>> include/linux/inetdevice.h:53:45: error: 'IPV4_DEVCONF_NETCONFA_MC_SNOOPING' undeclared (first use in this function); did you mean 'IPV4_DEVCONF_MC_SNOOPING'?
53 | #define IPV4_DEVCONF(cnf, attr) ((cnf).data[IPV4_DEVCONF_ ## attr - 1])
| ^~~~~~~~~~~~~
net/ipv4/devinet.c:2069:4: note: in expansion of macro 'IPV4_DEVCONF'
2069 | IPV4_DEVCONF(*devconf, NETCONFA_MC_SNOOPING)) < 0)
| ^~~~~~~~~~~~
include/linux/inetdevice.h:53:45: note: each undeclared identifier is reported only once for each function it appears in
53 | #define IPV4_DEVCONF(cnf, attr) ((cnf).data[IPV4_DEVCONF_ ## attr - 1])
| ^~~~~~~~~~~~~
net/ipv4/devinet.c:2069:4: note: in expansion of macro 'IPV4_DEVCONF'
2069 | IPV4_DEVCONF(*devconf, NETCONFA_MC_SNOOPING)) < 0)
| ^~~~~~~~~~~~
vim +53 include/linux/inetdevice.h
^1da177e4c3f41 Linus Torvalds 2005-04-16 52
02291680ffba92 Eric W. Biederman 2010-02-14 @53 #define IPV4_DEVCONF(cnf, attr) ((cnf).data[IPV4_DEVCONF_ ## attr - 1])
586f12115264b7 Pavel Emelyanov 2007-12-16 54 #define IPV4_DEVCONF_ALL(net, attr) \
586f12115264b7 Pavel Emelyanov 2007-12-16 55 IPV4_DEVCONF((*(net)->ipv4.devconf_all), attr)
42f811b8bcdf66 Herbert Xu 2007-06-04 56
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
Hi Callum,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on linus/master]
[also build test ERROR on v5.13-rc6 next-20210617]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Callum-Sinclair/Create-multicast-snooping-sysctl-option/20210617-175212
base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 70585216fe7730d9fb5453d3e2804e149d0fe201
config: m68k-allmodconfig (attached as .config)
compiler: m68k-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/0day-ci/linux/commit/4220b6837f4315ff557bd44f7aada23b69e181b6
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Callum-Sinclair/Create-multicast-snooping-sysctl-option/20210617-175212
git checkout 4220b6837f4315ff557bd44f7aada23b69e181b6
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=m68k
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
All errors (new ones prefixed by >>):
In file included from net/ipv4/devinet.c:47:
net/ipv4/devinet.c: In function 'inet_netconf_fill_devconf':
>> include/linux/inetdevice.h:53:45: error: 'IPV4_DEVCONF_NETCONFA_MC_SNOOPING' undeclared (first use in this function); did you mean 'IPV4_DEVCONF_MC_SNOOPING'?
53 | #define IPV4_DEVCONF(cnf, attr) ((cnf).data[IPV4_DEVCONF_ ## attr - 1])
| ^~~~~~~~~~~~~
net/ipv4/devinet.c:2069:4: note: in expansion of macro 'IPV4_DEVCONF'
2069 | IPV4_DEVCONF(*devconf, NETCONFA_MC_SNOOPING)) < 0)
| ^~~~~~~~~~~~
include/linux/inetdevice.h:53:45: note: each undeclared identifier is reported only once for each function it appears in
53 | #define IPV4_DEVCONF(cnf, attr) ((cnf).data[IPV4_DEVCONF_ ## attr - 1])
| ^~~~~~~~~~~~~
net/ipv4/devinet.c:2069:4: note: in expansion of macro 'IPV4_DEVCONF'
2069 | IPV4_DEVCONF(*devconf, NETCONFA_MC_SNOOPING)) < 0)
| ^~~~~~~~~~~~
vim +53 include/linux/inetdevice.h
^1da177e4c3f41 Linus Torvalds 2005-04-16 52
02291680ffba92 Eric W. Biederman 2010-02-14 @53 #define IPV4_DEVCONF(cnf, attr) ((cnf).data[IPV4_DEVCONF_ ## attr - 1])
586f12115264b7 Pavel Emelyanov 2007-12-16 54 #define IPV4_DEVCONF_ALL(net, attr) \
586f12115264b7 Pavel Emelyanov 2007-12-16 55 IPV4_DEVCONF((*(net)->ipv4.devconf_all), attr)
42f811b8bcdf66 Herbert Xu 2007-06-04 56
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
Hi Linus > I'm also wondering if it could be useful to configure it via > setsockopt() per application instead of per device via sysctl. Either by > adding a new socket option. Or by allowing the any IP address > 0.0.0.0 / :: with IP_ADD_MEMBERSHIP/IPV6_JOIN_GROUP. So that you > could use this for instance: Yes perhaps this would be a better way to get multicast snooping working with the existing options. I can see that using a multicast routing IP socket I will receive all IGMP and MLD data from that. I was just not creating the socket as a multicast routing socket. > Or would you prefer to be able to use a layer 3 IP instead of > a layer 2 packet socket? Yes I was preferring to use a L3 IP socket instead of a L2 packet socket. This was to have access to additional data from IP_PKTINFO. Cheers Callum
diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst index a5c250044500..12f82da52684 100644 --- a/Documentation/networking/ip-sysctl.rst +++ b/Documentation/networking/ip-sysctl.rst @@ -1357,6 +1357,14 @@ mc_forwarding - BOOLEAN conf/all/mc_forwarding must also be set to TRUE to enable multicast routing for the interface +mc_snooping - BOOLEAN + Enable multicast snooping on the interface. This allows any given + multicast group to be received without explicitly being joined. + The kernel needs to be compiled with CONFIG_MROUTE and/or + CONFIG_IPV6_MROUTE. + conf/all/mc_snooping must also be set to TRUE to enable multicast + snooping for the interface. + medium_id - INTEGER Integer value used to differentiate the devices by the medium they are attached to. Two devices can have different id values when diff --git a/include/linux/inetdevice.h b/include/linux/inetdevice.h index 53aa0343bf69..071edf7d4f9c 100644 --- a/include/linux/inetdevice.h +++ b/include/linux/inetdevice.h @@ -95,6 +95,7 @@ static inline void ipv4_devconf_setall(struct in_device *in_dev) #define IN_DEV_FORWARD(in_dev) IN_DEV_CONF_GET((in_dev), FORWARDING) #define IN_DEV_MFORWARD(in_dev) IN_DEV_ANDCONF((in_dev), MC_FORWARDING) +#define IN_DEV_MSNOOPING(in_dev) IN_DEV_ANDCONF((in_dev), MC_SNOOPING) #define IN_DEV_BFORWARD(in_dev) IN_DEV_ANDCONF((in_dev), BC_FORWARDING) #define IN_DEV_RPFILTER(in_dev) IN_DEV_MAXCONF((in_dev), RP_FILTER) #define IN_DEV_SRC_VMARK(in_dev) IN_DEV_ORCONF((in_dev), SRC_VMARK) diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h index 70b2ad3b9884..d88c34b1b3ae 100644 --- a/include/linux/ipv6.h +++ b/include/linux/ipv6.h @@ -52,6 +52,7 @@ struct ipv6_devconf { #endif #ifdef CONFIG_IPV6_MROUTE __s32 mc_forwarding; + __s32 mc_snooping; #endif __s32 disable_ipv6; __s32 drop_unicast_in_l2_multicast; diff --git a/include/uapi/linux/ip.h b/include/uapi/linux/ip.h index e42d13b55cf3..07956b4613d0 100644 --- a/include/uapi/linux/ip.h +++ b/include/uapi/linux/ip.h @@ -169,6 +169,7 @@ enum IPV4_DEVCONF_DROP_UNICAST_IN_L2_MULTICAST, IPV4_DEVCONF_DROP_GRATUITOUS_ARP, IPV4_DEVCONF_BC_FORWARDING, + IPV4_DEVCONF_MC_SNOOPING, __IPV4_DEVCONF_MAX }; diff --git a/include/uapi/linux/ipv6.h b/include/uapi/linux/ipv6.h index 70603775fe91..aa9389e1c1fd 100644 --- a/include/uapi/linux/ipv6.h +++ b/include/uapi/linux/ipv6.h @@ -190,6 +190,7 @@ enum { DEVCONF_NDISC_TCLASS, DEVCONF_RPL_SEG_ENABLED, DEVCONF_RA_DEFRTR_METRIC, + DEVCONF_MC_SNOOPING, DEVCONF_MAX }; diff --git a/include/uapi/linux/netconf.h b/include/uapi/linux/netconf.h index fac4edd55379..5259742a700b 100644 --- a/include/uapi/linux/netconf.h +++ b/include/uapi/linux/netconf.h @@ -19,6 +19,7 @@ enum { NETCONFA_IGNORE_ROUTES_WITH_LINKDOWN, NETCONFA_INPUT, NETCONFA_BC_FORWARDING, + NETCONFA_MC_SNOOPING, __NETCONFA_MAX }; #define NETCONFA_MAX (__NETCONFA_MAX - 1) diff --git a/include/uapi/linux/sysctl.h b/include/uapi/linux/sysctl.h index 1e05d3caa712..1b7be9dc78de 100644 --- a/include/uapi/linux/sysctl.h +++ b/include/uapi/linux/sysctl.h @@ -482,6 +482,7 @@ enum NET_IPV4_CONF_PROMOTE_SECONDARIES=20, NET_IPV4_CONF_ARP_ACCEPT=21, NET_IPV4_CONF_ARP_NOTIFY=22, + NET_IPV4_CONF_MC_SNOOPING=23, }; /* /proc/sys/net/ipv4/netfilter */ diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c index 50deeff48c8b..3e4ac6aead9d 100644 --- a/net/ipv4/devinet.c +++ b/net/ipv4/devinet.c @@ -2014,6 +2014,8 @@ static int inet_netconf_msgsize_devconf(int type) size += nla_total_size(4); if (all || type == NETCONFA_MC_FORWARDING) size += nla_total_size(4); + if (all || type == NETCONFA_MC_SNOOPING) + size += nla_total_size(4); if (all || type == NETCONFA_BC_FORWARDING) size += nla_total_size(4); if (all || type == NETCONFA_PROXY_NEIGH) @@ -2062,6 +2064,10 @@ static int inet_netconf_fill_devconf(struct sk_buff *skb, int ifindex, nla_put_s32(skb, NETCONFA_MC_FORWARDING, IPV4_DEVCONF(*devconf, MC_FORWARDING)) < 0) goto nla_put_failure; + if ((all || type == NETCONFA_MC_SNOOPING) && + nla_put_s32(skb, NETCONFA_MC_SNOOPING, + IPV4_DEVCONF(*devconf, NETCONFA_MC_SNOOPING)) < 0) + goto nla_put_failure; if ((all || type == NETCONFA_BC_FORWARDING) && nla_put_s32(skb, NETCONFA_BC_FORWARDING, IPV4_DEVCONF(*devconf, BC_FORWARDING)) < 0) @@ -2506,6 +2512,7 @@ static struct devinet_sysctl_table { DEVINET_SYSCTL_COMPLEX_ENTRY(FORWARDING, "forwarding", devinet_sysctl_forward), DEVINET_SYSCTL_RO_ENTRY(MC_FORWARDING, "mc_forwarding"), + DEVINET_SYSCTL_RW_ENTRY(MC_SNOOPING, "mc_snooping"), DEVINET_SYSCTL_RW_ENTRY(BC_FORWARDING, "bc_forwarding"), DEVINET_SYSCTL_RW_ENTRY(ACCEPT_REDIRECTS, "accept_redirects"), diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c index 7b272bbed2b4..cd5a837dfb0c 100644 --- a/net/ipv4/igmp.c +++ b/net/ipv4/igmp.c @@ -2692,6 +2692,11 @@ int ip_check_mc_rcu(struct in_device *in_dev, __be32 mc_addr, __be32 src_addr, u struct ip_sf_list *psf; int rv = 0; +#ifdef CONFIG_IP_MROUTE + if (IN_DEV_MSNOOPING(in_dev)) + return 1; +#endif /* CONFIG_IP_MROUTE */ + mc_hash = rcu_dereference(in_dev->mc_hash); if (mc_hash) { u32 hash = hash_32((__force u32)mc_addr, MC_HASH_SZ_LOG); diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index 048570900fdf..b92ac4e8f37d 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -502,6 +502,8 @@ static int inet6_netconf_msgsize_devconf(int type) #ifdef CONFIG_IPV6_MROUTE if (all || type == NETCONFA_MC_FORWARDING) size += nla_total_size(4); + if (all || type == NETCONFA_MC_SNOOPING) + size += nla_total_size(4); #endif if (all || type == NETCONFA_PROXY_NEIGH) size += nla_total_size(4); @@ -546,6 +548,10 @@ static int inet6_netconf_fill_devconf(struct sk_buff *skb, int ifindex, nla_put_s32(skb, NETCONFA_MC_FORWARDING, devconf->mc_forwarding) < 0) goto nla_put_failure; + if ((all || type == NETCONFA_MC_SNOOPING) && + nla_put_s32(skb, NETCONFA_MC_SNOOPING, + devconf->mc_snooping) < 0) + goto nla_put_failure; #endif if ((all || type == NETCONFA_PROXY_NEIGH) && nla_put_s32(skb, NETCONFA_PROXY_NEIGH, devconf->proxy_ndp) < 0) @@ -5503,6 +5509,7 @@ static inline void ipv6_store_devconf(struct ipv6_devconf *cnf, #endif #ifdef CONFIG_IPV6_MROUTE array[DEVCONF_MC_FORWARDING] = cnf->mc_forwarding; + array[DEVCONF_MC_SNOOPING] = cnf->mc_snooping; #endif array[DEVCONF_DISABLE_IPV6] = cnf->disable_ipv6; array[DEVCONF_ACCEPT_DAD] = cnf->accept_dad; @@ -6786,6 +6793,13 @@ static const struct ctl_table addrconf_sysctl[] = { .mode = 0444, .proc_handler = proc_dointvec, }, + { + .procname = "mc_snooping", + .data = &ipv6_devconf.mc_snooping, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = proc_dointvec, + }, #endif { .procname = "disable_ipv6", diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c index 54ec163fbafa..25046ee8276f 100644 --- a/net/ipv6/mcast.c +++ b/net/ipv6/mcast.c @@ -1013,6 +1013,11 @@ bool ipv6_chk_mcast_addr(struct net_device *dev, const struct in6_addr *group, struct ifmcaddr6 *mc; bool rv = false; +#ifdef CONFIG_IPV6_MROUTE + if (dev_net(dev)->ipv6.devconf_all->mc_snooping) + return true; +#endif + rcu_read_lock(); idev = __in6_dev_get(dev); if (idev) {