mbox series

[V4,net-next,0/5] add support for RFC 8335 PROBE

Message ID cover.1615738431.git.andreas.a.roeseler@gmail.com
Headers show
Series add support for RFC 8335 PROBE | expand

Message

Andreas Roeseler March 14, 2021, 4:48 p.m. UTC
The popular utility ping has several severe limitations, such as the
inability to query specific interfaces on a node and requiring
bidirectional connectivity between the probing and probed interfaces.
RFC 8335 attempts to solve these limitations by creating the new utility
PROBE which is a specialized ICMP message that makes use of the ICMP
Extension Structure outlined in RFC 4884.

This patchset adds definitions for the ICMP Extended Echo Request and
Reply (PROBE) types for both IPV4 and IPV6, adds a sysctl to enable
responses to PROBE messages, expands the list of supported ICMP messages
to accommodate PROBE types, and adds functionality to respond to PROBE
requests.

Changes:
v1 -> v2:
 - Add AFI definitions
 - Switch to functions such as dev_get_by_name and ip_dev_find to lookup
   net devices

v2 -> v3:
Suggested by Willem de Bruijn <willemdebruijn.kernel@gmail.com>
 - Add verification of incoming messages before looking up netdev
 - Add prefix for PROBE specific defined variables
 - Use proc_dointvec_minmax with zero and one for sysctl
 - Create struct icmp_ext_echo_iio for parsing incoming packets
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
 - Include net/addrconf.h library for ipv6_dev_find

v3 -> v4:
 - Use in_addr instead of __be32 for storing IPV4 addresses
 - Use IFNAMSIZ to statically allocate space for name in
   icmp_ext_echo_iio
Suggested by Willem de Bruijn <willemdebruijn.kernel@gmail.com>
 - Use skb_header_pointer to verify fields in incoming message
 - Add check to ensure that extobj_hdr.length is valid
 - Check to ensure object payload is padded with ASCII NULL characters
   when probing by name, as specified by RFC 8335
 - Statically allocate buff using IFNAMSIZ
 - Add rcu blocking around ipv6_dev_find
 - Use __in_dev_get_rcu to access IPV4 addresses of identified
   net_device
 - Remove check for ICMPV6 PROBE types

Andreas Roeseler (5):
  icmp: add support for RFC 8335 PROBE
  ICMPV6: add support for RFC 8335 PROBE
  net: add sysctl for enabling RFC 8335 PROBE messages
  net: add support for sending RFC 8335 PROBE messages
  icmp: add response to RFC 8335 PROBE messages

 include/net/netns/ipv4.h    |   1 +
 include/uapi/linux/icmp.h   |  42 +++++++++++
 include/uapi/linux/icmpv6.h |   3 +
 net/ipv4/icmp.c             | 145 ++++++++++++++++++++++++++++++++----
 net/ipv4/ping.c             |   4 +-
 net/ipv4/sysctl_net_ipv4.c  |   9 +++
 6 files changed, 188 insertions(+), 16 deletions(-)

Comments

Willem de Bruijn March 15, 2021, 3:50 p.m. UTC | #1
On Sun, Mar 14, 2021 at 12:50 PM Andreas Roeseler
<andreas.a.roeseler@gmail.com> wrote:
>

> Modify the icmp_rcv function to check PROBE messages and call icmp_echo

> if a PROBE request is detected.

>

> Modify the existing icmp_echo function to respond ot both ping and PROBE

> requests.

>

> This was tested using a custom modification to the iputils package and

> wireshark. It supports IPV4 probing by name, ifindex, and probing by

> both IPV4 and IPV6 addresses. It currently does not support responding

> to probes off the proxy node (see RFC 8335 Section 2).


If you happen to use github or something similar, if you don't mind
sharing the code, you could clone the iputils repo and publish the
changes. No pressure.

>  net/ipv4/icmp.c | 145 +++++++++++++++++++++++++++++++++++++++++++-----

>  1 file changed, 130 insertions(+), 15 deletions(-)

>

> diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c

> index 616e2dc1c8fa..f1530011b7bc 100644

> --- a/net/ipv4/icmp.c

> +++ b/net/ipv4/icmp.c

> @@ -93,6 +93,10 @@

>  #include <net/ip_fib.h>

>  #include <net/l3mdev.h>

>

> +#if IS_ENABLED(CONFIG_IPV6)

> +#include <net/addrconf.h>

> +#endif


I don't think the conditional is needed (?)

>  static bool icmp_echo(struct sk_buff *skb)

>  {

> +       struct icmp_ext_hdr *ext_hdr, _ext_hdr;

> +       struct icmp_ext_echo_iio *iio, _iio;

> +       struct icmp_bxm icmp_param;

> +       struct net_device *dev;

>         struct net *net;

> +       u16 ident_len;

> +       char *buff;

> +       u8 status;

>

>         net = dev_net(skb_dst(skb)->dev);

> -       if (!net->ipv4.sysctl_icmp_echo_ignore_all) {

> -               struct icmp_bxm icmp_param;

> -

> -               icmp_param.data.icmph      = *icmp_hdr(skb);

> -               icmp_param.data.icmph.type = ICMP_ECHOREPLY;

> -               icmp_param.skb             = skb;

> -               icmp_param.offset          = 0;

> -               icmp_param.data_len        = skb->len;

> -               icmp_param.head_len        = sizeof(struct icmphdr);

> -               icmp_reply(&icmp_param, skb);

> -       }

>         /* should there be an ICMP stat for ignored echos? */

> -       return true;

> +       if (net->ipv4.sysctl_icmp_echo_ignore_all)

> +               return true;

> +

> +       icmp_param.data.icmph      = *icmp_hdr(skb);

> +       icmp_param.skb             = skb;

> +       icmp_param.offset          = 0;

> +       icmp_param.data_len        = skb->len;

> +       icmp_param.head_len        = sizeof(struct icmphdr);

> +

> +       if (icmp_param.data.icmph.type == ICMP_ECHO)

> +               goto send_reply;


Is this path now missing

               icmp_param.data.icmph.type = ICMP_ECHOREPLY;

> +       if (!net->ipv4.sysctl_icmp_echo_enable_probe)

> +               return true;

> +       /* We currently only support probing interfaces on the proxy node

> +        * Check to ensure L-bit is set

> +        */

> +       if (!(ntohs(icmp_param.data.icmph.un.echo.sequence) & 1))

> +               return true;

> +       /* Clear status bits in reply message */

> +       icmp_param.data.icmph.un.echo.sequence &= htons(0xFF00);

> +       icmp_param.data.icmph.type = ICMP_EXT_ECHOREPLY;

> +       ext_hdr = skb_header_pointer(skb, 0, sizeof(_ext_hdr), &_ext_hdr);

> +       iio = skb_header_pointer(skb, sizeof(_ext_hdr), sizeof(_iio), &_iio);

> +       if (!ext_hdr || !iio)

> +               goto send_mal_query;

> +       if (ntohs(iio->extobj_hdr.length) <= sizeof(iio->extobj_hdr))

> +               goto send_mal_query;

> +       ident_len = ntohs(iio->extobj_hdr.length) - sizeof(iio->extobj_hdr);

> +       status = 0;

> +       dev = NULL;

> +       switch (iio->extobj_hdr.class_type) {

> +       case EXT_ECHO_CTYPE_NAME:

> +               if (ident_len >= IFNAMSIZ)

> +                       goto send_mal_query;

> +               buff = kcalloc(IFNAMSIZ, sizeof(char), GFP_KERNEL);

> +               if (!buff)

> +                       return -ENOMEM;

> +               memcpy(buff, &iio->ident.name, ident_len);

> +               /* RFC 8335 2.1 If the Object Payload would not otherwise terminate

> +                * on a 32-bit boundary, it MUST be padded with ASCII NULL characters

> +                */

> +               if (ident_len % sizeof(u32) != 0) {

> +                       u8 i;

> +

> +                       for (i = ident_len; i % sizeof(u32) != 0; i++) {

> +                               if (buff[i] != '\0')

> +                                       goto send_mal_query;


Memory leak. IFNAMSIZ is small enough that you can use on-stack allocation

Also, I think you can ignore if there are non-zero bytes beyond the
len. We need to safely parse to avoid integrity bugs in the kernel.
Beyond that, it's fine to be strict about what you send, liberal what
you accept.

> +                       }

> +               }

> +               dev = dev_get_by_name(net, buff);

> +               kfree(buff);

> +               break;

> +       case EXT_ECHO_CTYPE_INDEX:

> +               if (ident_len != sizeof(iio->ident.ifindex))

> +                       goto send_mal_query;

> +               dev = dev_get_by_index(net, ntohl(iio->ident.ifindex));

> +               break;

> +       case EXT_ECHO_CTYPE_ADDR:

> +               if (ident_len != sizeof(iio->ident.addr.ctype3_hdr) + iio->ident.addr.ctype3_hdr.addrlen)

> +                       goto send_mal_query;

> +               switch (ntohs(iio->ident.addr.ctype3_hdr.afi)) {

> +               case ICMP_AFI_IP:

> +                       if (ident_len != sizeof(iio->ident.addr.ctype3_hdr) + sizeof(struct in_addr))

> +                               goto send_mal_query;

> +                       dev = ip_dev_find(net, iio->ident.addr.ip_addr.ipv4_addr.s_addr);

> +                       break;

> +#if IS_ENABLED(CONFIG_IPV6)

> +               case ICMP_AFI_IP6:

> +                       if (ident_len != sizeof(iio->ident.addr.ctype3_hdr) + sizeof(struct in6_addr))

> +                               goto send_mal_query;

> +                       rcu_read_lock();

> +                       dev = ipv6_dev_find(net, &iio->ident.addr.ip_addr.ipv6_addr, dev);

> +                       if (dev)

> +                               dev_hold(dev);

> +                       rcu_read_unlock();

> +                       break;

> +#endif

> +               default:

> +                       goto send_mal_query;

> +               }

> +               break;

> +       default:

> +               goto send_mal_query;

> +       }

> +       if (!dev) {

> +               icmp_param.data.icmph.code = ICMP_EXT_NO_IF;

> +               goto send_reply;

> +       }

> +       /* Fill bits in reply message */

> +       if (dev->flags & IFF_UP)

> +               status |= EXT_ECHOREPLY_ACTIVE;

> +       if (__in_dev_get_rcu(dev) && __in_dev_get_rcu(dev)->ifa_list)

> +               status |= EXT_ECHOREPLY_IPV4;

> +       if (!list_empty(&dev->ip6_ptr->addr_list))

> +               status |= EXT_ECHOREPLY_IPV6;

> +       dev_put(dev);

> +       icmp_param.data.icmph.un.echo.sequence |= htons(status);

> +send_reply:

> +       icmp_reply(&icmp_param, skb);

> +               return true;

> +send_mal_query:

> +       icmp_param.data.icmph.code = ICMP_EXT_MAL_QUERY;

> +       goto send_reply;

>  }

>

>  /*

> @@ -1088,6 +1192,11 @@ int icmp_rcv(struct sk_buff *skb)

>         icmph = icmp_hdr(skb);

>

>         ICMPMSGIN_INC_STATS(net, icmph->type);

> +

> +       /* Check for ICMP Extended Echo (PROBE) messages */

> +       if (icmph->type == ICMP_EXT_ECHO)

> +               goto probe;

> +

>         /*

>          *      18 is the highest 'known' ICMP type. Anything else is a mystery

>          *

> @@ -1097,7 +1206,6 @@ int icmp_rcv(struct sk_buff *skb)

>         if (icmph->type > NR_ICMP_TYPES)

>                 goto error;

>

> -

>         /*

>          *      Parse the ICMP message

>          */

> @@ -1123,7 +1231,7 @@ int icmp_rcv(struct sk_buff *skb)

>         }

>

>         success = icmp_pointers[icmph->type].handler(skb);

> -

> +success_check:

>         if (success)  {

>                 consume_skb(skb);

>                 return NET_RX_SUCCESS;

> @@ -1137,6 +1245,12 @@ int icmp_rcv(struct sk_buff *skb)

>  error:

>         __ICMP_INC_STATS(net, ICMP_MIB_INERRORS);

>         goto drop;

> +probe:

> +       /* We can't use icmp_pointers[].handler() because it is an array of

> +        * size NR_ICMP_TYPES + 1 (19 elements) and PROBE has code 42.

> +        */

> +       success = icmp_echo(skb);

> +       goto success_check;


just make this a branch instead of
Andreas Roeseler March 15, 2021, 7:09 p.m. UTC | #2
On Mon, 2021-03-15 at 11:50 -0400, Willem de Bruijn wrote:
> On Sun, Mar 14, 2021 at 12:50 PM Andreas Roeseler

> <andreas.a.roeseler@gmail.com> wrote:

> > 

> > Modify the icmp_rcv function to check PROBE messages and call

> > icmp_echo

> > if a PROBE request is detected.

> > 

> > Modify the existing icmp_echo function to respond ot both ping and

> > PROBE

> > requests.

> > 

> > This was tested using a custom modification to the iputils package

> > and

> > wireshark. It supports IPV4 probing by name, ifindex, and probing

> > by

> > both IPV4 and IPV6 addresses. It currently does not support

> > responding

> > to probes off the proxy node (see RFC 8335 Section 2).

> 

> If you happen to use github or something similar, if you don't mind

> sharing the code, you could clone the iputils repo and publish the

> changes. No pressure.


Should I include the link to the github repo in the patch?

> 

> >  net/ipv4/icmp.c | 145 +++++++++++++++++++++++++++++++++++++++++++-

> > ----

> >  1 file changed, 130 insertions(+), 15 deletions(-)

> > 

> > diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c

> > index 616e2dc1c8fa..f1530011b7bc 100644

> > --- a/net/ipv4/icmp.c

> > +++ b/net/ipv4/icmp.c

> > @@ -93,6 +93,10 @@

> >  #include <net/ip_fib.h>

> >  #include <net/l3mdev.h>

> > 

> > +#if IS_ENABLED(CONFIG_IPV6)

> > +#include <net/addrconf.h>

> > +#endif

> 

> I don't think the conditional is needed (?)

> 

> >  static bool icmp_echo(struct sk_buff *skb)

> >  {

> > +       struct icmp_ext_hdr *ext_hdr, _ext_hdr;

> > +       struct icmp_ext_echo_iio *iio, _iio;

> > +       struct icmp_bxm icmp_param;

> > +       struct net_device *dev;

> >         struct net *net;

> > +       u16 ident_len;

> > +       char *buff;

> > +       u8 status;

> > 

> >         net = dev_net(skb_dst(skb)->dev);

> > -       if (!net->ipv4.sysctl_icmp_echo_ignore_all) {

> > -               struct icmp_bxm icmp_param;

> > -

> > -               icmp_param.data.icmph      = *icmp_hdr(skb);

> > -               icmp_param.data.icmph.type = ICMP_ECHOREPLY;

> > -               icmp_param.skb             = skb;

> > -               icmp_param.offset          = 0;

> > -               icmp_param.data_len        = skb->len;

> > -               icmp_param.head_len        = sizeof(struct

> > icmphdr);

> > -               icmp_reply(&icmp_param, skb);

> > -       }

> >         /* should there be an ICMP stat for ignored echos? */

> > -       return true;

> > +       if (net->ipv4.sysctl_icmp_echo_ignore_all)

> > +               return true;

> > +

> > +       icmp_param.data.icmph      = *icmp_hdr(skb);

> > +       icmp_param.skb             = skb;

> > +       icmp_param.offset          = 0;

> > +       icmp_param.data_len        = skb->len;

> > +       icmp_param.head_len        = sizeof(struct icmphdr);

> > +

> > +       if (icmp_param.data.icmph.type == ICMP_ECHO)

> > +               goto send_reply;

> 

> Is this path now missing

> 

>                icmp_param.data.icmph.type = ICMP_ECHOREPLY;

> 

> > +       if (!net->ipv4.sysctl_icmp_echo_enable_probe)

> > +               return true;

> > +       /* We currently only support probing interfaces on the

> > proxy node

> > +        * Check to ensure L-bit is set

> > +        */

> > +       if (!(ntohs(icmp_param.data.icmph.un.echo.sequence) & 1))

> > +               return true;

> > +       /* Clear status bits in reply message */

> > +       icmp_param.data.icmph.un.echo.sequence &= htons(0xFF00);

> > +       icmp_param.data.icmph.type = ICMP_EXT_ECHOREPLY;

> > +       ext_hdr = skb_header_pointer(skb, 0, sizeof(_ext_hdr),

> > &_ext_hdr);

> > +       iio = skb_header_pointer(skb, sizeof(_ext_hdr),

> > sizeof(_iio), &_iio);

> > +       if (!ext_hdr || !iio)

> > +               goto send_mal_query;

> > +       if (ntohs(iio->extobj_hdr.length) <= sizeof(iio-

> > >extobj_hdr))

> > +               goto send_mal_query;

> > +       ident_len = ntohs(iio->extobj_hdr.length) - sizeof(iio-

> > >extobj_hdr);

> > +       status = 0;

> > +       dev = NULL;

> > +       switch (iio->extobj_hdr.class_type) {

> > +       case EXT_ECHO_CTYPE_NAME:

> > +               if (ident_len >= IFNAMSIZ)

> > +                       goto send_mal_query;

> > +               buff = kcalloc(IFNAMSIZ, sizeof(char), GFP_KERNEL);

> > +               if (!buff)

> > +                       return -ENOMEM;

> > +               memcpy(buff, &iio->ident.name, ident_len);

> > +               /* RFC 8335 2.1 If the Object Payload would not

> > otherwise terminate

> > +                * on a 32-bit boundary, it MUST be padded with

> > ASCII NULL characters

> > +                */

> > +               if (ident_len % sizeof(u32) != 0) {

> > +                       u8 i;

> > +

> > +                       for (i = ident_len; i % sizeof(u32) != 0;

> > i++) {

> > +                               if (buff[i] != '\0')

> > +                                       goto send_mal_query;

> 

> Memory leak. IFNAMSIZ is small enough that you can use on-stack

> allocation

> 

> Also, I think you can ignore if there are non-zero bytes beyond the

> len. We need to safely parse to avoid integrity bugs in the kernel.

> Beyond that, it's fine to be strict about what you send, liberal what

> you accept.


This checks that the incoming request is padded to the nearest 32-bit
boundary by ASCII NULL characters, as specified by RFC 8335

> 

> > +                       }

> > +               }

> > +               dev = dev_get_by_name(net, buff);

> > +               kfree(buff);

> > +               break;

> > +       case EXT_ECHO_CTYPE_INDEX:

> > +               if (ident_len != sizeof(iio->ident.ifindex))

> > +                       goto send_mal_query;

> > +               dev = dev_get_by_index(net, ntohl(iio-

> > >ident.ifindex));

> > +               break;

> > +       case EXT_ECHO_CTYPE_ADDR:

> > +               if (ident_len != sizeof(iio->ident.addr.ctype3_hdr)

> > + iio->ident.addr.ctype3_hdr.addrlen)

> > +                       goto send_mal_query;

> > +               switch (ntohs(iio->ident.addr.ctype3_hdr.afi)) {

> > +               case ICMP_AFI_IP:

> > +                       if (ident_len != sizeof(iio-

> > >ident.addr.ctype3_hdr) + sizeof(struct in_addr))

> > +                               goto send_mal_query;

> > +                       dev = ip_dev_find(net, iio-

> > >ident.addr.ip_addr.ipv4_addr.s_addr);

> > +                       break;

> > +#if IS_ENABLED(CONFIG_IPV6)

> > +               case ICMP_AFI_IP6:

> > +                       if (ident_len != sizeof(iio-

> > >ident.addr.ctype3_hdr) + sizeof(struct in6_addr))

> > +                               goto send_mal_query;

> > +                       rcu_read_lock();

> > +                       dev = ipv6_dev_find(net, &iio-

> > >ident.addr.ip_addr.ipv6_addr, dev);

> > +                       if (dev)

> > +                               dev_hold(dev);

> > +                       rcu_read_unlock();

> > +                       break;

> > +#endif

> > +               default:

> > +                       goto send_mal_query;

> > +               }

> > +               break;

> > +       default:

> > +               goto send_mal_query;

> > +       }

> > +       if (!dev) {

> > +               icmp_param.data.icmph.code = ICMP_EXT_NO_IF;

> > +               goto send_reply;

> > +       }

> > +       /* Fill bits in reply message */

> > +       if (dev->flags & IFF_UP)

> > +               status |= EXT_ECHOREPLY_ACTIVE;

> > +       if (__in_dev_get_rcu(dev) && __in_dev_get_rcu(dev)-

> > >ifa_list)

> > +               status |= EXT_ECHOREPLY_IPV4;

> > +       if (!list_empty(&dev->ip6_ptr->addr_list))

> > +               status |= EXT_ECHOREPLY_IPV6;

> > +       dev_put(dev);

> > +       icmp_param.data.icmph.un.echo.sequence |= htons(status);

> > +send_reply:

> > +       icmp_reply(&icmp_param, skb);

> > +               return true;

> > +send_mal_query:

> > +       icmp_param.data.icmph.code = ICMP_EXT_MAL_QUERY;

> > +       goto send_reply;

> >  }

> > 

> >  /*

> > @@ -1088,6 +1192,11 @@ int icmp_rcv(struct sk_buff *skb)

> >         icmph = icmp_hdr(skb);

> > 

> >         ICMPMSGIN_INC_STATS(net, icmph->type);

> > +

> > +       /* Check for ICMP Extended Echo (PROBE) messages */

> > +       if (icmph->type == ICMP_EXT_ECHO)

> > +               goto probe;

> > +

> >         /*

> >          *      18 is the highest 'known' ICMP type. Anything else

> > is a mystery

> >          *

> > @@ -1097,7 +1206,6 @@ int icmp_rcv(struct sk_buff *skb)

> >         if (icmph->type > NR_ICMP_TYPES)

> >                 goto error;

> > 

> > -

> >         /*

> >          *      Parse the ICMP message

> >          */

> > @@ -1123,7 +1231,7 @@ int icmp_rcv(struct sk_buff *skb)

> >         }

> > 

> >         success = icmp_pointers[icmph->type].handler(skb);

> > -

> > +success_check:

> >         if (success)  {

> >                 consume_skb(skb);

> >                 return NET_RX_SUCCESS;

> > @@ -1137,6 +1245,12 @@ int icmp_rcv(struct sk_buff *skb)

> >  error:

> >         __ICMP_INC_STATS(net, ICMP_MIB_INERRORS);

> >         goto drop;

> > +probe:

> > +       /* We can't use icmp_pointers[].handler() because it is an

> > array of

> > +        * size NR_ICMP_TYPES + 1 (19 elements) and PROBE has code

> > 42.

> > +        */

> > +       success = icmp_echo(skb);

> > +       goto success_check;

> 

> just make this a branch instead of


Could you clarify this comment? Do you mean move this code to the
section where we check for PROBE messages instead of jumping to probe
and then jumping to success_check?
Willem de Bruijn March 15, 2021, 7:34 p.m. UTC | #3
On Mon, Mar 15, 2021 at 3:10 PM Andreas Roeseler
<andreas.a.roeseler@gmail.com> wrote:
>

> On Mon, 2021-03-15 at 11:50 -0400, Willem de Bruijn wrote:

> > On Sun, Mar 14, 2021 at 12:50 PM Andreas Roeseler

> > <andreas.a.roeseler@gmail.com> wrote:

> > >

> > > Modify the icmp_rcv function to check PROBE messages and call

> > > icmp_echo

> > > if a PROBE request is detected.

> > >

> > > Modify the existing icmp_echo function to respond ot both ping and

> > > PROBE

> > > requests.

> > >

> > > This was tested using a custom modification to the iputils package

> > > and

> > > wireshark. It supports IPV4 probing by name, ifindex, and probing

> > > by

> > > both IPV4 and IPV6 addresses. It currently does not support

> > > responding

> > > to probes off the proxy node (see RFC 8335 Section 2).

> >

> > If you happen to use github or something similar, if you don't mind

> > sharing the code, you could clone the iputils repo and publish the

> > changes. No pressure.

>

> Should I include the link to the github repo in the patch?


If you don't mind, please do. It can serve as example for others to
use the feature, too.

>

> >

> > >  net/ipv4/icmp.c | 145 +++++++++++++++++++++++++++++++++++++++++++-

> > > ----

> > >  1 file changed, 130 insertions(+), 15 deletions(-)

> > >

> > > diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c

> > > index 616e2dc1c8fa..f1530011b7bc 100644

> > > --- a/net/ipv4/icmp.c

> > > +++ b/net/ipv4/icmp.c

> > > @@ -93,6 +93,10 @@

> > >  #include <net/ip_fib.h>

> > >  #include <net/l3mdev.h>

> > >

> > > +#if IS_ENABLED(CONFIG_IPV6)

> > > +#include <net/addrconf.h>

> > > +#endif

> >

> > I don't think the conditional is needed (?)

> >

> > >  static bool icmp_echo(struct sk_buff *skb)

> > >  {

> > > +       struct icmp_ext_hdr *ext_hdr, _ext_hdr;

> > > +       struct icmp_ext_echo_iio *iio, _iio;

> > > +       struct icmp_bxm icmp_param;

> > > +       struct net_device *dev;

> > >         struct net *net;

> > > +       u16 ident_len;

> > > +       char *buff;

> > > +       u8 status;

> > >

> > >         net = dev_net(skb_dst(skb)->dev);

> > > -       if (!net->ipv4.sysctl_icmp_echo_ignore_all) {

> > > -               struct icmp_bxm icmp_param;

> > > -

> > > -               icmp_param.data.icmph      = *icmp_hdr(skb);

> > > -               icmp_param.data.icmph.type = ICMP_ECHOREPLY;

> > > -               icmp_param.skb             = skb;

> > > -               icmp_param.offset          = 0;

> > > -               icmp_param.data_len        = skb->len;

> > > -               icmp_param.head_len        = sizeof(struct

> > > icmphdr);

> > > -               icmp_reply(&icmp_param, skb);

> > > -       }

> > >         /* should there be an ICMP stat for ignored echos? */

> > > -       return true;

> > > +       if (net->ipv4.sysctl_icmp_echo_ignore_all)

> > > +               return true;

> > > +

> > > +       icmp_param.data.icmph      = *icmp_hdr(skb);

> > > +       icmp_param.skb             = skb;

> > > +       icmp_param.offset          = 0;

> > > +       icmp_param.data_len        = skb->len;

> > > +       icmp_param.head_len        = sizeof(struct icmphdr);

> > > +

> > > +       if (icmp_param.data.icmph.type == ICMP_ECHO)

> > > +               goto send_reply;

> >

> > Is this path now missing

> >

> >                icmp_param.data.icmph.type = ICMP_ECHOREPLY;

> >

> > > +       if (!net->ipv4.sysctl_icmp_echo_enable_probe)

> > > +               return true;

> > > +       /* We currently only support probing interfaces on the

> > > proxy node

> > > +        * Check to ensure L-bit is set

> > > +        */

> > > +       if (!(ntohs(icmp_param.data.icmph.un.echo.sequence) & 1))

> > > +               return true;

> > > +       /* Clear status bits in reply message */

> > > +       icmp_param.data.icmph.un.echo.sequence &= htons(0xFF00);

> > > +       icmp_param.data.icmph.type = ICMP_EXT_ECHOREPLY;

> > > +       ext_hdr = skb_header_pointer(skb, 0, sizeof(_ext_hdr),

> > > &_ext_hdr);

> > > +       iio = skb_header_pointer(skb, sizeof(_ext_hdr),

> > > sizeof(_iio), &_iio);

> > > +       if (!ext_hdr || !iio)

> > > +               goto send_mal_query;

> > > +       if (ntohs(iio->extobj_hdr.length) <= sizeof(iio-

> > > >extobj_hdr))

> > > +               goto send_mal_query;

> > > +       ident_len = ntohs(iio->extobj_hdr.length) - sizeof(iio-

> > > >extobj_hdr);

> > > +       status = 0;

> > > +       dev = NULL;

> > > +       switch (iio->extobj_hdr.class_type) {

> > > +       case EXT_ECHO_CTYPE_NAME:

> > > +               if (ident_len >= IFNAMSIZ)

> > > +                       goto send_mal_query;

> > > +               buff = kcalloc(IFNAMSIZ, sizeof(char), GFP_KERNEL);

> > > +               if (!buff)

> > > +                       return -ENOMEM;

> > > +               memcpy(buff, &iio->ident.name, ident_len);

> > > +               /* RFC 8335 2.1 If the Object Payload would not

> > > otherwise terminate

> > > +                * on a 32-bit boundary, it MUST be padded with

> > > ASCII NULL characters

> > > +                */

> > > +               if (ident_len % sizeof(u32) != 0) {

> > > +                       u8 i;

> > > +

> > > +                       for (i = ident_len; i % sizeof(u32) != 0;

> > > i++) {

> > > +                               if (buff[i] != '\0')

> > > +                                       goto send_mal_query;

> >

> > Memory leak. IFNAMSIZ is small enough that you can use on-stack

> > allocation

> >

> > Also, I think you can ignore if there are non-zero bytes beyond the

> > len. We need to safely parse to avoid integrity bugs in the kernel.

> > Beyond that, it's fine to be strict about what you send, liberal what

> > you accept.

>

> This checks that the incoming request is padded to the nearest 32-bit

> boundary by ASCII NULL characters, as specified by RFC 8335


Right. The question is whether you need to be strict in enforcing
that. Perhaps the RFC states that explicitly. Else, it's up to your
interpretation. I suggested the robustness principle, which is
commonly employed in such instances.

> >

> > > +                       }

> > > +               }

> > > +               dev = dev_get_by_name(net, buff);

> > > +               kfree(buff);

> > > +               break;

> > > +       case EXT_ECHO_CTYPE_INDEX:

> > > +               if (ident_len != sizeof(iio->ident.ifindex))

> > > +                       goto send_mal_query;

> > > +               dev = dev_get_by_index(net, ntohl(iio-

> > > >ident.ifindex));

> > > +               break;

> > > +       case EXT_ECHO_CTYPE_ADDR:

> > > +               if (ident_len != sizeof(iio->ident.addr.ctype3_hdr)

> > > + iio->ident.addr.ctype3_hdr.addrlen)

> > > +                       goto send_mal_query;

> > > +               switch (ntohs(iio->ident.addr.ctype3_hdr.afi)) {

> > > +               case ICMP_AFI_IP:

> > > +                       if (ident_len != sizeof(iio-

> > > >ident.addr.ctype3_hdr) + sizeof(struct in_addr))

> > > +                               goto send_mal_query;

> > > +                       dev = ip_dev_find(net, iio-

> > > >ident.addr.ip_addr.ipv4_addr.s_addr);

> > > +                       break;

> > > +#if IS_ENABLED(CONFIG_IPV6)

> > > +               case ICMP_AFI_IP6:

> > > +                       if (ident_len != sizeof(iio-

> > > >ident.addr.ctype3_hdr) + sizeof(struct in6_addr))

> > > +                               goto send_mal_query;

> > > +                       rcu_read_lock();

> > > +                       dev = ipv6_dev_find(net, &iio-

> > > >ident.addr.ip_addr.ipv6_addr, dev);

> > > +                       if (dev)

> > > +                               dev_hold(dev);

> > > +                       rcu_read_unlock();

> > > +                       break;

> > > +#endif

> > > +               default:

> > > +                       goto send_mal_query;

> > > +               }

> > > +               break;

> > > +       default:

> > > +               goto send_mal_query;

> > > +       }

> > > +       if (!dev) {

> > > +               icmp_param.data.icmph.code = ICMP_EXT_NO_IF;

> > > +               goto send_reply;

> > > +       }

> > > +       /* Fill bits in reply message */

> > > +       if (dev->flags & IFF_UP)

> > > +               status |= EXT_ECHOREPLY_ACTIVE;

> > > +       if (__in_dev_get_rcu(dev) && __in_dev_get_rcu(dev)-

> > > >ifa_list)

> > > +               status |= EXT_ECHOREPLY_IPV4;

> > > +       if (!list_empty(&dev->ip6_ptr->addr_list))

> > > +               status |= EXT_ECHOREPLY_IPV6;

> > > +       dev_put(dev);

> > > +       icmp_param.data.icmph.un.echo.sequence |= htons(status);

> > > +send_reply:

> > > +       icmp_reply(&icmp_param, skb);

> > > +               return true;

> > > +send_mal_query:

> > > +       icmp_param.data.icmph.code = ICMP_EXT_MAL_QUERY;

> > > +       goto send_reply;

> > >  }

> > >

> > >  /*

> > > @@ -1088,6 +1192,11 @@ int icmp_rcv(struct sk_buff *skb)

> > >         icmph = icmp_hdr(skb);

> > >

> > >         ICMPMSGIN_INC_STATS(net, icmph->type);

> > > +

> > > +       /* Check for ICMP Extended Echo (PROBE) messages */

> > > +       if (icmph->type == ICMP_EXT_ECHO)

> > > +               goto probe;

> > > +

> > >         /*

> > >          *      18 is the highest 'known' ICMP type. Anything else

> > > is a mystery

> > >          *

> > > @@ -1097,7 +1206,6 @@ int icmp_rcv(struct sk_buff *skb)

> > >         if (icmph->type > NR_ICMP_TYPES)

> > >                 goto error;

> > >

> > > -

> > >         /*

> > >          *      Parse the ICMP message

> > >          */

> > > @@ -1123,7 +1231,7 @@ int icmp_rcv(struct sk_buff *skb)

> > >         }

> > >

> > >         success = icmp_pointers[icmph->type].handler(skb);

> > > -

> > > +success_check:

> > >         if (success)  {

> > >                 consume_skb(skb);

> > >                 return NET_RX_SUCCESS;

> > > @@ -1137,6 +1245,12 @@ int icmp_rcv(struct sk_buff *skb)

> > >  error:

> > >         __ICMP_INC_STATS(net, ICMP_MIB_INERRORS);

> > >         goto drop;

> > > +probe:

> > > +       /* We can't use icmp_pointers[].handler() because it is an

> > > array of

> > > +        * size NR_ICMP_TYPES + 1 (19 elements) and PROBE has code

> > > 42.

> > > +        */

> > > +       success = icmp_echo(skb);

> > > +       goto success_check;

> >

> > just make this a branch instead of

>

> Could you clarify this comment? Do you mean move this code to the

> section where we check for PROBE messages instead of jumping to probe

> and then jumping to success_check?


Exactly