mbox series

[RESEND,net-next,v3,0/5] Support for the IOAM Pre-allocated Trace with IPv6

Message ID 20210526171640.9722-1-justin.iurman@uliege.be
Headers show
Series Support for the IOAM Pre-allocated Trace with IPv6 | expand

Message

Justin Iurman May 26, 2021, 5:16 p.m. UTC
v3:
 - Fix warning "unused label 'out_unregister_genl'" by adding conditional macro
 - Fix lwtunnel output redirect bug: dst cache useless in this case, use
   orig_output instead

v2:
 - Fix warning with static for __ioam6_fill_trace_data
 - Fix sparse warning with __force when casting __be64 to __be32
 - Fix unchecked dereference when removing IOAM namespaces or schemas
 - exthdrs.c: Don't drop by default (now: ignore) to match the act bits "00"
 - Add control plane support for the inline insertion (lwtunnel)
 - Provide uapi structures
 - Use __net_timestamp if skb->tstamp is empty
 - Add note about the temporary IANA allocation
 - Remove support for "removable" TLVs
 - Remove support for virtual/anonymous tunnel decapsulation

In-situ Operations, Administration, and Maintenance (IOAM) records
operational and telemetry information in a packet while it traverses
a path between two points in an IOAM domain. It is defined in
draft-ietf-ippm-ioam-data [1]. IOAM data fields can be encapsulated
into a variety of protocols. The IPv6 encapsulation is defined in
draft-ietf-ippm-ioam-ipv6-options [2], via extension headers. IOAM
can be used to complement OAM mechanisms based on e.g. ICMP or other
types of probe packets.

This patchset implements support for the Pre-allocated Trace, carried
by a Hop-by-Hop. Therefore, a new IPv6 Hop-by-Hop TLV option is
introduced, see IANA [3]. The three other IOAM options are not included
in this patchset (Incremental Trace, Proof-of-Transit and Edge-to-Edge).
The main idea behind the IOAM Pre-allocated Trace is that a node
pre-allocates some room in packets for IOAM data. Then, each IOAM node
on the path will insert its data. There exist several interesting use-
cases, e.g. Fast failure detection/isolation or Smart service selection.
Another killer use-case is what we have called Cross-Layer Telemetry,
see the demo video on its repository [4], that aims to make the entire
stack (L2/L3 -> L7) visible for distributed tracing tools (e.g. Jaeger),
instead of the current L5 -> L7 limited view. So, basically, this is a
nice feature for the Linux Kernel.

This patchset also provides support for the control plane part, but only for the
inline insertion (host-to-host use case), through lightweight tunnels. Indeed,
for in-transit traffic, the solution is to have an IPv6-in-IPv6 encapsulation,
which brings some difficulties and still requires a little bit of work and
discussion (ie anonymous tunnel decapsulation and multi egress resolution).

- Patch 1: IPv6 IOAM headers definition
- Patch 2: Data plane support for Pre-allocated Trace
- Patch 3: IOAM Generic Netlink API
- Patch 4: Support for IOAM injection with lwtunnels
- Patch 5: Documentation for new IOAM sysctls

  [1] https://tools.ietf.org/html/draft-ietf-ippm-ioam-data
  [2] https://tools.ietf.org/html/draft-ietf-ippm-ioam-ipv6-options
  [3] https://www.iana.org/assignments/ipv6-parameters/ipv6-parameters.xhtml#ipv6-parameters-2
  [4] https://github.com/iurmanj/cross-layer-telemetry

Justin Iurman (5):
  uapi: IPv6 IOAM headers definition
  ipv6: ioam: Data plane support for Pre-allocated Trace
  ipv6: ioam: IOAM Generic Netlink API
  ipv6: ioam: Support for IOAM injection with lwtunnels
  ipv6: ioam: Documentation for new IOAM sysctls

 Documentation/networking/ioam6-sysctl.rst |  20 +
 Documentation/networking/ip-sysctl.rst    |   5 +
 include/linux/ioam6.h                     |  13 +
 include/linux/ioam6_genl.h                |  13 +
 include/linux/ioam6_iptunnel.h            |  13 +
 include/linux/ipv6.h                      |   2 +
 include/net/ioam6.h                       |  65 ++
 include/net/netns/ipv6.h                  |   2 +
 include/uapi/linux/in6.h                  |   1 +
 include/uapi/linux/ioam6.h                | 124 +++
 include/uapi/linux/ioam6_genl.h           |  49 ++
 include/uapi/linux/ioam6_iptunnel.h       |  19 +
 include/uapi/linux/ipv6.h                 |   2 +
 include/uapi/linux/lwtunnel.h             |   1 +
 net/core/lwtunnel.c                       |   2 +
 net/ipv6/Kconfig                          |  11 +
 net/ipv6/Makefile                         |   3 +-
 net/ipv6/addrconf.c                       |  20 +
 net/ipv6/af_inet6.c                       |   7 +
 net/ipv6/exthdrs.c                        |  51 ++
 net/ipv6/ioam6.c                          | 872 ++++++++++++++++++++++
 net/ipv6/ioam6_iptunnel.c                 | 273 +++++++
 net/ipv6/sysctl_net_ipv6.c                |   7 +
 23 files changed, 1574 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/networking/ioam6-sysctl.rst
 create mode 100644 include/linux/ioam6.h
 create mode 100644 include/linux/ioam6_genl.h
 create mode 100644 include/linux/ioam6_iptunnel.h
 create mode 100644 include/net/ioam6.h
 create mode 100644 include/uapi/linux/ioam6.h
 create mode 100644 include/uapi/linux/ioam6_genl.h
 create mode 100644 include/uapi/linux/ioam6_iptunnel.h
 create mode 100644 net/ipv6/ioam6.c
 create mode 100644 net/ipv6/ioam6_iptunnel.c

Comments

Jakub Kicinski May 27, 2021, 12:34 a.m. UTC | #1
On Wed, 26 May 2021 19:16:39 +0200 Justin Iurman wrote:
> Add support for the IOAM inline insertion (only for the host-to-host use case)

> which is per-route configured with lightweight tunnels. The target is iproute2

> and the patch is ready. It will be posted as soon as this patchset is merged.

> Here is an overview:

> 

> $ ip -6 ro ad fc00::1/128 encap ioam6 trace type 0x800000 ns 1 size 12 dev eth0

> 

> This example configures an IOAM Pre-allocated Trace option attached to the

> fc00::1/128 prefix. The IOAM namespace (ns) is 1, the size of the pre-allocated

> trace data block is 12 octets (size) and only the first IOAM data (bit 0:

> hop_limit + node id) is included in the trace (type) represented as a bitfield.

> 

> The reason why the in-transit (IPv6-in-IPv6 encapsulation) use case is not

> implemented is explained on the patchset cover.

> 

> Signed-off-by: Justin Iurman <justin.iurman@uliege.be>


Please address the warnings from checkpatch --strict on this patches.

For all patches please make sure you don't use static inline in C
files, and let the compiler decide what to inline by itself.

> +	if (trace->type.bit0) trace->nodelen += sizeof(__be32) / 4;

> +	if (trace->type.bit1) trace->nodelen += sizeof(__be32) / 4;

> +	if (trace->type.bit2) trace->nodelen += sizeof(__be32) / 4;

> +	if (trace->type.bit3) trace->nodelen += sizeof(__be32) / 4;

> +	if (trace->type.bit4) trace->nodelen += sizeof(__be32) / 4;

> +	if (trace->type.bit5) trace->nodelen += sizeof(__be32) / 4;

> +	if (trace->type.bit6) trace->nodelen += sizeof(__be32) / 4;

> +	if (trace->type.bit7) trace->nodelen += sizeof(__be32) / 4;

> +	if (trace->type.bit8) trace->nodelen += sizeof(__be64) / 4;

> +	if (trace->type.bit9) trace->nodelen += sizeof(__be64) / 4;

> +	if (trace->type.bit10) trace->nodelen += sizeof(__be64) / 4;

> +	if (trace->type.bit11) trace->nodelen += sizeof(__be32) / 4;


Seems simpler to do:

	nodelen += hweight16(field & MASK1) * (sizeof(__be32) / 4);
	nodelen += hweight16(field & MASK2) * (sizeof(__be64) / 4);
Jakub Kicinski May 27, 2021, 12:36 a.m. UTC | #2
On Wed, 26 May 2021 17:34:02 -0700 Jakub Kicinski wrote:
> Please address the warnings from checkpatch --strict on this patches.


I meant to say "this patch", the warnings in patch 1 seem to be related
to unnamed bitfields, those can be ignored.
Justin Iurman May 27, 2021, 1:16 p.m. UTC | #3
> On Wed, 26 May 2021 19:16:39 +0200 Justin Iurman wrote:

>> Add support for the IOAM inline insertion (only for the host-to-host use case)

>> which is per-route configured with lightweight tunnels. The target is iproute2

>> and the patch is ready. It will be posted as soon as this patchset is merged.

>> Here is an overview:

>> 

>> $ ip -6 ro ad fc00::1/128 encap ioam6 trace type 0x800000 ns 1 size 12 dev eth0

>> 

>> This example configures an IOAM Pre-allocated Trace option attached to the

>> fc00::1/128 prefix. The IOAM namespace (ns) is 1, the size of the pre-allocated

>> trace data block is 12 octets (size) and only the first IOAM data (bit 0:

>> hop_limit + node id) is included in the trace (type) represented as a bitfield.

>> 

>> The reason why the in-transit (IPv6-in-IPv6 encapsulation) use case is not

>> implemented is explained on the patchset cover.

>> 

>> Signed-off-by: Justin Iurman <justin.iurman@uliege.be>

> 

> Please address the warnings from checkpatch --strict on this patches.


My mistake, I'll do that.

> For all patches please make sure you don't use static inline in C

> files, and let the compiler decide what to inline by itself.


Will do as well.

>> +	if (trace->type.bit0) trace->nodelen += sizeof(__be32) / 4;

>> +	if (trace->type.bit1) trace->nodelen += sizeof(__be32) / 4;

>> +	if (trace->type.bit2) trace->nodelen += sizeof(__be32) / 4;

>> +	if (trace->type.bit3) trace->nodelen += sizeof(__be32) / 4;

>> +	if (trace->type.bit4) trace->nodelen += sizeof(__be32) / 4;

>> +	if (trace->type.bit5) trace->nodelen += sizeof(__be32) / 4;

>> +	if (trace->type.bit6) trace->nodelen += sizeof(__be32) / 4;

>> +	if (trace->type.bit7) trace->nodelen += sizeof(__be32) / 4;

>> +	if (trace->type.bit8) trace->nodelen += sizeof(__be64) / 4;

>> +	if (trace->type.bit9) trace->nodelen += sizeof(__be64) / 4;

>> +	if (trace->type.bit10) trace->nodelen += sizeof(__be64) / 4;

>> +	if (trace->type.bit11) trace->nodelen += sizeof(__be32) / 4;

> 

> Seems simpler to do:

> 

>	nodelen += hweight16(field & MASK1) * (sizeof(__be32) / 4);

> 	nodelen += hweight16(field & MASK2) * (sizeof(__be64) / 4);


Indeed, I didn't know this macro. Will post a rev ASAP. Thanks Jakub for the feedback, I appreciate.

Justin