From patchwork Wed Sep 7 00:49:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adel Abouchaev X-Patchwork-Id: 604109 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 410BEC38145 for ; Wed, 7 Sep 2022 00:49:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229591AbiIGAty (ORCPT ); Tue, 6 Sep 2022 20:49:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50540 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229469AbiIGAtv (ORCPT ); Tue, 6 Sep 2022 20:49:51 -0400 Received: from mail-pg1-x52e.google.com (mail-pg1-x52e.google.com [IPv6:2607:f8b0:4864:20::52e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 451EF31236; Tue, 6 Sep 2022 17:49:49 -0700 (PDT) Received: by mail-pg1-x52e.google.com with SMTP id q63so12092120pga.9; Tue, 06 Sep 2022 17:49:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date; bh=1sYQWtnh8eAQzN0M1bOHJINlbj63vd9/Caz0yn2dSmk=; b=Jbx8npI2yp5WFLMicBtRx+Xfu6kM3ERWW8ljC0ghOnb2d7mxZZ/bmhpW7MjCAheq7S T3RMXxf/zZvjWeBntjDgqNiLQKzdxi6mO68jJ8NcjmHPCumjJbmUpORUqyrrl+Im4w63 x9KgLlJwaaPycX+2TLFdrBmmgUH5dat12cK2KiUkTtHEzxc1sEy1rRLvV8uF309Ba3kT igmBsWLPCZBBaooQi72qMTBtMax+k/O0IEw9X/p2OihBYQStNQDCgklIqQ56PPHHZwuj O5PiwHlRsxirhook01qS56LJuPxiLYZ6O14+qN2VERb3AjfqXZOkcwpubS3fgvkhUtIw RePg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=1sYQWtnh8eAQzN0M1bOHJINlbj63vd9/Caz0yn2dSmk=; b=gYHxM6g1D7VHKlProL9gpPgv7bzPpryivDDVDl2EtzFS46WH8CnKCXm5F41H9kAWLp n9i40f9o0hcqzThXFhzyGhSJGbd+H1PnLv+Wtz1H0hD8599sX8GFgR5gwPhzDYjBWQRl R5sH1sZ8sn3ufVGdqSxWxsOTi2yGaO4xH5qcobHfX3Zmty4R7wYMFLvS+YKXuUq5Ze1b cXazaVzkLUKQV8FAqUgrNT3zTrHkBhYRyb4dkQ+/Z6nuszfzOmXmQmU6Lpf8jrODXqAZ e8vSVCXuKU+qDiO5y8L6gjLYH0bXV6DzfYqSp8ytHBUep/uvXAl0OZcRK+fH/EHDkYwd 4dOQ== X-Gm-Message-State: ACgBeo3oWN6Zch+LNR1Q+VoEKd5UnZobbEamZMaWVYWmSB5QYnHHbclT wp4shMkKErZ8yF1U2m8yYQw= X-Google-Smtp-Source: AA6agR6y9y7kKQK7v3mDFuEVYNq6vDDCJWFT/o6ORoP1+NyLreBqbgHhmgFPM3hXYMALqfOcwNBx6w== X-Received: by 2002:a63:554a:0:b0:42b:e4a4:ec86 with SMTP id f10-20020a63554a000000b0042be4a4ec86mr1131666pgm.47.1662511788668; Tue, 06 Sep 2022 17:49:48 -0700 (PDT) Received: from localhost (fwdproxy-prn-111.fbsv.net. [2a03:2880:ff:6f::face:b00c]) by smtp.gmail.com with ESMTPSA id 4-20020a620504000000b00537b1aa9191sm11137033pff.178.2022.09.06.17.49.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 06 Sep 2022 17:49:47 -0700 (PDT) From: Adel Abouchaev To: davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, corbet@lwn.net, dsahern@kernel.org, shuah@kernel.org, netdev@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org Cc: kernel test robot Subject: [net-next v3 1/6] net: Documentation on QUIC kernel Tx crypto. Date: Tue, 6 Sep 2022 17:49:30 -0700 Message-Id: <20220907004935.3971173-2-adel.abushaev@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220907004935.3971173-1-adel.abushaev@gmail.com> References: <20220907004935.3971173-1-adel.abushaev@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org Add documentation for kernel QUIC code. Signed-off-by: Adel Abouchaev --- Added quic.rst reference to the index.rst file; identation in quic.rst file. Reported-by: kernel test robot Added SPDX license GPL 2.0. v2: Removed whitespace at EOF. v3: Added explanation of features. --- Documentation/networking/index.rst | 1 + Documentation/networking/quic.rst | 211 +++++++++++++++++++++++++++++ 2 files changed, 212 insertions(+) create mode 100644 Documentation/networking/quic.rst diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst index bacadd09e570..0dacd8c8a3ff 100644 --- a/Documentation/networking/index.rst +++ b/Documentation/networking/index.rst @@ -89,6 +89,7 @@ Contents: plip ppp_generic proc_net_tcp + quic radiotap-headers rds regulatory diff --git a/Documentation/networking/quic.rst b/Documentation/networking/quic.rst new file mode 100644 index 000000000000..2e6ec72f4eea --- /dev/null +++ b/Documentation/networking/quic.rst @@ -0,0 +1,211 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=========== +KERNEL QUIC +=========== + +Overview +======== + +QUIC is a secure general-purpose transport protocol that creates a stateful +interaction between a client and a server. QUIC provides end-to-end integrity +and confidentiality. Refer to RFC 9000 for more information on QUIC. + +The kernel Tx side offload covers the encryption of the application streams +in the kernel rather than in the application. These packets are 1RTT packets +in QUIC connection. Encryption of every other packets is still done by the +QUIC library in user space. + +The flow match is performed using 5 parameters: source and destination IP +addresses, source and destination UDP ports and destination QUIC connection ID. +Not all 5 parameters are always needed. The Tx direction matches the flow on +the destination IP, port and destination connection ID, while the Rx part would +later match on source IP, port and destination connection ID. This will cover +multiple scenarios where the server is using SO_REUSEADDR and/or empty +destination connection IDs or combination of these. + +The Rx direction is not implemented in this set of patches. + +The connection migration scenario is not handled by the kernel code and will +be handled by the user space portion of QUIC library. On the Tx direction, +the new key would be installed before a packet with an updated destination is +sent. On the Rx direction, the behavior will be to drop a packet if a flow is +missing. + +For the key rotation, the behavior is to drop packets on Tx when the encryption +key with matching key rotation bit is not present. On Rx direction, the packet +will be sent to the userspace library with unencrypted header and encrypted +payload. A separate indication will be added to the ancillary data to indicate +the status of the operation as not matching the current key bit. It is not +possible to use the key rotation bit as part of the key for flow lookup as that +bit is protected by the header protection. A special provision will need to be +done in user mode to still attempt the decryption of the payload to prevent a +timing attack. + + +User Interface +============== + +Creating a QUIC connection +-------------------------- + +QUIC connection originates and terminates in the application, using one of many +available QUIC libraries. The code instantiates QUIC client and QUIC server in +some form and configures them to use certain addresses and ports for the +source and destination. The client and server negotiate the set of keys to +protect the communication during different phases of the connection, maintain +the connection and perform congestion control. + +Requesting to add QUIC Tx kernel encryption to the connection +------------------------------------------------------------- + +Each flow that should be encrypted by the kernel needs to be registered with +the kernel using socket API. A setsockopt() call on the socket creates an +association between the QUIC connection ID of the flow with the encryption +parameters for the crypto operations: + +.. code-block:: c + + struct quic_connection_info conn_info; + char conn_id[5] = {0x01, 0x02, 0x03, 0x04, 0x05}; + const size_t conn_id_len = sizeof(conn_id); + char conn_key[16] = {0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, + 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f}; + char conn_iv[12] = {0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, + 0x08, 0x09, 0x0a, 0x0b}; + char conn_hdr_key[16] = {0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17, + 0x18, 0x19, 0x1a, 0x1b, 0x1c, 0x1d, 0x1e, 0x1f + }; + + conn_info.conn_payload_key_gen = 0; + conn_info.cipher_type = TLS_CIPHER_AES_GCM_128; + + memset(&conn_info.key, 0, sizeof(struct quic_connection_info_key)); + conn_info.key.conn_id_length = 5; + memcpy(&conn_info.key.conn_id[QUIC_MAX_CONNECTION_ID_SIZE + - conn_id_len], + &conn_id, conn_id_len); + + memcpy(&conn_info.payload_key, conn_key, sizeof(conn_key)); + memcpy(&conn_info.payload_iv, conn_iv, sizeof(conn_iv)); + memcpy(&conn_info.header_key, conn_hdr_key, sizeof(conn_hdr_key)); + + setsockopt(fd, SOL_UDP, UDP_QUIC_ADD_TX_CONNECTION, &conn_info, + sizeof(conn_info)); + + +Requesting to remove QUIC Tx kernel crypto offload control messages +------------------------------------------------------------------- + +All flows are removed when the socket is closed. To request an explicit remove +of the offload for the connection during the lifetime of the socket the process +is similar to adding the flow. Only the connection ID and its length are +necessary to supply to remove the connection from the offload: + +.. code-block:: c + + memset(&conn_info.key, 0, sizeof(struct quic_connection_info_key)); + conn_info.key.conn_id_length = 5; + memcpy(&conn_info.key.conn_id[QUIC_MAX_CONNECTION_ID_SIZE + - conn_id_len], + &conn_id, conn_id_len); + setsockopt(fd, SOL_UDP, UDP_QUIC_DEL_TX_CONNECTION, &conn_info, + sizeof(conn_info)); + +Sending QUIC application data +----------------------------- + +For QUIC Tx encryption offload, the application should use sendmsg() socket +call and provide ancillary data with information on connection ID length and +offload flags for the kernel to perform the encryption and GSO support if +requested. + +.. code-block:: c + + size_t cmsg_tx_len = sizeof(struct quic_tx_ancillary_data); + uint8_t cmsg_buf[CMSG_SPACE(cmsg_tx_len)]; + struct quic_tx_ancillary_data * anc_data; + size_t quic_data_len = 4500; + struct cmsghdr * cmsg_hdr; + char quic_data[9000]; + struct iovec iov[2]; + int send_len = 9000; + struct msghdr msg; + int err; + + iov[0].iov_base = quic_data; + iov[0].iov_len = quic_data_len; + iov[1].iov_base = quic_data + 4500; + iov[1].iov_len = quic_data_len; + + if (client.addr.sin_family == AF_INET) { + msg.msg_name = &client.addr; + msg.msg_namelen = sizeof(client.addr); + } else { + msg.msg_name = &client.addr6; + msg.msg_namelen = sizeof(client.addr6); + } + + msg.msg_iov = iov; + msg.msg_iovlen = 2; + msg.msg_control = cmsg_buf; + msg.msg_controllen = sizeof(cmsg_buf); + cmsg_hdr = CMSG_FIRSTHDR(&msg); + cmsg_hdr->cmsg_level = IPPROTO_UDP; + cmsg_hdr->cmsg_type = UDP_QUIC_ENCRYPT; + cmsg_hdr->cmsg_len = CMSG_LEN(cmsg_tx_len); + anc_data = CMSG_DATA(cmsg_hdr); + anc_data->flags = 0; + anc_data->next_pkt_num = 0x0d65c9; + anc_data->conn_id_length = conn_id_len; + err = sendmsg(self->sfd, &msg, 0); + +QUIC Tx offload in kernel will read the data from userspace, encrypt and +copy it to the ciphertext within the same operation. + + +Sending QUIC application data with GSO +-------------------------------------- +When GSO is in use, the kernel will use the GSO fragment size as the target +for ciphertext. The packets from the user space should align on the boundary +of GSO fragment size minus the size of the tag for the chosen cipher. For the +GSO fragment 1200, the plain packets should follow each other at every 1184 +bytes, given the tag size of 16. After the encryption, the rest of the UDP +and IP stacks will follow the defined value of GSO fragment which will include +the trailing tag bytes. + +To set up GSO fragmentation: + +.. code-block:: c + + setsockopt(self->sfd, SOL_UDP, UDP_SEGMENT, &frag_size, + sizeof(frag_size)); + +If the GSO fragment size is provided in ancillary data within the sendmsg() +call, the value in ancillary data will take precedence over the segment size +provided in setsockopt to split the payload into packets. This is consistent +with the UDP stack behavior. + +Integrating to userspace QUIC libraries +--------------------------------------- + +Userspace QUIC libraries integration would depend on the implementation of the +QUIC protocol. For MVFST library, the control plane is integrated into the +handshake callbacks to properly configure the flows into the socket; and the +data plane is integrated into the methods that perform encryption and send +the packets to the batch scheduler for transmissions to the socket. + +MVFST library can be found at https://github.com/facebookincubator/mvfst. + +Statistics +========== + +QUIC Tx offload to the kernel has counters +(``/proc/net/quic_stat``): + +- ``QuicCurrTxSw`` - + number of currently active kernel offloaded QUIC connections +- ``QuicTxSw`` - + accumulative total number of offloaded QUIC connections +- ``QuicTxSwError`` - + accumulative total number of errors during QUIC Tx offload to kernel From patchwork Wed Sep 7 00:49:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adel Abouchaev X-Patchwork-Id: 603635 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4AA7EC6FA8B for ; Wed, 7 Sep 2022 00:49:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229469AbiIGAt4 (ORCPT ); Tue, 6 Sep 2022 20:49:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50782 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229509AbiIGAtx (ORCPT ); Tue, 6 Sep 2022 20:49:53 -0400 Received: from mail-pl1-x634.google.com (mail-pl1-x634.google.com [IPv6:2607:f8b0:4864:20::634]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 81A2840BF8; Tue, 6 Sep 2022 17:49:51 -0700 (PDT) Received: by mail-pl1-x634.google.com with SMTP id 9so9711684plj.11; Tue, 06 Sep 2022 17:49:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date; bh=zKHLtlJnFLmV+2bVGZ+NL3Hc0unpEzfOZEcdnJuJu7A=; b=Z3tjrToGNhKhp27814rT8Qy2PjwR69odBta3zjAkDpGaBuSZ9L/tW2L5hBLC6ebdpH vHur1ZSGWyxc7L+oSylyYfM64GKmP8IH6cDdkMyXgkDvxjw21tiNmOUnx7L6SYYGEixG 7OzrB8JKJHSkRuMJ1heLy2PYY3MheVIu/JtHGzfXtZn+T+Ypg5B8t4sdLyf9mo5p2kzr VDfRV35X9+ofy3oHUCGOm/OhKbF2BAPdlhSbdQtdlUzw4kLoM0YZ7yvECE4iTnfXtiXB coBejlsohTdnCNpegZM++/vPwb27hk6XPQIzMXUkttmb4lZylPPiW8Gcr/IpsubW7AZ/ tLNw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date; bh=zKHLtlJnFLmV+2bVGZ+NL3Hc0unpEzfOZEcdnJuJu7A=; b=K16klq6CaLzGWl4st9+vZWSPjc9qhs5AtSWaHtQQSIIorGEkd+MEK9bRUrq1roeeNz VBcVFMfV+9ziEexLXSfcaO5Z3HdVLejyfMEJ6xTioqunYb6fH0DPJrZxu/qe7LGjKnDp uTVA/qEApuZF+JIsXTp24fUW14xEdlzMUo9AOiIgS3+q8t9bNuaq/Sjv7bIzyCNVp9JO nu3oXELkq//1L5OpqyIw271ufNyoG1IEhFQ0fO16uPgXG0oLf//3hLvO6syr3yUqDs21 3hleMnPnjsmymrnJQBYPK+G8gLiZY09fD6Ad9IrBlOJPs/qYnvIDrnTnp3+FihNFNuAg 3xEQ== X-Gm-Message-State: ACgBeo1jY8OWCKs07KOB9ca9kQMUi7LZnmqr4MseOL8kqNtuvRGTWp58 u52BIvHmmJBH3oqjYtb11IE= X-Google-Smtp-Source: AA6agR60RiD3D+XYft2xkmAsWmmgp6F+zhqARBKq8FMC0xP73t+qyOWyg11tS1hWA74J/1n6SylzAQ== X-Received: by 2002:a17:902:c189:b0:176:b871:8a1 with SMTP id d9-20020a170902c18900b00176b87108a1mr1353832pld.30.1662511790573; Tue, 06 Sep 2022 17:49:50 -0700 (PDT) Received: from localhost (fwdproxy-prn-117.fbsv.net. [2a03:2880:ff:75::face:b00c]) by smtp.gmail.com with ESMTPSA id 18-20020a630312000000b00434e1d3b2ecsm334510pgd.79.2022.09.06.17.49.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 06 Sep 2022 17:49:50 -0700 (PDT) From: Adel Abouchaev To: davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, corbet@lwn.net, dsahern@kernel.org, shuah@kernel.org, netdev@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org Subject: [net-next v3 2/6] net: Define QUIC specific constants, control and data plane structures Date: Tue, 6 Sep 2022 17:49:31 -0700 Message-Id: <20220907004935.3971173-3-adel.abushaev@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220907004935.3971173-1-adel.abushaev@gmail.com> References: <20220907004935.3971173-1-adel.abushaev@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org Define control and data plane structures to pass in control plane for flow add/remove and during packet send within ancillary data. Define constants to use within SOL_UDP to program QUIC sockets. Signed-off-by: Adel Abouchaev --- v3: added a 3-tuple to map a flow to a key, added key generation to include into flow context. --- include/uapi/linux/quic.h | 66 +++++++++++++++++++++++++++++++++++++++ include/uapi/linux/udp.h | 3 ++ 2 files changed, 69 insertions(+) create mode 100644 include/uapi/linux/quic.h diff --git a/include/uapi/linux/quic.h b/include/uapi/linux/quic.h new file mode 100644 index 000000000000..1fd9d2ed8683 --- /dev/null +++ b/include/uapi/linux/quic.h @@ -0,0 +1,66 @@ +/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) */ + +#ifndef _UAPI_LINUX_QUIC_H +#define _UAPI_LINUX_QUIC_H + +#include +#include + +#define QUIC_MAX_CONNECTION_ID_SIZE 20 + +/* Side by side data for QUIC egress operations */ +#define QUIC_BYPASS_ENCRYPTION 0x01 + +struct quic_tx_ancillary_data { + __aligned_u64 next_pkt_num; + __u8 flags; + __u8 dst_conn_id_length; +}; + +struct quic_connection_info_key { + __u8 dst_conn_id[QUIC_MAX_CONNECTION_ID_SIZE]; + __u8 dst_conn_id_length; + union { + struct in6_addr ipv6_addr; + struct in_addr ipv4_addr; + } addr; + __be16 udp_port; +}; + +struct quic_aes_gcm_128 { + __u8 header_key[TLS_CIPHER_AES_GCM_128_KEY_SIZE]; + __u8 payload_key[TLS_CIPHER_AES_GCM_128_KEY_SIZE]; + __u8 payload_iv[TLS_CIPHER_AES_GCM_128_IV_SIZE]; +}; + +struct quic_aes_gcm_256 { + __u8 header_key[TLS_CIPHER_AES_GCM_256_KEY_SIZE]; + __u8 payload_key[TLS_CIPHER_AES_GCM_256_KEY_SIZE]; + __u8 payload_iv[TLS_CIPHER_AES_GCM_256_IV_SIZE]; +}; + +struct quic_aes_ccm_128 { + __u8 header_key[TLS_CIPHER_AES_CCM_128_KEY_SIZE]; + __u8 payload_key[TLS_CIPHER_AES_CCM_128_KEY_SIZE]; + __u8 payload_iv[TLS_CIPHER_AES_CCM_128_IV_SIZE]; +}; + +struct quic_chacha20_poly1305 { + __u8 header_key[TLS_CIPHER_CHACHA20_POLY1305_KEY_SIZE]; + __u8 payload_key[TLS_CIPHER_CHACHA20_POLY1305_KEY_SIZE]; + __u8 payload_iv[TLS_CIPHER_CHACHA20_POLY1305_IV_SIZE]; +}; + +struct quic_connection_info { + __u16 cipher_type; + struct quic_connection_info_key key; + __u8 conn_payload_key_gen; + union { + struct quic_aes_gcm_128 aes_gcm_128; + struct quic_aes_gcm_256 aes_gcm_256; + struct quic_aes_ccm_128 aes_ccm_128; + struct quic_chacha20_poly1305 chacha20_poly1305; + }; +}; + +#endif diff --git a/include/uapi/linux/udp.h b/include/uapi/linux/udp.h index 4828794efcf8..0ee4c598e70b 100644 --- a/include/uapi/linux/udp.h +++ b/include/uapi/linux/udp.h @@ -34,6 +34,9 @@ struct udphdr { #define UDP_NO_CHECK6_RX 102 /* Disable accpeting checksum for UDP6 */ #define UDP_SEGMENT 103 /* Set GSO segmentation size */ #define UDP_GRO 104 /* This socket can receive UDP GRO packets */ +#define UDP_QUIC_ADD_TX_CONNECTION 106 /* Add QUIC Tx crypto offload */ +#define UDP_QUIC_DEL_TX_CONNECTION 107 /* Del QUIC Tx crypto offload */ +#define UDP_QUIC_ENCRYPT 108 /* QUIC encryption parameters */ /* UDP encapsulation types */ #define UDP_ENCAP_ESPINUDP_NON_IKE 1 /* draft-ietf-ipsec-nat-t-ike-00/01 */ From patchwork Wed Sep 7 00:49:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adel Abouchaev X-Patchwork-Id: 604108 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 722BDC6FA89 for ; Wed, 7 Sep 2022 00:49:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229594AbiIGAt5 (ORCPT ); Tue, 6 Sep 2022 20:49:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50866 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229590AbiIGAty (ORCPT ); Tue, 6 Sep 2022 20:49:54 -0400 Received: from mail-pg1-x532.google.com (mail-pg1-x532.google.com [IPv6:2607:f8b0:4864:20::532]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 347D148E89; Tue, 6 Sep 2022 17:49:53 -0700 (PDT) Received: by mail-pg1-x532.google.com with SMTP id t65so1169106pgt.2; Tue, 06 Sep 2022 17:49:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date; bh=QvhU0mXFxYu4G6nRWKmayja84OJk3wE6CUzZk65Rffw=; b=IrgcuEML98nTpmRTRD2391K0EM6M1C4s+tjeQCs45wW4L6cjRKUFodlk2/OWoTFyVH FRpJ3rym4HhbNjMQiQFjgC9hAS5g5XzkvUA6mHeiKC/oUgsC9vFzC6N5qj6FD7V31BE2 jlMKqzBpWFLOgR64jLYHJol10afLAm4mAuYvImqMbm82ufj9SkU4RwIix2w0jptxpus0 B22ZtvRRcHE6WYUy9LH01GhmMNoU3rqzlypfHYppFsKPwVaEsN8BN8dIZRbQu6To6SWU 8NvWTJI1cT2c4MCvhLuhLif7ma/eYAtpgAxJkYdARoHSkp4hFja+FeHgk0JLfWkijxPU jjXg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date; bh=QvhU0mXFxYu4G6nRWKmayja84OJk3wE6CUzZk65Rffw=; b=dfEswNY2nIgwwc8EvlC1gbRysi1i9+NbbAOQoEsXgU/x0Pn2SiwSVRi/RRNzwMQNHN GFBLlxo0X+x8jHH5Xw15Tuu4PWJ6uTqYhrSzRLaViAOlKGFPI/C354P496bX3ErZwYoc INtuALrWJfcD52nzv8kEs/01+DgEZixVgt6RCyoNa4lqKHDXWPXjnvLjAHD58jTSHqpO DszLbpSkllu18YSVc2eH7JaIleOIcl0yCyPetTcvaBDCU6HEf3XCvhVSZsqslm4qwBJr suRIGujJGnkwmTjTYti1+svAVRFeSx6B1G6h1AR2KVoYxjRZyDwehY0YdXigYxrO2GUK 5Baw== X-Gm-Message-State: ACgBeo2X6Ha4Qehkvd2gqKevbZB3clywWSCV7Y331T86R8lV6k01UkAZ h4drPPQAkFMlbXuypA70dZo= X-Google-Smtp-Source: AA6agR4MKpYK/wQeAdkGyYIOnnZgsWqoGsDZhPtgi0Dk4CBu+A1C3/jSBUgbsR4ImmuXr56RTa2ydA== X-Received: by 2002:a63:1853:0:b0:41d:70c0:978e with SMTP id 19-20020a631853000000b0041d70c0978emr1200390pgy.32.1662511792578; Tue, 06 Sep 2022 17:49:52 -0700 (PDT) Received: from localhost (fwdproxy-prn-021.fbsv.net. [2a03:2880:ff:15::face:b00c]) by smtp.gmail.com with ESMTPSA id p3-20020aa79e83000000b005371689d70fsm10946297pfq.120.2022.09.06.17.49.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 06 Sep 2022 17:49:52 -0700 (PDT) From: Adel Abouchaev To: davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, corbet@lwn.net, dsahern@kernel.org, shuah@kernel.org, netdev@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org Subject: [net-next v3 3/6] net: Add UDP ULP operations, initialization and handling prototype functions. Date: Tue, 6 Sep 2022 17:49:32 -0700 Message-Id: <20220907004935.3971173-4-adel.abushaev@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220907004935.3971173-1-adel.abushaev@gmail.com> References: <20220907004935.3971173-1-adel.abushaev@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org Define functions to add UDP ULP handling, registration with UDP protocol and supporting data structures. Create structure for QUIC ULP and add empty prototype functions to support it. Signed-off-by: Adel Abouchaev --- Removed reference to net/quic/Kconfig from this patch into the next. Fixed formatting around brackets. --- include/net/inet_sock.h | 2 + include/net/udp.h | 33 +++++++ include/uapi/linux/udp.h | 1 + net/Makefile | 1 + net/ipv4/Makefile | 3 +- net/ipv4/udp.c | 6 ++ net/ipv4/udp_ulp.c | 192 +++++++++++++++++++++++++++++++++++++++ 7 files changed, 237 insertions(+), 1 deletion(-) create mode 100644 net/ipv4/udp_ulp.c diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h index bf5654ce711e..650e332bdb50 100644 --- a/include/net/inet_sock.h +++ b/include/net/inet_sock.h @@ -249,6 +249,8 @@ struct inet_sock { __be32 mc_addr; struct ip_mc_socklist __rcu *mc_list; struct inet_cork_full cork; + const struct udp_ulp_ops *udp_ulp_ops; + void __rcu *ulp_data; }; #define IPCORK_OPT 1 /* ip-options has been held in ipcork.opt */ diff --git a/include/net/udp.h b/include/net/udp.h index 5ee88ddf79c3..f22ebabbb186 100644 --- a/include/net/udp.h +++ b/include/net/udp.h @@ -523,4 +523,37 @@ struct proto *udp_bpf_get_proto(struct sock *sk, struct sk_psock *psock); int udp_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore); #endif +/* + * Interface for adding Upper Level Protocols over UDP + */ + +#define UDP_ULP_NAME_MAX 16 +#define UDP_ULP_MAX 128 + +struct udp_ulp_ops { + struct list_head list; + + /* initialize ulp */ + int (*init)(struct sock *sk); + /* cleanup ulp */ + void (*release)(struct sock *sk); + + char name[UDP_ULP_NAME_MAX]; + struct module *owner; +}; + +int udp_register_ulp(struct udp_ulp_ops *type); +void udp_unregister_ulp(struct udp_ulp_ops *type); +int udp_set_ulp(struct sock *sk, const char *name); +void udp_get_available_ulp(char *buf, size_t len); +void udp_cleanup_ulp(struct sock *sk); +int udp_setsockopt_ulp(struct sock *sk, sockptr_t optval, + unsigned int optlen); +int udp_getsockopt_ulp(struct sock *sk, char __user *optval, + int __user *optlen); + +#define MODULE_ALIAS_UDP_ULP(name)\ + __MODULE_INFO(alias, alias_userspace, name);\ + __MODULE_INFO(alias, alias_udp_ulp, "udp-ulp-" name) + #endif /* _UDP_H */ diff --git a/include/uapi/linux/udp.h b/include/uapi/linux/udp.h index 0ee4c598e70b..893691f0108a 100644 --- a/include/uapi/linux/udp.h +++ b/include/uapi/linux/udp.h @@ -34,6 +34,7 @@ struct udphdr { #define UDP_NO_CHECK6_RX 102 /* Disable accpeting checksum for UDP6 */ #define UDP_SEGMENT 103 /* Set GSO segmentation size */ #define UDP_GRO 104 /* This socket can receive UDP GRO packets */ +#define UDP_ULP 105 /* Attach ULP to a UDP socket */ #define UDP_QUIC_ADD_TX_CONNECTION 106 /* Add QUIC Tx crypto offload */ #define UDP_QUIC_DEL_TX_CONNECTION 107 /* Del QUIC Tx crypto offload */ #define UDP_QUIC_ENCRYPT 108 /* QUIC encryption parameters */ diff --git a/net/Makefile b/net/Makefile index 6a62e5b27378..021ea3698d3a 100644 --- a/net/Makefile +++ b/net/Makefile @@ -16,6 +16,7 @@ obj-y += ethernet/ 802/ sched/ netlink/ bpf/ ethtool/ obj-$(CONFIG_NETFILTER) += netfilter/ obj-$(CONFIG_INET) += ipv4/ obj-$(CONFIG_TLS) += tls/ +obj-$(CONFIG_QUIC) += quic/ obj-$(CONFIG_XFRM) += xfrm/ obj-$(CONFIG_UNIX_SCM) += unix/ obj-y += ipv6/ diff --git a/net/ipv4/Makefile b/net/ipv4/Makefile index bbdd9c44f14e..88d3baf4af95 100644 --- a/net/ipv4/Makefile +++ b/net/ipv4/Makefile @@ -14,7 +14,8 @@ obj-y := route.o inetpeer.o protocol.o \ udp_offload.o arp.o icmp.o devinet.o af_inet.o igmp.o \ fib_frontend.o fib_semantics.o fib_trie.o fib_notifier.o \ inet_fragment.o ping.o ip_tunnel_core.o gre_offload.o \ - metrics.o netlink.o nexthop.o udp_tunnel_stub.o + metrics.o netlink.o nexthop.o udp_tunnel_stub.o \ + udp_ulp.o obj-$(CONFIG_BPFILTER) += bpfilter/ diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 34eda973bbf1..027c4513a9cd 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -2779,6 +2779,9 @@ int udp_lib_setsockopt(struct sock *sk, int level, int optname, up->pcflag |= UDPLITE_RECV_CC; break; + case UDP_ULP: + return udp_setsockopt_ulp(sk, optval, optlen); + default: err = -ENOPROTOOPT; break; @@ -2847,6 +2850,9 @@ int udp_lib_getsockopt(struct sock *sk, int level, int optname, val = up->pcrlen; break; + case UDP_ULP: + return udp_getsockopt_ulp(sk, optval, optlen); + default: return -ENOPROTOOPT; } diff --git a/net/ipv4/udp_ulp.c b/net/ipv4/udp_ulp.c new file mode 100644 index 000000000000..138818690151 --- /dev/null +++ b/net/ipv4/udp_ulp.c @@ -0,0 +1,192 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Pluggable UDP upper layer protocol support, based on pluggable TCP upper + * layer protocol support. + * + * Copyright (c) 2016-2017, Mellanox Technologies. All rights reserved. + * Copyright (c) 2016-2017, Dave Watson . All rights + * reserved. + * Copyright (c) 2021-2022, Meta Platforms, Inc. All rights reserved. + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +static DEFINE_SPINLOCK(udp_ulp_list_lock); +static LIST_HEAD(udp_ulp_list); + +/* Simple linear search, don't expect many entries! */ +static struct udp_ulp_ops *udp_ulp_find(const char *name) +{ + struct udp_ulp_ops *e; + + list_for_each_entry_rcu(e, &udp_ulp_list, list, + lockdep_is_held(&udp_ulp_list_lock)) { + if (strcmp(e->name, name) == 0) + return e; + } + + return NULL; +} + +static const struct udp_ulp_ops *__udp_ulp_find_autoload(const char *name) +{ + const struct udp_ulp_ops *ulp = NULL; + + rcu_read_lock(); + ulp = udp_ulp_find(name); + +#ifdef CONFIG_MODULES + if (!ulp && capable(CAP_NET_ADMIN)) { + rcu_read_unlock(); + request_module("udp-ulp-%s", name); + rcu_read_lock(); + ulp = udp_ulp_find(name); + } +#endif + if (!ulp || !try_module_get(ulp->owner)) + ulp = NULL; + + rcu_read_unlock(); + return ulp; +} + +/* Attach new upper layer protocol to the list + * of available protocols. + */ +int udp_register_ulp(struct udp_ulp_ops *ulp) +{ + int ret = 0; + + spin_lock(&udp_ulp_list_lock); + if (udp_ulp_find(ulp->name)) + ret = -EEXIST; + else + list_add_tail_rcu(&ulp->list, &udp_ulp_list); + + spin_unlock(&udp_ulp_list_lock); + + return ret; +} +EXPORT_SYMBOL_GPL(udp_register_ulp); + +void udp_unregister_ulp(struct udp_ulp_ops *ulp) +{ + spin_lock(&udp_ulp_list_lock); + list_del_rcu(&ulp->list); + spin_unlock(&udp_ulp_list_lock); + + synchronize_rcu(); +} +EXPORT_SYMBOL_GPL(udp_unregister_ulp); + +void udp_cleanup_ulp(struct sock *sk) +{ + struct inet_sock *inet = inet_sk(sk); + + /* No sock_owned_by_me() check here as at the time the + * stack calls this function, the socket is dead and + * about to be destroyed. + */ + if (!inet->udp_ulp_ops) + return; + + if (inet->udp_ulp_ops->release) + inet->udp_ulp_ops->release(sk); + module_put(inet->udp_ulp_ops->owner); + + inet->udp_ulp_ops = NULL; +} + +static int __udp_set_ulp(struct sock *sk, const struct udp_ulp_ops *ulp_ops) +{ + struct inet_sock *inet = inet_sk(sk); + int err; + + err = -EEXIST; + if (inet->udp_ulp_ops) + goto out_err; + + err = ulp_ops->init(sk); + if (err) + goto out_err; + + inet->udp_ulp_ops = ulp_ops; + return 0; + +out_err: + module_put(ulp_ops->owner); + return err; +} + +int udp_set_ulp(struct sock *sk, const char *name) +{ + struct sk_psock *psock = sk_psock_get(sk); + const struct udp_ulp_ops *ulp_ops; + + if (psock) { + sk_psock_put(sk, psock); + return -EINVAL; + } + + sock_owned_by_me(sk); + ulp_ops = __udp_ulp_find_autoload(name); + if (!ulp_ops) + return -ENOENT; + + return __udp_set_ulp(sk, ulp_ops); +} + +int udp_setsockopt_ulp(struct sock *sk, sockptr_t optval, unsigned int optlen) +{ + char name[UDP_ULP_NAME_MAX]; + int val, err; + + if (!optlen || optlen > UDP_ULP_NAME_MAX) + return -EINVAL; + + val = strncpy_from_sockptr(name, optval, optlen); + if (val < 0) + return -EFAULT; + + if (val == UDP_ULP_NAME_MAX) + return -EINVAL; + + name[val] = 0; + lock_sock(sk); + err = udp_set_ulp(sk, name); + release_sock(sk); + return err; +} + +int udp_getsockopt_ulp(struct sock *sk, char __user *optval, int __user *optlen) +{ + struct inet_sock *inet = inet_sk(sk); + int len; + + if (get_user(len, optlen)) + return -EFAULT; + + len = min_t(unsigned int, len, UDP_ULP_NAME_MAX); + if (len < 0) + return -EINVAL; + + if (!inet->udp_ulp_ops) { + if (put_user(0, optlen)) + return -EFAULT; + return 0; + } + + if (put_user(len, optlen)) + return -EFAULT; + if (copy_to_user(optval, inet->udp_ulp_ops->name, len)) + return -EFAULT; + + return 0; +} From patchwork Wed Sep 7 00:49:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adel Abouchaev X-Patchwork-Id: 603634 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8B2F7C38145 for ; Wed, 7 Sep 2022 00:50:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229590AbiIGAuI (ORCPT ); Tue, 6 Sep 2022 20:50:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51340 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229616AbiIGAt7 (ORCPT ); Tue, 6 Sep 2022 20:49:59 -0400 Received: from mail-pf1-x436.google.com (mail-pf1-x436.google.com [IPv6:2607:f8b0:4864:20::436]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6F99A5142A; Tue, 6 Sep 2022 17:49:55 -0700 (PDT) Received: by mail-pf1-x436.google.com with SMTP id 65so2809940pfx.0; Tue, 06 Sep 2022 17:49:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date; bh=9RRa46YyVZTCaVoG6sOJP5fF9tsqvHfDw1aJkGfs7lI=; b=i4qST3kUMMczH/YbjhGgngeOXAL1gEB9tG1dg+VIfGElLnjbJnQ2NouD/KjHg7nvDW +C/sW/Wo0FY4uG7P5AxtUDrre9lXCm9bZM0n7UGle4LTiU0gL5/RFVaUtKui4MN1A9Ux JOEKTYotXHU/6M+WpTqh8rxYQiht9qGGMOLhq13VmKZRY1hmKDk2u6VjmmVHkQWixQLG nllWsc4njeNvTsUvMiTV8oj9nvQvlSl/IiOPAlYqUV/n19wY/HEZLnAsgVi684mhmZXN HUy2I1HBEaQxeR5iygEHjyRzxJSwARijOXevbJzfM7/+DtDKziMQAUDU88BnllX7Qegs dc1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date; bh=9RRa46YyVZTCaVoG6sOJP5fF9tsqvHfDw1aJkGfs7lI=; b=WAE0OOqieMF08ZbU3LPwPqALEn8OjrSCnissyodsE97/166SbHi+jSBpR4sQ6HR+jh xkUQV/zWc1tBtGxLzIlChKBxu18ysVh8B4YSh69weUACp5arTn80aXFwZoL5B4UoTk5W 07tRgw5ER8RfCgMTfh3cMR5kAZD0QkVFB+uQkK4OmbTc3Wo/Pd2R423MhSoa+nMqbh07 WasYx2NEtx2Zle1N/H8s7V5Ir8qcmiQ8OoOYBZTFZulELd1fBSfjgsi0JzYNqdbo2Tal ZymKTvXhxGZZJLgpmioJNTkgp/qu0KTGRWZ9b5WqsTD6I3MWAT2xEIyDUYPihtgMV5kR Rc2Q== X-Gm-Message-State: ACgBeo2WkyeFkQOoE5mZJ4gJ9nuYOXJxSxI2xzQDAYaVg+6lWG51gcdk 81NFIJMu+v7QpY15VwHITDI= X-Google-Smtp-Source: AA6agR5I4DVLRIsAqF0HPOhYMEhKNXZ8oNphspAZOlNQAwJIkgjRbz8vJH61gmlSP8Aba/PZbERGVw== X-Received: by 2002:a63:904c:0:b0:434:c04a:f7b8 with SMTP id a73-20020a63904c000000b00434c04af7b8mr1156155pge.39.1662511794620; Tue, 06 Sep 2022 17:49:54 -0700 (PDT) Received: from localhost (fwdproxy-prn-006.fbsv.net. [2a03:2880:ff:6::face:b00c]) by smtp.gmail.com with ESMTPSA id w10-20020a65534a000000b0043014f9a4c9sm9125563pgr.93.2022.09.06.17.49.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 06 Sep 2022 17:49:54 -0700 (PDT) From: Adel Abouchaev To: davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, corbet@lwn.net, dsahern@kernel.org, shuah@kernel.org, netdev@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org Subject: [net-next v3 4/6] net: Implement QUIC offload functions Date: Tue, 6 Sep 2022 17:49:33 -0700 Message-Id: <20220907004935.3971173-5-adel.abushaev@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220907004935.3971173-1-adel.abushaev@gmail.com> References: <20220907004935.3971173-1-adel.abushaev@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org Add connection hash to the context do support add, remove operations on QUIC connections for the control plane and lookup for the data plane. Implement setsockopt and add placeholders to add and delete Tx connections. Signed-off-by: Adel Abouchaev --- Added net/quic/Kconfig reference to net/Kconfig in this commit. Initialized pointers with NULL vs 0. Restricted AES counter to __le32 Added address space qualifiers to user space addresses. Removed empty lines. Updated code alignment. Removed inlines. v3: removed ITER_KVEC flag from iov_iter_kvec call. v3: fixed Chacha20 encryption bug. v3: updated to match the uAPI struct fields v3: updated Tx flow to match on dst ip, dst port and connection id. v3: updated to drop packets if key generations do not match. --- include/net/quic.h | 53 ++ net/Kconfig | 1 + net/ipv4/udp.c | 9 + net/quic/Kconfig | 16 + net/quic/Makefile | 8 + net/quic/quic_main.c | 1487 ++++++++++++++++++++++++++++++++++++++++++ 6 files changed, 1574 insertions(+) create mode 100644 include/net/quic.h create mode 100644 net/quic/Kconfig create mode 100644 net/quic/Makefile create mode 100644 net/quic/quic_main.c diff --git a/include/net/quic.h b/include/net/quic.h new file mode 100644 index 000000000000..cafe01174e60 --- /dev/null +++ b/include/net/quic.h @@ -0,0 +1,53 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ + +#ifndef INCLUDE_NET_QUIC_H +#define INCLUDE_NET_QUIC_H + +#include +#include +#include +#include + +#define QUIC_MAX_SHORT_HEADER_SIZE 25 +#define QUIC_MAX_CONNECTION_ID_SIZE 20 +#define QUIC_HDR_MASK_SIZE 16 +#define QUIC_MAX_GSO_FRAGS 16 + +// Maximum IV and nonce sizes should be in sync with supported ciphers. +#define QUIC_CIPHER_MAX_IV_SIZE 12 +#define QUIC_CIPHER_MAX_NONCE_SIZE 16 + +/* Side by side data for QUIC egress operations */ +#define QUIC_ANCILLARY_FLAGS (QUIC_BYPASS_ENCRYPTION) + +#define QUIC_MAX_IOVEC_SEGMENTS 8 +#define QUIC_MAX_SG_ALLOC_ELEMENTS 32 +#define QUIC_MAX_PLAIN_PAGES 16 +#define QUIC_MAX_CIPHER_PAGES_ORDER 4 + +struct quic_internal_crypto_context { + struct quic_connection_info conn_info; + struct crypto_skcipher *header_tfm; + struct crypto_aead *packet_aead; +}; + +struct quic_connection_rhash { + struct rhash_head node; + struct quic_internal_crypto_context crypto_ctx; + struct rcu_head rcu; +}; + +struct quic_context { + struct proto *sk_proto; + struct rhashtable tx_connections; + struct scatterlist sg_alloc[QUIC_MAX_SG_ALLOC_ELEMENTS]; + struct page *cipher_page; + /** + * To synchronize concurrent sendmsg() requests through the same socket + * and protect preallocated per-context memory. + **/ + struct mutex sendmsg_mux; + struct rcu_head rcu; +}; + +#endif diff --git a/net/Kconfig b/net/Kconfig index 48c33c222199..6824d07b9e57 100644 --- a/net/Kconfig +++ b/net/Kconfig @@ -63,6 +63,7 @@ menu "Networking options" source "net/packet/Kconfig" source "net/unix/Kconfig" source "net/tls/Kconfig" +source "net/quic/Kconfig" source "net/xfrm/Kconfig" source "net/iucv/Kconfig" source "net/smc/Kconfig" diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 027c4513a9cd..e7cbbea9d8d9 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -113,6 +113,7 @@ #include #include #include +#include #if IS_ENABLED(CONFIG_IPV6) #include #endif @@ -1011,6 +1012,14 @@ static int __udp_cmsg_send(struct cmsghdr *cmsg, u16 *gso_size) return -EINVAL; *gso_size = *(__u16 *)CMSG_DATA(cmsg); return 0; + case UDP_QUIC_ENCRYPT: + /* This option is handled in UDP_ULP and is only checked + * here for the bypass bit + */ + if (cmsg->cmsg_len != + CMSG_LEN(sizeof(struct quic_tx_ancillary_data))) + return -EINVAL; + return 0; default: return -EINVAL; } diff --git a/net/quic/Kconfig b/net/quic/Kconfig new file mode 100644 index 000000000000..661cb989508a --- /dev/null +++ b/net/quic/Kconfig @@ -0,0 +1,16 @@ +# SPDX-License-Identifier: GPL-2.0-only +# +# QUIC configuration +# +config QUIC + tristate "QUIC encryption offload" + depends on INET + select CRYPTO + select CRYPTO_AES + select CRYPTO_GCM + help + Enable kernel support for QUIC crypto offload. Currently only TX + encryption offload is supported. The kernel will perform + copy-during-encryption. + + If unsure, say N. diff --git a/net/quic/Makefile b/net/quic/Makefile new file mode 100644 index 000000000000..928239c4d08c --- /dev/null +++ b/net/quic/Makefile @@ -0,0 +1,8 @@ +# SPDX-License-Identifier: GPL-2.0-only +# +# Makefile for the QUIC subsystem +# + +obj-$(CONFIG_QUIC) += quic.o + +quic-y := quic_main.o diff --git a/net/quic/quic_main.c b/net/quic/quic_main.c new file mode 100644 index 000000000000..a43d989a1c8e --- /dev/null +++ b/net/quic/quic_main.c @@ -0,0 +1,1487 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include +#include +#include +#include +// Include header to use TLS constants for AEAD cipher. +#include +#include +#include +#include + +static unsigned long af_init_done; +static struct proto quic_v4_proto; +static struct proto quic_v6_proto; +static DEFINE_SPINLOCK(quic_proto_lock); + +static u32 quic_tx_connection_hash(const void *data, u32 len, u32 seed) +{ + return jhash(data, len, seed); +} + +static u32 quic_tx_connection_hash_obj(const void *data, u32 len, u32 seed) +{ + const struct quic_connection_rhash *connhash = data; + + return jhash(&connhash->crypto_ctx.conn_info.key, + sizeof(struct quic_connection_info_key), seed); +} + +static int quic_tx_connection_hash_cmp(struct rhashtable_compare_arg *arg, + const void *ptr) +{ + const struct quic_connection_info_key *key = arg->key; + const struct quic_connection_rhash *x = ptr; + + return !!memcmp(&x->crypto_ctx.conn_info.key, + key, + sizeof(struct quic_connection_info_key)); +} + +static const struct rhashtable_params quic_tx_connection_params = { + .key_len = sizeof(struct quic_connection_info_key), + .head_offset = offsetof(struct quic_connection_rhash, node), + .hashfn = quic_tx_connection_hash, + .obj_hashfn = quic_tx_connection_hash_obj, + .obj_cmpfn = quic_tx_connection_hash_cmp, + .automatic_shrinking = true, +}; + +static size_t quic_crypto_key_size(u16 cipher_type) +{ + switch (cipher_type) { + case TLS_CIPHER_AES_GCM_128: + return TLS_CIPHER_AES_GCM_128_KEY_SIZE; + case TLS_CIPHER_AES_GCM_256: + return TLS_CIPHER_AES_GCM_256_KEY_SIZE; + case TLS_CIPHER_AES_CCM_128: + return TLS_CIPHER_AES_CCM_128_KEY_SIZE; + case TLS_CIPHER_CHACHA20_POLY1305: + return TLS_CIPHER_CHACHA20_POLY1305_KEY_SIZE; + default: + break; + } + WARN_ON("Unsupported cipher type"); + return 0; +} + +static size_t quic_crypto_tag_size(u16 cipher_type) +{ + switch (cipher_type) { + case TLS_CIPHER_AES_GCM_128: + return TLS_CIPHER_AES_GCM_128_TAG_SIZE; + case TLS_CIPHER_AES_GCM_256: + return TLS_CIPHER_AES_GCM_256_TAG_SIZE; + case TLS_CIPHER_AES_CCM_128: + return TLS_CIPHER_AES_CCM_128_TAG_SIZE; + case TLS_CIPHER_CHACHA20_POLY1305: + return TLS_CIPHER_CHACHA20_POLY1305_TAG_SIZE; + default: + break; + } + WARN_ON("Unsupported cipher type"); + return 0; +} + +static size_t quic_crypto_nonce_size(u16 cipher_type) +{ + switch (cipher_type) { + case TLS_CIPHER_AES_GCM_128: + BUILD_BUG_ON(TLS_CIPHER_AES_GCM_128_IV_SIZE + + TLS_CIPHER_AES_GCM_128_SALT_SIZE > + QUIC_CIPHER_MAX_NONCE_SIZE); + return TLS_CIPHER_AES_GCM_128_IV_SIZE + + TLS_CIPHER_AES_GCM_128_SALT_SIZE; + case TLS_CIPHER_AES_GCM_256: + BUILD_BUG_ON(TLS_CIPHER_AES_GCM_256_IV_SIZE + + TLS_CIPHER_AES_GCM_256_SALT_SIZE > + QUIC_CIPHER_MAX_NONCE_SIZE); + return TLS_CIPHER_AES_GCM_256_IV_SIZE + + TLS_CIPHER_AES_GCM_256_SALT_SIZE; + case TLS_CIPHER_AES_CCM_128: + BUILD_BUG_ON(TLS_CIPHER_AES_CCM_128_IV_SIZE + + TLS_CIPHER_AES_CCM_128_SALT_SIZE > + QUIC_CIPHER_MAX_NONCE_SIZE); + return TLS_CIPHER_AES_CCM_128_IV_SIZE + + TLS_CIPHER_AES_CCM_128_SALT_SIZE; + case TLS_CIPHER_CHACHA20_POLY1305: + BUILD_BUG_ON(TLS_CIPHER_CHACHA20_POLY1305_IV_SIZE + + TLS_CIPHER_CHACHA20_POLY1305_SALT_SIZE > + QUIC_CIPHER_MAX_NONCE_SIZE); + return TLS_CIPHER_CHACHA20_POLY1305_IV_SIZE + + TLS_CIPHER_CHACHA20_POLY1305_SALT_SIZE; + default: + break; + } + WARN_ON("Unsupported cipher type"); + return 0; +} + +static u8 *quic_payload_iv(struct quic_internal_crypto_context *crypto_ctx) +{ + switch (crypto_ctx->conn_info.cipher_type) { + case TLS_CIPHER_AES_GCM_128: + return crypto_ctx->conn_info.aes_gcm_128.payload_iv; + case TLS_CIPHER_AES_GCM_256: + return crypto_ctx->conn_info.aes_gcm_256.payload_iv; + case TLS_CIPHER_AES_CCM_128: + return crypto_ctx->conn_info.aes_ccm_128.payload_iv; + case TLS_CIPHER_CHACHA20_POLY1305: + return crypto_ctx->conn_info.chacha20_poly1305.payload_iv; + default: + break; + } + WARN_ON("Unsupported cipher type"); + return NULL; +} + +static int +quic_config_header_crypto(struct quic_internal_crypto_context *crypto_ctx) +{ + struct crypto_skcipher *tfm; + char *header_cipher; + int rc = 0; + char *key; + + switch (crypto_ctx->conn_info.cipher_type) { + case TLS_CIPHER_AES_GCM_128: + header_cipher = "ecb(aes)"; + key = crypto_ctx->conn_info.aes_gcm_128.header_key; + break; + case TLS_CIPHER_AES_GCM_256: + header_cipher = "ecb(aes)"; + key = crypto_ctx->conn_info.aes_gcm_256.header_key; + break; + case TLS_CIPHER_AES_CCM_128: + header_cipher = "ecb(aes)"; + key = crypto_ctx->conn_info.aes_ccm_128.header_key; + break; + case TLS_CIPHER_CHACHA20_POLY1305: + header_cipher = "chacha20"; + key = crypto_ctx->conn_info.chacha20_poly1305.header_key; + break; + default: + rc = -EINVAL; + goto out; + } + + tfm = crypto_alloc_skcipher(header_cipher, 0, 0); + if (IS_ERR(tfm)) { + rc = PTR_ERR(tfm); + goto out; + } + + rc = crypto_skcipher_setkey(tfm, key, + quic_crypto_key_size(crypto_ctx->conn_info + .cipher_type)); + if (rc) { + crypto_free_skcipher(tfm); + goto out; + } + + crypto_ctx->header_tfm = tfm; + +out: + return rc; +} + +static int +quic_config_packet_crypto(struct quic_internal_crypto_context *crypto_ctx) +{ + struct crypto_aead *aead; + char *cipher_name; + int rc = 0; + char *key; + + switch (crypto_ctx->conn_info.cipher_type) { + case TLS_CIPHER_AES_GCM_128: { + key = crypto_ctx->conn_info.aes_gcm_128.payload_key; + cipher_name = "gcm(aes)"; + break; + } + case TLS_CIPHER_AES_GCM_256: { + key = crypto_ctx->conn_info.aes_gcm_256.payload_key; + cipher_name = "gcm(aes)"; + break; + } + case TLS_CIPHER_AES_CCM_128: { + key = crypto_ctx->conn_info.aes_ccm_128.payload_key; + cipher_name = "ccm(aes)"; + break; + } + case TLS_CIPHER_CHACHA20_POLY1305: { + key = crypto_ctx->conn_info.chacha20_poly1305.payload_key; + cipher_name = "rfc7539(chacha20,poly1305)"; + break; + } + default: + rc = -EINVAL; + goto out; + } + + aead = crypto_alloc_aead(cipher_name, 0, 0); + if (IS_ERR(aead)) { + rc = PTR_ERR(aead); + goto out; + } + + rc = crypto_aead_setkey(aead, key, + quic_crypto_key_size(crypto_ctx->conn_info + .cipher_type)); + if (rc) + goto free_aead; + + rc = crypto_aead_setauthsize(aead, + quic_crypto_tag_size(crypto_ctx->conn_info + .cipher_type)); + if (rc) + goto free_aead; + + crypto_ctx->packet_aead = aead; + goto out; + +free_aead: + crypto_free_aead(aead); + +out: + return rc; +} + +static inline struct quic_context *quic_get_ctx(struct sock *sk) +{ + struct inet_sock *inet = inet_sk(sk); + + return (__force void *)rcu_access_pointer(inet->ulp_data); +} + +static void quic_free_cipher_page(struct page *page) +{ + __free_pages(page, QUIC_MAX_CIPHER_PAGES_ORDER); +} + +static struct quic_context *quic_ctx_create(void) +{ + struct quic_context *ctx; + + ctx = kzalloc(sizeof(*ctx), GFP_KERNEL); + if (!ctx) + return NULL; + + mutex_init(&ctx->sendmsg_mux); + ctx->cipher_page = alloc_pages(GFP_KERNEL, QUIC_MAX_CIPHER_PAGES_ORDER); + if (!ctx->cipher_page) + goto out_err; + + if (rhashtable_init(&ctx->tx_connections, + &quic_tx_connection_params) < 0) { + quic_free_cipher_page(ctx->cipher_page); + goto out_err; + } + + return ctx; + +out_err: + kfree(ctx); + return NULL; +} + +static int quic_getsockopt(struct sock *sk, int level, int optname, + char __user *optval, int __user *optlen) +{ + struct quic_context *ctx = quic_get_ctx(sk); + + return ctx->sk_proto->getsockopt(sk, level, optname, optval, optlen); +} + +static void quic_update_key_if_mapped_ipv4(struct quic_connection_info_key *key) +{ + if (ipv6_addr_v4mapped(&key->addr.ipv6_addr)) { + key->addr.ipv6_addr.s6_addr32[0] = + key->addr.ipv6_addr.s6_addr32[3]; + key->addr.ipv6_addr.s6_addr32[1] = 0; + key->addr.ipv6_addr.s6_addr32[2] = 0; + key->addr.ipv6_addr.s6_addr32[3] = 0; + } +} + +static int do_quic_conn_add_tx(struct sock *sk, sockptr_t optval, + unsigned int optlen) +{ + struct quic_internal_crypto_context *crypto_ctx; + struct quic_context *ctx = quic_get_ctx(sk); + struct quic_connection_rhash *connhash; + int rc = 0; + + if (sockptr_is_null(optval)) + return -EINVAL; + + if (optlen != sizeof(struct quic_connection_info)) + return -EINVAL; + + connhash = kzalloc(sizeof(*connhash), GFP_KERNEL); + if (!connhash) + return -EFAULT; + + crypto_ctx = &connhash->crypto_ctx; + rc = copy_from_sockptr(&crypto_ctx->conn_info, optval, + sizeof(crypto_ctx->conn_info)); + if (rc) { + rc = -EFAULT; + goto err_crypto_info; + } + + quic_update_key_if_mapped_ipv4(&crypto_ctx->conn_info.key); + + if (crypto_ctx->conn_info.key.dst_conn_id_length > + QUIC_MAX_CONNECTION_ID_SIZE) { + rc = -EINVAL; + goto err_crypto_info; + } + + if (crypto_ctx->conn_info.conn_payload_key_gen > 1) { + rc = -EINVAL; + goto err_crypto_info; + } + + // create all TLS materials for packet and header decryption + rc = quic_config_header_crypto(crypto_ctx); + if (rc) + goto err_crypto_info; + + rc = quic_config_packet_crypto(crypto_ctx); + if (rc) + goto err_free_skcipher; + + // insert crypto data into hash per connection ID + rc = rhashtable_insert_fast(&ctx->tx_connections, &connhash->node, + quic_tx_connection_params); + if (rc < 0) + goto err_free_ciphers; + + return 0; + +err_free_ciphers: + crypto_free_aead(crypto_ctx->packet_aead); + +err_free_skcipher: + crypto_free_skcipher(crypto_ctx->header_tfm); + +err_crypto_info: + // wipeout all crypto materials; + memzero_explicit(&connhash->crypto_ctx, sizeof(connhash->crypto_ctx)); + kfree(connhash); + return rc; +} + +static int do_quic_conn_del_tx(struct sock *sk, sockptr_t optval, + unsigned int optlen) +{ + struct quic_internal_crypto_context *crypto_ctx; + struct quic_context *ctx = quic_get_ctx(sk); + struct quic_connection_rhash *connhash; + struct quic_connection_info conn_info; + + if (sockptr_is_null(optval)) + return -EINVAL; + + if (optlen != sizeof(struct quic_connection_info)) + return -EINVAL; + + if (copy_from_sockptr(&conn_info, optval, optlen)) + return -EFAULT; + + if (conn_info.key.dst_conn_id_length > + QUIC_MAX_CONNECTION_ID_SIZE) + return -EINVAL; + + if (conn_info.conn_payload_key_gen > 1) + return -EINVAL; + + quic_update_key_if_mapped_ipv4(&conn_info.key); + + connhash = rhashtable_lookup_fast(&ctx->tx_connections, + &conn_info.key, + quic_tx_connection_params); + if (!connhash) + return -EINVAL; + + rhashtable_remove_fast(&ctx->tx_connections, + &connhash->node, + quic_tx_connection_params); + + crypto_ctx = &connhash->crypto_ctx; + + crypto_free_skcipher(crypto_ctx->header_tfm); + crypto_free_aead(crypto_ctx->packet_aead); + memzero_explicit(crypto_ctx, sizeof(*crypto_ctx)); + kfree(connhash); + + return 0; +} + +static int do_quic_setsockopt(struct sock *sk, int optname, sockptr_t optval, + unsigned int optlen) +{ + int rc = 0; + + switch (optname) { + case UDP_QUIC_ADD_TX_CONNECTION: + lock_sock(sk); + rc = do_quic_conn_add_tx(sk, optval, optlen); + release_sock(sk); + break; + case UDP_QUIC_DEL_TX_CONNECTION: + lock_sock(sk); + rc = do_quic_conn_del_tx(sk, optval, optlen); + release_sock(sk); + break; + default: + rc = -ENOPROTOOPT; + break; + } + + return rc; +} + +static int quic_setsockopt(struct sock *sk, int level, int optname, + sockptr_t optval, unsigned int optlen) +{ + struct quic_context *ctx; + struct proto *sk_proto; + + rcu_read_lock(); + ctx = quic_get_ctx(sk); + sk_proto = ctx->sk_proto; + rcu_read_unlock(); + + if (level == SOL_UDP && + (optname == UDP_QUIC_ADD_TX_CONNECTION || + optname == UDP_QUIC_DEL_TX_CONNECTION)) + return do_quic_setsockopt(sk, optname, optval, optlen); + + return sk_proto->setsockopt(sk, level, optname, optval, optlen); +} + +static int +quic_extract_ancillary_data(struct msghdr *msg, + struct quic_tx_ancillary_data *ancillary_data, + u16 *udp_pkt_size) +{ + struct cmsghdr *cmsg_hdr = NULL; + void *ancillary_data_ptr = NULL; + + if (!msg->msg_controllen) + return -EINVAL; + + for_each_cmsghdr(cmsg_hdr, msg) { + if (!CMSG_OK(msg, cmsg_hdr)) + return -EINVAL; + + if (cmsg_hdr->cmsg_level != IPPROTO_UDP) + continue; + + if (cmsg_hdr->cmsg_type == UDP_QUIC_ENCRYPT) { + if (cmsg_hdr->cmsg_len != + CMSG_LEN(sizeof(struct quic_tx_ancillary_data))) + return -EINVAL; + memcpy((void *)ancillary_data, CMSG_DATA(cmsg_hdr), + sizeof(struct quic_tx_ancillary_data)); + ancillary_data_ptr = cmsg_hdr; + } else if (cmsg_hdr->cmsg_type == UDP_SEGMENT) { + if (cmsg_hdr->cmsg_len != CMSG_LEN(sizeof(u16))) + return -EINVAL; + memcpy((void *)udp_pkt_size, CMSG_DATA(cmsg_hdr), + sizeof(u16)); + } + } + + if (!ancillary_data_ptr) + return -EINVAL; + + return 0; +} + +static int quic_sendmsg_validate(struct msghdr *msg) +{ + if (!iter_is_iovec(&msg->msg_iter)) + return -EINVAL; + + if (!msg->msg_controllen) + return -EINVAL; + + return 0; +} + +static struct quic_connection_rhash +*quic_lookup_connection(struct quic_context *ctx, + u8 *conn_id, + struct quic_tx_ancillary_data *ancillary_data, + sa_family_t sa_family, + void *addr, + __be16 port) +{ + struct quic_connection_info_key conn_key; + size_t addrlen; + + // Lookup connection information by the connection key. + memset(&conn_key, 0, sizeof(struct quic_connection_info_key)); + // fill the connection id up to the max connection ID length + if (ancillary_data->dst_conn_id_length > QUIC_MAX_CONNECTION_ID_SIZE) + return NULL; + + conn_key.dst_conn_id_length = ancillary_data->dst_conn_id_length; + if (ancillary_data->dst_conn_id_length) + memcpy(conn_key.dst_conn_id, + conn_id, + ancillary_data->dst_conn_id_length); + + addrlen = (sa_family == AF_INET) ? 4 : 16; + memcpy(&conn_key.addr, addr, addrlen); + conn_key.udp_port = port; + + return rhashtable_lookup_fast(&ctx->tx_connections, + &conn_key, + quic_tx_connection_params); +} + +static int quic_sg_capacity_from_msg(const size_t pkt_size, + const off_t offset, + const size_t length) +{ + size_t pages = 0; + size_t pkts = 0; + + pages = DIV_ROUND_UP(offset + length, PAGE_SIZE); + pkts = DIV_ROUND_UP(length, pkt_size); + return pages + pkts + 1; +} + +static void quic_put_plain_user_pages(struct page **pages, size_t nr_pages) +{ + int i; + + for (i = 0; i < nr_pages; ++i) + if (i == 0 || pages[i] != pages[i - 1]) + put_page(pages[i]); +} + +static int quic_get_plain_user_pages(struct msghdr * const msg, + struct page **pages, + int *page_indices) +{ + void __user *data_addr; + size_t nr_mapped = 0; + size_t nr_pages = 0; + void *page_addr; + size_t count = 0; + off_t data_off; + int ret = 0; + int i; + + for (i = 0; i < msg->msg_iter.nr_segs; ++i) { + data_addr = msg->msg_iter.iov[i].iov_base; + if (!i) + data_addr += msg->msg_iter.iov_offset; + page_addr = + (void *)((unsigned long)data_addr & PAGE_MASK); + + data_off = (unsigned long)data_addr & ~PAGE_MASK; + nr_pages = + DIV_ROUND_UP(data_off + msg->msg_iter.iov[i].iov_len, + PAGE_SIZE); + if (nr_mapped + nr_pages > QUIC_MAX_PLAIN_PAGES) { + quic_put_plain_user_pages(pages, nr_mapped); + ret = -ENOMEM; + goto out; + } + + count = get_user_pages((unsigned long)page_addr, nr_pages, 1, + pages, NULL); + if (count < nr_pages) { + quic_put_plain_user_pages(pages, nr_mapped + count); + ret = -ENOMEM; + goto out; + } + + page_indices[i] = nr_mapped; + nr_mapped += count; + pages += count; + } + ret = nr_mapped; + +out: + return ret; +} + +static int quic_sg_plain_from_mapped_msg(struct msghdr * const msg, + struct page **plain_pages, + void **iov_base_ptrs, + void **iov_data_ptrs, + const size_t plain_size, + const size_t pkt_size, + struct scatterlist * const sg_alloc, + const size_t max_sg_alloc, + struct scatterlist ** const sg_pkts, + size_t *nr_plain_pages) +{ + int iov_page_indices[QUIC_MAX_IOVEC_SEGMENTS]; + struct scatterlist *sg; + unsigned int pkt_i = 0; + ssize_t left_on_page; + size_t pkt_left; + unsigned int i; + size_t seg_len; + off_t page_ofs; + off_t seg_ofs; + int ret = 0; + int page_i; + + if (msg->msg_iter.nr_segs >= QUIC_MAX_IOVEC_SEGMENTS) { + ret = -ENOMEM; + goto out; + } + + ret = quic_get_plain_user_pages(msg, plain_pages, iov_page_indices); + if (ret < 0) + goto out; + + *nr_plain_pages = ret; + sg = sg_alloc; + sg_pkts[pkt_i] = sg; + sg_unmark_end(sg); + pkt_left = pkt_size; + for (i = 0; i < msg->msg_iter.nr_segs; ++i) { + page_ofs = ((unsigned long)msg->msg_iter.iov[i].iov_base + & (PAGE_SIZE - 1)); + page_i = 0; + if (!i) { + page_ofs += msg->msg_iter.iov_offset; + while (page_ofs >= PAGE_SIZE) { + page_ofs -= PAGE_SIZE; + page_i++; + } + } + + seg_len = msg->msg_iter.iov[i].iov_len; + page_i += iov_page_indices[i]; + + if (page_i >= QUIC_MAX_PLAIN_PAGES) + return -EFAULT; + + seg_ofs = 0; + while (seg_ofs < seg_len) { + if (sg - sg_alloc > max_sg_alloc) + return -EFAULT; + + sg_unmark_end(sg); + left_on_page = min_t(size_t, PAGE_SIZE - page_ofs, + seg_len - seg_ofs); + if (left_on_page <= 0) + return -EFAULT; + + if (left_on_page > pkt_left) { + sg_set_page(sg, plain_pages[page_i], pkt_left, + page_ofs); + pkt_i++; + seg_ofs += pkt_left; + page_ofs += pkt_left; + sg_mark_end(sg); + sg++; + sg_pkts[pkt_i] = sg; + pkt_left = pkt_size; + continue; + } + sg_set_page(sg, plain_pages[page_i], left_on_page, + page_ofs); + page_i++; + page_ofs = 0; + seg_ofs += left_on_page; + pkt_left -= left_on_page; + if (pkt_left == 0 || + (seg_ofs == seg_len && + i == msg->msg_iter.nr_segs - 1)) { + sg_mark_end(sg); + pkt_i++; + sg++; + sg_pkts[pkt_i] = sg; + pkt_left = pkt_size; + } else { + sg++; + } + } + } + + if (pkt_left && pkt_left != pkt_size) { + pkt_i++; + sg_mark_end(sg); + } + ret = pkt_i; + +out: + return ret; +} + +/* sg_alloc: allocated zeroed array of scatterlists + * cipher_page: preallocated compound page + */ +static int quic_sg_cipher_from_pkts(const size_t cipher_tag_size, + const size_t plain_pkt_size, + const size_t plain_size, + struct page * const cipher_page, + struct scatterlist * const sg_alloc, + const size_t nr_sg_alloc, + struct scatterlist ** const sg_cipher) +{ + const size_t cipher_pkt_size = plain_pkt_size + cipher_tag_size; + size_t pkts = DIV_ROUND_UP(plain_size, plain_pkt_size); + struct scatterlist *sg = sg_alloc; + int pkt_i; + void *ptr; + + if (pkts > nr_sg_alloc) + return -EINVAL; + + ptr = page_address(cipher_page); + for (pkt_i = 0; pkt_i < pkts; + ++pkt_i, ptr += cipher_pkt_size, ++sg) { + sg_set_buf(sg, ptr, cipher_pkt_size); + sg_mark_end(sg); + sg_cipher[pkt_i] = sg; + } + return pkts; +} + +/* fast copy from scatterlist to a buffer assuming that all pages are + * available in kernel memory. + */ +static int quic_sg_pcopy_to_buffer_kernel(struct scatterlist *sg, + u8 *buffer, + size_t bytes_to_copy, + off_t offset_to_read) +{ + off_t sg_remain = sg->length; + size_t to_copy; + + if (!bytes_to_copy) + return 0; + + /* skip to offset first */ + while (offset_to_read > 0) { + if (!sg_remain) + return -EINVAL; + if (offset_to_read < sg_remain) { + sg_remain -= offset_to_read; + break; + } + offset_to_read -= sg_remain; + sg = sg_next(sg); + if (!sg) + return -EINVAL; + sg_remain = sg->length; + } + + /* traverse sg list from offset to offset + bytes_to_copy */ + while (bytes_to_copy) { + to_copy = min_t(size_t, bytes_to_copy, sg_remain); + if (!to_copy) + return -EINVAL; + memcpy(buffer, sg_virt(sg) + (sg->length - sg_remain), to_copy); + buffer += to_copy; + bytes_to_copy -= to_copy; + if (bytes_to_copy) { + sg = sg_next(sg); + if (!sg) + return -EINVAL; + sg_remain = sg->length; + } + } + + return 0; +} + +static int quic_copy_header(struct scatterlist *sg_plain, + u8 *buf, const size_t buf_len, + const size_t conn_id_len) +{ + u8 *pkt = sg_virt(sg_plain); + size_t hdr_len; + + hdr_len = 1 + conn_id_len + ((*pkt & 0x03) + 1); + if (hdr_len > QUIC_MAX_SHORT_HEADER_SIZE || hdr_len > buf_len) + return -EINVAL; + + WARN_ON_ONCE(quic_sg_pcopy_to_buffer_kernel(sg_plain, buf, hdr_len, 0)); + return hdr_len; +} + +static u64 quic_unpack_pkt_num(struct quic_tx_ancillary_data * const control, + const u8 * const hdr, + const off_t payload_crypto_off) +{ + u64 truncated_pn = 0; + u64 candidate_pn; + u64 expected_pn; + u64 pn_hwin; + u64 pn_mask; + u64 pn_len; + u64 pn_win; + int i; + + pn_len = (hdr[0] & 0x03) + 1; + expected_pn = control->next_pkt_num; + + for (i = 1 + control->dst_conn_id_length; i < payload_crypto_off; ++i) { + truncated_pn <<= 8; + truncated_pn |= hdr[i]; + } + + pn_win = 1ULL << (pn_len << 3); + pn_hwin = pn_win >> 1; + pn_mask = pn_win - 1; + candidate_pn = (expected_pn & ~pn_mask) | truncated_pn; + + if (expected_pn > pn_hwin && + candidate_pn <= expected_pn - pn_hwin && + candidate_pn < (1ULL << 62) - pn_win) + return candidate_pn + pn_win; + + if (candidate_pn > expected_pn + pn_hwin && + candidate_pn >= pn_win) + return candidate_pn - pn_win; + + return candidate_pn; +} + +static int +quic_construct_header_prot_mask(struct quic_internal_crypto_context *crypto_ctx, + struct skcipher_request *hdr_mask_req, + struct scatterlist *sg_cipher_pkt, + off_t sample_offset, + u8 *hdr_mask) +{ + u8 *sample = sg_virt(sg_cipher_pkt) + sample_offset; + u8 hdr_ctr[sizeof(u32) + QUIC_CIPHER_MAX_IV_SIZE]; + u8 chacha20_zeros[5] = {0, 0, 0, 0, 0}; + struct scatterlist sg_cipher_sample; + struct scatterlist sg_hdr_mask; + struct crypto_wait wait_header; + __le32 counter; + + BUILD_BUG_ON(QUIC_HDR_MASK_SIZE + < sizeof(u32) + QUIC_CIPHER_MAX_IV_SIZE); + + sg_init_one(&sg_hdr_mask, hdr_mask, QUIC_HDR_MASK_SIZE); + skcipher_request_set_callback(hdr_mask_req, 0, crypto_req_done, + &wait_header); + + if (crypto_ctx->conn_info.cipher_type == TLS_CIPHER_CHACHA20_POLY1305) { + sg_init_one(&sg_cipher_sample, (u8 *)chacha20_zeros, + sizeof(chacha20_zeros)); + counter = cpu_to_le32(*((u32 *)sample)); + memset(hdr_ctr, 0, sizeof(hdr_ctr)); + memcpy((u8 *)hdr_ctr, (u8 *)&counter, sizeof(u32)); + memcpy((u8 *)hdr_ctr + sizeof(u32), + (sample + sizeof(u32)), + QUIC_CIPHER_MAX_IV_SIZE); + skcipher_request_set_crypt(hdr_mask_req, &sg_cipher_sample, + &sg_hdr_mask, 5, hdr_ctr); + } else { + /* cipher pages are continuous, get the pointer to the sg data + directly, pages are allocated in kernel */ + sg_init_one(&sg_cipher_sample, sample, QUIC_HDR_MASK_SIZE); + skcipher_request_set_crypt(hdr_mask_req, &sg_cipher_sample, + &sg_hdr_mask, QUIC_HDR_MASK_SIZE, + NULL); + } + + return crypto_wait_req(crypto_skcipher_encrypt(hdr_mask_req), + &wait_header); +} + +static int quic_protect_header(struct quic_internal_crypto_context *crypto_ctx, + struct quic_tx_ancillary_data *control, + struct skcipher_request *hdr_mask_req, + struct scatterlist *sg_cipher_pkt, + int payload_crypto_off) +{ + u8 hdr_mask[QUIC_HDR_MASK_SIZE]; + off_t quic_pkt_num_off; + u8 quic_pkt_num_len; + u8 *cipher_hdr; + int err; + int i; + + quic_pkt_num_off = 1 + control->dst_conn_id_length; + quic_pkt_num_len = payload_crypto_off - quic_pkt_num_off; + + if (quic_pkt_num_len > 4) + return -EPERM; + + err = quic_construct_header_prot_mask(crypto_ctx, hdr_mask_req, + sg_cipher_pkt, + payload_crypto_off + + (4 - quic_pkt_num_len), + hdr_mask); + if (unlikely(err)) + return err; + + cipher_hdr = sg_virt(sg_cipher_pkt); + /* protect the public flags */ + cipher_hdr[0] ^= (hdr_mask[0] & 0x1f); + + for (i = 0; i < quic_pkt_num_len; ++i) + cipher_hdr[quic_pkt_num_off + i] ^= hdr_mask[1 + i]; + + return 0; +} + +static +void quic_construct_ietf_nonce(u8 *nonce, + struct quic_internal_crypto_context *crypto_ctx, + u64 quic_pkt_num) +{ + u8 *iv = quic_payload_iv(crypto_ctx); + int i; + + for (i = quic_crypto_nonce_size(crypto_ctx->conn_info.cipher_type) - 1; + i >= 0 && quic_pkt_num; + --i, quic_pkt_num >>= 8) + nonce[i] = iv[i] ^ (u8)quic_pkt_num; + + for (; i >= 0; --i) + nonce[i] = iv[i]; +} + +static ssize_t quic_sendpage(struct quic_context *ctx, + struct sock *sk, + struct msghdr *msg, + const size_t cipher_size, + struct page * const cipher_page) +{ + struct kvec iov; + ssize_t ret; + + iov.iov_base = page_address(cipher_page); + iov.iov_len = cipher_size; + iov_iter_kvec(&msg->msg_iter, WRITE, &iov, 1, cipher_size); + ret = security_socket_sendmsg(sk->sk_socket, msg, msg_data_left(msg)); + if (ret) + return ret; + + ret = ctx->sk_proto->sendmsg(sk, msg, msg_data_left(msg)); + WARN_ON(ret == -EIOCBQUEUED); + return ret; +} + +static int quic_extract_dst_address_info(struct sock *sk, struct msghdr *msg, + sa_family_t *sa_family, void **daddr, + __be16 *dport) +{ + DECLARE_SOCKADDR(struct sockaddr_in6 *, usin6, msg->msg_name); + DECLARE_SOCKADDR(struct sockaddr_in *, usin, msg->msg_name); + struct inet_sock *inet = inet_sk(sk); + struct ipv6_pinfo *np = inet6_sk(sk); + + if (usin6) { + /* dst address is provided in msg */ + *sa_family = usin6->sin6_family; + switch (*sa_family) { + case AF_INET: + if (msg->msg_namelen < sizeof(*usin)) + return -EINVAL; + *daddr = &usin->sin_addr.s_addr; + *dport = usin->sin_port; + break; + case AF_INET6: + if (msg->msg_namelen < sizeof(*usin6)) + return -EINVAL; + *daddr = &usin6->sin6_addr; + *dport = usin6->sin6_port; + break; + default: + return -EAFNOSUPPORT; + } + } else { + /* socket should be connected */ + if (sk->sk_state != TCP_ESTABLISHED) + return -EDESTADDRREQ; + if (np) { + *sa_family = AF_INET6; + *daddr = &sk->sk_v6_daddr; + *dport = inet->inet_dport; + } else if (inet) { + *sa_family = AF_INET; + *daddr = &sk->sk_daddr; + *dport = inet->inet_dport; + } else { + return -EAFNOSUPPORT; + } + } + + if (!*dport || !*daddr) + return -EINVAL; + + if (*sa_family == AF_INET6 && + ipv6_addr_v4mapped((struct in6_addr *)(*daddr))) { + *daddr = &((struct in6_addr *)(*daddr))->s6_addr32[3]; + *sa_family = AF_INET; + } + + return 0; +} + +static int quic_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) +{ + struct quic_internal_crypto_context *crypto_ctx = NULL; + struct scatterlist *sg_cipher_pkts[QUIC_MAX_GSO_FRAGS]; + struct scatterlist *sg_plain_pkts[QUIC_MAX_GSO_FRAGS]; + struct page *plain_pages[QUIC_MAX_PLAIN_PAGES]; + void *plain_base_ptrs[QUIC_MAX_IOVEC_SEGMENTS]; + void *plain_data_ptrs[QUIC_MAX_IOVEC_SEGMENTS]; + struct msghdr msg_cipher = { + .msg_name = msg->msg_name, + .msg_namelen = msg->msg_namelen, + .msg_flags = msg->msg_flags, + .msg_control = msg->msg_control, + .msg_controllen = msg->msg_controllen, + }; + struct quic_connection_rhash *connhash = NULL; + struct quic_context *ctx = quic_get_ctx(sk); + u8 hdr_buf[QUIC_MAX_SHORT_HEADER_SIZE]; + struct skcipher_request *hdr_mask_req; + struct quic_tx_ancillary_data control; + struct aead_request *aead_req = NULL; + u8 nonce[QUIC_CIPHER_MAX_NONCE_SIZE]; + struct scatterlist *sg_cipher = NULL; + struct udp_sock *up = udp_sk(sk); + struct scatterlist *sg_plain = NULL; + u16 gso_pkt_size = up->gso_size; + size_t last_plain_pkt_size = 0; + off_t payload_crypto_offset; + struct crypto_aead *tfm = NULL; + size_t nr_plain_pages = 0; + struct crypto_wait waiter; + size_t nr_sg_cipher_pkts; + size_t nr_sg_plain_pkts; + u8 conn_payload_key_gen; + ssize_t hdr_buf_len = 0; + size_t nr_sg_alloc = 0; + size_t plain_pkt_size; + sa_family_t sa_family; + u64 full_pkt_num; + size_t cipher_size; + size_t plain_size; + size_t pkt_size; + size_t tag_size; + __be16 dport; + int ret = 0; + void *daddr; + int pkt_i; + int err; + + memset(&hdr_buf[0], 0, QUIC_MAX_SHORT_HEADER_SIZE); + hdr_buf_len = copy_from_iter(hdr_buf, QUIC_MAX_SHORT_HEADER_SIZE, + &msg->msg_iter); + if (hdr_buf_len <= 0) { + ret = -EINVAL; + goto out; + } + iov_iter_revert(&msg->msg_iter, hdr_buf_len); + + ctx = quic_get_ctx(sk); + + // Bypass for anything that is guaranteed not QUIC. + plain_size = len; + + if (plain_size < 2) + return ctx->sk_proto->sendmsg(sk, msg, len); + + // Bypass for other than short header. + if ((hdr_buf[0] & 0xc0) != 0x40) + return ctx->sk_proto->sendmsg(sk, msg, len); + + // Crypto adds a tag after the packet. Corking a payload would produce + // a crypto tag after each portion. Use GSO instead. + if ((msg->msg_flags & MSG_MORE) || up->pending) { + ret = -EINVAL; + goto out; + } + + ret = quic_sendmsg_validate(msg); + if (ret) + goto out; + + ret = quic_extract_ancillary_data(msg, &control, &gso_pkt_size); + if (ret) + goto out; + + // Reserved bits with ancillary data present are an error. + if (control.flags & ~QUIC_ANCILLARY_FLAGS) { + ret = -EINVAL; + goto out; + } + + // Bypass offload on request. First packet bypass applies to all + // packets in the GSO pack. + if (control.flags & QUIC_BYPASS_ENCRYPTION) + return ctx->sk_proto->sendmsg(sk, msg, len); + + if (hdr_buf_len < 1 + control.dst_conn_id_length) { + ret = -EINVAL; + goto out; + } + + conn_payload_key_gen = (hdr_buf[0] & 0x04) >> 2; + + ret = quic_extract_dst_address_info(sk, msg, &sa_family, &daddr, + &dport); + if (ret) + goto out; + + // Fetch the flow + connhash = quic_lookup_connection(ctx, &hdr_buf[1], &control, + sa_family, daddr, dport); + if (!connhash) { + ret = -EINVAL; + goto out; + } + + crypto_ctx = &connhash->crypto_ctx; + tag_size = quic_crypto_tag_size(crypto_ctx->conn_info.cipher_type); + + if (crypto_ctx->conn_info.conn_payload_key_gen != + conn_payload_key_gen) { + ret = -EINVAL; + goto out; + } + + // For GSO, use the GSO size minus cipher tag size as the packet size; + // for non-GSO, use the size of the whole plaintext. + // Reduce the packet size by tag size to keep the original packet size + // for the rest of the UDP path in the stack. + if (!gso_pkt_size) { + plain_pkt_size = plain_size; + } else { + if (gso_pkt_size < tag_size) + goto out; + + plain_pkt_size = gso_pkt_size - tag_size; + } + + // Build scatterlist from the input data, split by GSO minus the + // crypto tag size. + nr_sg_alloc = quic_sg_capacity_from_msg(plain_pkt_size, + msg->msg_iter.iov_offset, + plain_size); + if ((nr_sg_alloc * 2) >= QUIC_MAX_SG_ALLOC_ELEMENTS) { + ret = -ENOMEM; + goto out; + } + + sg_plain = ctx->sg_alloc; + sg_cipher = sg_plain + nr_sg_alloc; + + ret = quic_sg_plain_from_mapped_msg(msg, plain_pages, + plain_base_ptrs, + plain_data_ptrs, plain_size, + plain_pkt_size, sg_plain, + nr_sg_alloc, sg_plain_pkts, + &nr_plain_pages); + + if (ret < 0) + goto out; + + nr_sg_plain_pkts = ret; + last_plain_pkt_size = plain_size % plain_pkt_size; + if (!last_plain_pkt_size) + last_plain_pkt_size = plain_pkt_size; + + // Build scatterlist for the ciphertext, split by GSO. + cipher_size = plain_size + nr_sg_plain_pkts * tag_size; + + if (DIV_ROUND_UP(cipher_size, PAGE_SIZE) + >= (1 << QUIC_MAX_CIPHER_PAGES_ORDER)) { + ret = -ENOMEM; + goto out_put_pages; + } + + ret = quic_sg_cipher_from_pkts(tag_size, plain_pkt_size, plain_size, + ctx->cipher_page, sg_cipher, nr_sg_alloc, + sg_cipher_pkts); + if (ret < 0) + goto out_put_pages; + + nr_sg_cipher_pkts = ret; + + if (nr_sg_plain_pkts != nr_sg_cipher_pkts) { + ret = -EPERM; + goto out_put_pages; + } + + // Encrypt and protect header for each packet individually. + tfm = crypto_ctx->packet_aead; + crypto_aead_clear_flags(tfm, ~0); + aead_req = aead_request_alloc(tfm, GFP_KERNEL); + if (!aead_req) { + aead_request_free(aead_req); + ret = -ENOMEM; + goto out_put_pages; + } + + hdr_mask_req = skcipher_request_alloc(crypto_ctx->header_tfm, + GFP_KERNEL); + if (!hdr_mask_req) { + aead_request_free(aead_req); + ret = -ENOMEM; + goto out_put_pages; + } + + for (pkt_i = 0; pkt_i < nr_sg_plain_pkts; ++pkt_i) { + payload_crypto_offset = + quic_copy_header(sg_plain_pkts[pkt_i], + hdr_buf, + sizeof(hdr_buf), + control.dst_conn_id_length); + + full_pkt_num = quic_unpack_pkt_num(&control, hdr_buf, + payload_crypto_offset); + + pkt_size = (pkt_i + 1 < nr_sg_plain_pkts + ? plain_pkt_size + : last_plain_pkt_size) + - payload_crypto_offset; + if (pkt_size < 0) { + aead_request_free(aead_req); + skcipher_request_free(hdr_mask_req); + ret = -EINVAL; + goto out_put_pages; + } + + /* Construct nonce and initialize request */ + quic_construct_ietf_nonce(nonce, crypto_ctx, full_pkt_num); + + /* Encrypt the body */ + aead_request_set_callback(aead_req, + CRYPTO_TFM_REQ_MAY_BACKLOG + | CRYPTO_TFM_REQ_MAY_SLEEP, + crypto_req_done, &waiter); + aead_request_set_crypt(aead_req, sg_plain_pkts[pkt_i], + sg_cipher_pkts[pkt_i], + pkt_size, + nonce); + aead_request_set_ad(aead_req, payload_crypto_offset); + err = crypto_wait_req(crypto_aead_encrypt(aead_req), &waiter); + if (unlikely(err)) { + ret = err; + aead_request_free(aead_req); + skcipher_request_free(hdr_mask_req); + goto out_put_pages; + } + + /* Protect the header */ + memcpy(sg_virt(sg_cipher_pkts[pkt_i]), hdr_buf, + payload_crypto_offset); + + err = quic_protect_header(crypto_ctx, &control, + hdr_mask_req, + sg_cipher_pkts[pkt_i], + payload_crypto_offset); + if (unlikely(err)) { + ret = err; + aead_request_free(aead_req); + skcipher_request_free(hdr_mask_req); + goto out_put_pages; + } + } + skcipher_request_free(hdr_mask_req); + aead_request_free(aead_req); + + // Deliver to the next layer. + if (ctx->sk_proto->sendpage) { + msg_cipher.msg_flags |= MSG_MORE; + err = ctx->sk_proto->sendmsg(sk, &msg_cipher, 0); + if (err < 0) { + ret = err; + goto out_put_pages; + } + + err = ctx->sk_proto->sendpage(sk, ctx->cipher_page, 0, + cipher_size, 0); + if (err < 0) { + ret = err; + goto out_put_pages; + } + if (err != cipher_size) { + ret = -EINVAL; + goto out_put_pages; + } + ret = plain_size; + } else { + ret = quic_sendpage(ctx, sk, &msg_cipher, cipher_size, + ctx->cipher_page); + // indicate full plaintext transmission to the caller. + if (ret > 0) + ret = plain_size; + } + +out_put_pages: + quic_put_plain_user_pages(plain_pages, nr_plain_pages); + +out: + return ret; +} + +static int quic_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t len) +{ + struct quic_context *ctx; + int ret; + + rcu_read_lock(); + ctx = quic_get_ctx(sk); + rcu_read_unlock(); + if (!ctx) + return -EINVAL; + + mutex_lock(&ctx->sendmsg_mux); + ret = quic_sendmsg(sk, msg, len); + mutex_unlock(&ctx->sendmsg_mux); + return ret; +} + +static void quic_release_resources(struct sock *sk) +{ + struct quic_internal_crypto_context *crypto_ctx; + struct quic_connection_rhash *connhash; + struct inet_sock *inet = inet_sk(sk); + struct rhashtable_iter hti; + struct quic_context *ctx; + struct proto *sk_proto; + + rcu_read_lock(); + ctx = quic_get_ctx(sk); + if (!ctx) { + rcu_read_unlock(); + return; + } + + sk_proto = ctx->sk_proto; + + rhashtable_walk_enter(&ctx->tx_connections, &hti); + rhashtable_walk_start(&hti); + + while ((connhash = rhashtable_walk_next(&hti))) { + if (IS_ERR(connhash)) { + if (PTR_ERR(connhash) == -EAGAIN) + continue; + break; + } + + crypto_ctx = &connhash->crypto_ctx; + crypto_free_aead(crypto_ctx->packet_aead); + crypto_free_skcipher(crypto_ctx->header_tfm); + memzero_explicit(crypto_ctx, sizeof(*crypto_ctx)); + } + + rhashtable_walk_stop(&hti); + rhashtable_walk_exit(&hti); + rhashtable_destroy(&ctx->tx_connections); + + if (ctx->cipher_page) { + quic_free_cipher_page(ctx->cipher_page); + ctx->cipher_page = NULL; + } + + rcu_read_unlock(); + + write_lock_bh(&sk->sk_callback_lock); + rcu_assign_pointer(inet->ulp_data, NULL); + WRITE_ONCE(sk->sk_prot, sk_proto); + write_unlock_bh(&sk->sk_callback_lock); + + kfree_rcu(ctx, rcu); +} + +static void +quic_prep_protos(unsigned int af, struct proto *proto, const struct proto *base) +{ + if (likely(test_bit(af, &af_init_done))) + return; + + spin_lock(&quic_proto_lock); + if (test_bit(af, &af_init_done)) + goto out_unlock; + + *proto = *base; + proto->setsockopt = quic_setsockopt; + proto->getsockopt = quic_getsockopt; + proto->sendmsg = quic_sendmsg_locked; + + smp_mb__before_atomic(); /* proto calls should be visible first */ + set_bit(af, &af_init_done); + +out_unlock: + spin_unlock(&quic_proto_lock); +} + +static void quic_update_proto(struct sock *sk, struct quic_context *ctx) +{ + struct proto *udp_proto, *quic_proto; + struct inet_sock *inet = inet_sk(sk); + + udp_proto = READ_ONCE(sk->sk_prot); + ctx->sk_proto = udp_proto; + quic_proto = sk->sk_family == AF_INET ? &quic_v4_proto : &quic_v6_proto; + + quic_prep_protos(sk->sk_family, quic_proto, udp_proto); + + write_lock_bh(&sk->sk_callback_lock); + rcu_assign_pointer(inet->ulp_data, ctx); + WRITE_ONCE(sk->sk_prot, quic_proto); + write_unlock_bh(&sk->sk_callback_lock); +} + +static int quic_init(struct sock *sk) +{ + struct quic_context *ctx; + + ctx = quic_ctx_create(); + if (!ctx) + return -ENOMEM; + + quic_update_proto(sk, ctx); + + return 0; +} + +static void quic_release(struct sock *sk) +{ + lock_sock(sk); + quic_release_resources(sk); + release_sock(sk); +} + +static struct udp_ulp_ops quic_ulp_ops __read_mostly = { + .name = "quic-crypto", + .owner = THIS_MODULE, + .init = quic_init, + .release = quic_release, +}; + +static int __init quic_register(void) +{ + udp_register_ulp(&quic_ulp_ops); + return 0; +} + +static void __exit quic_unregister(void) +{ + udp_unregister_ulp(&quic_ulp_ops); +} + +module_init(quic_register); +module_exit(quic_unregister); + +MODULE_DESCRIPTION("QUIC crypto ULP"); +MODULE_LICENSE("GPL"); +MODULE_ALIAS_UDP_ULP("quic-crypto"); From patchwork Wed Sep 7 00:49:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adel Abouchaev X-Patchwork-Id: 604107 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id ABBC2C6FA83 for ; Wed, 7 Sep 2022 00:50:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229634AbiIGAuT (ORCPT ); Tue, 6 Sep 2022 20:50:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51886 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229617AbiIGAuG (ORCPT ); Tue, 6 Sep 2022 20:50:06 -0400 Received: from mail-pj1-x1031.google.com (mail-pj1-x1031.google.com [IPv6:2607:f8b0:4864:20::1031]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 40FC45F130; Tue, 6 Sep 2022 17:49:58 -0700 (PDT) Received: by mail-pj1-x1031.google.com with SMTP id x1-20020a17090ab00100b001fda21bbc90so16635656pjq.3; Tue, 06 Sep 2022 17:49:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date; bh=BZHP5NdKF+I7nt1UTZOGxlIGpPl7nGAcbLBGtpDvY20=; b=oi6Is+ANLSCz3uJNtb96yW6evLxQzTDAoi1D4pjQC1UrODZoRQV98gzPM1pR6x0sXn 3stjNQWlJcRox4sFgfRDW7eXS3+W2pHYjR8RVYqxI/xighCo9MxwJjTDvxFs1XZCMfYq SiNbUXa6sYbNpxA8k8lBPRH/OowMK0ZSuSS9i030e3vfeg1e4DfHCpe98tVIEXGwadSb q1Bgc1gN9tH+LCvLFuT7+kJR3LEU1auutnTzdd29R9WkjSKFSsOCaaxLqftG/YOUzE/q JIBhxVw0aLgDlNog4W+toarejVenmwEPA5b1J7X6fzRWnDZ0UmsBlDTSmdFTe2NL514U AeQA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date; bh=BZHP5NdKF+I7nt1UTZOGxlIGpPl7nGAcbLBGtpDvY20=; b=ZI4T5Q4SQRpKeeGlzc6WoI99rSnLwRHGsSiYQvtFD5Mb6SrpJ9ieUnaDWZyR7Z03td fz/niU/WyhkIZr573Ol92DmivkOXqynRNztdDH/AFtn5dnOI5V+RVRU8vAGhxKKdK9uj 4UzkrlBSrOWKRddIyex9zMWgOurB7oCu3PNapm/xBQyty6nv53vxd/XcjKH9MH/DX0/H 5G62UbtWeHGcNXg7siOLIoOVR62hoqOenW0fkg2Heidy6WtZTS/+uNlLznE94TR0ItXC PzFOkBWm9kZqqDJuF/bT+vX+ojzh/Oe8gZG/T5w3WN4b3iAKwRSmPBnPE7NcCchB3TNP e3PQ== X-Gm-Message-State: ACgBeo2sRAsjjoDcANZQxt3wAS/WhkOuVat5MIfohkFY3uyDtoXGxnpJ x+go2Z+bCKOKfiCfInVgyAU= X-Google-Smtp-Source: AA6agR6ZyBxZaEQ8XFqI4wj93utKaSBZt1T8rCadYEp+Om6vRPjj9giD0yFsswlFJtI6MMV/1/SSbQ== X-Received: by 2002:a17:902:f70d:b0:172:d1d1:9b8c with SMTP id h13-20020a170902f70d00b00172d1d19b8cmr971685plo.129.1662511796668; Tue, 06 Sep 2022 17:49:56 -0700 (PDT) Received: from localhost (fwdproxy-prn-008.fbsv.net. [2a03:2880:ff:8::face:b00c]) by smtp.gmail.com with ESMTPSA id a9-20020a62d409000000b005387bf85ea0sm10869165pfh.128.2022.09.06.17.49.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 06 Sep 2022 17:49:56 -0700 (PDT) From: Adel Abouchaev To: davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, corbet@lwn.net, dsahern@kernel.org, shuah@kernel.org, netdev@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org Subject: [net-next v3 5/6] net: Add flow counters and Tx processing error counter Date: Tue, 6 Sep 2022 17:49:34 -0700 Message-Id: <20220907004935.3971173-6-adel.abushaev@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220907004935.3971173-1-adel.abushaev@gmail.com> References: <20220907004935.3971173-1-adel.abushaev@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org Added flow counters. Total flow counter is accumulative, the current shows the number of flows currently in flight, the error counters is accumulating the number of errors during Tx processing. Signed-off-by: Adel Abouchaev --- Updated enum bracket to follow enum keyword. Removed extra blank lines. --- include/net/netns/mib.h | 3 +++ include/net/quic.h | 10 +++++++++ include/net/snmp.h | 6 +++++ include/uapi/linux/snmp.h | 9 ++++++++ net/quic/Makefile | 2 +- net/quic/quic_main.c | 46 +++++++++++++++++++++++++++++++++++++++ net/quic/quic_proc.c | 45 ++++++++++++++++++++++++++++++++++++++ 7 files changed, 120 insertions(+), 1 deletion(-) create mode 100644 net/quic/quic_proc.c diff --git a/include/net/netns/mib.h b/include/net/netns/mib.h index 7e373664b1e7..dcbba3d1ceec 100644 --- a/include/net/netns/mib.h +++ b/include/net/netns/mib.h @@ -24,6 +24,9 @@ struct netns_mib { #if IS_ENABLED(CONFIG_TLS) DEFINE_SNMP_STAT(struct linux_tls_mib, tls_statistics); #endif +#if IS_ENABLED(CONFIG_QUIC) + DEFINE_SNMP_STAT(struct linux_quic_mib, quic_statistics); +#endif #ifdef CONFIG_MPTCP DEFINE_SNMP_STAT(struct mptcp_mib, mptcp_statistics); #endif diff --git a/include/net/quic.h b/include/net/quic.h index cafe01174e60..6362d827d266 100644 --- a/include/net/quic.h +++ b/include/net/quic.h @@ -25,6 +25,16 @@ #define QUIC_MAX_PLAIN_PAGES 16 #define QUIC_MAX_CIPHER_PAGES_ORDER 4 +#define __QUIC_INC_STATS(net, field) \ + __SNMP_INC_STATS((net)->mib.quic_statistics, field) +#define QUIC_INC_STATS(net, field) \ + SNMP_INC_STATS((net)->mib.quic_statistics, field) +#define QUIC_DEC_STATS(net, field) \ + SNMP_DEC_STATS((net)->mib.quic_statistics, field) + +int __net_init quic_proc_init(struct net *net); +void __net_exit quic_proc_fini(struct net *net); + struct quic_internal_crypto_context { struct quic_connection_info conn_info; struct crypto_skcipher *header_tfm; diff --git a/include/net/snmp.h b/include/net/snmp.h index 468a67836e2f..f94680a3e9e8 100644 --- a/include/net/snmp.h +++ b/include/net/snmp.h @@ -117,6 +117,12 @@ struct linux_tls_mib { unsigned long mibs[LINUX_MIB_TLSMAX]; }; +/* Linux QUIC */ +#define LINUX_MIB_QUICMAX __LINUX_MIB_QUICMAX +struct linux_quic_mib { + unsigned long mibs[LINUX_MIB_QUICMAX]; +}; + #define DEFINE_SNMP_STAT(type, name) \ __typeof__(type) __percpu *name #define DEFINE_SNMP_STAT_ATOMIC(type, name) \ diff --git a/include/uapi/linux/snmp.h b/include/uapi/linux/snmp.h index 4d7470036a8b..ca1e626dbdb4 100644 --- a/include/uapi/linux/snmp.h +++ b/include/uapi/linux/snmp.h @@ -349,4 +349,13 @@ enum __LINUX_MIB_TLSMAX }; +/* linux QUIC mib definitions */ +enum { + LINUX_MIB_QUICNUM = 0, + LINUX_MIB_QUICCURRTXSW, /* QuicCurrTxSw */ + LINUX_MIB_QUICTXSW, /* QuicTxSw */ + LINUX_MIB_QUICTXSWERROR, /* QuicTxSwError */ + __LINUX_MIB_QUICMAX +}; + #endif /* _LINUX_SNMP_H */ diff --git a/net/quic/Makefile b/net/quic/Makefile index 928239c4d08c..a885cd8bc4e0 100644 --- a/net/quic/Makefile +++ b/net/quic/Makefile @@ -5,4 +5,4 @@ obj-$(CONFIG_QUIC) += quic.o -quic-y := quic_main.o +quic-y := quic_main.o quic_proc.o diff --git a/net/quic/quic_main.c b/net/quic/quic_main.c index a43d989a1c8e..1fda1083ee25 100644 --- a/net/quic/quic_main.c +++ b/net/quic/quic_main.c @@ -359,6 +359,8 @@ static int do_quic_conn_add_tx(struct sock *sk, sockptr_t optval, if (rc < 0) goto err_free_ciphers; + QUIC_INC_STATS(sock_net(sk), LINUX_MIB_QUICCURRTXSW); + QUIC_INC_STATS(sock_net(sk), LINUX_MIB_QUICTXSW); return 0; err_free_ciphers: @@ -416,6 +418,7 @@ static int do_quic_conn_del_tx(struct sock *sk, sockptr_t optval, crypto_free_aead(crypto_ctx->packet_aead); memzero_explicit(crypto_ctx, sizeof(*crypto_ctx)); kfree(connhash); + QUIC_DEC_STATS(sock_net(sk), LINUX_MIB_QUICCURRTXSW); return 0; } @@ -441,6 +444,9 @@ static int do_quic_setsockopt(struct sock *sk, int optname, sockptr_t optval, break; } + if (rc) + QUIC_INC_STATS(sock_net(sk), LINUX_MIB_QUICTXSWERROR); + return rc; } @@ -1329,6 +1335,9 @@ static int quic_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) quic_put_plain_user_pages(plain_pages, nr_plain_pages); out: + if (unlikely(ret < 0)) + QUIC_INC_STATS(sock_net(sk), LINUX_MIB_QUICTXSWERROR); + return ret; } @@ -1461,6 +1470,36 @@ static void quic_release(struct sock *sk) release_sock(sk); } +static int __net_init quic_init_net(struct net *net) +{ + int err; + + net->mib.quic_statistics = alloc_percpu(struct linux_quic_mib); + if (!net->mib.quic_statistics) + return -ENOMEM; + + err = quic_proc_init(net); + if (err) + goto err_free_stats; + + return 0; + +err_free_stats: + free_percpu(net->mib.quic_statistics); + return err; +} + +static void __net_exit quic_exit_net(struct net *net) +{ + quic_proc_fini(net); + free_percpu(net->mib.quic_statistics); +} + +static struct pernet_operations quic_proc_ops = { + .init = quic_init_net, + .exit = quic_exit_net, +}; + static struct udp_ulp_ops quic_ulp_ops __read_mostly = { .name = "quic-crypto", .owner = THIS_MODULE, @@ -1470,6 +1509,12 @@ static struct udp_ulp_ops quic_ulp_ops __read_mostly = { static int __init quic_register(void) { + int err; + + err = register_pernet_subsys(&quic_proc_ops); + if (err) + return err; + udp_register_ulp(&quic_ulp_ops); return 0; } @@ -1477,6 +1522,7 @@ static int __init quic_register(void) static void __exit quic_unregister(void) { udp_unregister_ulp(&quic_ulp_ops); + unregister_pernet_subsys(&quic_proc_ops); } module_init(quic_register); diff --git a/net/quic/quic_proc.c b/net/quic/quic_proc.c new file mode 100644 index 000000000000..cb4fe7a589b5 --- /dev/null +++ b/net/quic/quic_proc.c @@ -0,0 +1,45 @@ +// SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +/* Copyright (C) 2019 Meta Platforms, Inc. */ + +#include +#include +#include +#include + +#ifdef CONFIG_PROC_FS +static const struct snmp_mib quic_mib_list[] = { + SNMP_MIB_ITEM("QuicCurrTxSw", LINUX_MIB_QUICCURRTXSW), + SNMP_MIB_ITEM("QuicTxSw", LINUX_MIB_QUICTXSW), + SNMP_MIB_ITEM("QuicTxSwError", LINUX_MIB_QUICTXSWERROR), + SNMP_MIB_SENTINEL +}; + +static int quic_statistics_seq_show(struct seq_file *seq, void *v) +{ + unsigned long buf[LINUX_MIB_QUICMAX] = {}; + struct net *net = seq->private; + int i; + + snmp_get_cpu_field_batch(buf, quic_mib_list, net->mib.quic_statistics); + for (i = 0; quic_mib_list[i].name; i++) + seq_printf(seq, "%-32s\t%lu\n", quic_mib_list[i].name, buf[i]); + + return 0; +} +#endif + +int __net_init quic_proc_init(struct net *net) +{ +#ifdef CONFIG_PROC_FS + if (!proc_create_net_single("quic_stat", 0444, net->proc_net, + quic_statistics_seq_show, NULL)) + return -ENOMEM; +#endif /* CONFIG_PROC_FS */ + + return 0; +} + +void __net_exit quic_proc_fini(struct net *net) +{ + remove_proc_entry("quic_stat", net->proc_net); +} From patchwork Wed Sep 7 00:49:35 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adel Abouchaev X-Patchwork-Id: 603633 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B0BA7C6FA86 for ; Wed, 7 Sep 2022 00:50:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229631AbiIGAuX (ORCPT ); Tue, 6 Sep 2022 20:50:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51888 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229626AbiIGAuI (ORCPT ); Tue, 6 Sep 2022 20:50:08 -0400 Received: from mail-pl1-x632.google.com (mail-pl1-x632.google.com [IPv6:2607:f8b0:4864:20::632]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D90AE642EA; Tue, 6 Sep 2022 17:49:59 -0700 (PDT) Received: by mail-pl1-x632.google.com with SMTP id b21so24450plz.7; Tue, 06 Sep 2022 17:49:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date; bh=sl8Uzgrm3DCid/LWuaibxm9+1N/HnaWqjI+4AHmRzpc=; b=GN2KE2NyjgSBxFGGWtdZvL6sPWI2veZK70adsFs8NdmLtRMWaYXiY2ZbRJ2/GvdXAN E320VhRQo05loZ6hxfRn39bwXmaR45ocGuLkbtU5gzJbIUrxFM3SJATSWRtBsXr79MRa wsJ76Mr/DyAFg0Mjudwj8RNJ6BvpCPULaq/ROrvK2FfiyIfhjFC+2VlP7KqhJvZKu6aB 3gV7tfdsDoJ7PP0QkYzsIS2Ywc+puVoVWt5OA3NsTyFWNGBrc/Xt+PotRqoJ6/PICCId LJQhRNSfUnGqxsAwl/gf+MIQ7iJmZqFs+SRLiRPZ377PRB5uCLplnReruGQ2gBxYkwrf dQfQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date; bh=sl8Uzgrm3DCid/LWuaibxm9+1N/HnaWqjI+4AHmRzpc=; b=QAi7DDI7WDHn/teeO4A6SFJN9vnjDdL5HwDujhgllYPuanMZH5D00S8K57peOEDtI1 osfSeWL6HbNluXgCc6wvAXuExCccDlb2qv23zbJUUflurXQidTKEtuBLUDfokSiwFF+s eCeJTlQgz7Hnj6hwZnKaotOFmp2k+jlXv+kpVwIQssUrbhWwm9CnGnLJwc/70z9AmwTC /7XQtVxEarlKmH2rD50k1A/nwumUhs6pSEp0f2Ig6jPX7qXZDONVt2QnhtUCbLXa+/hV 6MECGYseISRcv9c6k8nRNqiIFmp6n8N5Mh7u61CrBb2qr5nEdUIJjrsYNrNMxSMK7Ccb RCZA== X-Gm-Message-State: ACgBeo0fWkwTQ5uJMhwAvG3VJQ33E4TzyHIHx8ibGhsKWFnmL9kqJTCX kKl5OGTpf/IAX1KIVEl03ec= X-Google-Smtp-Source: AA6agR7rERxEc943fu0Tlrp4mxNkSQjZ8Q5qsiZ5SK1DfEB4ohPhtxcTGSf+a78J/K3xoJxcv0IaWA== X-Received: by 2002:a17:90b:1241:b0:200:2f9e:35ac with SMTP id gx1-20020a17090b124100b002002f9e35acmr1111588pjb.182.1662511798788; Tue, 06 Sep 2022 17:49:58 -0700 (PDT) Received: from localhost (fwdproxy-prn-016.fbsv.net. [2a03:2880:ff:10::face:b00c]) by smtp.gmail.com with ESMTPSA id k128-20020a632486000000b00434dccacd72sm551550pgk.34.2022.09.06.17.49.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 06 Sep 2022 17:49:58 -0700 (PDT) From: Adel Abouchaev To: davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, corbet@lwn.net, dsahern@kernel.org, shuah@kernel.org, netdev@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org Subject: [net-next v3 6/6] net: Add self tests for ULP operations, flow setup and crypto tests Date: Tue, 6 Sep 2022 17:49:35 -0700 Message-Id: <20220907004935.3971173-7-adel.abushaev@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220907004935.3971173-1-adel.abushaev@gmail.com> References: <20220907004935.3971173-1-adel.abushaev@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org Add self tests for ULP operations, flow setup and crypto tests. Signed-off-by: Adel Abouchaev --- Restored the test build. Changed the QUIC context reference variable names for the keys and iv to match the uAPI. Updated alignment, added SPDX license line. v3: Added Chacha20-Poly1305 test. v3: Added test to fail sending with wrong key generation bit. --- tools/testing/selftests/net/.gitignore | 1 + tools/testing/selftests/net/Makefile | 3 +- tools/testing/selftests/net/quic.c | 1369 ++++++++++++++++++++++++ tools/testing/selftests/net/quic.sh | 46 + 4 files changed, 1418 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/net/quic.c create mode 100755 tools/testing/selftests/net/quic.sh diff --git a/tools/testing/selftests/net/.gitignore b/tools/testing/selftests/net/.gitignore index 3d7adee7a3e6..78970a09d73c 100644 --- a/tools/testing/selftests/net/.gitignore +++ b/tools/testing/selftests/net/.gitignore @@ -14,6 +14,7 @@ nettest psock_fanout psock_snd psock_tpacket +quic reuseaddr_conflict reuseaddr_ports_exhausted reuseport_addr_any diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile index f5ac1433c301..b4e9586a2d03 100644 --- a/tools/testing/selftests/net/Makefile +++ b/tools/testing/selftests/net/Makefile @@ -44,6 +44,7 @@ TEST_PROGS += arp_ndisc_untracked_subnets.sh TEST_PROGS += stress_reuseport_listen.sh TEST_PROGS += l2_tos_ttl_inherit.sh TEST_PROGS += bind_bhash.sh +TEST_PROGS += quic.sh TEST_PROGS_EXTENDED := in_netns.sh setup_loopback.sh setup_veth.sh TEST_PROGS_EXTENDED += toeplitz_client.sh toeplitz.sh TEST_GEN_FILES = socket nettest @@ -59,7 +60,7 @@ TEST_GEN_FILES += ipsec TEST_GEN_FILES += ioam6_parser TEST_GEN_FILES += gro TEST_GEN_PROGS = reuseport_bpf reuseport_bpf_cpu reuseport_bpf_numa -TEST_GEN_PROGS += reuseport_dualstack reuseaddr_conflict tls tun tap +TEST_GEN_PROGS += reuseport_dualstack reuseaddr_conflict tls tun tap quic TEST_GEN_FILES += toeplitz TEST_GEN_FILES += cmsg_sender TEST_GEN_FILES += stress_reuseport_listen diff --git a/tools/testing/selftests/net/quic.c b/tools/testing/selftests/net/quic.c new file mode 100644 index 000000000000..81285a6d9601 --- /dev/null +++ b/tools/testing/selftests/net/quic.c @@ -0,0 +1,1369 @@ +// SPDX-License-Identifier: GPL-2.0 + +#define _GNU_SOURCE + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +#include "../kselftest_harness.h" + +#define UDP_ULP 105 + +#ifndef SOL_UDP +#define SOL_UDP 17 +#endif + +// 1. QUIC ULP Registration Test + +FIXTURE(quic_ulp) +{ + int sfd; + socklen_t len_s; + union { + struct sockaddr_in addr; + struct sockaddr_in6 addr6; + } server; + int default_net_ns_fd; + int server_net_ns_fd; +}; + +FIXTURE_VARIANT(quic_ulp) +{ + unsigned int af_server; + char *server_address; + unsigned short server_port; +}; + +FIXTURE_VARIANT_ADD(quic_ulp, ipv4) +{ + .af_server = AF_INET, + .server_address = "10.0.0.2", + .server_port = 7101, +}; + +FIXTURE_VARIANT_ADD(quic_ulp, ipv6) +{ + .af_server = AF_INET6, + .server_address = "2001::2", + .server_port = 7102, +}; + +FIXTURE_SETUP(quic_ulp) +{ + char path[PATH_MAX]; + int optval = 1; + + snprintf(path, sizeof(path), "/proc/%d/ns/net", getpid()); + self->default_net_ns_fd = open(path, O_RDONLY); + ASSERT_GE(self->default_net_ns_fd, 0); + strcpy(path, "/var/run/netns/ns2"); + self->server_net_ns_fd = open(path, O_RDONLY); + ASSERT_GE(self->server_net_ns_fd, 0); + + ASSERT_NE(setns(self->server_net_ns_fd, 0), -1); + self->sfd = socket(variant->af_server, SOCK_DGRAM, 0); + ASSERT_NE(setsockopt(self->sfd, SOL_SOCKET, SO_REUSEPORT, &optval, + sizeof(optval)), -1); + if (variant->af_server == AF_INET) { + self->len_s = sizeof(self->server.addr); + self->server.addr.sin_family = variant->af_server; + inet_pton(variant->af_server, variant->server_address, + &self->server.addr.sin_addr); + self->server.addr.sin_port = htons(variant->server_port); + ASSERT_EQ(bind(self->sfd, &self->server.addr, self->len_s), 0); + ASSERT_EQ(getsockname(self->sfd, &self->server.addr, + &self->len_s), 0); + } else { + self->len_s = sizeof(self->server.addr6); + self->server.addr6.sin6_family = variant->af_server; + inet_pton(variant->af_server, variant->server_address, + &self->server.addr6.sin6_addr); + self->server.addr6.sin6_port = htons(variant->server_port); + ASSERT_EQ(bind(self->sfd, &self->server.addr6, self->len_s), 0); + ASSERT_EQ(getsockname(self->sfd, &self->server.addr6, + &self->len_s), 0); + } + ASSERT_NE(setns(self->default_net_ns_fd, 0), -1); +}; + +FIXTURE_TEARDOWN(quic_ulp) +{ + ASSERT_NE(setns(self->server_net_ns_fd, 0), -1); + close(self->sfd); + ASSERT_NE(setns(self->default_net_ns_fd, 0), -1); +}; + +TEST_F(quic_ulp, request_nonexistent_udp_ulp) +{ + ASSERT_NE(setns(self->server_net_ns_fd, 0), -1); + ASSERT_EQ(setsockopt(self->sfd, SOL_UDP, UDP_ULP, + "nonexistent", sizeof("nonexistent")), -1); + // If UDP_ULP option is not present, the error would be ENOPROTOOPT. + ASSERT_EQ(errno, ENOENT); + ASSERT_NE(setns(self->default_net_ns_fd, 0), -1); +}; + +TEST_F(quic_ulp, request_quic_crypto_udp_ulp) +{ + ASSERT_NE(setns(self->server_net_ns_fd, 0), -1); + ASSERT_EQ(setsockopt(self->sfd, SOL_UDP, UDP_ULP, + "quic-crypto", sizeof("quic-crypto")), 0); + ASSERT_NE(setns(self->default_net_ns_fd, 0), -1); +}; + +// 2. QUIC Data Path Operation Tests + +#define DO_NOT_SETUP_FLOW 0 +#define SETUP_FLOW 1 + +#define DO_NOT_USE_CLIENT 0 +#define USE_CLIENT 1 + +FIXTURE(quic_data) +{ + int sfd, c1fd, c2fd; + socklen_t len_c1; + socklen_t len_c2; + socklen_t len_s; + + union { + struct sockaddr_in addr; + struct sockaddr_in6 addr6; + } client_1; + union { + struct sockaddr_in addr; + struct sockaddr_in6 addr6; + } client_2; + union { + struct sockaddr_in addr; + struct sockaddr_in6 addr6; + } server; + int default_net_ns_fd; + int client_1_net_ns_fd; + int client_2_net_ns_fd; + int server_net_ns_fd; +}; + +FIXTURE_VARIANT(quic_data) +{ + unsigned int af_client_1; + char *client_1_address; + unsigned short client_1_port; + uint8_t conn_id_1[8]; + uint8_t conn_1_key[16]; + uint8_t conn_1_iv[12]; + uint8_t conn_1_hdr_key[16]; + size_t conn_id_1_len; + bool setup_flow_1; + bool use_client_1; + unsigned int af_client_2; + char *client_2_address; + unsigned short client_2_port; + uint8_t conn_id_2[8]; + uint8_t conn_2_key[16]; + uint8_t conn_2_iv[12]; + uint8_t conn_2_hdr_key[16]; + size_t conn_id_2_len; + bool setup_flow_2; + bool use_client_2; + unsigned int af_server; + char *server_address; + unsigned short server_port; +}; + +FIXTURE_VARIANT_ADD(quic_data, ipv4) +{ + .af_client_1 = AF_INET, + .client_1_address = "10.0.0.1", + .client_1_port = 6667, + .conn_id_1 = {0x11, 0x12, 0x13, 0x14}, + .conn_id_1_len = 4, + .setup_flow_1 = SETUP_FLOW, + .use_client_1 = USE_CLIENT, + .af_client_2 = AF_INET, + .client_2_address = "10.0.0.3", + .client_2_port = 6668, + .conn_id_2 = {0x21, 0x22, 0x23, 0x24}, + .conn_id_2_len = 4, + .setup_flow_2 = SETUP_FLOW, + //.use_client_2 = USE_CLIENT, + .af_server = AF_INET, + .server_address = "10.0.0.2", + .server_port = 6669, +}; + +FIXTURE_VARIANT_ADD(quic_data, ipv6_mapped_ipv4_two_conns) +{ + .af_client_1 = AF_INET6, + .client_1_address = "::ffff:10.0.0.1", + .client_1_port = 6670, + .conn_id_1 = {0x11, 0x12, 0x13, 0x14}, + .conn_id_1_len = 4, + .setup_flow_1 = SETUP_FLOW, + .use_client_1 = USE_CLIENT, + .af_client_2 = AF_INET6, + .client_2_address = "::ffff:10.0.0.3", + .client_2_port = 6671, + .conn_id_2 = {0x21, 0x22, 0x23, 0x24}, + .conn_id_2_len = 4, + .setup_flow_2 = SETUP_FLOW, + .use_client_2 = USE_CLIENT, + .af_server = AF_INET6, + .server_address = "::ffff:10.0.0.2", + .server_port = 6672, +}; + +FIXTURE_VARIANT_ADD(quic_data, ipv6_mapped_ipv4_setup_ipv4_one_conn) +{ + .af_client_1 = AF_INET, + .client_1_address = "10.0.0.3", + .client_1_port = 6676, + .conn_id_1 = {0x11, 0x12, 0x13, 0x14}, + .conn_id_1_len = 4, + .setup_flow_1 = SETUP_FLOW, + .use_client_1 = DO_NOT_USE_CLIENT, + .af_client_2 = AF_INET6, + .client_2_address = "::ffff:10.0.0.3", + .client_2_port = 6676, + .conn_id_2 = {0x11, 0x12, 0x13, 0x14}, + .conn_id_2_len = 4, + .setup_flow_2 = DO_NOT_SETUP_FLOW, + .use_client_2 = USE_CLIENT, + .af_server = AF_INET6, + .server_address = "::ffff:10.0.0.2", + .server_port = 6677, +}; + +FIXTURE_VARIANT_ADD(quic_data, ipv6_mapped_ipv4_setup_ipv6_one_conn) +{ + .af_client_1 = AF_INET6, + .client_1_address = "::ffff:10.0.0.3", + .client_1_port = 6678, + .conn_id_1 = {0x11, 0x12, 0x13, 0x14}, + .setup_flow_1 = SETUP_FLOW, + .use_client_1 = DO_NOT_USE_CLIENT, + .af_client_2 = AF_INET, + .client_2_address = "10.0.0.3", + .client_2_port = 6678, + .conn_id_2 = {0x11, 0x12, 0x13, 0x14}, + .setup_flow_2 = DO_NOT_SETUP_FLOW, + .use_client_2 = USE_CLIENT, + .af_server = AF_INET6, + .server_address = "::ffff:10.0.0.2", + .server_port = 6679, +}; + +FIXTURE_SETUP(quic_data) +{ + char path[PATH_MAX]; + int optval = 1; + + if (variant->af_client_1 == AF_INET) { + self->len_c1 = sizeof(self->client_1.addr); + self->client_1.addr.sin_family = variant->af_client_1; + inet_pton(variant->af_client_1, variant->client_1_address, + &self->client_1.addr.sin_addr); + self->client_1.addr.sin_port = htons(variant->client_1_port); + } else { + self->len_c1 = sizeof(self->client_1.addr6); + self->client_1.addr6.sin6_family = variant->af_client_1; + inet_pton(variant->af_client_1, variant->client_1_address, + &self->client_1.addr6.sin6_addr); + self->client_1.addr6.sin6_port = htons(variant->client_1_port); + } + + if (variant->af_client_2 == AF_INET) { + self->len_c2 = sizeof(self->client_2.addr); + self->client_2.addr.sin_family = variant->af_client_2; + inet_pton(variant->af_client_2, variant->client_2_address, + &self->client_2.addr.sin_addr); + self->client_2.addr.sin_port = htons(variant->client_2_port); + } else { + self->len_c2 = sizeof(self->client_2.addr6); + self->client_2.addr6.sin6_family = variant->af_client_2; + inet_pton(variant->af_client_2, variant->client_2_address, + &self->client_2.addr6.sin6_addr); + self->client_2.addr6.sin6_port = htons(variant->client_2_port); + } + + if (variant->af_server == AF_INET) { + self->len_s = sizeof(self->server.addr); + self->server.addr.sin_family = variant->af_server; + inet_pton(variant->af_server, variant->server_address, + &self->server.addr.sin_addr); + self->server.addr.sin_port = htons(variant->server_port); + } else { + self->len_s = sizeof(self->server.addr6); + self->server.addr6.sin6_family = variant->af_server; + inet_pton(variant->af_server, variant->server_address, + &self->server.addr6.sin6_addr); + self->server.addr6.sin6_port = htons(variant->server_port); + } + + snprintf(path, sizeof(path), "/proc/%d/ns/net", getpid()); + self->default_net_ns_fd = open(path, O_RDONLY); + ASSERT_GE(self->default_net_ns_fd, 0); + strcpy(path, "/var/run/netns/ns11"); + self->client_1_net_ns_fd = open(path, O_RDONLY); + ASSERT_GE(self->client_1_net_ns_fd, 0); + strcpy(path, "/var/run/netns/ns12"); + self->client_2_net_ns_fd = open(path, O_RDONLY); + ASSERT_GE(self->client_2_net_ns_fd, 0); + strcpy(path, "/var/run/netns/ns2"); + self->server_net_ns_fd = open(path, O_RDONLY); + ASSERT_GE(self->server_net_ns_fd, 0); + + if (variant->use_client_1) { + ASSERT_NE(setns(self->client_1_net_ns_fd, 0), -1); + self->c1fd = socket(variant->af_client_1, SOCK_DGRAM, 0); + ASSERT_NE(setsockopt(self->c1fd, SOL_SOCKET, SO_REUSEPORT, + &optval, sizeof(optval)), -1); + if (variant->af_client_1 == AF_INET) { + ASSERT_EQ(bind(self->c1fd, &self->client_1.addr, + self->len_c1), 0); + ASSERT_EQ(getsockname(self->c1fd, &self->client_1.addr, + &self->len_c1), 0); + } else { + ASSERT_EQ(bind(self->c1fd, &self->client_1.addr6, + self->len_c1), 0); + ASSERT_EQ(getsockname(self->c1fd, &self->client_1.addr6, + &self->len_c1), 0); + } + } + + if (variant->use_client_2) { + ASSERT_NE(setns(self->client_2_net_ns_fd, 0), -1); + self->c2fd = socket(variant->af_client_2, SOCK_DGRAM, 0); + ASSERT_NE(setsockopt(self->c2fd, SOL_SOCKET, SO_REUSEPORT, + &optval, sizeof(optval)), -1); + if (variant->af_client_2 == AF_INET) { + ASSERT_EQ(bind(self->c2fd, &self->client_2.addr, + self->len_c2), 0); + ASSERT_EQ(getsockname(self->c2fd, &self->client_2.addr, + &self->len_c2), 0); + } else { + ASSERT_EQ(bind(self->c2fd, &self->client_2.addr6, + self->len_c2), 0); + ASSERT_EQ(getsockname(self->c2fd, &self->client_2.addr6, + &self->len_c2), 0); + } + } + + ASSERT_NE(setns(self->server_net_ns_fd, 0), -1); + self->sfd = socket(variant->af_server, SOCK_DGRAM, 0); + ASSERT_NE(setsockopt(self->sfd, SOL_SOCKET, SO_REUSEPORT, &optval, + sizeof(optval)), -1); + if (variant->af_server == AF_INET) { + ASSERT_EQ(bind(self->sfd, &self->server.addr, self->len_s), 0); + ASSERT_EQ(getsockname(self->sfd, &self->server.addr, + &self->len_s), 0); + } else { + ASSERT_EQ(bind(self->sfd, &self->server.addr6, self->len_s), 0); + ASSERT_EQ(getsockname(self->sfd, &self->server.addr6, + &self->len_s), 0); + } + + ASSERT_EQ(setsockopt(self->sfd, IPPROTO_UDP, UDP_ULP, + "quic-crypto", sizeof("quic-crypto")), 0); + + ASSERT_NE(setns(self->default_net_ns_fd, 0), -1); +} + +FIXTURE_TEARDOWN(quic_data) +{ + ASSERT_NE(setns(self->server_net_ns_fd, 0), -1); + close(self->sfd); + ASSERT_NE(setns(self->client_1_net_ns_fd, 0), -1); + close(self->c1fd); + ASSERT_NE(setns(self->client_2_net_ns_fd, 0), -1); + close(self->c2fd); + ASSERT_NE(setns(self->default_net_ns_fd, 0), -1); +} + +TEST_F(quic_data, send_fail_no_flow) +{ + char const *test_str = "test_read"; + int send_len = 10; + + ASSERT_EQ(strlen(test_str) + 1, send_len); + EXPECT_EQ(sendto(self->sfd, test_str, send_len, 0, + &self->client_1.addr, self->len_c1), -1); +}; + +TEST_F(quic_data, fail_wrong_key_generation_bit) +{ + size_t cmsg_tx_len = sizeof(struct quic_tx_ancillary_data); + uint8_t cmsg_buf[CMSG_SPACE(cmsg_tx_len)]; + struct quic_connection_info conn_1_info; + struct quic_connection_info conn_2_info; + struct quic_tx_ancillary_data *anc_data; + struct cmsghdr *cmsg_hdr; + int frag_size = 1200; + struct iovec iov[2]; + int msg_len = 4500; + struct msghdr msg; + char *test_str_1; + char *test_str_2; + char *buf_1; + char *buf_2; + int i; + + test_str_1 = (char *)malloc(9000); + test_str_2 = (char *)malloc(9000); + memset(test_str_1, 0, 9000); + memset(test_str_2, 0, 9000); + + buf_1 = (char *)malloc(10000); + buf_2 = (char *)malloc(10000); + for (i = 0; i < 9000; i += (1200 - 16)) { + test_str_1[i] = 0x44; + memcpy(&test_str_1[i + 1], &variant->conn_id_1, + variant->conn_id_1_len); + test_str_1[i + 1 + variant->conn_id_1_len] = 0xca; + + test_str_2[i] = 0x44; + memcpy(&test_str_2[i + 1], &variant->conn_id_2, + variant->conn_id_2_len); + test_str_2[i + 1 + variant->conn_id_2_len] = 0xca; + } + + // program the connection into the offload + conn_1_info.cipher_type = TLS_CIPHER_AES_GCM_128; + memset(&conn_1_info.key, 0, sizeof(struct quic_connection_info_key)); + conn_1_info.key.dst_conn_id_length = variant->conn_id_1_len; + memcpy(conn_1_info.key.dst_conn_id, + &variant->conn_id_1, + variant->conn_id_1_len); + conn_1_info.conn_payload_key_gen = 0; + + if (self->client_1.addr.sin_family == AF_INET) { + memcpy(&conn_1_info.key.addr.ipv4_addr, + &self->client_1.addr.sin_addr, sizeof(struct in_addr)); + conn_1_info.key.udp_port = self->client_1.addr.sin_port; + } else { + memcpy(&conn_1_info.key.addr.ipv6_addr, + &self->client_1.addr6.sin6_addr, + sizeof(struct in6_addr)); + conn_1_info.key.udp_port = self->client_1.addr6.sin6_port; + } + + conn_2_info.cipher_type = TLS_CIPHER_AES_GCM_128; + memset(&conn_2_info.key, 0, sizeof(struct quic_connection_info_key)); + conn_2_info.key.dst_conn_id_length = variant->conn_id_2_len; + memcpy(conn_2_info.key.dst_conn_id, + &variant->conn_id_2, + variant->conn_id_2_len); + conn_2_info.conn_payload_key_gen = 0; + + if (self->client_2.addr.sin_family == AF_INET) { + memcpy(&conn_2_info.key.addr.ipv4_addr, + &self->client_2.addr.sin_addr, sizeof(struct in_addr)); + conn_2_info.key.udp_port = self->client_2.addr.sin_port; + } else { + memcpy(&conn_2_info.key.addr.ipv6_addr, + &self->client_2.addr6.sin6_addr, + sizeof(struct in6_addr)); + conn_2_info.key.udp_port = self->client_2.addr6.sin6_port; + } + + memcpy(&conn_1_info.aes_gcm_128.payload_key, + &variant->conn_1_key, 16); + memcpy(&conn_1_info.aes_gcm_128.payload_iv, + &variant->conn_1_iv, 12); + memcpy(&conn_1_info.aes_gcm_128.header_key, + &variant->conn_1_hdr_key, 16); + memcpy(&conn_2_info.aes_gcm_128.payload_key, + &variant->conn_2_key, 16); + memcpy(&conn_2_info.aes_gcm_128.payload_iv, + &variant->conn_2_iv, 12); + memcpy(&conn_2_info.aes_gcm_128.header_key, + &variant->conn_2_hdr_key, + 16); + + ASSERT_NE(setns(self->server_net_ns_fd, 0), -1); + + ASSERT_EQ(setsockopt(self->sfd, SOL_UDP, UDP_SEGMENT, &frag_size, + sizeof(frag_size)), 0); + + if (variant->setup_flow_1) + ASSERT_EQ(setsockopt(self->sfd, SOL_UDP, + UDP_QUIC_ADD_TX_CONNECTION, + &conn_1_info, sizeof(conn_1_info)), 0); + + if (variant->setup_flow_2) + ASSERT_EQ(setsockopt(self->sfd, SOL_UDP, + UDP_QUIC_ADD_TX_CONNECTION, + &conn_2_info, sizeof(conn_2_info)), 0); + + iov[0].iov_base = test_str_1; + iov[0].iov_len = msg_len; + iov[1].iov_base = (void *)test_str_1 + 4500; + iov[1].iov_len = msg_len; + + msg.msg_name = (self->client_1.addr.sin_family == AF_INET) + ? (void *)&self->client_1.addr + : (void *)&self->client_1.addr6; + msg.msg_namelen = self->len_c1; + msg.msg_iov = iov; + msg.msg_iovlen = 2; + msg.msg_control = cmsg_buf; + msg.msg_controllen = sizeof(cmsg_buf); + cmsg_hdr = CMSG_FIRSTHDR(&msg); + cmsg_hdr->cmsg_level = IPPROTO_UDP; + cmsg_hdr->cmsg_type = UDP_QUIC_ENCRYPT; + cmsg_hdr->cmsg_len = CMSG_LEN(cmsg_tx_len); + anc_data = (struct quic_tx_ancillary_data *)CMSG_DATA(cmsg_hdr); + anc_data->next_pkt_num = 0x0d65c9; + anc_data->flags = 0; + anc_data->dst_conn_id_length = variant->conn_id_1_len; + + if (variant->use_client_1) + EXPECT_EQ(sendmsg(self->sfd, &msg, 0), -1); + + iov[0].iov_base = test_str_2; + iov[0].iov_len = msg_len; + iov[1].iov_base = (void *)test_str_2 + 4500; + iov[1].iov_len = msg_len; + msg.msg_name = (self->client_2.addr.sin_family == AF_INET) + ? (void *)&self->client_2.addr + : (void *)&self->client_2.addr6; + msg.msg_namelen = self->len_c2; + cmsg_hdr = CMSG_FIRSTHDR(&msg); + anc_data = (struct quic_tx_ancillary_data *)CMSG_DATA(cmsg_hdr); + anc_data->next_pkt_num = 0x0d65c9; + anc_data->dst_conn_id_length = variant->conn_id_2_len; + anc_data->flags = 0; + + if (variant->use_client_2) + EXPECT_EQ(sendmsg(self->sfd, &msg, 0), -1); + + ASSERT_NE(setns(self->server_net_ns_fd, 0), -1); + if (variant->setup_flow_1) { + ASSERT_EQ(setsockopt(self->sfd, SOL_UDP, + UDP_QUIC_DEL_TX_CONNECTION, + &conn_1_info, sizeof(conn_1_info)), + 0); + } + if (variant->setup_flow_2) { + ASSERT_EQ(setsockopt(self->sfd, SOL_UDP, + UDP_QUIC_DEL_TX_CONNECTION, + &conn_2_info, sizeof(conn_2_info)), + 0); + } + free(test_str_1); + free(test_str_2); + free(buf_1); + free(buf_2); + ASSERT_NE(setns(self->default_net_ns_fd, 0), -1); +} + +TEST_F(quic_data, encrypt_two_conn_gso_1200_iov_2_size_9000_aesgcm128) +{ + size_t cmsg_tx_len = sizeof(struct quic_tx_ancillary_data); + uint8_t cmsg_buf[CMSG_SPACE(cmsg_tx_len)]; + struct quic_connection_info conn_1_info; + struct quic_connection_info conn_2_info; + struct quic_tx_ancillary_data *anc_data; + socklen_t recv_addr_len_1; + socklen_t recv_addr_len_2; + struct cmsghdr *cmsg_hdr; + int frag_size = 1200; + int send_len = 9000; + struct iovec iov[2]; + int msg_len = 4500; + struct msghdr msg; + char *test_str_1; + char *test_str_2; + char *buf_1; + char *buf_2; + int i; + + test_str_1 = (char *)malloc(9000); + test_str_2 = (char *)malloc(9000); + memset(test_str_1, 0, 9000); + memset(test_str_2, 0, 9000); + + buf_1 = (char *)malloc(10000); + buf_2 = (char *)malloc(10000); + for (i = 0; i < 9000; i += (1200 - 16)) { + test_str_1[i] = 0x40; + memcpy(&test_str_1[i + 1], &variant->conn_id_1, + variant->conn_id_1_len); + test_str_1[i + 1 + variant->conn_id_1_len] = 0xca; + + test_str_2[i] = 0x40; + memcpy(&test_str_2[i + 1], &variant->conn_id_2, + variant->conn_id_2_len); + test_str_2[i + 1 + variant->conn_id_2_len] = 0xca; + } + + // program the connection into the offload + conn_1_info.cipher_type = TLS_CIPHER_AES_GCM_128; + memset(&conn_1_info.key, 0, sizeof(struct quic_connection_info_key)); + conn_1_info.key.dst_conn_id_length = variant->conn_id_1_len; + memcpy(conn_1_info.key.dst_conn_id, + &variant->conn_id_1, + variant->conn_id_1_len); + conn_1_info.conn_payload_key_gen = 0; + + if (self->client_1.addr.sin_family == AF_INET) { + memcpy(&conn_1_info.key.addr.ipv4_addr, + &self->client_1.addr.sin_addr, sizeof(struct in_addr)); + conn_1_info.key.udp_port = self->client_1.addr.sin_port; + } else { + memcpy(&conn_1_info.key.addr.ipv6_addr, + &self->client_1.addr6.sin6_addr, + sizeof(struct in6_addr)); + conn_1_info.key.udp_port = self->client_1.addr6.sin6_port; + } + + conn_2_info.cipher_type = TLS_CIPHER_AES_GCM_128; + memset(&conn_2_info.key, 0, sizeof(struct quic_connection_info_key)); + conn_2_info.key.dst_conn_id_length = variant->conn_id_2_len; + memcpy(conn_2_info.key.dst_conn_id, + &variant->conn_id_2, + variant->conn_id_2_len); + conn_2_info.conn_payload_key_gen = 0; + + if (self->client_2.addr.sin_family == AF_INET) { + memcpy(&conn_2_info.key.addr.ipv4_addr, + &self->client_2.addr.sin_addr, sizeof(struct in_addr)); + conn_2_info.key.udp_port = self->client_2.addr.sin_port; + } else { + memcpy(&conn_2_info.key.addr.ipv6_addr, + &self->client_2.addr6.sin6_addr, + sizeof(struct in6_addr)); + conn_2_info.key.udp_port = self->client_2.addr6.sin6_port; + } + + memcpy(&conn_1_info.aes_gcm_128.payload_key, + &variant->conn_1_key, 16); + memcpy(&conn_1_info.aes_gcm_128.payload_iv, + &variant->conn_1_iv, 12); + memcpy(&conn_1_info.aes_gcm_128.header_key, + &variant->conn_1_hdr_key, 16); + memcpy(&conn_2_info.aes_gcm_128.payload_key, + &variant->conn_2_key, 16); + memcpy(&conn_2_info.aes_gcm_128.payload_iv, + &variant->conn_2_iv, 12); + memcpy(&conn_2_info.aes_gcm_128.header_key, + &variant->conn_2_hdr_key, + 16); + + ASSERT_NE(setns(self->server_net_ns_fd, 0), -1); + + ASSERT_EQ(setsockopt(self->sfd, SOL_UDP, UDP_SEGMENT, &frag_size, + sizeof(frag_size)), 0); + + if (variant->setup_flow_1) + ASSERT_EQ(setsockopt(self->sfd, SOL_UDP, + UDP_QUIC_ADD_TX_CONNECTION, + &conn_1_info, sizeof(conn_1_info)), 0); + + if (variant->setup_flow_2) + ASSERT_EQ(setsockopt(self->sfd, SOL_UDP, + UDP_QUIC_ADD_TX_CONNECTION, + &conn_2_info, sizeof(conn_2_info)), 0); + + recv_addr_len_1 = self->len_c1; + recv_addr_len_2 = self->len_c2; + + iov[0].iov_base = test_str_1; + iov[0].iov_len = msg_len; + iov[1].iov_base = (void *)test_str_1 + 4500; + iov[1].iov_len = msg_len; + + msg.msg_name = (self->client_1.addr.sin_family == AF_INET) + ? (void *)&self->client_1.addr + : (void *)&self->client_1.addr6; + msg.msg_namelen = self->len_c1; + msg.msg_iov = iov; + msg.msg_iovlen = 2; + msg.msg_control = cmsg_buf; + msg.msg_controllen = sizeof(cmsg_buf); + cmsg_hdr = CMSG_FIRSTHDR(&msg); + cmsg_hdr->cmsg_level = IPPROTO_UDP; + cmsg_hdr->cmsg_type = UDP_QUIC_ENCRYPT; + cmsg_hdr->cmsg_len = CMSG_LEN(cmsg_tx_len); + anc_data = (struct quic_tx_ancillary_data *)CMSG_DATA(cmsg_hdr); + anc_data->next_pkt_num = 0x0d65c9; + anc_data->flags = 0; + anc_data->dst_conn_id_length = variant->conn_id_1_len; + + if (variant->use_client_1) + EXPECT_EQ(sendmsg(self->sfd, &msg, 0), send_len); + + iov[0].iov_base = test_str_2; + iov[0].iov_len = msg_len; + iov[1].iov_base = (void *)test_str_2 + 4500; + iov[1].iov_len = msg_len; + msg.msg_name = (self->client_2.addr.sin_family == AF_INET) + ? (void *)&self->client_2.addr + : (void *)&self->client_2.addr6; + msg.msg_namelen = self->len_c2; + cmsg_hdr = CMSG_FIRSTHDR(&msg); + anc_data = (struct quic_tx_ancillary_data *)CMSG_DATA(cmsg_hdr); + anc_data->next_pkt_num = 0x0d65c9; + anc_data->dst_conn_id_length = variant->conn_id_2_len; + anc_data->flags = 0; + + if (variant->use_client_2) + EXPECT_EQ(sendmsg(self->sfd, &msg, 0), send_len); + + if (variant->use_client_1) { + ASSERT_NE(setns(self->client_1_net_ns_fd, 0), -1); + if (variant->af_client_1 == AF_INET) { + for (i = 0; i < 7; ++i) { + EXPECT_EQ(recvfrom(self->c1fd, buf_1, 9000, 0, + &self->client_1.addr, + &recv_addr_len_1), + 1200); + // Validate framing is intact. + EXPECT_EQ(memcmp((void *)buf_1 + 1, + &variant->conn_id_1, + variant->conn_id_1_len), 0); + } + EXPECT_EQ(recvfrom(self->c1fd, buf_1, 9000, 0, + &self->client_1.addr, + &recv_addr_len_1), + 728); + EXPECT_EQ(memcmp((void *)buf_1 + 1, + &variant->conn_id_1, + variant->conn_id_1_len), 0); + } else { + for (i = 0; i < 7; ++i) { + EXPECT_EQ(recvfrom(self->c1fd, buf_1, 9000, 0, + &self->client_1.addr6, + &recv_addr_len_1), + 1200); + } + EXPECT_EQ(recvfrom(self->c1fd, buf_1, 9000, 0, + &self->client_1.addr6, + &recv_addr_len_1), + 728); + EXPECT_EQ(memcmp((void *)buf_1 + 1, + &variant->conn_id_1, + variant->conn_id_1_len), 0); + } + EXPECT_NE(memcmp(buf_1, test_str_1, send_len), 0); + } + + if (variant->use_client_2) { + ASSERT_NE(setns(self->client_2_net_ns_fd, 0), -1); + if (variant->af_client_2 == AF_INET) { + for (i = 0; i < 7; ++i) { + EXPECT_EQ(recvfrom(self->c2fd, buf_2, 9000, 0, + &self->client_2.addr, + &recv_addr_len_2), + 1200); + EXPECT_EQ(memcmp((void *)buf_2 + 1, + &variant->conn_id_2, + variant->conn_id_2_len), 0); + } + EXPECT_EQ(recvfrom(self->c2fd, buf_2, 9000, 0, + &self->client_2.addr, + &recv_addr_len_2), + 728); + EXPECT_EQ(memcmp((void *)buf_2 + 1, + &variant->conn_id_2, + variant->conn_id_2_len), 0); + } else { + for (i = 0; i < 7; ++i) { + EXPECT_EQ(recvfrom(self->c2fd, buf_2, 9000, 0, + &self->client_2.addr6, + &recv_addr_len_2), + 1200); + EXPECT_EQ(memcmp((void *)buf_2 + 1, + &variant->conn_id_2, + variant->conn_id_2_len), 0); + } + EXPECT_EQ(recvfrom(self->c2fd, buf_2, 9000, 0, + &self->client_2.addr6, + &recv_addr_len_2), + 728); + EXPECT_EQ(memcmp((void *)buf_2 + 1, + &variant->conn_id_2, + variant->conn_id_2_len), 0); + } + EXPECT_NE(memcmp(buf_2, test_str_2, send_len), 0); + } + + if (variant->use_client_1 && variant->use_client_2) + EXPECT_NE(memcmp(buf_1, buf_2, send_len), 0); + + ASSERT_NE(setns(self->server_net_ns_fd, 0), -1); + if (variant->setup_flow_1) { + ASSERT_EQ(setsockopt(self->sfd, SOL_UDP, + UDP_QUIC_DEL_TX_CONNECTION, + &conn_1_info, sizeof(conn_1_info)), + 0); + } + if (variant->setup_flow_2) { + ASSERT_EQ(setsockopt(self->sfd, SOL_UDP, + UDP_QUIC_DEL_TX_CONNECTION, + &conn_2_info, sizeof(conn_2_info)), + 0); + } + free(test_str_1); + free(test_str_2); + free(buf_1); + free(buf_2); + ASSERT_NE(setns(self->default_net_ns_fd, 0), -1); +} + +// 3. QUIC Encryption Tests + +FIXTURE(quic_crypto) +{ + int sfd, cfd; + socklen_t len_c; + socklen_t len_s; + union { + struct sockaddr_in addr; + struct sockaddr_in6 addr6; + } client; + union { + struct sockaddr_in addr; + struct sockaddr_in6 addr6; + } server; + int default_net_ns_fd; + int client_net_ns_fd; + int server_net_ns_fd; +}; + +FIXTURE_VARIANT(quic_crypto) +{ + unsigned int af_client; + char *client_address; + unsigned short client_port; + uint32_t algo; + size_t conn_key_len; + uint8_t conn_id[8]; + union { + uint8_t conn_key_16[16]; + uint8_t conn_key_32[32]; + } conn_key; + uint8_t conn_iv[12]; + union { + uint8_t conn_hdr_key_16[16]; + uint8_t conn_hdr_key_32[32]; + } conn_hdr_key; + size_t conn_id_len; + bool setup_flow; + bool use_client; + unsigned int af_server; + char *server_address; + unsigned short server_port; + char plain[128]; + size_t plain_len; + char match[128]; + size_t match_len; + uint32_t next_pkt_num; +}; + +FIXTURE_SETUP(quic_crypto) +{ + char path[PATH_MAX]; + int optval = 1; + + if (variant->af_client == AF_INET) { + self->len_c = sizeof(self->client.addr); + self->client.addr.sin_family = variant->af_client; + inet_pton(variant->af_client, variant->client_address, + &self->client.addr.sin_addr); + self->client.addr.sin_port = htons(variant->client_port); + } else { + self->len_c = sizeof(self->client.addr6); + self->client.addr6.sin6_family = variant->af_client; + inet_pton(variant->af_client, variant->client_address, + &self->client.addr6.sin6_addr); + self->client.addr6.sin6_port = htons(variant->client_port); + } + + if (variant->af_server == AF_INET) { + self->len_s = sizeof(self->server.addr); + self->server.addr.sin_family = variant->af_server; + inet_pton(variant->af_server, variant->server_address, + &self->server.addr.sin_addr); + self->server.addr.sin_port = htons(variant->server_port); + } else { + self->len_s = sizeof(self->server.addr6); + self->server.addr6.sin6_family = variant->af_server; + inet_pton(variant->af_server, variant->server_address, + &self->server.addr6.sin6_addr); + self->server.addr6.sin6_port = htons(variant->server_port); + } + + snprintf(path, sizeof(path), "/proc/%d/ns/net", getpid()); + self->default_net_ns_fd = open(path, O_RDONLY); + ASSERT_GE(self->default_net_ns_fd, 0); + strcpy(path, "/var/run/netns/ns11"); + self->client_net_ns_fd = open(path, O_RDONLY); + ASSERT_GE(self->client_net_ns_fd, 0); + strcpy(path, "/var/run/netns/ns2"); + self->server_net_ns_fd = open(path, O_RDONLY); + ASSERT_GE(self->server_net_ns_fd, 0); + + if (variant->use_client) { + ASSERT_NE(setns(self->client_net_ns_fd, 0), -1); + self->cfd = socket(variant->af_client, SOCK_DGRAM, 0); + ASSERT_NE(setsockopt(self->cfd, SOL_SOCKET, SO_REUSEPORT, + &optval, sizeof(optval)), -1); + if (variant->af_client == AF_INET) { + ASSERT_EQ(bind(self->cfd, &self->client.addr, + self->len_c), 0); + ASSERT_EQ(getsockname(self->cfd, &self->client.addr, + &self->len_c), 0); + } else { + ASSERT_EQ(bind(self->cfd, &self->client.addr6, + self->len_c), 0); + ASSERT_EQ(getsockname(self->cfd, &self->client.addr6, + &self->len_c), 0); + } + } + + ASSERT_NE(setns(self->server_net_ns_fd, 0), -1); + self->sfd = socket(variant->af_server, SOCK_DGRAM, 0); + ASSERT_NE(setsockopt(self->sfd, SOL_SOCKET, SO_REUSEPORT, &optval, + sizeof(optval)), -1); + if (variant->af_server == AF_INET) { + ASSERT_EQ(bind(self->sfd, &self->server.addr, self->len_s), 0); + ASSERT_EQ(getsockname(self->sfd, &self->server.addr, + &self->len_s), + 0); + } else { + ASSERT_EQ(bind(self->sfd, &self->server.addr6, self->len_s), 0); + ASSERT_EQ(getsockname(self->sfd, &self->server.addr6, + &self->len_s), + 0); + } + + ASSERT_EQ(setsockopt(self->sfd, IPPROTO_UDP, UDP_ULP, + "quic-crypto", sizeof("quic-crypto")), 0); + + ASSERT_NE(setns(self->default_net_ns_fd, 0), -1); +} + +FIXTURE_TEARDOWN(quic_crypto) +{ + ASSERT_NE(setns(self->server_net_ns_fd, 0), -1); + close(self->sfd); + ASSERT_NE(setns(self->client_net_ns_fd, 0), -1); + close(self->cfd); + ASSERT_NE(setns(self->default_net_ns_fd, 0), -1); +} + +FIXTURE_VARIANT_ADD(quic_crypto, ipv4_aes_gcm_128) +{ + .af_client = AF_INET, + .client_address = "10.0.0.1", + .client_port = 7667, + .algo = TLS_CIPHER_AES_GCM_128, + .conn_key_len = 16, + .conn_id = {0x08, 0x6b, 0xbf, 0x88, 0x82, 0xb9, 0x12, 0x49}, + .conn_key = { + .conn_key_16 = {0x87, 0x71, 0xea, 0x1d, + 0xfb, 0xbe, 0x7a, 0x45, + 0xbb, 0xe2, 0x7e, 0xbc, + 0x0b, 0x53, 0x94, 0x99 + }, + }, + .conn_iv = {0x3A, 0xA7, 0x46, 0x72, 0xE9, 0x83, 0x6B, 0x55, 0xDA, + 0x66, 0x7B, 0xDA}, + .conn_hdr_key = { + .conn_hdr_key_16 = {0xc9, 0x8e, 0xfd, 0xf2, + 0x0b, 0x64, 0x8c, 0x57, + 0xb5, 0x0a, 0xb2, 0xd2, + 0x21, 0xd3, 0x66, 0xa5}, + }, + .conn_id_len = 8, + .setup_flow = SETUP_FLOW, + .use_client = USE_CLIENT, + .af_server = AF_INET, + .server_address = "10.0.0.2", + .server_port = 7669, + .plain = { 0x40, 0x08, 0x6b, 0xbf, 0x88, 0x82, 0xb9, 0x12, + 0x49, 0xca, + // payload + 0x02, 0x80, 0xde, 0x40, 0x39, 0x40, 0xf6, 0x00, + 0x01, 0x0b, 0x00, 0x0f, 0x65, 0x63, 0x68, 0x6f, + 0x20, 0x30, 0x31, 0x32, 0x33, 0x34, 0x35, 0x36, + 0x37, 0x38, 0x39 + }, + .plain_len = 37, + .match = { + 0x46, 0x08, 0x6b, 0xbf, 0x88, 0x82, 0xb9, 0x12, + 0x49, 0x1c, 0x44, 0xb8, 0x41, 0xbb, 0xcf, 0x6e, + 0x0a, 0x2a, 0x24, 0xfb, 0xb4, 0x79, 0x62, 0xea, + 0x59, 0x38, 0x1a, 0x0e, 0x50, 0x1e, 0x59, 0xed, + 0x3f, 0x8e, 0x7e, 0x5a, 0x70, 0xe4, 0x2a, 0xbc, + 0x2a, 0xfa, 0x2b, 0x54, 0xeb, 0x89, 0xc3, 0x2c, + 0xb6, 0x8c, 0x1e, 0xab, 0x2d + }, + .match_len = 53, + .next_pkt_num = 0x0d65c9, +}; + +FIXTURE_VARIANT_ADD(quic_crypto, ipv4_chacha20_poly1305) +{ + .af_client = AF_INET, + .client_address = "10.0.0.1", + .client_port = 7801, + .algo = TLS_CIPHER_CHACHA20_POLY1305, + .conn_key_len = 32, + .conn_id = {}, + .conn_id_len = 0, + .conn_key = { + .conn_key_32 = { + 0x3b, 0xfc, 0xdd, 0xd7, 0x2b, 0xcf, 0x02, 0x54, + 0x1d, 0x7f, 0xa0, 0xdd, 0x1f, 0x5f, 0x9e, 0xee, + 0xa8, 0x17, 0xe0, 0x9a, 0x69, 0x63, 0xa0, 0xe6, + 0xc7, 0xdf, 0x0f, 0x9a, 0x1b, 0xab, 0x90, 0xf2, + }, + }, + .conn_iv = { + 0xa6, 0xb5, 0xbc, 0x6a, 0xb7, 0xda, 0xfc, 0xe3, + 0x0f, 0xff, 0xf5, 0xdd, + }, + .conn_hdr_key = { + .conn_hdr_key_32 = { + 0xd6, 0x59, 0x76, 0x0d, 0x2b, 0xa4, 0x34, 0xa2, + 0x26, 0xfd, 0x37, 0xb3, 0x5c, 0x69, 0xe2, 0xda, + 0x82, 0x11, 0xd1, 0x0c, 0x4f, 0x12, 0x53, 0x87, + 0x87, 0xd6, 0x56, 0x45, 0xd5, 0xd1, 0xb8, 0xe2, + }, + }, + .setup_flow = SETUP_FLOW, + .use_client = USE_CLIENT, + .af_server = AF_INET, + .server_address = "10.0.0.2", + .server_port = 7802, + .plain = { 0x42, 0x00, 0xbf, 0xf4, 0x01 }, + .plain_len = 5, + .match = { 0x55, 0x58, 0xb1, 0xc6, 0x0a, 0xe7, 0xb6, 0xb9, + 0x32, 0xbc, 0x27, 0xd7, 0x86, 0xf4, 0xbc, 0x2b, + 0xb2, 0x0f, 0x21, 0x62, 0xba }, + .match_len = 21, + .next_pkt_num = 0x2700bff5, +}; + +FIXTURE_VARIANT_ADD(quic_crypto, ipv6_aes_gcm_128) +{ + .af_client = AF_INET6, + .client_address = "2001::1", + .client_port = 7673, + .algo = TLS_CIPHER_AES_GCM_128, + .conn_key_len = 16, + .conn_id = {0x08, 0x6b, 0xbf, 0x88, 0x82, 0xb9, 0x12, 0x49}, + .conn_key = { + .conn_key_16 = {0x87, 0x71, 0xea, 0x1d, + 0xfb, 0xbe, 0x7a, 0x45, + 0xbb, 0xe2, 0x7e, 0xbc, + 0x0b, 0x53, 0x94, 0x99 + }, + }, + .conn_iv = {0x3a, 0xa7, 0x46, 0x72, 0xe9, 0x83, 0x6b, 0x55, 0xda, + 0x66, 0x7b, 0xda}, + .conn_hdr_key = { + .conn_hdr_key_16 = {0xc9, 0x8e, 0xfd, 0xf2, + 0x0b, 0x64, 0x8c, 0x57, + 0xb5, 0x0a, 0xb2, 0xd2, + 0x21, 0xd3, 0x66, 0xa5}, + }, + .conn_id_len = 8, + .setup_flow = SETUP_FLOW, + .use_client = USE_CLIENT, + .af_server = AF_INET6, + .server_address = "2001::2", + .server_port = 7675, + .plain = { 0x40, 0x08, 0x6b, 0xbf, 0x88, 0x82, 0xb9, 0x12, + 0x49, 0xca, + // Payload + 0x02, 0x80, 0xde, 0x40, 0x39, 0x40, 0xf6, 0x00, + 0x01, 0x0b, 0x00, 0x0f, 0x65, 0x63, 0x68, 0x6f, + 0x20, 0x30, 0x31, 0x32, 0x33, 0x34, 0x35, 0x36, + 0x37, 0x38, 0x39 + }, + .plain_len = 37, + .match = { + 0x46, 0x08, 0x6b, 0xbf, 0x88, 0x82, 0xb9, 0x12, + 0x49, 0x1c, 0x44, 0xb8, 0x41, 0xbb, 0xcf, 0x6e, + 0x0a, 0x2a, 0x24, 0xfb, 0xb4, 0x79, 0x62, 0xea, + 0x59, 0x38, 0x1a, 0x0e, 0x50, 0x1e, 0x59, 0xed, + 0x3f, 0x8e, 0x7e, 0x5a, 0x70, 0xe4, 0x2a, 0xbc, + 0x2a, 0xfa, 0x2b, 0x54, 0xeb, 0x89, 0xc3, 0x2c, + 0xb6, 0x8c, 0x1e, 0xab, 0x2d + }, + .match_len = 53, + .next_pkt_num = 0x0d65c9, +}; + +FIXTURE_VARIANT_ADD(quic_crypto, ipv6_chacha20_poly1305) +{ + .af_client = AF_INET6, + .client_address = "2001::1", + .client_port = 7803, + .algo = TLS_CIPHER_CHACHA20_POLY1305, + .conn_key_len = 32, + .conn_id = {}, + .conn_id_len = 0, + .conn_key = { + .conn_key_32 = { + 0x3b, 0xfc, 0xdd, 0xd7, 0x2b, 0xcf, 0x02, 0x54, + 0x1d, 0x7f, 0xa0, 0xdd, 0x1f, 0x5f, 0x9e, 0xee, + 0xa8, 0x17, 0xe0, 0x9a, 0x69, 0x63, 0xa0, 0xe6, + 0xc7, 0xdf, 0x0f, 0x9a, 0x1b, 0xab, 0x90, 0xf2, + }, + }, + .conn_iv = { + 0xa6, 0xb5, 0xbc, 0x6a, 0xb7, 0xda, 0xfc, 0xe3, + 0x0f, 0xff, 0xf5, 0xdd, + }, + .conn_hdr_key = { + .conn_hdr_key_32 = { + 0xd6, 0x59, 0x76, 0x0d, 0x2b, 0xa4, 0x34, 0xa2, + 0x26, 0xfd, 0x37, 0xb3, 0x5c, 0x69, 0xe2, 0xda, + 0x82, 0x11, 0xd1, 0x0c, 0x4f, 0x12, 0x53, 0x87, + 0x87, 0xd6, 0x56, 0x45, 0xd5, 0xd1, 0xb8, 0xe2, + }, + }, + .setup_flow = SETUP_FLOW, + .use_client = USE_CLIENT, + .af_server = AF_INET6, + .server_address = "2001::2", + .server_port = 7804, + .plain = { 0x42, 0x00, 0xbf, 0xf4, 0x01 }, + .plain_len = 5, + .match = { 0x55, 0x58, 0xb1, 0xc6, 0x0a, 0xe7, 0xb6, 0xb9, + 0x32, 0xbc, 0x27, 0xd7, 0x86, 0xf4, 0xbc, 0x2b, + 0xb2, 0x0f, 0x21, 0x62, 0xba }, + .match_len = 21, + .next_pkt_num = 0x2700bff5, +}; + +TEST_F(quic_crypto, encrypt_test_vector_single_flow_gso_in_control) +{ + uint8_t cmsg_buf[CMSG_SPACE(sizeof(struct quic_tx_ancillary_data)) + + CMSG_SPACE(sizeof(uint16_t))]; + struct quic_tx_ancillary_data *anc_data; + struct quic_connection_info conn_info; + uint16_t frag_size = 1200; + struct cmsghdr *cmsg_hdr; + int wrong_frag_size = 26; + socklen_t recv_addr_len; + struct iovec iov; + struct msghdr msg; + char *buf; + + buf = (char *)malloc(9000); + conn_info.cipher_type = variant->algo; + memset(&conn_info.key, 0, sizeof(struct quic_connection_info_key)); + conn_info.key.dst_conn_id_length = variant->conn_id_len; + memcpy(conn_info.key.dst_conn_id, + &variant->conn_id, + variant->conn_id_len); + conn_info.conn_payload_key_gen = 0; + + if (self->client.addr.sin_family == AF_INET) { + memcpy(&conn_info.key.addr.ipv4_addr, + &self->client.addr.sin_addr, sizeof(struct in_addr)); + conn_info.key.udp_port = self->client.addr.sin_port; + } else { + memcpy(&conn_info.key.addr.ipv6_addr, + &self->client.addr6.sin6_addr, + sizeof(struct in6_addr)); + conn_info.key.udp_port = self->client.addr6.sin6_port; + } + + ASSERT_TRUE(variant->algo == TLS_CIPHER_AES_GCM_128 || + variant->algo == TLS_CIPHER_CHACHA20_POLY1305); + switch (variant->algo) { + case TLS_CIPHER_AES_GCM_128: + memcpy(&conn_info.aes_gcm_128.payload_key, + &variant->conn_key, 16); + memcpy(&conn_info.aes_gcm_128.payload_iv, + &variant->conn_iv, 12); + memcpy(&conn_info.aes_gcm_128.header_key, + &variant->conn_hdr_key, 16); + break; + case TLS_CIPHER_CHACHA20_POLY1305: + memcpy(&conn_info.chacha20_poly1305.payload_key, + &variant->conn_key, 32); + memcpy(&conn_info.chacha20_poly1305.payload_iv, + &variant->conn_iv, 12); + memcpy(&conn_info.chacha20_poly1305.header_key, + &variant->conn_hdr_key, 32); + break; + } + + ASSERT_NE(setns(self->server_net_ns_fd, 0), -1); + ASSERT_EQ(setsockopt(self->sfd, SOL_UDP, UDP_SEGMENT, &wrong_frag_size, + sizeof(wrong_frag_size)), 0); + ASSERT_EQ(setsockopt(self->sfd, SOL_UDP, UDP_QUIC_ADD_TX_CONNECTION, + &conn_info, sizeof(conn_info)), 0); + + recv_addr_len = self->len_c; + iov.iov_base = (void *)variant->plain; + iov.iov_len = variant->plain_len; + memset(cmsg_buf, 0, sizeof(cmsg_buf)); + msg.msg_name = (self->client.addr.sin_family == AF_INET) + ? (void *)&self->client.addr + : (void *)&self->client.addr6; + msg.msg_namelen = self->len_c; + msg.msg_iov = &iov; + msg.msg_iovlen = 1; + msg.msg_control = cmsg_buf; + msg.msg_controllen = sizeof(cmsg_buf); + cmsg_hdr = CMSG_FIRSTHDR(&msg); + cmsg_hdr->cmsg_level = IPPROTO_UDP; + cmsg_hdr->cmsg_type = UDP_QUIC_ENCRYPT; + cmsg_hdr->cmsg_len = CMSG_LEN(sizeof(struct quic_tx_ancillary_data)); + anc_data = (struct quic_tx_ancillary_data *)CMSG_DATA(cmsg_hdr); + anc_data->flags = 0; + anc_data->next_pkt_num = variant->next_pkt_num; + anc_data->dst_conn_id_length = variant->conn_id_len; + cmsg_hdr = CMSG_NXTHDR(&msg, cmsg_hdr); + cmsg_hdr->cmsg_level = IPPROTO_UDP; + cmsg_hdr->cmsg_type = UDP_SEGMENT; + cmsg_hdr->cmsg_len = CMSG_LEN(sizeof(uint16_t)); + memcpy(CMSG_DATA(cmsg_hdr), (void *)&frag_size, sizeof(frag_size)); + + EXPECT_EQ(sendmsg(self->sfd, &msg, 0), variant->plain_len); + ASSERT_NE(setns(self->client_net_ns_fd, 0), -1); + if (variant->af_client == AF_INET) { + EXPECT_EQ(recvfrom(self->cfd, buf, 9000, 0, + &self->client.addr, &recv_addr_len), + variant->match_len); + } else { + EXPECT_EQ(recvfrom(self->cfd, buf, 9000, 0, + &self->client.addr6, &recv_addr_len), + variant->match_len); + } + EXPECT_STREQ(buf, variant->match); + ASSERT_NE(setns(self->server_net_ns_fd, 0), -1); + ASSERT_EQ(setsockopt(self->sfd, SOL_UDP, UDP_QUIC_DEL_TX_CONNECTION, + &conn_info, sizeof(conn_info)), 0); + free(buf); + ASSERT_NE(setns(self->default_net_ns_fd, 0), -1); +} + +TEST_F(quic_crypto, encrypt_test_vector_single_flow_gso_in_setsockopt) +{ + uint8_t cmsg_buf[CMSG_SPACE(sizeof(struct quic_tx_ancillary_data))]; + struct quic_tx_ancillary_data *anc_data; + struct quic_connection_info conn_info; + int frag_size = 1200; + struct cmsghdr *cmsg_hdr; + socklen_t recv_addr_len; + struct iovec iov; + struct msghdr msg; + char *buf; + + buf = (char *)malloc(9000); + conn_info.cipher_type = variant->algo; + memset(&conn_info.key, 0, sizeof(struct quic_connection_info_key)); + conn_info.key.dst_conn_id_length = variant->conn_id_len; + memcpy(conn_info.key.dst_conn_id, + &variant->conn_id, + variant->conn_id_len); + conn_info.conn_payload_key_gen = 0; + + if (self->client.addr.sin_family == AF_INET) { + memcpy(&conn_info.key.addr.ipv4_addr, + &self->client.addr.sin_addr, sizeof(struct in_addr)); + conn_info.key.udp_port = self->client.addr.sin_port; + } else { + memcpy(&conn_info.key.addr.ipv6_addr, + &self->client.addr6.sin6_addr, + sizeof(struct in6_addr)); + conn_info.key.udp_port = self->client.addr6.sin6_port; + } + ASSERT_TRUE(variant->algo == TLS_CIPHER_AES_GCM_128 || + variant->algo == TLS_CIPHER_CHACHA20_POLY1305); + switch (variant->algo) { + case TLS_CIPHER_AES_GCM_128: + memcpy(&conn_info.aes_gcm_128.payload_key, + &variant->conn_key, 16); + memcpy(&conn_info.aes_gcm_128.payload_iv, + &variant->conn_iv, 12); + memcpy(&conn_info.aes_gcm_128.header_key, + &variant->conn_hdr_key, 16); + break; + case TLS_CIPHER_CHACHA20_POLY1305: + memcpy(&conn_info.chacha20_poly1305.payload_key, + &variant->conn_key, 32); + memcpy(&conn_info.chacha20_poly1305.payload_iv, + &variant->conn_iv, 12); + memcpy(&conn_info.chacha20_poly1305.header_key, + &variant->conn_hdr_key, 32); + break; + } + + ASSERT_NE(setns(self->server_net_ns_fd, 0), -1); + ASSERT_EQ(setsockopt(self->sfd, SOL_UDP, UDP_SEGMENT, &frag_size, + sizeof(frag_size)), 0); + ASSERT_EQ(setsockopt(self->sfd, SOL_UDP, UDP_QUIC_ADD_TX_CONNECTION, + &conn_info, sizeof(conn_info)), 0); + + recv_addr_len = self->len_c; + iov.iov_base = (void *)variant->plain; + iov.iov_len = variant->plain_len; + memset(cmsg_buf, 0, sizeof(cmsg_buf)); + msg.msg_name = (self->client.addr.sin_family == AF_INET) + ? (void *)&self->client.addr + : (void *)&self->client.addr6; + msg.msg_namelen = self->len_c; + msg.msg_iov = &iov; + msg.msg_iovlen = 1; + msg.msg_control = cmsg_buf; + msg.msg_controllen = sizeof(cmsg_buf); + cmsg_hdr = CMSG_FIRSTHDR(&msg); + cmsg_hdr->cmsg_level = IPPROTO_UDP; + cmsg_hdr->cmsg_type = UDP_QUIC_ENCRYPT; + cmsg_hdr->cmsg_len = CMSG_LEN(sizeof(struct quic_tx_ancillary_data)); + anc_data = (struct quic_tx_ancillary_data *)CMSG_DATA(cmsg_hdr); + anc_data->flags = 0; + anc_data->next_pkt_num = variant->next_pkt_num; + anc_data->dst_conn_id_length = variant->conn_id_len; + + EXPECT_EQ(sendmsg(self->sfd, &msg, 0), variant->plain_len); + ASSERT_NE(setns(self->client_net_ns_fd, 0), -1); + if (variant->af_client == AF_INET) { + EXPECT_EQ(recvfrom(self->cfd, buf, 9000, 0, + &self->client.addr, &recv_addr_len), + variant->match_len); + } else { + EXPECT_EQ(recvfrom(self->cfd, buf, 9000, 0, + &self->client.addr6, &recv_addr_len), + variant->match_len); + } + EXPECT_STREQ(buf, variant->match); + ASSERT_NE(setns(self->server_net_ns_fd, 0), -1); + ASSERT_EQ(setsockopt(self->sfd, SOL_UDP, UDP_QUIC_DEL_TX_CONNECTION, + &conn_info, sizeof(conn_info)), 0); + free(buf); + ASSERT_NE(setns(self->default_net_ns_fd, 0), -1); +} + +TEST_HARNESS_MAIN diff --git a/tools/testing/selftests/net/quic.sh b/tools/testing/selftests/net/quic.sh new file mode 100755 index 000000000000..8ff8bc494671 --- /dev/null +++ b/tools/testing/selftests/net/quic.sh @@ -0,0 +1,46 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +sudo ip netns add ns11 +sudo ip netns add ns12 +sudo ip netns add ns2 +sudo ip link add veth11 type veth peer name br-veth11 +sudo ip link add veth12 type veth peer name br-veth12 +sudo ip link add veth2 type veth peer name br-veth2 +sudo ip link set veth11 netns ns11 +sudo ip link set veth12 netns ns12 +sudo ip link set veth2 netns ns2 +sudo ip netns exec ns11 ip addr add 10.0.0.1/24 dev veth11 +sudo ip netns exec ns11 ip addr add ::ffff:10.0.0.1/96 dev veth11 +sudo ip netns exec ns11 ip addr add 2001::1/64 dev veth11 +sudo ip netns exec ns12 ip addr add 10.0.0.3/24 dev veth12 +sudo ip netns exec ns12 ip addr add ::ffff:10.0.0.3/96 dev veth12 +sudo ip netns exec ns12 ip addr add 2001::3/64 dev veth12 +sudo ip netns exec ns2 ip addr add 10.0.0.2/24 dev veth2 +sudo ip netns exec ns2 ip addr add ::ffff:10.0.0.2/96 dev veth2 +sudo ip netns exec ns2 ip addr add 2001::2/64 dev veth2 +sudo ip link add name br1 type bridge forward_delay 0 +sudo ip link set br1 up +sudo ip link set br-veth11 up +sudo ip link set br-veth12 up +sudo ip link set br-veth2 up +sudo ip netns exec ns11 ip link set veth11 up +sudo ip netns exec ns12 ip link set veth12 up +sudo ip netns exec ns2 ip link set veth2 up +sudo ip link set br-veth11 master br1 +sudo ip link set br-veth12 master br1 +sudo ip link set br-veth2 master br1 +sudo ip netns exec ns2 cat /proc/net/quic_stat + +printf "%s" "Waiting for bridge to start fowarding ..." +while ! timeout 0.5 sudo ip netns exec ns2 ping -c 1 -n 2001::1 &> /dev/null +do + printf "%c" "." +done +printf "\n%s\n" "Bridge is operational" + +sudo ./quic +sudo ip netns exec ns2 cat /proc/net/quic_stat +sudo ip netns delete ns2 +sudo ip netns delete ns12 +sudo ip netns delete ns11