From patchwork Thu Sep 11 16:42:29 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Bill Fischofer X-Patchwork-Id: 37268 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-pd0-f198.google.com (mail-pd0-f198.google.com [209.85.192.198]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id 8372D20454 for ; Thu, 11 Sep 2014 16:43:04 +0000 (UTC) Received: by mail-pd0-f198.google.com with SMTP id ft15sf41200842pdb.9 for ; Thu, 11 Sep 2014 09:43:03 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:delivered-to:from:to:date:message-id :mime-version:subject:precedence:list-id:list-unsubscribe :list-archive:list-post:list-help:list-subscribe:errors-to:sender :x-original-sender:x-original-authentication-results:mailing-list :content-type:content-transfer-encoding; bh=qsuOX0ir6WAUk2zUd+3vdp1q95Qb8ftqj5rtfq8Gz6w=; b=XiJeOEhv5ujIZD8BJhjqmGUG1UYDo/escbhZ8i+Fsx0LfDMKsK0OLRXyqosrz6T+tA ZYuWLM4grvwLnV6vEDgsNkBM6uVwnr/Bco0dMULqDGP01fK10X+5/2p4GwXUbn34gkFh 22TvG/BzpzvoW/+ZthSKlk/P0EKsEKifYEgtYcCUg6hbqajtUI7cPlTVeC0jRgGDtc07 fpttBn3sLR/YWRnhNxP+wyOl8yGt2VAzXKjuxDiABTVLVYJvRYHkTE0JxSnq2Vbo87kg RjowL17l3yiQ6b42L8Wri7XRap6aXKf9aJh6AWk/pv/JLQ9ABjIFmN4bm+hmsc/nytPz Qc/Q== X-Gm-Message-State: ALoCoQnfC9KxY8UzB5XLghyG+zGmwCzNqJ5iUpecXQBUe4vOYUBDDWX9ruoeO63h4MITomFqPNjg X-Received: by 10.66.245.197 with SMTP id xq5mr1043638pac.42.1410453783389; Thu, 11 Sep 2014 09:43:03 -0700 (PDT) X-BeenThere: patchwork-forward@linaro.org Received: by 10.140.41.11 with SMTP id y11ls313230qgy.23.gmail; Thu, 11 Sep 2014 09:43:03 -0700 (PDT) X-Received: by 10.220.172.134 with SMTP id l6mr1284629vcz.80.1410453783123; Thu, 11 Sep 2014 09:43:03 -0700 (PDT) Received: from mail-vc0-f170.google.com (mail-vc0-f170.google.com [209.85.220.170]) by mx.google.com with ESMTPS id 20si778129vcd.22.2014.09.11.09.43.03 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 11 Sep 2014 09:43:03 -0700 (PDT) Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.220.170 as permitted sender) client-ip=209.85.220.170; Received: by mail-vc0-f170.google.com with SMTP id hy4so5880519vcb.29 for ; Thu, 11 Sep 2014 09:43:03 -0700 (PDT) X-Received: by 10.52.120.51 with SMTP id kz19mr1141187vdb.95.1410453782933; Thu, 11 Sep 2014 09:43:02 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.221.45.67 with SMTP id uj3csp620318vcb; Thu, 11 Sep 2014 09:43:01 -0700 (PDT) X-Received: by 10.224.126.202 with SMTP id d10mr3312318qas.22.1410453781123; Thu, 11 Sep 2014 09:43:01 -0700 (PDT) Received: from ip-10-141-164-156.ec2.internal (lists.linaro.org. [54.225.227.206]) by mx.google.com with ESMTPS id n91si1777304qgd.109.2014.09.11.09.43.00 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Thu, 11 Sep 2014 09:43:00 -0700 (PDT) Received-SPF: none (google.com: lng-odp-bounces@lists.linaro.org does not designate permitted sender hosts) client-ip=54.225.227.206; Received: from localhost ([127.0.0.1] helo=ip-10-141-164-156.ec2.internal) by ip-10-141-164-156.ec2.internal with esmtp (Exim 4.76) (envelope-from ) id 1XS7SU-0005op-4m; Thu, 11 Sep 2014 16:42:58 +0000 Received: from mail-oa0-f53.google.com ([209.85.219.53]) by ip-10-141-164-156.ec2.internal with esmtp (Exim 4.76) (envelope-from ) id 1XS7SG-0005ok-If for lng-odp@lists.linaro.org; Thu, 11 Sep 2014 16:42:44 +0000 Received: by mail-oa0-f53.google.com with SMTP id eb12so14069967oac.26 for ; Thu, 11 Sep 2014 09:42:39 -0700 (PDT) X-Received: by 10.60.65.135 with SMTP id x7mr2449623oes.45.1410453758760; Thu, 11 Sep 2014 09:42:38 -0700 (PDT) Received: from localhost.localdomain (cpe-24-28-70-239.austin.res.rr.com. [24.28.70.239]) by mx.google.com with ESMTPSA id om1sm811951obb.9.2014.09.11.09.42.37 for (version=TLSv1.1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 11 Sep 2014 09:42:38 -0700 (PDT) From: Bill Fischofer To: lng-odp@lists.linaro.org Date: Thu, 11 Sep 2014 11:42:29 -0500 Message-Id: <1410453749-13505-1-git-send-email-bill.fischofer@linaro.org> X-Mailer: git-send-email 1.8.3.2 MIME-Version: 1.0 X-Topics: Architecture patch Subject: [lng-odp] [ARCH/PATCH] Fix formatting issues associated with previous committed patch X-BeenThere: lng-odp@lists.linaro.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: , List-Help: , List-Subscribe: , Errors-To: lng-odp-bounces@lists.linaro.org Sender: lng-odp-bounces@lists.linaro.org X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: bill.fischofer@linaro.org X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.220.170 as permitted sender) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org X-Google-Group-Id: 836684582541 Signed-off-by: Bill Fischofer --- classification_design.dox | 1810 +++++++++++++++++++++++---------------------- 1 file changed, 909 insertions(+), 901 deletions(-) diff --git a/classification_design.dox b/classification_design.dox index 58ffb51..dee12a5 100644 --- a/classification_design.dox +++ b/classification_design.dox @@ -1,901 +1,909 @@ -/* Copyright (c) 2014, Linaro Limited - * All rights reserved - * - * SPDX-License-Identifier: BSD-3-Clause - */ - -/** -@page classification_design ODP Design - Classification API -For the implementation of the ODP classification API please see @ref odp_classify.h - -@tableofcontents - -@section introduction Introduction -This document defines the Classification APIs supported by ODP. -Classification is logically composed of two stages: Parsing and Rule Matching. -Parsing takes a raw packet and validates its structure and identifies fields of interest in the various headers that comprise the layers of the packet. -Rule Matching, in turn, takes the result of parsing and sorts packets into Classes of Service (CoS) based on application-defined rule sets. -@subsection use_of_terms Use of Terms -The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC 2119](https://tools.ietf.org/html/rfc21199). -@subsection purpose Purpose -ODP is a framework for software-based packet forwarding/filtering applications, and the purpose of the Packet Classifier API is to enable applications to program the platform hardware or software implementation to assist in prioritization, classification and scheduling of each packet, so that the software application can run faster, scale better and adhere to QoS requirements. -The following API abstraction are not modelled after any existing product implementation, but is instead defined in terms of what a typical data-plane application may require from such a platform, without sacrificing simplicity and avoiding ambiguity. -Certain terms that are being used within the context of existing products in relation to packet parsing and classification, such as “access lists” are avoided such that not to suggest any relationship between the abstraction used within this API and any particular manner in which they may be implemented in hardware. -These are the key ODP objects that the parser needs to employ, that are presently defined in ODP: -@subsubsection odp_pktio odp_pktio -odp_pktio specifies an individual packet I/O channel instance. -In other words, it would translate to a physical interface or a logical port, or in the case of channelized protocols (e.g., [Interlaken](https://www.google.com/url?q=https%3A%2F%2Fwww.cortina-systems.com%2Fimages%2Fdocuments%2F400023_Interlaken_Technology_White_Paper.pdf&sa=D&sntz=1&usg=AFQjCNEBdJTBmA1XaNGY3pmumQTfgSi1oA)) it would map to a logical channel on that interface. -Since the classifier API deals exclusively with ingress, this object represents the source of packets into the classifier. -In order to support any non-trivial use case, the classifier API needs to be able to assign multiple odp_queue instances for any single odp_pktio object, and may also assign any odp_queue instance to more than one odp_pktio object. -@subsubsection odp_queue odp_queue -odp_queue specifies a logical queue for packets, and in the case of ingress, this would represent a stream of packets which share several attributes, that are delivered to the ODP application for processing. -The per-queue attributes currently defined are: queue type, sync (ordering); priority; and schedule group (set of processor cores). -@subsubsection odp_buffer_pool odp_buffer_pool -odp_buffer_pool specifies a collection of buffers of same size and alignment, as well as a set of policies such as flow control and processor affinity. -The classifier API refers to such pools that are designated for storing ingress packets. -@section functional_description Functional Description -Following is the functionality that is required of the classifier API, and its underlying implementation. -The details and order of the following paragraph is informative, and is only intended to help convey the functional scope of a classifier and provide context for the API. -In reality, implementations may execute many of these steps concurrently, or in different order while maintaining the evident dependencies: - --# Apply a set of \e classification \e rules to the header of an incoming packet, identify the header fields, e.g., \e ethertype, IP version, IP protocol, transport layer port numbers, IP DiffServ, VLAN id, 802.1p priority. - --# Store these fields as packet meta data for application use, and for the remainder of parser operations. -The \e odp_pktio is also stored as one of the meta data fields for subsequent use. - --# Compute an \e odp_cos (Class of Service) value from a subset of supported fields from 1) above. - - --# Based on the \e odp_cos from 3) above, select the \e odp_queue through which the packet is delivered to the application. - --# Validate the packet data integrity (checksums, FCS) and correctness (e.g., length fields) and store the validation result, along with optional error layer and type indicator, in packet meta data. -Optionally, if a packet fails validation, override the \e odp_cos selection in step 3 to a class of service designated for errored packets. - --# Since the selected \e odp_queue may require preservation of packet order, i.e., SYNC_ATOMIC or SYNC_ORDERED, optionally select the packet header fields from which the parser calculates a \e odp_flow_signature, which may be a unique flow identifier or a hash, such that the packets which are assigned the same \e odp_flow_signature are scheduled in the same order they are received. - --# Based on the \e odp_cos from 3) above, select the \e odp_buffer_pool that should be used to acquire a buffer to store the packet data and meta data. - --# Allocate a buffer from \e odp_buffer_pool selected in 6) above and logically store the packet data and meta data to the allocated buffer, or in accordance with class-of-service drop policy and subject to pool buffer availability, optionally discard the packet. - --# Enqueue the buffer into the \e odp_queue selected in 4) above. - -The above is an abstract description of the classifier functionality, and may be applied to a variety of applications in many different ways. -The ultimate meaning of how this functionality applies to an application also depends on other ODP modules, so the above may not complete a full depiction. -For instance, the exact meaning of \e priority, which is a per-queue attribute is influenced by the ODP scheduler semantics, and the system behavior under stress depends on the ODP buffer pool module behavior. - -For the sole purpose of illustrating the above abstract functionality, here is an example of a Layer-2 (IEEE 802.1D) bridge application: -Such a forwarding application that also adheres to IEEE 802.1p/q priority, which has 8 traffic priority levels, might create 8 \e odp_buffer_pool instances, one for each PCP priority level, and 8 \e odp_queue instances one per priority level. -Incoming packets will be inspected for a VLAN header; the PCP field will be extracted, and used to select both the pool and the queue. -Because each queue will be assigned a priority value, the packets with highest PCP values will be scheduled before any packet with a lower PCP value. -Also, in a case of congestion, buffer pools for lower priority packets will be depleted earlier than the pools containing packets of the high priority, and hence the lower priority packets will be dropped (assuming that is the only flow control method that is supported in the platform) while higher priority packets will continue to be received into buffers and processed. -@subsection flow_diagram Classification Processing Flow Diagram -@image html classification_flow.png "Figure 1: Classification Flow Diagram" width=\textwidth -@image latex classification_flow.eps "Figure 1: Classification Flow Diagram" width=\textwidth - -@section api_elements API Elements -While the above description refers to the abstracted packet classifier, the following is the description of the API designed to program the packet classifier, and is intended to add clarity to the functions provided further below. -@subsection cos_creation Class of Service Creation and Binding -To program the classifier, a class-of-service instance must be created, which will contain the packet filtering resources that it may require. -All subsequent calls refer to one or more of these resources. -Each class of service instance must be associated with a single queue or queue group, which will be the destination of all packets matching that particular filter. -The queue assignment is implemented as a separate function call such that the queue may be modified at any time, without tearing down the filters that define the class of service. -In other words, it is possible to change the destination queue for a class of service defined by its filters quickly and dynamically. -Optionally, on platforms that support multiple packet buffer pools, each class of service may be assigned a different pool such that when buffers are exhausted for one class of service, other classes are not negatively impacted and continue to be processed. - -@subsection default_packet_handling Default packet handling -There SHOULD be one \b odp_cos assigned to each port with the \c odp_cos_pktio_set() function, which will function as the default class-of-service for all packets received from an ingress port, that do not match any of the filters defined subsequently. -At minimum this default class-of-service MUST have a queue and a buffer pool assigned to it on platforms that support multiple packet buffer pools. -Multiple odp_pktio instances (i.e., multiple ports) MAY each have their own default odp_cos, or MAY share a odp_cos with other ports, based on application requirements. - -@subsection packet_classification Packet Classification -For each odp_pktio port, the API allows the assignment of a class-of-service to a packet using one of three methods: - --# The packet may be assigned a specific class-of-service based on its Layer-2 (802.1P/902.1Q VLAN tag) priority field. -Since the standard field defines 8 discrete priority levels, the API allows to assign an odp_cos to each of these priority levels with the \c odp_cos_with_l2_priority() function. - --# Similarly, a class-of-service may be assigned using the Layer-3 (IP DiffServ) header field. -The application supplies an array of \e odp_cos values that covers the entire range of the standard protocol header field, where array elements do not need to contain unique values. -There is also a need to specify if Layer-3 priority takes precedence over Layer-2 priority in a packet with both headers present. - --# Additionally, the application may also program a number of \e pattern \e matching \e rules that assign a class-of-service for packets with header fields matching specified values. -The field-matching rules take precedence over the previously described priority-based assignment of a class-of-service. -Using these matching rules the application should be able for example to identify all packets containing VoIP traffic based on the protocol being UDP, and a specific destination or source port numbers, and appropriately assign these packets an class-of-service that maps to a higher priority queue, assuring voice packets a lower and bound latency. - -@subsection scaling_and_flow Scaling and Flow Discrimination -In addition to classifying packets and routing them to those queues with the appropriate priority, and optionally limiting their memory consumption by designating certain classes of packets to specific buffer pools, the classifier API also facilitates the scaling of data-plane application on multi-core systems by creating a mechanism to define which packet headers need to be combined to result in a value representing a specific packet flow. -The classifier generates a signature, which can be a checksum or hash of arbitrary strength that covers those packet header fields that are identified by the application as identifying flows. - -The \e flow \e signatures that result from hashing are then stored with the packet meta data (along with its class-of-service and its ingress \e odp_pktio port), and subsequently may be utilized by the implementation of a scheduler queue to maintain the order of packets with the same flow signature, while allowing packets with different signatures to be processed concurrently and independently on different processing cores. - -@subsection packet_meta_data Packet meta data Elements -Here are the specific information elements that SHOULD be stored within the packet meta data structure: -- Protocol fields that are decoded and extracted by the parsing phase -- Flow-signature calculated from a prescribed collection of protocol fields -- The class-of-service identifier that is selected for the packet -- The ingress port identifier -- The result of packet validation, including an indication of the type of error detected, if any - -The ODP packet API module SHALL provide accessors for retrieving the above meta data fields from the container buffer in an implementation-independent manner. - -@section api_definitions API Definitions -@subsection data_types Data Types -The following data types are referenced in the API descriptions described below. -The names are part of the ODP API and MUST be present in any conforming implementation, however the type values shown here are illustrative and implementations SHOULD either use these or substitute their own type values that are appropriate to the underlying platform. -*/ - -@verbatim -/** - * 'odp_pktio_t' value to indicate any port - */ -#define ODP_PKTIO_ANY ((odp_pktio_t)~0) - - -/** - * 'odp_pktio_t' value to indicate an error - */ -#define ODP_PKTIO_INVALID ((odp_pktio_t)0) - - -/** - * Class of service instance type - */ -typedef uint32_t odp_cos_t; - - -/** - * flow signature type, only used for packet meta data field. - */ -typedef uint32_t odp_flowsig_t; - - -/** - * This value is returned from odp_cos_create() on failure, - * May also be used as a “sink” class of service that - * results in packets being discarded. - */ -#define ODP_COS_INVALID ((odp_cos_t)~0) -@endverbatim - -@subsection cos_routines Class of Service Routines -Conforming ODP implementations MUST provide the following Classification APIs: -@subsubsection cos_create odp_cos_create -@verbatim -/** - * Create a class-of-service - * - * @param name is a string intended for debugging purposes. - * - * @return Class of service instance identifier, - * or ODP_COS_INVALID on error. - */ - -odp_cos_t odp_cos_create(const char *name); -@endverbatim - -This routine is used to create a class of service that can be the target of classifier rules. -The number of such classes supported is implementation-defined. -Attempts to create more than are supported by the implementation will result in an \c ODP_COS_INVALID return and errno being set to \c ODP_IMPLEMENTATION_LIMIT. - -@subsubsection cos_destroy odp_cos_destroy -@verbatim -/** - * Discard a class-of-service along with all its associated resources - * - * @param cos_id class-of-service instance. - * - * @return 0 on success, -1 on error. - */ - -int odp_cos_destroy(odp_cos_t cos_id); -@endverbatim - -This routine is the bracketing routine for odp_cos_create(). -It is used to destroy an existing CoS. -It is the caller’s responsibility to ensure that no active pattern matching rules refer to the CoS prior to calling this routine. -Results are unpredictable if this restriction is not met. -@subsubsection cos_set_queue odp_cos_set_queue -@verbatim -/** - * Assign a queue for a class-of-service - * - * @param cos_id class-of-service instance. - * - * @param queue_id is the identifier of a queue where all packets - * of this specific class of service will be enqueued. - * - * @return 0 on success, negative error code on failure. - */ - -int odp_cos_set_queue(odp_cos_t cos_id, odp_queue_t queue_id); -@endverbatim - -This routine associates a target queue with a CoS such that all packets assigned to this CoS will be enqueued to the specified queue_id at the end of classification processing. -@subsubsection cos_set_queue_group odp_cos_set_queue_group -@verbatim -/** - * Assign a homogenous queue-group to a class-of-service. - * - * @param cos_id identifier of class-of-service instance - * @param queue_group_id identifier of the queue group to receive packets - * associated with this class of service. - * - * @return 0 on success, negative error code on failure. - */ - -int odp_cos_set_queue_group(odp_cos_t cos_id, odp_queue_group_t queue_group_id); -@endverbatim - -This routine associates a target queue group with a CoS such that all packets assigned to this CoS will be distributed to the specified queue_group_id at the end of classification processing. -@subsubsection cos_set_pool odp_cos_set_pool -@verbatim -/** - * Assign packet buffer pool for specific class-of-service - * - * @param cos_id class-of-service instance. - * @param pool_id is a buffer pool identifier where all packet buffers - * will be sourced to store packet that belong to this - * class of service. - * - * @return 0 on success negative error code on failure. - * - * - */ - -int odp_cos_set_pool(odp_cos_t cos_id, odp_buffer_pool_t pool_id); -@endverbatim - -This OPTIONAL routine associates a target buffer pool with a CoS such that all packets assigned to this CoS will be stored in packet buffers allocated from the designated pool_id. - - -@subsection cos_drop_policy Class of Service Drop Policy Routines -These routines control how drop policies are to be observed for a given class of service. -@subsubsection drop_data_types Data types -~~~~~{.c} -enum odp_cos_drop_e { - ODP_COS_DROP_POOL, /**< Follow buffer pool drop policy */ - ODP_COS_DROP_NEVER, /**< Never drop, ignoring buffer pool policy */ -}; -typedef enum odp_drop_e odp_drop_t; -~~~~~ - -@subsubsection cos_set_drop odp_cos_set_drop -@verbatim -/** - * Assign packet drop policy for specific class-of-service - * - * @param cos_id class-of-service instance. - * @param drop_policy is the desired packet drop policy for this class. - * - * @return 0 on success negative error code on failure. - */ - -int odp_cos_set_drop(odp_cos_t cos_id, odp_drop_t drop_policy); -@endverbatim - -This routine sets the drop policy for a class of service. -It is an OPTIONAL routine. -If an implementation does not provide this function it MUST supply a definition of it that simply returns ODP_FUNCTION_NOT_AVAILABLE. -@subsubsection pktio_set_default_cos odp_pktio_set_default_cos -@verbatim -/** - * Setup per-port default class-of-service - * - * @param pktio_in ingress port identifier. - * @param default_cos class-of-service set to all packets arriving - * at the 'pktio_in' ingress port, unless overridden by subsequent - * header-based filters. - * - * @return 0 on success negative error code on failure. - * - * - * @note This may replace the default queue per pktio. - */ - -int odp_pktio_set_default_cos(odp_pktio_t pktio_in, odp_cos_t default_cos); -@endverbatim - -This routine specifies a default class of service for a given pktio instance. -Incoming packets on the specified pktio are assigned to this class of service if no other pattern matching rule obtains. -@subsubsection pktio_set_error_cos odp_pktio_set_error_cos -@verbatim -/** - * Setup per-port error class-of-service - * - * @param pktio_in ingress port identifier. - * @param error_cos class-of-service set to all packets arriving - * at the 'pktio_in' ingress port that contain an error. - * - * @return 0 on success negative error code on failure. - */ - -int odp_pktio_set_error_cos(odp_pktio_t pktio_in, odp_cos_t error_cos); -@endverbatim - -This OPTIONAL function assigns a class-of-service used to handle packets containing various types of errors. -The specific errors types include L2 FCS and optionally L3/L4 checksum errors, malformed headers, etc., depending on platform capabilities. -The specified error_cos MAY simply discard these packets or deliver them via a queue to the application for further processing. -@subsubsection pktio_set_skip odp_pktio_set_skip -@verbatim -/** - * Setup per-port header offset - * - * @param pktio_in ingress port identifier. - * @param offset is the number of bytes the classifier must skip. - * - * @return Success or ODP_FUNCTION_NOT_AVAILABLE - */ - -int odp_pktio_set_skip(odp_pktio_t pktio_in, size_t offset); -@endverbatim - -This OPTIONAL function applies to ports that carry an additional headers preceding the standard Ethernet header. -Such headers are typically vendor-specific and thus the classifier is not required to parse such headers, but the size of a custom header is critical for the classifier to be able to parse standard protocol headers that normally follow. -@subsubsection cos_set_headroom odp_cos_set_headroom -@verbatim -/** - * Specify per-port buffer headroom - * - * @param pktio_in ingress port identifier. - * @param headroom number of bytes of space preceding packet data to reserve - * for use as headroom. Must not exceed the implementation - * defined ODP_PACKET_MAX_HEADROOM. - * - * @return Success or ODP_PARAMETER_ERROR, - * or ODP_FUNCTION_NOT_AVAILABLE - */ - -int odp_cos_set_headroom(odp_cos_t cos_id, size_t req_room); -@endverbatim - -This OPTIONAL routine specifies the number of bytes of headroom that should be reserved for each packet assigned to this class of service. -Each implementation defines an ODP_PACKET_MAX_HEADROOM limit that sets an upper bound on the size of the headroom that can be reserved for a packet. -@subsubsection cos_with_l2_priority odp_cos_with_l2_priority -@verbatim -/** - * Request to override per-port class of service - * based on Layer-2 priority field if present. - * - * @param pktio_in ingress port identifier. - * @param num_qos is the number of QoS levels, typically 8. - * @param qos_table are the values of the Layer-2 QoS header field. - * @param cos_table is the class-of-service assigned to each of the - * allowed Layer-2 QOS levels. - * @return 0 on success negative error code on failure. - */ - -int odp_cos_with_l2_priority(odp_pktio_t pktio_in, - size_t num_qos, - uint8_t qos_table[], /**< 'num_qos' elements */ - odp_cos_t cos_table[]); /**< 'num_qos' elements */ -@endverbatim - -This routine is used to assign classes of service based on the layer 2 (L2) priority associated with input packets received on the specified pktio_in. -For each of the values in qos_table[], the corresponding value in cos_table[] will be assigned. -@subsubsection cos_with_l3_dscp odp_cos_with_l3_dscp -@verbatim -/** - * - * @param pktio_in ingress port identifier. - * @param num_qos is the number of allowed Layer-3 QoS levels. - * @param qos_table are the values of the Layer-3 QoS header field. - * @param cos_table is the class-of-service assigned to each of the - * allowed Layer-3 QOS levels. - * @param l3_preference when true, Layer-3 QoS overrides L2 QoS when present. - * - * @return 0 on success negative error code on failure. - */ - -int odp_cos_with_l3_qos(odp_pktio_t pktio_in, - size_t num_qos, - uint8_t qos_table[], /**< 'num_qos' elements */ - odp_cos_t cos_table[], /**< 'num_qos' elements */ - odp_bool_t l3_preference); -@endverbatim - -This OPTIONAL routine is used to assign classes of service based on the layer 3 (L3) Differentiated Services (DS) designation. -This is the DSCP field of an IPv4 header or the first six bits of the Traffic Class of an IPv6 header. -For each of the values in qos_table[], the corresponding value in cos_table[] will be assigned. -The l3_preference flag is use to control whether the CoS assigned by this routine takes precedence over the CoS assigned by odp_cos_with_l2_priority() in the event that both apply to the same packet. - -@subsection pmrs Pattern Matching Rules -While the above routines permit class of service assignments to be made based on static criteria, the real power of classification is the ability to identify flows based on the variable contents of packet headers. -To do this ODP provides support for defining pattern matching rules (PMRs) that operate based on values contained in specified header fields. - -Associated with PMRs are enums that are used to specify standard packet header fields: -@subsubsection cos_hdr_flow_fields odp_cos_hdr_flow_fields_e -@verbatim -/** - * Packet header field enumeration - * for fields that may be used to calculate - * the flow signature, if present in a packet. - */ - -enum odp_cos_hdr_flow_fields_e { - ODP_COS_FHDR_IN_PKTIO, /**< Ingress port number */ - ODP_COS_FHDR_L2_SAP, /**< Ethernet Source MAC address */ - ODP_COS_FHDR_L2_DAP, /**< Ethernet Destination MAC address */ - ODP_COS_FHDR_L2_VID, /**< Ethernet VLAN ID */ - ODP_COS_FHDR_L3_FLOW /**< IPv6 flow_id */ - ODP_COS_FHDR_L3_SAP, /**< IP source address */ - ODP_COS_FHDR_L3_DAP, /**< IP destination address */ - ODP_COS_FHDR_L4_PROTO, /**< IP protocol (e.g. TCP/UDP/ICMP) */ - ODP_COS_FHDR_L4_SAP, /**< Transport source port */ - ODP_COS_FHDR_L4_DAP, /**< Transport destination port */ - ODP_COS_FHDR_IPSEC_SPI, /**< IPsec session identifier */ - ODP_COS_FHDR_LD_VNI, /**< NVGRE/VXLAN network identifier */ - ODP_COS_FHDR_USER /**< Application-specific header field(s) */ -}; -@endverbatim - -Conforming ODP implementations SHOULD implement efficient flow set management routines such as these: - -~~~~~{.c} -/** - * Set of header fields that take part in flow signature hash calculation: - * bit positions per 'odp_cos_hdr_flow_fields_e' enumeration. - * -typedef uint16_t odp_cos_flow_set_t; - - -/** - * Set a member of the flow signature fields data set - * -static inline odp_cos_flow_set_t -odp_cos_flow_set( odp_cos_flow_set_t set, - enum odp_cos_hdr_flow_fields_e field) -{ - return set | (1U << field); -} - - -/** - * Test a member of the flow signature fields data set - * -static inline bool -odp_cos_flow_is_set( odp_cos_flow_set_t set, - enum odp_cos_hdr_flow_fields_e field) -{ - return (set & (1U << field)) != 0; -} -~~~~~ - -These routines are intended to be used in support of the following flow signature APIs: - -@subsubsection cos_class_flow_sig odp_cos_class_flow_signature -@verbatim -/** - * Set up set of headers used to calculate a flow signature - * based on class-of-service. - * - * @param cos_id class of service instance identifier - * @param req_data_set requested data-set for flow signature calculation - * - * @return data-set that was successfully applied. All-zeros data set - * indicates a failure to assign any of the requested fields, or other - * error. - */ - -odp_cos_flow_set_t -odp_cos_class_flow_signature(odp_cos_t cos_id, - odp_cos_flow_set_t req_data_set); -@endverbatim - -This OPTIONAL routine associates a fow set with a class of service for flow signature calculation. - -@subsubsection cos_port_flow_sig odp_cos_port_flow_signature -@verbatim -/** - * Set up set of headers used to calculate a flow signature - * based on ingress port. - * - * @param pktio_in ingress port identifier. - * @param req_data_set requested data-set for flow signature calculation - * - * @return data-set that was successfully applied. An all-zeros data-set - * indicates a failure to assign any of the requested fields, or other - * error. - */ - -odp_cos_flow_set_t -odp_cos_port_flow_signature(odp_pktio_t pktio_in, - odp_cos_flow_set_t req_data_set); -@endverbatim - -@subsection pmr_routines Pattern Matching Rules Routines -The following data structures SHOULD be implemented to support the definition of pattern matching routines by conforming ODP implementations: - -~~~~~{.c} -/** - * PMR - Packet Matching Rule - * Up to 32 bit of ternary matching of one of the available header fields - * - - -#define ODP_PMR_INVAL ((odp_pmr_t)NULL) -typedef struct odp_pmr_s *odp_pmr_t; -~~~~~ - -@subsecion terms Terms -Terms are the elements of a PMR and are identified by the following enum: - -@verbatim -enum odp_pmr_term_e { - ODP_PMR_LEN, /**< Total length of received packet */ - ODP_PMR_ETHTYPE_0, /**< Initial (outer) Ethertype only (*val=uint16_t)*/ - ODP_PMR_ETHTYPE_X, /**< Ethertype of most inner VLAN tag (*val=uint16_t)*/ - ODP_PMR_VLAN_ID_0, /**< First VLAN ID (outer) (*val=uint16_t) */ - ODP_PMR_VLAN_ID_X, /**< Last VLAN ID (inner) (*val=uint16_t) */ - ODP_PMR_DMAC, /**< destination MAC address (*val=uint64_t) */ - ODP_PMR_IPPROTO, /**< IP Protocol or IPv6 Next Header (*val=uint8_t) */ - ODP_PMR_UDP_DPORT, /**< Destination UDP port, implies IPPROTO=17 */ - ODP_PMR_TCP_DPORT, /**< Destination TCP port implies IPPROTO=6 */ - ODP_PMR_UDP_SPORT, /**< Source UDP Port (*val=uint16_t) */ - ODP_PMR_TCP_SPORT, /**< Source TCP port (*val=uint16_t) */ - ODP_PMR_SIP_ADDR, /**< Source IP address (uint32_t) */ - ODP_PMR_DIP_ADDR, /**< Destination IP address (uint32_t) */ - ODP_PMR_SIP6_ADDR, /**< Source IP address (uint8_t[16]) */ - ODP_PMR_DIP6_ADDR, /**< Destination IP address (uint8_t[16]) */ - ODP_PMR_IPSEC_SPI, /**< IPsec session identifier(*val=uint32_t) */ - ODP_PMR_LD_VNI, /**< NVGRE/VXLAN network identifier (*val=uint32_t) */ - - - /** Inner header may repeat above values with this offset */ - ODP_PMR_INNER_HDR_OFF=32 -}; -@endverbatim - -@subsubsection tunnel_considerations Tunnel Considerations -Note that PMRs may be extended to support tunnels and tenants (NVGRE, VXLAN) via the ODP_PMR_INNER_HDR_OFF enum. -This enum is intended to be used as an “adder” to a PMR to indicate that the term refers to an inner header. -For example, the term ODP_PMR_DMAC would refer to the destination MAC address of the packet if the packet is not a tunnel, or of the outer header (the tunnel) if the packet is a tunnel. -To refer to the inner (tenant) destination MAC, the term would be specified as ODP_PMR_INNER_HDR_OFF+ODP_PMR_DMAC. - -@subsection pmr_apis PMR APIs -The following APIs are provided to enable an ODP application to specify PMRs as a series of individual or cascaded terms: -@subsubsection pmr_create_match odp_pmr_create_match -@verbatim -/** - * Create a packet match rule with mask and value - * - * @param term is one value of the enumerated values supported - * @param val is the value to match against the packet header - * in native byte order. - * @param mask is the mask to indicate which bits of the header - * should be matched ('1') and which should be ignored ('0') - * @param val_sz size of the ‘val’ and ‘mask’ arguments, - * that must match the value size requirement of the - * specific ‘term’. - * - * @return a handle of the matching rule or ODP_PMR_INVAL on error - */ - -odp_pmr_t odp_pmr_create_match(enum odp_pmr_term_e term, - const void *val, const void *mask, size_t val_sz); -@endverbatim - -This routine creates a PMR that matches a single value to a term. - -@subsubsection pmr_create_range odp_pmr_create_range -@verbatim -/** - * Create a packet match rule with value range - * - * @param term is one value of the enumerated values supported - * @param val1 is the lower bound of the header field range. - * @param val2 is the upper bound of the header field range. - * @param val_sz size of the ‘val1’ and ‘val2’ arguments, - * that must match the value size requirement of the - * specific ‘term’. - * - * @return a handle of the matching rule or ODP_PMR_INVAL on error - * @note: Range is inclusive [val1..val2]. - */ - -odp_pmr_t odp_pmr_create_range(enum odp_pmr_term_e term, - const void *val1, const void *val2, size_t val_sz); -@endverbatim - -This routine creates a PMR that matches an inclusive range of values to a term. - -@subsubsection pmr_destroy odp_pmr_destroy -@verbatim -/** - * Invalidate a packet match rule and vacate its resources - * - * @param pmr_id the identifier of the PMR to be destroyed - * - * @return Success or ODP_PMR_INVALID if the specified pmr_id not found. - */ - -int odp_pmr_destroy(odp_omr_t pmr_id); -@endverbatim - -This routine destroys a previously created PMR. -If the PMR is currently associated with an active class of service it is unpredictable at which point the match defined by the PMR is deactivated in terms of packet flow. -However, implementations MUST ensure that a PMR is either matched or not matched in its entirety such that dynamic changes to PMRs do not result in partial matches. - -@subsubsection pktio_pmr_cos odp_pktio_pmr_cos -@verbatim -/** - * Apply a PMR to a pktio to assign a CoS. - * - * @param pmr_id the id of the PMR to be activated - * @param src_pktio the pktio to which this PMR is to be applied - * @param dst_cos the CoS to be assigned by this PMR - * - * @return Success or ODP_PARAMETER_ERROR - */ - -int odp_pktio_pmr_cos(odp_pmr_t pmr_id, odp_pktio_t src_pktio, odp_cos_t dst_cos); -@endverbatim - -This routine links a pktio to a corresponding class of service via a specified PMR. -Any packet received on the specified src_pktio that matches the specified pmr_id will be assigned to the specified dst_cos. -If multiple PMRs match the implementation MAY define an inherent precedence or it MAY be unpredictable as to which PMR will determine the assigned CoS. -For this reason applications SHOULD NOT be written to use conflicting or ambiguous PMR definitions. - -@subsubsection cos_pmr_cos odp_cos_pmr_cos -@verbatim -/** - * Cascade a PMR to refine packets from one CoS to another. - * - * @param pmr_id the id of the PMR to be activated - * @param src_cos the id of the CoS to be filtered - * @param dst_cos the id of the CoS to be assigned to packets filtered - * from src_cos that match pmr_id. - * - * @return Success or ODP_PARAMETER_ERROR if an input is in error - * or ODP_IMPLEMENTATION_LIMIT if cascade depth is exceeded - */ - -int odp_cos_pmr_cos(odp_pmr_t pmr_id, odp_cos_t src_cos, odp_cos_t dst_cos); -@endverbatim - -This routine is used to cascade PMRs by passing packets assigned to the src_cos through another PMR. -Those matching are reassigned to the specified dst_cos. -Note that this process can be repeated to an implementation-defined maximum supported cascade depth. -When cascades are defined, the actual class of service assigned to a packet is the result of the longest chain of PMRs that can be matched against the packet. - -For example, suppose the following sequence of PMRs is in effect: - -@verbatim -odp_pktio_pmr_cos(pmr_idA, pktio_id, cos_idA); -odp_cos_pmr_cos(pmr_idB, cos_idA, cos_idB); -odp_cos_pmr_cos(pmr_idC, cos_idB, cos_idC); -odp_cos_pmr_cos(pmr_idD, cos_idC, cos_idD); -@endverbatim - -If a packet arrives on pktio_id that matches pmr_idA it is assigned to cos_idA. -But since it is now on cos_idA it is further filtered by pmr_idB and if it matches is reassigned to cos_idB. -This process continues until no further more specific match is found to determine the final CoS that the packet receives. - -Note that given this rule set a packet that matched pmr_idA and pmr_idC it would be assigned to cos_idA because the rule that can assign packets to pmr_idC is only applicable to packets that are assigned to cos_idB, not cos_idA. - -Using cascaded PMRs it is possible to build quite sophisticated filters (up to the implementation limits supported by a given platform). -For example, one could add additional rules to the above set: - -@verbatim -odp_cos_pmr_cos(pmr_idAC, cos_idA, cos_idC); -odp_cos_pmr_cos(pmr_idAD, cos_idA, cos_idD); -@endverbatim - -To cover cases where some packets on cos_idA should be further sorted to cos_idB while others should be sorted directly to cos_idC or cos_idD. -Again it is the application’s responsibility to ensure that the cascades remain unambiguous and that loops be avoided (e.g., having rules that bounce packets between cos_idA and cos_idB endlessly). - -@subsection pmr_stats PMR Statistics -Conforming ODP implementations SHOULD maintain statistics regarding PMRs and provide the following routines for retrieving them: - -@subsubsection pmr_match_count odp_pmr_match_count -@verbatim -/** - * Retrieve packet matcher statistics - * - * @param pmr_id the id of the PMR from which to retrieve the count - * - * @return The current number of matches for a given matcher instance. - */ - -signed long odp_pmr_match_count(odp_pmr_t pmr_id); -@endverbatim - -@subsubsection pmr_terms_cap odp_pmr_terms_cap -@verbatim -/** - * Inquire about matching terms supported by the classifier - * - * @return A mask one bit per enumerated term, one for each of op_pmr_term_e - */ - -unsigned long long odp_pmr_terms_cap(void); -@endverbatim - -@subsubsection pmr_terms_avail odp_pmr_terms_avail -@verbatim -/** - * Return the number of packet matching terms available for use - * - * @return A number of packet matcher resources available for use. - */ - -unsigned odp_pmr_terms_avail(void); -@endverbatim - -@subsection pmr_composite_rules Pattern Matching Composite Routines -As a shorthand, applications MAY express pattern matching rules using a table rather than constructing them term-by-term. -ODP implementations MUST support both methods of rule specification but MAY have implementation-specific restrictions on the complexity of table-based rules they support. -Note that some implementations MAY be able to implement tables directly while others MAY choose to implement tables by internally generating the equivalent set of term generating calls. - -@subsubsection pmr_table_structure PMR Table Structure -@verbatim -/** - * Following structure is used to define composite packet matching rules - * in the form of an array of individual match or range rules. - * The underlying platform may not support all or any specific combination - * of value match or range rules, and the application should take care - * of inspecting the return value when installing such rules, and perform - * appropriate fallback action. - */ - -typedef struct odp_pmr_match_t { - enum odp_pmr_match_type_e { - ODP_PMR_MASK, /**< Match a masked set of bits */ - ODP_PMR_RANGE, /**< Match an integer range */ - } match_type; - union { - struct { - enum odp_pmr_term_e term; - const void *val; - const void *mask; - unsigned int val_sz; - } mask; /**< Match a masked set of bits */ - struct { - enum odp_pmr_term_e term; - const void *val1; - const void *val2; - unsigned int val_sz; - } range; /**< Match an integer range */ - }; -} odp_pmr_match_t; - - -/** An opaque handle to a composite packet match rule-set */ -typedef struct odp_pmr_set_s *odp_pmr_set_t; -@endverbatim; - -The above structure is used with the following APIs to implement table-based PMRs: - -@subsubsection pmr_match_set_create odp_pmr_match_set_create -@verbatim -/** - * Create a composite packet match rule - * - * @param num_terms is the number of terms in the match rule. - * @param terms is an array of num_terms entries, one entry per - * term desired. - * @param dst_cos is the class-of-service to be assigned to packets - * that match the compound rule-set, or a subset thereof, - * if partly applied. - * @param pmr_set_id is the returned handle to the composite rule set. - * - * @return The return value may be a negative number indicating a general - * error, or a positive number indicating the number of ‘terms’ elements that - * have been successfully mapped to the underlying platform classification engine, - * and may be in the range from 1 to ‘num_terms’. - */ - -int odp_pmr_match_set_create(int num_terms, odp_pmr_match_t *terms, - odp_pmr_set_t *pmr_set_id); -@endverbatim - -This routine is used to create a PMR match set. - It is the equivalent to a cascade of PMRs except that there are no “intermediate” classes of service defined. -Instead, the entire match set either matches or does not match as a single entity. - -@subsubsection pmr_match_set_destroy odp_pmr_match_set_destroy -@verbatim -/** - * Function to delete a composite packet match rule set - * - * All of the resources pertaining to the match set associated with the - * class-of-service will be released, but the class-of-service will - * remain intact. - * - * @param pmr_set_id a composite rule-set handle returned when created. - * - * @note Depending on the implementation details, destroying a rule-set - * may not guarantee the availability of hardware resources to create the - * same or essentially similar rule-set. - */ - -int odp_pmr_match_set_destroy(odp_pmr_set_t pmr_set_id); -@endverbatim - -This routine destroys a PMR match set previously created by odp_pmr_match_set_create(). - -@subsubsection pktio_pmr_match_set_cos odp_pktio_pmr_match_set_cos -@verbatim -/** - * Apply a PMR Match Set to a pktio to assign a CoS. - * - * @param pmr_set_id the id of the PMR match set to be activated - * @param src_pktio the pktio to which this PMR match set is to be applied - * @param dst_cos the CoS to be assigned by this PMR match set - * - * @return Success or ODP_PARAMETER_ERROR - */ - -int odp_pktio_pmr_match_set_cos(odp_pmr_t pmr_id, odp_pktio_t src_pktio, - odp_cos_t dst_cos); -@endverbatim - -This routine is the same as odp_pktio_pmr_cos() except that it operates on PMR match sets rather than individual PMRs. - -@section items_pending Items pending resolution -- Revise ‘odp_packet_io.h’ API with respect of default input queue per ‘pktio’ instance. -- Revise ‘odp_queue.h’ API to support an arbitrary priority range, typically 8 priority levels with numeric priority values are platform-specific. -- Add specific packet meta data fields to go into packet buffer which contain all meta data fields parsed and generated by the classifier, for later application use. - -@section implementation_notes Implementation Notes -The following sections are not part of the specification, but shed light into the intent of the specification in several areas, describing some specific implementation approaches of these aspects. - -@subsection supporting_multi_pools Supporting multiple buffer pools -The support of multiple buffer pools for containing packet buffers is optional, and may not be supported by some platforms. -The importance of this feature stems from the need of protecting a networking application in the event of a congestion, or an attempted denial of service attack. -Separating different classes of service to dedicated buffer pools allows the system to limit the memory resources that may be consumed by a particular type of traffic, thereby reserving buffer resources for other classes of traffic. - -In a software implementation, a packet would already be stored in memory when the classifier is invoked, and so it seems the classifier is unable to insert itself into the process of selecting a buffer pool. -For obvious reasons the copying of a packet into a new buffer allocated from a different pool by the classifier is not a desirable solution. - -The recommended solution is to implement buffer pools in the form of buffer counters, while the actual buffers all belong to a single free list when not used to store a packet. -In such an implementation, the classifier will be able to associate a packet already occupying a buffer to a different pool than the default by incrementing the buffer counter of the newly selected pool, and decrementing the counter representing the default pool. -If however the selected pool counter has already reached a certain limit, the classifier would be able to e.g discard the packet instead of incrementing the destination pool counter, and thereby enforce the desirable semantics of distinct buffer pools per class of service. - -Other possible action that may be taken in response to running out of buffers or coming too low on buffers include back-pressure and random-early-detect with a discard probability inversely proportional to the number of free buffers in a pool. -A related implementation topic is the ability to begin dropping some packets before a buffer pool is entirely exhausted. -This is typically referred to as Random Early Detect (or “RED”). -This is deemed to be a feature of the buffer pool implementation on a given platform, where in addition to a hard limit on the number of buffers that can be allocated to a pool, there can also be an option discard packets with a probability the increases as the number of outstanding buffers approaches that hard limit. - -@subsection resolving_gaps Resolving gaps between the API and hardware capabilities -On platforms that support hardware packet accelerators, it is possible that the packet parsing and classification functionality is sufficient to address only a portion of the functionality specified within this document. -This gap may be potentially bridged by augmenting the hardware classification capabilities with a software logic implemented as part of the platform. -In that case, the platform will have to curve out a fraction of the processing resources and dedicate those to the software classification logic, which would be invoked for packets that the hardware platform was unable to classify completely. -At the time of this writing, it is believed however that the performance penalty that will be incurred as a result of software augmentation is unjustified for most application, i.e. -it is preferred to lose the precision of packet prioritization while maintaining full hardware packet processing speed. - -@subsection loopback_case The case for loopback ports, and some of their uses -In some applications, it may be desirable to be able to run a single packet through the classifier more than once. -For example, an encrypted IPsec packet is received from a physical port. -The encrypted packet is assigned a class of service based on its outer unencrypted header fields. -Later, processing the packet entails decrypting the payload of the packet, authenticating it, and removing the original outer headers, which reveals a new set of protocol headers which need to be used to re-classify the packet, and assign it a new priority and buffer pool. -An elegant solution for this use case would be to take advantage of “loopback” logical ports that may be implemented in certain platforms, by transmitting decapsulated packet into a loop-back port. -The same packet then is received from a loop-back port and is examined by the classifier in accordance to the rules assigned to the loopback odp_pktio logical port instance. -Similar mechanism may be applied to tunnel termination processing, fragment reassembly et al. - -@section related_topics Related Topics -The following section discusses aspects of the ODP API that are not integral to the classifier, which only applies to ingress preprocessing. -This section covers miscellaneous aspects of the API that need to be addressed, and are related to packet buffer processing and egress post-processing. -Additional packet buffer manipulation APIs -The need for these following calls are made evident by the need to encapsulate, i.e., remove some headers and add other, thereby changing the size of the headers of a packet during processing. - -@subsection initial_headroom Configuring initial packet buffer headroom -The following function is provided to configure the pktio receive mechanism to (optionally)reserve some headroom between start of the first buffer to the first byte of the first packet data byte, which subsequently could be used to increase the header size “in-place”, without allocating additional gather list elements. -If the request is granted, at least bytes will be reserved in the front of the packet data: -@verbatim -int odp_pktio_set_headroom(odp_pktio_t port_id, unsigned req_bytes); -@endverbatim -The return value should be negative if the request can not be satisfied, or positive otherwise indicating the actual minimum headroom reserved. -Note that the implementation may reserve more than the requested amount of headroom, and hence on platforms that are unable to support per-port (or per CoS) headroom configuration, a system-wide headroom configuration may be set to the largest of all such requests, and thus satisfy the requirement. -In addition to the above per-port headroom configuration call, there should be an optional, per-CoS call that allows the reservation of different amounts of packet buffer headroom for packets that match certain criteria: for example, the following call allows the application to request that only packets that are expected to be encapsulated in a tunnel, be augmented with a large headroom amount, while packets that are received from a tunnel, and are IP fragments, be assigned a different headroom requirement (see definition for odp_cos_set_headroom() above). - -@subsection open_issues Open Issues -- Egress packet scheduling, prioritization, and ordering -- Parallel matching rules relative precedence. -- Specify application-defined header field declaration APIs. -- Review RFC 4301 for match requirements for IPsec SA, consider the use of L4 port ranges instead of or in addition to value & mask matching criteria. -- Consider the type of packet checks should route a packet through the error CoS: L2 is a safe choice, but L3/L4 checksum or other exceptions deserve consideration. - -@subsection usage_examples Usage Examples -Following is a simple sample configuration using the API elements described above. -TBD. - -*/ +/* Copyright (c) 2014, Linaro Limited + * All rights reserved + * + * SPDX-License-Identifier: BSD-3-Clause + */ + +/** +@page classification_design ODP Design - Classification API +For the implementation of the ODP classification API please see @ref odp_classify.h + +@tableofcontents + +@section introduction Introduction +This document defines the Classification APIs supported by ODP. +Classification is logically composed of two stages: Parsing and Rule Matching. +Parsing takes a raw packet and validates its structure and identifies fields of interest in the various headers that comprise the layers of the packet. +Rule Matching, in turn, takes the result of parsing and sorts packets into Classes of Service (CoS) based on application-defined rule sets. +@subsection use_of_terms Use of Terms +The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC 2119](https://tools.ietf.org/html/rfc21199). +@subsection purpose Purpose +ODP is a framework for software-based packet forwarding/filtering applications, and the purpose of the Packet Classifier API is to enable applications to program the platform hardware or software implementation to assist in prioritization, classification and scheduling of each packet, so that the software application can run faster, scale better and adhere to QoS requirements. + +The following API abstraction are not modelled after any existing product implementation, but is instead defined in terms of what a typical data-plane application may require from such a platform, without sacrificing simplicity and avoiding ambiguity. +Certain terms that are being used within the context of existing products in relation to packet parsing and classification, such as “access lists” are avoided such that not to suggest any relationship between the abstraction used within this API and any particular manner in which they may be implemented in hardware. + +These are the key ODP objects that the parser needs to employ, that are presently defined in ODP: +@subsubsection odp_pktio odp_pktio +odp_pktio specifies an individual packet I/O channel instance. +In other words, it would translate to a physical interface or a logical port, or in the case of channelized protocols (e.g., [Interlaken](https://www.google.com/url?q=https%3A%2F%2Fwww.cortina-systems.com%2Fimages%2Fdocuments%2F400023_Interlaken_Technology_White_Paper.pdf&sa=D&sntz=1&usg=AFQjCNEBdJTBmA1XaNGY3pmumQTfgSi1oA)) it would map to a logical channel on that interface. + +Since the classifier API deals exclusively with ingress, this object represents the source of packets into the classifier. +In order to support any non-trivial use case, the classifier API needs to be able to assign multiple odp_queue instances for any single odp_pktio object, and may also assign any odp_queue instance to more than one odp_pktio object. +@subsubsection odp_queue odp_queue +odp_queue specifies a logical queue for packets, and in the case of ingress, this would represent a stream of packets which share several attributes, that are delivered to the ODP application for processing. +The per-queue attributes currently defined are: queue type, sync (ordering); priority; and schedule group (set of processor cores). +@subsubsection odp_buffer_pool odp_buffer_pool +odp_buffer_pool specifies a collection of buffers of same size and alignment, as well as a set of policies such as flow control and processor affinity. +The classifier API refers to such pools that are designated for storing ingress packets. +@section functional_description Functional Description +Following is the functionality that is required of the classifier API, and its underlying implementation. +The details and order of the following paragraph is informative, and is only intended to help convey the functional scope of a classifier and provide context for the API. +In reality, implementations may execute many of these steps concurrently, or in different order while maintaining the evident dependencies: + +-# Apply a set of \e classification \e rules to the header of an incoming packet, identify the header fields, e.g., \e ethertype, IP version, IP protocol, transport layer port numbers, IP DiffServ, VLAN id, 802.1p priority. + +-# Store these fields as packet meta data for application use, and for the remainder of parser operations. +The \e odp_pktio is also stored as one of the meta data fields for subsequent use. + +-# Compute an \e odp_cos (Class of Service) value from a subset of supported fields from 1) above. + + +-# Based on the \e odp_cos from 3) above, select the \e odp_queue through which the packet is delivered to the application. + +-# Validate the packet data integrity (checksums, FCS) and correctness (e.g., length fields) and store the validation result, along with optional error layer and type indicator, in packet meta data. +Optionally, if a packet fails validation, override the \e odp_cos selection in step 3 to a class of service designated for errored packets. + +-# Since the selected \e odp_queue may require preservation of packet order, i.e., SYNC_ATOMIC or SYNC_ORDERED, optionally select the packet header fields from which the parser calculates a \e odp_flow_signature, which may be a unique flow identifier or a hash, such that the packets which are assigned the same \e odp_flow_signature are scheduled in the same order they are received. + +-# Based on the \e odp_cos from 3) above, select the \e odp_buffer_pool that should be used to acquire a buffer to store the packet data and meta data. + +-# Allocate a buffer from \e odp_buffer_pool selected in 6) above and logically store the packet data and meta data to the allocated buffer, or in accordance with class-of-service drop policy and subject to pool buffer availability, optionally discard the packet. + +-# Enqueue the buffer into the \e odp_queue selected in 4) above. + +The above is an abstract description of the classifier functionality, and may be applied to a variety of applications in many different ways. +The ultimate meaning of how this functionality applies to an application also depends on other ODP modules, so the above may not complete a full depiction. +For instance, the exact meaning of \e priority, which is a per-queue attribute is influenced by the ODP scheduler semantics, and the system behavior under stress depends on the ODP buffer pool module behavior. + +For the sole purpose of illustrating the above abstract functionality, here is an example of a Layer-2 (IEEE 802.1D) bridge application: +Such a forwarding application that also adheres to IEEE 802.1p/q priority, which has 8 traffic priority levels, might create 8 \e odp_buffer_pool instances, one for each PCP priority level, and 8 \e odp_queue instances one per priority level. +Incoming packets will be inspected for a VLAN header; the PCP field will be extracted, and used to select both the pool and the queue. +Because each queue will be assigned a priority value, the packets with highest PCP values will be scheduled before any packet with a lower PCP value. +Also, in a case of congestion, buffer pools for lower priority packets will be depleted earlier than the pools containing packets of the high priority, and hence the lower priority packets will be dropped (assuming that is the only flow control method that is supported in the platform) while higher priority packets will continue to be received into buffers and processed. +@subsection flow_diagram Classification Processing Flow Diagram +@image html classification_flow.png "Figure 1: Classification Flow Diagram" width=\textwidth +@image latex classification_flow.eps "Figure 1: Classification Flow Diagram" width=\textwidth + +@section api_elements API Elements +While the above description refers to the abstracted packet classifier, the following is the description of the API designed to program the packet classifier, and is intended to add clarity to the functions provided further below. +@subsection cos_creation Class of Service Creation and Binding +To program the classifier, a class-of-service instance must be created, which will contain the packet filtering resources that it may require. +All subsequent calls refer to one or more of these resources. + +Each class of service instance must be associated with a single queue or queue group, which will be the destination of all packets matching that particular filter. +The queue assignment is implemented as a separate function call such that the queue may be modified at any time, without tearing down the filters that define the class of service. +In other words, it is possible to change the destination queue for a class of service defined by its filters quickly and dynamically. + +Optionally, on platforms that support multiple packet buffer pools, each class of service may be assigned a different pool such that when buffers are exhausted for one class of service, other classes are not negatively impacted and continue to be processed. + +@subsection default_packet_handling Default packet handling +There SHOULD be one \b odp_cos assigned to each port with the \c odp_cos_pktio_set() function, which will function as the default class-of-service for all packets received from an ingress port, that do not match any of the filters defined subsequently. +At minimum this default class-of-service MUST have a queue and a buffer pool assigned to it on platforms that support multiple packet buffer pools. +Multiple odp_pktio instances (i.e., multiple ports) MAY each have their own default odp_cos, or MAY share a odp_cos with other ports, based on application requirements. + +@subsection packet_classification Packet Classification +For each odp_pktio port, the API allows the assignment of a class-of-service to a packet using one of three methods: + +-# The packet may be assigned a specific class-of-service based on its Layer-2 (802.1P/902.1Q VLAN tag) priority field. +Since the standard field defines 8 discrete priority levels, the API allows to assign an odp_cos to each of these priority levels with the \c odp_cos_with_l2_priority() function. + +-# Similarly, a class-of-service may be assigned using the Layer-3 (IP DiffServ) header field. +The application supplies an array of \e odp_cos values that covers the entire range of the standard protocol header field, where array elements do not need to contain unique values. +There is also a need to specify if Layer-3 priority takes precedence over Layer-2 priority in a packet with both headers present. + +-# Additionally, the application may also program a number of \e pattern \e matching \e rules that assign a class-of-service for packets with header fields matching specified values. +The field-matching rules take precedence over the previously described priority-based assignment of a class-of-service. +Using these matching rules the application should be able for example to identify all packets containing VoIP traffic based on the protocol being UDP, and a specific destination or source port numbers, and appropriately assign these packets an class-of-service that maps to a higher priority queue, assuring voice packets a lower and bound latency. + +@subsection scaling_and_flow Scaling and Flow Discrimination +In addition to classifying packets and routing them to those queues with the appropriate priority, and optionally limiting their memory consumption by designating certain classes of packets to specific buffer pools, the classifier API also facilitates the scaling of data-plane application on multi-core systems by creating a mechanism to define which packet headers need to be combined to result in a value representing a specific packet flow. +The classifier generates a signature, which can be a checksum or hash of arbitrary strength that covers those packet header fields that are identified by the application as identifying flows. + +The \e flow \e signatures that result from hashing are then stored with the packet meta data (along with its class-of-service and its ingress \e odp_pktio port), and subsequently may be utilized by the implementation of a scheduler queue to maintain the order of packets with the same flow signature, while allowing packets with different signatures to be processed concurrently and independently on different processing cores. + +@subsection packet_meta_data Packet meta data Elements +Here are the specific information elements that SHOULD be stored within the packet meta data structure: +- Protocol fields that are decoded and extracted by the parsing phase +- Flow-signature calculated from a prescribed collection of protocol fields +- The class-of-service identifier that is selected for the packet +- The ingress port identifier +- The result of packet validation, including an indication of the type of error detected, if any + +The ODP packet API module SHALL provide accessors for retrieving the above meta data fields from the container buffer in an implementation-independent manner. + +@section api_definitions API Definitions +@subsection data_types Data Types +The following data types are referenced in the API descriptions described below. +The names are part of the ODP API and MUST be present in any conforming implementation, however the type values shown here are illustrative and implementations SHOULD either use these or substitute their own type values that are appropriate to the underlying platform. + +@verbatim +/** + * 'odp_pktio_t' value to indicate any port + */ +#define ODP_PKTIO_ANY ((odp_pktio_t)~0) + + +/** + * 'odp_pktio_t' value to indicate an error + */ +#define ODP_PKTIO_INVALID ((odp_pktio_t)0) + + +/** + * Class of service instance type + */ +typedef uint32_t odp_cos_t; + + +/** + * flow signature type, only used for packet meta data field. + */ +typedef uint32_t odp_flowsig_t; + + +/** + * This value is returned from odp_cos_create() on failure, + * May also be used as a “sink” class of service that + * results in packets being discarded. + */ +#define ODP_COS_INVALID ((odp_cos_t)~0) +@endverbatim + +@subsection cos_routines Class of Service Routines +Conforming ODP implementations MUST provide the following Classification APIs: +@subsubsection cos_create odp_cos_create +@verbatim +/** + * Create a class-of-service + * + * @param name is a string intended for debugging purposes. + * + * @return Class of service instance identifier, + * or ODP_COS_INVALID on error. + */ + +odp_cos_t odp_cos_create(const char *name); +@endverbatim + +This routine is used to create a class of service that can be the target of classifier rules. +The number of such classes supported is implementation-defined. +Attempts to create more than are supported by the implementation will result in an \c ODP_COS_INVALID return and errno being set to \c ODP_IMPLEMENTATION_LIMIT. + +@subsubsection cos_destroy odp_cos_destroy +@verbatim +/** + * Discard a class-of-service along with all its associated resources + * + * @param cos_id class-of-service instance. + * + * @return 0 on success, -1 on error. + */ + +int odp_cos_destroy(odp_cos_t cos_id); +@endverbatim + +This routine is the bracketing routine for odp_cos_create(). +It is used to destroy an existing CoS. +It is the caller’s responsibility to ensure that no active pattern matching rules refer to the CoS prior to calling this routine. +Results are unpredictable if this restriction is not met. +@subsubsection cos_set_queue odp_cos_set_queue +@verbatim +/** + * Assign a queue for a class-of-service + * + * @param cos_id class-of-service instance. + * + * @param queue_id is the identifier of a queue where all packets + * of this specific class of service will be enqueued. + * + * @return 0 on success, negative error code on failure. + */ + +int odp_cos_set_queue(odp_cos_t cos_id, odp_queue_t queue_id); +@endverbatim + +This routine associates a target queue with a CoS such that all packets assigned to this CoS will be enqueued to the specified queue_id at the end of classification processing. +@subsubsection cos_set_queue_group odp_cos_set_queue_group +@verbatim +/** + * Assign a homogenous queue-group to a class-of-service. + * + * @param cos_id identifier of class-of-service instance + * @param queue_group_id identifier of the queue group to receive packets + * associated with this class of service. + * + * @return 0 on success, negative error code on failure. + */ + +int odp_cos_set_queue_group(odp_cos_t cos_id, odp_queue_group_t queue_group_id); +@endverbatim + +This routine associates a target queue group with a CoS such that all packets assigned to this CoS will be distributed to the specified queue_group_id at the end of classification processing. +@subsubsection cos_set_pool odp_cos_set_pool +@verbatim +/** + * Assign packet buffer pool for specific class-of-service + * + * @param cos_id class-of-service instance. + * @param pool_id is a buffer pool identifier where all packet buffers + * will be sourced to store packet that belong to this + * class of service. + * + * @return 0 on success negative error code on failure. + * + * + */ + +int odp_cos_set_pool(odp_cos_t cos_id, odp_buffer_pool_t pool_id); +@endverbatim + +This routine associates a target buffer pool with a CoS such that all packets assigned to this CoS will be stored in packet buffers allocated from the designated pool_id. + + +@subsection cos_drop_policy Class of Service Drop Policy Routines +These routines control how drop policies are to be observed for a given class of service. +@subsubsection drop_data_types Data types +~~~~~{.c} +enum odp_cos_drop_e { + ODP_COS_DROP_POOL, /**< Follow buffer pool drop policy */ + ODP_COS_DROP_NEVER, /**< Never drop, ignoring buffer pool policy */ +}; +typedef enum odp_drop_e odp_drop_t; +~~~~~ + +@subsubsection cos_set_drop odp_cos_set_drop +@verbatim +/** + * Assign packet drop policy for specific class-of-service + * + * @param cos_id class-of-service instance. + * @param drop_policy is the desired packet drop policy for this class. + * + * @return 0 on success negative error code on failure. + */ + +int odp_cos_set_drop(odp_cos_t cos_id, odp_drop_t drop_policy); +@endverbatim + +This routine sets the drop policy for a class of service. +It is an OPTIONAL routine. +If an implementation does not provide this function it MUST supply a definition of it that simply returns ODP_FUNCTION_NOT_AVAILABLE. +@subsubsection pktio_set_default_cos odp_pktio_set_default_cos +@verbatim +/** + * Setup per-port default class-of-service + * + * @param pktio_in ingress port identifier. + * @param default_cos class-of-service set to all packets arriving + * at the 'pktio_in' ingress port, unless overridden by subsequent + * header-based filters. + * + * @return 0 on success negative error code on failure. + * + * + * @note This may replace the default queue per pktio. + */ + +int odp_pktio_set_default_cos(odp_pktio_t pktio_in, odp_cos_t default_cos); +@endverbatim + +This routine specifies a default class of service for a given pktio instance. +Incoming packets on the specified pktio are assigned to this class of service if no other pattern matching rule obtains. +@subsubsection pktio_set_error_cos odp_pktio_set_error_cos +@verbatim +/** + * Setup per-port error class-of-service + * + * @param pktio_in ingress port identifier. + * @param error_cos class-of-service set to all packets arriving + * at the 'pktio_in' ingress port that contain an error. + * + * @return 0 on success negative error code on failure. + */ + +int odp_pktio_set_error_cos(odp_pktio_t pktio_in, odp_cos_t error_cos); +@endverbatim + +This function assigns a class-of-service used to handle packets containing various types of errors. +The specific errors types include L2 FCS and optionally L3/L4 checksum errors, malformed headers, etc., depending on platform capabilities. +The specified error_cos MAY simply discard these packets or deliver them via a queue to the application for further processing. +@subsubsection pktio_set_skip odp_pktio_set_skip +@verbatim +/** + * Setup per-port header offset + * + * @param pktio_in ingress port identifier. + * @param offset is the number of bytes the classifier must skip. + * + * @return Success or ODP_FUNCTION_NOT_AVAILABLE + */ + +int odp_pktio_set_skip(odp_pktio_t pktio_in, size_t offset); +@endverbatim + +This function applies to ports that carry an additional headers preceding the standard Ethernet header. +Such headers are typically vendor-specific and thus the classifier is not required to parse such headers, but the size of a custom header is critical for the classifier to be able to parse standard protocol headers that normally follow. +@subsubsection cos_set_headroom odp_cos_set_headroom +@verbatim +/** + * Specify per-port buffer headroom + * + * @param pktio_in ingress port identifier. + * @param headroom number of bytes of space preceding packet data to reserve + * for use as headroom. Must not exceed the implementation + * defined ODP_PACKET_MAX_HEADROOM. + * + * @return Success or ODP_PARAMETER_ERROR, + * or ODP_FUNCTION_NOT_AVAILABLE + */ + +int odp_cos_set_headroom(odp_cos_t cos_id, size_t req_room); +@endverbatim + +This routine specifies the number of bytes of headroom that should be reserved for each packet assigned to this class of service. +Each implementation defines an ODP_PACKET_MAX_HEADROOM limit that sets an upper bound on the size of the headroom that can be reserved for a packet. +@subsubsection cos_with_l2_priority odp_cos_with_l2_priority +@verbatim +/** + * Request to override per-port class of service + * based on Layer-2 priority field if present. + * + * @param pktio_in ingress port identifier. + * @param num_qos is the number of QoS levels, typically 8. + * @param qos_table are the values of the Layer-2 QoS header field. + * @param cos_table is the class-of-service assigned to each of the + * allowed Layer-2 QOS levels. + * @return 0 on success negative error code on failure. + */ + +int odp_cos_with_l2_priority(odp_pktio_t pktio_in, + size_t num_qos, + uint8_t qos_table[], /**< 'num_qos' elements */ + odp_cos_t cos_table[]); /**< 'num_qos' elements */ +@endverbatim + +This routine is used to assign classes of service based on the layer 2 (L2) priority associated with input packets received on the specified pktio_in. +For each of the values in qos_table[], the corresponding value in cos_table[] will be assigned. +@subsubsection cos_with_l3_dscp odp_cos_with_l3_dscp +@verbatim +/** + * + * @param pktio_in ingress port identifier. + * @param num_qos is the number of allowed Layer-3 QoS levels. + * @param qos_table are the values of the Layer-3 QoS header field. + * @param cos_table is the class-of-service assigned to each of the + * allowed Layer-3 QOS levels. + * @param l3_preference when true, Layer-3 QoS overrides L2 QoS when present. + * + * @return 0 on success negative error code on failure. + */ + +int odp_cos_with_l3_dscp(odp_pktio_t pktio_in, + size_t num_qos, + uint8_t qos_table[], /**< 'num_qos' elements */ + odp_cos_t cos_table[], /**< 'num_qos' elements */ + odp_bool_t l3_preference); +@endverbatim + +This OPTIONAL routine is used to assign classes of service based on the layer 3 (L3) Differentiated Services (DS) designation. +This is the DSCP field of an IPv4 header or the first six bits of the Traffic Class of an IPv6 header. +For each of the values in qos_table[], the corresponding value in cos_table[] will be assigned. +The l3_preference flag is use to control whether the CoS assigned by this routine takes precedence over the CoS assigned by odp_cos_with_l2_priority() in the event that both apply to the same packet. + +@subsection pmrs Pattern Matching Rules +While the above routines permit class of service assignments to be made based on static criteria, the real power of classification is the ability to identify flows based on the variable contents of packet headers. +To do this ODP provides support for defining pattern matching rules (PMRs) that operate based on values contained in specified header fields. + +Associated with PMRs are enums that are used to specify standard packet header fields: +@subsubsection cos_hdr_flow_fields odp_cos_hdr_flow_fields_e +@verbatim +/** + * Packet header field enumeration + * for fields that may be used to calculate + * the flow signature, if present in a packet. + */ + +enum odp_cos_hdr_flow_fields_e { + ODP_COS_FHDR_IN_PKTIO, /**< Ingress port number */ + ODP_COS_FHDR_L2_SAP, /**< Ethernet Source MAC address */ + ODP_COS_FHDR_L2_DAP, /**< Ethernet Destination MAC address */ + ODP_COS_FHDR_L2_VID, /**< Ethernet VLAN ID */ + ODP_COS_FHDR_L3_FLOW /**< IPv6 flow_id */ + ODP_COS_FHDR_L3_SAP, /**< IP source address */ + ODP_COS_FHDR_L3_DAP, /**< IP destination address */ + ODP_COS_FHDR_L4_PROTO, /**< IP protocol (e.g. TCP/UDP/ICMP) */ + ODP_COS_FHDR_L4_SAP, /**< Transport source port */ + ODP_COS_FHDR_L4_DAP, /**< Transport destination port */ + ODP_COS_FHDR_IPSEC_SPI, /**< IPsec session identifier */ + ODP_COS_FHDR_LD_VNI, /**< NVGRE/VXLAN network identifier */ + ODP_COS_FHDR_USER /**< Application-specific header field(s) */ +}; +@endverbatim + +Conforming ODP implementations SHOULD implement efficient flow set management routines such as these: + +~~~~~{.c} +/** + * Set of header fields that take part in flow signature hash calculation: + * bit positions per 'odp_cos_hdr_flow_fields_e' enumeration. + * +typedef uint16_t odp_cos_flow_set_t; + + +/** + * Set a member of the flow signature fields data set + * +static inline odp_cos_flow_set_t +odp_cos_flow_set( odp_cos_flow_set_t set, + enum odp_cos_hdr_flow_fields_e field) +{ + return set | (1U << field); +} + + +/** + * Test a member of the flow signature fields data set + * +static inline bool +odp_cos_flow_is_set( odp_cos_flow_set_t set, + enum odp_cos_hdr_flow_fields_e field) +{ + return (set & (1U << field)) != 0; +} +~~~~~ + +These routines are intended to be used in support of the following flow signature APIs: + +@subsubsection cos_class_flow_sig odp_cos_class_flow_signature +@verbatim +/** + * Set up set of headers used to calculate a flow signature + * based on class-of-service. + * + * @param cos_id class of service instance identifier + * @param req_data_set requested data-set for flow signature calculation + * + * @return data-set that was successfully applied. All-zeros data set + * indicates a failure to assign any of the requested fields, or other + * error. + */ + +odp_cos_flow_set_t +odp_cos_class_flow_signature(odp_cos_t cos_id, + odp_cos_flow_set_t req_data_set); +@endverbatim + +This OPTIONAL routine associates a flow set with a class of service for flow signature calculation. + +@subsubsection cos_port_flow_sig odp_cos_port_flow_signature +@verbatim +/** + * Set up set of headers used to calculate a flow signature + * based on ingress port. + * + * @param pktio_in ingress port identifier. + * @param req_data_set requested data-set for flow signature calculation + * + * @return data-set that was successfully applied. An all-zeros data-set + * indicates a failure to assign any of the requested fields, or other + * error. + */ + +odp_cos_flow_set_t +odp_cos_port_flow_signature(odp_pktio_t pktio_in, + odp_cos_flow_set_t req_data_set); +@endverbatim + +This routine associates a flow set with an input port for flow signature claculation. + +@subsection pmr_routines Pattern Matching Rules Routines +The following data structures SHOULD be implemented to support the definition of pattern matching routines by conforming ODP implementations: + +~~~~~{.c} +/** + * PMR - Packet Matching Rule + * Up to 32 bit of ternary matching of one of the available header fields + * + + +#define ODP_PMR_INVAL ((odp_pmr_t)NULL) +typedef struct odp_pmr_s *odp_pmr_t; +~~~~~ + +@subsecion terms Terms +Terms are the elements of a PMR and are identified by the following enum: + +@verbatim +enum odp_pmr_term_e { + ODP_PMR_LEN, /**< Total length of received packet */ + ODP_PMR_ETHTYPE_0, /**< Initial (outer) Ethertype only (*val=uint16_t)*/ + ODP_PMR_ETHTYPE_X, /**< Ethertype of most inner VLAN tag (*val=uint16_t)*/ + ODP_PMR_VLAN_ID_0, /**< First VLAN ID (outer) (*val=uint16_t) */ + ODP_PMR_VLAN_ID_X, /**< Last VLAN ID (inner) (*val=uint16_t) */ + ODP_PMR_DMAC, /**< destination MAC address (*val=uint64_t) */ + ODP_PMR_IPPROTO, /**< IP Protocol or IPv6 Next Header (*val=uint8_t) */ + ODP_PMR_UDP_DPORT, /**< Destination UDP port, implies IPPROTO=17 */ + ODP_PMR_TCP_DPORT, /**< Destination TCP port implies IPPROTO=6 */ + ODP_PMR_UDP_SPORT, /**< Source UDP Port (*val=uint16_t) */ + ODP_PMR_TCP_SPORT, /**< Source TCP port (*val=uint16_t) */ + ODP_PMR_SIP_ADDR, /**< Source IP address (uint32_t) */ + ODP_PMR_DIP_ADDR, /**< Destination IP address (uint32_t) */ + ODP_PMR_SIP6_ADDR, /**< Source IP address (uint8_t[16]) */ + ODP_PMR_DIP6_ADDR, /**< Destination IP address (uint8_t[16]) */ + ODP_PMR_IPSEC_SPI, /**< IPsec session identifier(*val=uint32_t) */ + ODP_PMR_LD_VNI, /**< NVGRE/VXLAN network identifier (*val=uint32_t) */ + + + /** Inner header may repeat above values with this offset */ + ODP_PMR_INNER_HDR_OFF=32 +}; +@endverbatim + +@subsubsection tunnel_considerations Tunnel Considerations +Note that PMRs may be extended to support tunnels and tenants (NVGRE, VXLAN) via the ODP_PMR_INNER_HDR_OFF enum. +This enum is intended to be used as an “adder” to a PMR to indicate that the term refers to an inner header. +For example, the term ODP_PMR_DMAC would refer to the destination MAC address of the packet if the packet is not a tunnel, or of the outer header (the tunnel) if the packet is a tunnel. +To refer to the inner (tenant) destination MAC, the term would be specified as ODP_PMR_INNER_HDR_OFF+ODP_PMR_DMAC. + +@subsection pmr_apis PMR APIs +The following APIs are provided to enable an ODP application to specify PMRs as a series of individual or cascaded terms: +@subsubsection pmr_create_match odp_pmr_create_match +@verbatim +/** + * Create a packet match rule with mask and value + * + * @param term is one value of the enumerated values supported + * @param val is the value to match against the packet header + * in native byte order. + * @param mask is the mask to indicate which bits of the header + * should be matched ('1') and which should be ignored ('0') + * @param val_sz size of the ‘val’ and ‘mask’ arguments, + * that must match the value size requirement of the + * specific ‘term’. + * + * @return a handle of the matching rule or ODP_PMR_INVAL on error + */ + +odp_pmr_t odp_pmr_create_match(enum odp_pmr_term_e term, + const void *val, const void *mask, size_t val_sz); +@endverbatim + +This routine creates a PMR that matches a single value to a term. + +@subsubsection pmr_create_range odp_pmr_create_range +@verbatim +/** + * Create a packet match rule with value range + * + * @param term is one value of the enumerated values supported + * @param val1 is the lower bound of the header field range. + * @param val2 is the upper bound of the header field range. + * @param val_sz size of the ‘val1’ and ‘val2’ arguments, + * that must match the value size requirement of the + * specific ‘term’. + * + * @return a handle of the matching rule or ODP_PMR_INVAL on error + * @note: Range is inclusive [val1..val2]. + */ + +odp_pmr_t odp_pmr_create_range(enum odp_pmr_term_e term, + const void *val1, const void *val2, size_t val_sz); +@endverbatim + +This routine creates a PMR that matches an inclusive range of values to a term. + +@subsubsection pmr_destroy odp_pmr_destroy +@verbatim +/** + * Invalidate a packet match rule and vacate its resources + * + * @param pmr_id the identifier of the PMR to be destroyed + * + * @return Success or ODP_PMR_INVALID if the specified pmr_id not found. + */ + +int odp_pmr_destroy(odp_omr_t pmr_id); +@endverbatim + +This routine destroys a previously created PMR. +If the PMR is currently associated with an active class of service it is unpredictable at which point the match defined by the PMR is deactivated in terms of packet flow. +However, implementations MUST ensure that a PMR is either matched or not matched in its entirety such that dynamic changes to PMRs do not result in partial matches. + +@subsubsection pktio_pmr_cos odp_pktio_pmr_cos +@verbatim +/** + * Apply a PMR to a pktio to assign a CoS. + * + * @param pmr_id the id of the PMR to be activated + * @param src_pktio the pktio to which this PMR is to be applied + * @param dst_cos the CoS to be assigned by this PMR + * + * @return Success or ODP_PARAMETER_ERROR + */ + +int odp_pktio_pmr_cos(odp_pmr_t pmr_id, odp_pktio_t src_pktio, odp_cos_t dst_cos); +@endverbatim + +This routine links a pktio to a corresponding class of service via a specified PMR. +Any packet received on the specified src_pktio that matches the specified pmr_id will be assigned to the specified dst_cos. +If multiple PMRs match the implementation MAY define an inherent precedence or it MAY be unpredictable as to which PMR will determine the assigned CoS. +For this reason applications SHOULD NOT be written to use conflicting or ambiguous PMR definitions. + +@subsubsection cos_pmr_cos odp_cos_pmr_cos +@verbatim +/** + * Cascade a PMR to refine packets from one CoS to another. + * + * @param pmr_id the id of the PMR to be activated + * @param src_cos the id of the CoS to be filtered + * @param dst_cos the id of the CoS to be assigned to packets filtered + * from src_cos that match pmr_id. + * + * @return Success or ODP_PARAMETER_ERROR if an input is in error + * or ODP_IMPLEMENTATION_LIMIT if cascade depth is exceeded + */ + +int odp_cos_pmr_cos(odp_pmr_t pmr_id, odp_cos_t src_cos, odp_cos_t dst_cos); +@endverbatim + +This routine is used to cascade PMRs by passing packets assigned to the src_cos through another PMR. +Those matching are reassigned to the specified dst_cos. +Note that this process can be repeated to an implementation-defined maximum supported cascade depth. +When cascades are defined, the actual class of service assigned to a packet is the result of the longest chain of PMRs that can be matched against the packet. + +For example, suppose the following sequence of PMRs is in effect: + +@verbatim +odp_pktio_pmr_cos(pmr_idA, pktio_id, cos_idA); +odp_cos_pmr_cos(pmr_idB, cos_idA, cos_idB); +odp_cos_pmr_cos(pmr_idC, cos_idB, cos_idC); +odp_cos_pmr_cos(pmr_idD, cos_idC, cos_idD); +@endverbatim + +If a packet arrives on pktio_id that matches pmr_idA it is assigned to cos_idA. +But since it is now on cos_idA it is further filtered by pmr_idB and if it matches is reassigned to cos_idB. +This process continues until no further more specific match is found to determine the final CoS that the packet receives. + +Note that given this rule set a packet that matched pmr_idA and pmr_idC it would be assigned to cos_idA because the rule that can assign packets to pmr_idC is only applicable to packets that are assigned to cos_idB, not cos_idA. + +Using cascaded PMRs it is possible to build quite sophisticated filters (up to the implementation limits supported by a given platform). +For example, one could add additional rules to the above set: + +@verbatim +odp_cos_pmr_cos(pmr_idAC, cos_idA, cos_idC); +odp_cos_pmr_cos(pmr_idAD, cos_idA, cos_idD); +@endverbatim + +To cover cases where some packets on cos_idA should be further sorted to cos_idB while others should be sorted directly to cos_idC or cos_idD. +Again it is the application’s responsibility to ensure that the cascades remain unambiguous and that loops be avoided (e.g., having rules that bounce packets between cos_idA and cos_idB endlessly). + +@subsection pmr_stats PMR Statistics +Conforming ODP implementations SHOULD maintain statistics regarding PMRs and provide the following routines for retrieving them: + +@subsubsection pmr_match_count odp_pmr_match_count +@verbatim +/** + * Retrieve packet matcher statistics + * + * @param pmr_id the id of the PMR from which to retrieve the count + * + * @return The current number of matches for a given matcher instance. + */ + +signed long odp_pmr_match_count(odp_pmr_t pmr_id); +@endverbatim + +@subsubsection pmr_terms_cap odp_pmr_terms_cap +@verbatim +/** + * Inquire about matching terms supported by the classifier + * + * @return A mask one bit per enumerated term, one for each of op_pmr_term_e + */ + +unsigned long long odp_pmr_terms_cap(void); +@endverbatim + +@subsubsection pmr_terms_avail odp_pmr_terms_avail +@verbatim +/** + * Return the number of packet matching terms available for use + * + * @return A number of packet matcher resources available for use. + */ + +unsigned odp_pmr_terms_avail(void); +@endverbatim + +@subsection pmr_composite_rules Pattern Matching Composite Routines +As a shorthand, applications MAY express pattern matching rules using a table rather than constructing them term-by-term. +ODP implementations MUST support both methods of rule specification but MAY have implementation-specific restrictions on the complexity of table-based rules they support. +Note that some implementations MAY be able to implement tables directly while others MAY choose to implement tables by internally generating the equivalent set of term generating calls. + +@subsubsection pmr_table_structure PMR Table Structure +@verbatim +/** + * Following structure is used to define composite packet matching rules + * in the form of an array of individual match or range rules. + * The underlying platform may not support all or any specific combination + * of value match or range rules, and the application should take care + * of inspecting the return value when installing such rules, and perform + * appropriate fallback action. + */ + +typedef struct odp_pmr_match_t { + enum odp_pmr_match_type_e { + ODP_PMR_MASK, /**< Match a masked set of bits */ + ODP_PMR_RANGE, /**< Match an integer range */ + } match_type; + union { + struct { + enum odp_pmr_term_e term; + const void *val; + const void *mask; + unsigned int val_sz; + } mask; /**< Match a masked set of bits */ + struct { + enum odp_pmr_term_e term; + const void *val1; + const void *val2; + unsigned int val_sz; + } range; /**< Match an integer range */ + }; +} odp_pmr_match_t; + + +/** An opaque handle to a composite packet match rule-set */ +typedef struct odp_pmr_set_s *odp_pmr_set_t; +@endverbatim; + +The above structure is used with the following APIs to implement table-based PMRs: + +@subsubsection pmr_match_set_create odp_pmr_match_set_create +@verbatim +/** + * Create a composite packet match rule + * + * @param num_terms is the number of terms in the match rule. + * @param terms is an array of num_terms entries, one entry per + * term desired. + * @param dst_cos is the class-of-service to be assigned to packets + * that match the compound rule-set, or a subset thereof, + * if partly applied. + * @param pmr_set_id is the returned handle to the composite rule set. + * + * @return The return value may be a negative number indicating a general + * error, or a positive number indicating the number of ‘terms’ elements that + * have been successfully mapped to the underlying platform classification engine, + * and may be in the range from 1 to ‘num_terms’. + */ + +int odp_pmr_match_set_create(int num_terms, odp_pmr_match_t *terms, + odp_pmr_set_t *pmr_set_id); +@endverbatim + +This routine is used to create a PMR match set. + It is the equivalent to a cascade of PMRs except that there are no “intermediate” classes of service defined. +Instead, the entire match set either matches or does not match as a single entity. + +@subsubsection pmr_match_set_destroy odp_pmr_match_set_destroy +@verbatim +/** + * Function to delete a composite packet match rule set + * + * All of the resources pertaining to the match set associated with the + * class-of-service will be released, but the class-of-service will + * remain intact. + * + * @param pmr_set_id a composite rule-set handle returned when created. + * + * @note Depending on the implementation details, destroying a rule-set + * may not guarantee the availability of hardware resources to create the + * same or essentially similar rule-set. + */ + +int odp_pmr_match_set_destroy(odp_pmr_set_t pmr_set_id); +@endverbatim + +This routine destroys a PMR match set previously created by odp_pmr_match_set_create(). + +@subsubsection pktio_pmr_match_set_cos odp_pktio_pmr_match_set_cos +@verbatim +/** + * Apply a PMR Match Set to a pktio to assign a CoS. + * + * @param pmr_set_id the id of the PMR match set to be activated + * @param src_pktio the pktio to which this PMR match set is to be applied + * @param dst_cos the CoS to be assigned by this PMR match set + * + * @return Success or ODP_PARAMETER_ERROR + */ + +int odp_pktio_pmr_match_set_cos(odp_pmr_t pmr_id, odp_pktio_t src_pktio, + odp_cos_t dst_cos); +@endverbatim + +This routine is the same as odp_pktio_pmr_cos() except that it operates on PMR match sets rather than individual PMRs. + +@section items_pending Items pending resolution +- Revise ‘odp_packet_io.h’ API with respect of default input queue per ‘pktio’ instance. +- Revise ‘odp_queue.h’ API to support an arbitrary priority range, typically 8 priority levels with numeric priority values are platform-specific. +- Add specific packet meta data fields to go into packet buffer which contain all meta data fields parsed and generated by the classifier, for later application use. + +@section implementation_notes Implementation Notes +The following sections are not part of the specification, but shed light into the intent of the specification in several areas, describing some specific implementation approaches of these aspects. + +@subsection supporting_multi_pools Supporting multiple buffer pools +The support of multiple buffer pools for containing packet buffers is optional, and may not be supported by some platforms. +The importance of this feature stems from the need of protecting a networking application in the event of a congestion, or an attempted denial of service attack. +Separating different classes of service to dedicated buffer pools allows the system to limit the memory resources that may be consumed by a particular type of traffic, thereby reserving buffer resources for other classes of traffic. + +In a software implementation, a packet would already be stored in memory when the classifier is invoked, and so it seems the classifier is unable to insert itself into the process of selecting a buffer pool. +For obvious reasons the copying of a packet into a new buffer allocated from a different pool by the classifier is not a desirable solution. + +The recommended solution is to implement buffer pools in the form of buffer counters, while the actual buffers all belong to a single free list when not used to store a packet. +In such an implementation, the classifier will be able to associate a packet already occupying a buffer to a different pool than the default by incrementing the buffer counter of the newly selected pool, and decrementing the counter representing the default pool. +If however the selected pool counter has already reached a certain limit, the classifier would be able to e.g discard the packet instead of incrementing the destination pool counter, and thereby enforce the desirable semantics of distinct buffer pools per class of service. + +Other possible action that may be taken in response to running out of buffers or coming too low on buffers include back-pressure and random-early-detect with a discard probability inversely proportional to the number of free buffers in a pool. + +A related implementation topic is the ability to begin dropping some packets before a buffer pool is entirely exhausted. +This is typically referred to as Random Early Detect (or “RED”). +This is deemed to be a feature of the buffer pool implementation on a given platform, where in addition to a hard limit on the number of buffers that can be allocated to a pool, there can also be an option discard packets with a probability the increases as the number of outstanding buffers approaches that hard limit. + +@subsection resolving_gaps Resolving gaps between the API and hardware capabilities +On platforms that support hardware packet accelerators, it is possible that the packet parsing and classification functionality is sufficient to address only a portion of the functionality specified within this document. +This gap may be potentially bridged by augmenting the hardware classification capabilities with a software logic implemented as part of the platform. +In that case, the platform will have to curve out a fraction of the processing resources and dedicate those to the software classification logic, which would be invoked for packets that the hardware platform was unable to classify completely. +At the time of this writing, it is believed however that the performance penalty that will be incurred as a result of software augmentation is unjustified for most application, i.e. +it is preferred to lose the precision of packet prioritization while maintaining full hardware packet processing speed. + +@subsection loopback_case The case for loopback ports, and some of their uses +In some applications, it may be desirable to be able to run a single packet through the classifier more than once. +For example, an encrypted IPsec packet is received from a physical port. +The encrypted packet is assigned a class of service based on its outer unencrypted header fields. +Later, processing the packet entails decrypting the payload of the packet, authenticating it, and removing the original outer headers, which reveals a new set of protocol headers which need to be used to re-classify the packet, and assign it a new priority and buffer pool. +An elegant solution for this use case would be to take advantage of “loopback” logical ports that may be implemented in certain platforms, by transmitting decapsulated packet into a loop-back port. +The same packet then is received from a loop-back port and is examined by the classifier in accordance to the rules assigned to the loopback odp_pktio logical port instance. +Similar mechanism may be applied to tunnel termination processing, fragment reassembly et al. + +@section related_topics Related Topics +The following section discusses aspects of the ODP API that are not integral to the classifier, which only applies to ingress preprocessing. +This section covers miscellaneous aspects of the API that need to be addressed, and are related to packet buffer processing and egress post-processing. +Additional packet buffer manipulation APIs +The need for these following calls are made evident by the need to encapsulate, i.e., remove some headers and add other, thereby changing the size of the headers of a packet during processing. + +@subsection initial_headroom Configuring initial packet buffer headroom +The following function is provided to configure the pktio receive mechanism to (optionally)reserve some headroom between start of the first buffer to the first byte of the first packet data byte, which subsequently could be used to increase the header size “in-place”, without allocating additional gather list elements. +If the request is granted, at least bytes will be reserved in the front of the packet data: +@verbatim +int odp_pktio_set_headroom(odp_pktio_t port_id, unsigned req_bytes); +@endverbatim +The return value should be negative if the request can not be satisfied, or positive otherwise indicating the actual minimum headroom reserved. +Note that the implementation may reserve more than the requested amount of headroom, and hence on platforms that are unable to support per-port (or per CoS) headroom configuration, a system-wide headroom configuration may be set to the largest of all such requests, and thus satisfy the requirement. + +In addition to the above per-port headroom configuration call, there should be an optional, per-CoS call that allows the reservation of different amounts of packet buffer headroom for packets that match certain criteria: for example, the following call allows the application to request that only packets that are expected to be encapsulated in a tunnel, be augmented with a large headroom amount, while packets that are received from a tunnel, and are IP fragments, be assigned a different headroom requirement (see definition for odp_cos_set_headroom() above). + +@subsection open_issues Open Issues +- Egress packet scheduling, prioritization, and ordering +- Parallel matching rules relative precedence. +- Specify application-defined header field declaration APIs. +- Review RFC 4301 for match requirements for IPsec SA, consider the use of L4 port ranges instead of or in addition to value & mask matching criteria. +- Consider the type of packet checks should route a packet through the error CoS: L2 is a safe choice, but L3/L4 checksum or other exceptions deserve consideration. + +@subsection usage_examples Usage Examples +Following is a simple sample configuration using the API elements described above. +TBD. + +*/