




Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
P4 is a high-level language for programming protocol-inde- pendent packet processors. P4 works in conjunction with. SDN control protocols like OpenFlow.
Typology: Summaries
1 / 8
This page cannot be seen from the preview
Don't miss anything!





P4 is a high-level language for programming protocol-inde- pendent packet processors. P4 works in conjunction with SDN control protocols like OpenFlow. In its current form, OpenFlow explicitly specifies protocol headers on which it operates. This set has grown from 12 to 41 fields in a few years, increasing the complexity of the specification while still not providing the flexibility to add new headers. In this paper we propose P4 as a strawman proposal for how Open- Flow should evolve in the future. We have three goals: (1) Reconfigurability in the field: Programmers should be able to change the way switches process packets once they are deployed. (2) Protocol independence: Switches should not be tied to any specific network protocols. (3) Target inde- pendence: Programmers should be able to describe packet- processing functionality independently of the specifics of the underlying hardware. As an example, we describe how to use P4 to configure a switch to add a new hierarchical label.
Software-Defined Networking (SDN) gives operators pro- grammatic control over their networks. In SDN, the con- trol plane is physically separate from the forwarding plane, and one control plane controls multiple forwarding devices. While forwarding devices could be programmed in many ways, having a common, open, vendor-agnostic interface (like OpenFlow) enables a control plane to control forward- ing devices from different hardware and software vendors.
Version Date Header Fields OF 1.0 Dec 2009 12 fields (Ethernet, TCP/IPv4) OF 1.1 Feb 2011 15 fields (MPLS, inter-table metadata) OF 1.2 Dec 2011 36 fields (ARP, ICMP, IPv6, etc.) OF 1.3 Jun 2012 40 fields OF 1.4 Oct 2013 41 fields
Table 1: Fields recognized by the OpenFlow standard
The OpenFlow interface started simple, with the abstrac- tion of a single table of rules that could match packets on a dozen header fields (e.g., MAC addresses, IP addresses, pro- tocol, TCP/UDP port numbers, etc.). Over the past five years, the specification has grown increasingly more com- plicated (see Table 1), with many more header fields and
multiple stages of rule tables, to allow switches to expose more of their capabilities to the controller. The proliferation of new header fields shows no signs of stopping. For example, data-center network operators in- creasingly want to apply new forms of packet encapsula- tion (e.g., NVGRE, VXLAN, and STT), for which they re- sort to deploying software switches that are easier to extend with new functionality. Rather than repeatedly extending the OpenFlow specification, we argue that future switches should support flexible mechanisms for parsing packets and matching header fields, allowing controller applications to leverage these capabilities through a common, open inter- face (i.e., a new “OpenFlow 2.0” API). Such a general, ex- tensible approach would be simpler, more elegant, and more future-proof than today’s OpenFlow 1.x standard.
Figure 1: P4 is a language to configure switches.
Recent chip designs demonstrate that such flexibility can be achieved in custom ASICs at terabit speeds [1, 2, 3]. Pro- gramming this new generation of switch chips is far from easy. Each chip has its own low-level interface, akin to microcode programming. In this paper, we sketch the de- sign of a higher-level language for Programming Protocol- independent Packet Processors (P4). Figure 1 shows the relationship between P4—used to configure a switch, telling it how packets are to be processed—and existing APIs (such as OpenFlow) that are designed to populate the forwarding tables in fixed function switches. P4 raises the level of ab- straction for programming the network, and can serve as a
general interface between the controller and the switches. That is, we believe that future generations of OpenFlow should allow the controller to tell the switch how to operate, rather than be constrained by a fixed switch design. The key challenge is to find a “sweet spot” that balances the need for expressiveness with the ease of implementation across a wide range of hardware and software switches. In designing P4, we have three main goals:
In our abstract model (Fig. 2), switches forward packets via a programmable parser followed by multiple stages of match+action, arranged in series, parallel, or a combination of both. Derived from OpenFlow, our model makes three
generalizations. First, OpenFlow assumes a fixed parser, whereas our model supports a programmable parser to allow new headers to be defined. Second, OpenFlow assumes the match+action stages are in series, whereas in our model they can be in parallel or in series. Third, our model assumes that actions are composed from protocol-independent primitives supported by the switch. Our abstract model generalizes how packets are processed in different forwarding devices (e.g., Ethernet switches, load- balancers, routers) and by different technologies (e.g., fixed- function switch ASICs, NPUs, reconfigurable switches, soft- ware switches, FPGAs). This allows us to devise a com- mon language (P4) to represent how packets are processed in terms of our common abstract model. Hence, program- mers can create target-independent programs that a com- piler can map to a variety of different forwarding devices, ranging from relatively slow software switches to the fastest ASIC-based switches.
Figure 2: The abstract forwarding model.
The forwarding model is controlled by two types of oper- ations: Configure and Populate. Configure operations pro- gram the parser, set the order of match+action stages, and specify the header fields processed by each stage. Config- uration determines which protocols are supported and how the switch may process packets. Populate operations add (and remove) entries to the match+action tables that were specified during configuration. Population determines the policy applied to packets at any given time. For the purposes of this paper, we assume that configura- tion and population are two distinct phases. In particular, the switch need not process packets during configuration. However, we expect implementations will allow packet pro- cessing during partial or full reconfiguration enabling up- grades with no downtime. Our model deliberately allows for, and encourages, reconfiguration that does not interrupt forwarding. Clearly, the configuration phase has little meaning in fixed- function ASIC switches; for this type of switch, the com-
and then maps the TDG to a specific switch target. P4 is designed to make it easy to translate a P4 program into a TDG. In summary, P4 can be considered to be a sweet spot between the generality of say Click (that makes it difficult to infer dependencies and map to hardware) and the inflexibil- ity of OpenFlow 1.0 (that makes it impossible to reconfigure protocol processing).
We explore P4 by examining a simple example in-depth. Many network deployments differentiate between an edge and a core; end-hosts are directly connected to edge de- vices, which are in turn interconnected by a high-bandwidth core. Entire protocols have been designed to support this architecture (such as MPLS [11] and PortLand [12]), aimed primarily at simplifying forwarding in the core. Consider an example L2 network deployment with top- of-rack (ToR) switches at the edge connected by a two-tier core. We will assume the number of end-hosts is growing and the core L2 tables are overflowing. MPLS is an option to simplify the core, but implementing a label distribution protocol with multiple tags is a daunting task. PortLand looks interesting but requires rewriting MAC addresses— possibly breaking existing network debugging tools—and re- quires new agents to respond to ARP requests. P4 lets us express a custom solution with minimal changes to the network architecture. We call our toy example mTag: it combines the hierarchical routing of PortLand with simple MPLS-like tags. The routes through the core are encoded by a 32-bit tag composed of four single-byte fields. The 32- bit tag can carry a “source route” or a destination locator (like PortLand’s Pseudo MAC). Each core switch need only examine one byte of the tag and switch on that information. In our example, the tag is added by the first ToR switch, although it could also be added by the end-host NIC. The mTag example is intentionally very simple to focus our attention on the P4 language. The P4 program for an entire switch would be many times more complex in practice.
4.1 P4 Concepts
A P4 program contains definitions of the following key components:
4.2 Header Formats A design begins with the specification of header formats. Several domain-specific languages have been proposed for this [13, 14, 15]; P4 borrows a number of ideas from them. In general, each header is specified by declaring an ordered list of field names together with their widths. Optional field annotations allow constraints on value ranges or maximum lengths for variable-sized fields. For example, standard Eth- ernet and VLAN headers are specified as follows: header ethernet { fields { dst_addr : 48; // width in bits src_addr : 48; ethertype : 16; } }
header vlan { fields { pcp : 3; cfi : 1; vid : 12; ethertype : 16; } }
The mTag header can be added without altering existing declarations. The field names indicate that the core has two layers of aggregation. Each core switch is programmed with rules to examine one of these bytes determined by its location in the hierarchy and the direction of travel (up or down).
header mTag { fields { up1 : 8; up2 : 8; down1 : 8; down2 : 8; ethertype : 16; } }
4.3 The Packet Parser P4 assumes the underlying switch can implement a state machine that traverses packet headers from start to finish, extracting field values as it goes. The extracted field values are sent to the match+action tables for processing. P4 describes this state machine directly as the set of tran- sitions from one header to the next. Each transition may be triggered by values in the current header. For example, we describe the mTag state machine as follows.
parser start { ethernet; }
parser ethernet { switch(ethertype) { case 0x8100: vlan; case 0x9100: vlan; case 0x800: ipv4; // Other cases } }
parser vlan { switch(ethertype) { case 0xaaaa: mTag; case 0x800: ipv4; // Other cases } }
parser mTag { switch(ethertype) { case 0x800: ipv4; // Other cases } }
Parsing starts in the start state and proceeds until an explicit stop state is reached or an unhandled case is en- countered (which may be marked as an error). Upon reach- ing a state for a new header, the state machine extracts the header using its specification and proceeds to identify its next transition. The extracted headers are forwarded to match+action processing in the back-half of the switch pipeline. The parser for mTag is very simple: it has only four states. Parsers in real networks require many more states; for ex- ample, the parser defined by Gibb et. al. [16, Figure 3(e)] expands to over one hundred states.
4.4 Table Specification
Next, the programmer describes how the defined header fields are to be matched in the match+action stages (e.g., should they be exact matches, ranges, or wildcards?) and what actions should be performed when a match occurs. In our simple mTag example, the edge switch matches on the L2 destination and VLAN ID, and selects an mTag to add to the header. The programmer defines a table to match on these fields and apply an action to add the mTag header (see below). The reads attribute declares which fields to match, qualified by the match type (exact, ternary, etc). The actions attribute lists the possible actions which may be applied to a packet by the table. Actions are explained in the following section. The max size attribute specifies how many entries the table should support. The table specification allows a compiler to decide how much memory it needs, and the memory type (e.g., TCAM or SRAM) to implement the table.
table mTag_table { reads { ethernet.dst_addr : exact; vlan.vid : exact; } actions { // At runtime, entries are programmed with params // for the mTag action. See below. add_mTag; } max_size : 20000; }
For completeness and for later discussion, we present brief definitions of other tables that are referenced by the Control Program (§4.6).
table source_check { // Verify mtag only on ports to the core reads { mtag : valid; // Was mtag parsed? metadata.ingress_port : exact; } actions { // Each table entry specifies one action
// If inappropriate mTag, send to CPU fault_to_cpu;
// If mtag found, strip and record in metadata strip_mtag;
// Otherwise, allow the packet to continue pass; } max_size : 64; // One rule per port }
table local_switching { // Reads destination and checks if local // If miss occurs, goto mtag table. }
table egress_check { // Verify egress is resolved // Do not retag packets received with tag // Reads egress and whether packet was mTagged }
4.5 Action Specifications P4 defines a collection of primitive actions from which more complicated actions are built. Each P4 program de- clares a set of action functions that are composed of action primitives; these action functions simplify table specification and population. P4 assumes parallel execution of primitives within an action function. (Switches incapable of parallel execution may emulate the semantics.) The add mTag action referred to above is implemented as follows:
action add_mTag(up1, up2, down1, down2, egr_spec) { add_header(mTag); // Copy VLAN ethertype to mTag
5.2 Compiling Control Programs
The imperative control-flow representation in §4.6 is a convenient way to specify the logical forwarding behavior of a switch, but does not explicitly call out dependencies be- tween tables or opportunities for concurrency. We therefore employ a compiler to analyze the control program to identify dependencies and look for opportunities to process header fields in parallel. Finally, the compiler generates the tar- get configuration for the switch. There are many potential targets: for example, a software switch [17], a multicore soft- ware switch [18], an NPU [19], a fixed function switch [20], or a reconfigurable match table (RMT) pipeline [2]. As discussed in §3, the compiler follows a two-stage com- pilation process. It first converts the P4 control program into an intermediate table dependency graph representation which it analyzes to determine dependencies between ta- bles. A target-specific back-end then maps this graph onto the switch’s specific resources. We briefly examine how the mTag example would be im- plemented in different kinds of switches: Software switches: A software switch provides complete flexibility: the table count, table configuration, and parsing are under software control. The compiler directly maps the mTag table graph to switch tables. The compiler uses ta- ble type information to constrain table widths, heights, and matching criterion (e.g., exact, prefix, or wildcard) of each table. The compiler might also optimize ternary or prefix matching with software data structures. Hardware switches with RAM and TCAM: A com- piler can configure hashing to perform efficient exact-match- ing using RAM, for the mTag table in edge switches. In contrast, the core mTag forwarding table that matches on a subset of tag bits would be mapped to TCAM. Switches supporting parallel tables: The compiler can detect data dependencies and arrange tables in parallel or in series. In the mTag example, the tables mTag table and local switching can execute in parallel up to the execution of the action of setting an mTag. Switches that apply actions at the end of the pipe- line: For switches with action processing only at the end of a pipeline, the compiler can tell intermediate stages to gen- erate metadata that is used to perform the final writes. In the mTag example, whether the mTag is added or removed could be represented in metadata. Switches with a few tables: The compiler can map a large number of P4 tables to a smaller number of physi- cal tables. In the mTag example, the local switching could be combined with the mTag table. When the controller in- stalls new rules at runtime, the compiler’s rule translator can “compose” the rules in the two P4 tables to generate the rules for the single physical table.
The promise of SDN is that a single control plane can di-
rectly control a whole network of switches. OpenFlow sup- ports this goal by providing a single, vendor-agnostic API. However, today’s OpenFlow targets fixed-function switches that recognize a predetermined set of header fields and that process packets using a small set of predefined actions. The control plane cannot express how packets should be pro- cessed to best meet the needs of control applications. We propose a step towards more flexible switches whose functionality is specified—and may be changed—in the field. The programmer decides how the forwarding plane processes packets without worrying about implementation details. A compiler transforms an imperative program into a table de- pendency graph that can be mapped to many specific target switches, including optimized hardware implementations. We emphasize that this is only a first step, designed as a straw-man proposal for OpenFlow 2.0 to contribute to the debate. In this proposal, several aspects of a switch re- main undefined (e.g., congestion-control primitives, queuing disciplines, traffic monitoring). However, we believe the ap- proach of having a configuration language—and compilers that generate low-level configurations for specific targets— will lead to future switches that provide greater flexibility, and unlock the potential of software defined networks.
the data plane,” in ACM SIGCOMM HotNets Workshop, Nov. 2013. [10] E. Kohler, R. Morris, B. Chen, J. Jannotti, and M. F. Kaashoek, “The Click modular router,” ACM Transactions on Computer Systems, vol. 18, pp. 263–297, Aug. 2000. [11] “Multiprotocol Label Switching Charter.” http://datatracker.ietf.org/wg/mpls/charter/. [12] R. Niranjan Mysore, A. Pamboris, N. Farrington, N. Huang, P. Miri, S. Radhakrishnan, V. Subramanya, and A. Vahdat, “PortLand: A scalable fault-tolerant layer 2 data center network fabric,” in ACM SIGCOMM, pp. 39–50, Aug. 2009. [13] P. McCann and S. Chandra, “PacketTypes: Abstract specificationa of network protocol messages,” in ACM SIGCOMM, pp. 321–333, Aug. 2000. [14] G. Back, “DataScript - A specification and scripting language for binary data,” in Generative Programming and Component Engineering, vol. 2487, pp. 66–77, Lecture Notes in Computer Science, 2002.
[15] K. Fisher and R. Gruber, “PADS: A domain specific language for processing ad hoc data,” in ACM Conference on Programming Language Design and Implementation, pp. 295–304, June 2005. [16] G. Gibb, G. Varghese, M. Horowitz, and N. McKeown, “Design principles for packet parsers,” in ANCS, pp. 13–24, 2013. [17] “Open vSwitch website.” http://www.openvswitch.org. [18] D. Zhou, B. Fan, H. Lim, M. Kaminsky, and D. G. Andersen, “Scalable, high performance ethernet forwarding with CuckooSwitch,” in CoNext, pp. 97–108, 2013. [19] “EZChip 240-Gigabit Network Processor for Carrier Ethernet Applications.” http:http://www.ezchip.com/p_np5.htm. [20] “Broadcom BCM56850 Series.” https://www.broadcom.com/products/Switching/ Data-Center/BCM56850-Series.