Buffer Metadata
Each vlib_buffer_t (packet buffer) carries buffer metadata which describes the current packet-processing state. The underlying techniques have been used for decades, across multiple packet processing environments.
We will examine vpp buffer metadata in some detail, but folks who need to manipulate and/or extend the scheme should expect to do a certain level of code inspection.
Vlib (Vector library) primary buffer metadata
The first 64 octets of each vlib_buffer_t carries the primary buffer metadata. See …/src/vlib/buffer.h for full details.
Important fields:
i16 current_data: the signed offset in data[], pre_data[] that we are currently processing. If negative current header points into the pre-data (rewrite space) area.
u16 current_length: nBytes between current_data and the end of this buffer.
u32 flags: Buffer flag bits. Heavily used, not many bits left
src/vlib/buffer.h flag bits
VLIB_BUFFER_IS_TRACED: buffer is traced
VLIB_BUFFER_NEXT_PRESENT: buffer has multiple chunks
VLIB_BUFFER_TOTAL_LENGTH_VALID: total_length_not_including_first_buffer is valid (see below)
src/vnet/buffer.h flag bits
VNET_BUFFER_F_L4_CHECKSUM_COMPUTED: tcp/udp checksum has been computed
VNET_BUFFER_F_L4_CHECKSUM_CORRECT: tcp/udp checksum is correct
VNET_BUFFER_F_VLAN_2_DEEP: two vlan tags present
VNET_BUFFER_F_VLAN_1_DEEP: one vlan tag present
VNET_BUFFER_F_SPAN_CLONE: packet has already been cloned (span feature)
VNET_BUFFER_F_LOOP_COUNTER_VALID: packet look-up loop count valid
VNET_BUFFER_F_LOCALLY_ORIGINATED: packet built by vpp
VNET_BUFFER_F_IS_IP4: packet is ipv4, for checksum offload
VNET_BUFFER_F_IS_IP6: packet is ipv6, for checksum offload
VNET_BUFFER_F_OFFLOAD_IP_CKSUM: hardware ip checksum offload requested
VNET_BUFFER_F_OFFLOAD_TCP_CKSUM: hardware tcp checksum offload requested
VNET_BUFFER_F_OFFLOAD_UDP_CKSUM: hardware udp checksum offload requested
VNET_BUFFER_F_IS_NATED: natted packet, skip input checks
VNET_BUFFER_F_L2_HDR_OFFSET_VALID: L2 header offset valid
VNET_BUFFER_F_L3_HDR_OFFSET_VALID: L3 header offset valid
VNET_BUFFER_F_L4_HDR_OFFSET_VALID: L4 header offset valid
VNET_BUFFER_F_FLOW_REPORT: packet is an ipfix packet
VNET_BUFFER_F_IS_DVR: packet to be reinjected into the l2 output path
VNET_BUFFER_F_QOS_DATA_VALID: QoS data valid in vnet_buffer_opaque2
VNET_BUFFER_F_GSO: generic segmentation offload requested
VNET_BUFFER_F_AVAIL1: available bit
VNET_BUFFER_F_AVAIL2: available bit
VNET_BUFFER_F_AVAIL3: available bit
VNET_BUFFER_F_AVAIL4: available bit
VNET_BUFFER_F_AVAIL5: available bit
VNET_BUFFER_F_AVAIL6: available bit
VNET_BUFFER_F_AVAIL7: available bit
u32 flow_id: generic flow identifier
u8 ref_count: buffer reference / clone count (e.g. for span replication)
u8 buffer_pool_index: buffer pool index which owns this buffer
vlib_error_t (u16) error: error code for buffers enqueued to error handler
u32 next_buffer: buffer index of next buffer in chain. Only valid if VLIB_BUFFER_NEXT_PRESENT is set
union
u32 current_config_index: current index on feature arc
u32 punt_reason: reason code once packet punted. Mutually exclusive with current_config_index
u32 opaque[10]: primary vnet-layer opaque data (see below)
END of first cache line / data initialized by the buffer allocator
u32 trace_index: buffer’s index in the packet trace subsystem
u32 total_length_not_including_first_buffer: see VLIB_BUFFER_TOTAL_LENGTH_VALID above
u32 opaque2[14]: secondary vnet-layer opaque data (see below)
u8 pre_data[VLIB_BUFFER_PRE_DATA_SIZE]: rewrite space, often used to prepend tunnel encapsulations
u8 data[0]: buffer data received from the wire. Ordinarily, hardware devices use b->data[0] as the DMA target but there are exceptions. Do not write code which blindly assumes that packet data starts in b->data[0]. Use vlib_buffer_get_current(…).
Vnet (network stack) primary buffer metadata
Vnet primary buffer metadata occupies space reserved in the vlib opaque field shown above, and has the type name vnet_buffer_opaque_t. Ordinarily accessed using the vnet_buffer(b) macro. See ../src/vnet/buffer.h for full details.
Important fields:
u32 sw_if_index[2]: RX and TX interface handles. At the ip lookup stage, vnet_buffer(b)->sw_if_index[VLIB_TX] is interpreted as a FIB index.
i16 l2_hdr_offset: offset from b->data[0] of the packet L2 header. Valid only if b->flags & VNET_BUFFER_F_L2_HDR_OFFSET_VALID is set
i16 l3_hdr_offset: offset from b->data[0] of the packet L3 header. Valid only if b->flags & VNET_BUFFER_F_L3_HDR_OFFSET_VALID is set
i16 l4_hdr_offset: offset from b->data[0] of the packet L4 header. Valid only if b->flags & VNET_BUFFER_F_L4_HDR_OFFSET_VALID is set
u8 feature_arc_index: feature arc that the packet is currently traversing
union
ip
u32 adj_index[2]: adjacency from dest IP lookup in [VLIB_TX], adjacency from source ip lookup in [VLIB_RX], set to ~0 until source lookup done
union
generic fields
ICMP fields
reassembly fields
mpls fields
l2 bridging fields, only valid in the L2 path
l2tpv3 fields
l2 classify fields
vnet policer fields
MAP fields
MAP-T fields
ip fragmentation fields
COP (whitelist/blacklist filter) fields
LISP fields
TCP fields
connection index
sequence numbers
header and data offsets
data length
flags
SCTP fields
NAT fields
u32 unused[6]
Vnet (network stack) secondary buffer metadata
Vnet primary buffer metadata occupies space reserved in the vlib opaque2 field shown above, and has the type name vnet_buffer_opaque2_t. Ordinarily accessed using the vnet_buffer2(b) macro. See ../src/vnet/buffer.h for full details.
Important fields:
qos fields
u8 bits
u8 source
u8 loop_counter: used to detect and report internal forwarding loops
group-based policy fields
u8 flags
u16 sclass: the packet’s source class
u16 gso_size: L4 payload size, persists all the way to interface-output in case GSO is not enabled
u16 gso_l4_hdr_sz: size of the L4 protocol header
union
packet trajectory tracer (largely deprecated)
u16 *trajectory_trace; only #if VLIB_BUFFER_TRACE_TRAJECTORY > 0
packet generator
u64 pg_replay_timestamp: timestamp for replayed pcap trace packets
u32 unused[8]
Buffer Metadata Extensions
Plugin developers may wish to extend either the primary or secondary vnet buffer opaque unions. Please perform a manual live variable analysis, otherwise nodes which use shared buffer metadata space may break things.
It’s not OK to add plugin or proprietary metadata to the core vpp engine header files named above. Instead, proceed as follows. The example concerns the vnet primary buffer opaque union vlib_buffer_opaque_t. It’s a very simple variation to use the vnet secondary buffer opaque union vlib_buffer_opaque2_t.
In a plugin header file:
/* Add arbitrary buffer metadata */
#include <vnet/buffer.h>
typedef struct
{
u32 my_stuff[6];
} my_buffer_opaque_t;
STATIC_ASSERT (sizeof (my_buffer_opaque_t) <=
STRUCT_SIZE_OF (vnet_buffer_opaque_t, unused),
"Custom meta-data too large for vnet_buffer_opaque_t");
#define my_buffer_opaque(b) \
((my_buffer_opaque_t *)((u8 *)((b)->opaque) + STRUCT_OFFSET_OF (vnet_buffer_opaque_t, unused)))
To set data in the custom buffer opaque type given a vlib_buffer_t *b:
my_buffer_opaque (b)->my_stuff[2] = 123;
To read data from the custom buffer opaque type:
stuff0 = my_buffer_opaque (b)->my_stuff[2];