# **CSIT REPORT**

Release rls2001

## **CONTENTS**

| 1   | Introduction                      | 1          |
|-----|-----------------------------------|------------|
|     | 1.1 Report History                |            |
|     | 1.2 Report Structure              |            |
|     | 1.3 Test Scenarios                |            |
|     | 1.4 Physical Testbeds             |            |
|     | 1.5 Test Methodology              | 14         |
| 2   | VPP Performance                   | 38         |
|     | 2.1 Overview                      | 38         |
|     | 2.2 Release Notes                 | 45         |
|     | 2.3 Packet Throughput             |            |
|     | 2.4 Speedup Multi-Core            |            |
|     | 2.5 Packet Latency                |            |
|     | 2.6 Soak Tests                    |            |
|     | 2.7 Reconfiguration Tests         |            |
|     | 2.8 NFV Service Density           |            |
|     | 2.9 Hoststack Testing             |            |
|     | 2.10 Comparisons                  | 516        |
|     | 2.11 Throughput Trending          | 521        |
|     | 2.12 Test Environment             | 522        |
|     | 2.13 Documentation                | 560        |
| 2   | DDDI/ Doufource                   | F//        |
| 3   | DPDK Performance 3.1 Overview     | 566        |
|     |                                   |            |
|     |                                   |            |
|     | 8 1                               |            |
|     | 3.4 Packet Latency                |            |
|     | 3.5 Comparisons                   |            |
|     | 3.6 Throughput Trending           |            |
|     | 3.7 Test Environment              |            |
|     | 3.8 Documentation                 | 655        |
| 4   | VPP Device                        | 656        |
|     | 4.1 Overview                      | 656        |
|     | 4.2 Release Notes                 | 660        |
|     | 4.3 Integration Tests             | 660        |
|     | 4.4 Documentation                 |            |
| 5   | CSIT Framework                    | <b>670</b> |
| J   | 5.1 Design                        |            |
|     |                                   |            |
|     | <ul><li>5.2 Test Naming</li></ul> |            |
|     | 5.4 CSIT RF Tags Descriptions     |            |
|     | J.A COLLIN 1982 Describtions      | /03        |
| Bil | pliography                        | 719        |

**CHAPTER** 

ONE

## **INTRODUCTION**

## 1.1 Report History

FD.io CSIT-2001 Report history and per .[ww] revision changes are listed below.

| .[ww] Revision | Changes                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
|----------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| .11            | <ol> <li>Added Soak tests</li> <li>Added data:         <ul> <li>VPP performance tests 3n-tsh</li> <li>all set selected for analysis and graphs</li> </ul> </li> </ol>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| .10            | <ol> <li>Added data:         <ul> <li>VPP performance tests 3n-hsw</li> <li>test sets IP4, IPSec</li> <li>sets selected for analysis and graphs</li> </ul> </li> <li>VPP performance tests 2n-clx         <ul> <li>test sets IP4, IP6, L2, Memif, load-balancer, VHost VTS</li> <li>all sets selected for analysis and graphs</li> </ul> </li> <li>Added reconfiguration tests:         <ul> <li>2n-clx</li> </ul> </li> <li>Added Hoststack tests</li> <li>Edited Test Methodology -&gt; Hoststack Testing</li> <li>Added TCP/IP tests</li> </ol>                                                                                                                                                      |
| .09            | <ol> <li>Added data:         <ul> <li>VPP performance tests 3n-hsw</li> <li>test sets IP4, L2, VHost</li> </ul> </li> <li>VPP performance tests 2n-clx         <ul> <li>test sets IP4, IP6, L2</li> <li>all sets selected for analysis and graphs</li> </ul> </li> <li>VPP performance tests 3n-dnv         <ul> <li>test sets IPSec, IP4 Tunnels</li> </ul> </li> <li>VPP performance tests 3n-tsh         <ul> <li>all set selected for analysis and graphs</li> </ul> </li> <li>VPP MRR tests 2n-clx         <ul> <li>all sets selected for analysis and graphs</li> </ul> </li> <li>DPDK performance tests 2n-clx         <ul> <li>all sets selected for analysis and graphs</li> </ul> </li> </ol> |
| .08            | <ol> <li>Added PDF version</li> <li>Added data:         <ul> <li>VPP performance tests 3n-hsw</li> <li>VPP performance tests 3n-tsh</li> <li>VPP MRR tests 3n-tsh</li> <li>DPDK performance tests 3n-tsh</li> </ul> </li> <li>Chapters "Detailed Results", "Test Configuration" and "Test Operational Data" split into sub-chapters.</li> </ol>                                                                                                                                                                                                                                                                                                                                                         |
| .07            | Initial version                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |

FD.io CSIT Reports follow CSIT-[yy][mm].[ww] numbering format, with version denoted by concatenation of two digit year [yy] and two digit month [mm], and maintenance revision identified by two digit calendar week number [ww].

## 1.2 Report Structure

FD.io CSIT-2001 report contains system performance and functional testing data of VPP-20.01 release. PDF version of this report<sup>1</sup> is available for download.

CSIT-2001 report is structured as follows:

- 1. INTRODUCTION: General introduction to FD.io CSIT-2001.
  - Introduction: This section.
  - Test Scenarios Overview: A brief overview of test scenarios covered in this report.
  - Physical Testbeds: Description of physical testbeds.
  - Test Methodology: Performance benchmarking and functional test methodologies.
- 2. VPP PERFORMANCE: VPP performance tests executed in physical FD.io testbeds.
  - Overview: Tested logical topologies, test coverage and naming specifics.
  - Release Notes: Changes in CSIT-2001, added tests, environment or methodology changes, known issues.
  - Packet Throughput: NDR, PDR throughput graphs based on results from repeated same test job executions to verify repeatibility of measurements.
  - **Speedup Multi-Core**: NDR, PDR throughput multi-core speedup graphs based on results from test job executions.
  - Packet Latency: Latency graphs based on results from test job executions.
  - Soak Tests: Long duration soak tests are executed using PLRsearch algorithm.
  - NFV Service Density: Network Function Virtualization (NFV) service density tests focus on measuring total per server throughput at varied NFV service "packing" densities with vswitch providing host dataplane.
  - **Comparisons**: Performance comparisons between VPP releases and between different testbed types.
  - Throughput Trending: References to continuous VPP performance trending.
  - Test Environment: Performance test environment configuration.
  - **Documentation**: Pointers to CSIT source code documentation for VPP performance tests.
- 3. DPDK PERFORMANCE: DPDK performance tests executed in physical FD.io testbeds.
  - Overview: Tested logical topologies, test coverage.
  - Release Notes: Changes in CSIT-2001, known issues.
  - Packet Throughput: NDR, PDR throughput graphs based on results from repeated same test job executions to verify repeatibility of measurements.
  - Packet Latency: Latency graphs based on results from test job executions.
  - **Comparisons**: Performance comparisons between DPDK releases and between different testbed types.
  - Throughput Trending: References to regular DPDK performance trending.
  - **Test Environment**: Performance test environment configuration.

1.2. Report Structure

<sup>&</sup>lt;sup>1</sup> https://docs.fd.io/csit/rls2001/report/\_static/archive/csit\_rls2001.11.pdf

- **Documentation**: Pointers to CSIT source code documentation for DPDK performance tests.
- 4. VPP DEVICE: VPP functional tests executed in physical FD.io testbeds using containers.
  - Overview: Tested virtual topologies, test coverage and naming specifics;
  - Release Notes: Changes in CSIT-2001, added tests, environment or methodology changes, known issues.
  - Integration Tests: Functional test environment configuration.
  - Documentation: Pointers to CSIT source code documentation for VPP functional tests.
- 5. DETAILED RESULTS: Detailed result tables auto-generated from CSIT test job executions using RF (Robot Framework) output files as sources.
  - VPP Performance NDR/PDR: VPP NDR/PDR throughput and latency.
  - VPP Performance MRR: VPP MRR throughput.
  - DPDK Performance: DPDK Testpmd and L3fwd NDR/PDR throughput and latency.
- 6. TEST CONFIGURATION: VPP DUT configuration data based on VPP API Test (VAT) Commands History auto-generated from CSIT test job executions using RF output files as sources.
  - VPP Performance NDR/PDR: Configuration data.
  - VPP Performance MRR: Configuration data.
- 7. TEST OPERATIONAL DATA: VPP DUT operational data auto-generated from CSIT test job executions using RFoutput files as sources.
  - VPP Performance NDR/PDR: VPP show run outputs under test load.
- 8. CSIT FRAMEWORK DOCUMENTATION: Description of the overall FD.io CSIT framework.
  - **Design**: Framework modular design hierarchy.
  - Test naming: Test naming convention.
  - Presentation and Analytics Layer: Description of PAL CSIT analytics module.
  - CSIT RF Tags Descriptions: CSIT RF Tags used for test suite and test case grouping and selection.

## 1.3 Test Scenarios

FD.io CSIT-2001 report includes multiple test scenarios of VPP centric applications, topologies and use cases. In addition it also covers baseline tests of DPDK sample applications. Tests are executed in physical (performance tests) and virtual environments (functional tests).

Brief overview of test scenarios covered in this report:

- VPP Performance: VPP performance tests are executed in physical FD.io testbeds, focusing on VPP network data plane performance in NIC-to-NIC switching topologies. Tested across Intel Xeon Haswell and Skylake servers, ARM, Denverton, range of NICs (10GE, 25GE, 40GE) and multithread/multi-core configurations. VPP application runs in bare-metal host user-mode handling NICs. TRex is used as a traffic generator.
- 2. VPP Vhostuser Performance with KVM VMs: VPP VM service switching performance tests using vhostuser virtual interface for interconnecting multiple NF-in-VM instances. VPP vswitch instance runs in bare-metal user-mode handling NICs and connecting over vhost-user interfaces to VM instances each running VPP with virtio virtual interfaces. Similarly to VPP Performance, tests are run across a range of configurations. TRex is used as a traffic generator.
- 3. VPP Memif Performance with LXC and Docker Containers: VPP Container service switching performance tests using memif virtual interface for interconnecting multiple VPP-in-container instances. VPP vswitch instance runs in bare-metal user-mode handling NICs and connecting over

memif (Slave side) interfaces to more instances of VPP running in LXC or in Docker Containers, both with memif interfaces (Master side). Similarly to VPP Performance, tests are run across a range of configurations. TRex is used as a traffic generator.

- 4. **DPDK Performance**: VPP uses DPDK to drive the NICs and physical interfaces. DPDK performance tests are used as a baseline to profile performance of the DPDK sub-system. Two DPDK applications are tested: Testpmd and L3fwd. DPDK tests are executed in the same testing environment as VPP tests. DPDK Testpmd and L3fwd applications run in host user-mode. TRex is used as a traffic generator.
- 5. **VPP Functional**: VPP functional tests are executed in virtual FD.io testbeds, focusing on VPP packet processing functionality, including both network data plane and in-line control plane. Tests cover vNIC-to-vNIC vNIC-to-nestedVM-to-vNIC forwarding topologies. Scapy is used as a traffic generator.

All CSIT test data included in this report is auto- generated from RF (Robot Framework) output.xml files produced by LF (Linux Foundation) FD.io Jenkins jobs executed against VPP-20.01 release artifacts. References are provided to the original FD.io Jenkins job results and all archived source files.

FD.io CSIT system is developed using two main coding platforms: RF and Python2.7. CSIT-2001 source code for the executed test suites is available in CSIT branch rls2001 in the directory ./tests/<name\_of\_the\_test\_suite>. A local copy of CSIT source code can be obtained by cloning CSIT git repository - git clone https://gerrit.fd.io/r/csit.

## 1.4 Physical Testbeds

All FD.io (Fast Data Input/Ouput) CSIT (Continuous System Integration and Testing) performance test results included in this report are executed on the physical testbeds hosted by LF FD.io project, unless otherwise noted.

Two physical server topology types are used:

- 2-Node Topology: Consists of one server acting as a System Under Test (SUT) and one server acting as a Traffic Generator (TG), with both servers connected into a ring topology. Used for executing tests that require frame encapsulations supported by TG.
- **3-Node Topology**: Consists of two servers acting as a Systems Under Test (SUTs) and one server acting as a Traffic Generator (TG), with all servers connected into a ring topology. Used for executing tests that require frame encapsulations not supported by TG e.g. certain overlay tunnel encapsulations and IPsec. Number of native Ethernet, IPv4 and IPv6 encapsulation tests are also executed on these testbeds, for comparison with 2-Node Topology.

Current FD.io production testbeds are built with SUT servers based on the following processor architectures:

- Intel Xeon: Skylake Platinum 8180, Haswell-SP E5-2699v3, Cascade Lake Platinum 8280, Cascade Lake 6252N.
- Intel Atom: Denverton C3858.
- ARM: TaiShan 2280, hip07-d05.

Server SUT performance depends on server and processor type, hence results for testbeds based on different servers must be reported separately, and compared if appropriate.

Complete technical specifications of compute servers used in CSIT physical testbeds are maintained in FD.io CSIT repository: <a href="https://git.fd.io/csit/tree/docs/lab/testbed\_specifications.md">https://git.fd.io/csit/tree/docs/lab/testbed\_specifications.md</a>.

Following is the description of existing production testbeds.

## 1.4.1 2-Node Xeon Cascade Lake (2n-clx)

Three 2n-clx testbeds are in operation in FD.io labs. Each 2n-clx testbed is built with two SuperMicro SYS-7049GP-TRT servers, SUTs are equipped with two Intel Xeon Gold 6252N processors (35.75 MB Cache, 2.30 GHz, 24 cores). TGs are equiped with Intel Xeon Cascade Lake Platinum 8280 processors (38.5 MB Cache, 2.70 GHz, 28 cores). 2n-clx physical topology is shown below.



SUT servers are populated with the following NIC models:

- 1. NIC-1: x710-DA4 4p10GE Intel.
- 2. NIC-2: xxv710-DA2 2p25GE Intel.
- 3. NIC-3: cx556a-edat ConnectX5 2p100GE Mellanox. (Only testbed t27, t28)
- 4. NIC-4: empty, future expansion.
- 5. NIC-5: empty, future expansion.
- 6. NIC-6: empty, future expansion.

TG servers run T-Rex application and are populated with the following NIC models:

- 1. NIC-1: x710-DA4 4p10GE Intel.
- 2. NIC-2: xxv710-DA2 2p25GE Intel.
- 3. NIC-3: cx556a-edat ConnectX5 2p100GE Mellanox. (Only testbed t27, t28)
- 4. NIC-4: empty, future expansion.
- 5. NIC-5: empty, future expansion.
- 6. NIC-6: x710-DA4 4p10GE Intel. (For self-tests.)

All Intel Xeon Cascade Lake servers run with Intel Hyper-Threading enabled, doubling the number of logical cores exposed to Linux.

## 1.4.2 2-Node Xeon Skylake (2n-skx)

Four 2n-skx testbeds are in operation in FD.io labs. Each 2n-skx testbed is built with two SuperMicro SYS-7049GP-TRT servers, each in turn equipped with two Intel Xeon Skylake Platinum 8180 processors (38.5 MB Cache, 2.50 GHz, 28 cores). 2n-skx physical topology is shown below.

1.4. Physical Testbeds



SUT servers are populated with the following NIC models:

- 1. NIC-1: x710-DA4 4p10GE Intel.
- 2. NIC-2: xxv710-DA2 2p25GE Intel.
- 3. NIC-3: cx556a-edat ConnectX5 2p100GE Mellanox. (Not used yet.)
- 4. NIC-4: empty, future expansion.
- 5. NIC-5: empty, future expansion.

6. NIC-6: empty, future expansion.

TG servers run T-Rex application and are populated with the following NIC models:

- 1. NIC-1: x710-DA4 4p10GE Intel.
- 2. NIC-2: xxv710-DA2 2p25GE Intel.
- 3. NIC-3: cx556a-edat ConnectX5 2p100GE Mellanox. (Not used yet.)
- 4. NIC-4: empty, future expansion.
- 5. NIC-5: empty, future expansion.
- 6. NIC-6: x710-DA4 4p10GE Intel. (For self-tests.)

All Intel Xeon Skylake servers run with Intel Hyper-Threading enabled, doubling the number of logical cores exposed to Linux, with 56 logical cores and 28 physical cores per processor socket.

## 1.4.3 3-Node Xeon Skylake (3n-skx)

Two 3n-skx testbeds are in operation in FD.io labs. Each 3n-skx testbed is built with three SuperMicro SYS-7049GP-TRT servers, each in turn equipped with two Intel Xeon Skylake Platinum 8180 processors (38.5 MB Cache, 2.50 GHz, 28 cores). 3n-skx physical topology is shown below.



SUT1 and SUT2 servers are populated with the following NIC models:

- 1. NIC-1: x710-DA4 4p10GE Intel.
- 2. NIC-2: xxv710-DA2 2p25GE Intel.
- 3. NIC-3: empty, future expansion.
- 4. NIC-4: empty, future expansion.

- 5. NIC-5: empty, future expansion.
- 6. NIC-6: empty, future expansion.

TG servers run T-Rex application and are populated with the following NIC models:

- 1. NIC-1: x710-DA4 4p10GE Intel.
- 2. NIC-2: xxv710-DA2 2p25GE Intel.
- 3. NIC-3: empty, future expansion.
- 4. NIC-4: empty, future expansion.
- 5. NIC-5: empty, future expansion.
- 6. NIC-6: x710-DA4 4p10GE Intel. (For self-tests.)

All Intel Xeon Skylake servers run with Intel Hyper-Threading enabled, doubling the number of logical cores exposed to Linux, with 56 logical cores and 28 physical cores per processor socket.

## 1.4.4 3-Node Xeon Haswell (3n-hsw)

Three 3n-hsw testbeds are in operation in FD.io labs. Each 3n-hsw testbed is built with three Cisco UCS-c240m3 servers, each in turn equipped with two Intel Xeon Haswell-SP E5-2699v3 processors (45 MB Cache, 2.3 GHz, 18 cores). 3n-hsw physical topology is shown below.



SUT1 and SUT2 servers are populated with the following NIC models:

- 1. NIC-1: VIC 1385 2p40GE Cisco.
- 2. NIC-2: NIC x520 2p10GE Intel.
- 3. NIC-3: empty.

- 4. NIC-4: NIC xl710-QDA2 2p40GE Intel.
- 5. NIC-5: NIC x710-DA2 2p10GE Intel.
- 6. NIC-6: QAT 8950 50G (Walnut Hill) Intel.

TG servers run T-Rex application and are populated with the following NIC models:

- 1. NIC-1: NIC xl710-QDA2 2p40GE Intel.
- 2. NIC-2: NIC x710-DA2 2p10GE Intel.
- 3. NIC-3: empty.
- 4. NIC-4: NIC xI710-QDA2 2p40GE Intel.
- 5. NIC-5: NIC x710-DA2 2p10GE Intel.
- 6. NIC-6: NIC x710-DA2 2p10GE Intel. (For self-tests.)

All Intel Xeon Haswell servers run with Intel Hyper-Threading disabled, making the number of logical cores exposed to Linux match the number of 18 physical cores per processor socket.

## 1.4.5 2-Node Atom Denverton (2n-dnv)

2n-dnv testbed is built with: i) one Intel S2600WFT server acting as TG and equipped with two Intel Xeon Skylake Platinum 8180 processors (38.5 MB Cache, 2.50 GHz, 28 cores), and ii) one SuperMicro SYS-E300-9A server acting as SUT and equipped with one Intel Atom C3858 processor (12 MB Cache, 2.00 GHz, 12 cores). 2n-dnv physical topology is shown below.



SUT server have four internal 10G NIC port:

- 1. P-1: x553 copper port.
- 2. P-2: x553 copper port.
- 3. P-3: x553 fiber port.
- 4. P-4: x553 fiber port.

TG server run T-Rex software traffic generator and are populated with the following NIC models:

- 1. NIC-1: x550-T2 2p10GE Intel.
- 2. NIC-2: x550-T2 2p10GE Intel.
- 3. NIC-3: x520-DA2 2p10GE Intel.
- 4. NIC-4: x520-DA2 2p10GE Intel.

The 2n-dnv testbed is in operation in Intel SH labs.

## 1.4.6 3-Node Atom Denverton (3n-dnv)

One 3n-dnv testbed is built with: i) one SuperMicro SYS-7049GP-TRT server acting as TG and equipped with two Intel Xeon Skylake Platinum 8180 processors (38.5 MB Cache, 2.50 GHz, 28 cores), and ii) one SuperMicro SYS-E300-9A server acting as SUT and equipped with one Intel Atom C3858 processor (12 MB Cache, 2.00 GHz, 12 cores). 3n-dnv physical topology is shown below.



SUT1 and SUT2 servers are populated with the following NIC models:

- 1. NIC-1: x553 2p10GE fiber Intel.
- 2. NIC-2: x553 2p10GE copper Intel.

TG servers run T-Rex application and are populated with the following NIC models:

1. NIC-1: x710-DA4 4p10GE Intel.

## 1.4.7 3-Node ARM TaiShan (3n-tsh)

One 3n-tsh testbed is built with: i) one SuperMicro SYS-7049GP-TRT server acting as TG and equipped with two Intel Xeon Skylake Platinum 8180 processors (38.5 MB Cache, 2.50 GHz, 28 cores), and ii)

one Huawei TaiShan 2280 server acting as SUT and equipped with one hip07-d05 processor (64\* ARM Cortex-A72). 3n-tsh physical topology is shown below.



SUT1 and SUT2 servers are populated with the following NIC models:

- 1. NIC-1: connectx4 2p25GE Mellanox.
- 2. NIC-2: x520 2p10GE Intel.

TG servers run T-Rex application and are populated with the following NIC models:

- 1. NIC-1: x710-DA4 4p10GE Intel.
- 2. NIC-2: xxv710-DA2 2p25GE Intel.

## 1.5 Test Methodology

## 1.5.1 Terminology

- Frame size: size of an Ethernet Layer-2 frame on the wire, including any VLAN tags (dot1q, dot1ad) and Ethernet FCS, but excluding Ethernet preamble and inter-frame gap. Measured in Bytes.
- Packet size: same as frame size, both terms used interchangeably.
- Inner L2 size: for tunneled L2 frames only, size of an encapsulated Ethernet Layer-2 frame, preceded with tunnel header, and followed by tunnel trailer. Measured in Bytes.
- Inner IP size: for tunneled IP packets only, size of an encapsulated IPv4 or IPv6 packet, preceded with tunnel header, and followed by tunnel trailer. Measured in Bytes.
- Device Under Test (DUT): In software networking, "device" denotes a specific piece of software tasked with packet processing. Such device is surrounded with other software components (such

as operating system kernel). It is not possible to run devices without also running the other components, and hardware resources are shared between both. For purposes of testing, the whole set of hardware and software components is called "System Under Test" (SUT). As SUT is the part of the whole test setup performance of which can be measured with RFC 2544<sup>2</sup>, using SUT instead of RFC 2544<sup>3</sup> DUT. Device under test (DUT) can be re-introduced when analyzing test results using whitebox techniques, but this document sticks to blackbox testing.

- System Under Test (SUT): System under test (SUT) is a part of the whole test setup whose performance is to be benchmarked. The complete methodology contains other parts, whose performance is either already established, or not affecting the benchmarking result.
- **Bi-directional throughput tests**: involve packets/frames flowing in both east-west and west-east directions over every tested interface of SUT/DUT. Packet flow metrics are measured per direction, and can be reported as aggregate for both directions (i.e. throughput) and/or separately for each measured direction (i.e. latency). In most cases bi-directional tests use the same (symmetric) load in both directions.
- Uni-directional throughput tests: involve packets/frames flowing in only one direction, i.e. either east-west or west-east direction, over every tested interface of SUT/DUT. Packet flow metrics are measured and are reported for measured direction.
- Packet Loss Ratio (PLR): ratio of packets received relative to packets transmitted over the test trial duration, calculated using formula: PLR = ( pkts\_transmitted pkts\_received ) / pkts\_transmitted. For bi-directional throughput tests aggregate PLR is calculated based on the aggregate number of packets transmitted and received.
- Packet Throughput Rate: maximum packet offered load DUT/SUT forwards within the specified Packet Loss Ratio (PLR). In many cases the rate depends on the frame size processed by DUT/SUT. Hence packet throughput rate MUST be quoted with specific frame size as received by DUT/SUT during the measurement. For bi-directional tests, packet throughput rate should be reported as aggregate for both directions. Measured in packets-per-second (pps) or frames-per-second (fps), equivalent metrics.
- Bandwidth Throughput Rate: a secondary metric calculated from packet throughput rate using formula: bw\_rate = pkt\_rate \* (frame\_size + L1\_overhead) \* 8, where L1\_overhead for Ethernet includes preamble (8 Bytes) and inter-frame gap (12 Bytes). For bi-directional tests, bandwidth throughput rate should be reported as aggregate for both directions. Expressed in bits-per-second (bps).
- Non Drop Rate (NDR): maximum packet/bandwith throughput rate sustained by DUT/SUT at PLR equal zero (zero packet loss) specific to tested frame size(s). MUST be quoted with specific packet size as received by DUT/SUT during the measurement. Packet NDR measured in packets-persecond (or fps), bandwidth NDR expressed in bits-per-second (bps).
- Partial Drop Rate (PDR): maximum packet/bandwith throughput rate sustained by DUT/SUT at PLR greater than zero (non-zero packet loss) specific to tested frame size(s). MUST be quoted with specific packet size as received by DUT/SUT during the measurement. Packet PDR measured in packets-per-second (or fps), bandwidth PDR expressed in bits-per-second (bps).
- Maximum Receive Rate (MRR): packet/bandwidth rate regardless of PLR sustained by DUT/SUT under specified Maximum Transmit Rate (MTR) packet load offered by traffic generator. MUST be quoted with both specific packet size and MTR as received by DUT/SUT during the measurement. Packet MRR measured in packets-per-second (or fps), bandwidth MRR expressed in bits-per-second (bps).
- Trial: a single measurement step.
- Trial duration: amount of time over which packets are transmitted and received in a single measurement step.

<sup>&</sup>lt;sup>2</sup> https://tools.ietf.org/html/rfc2544.html

<sup>&</sup>lt;sup>3</sup> https://tools.ietf.org/html/rfc2544.html

### 1.5.2 VPP Forwarding Modes

VPP is tested in a number of L2 and IP packet lookup and forwarding modes. Within each mode baseline and scale tests are executed, the latter with varying number of lookup entries.

#### **L2 Ethernet Switching**

VPP is tested in three L2 forwarding modes:

- *I2patch*: L2 patch, the fastest point-to-point L2 path that loops packets between two interfaces without any Ethernet frame checks or lookups.
- *I2xc*: L2 cross-connect, point-to-point L2 path with all Ethernet frame checks, but no MAC learning and no MAC lookup.
- *12bd*: L2 bridge-domain, multipoint-to-multipoint L2 path with all Ethernet frame checks, with MAC learning (unless static MACs are used) and MAC lookup.

12bd tests are executed in baseline and scale configurations:

- *I2bdbase*: low number of L2 flows (254 per direction) is switched by VPP. They drive the content of MAC FIB size (508 total MAC entries). Both source and destination MAC addresses are incremented on a packet by packet basis.
- *I2bdscale*: high number of L2 flows is switched by VPP. Tested MAC FIB sizes include: i) 10k (5k unique flows per direction), ii) 100k (2x 50k flows) and iii) 1M (2x 500k). Both source and destination MAC addresses are incremented on a packet by packet basis, ensuring new entries are learn refreshed and looked up at every packet, making it the worst case scenario.

Ethernet wire encapsulations tested include: untagged, dot1q, dot1ad.

#### **IPv4 Routing**

IPv4 routing tests are executed in baseline and scale configurations:

- *ip4base*: low number of IPv4 flows (253 or 254 per direction) is routed by VPP. They drive the content of IPv4 FIB size (506 or 508 total /32 prefixes). Destination IPv4 addresses are incremented on a packet by packet basis.
- *ip4scale*: high number of IPv4 flows is routed by VPP. Tested IPv4 FIB sizes of /32 prefixes include: i) 20k (10k unique flows per direction), ii) 200k (2x 100k flows) and iii) 2M (2x 1M). Destination IPv4 addresses are incremented on a packet by packet basis, ensuring new FIB entries are looked up at every packet, making it the worst case scenario.

#### **IPv6 Routing**

IPv6 routing tests are executed in baseline and scale configurations:

- *ip6base*: low number of IPv6 flows (253 or 254 per direction) is routed by VPP. They drive the content of IPv6 FIB size (506 or 508 total /128 prefixes). Destination IPv6 addresses are incremented on a packet by packet basis.
- *ip6scale*: high number of IPv6 flows is routed by VPP. Tested IPv6 FIB sizes of /128 prefixes include: i) 20k (10k unique flows per direction), ii) 200k (2x 100k flows) and iii) 2M (2x 1M). Destination IPv6 addresses are incremented on a packet by packet basis, ensuring new FIB entries are looked up at every packet, making it the worst case scenario.

#### **SRv6 Routing**

SRv6 routing tests are executed in a number of baseline configurations, in each case SR policy and steering policy are configured for one direction and one (or two) SR behaviours (functions) in the other directions:

- srv6enc1sid: One SID (no SRH present), one SR function End.
- srv6enc2sids: Two SIDs (SRH present), two SR functions End and End.DX6.
- srv6enc2sids-nodecaps: Two SIDs (SRH present) without decapsulation, one SR function End.
- srv6proxy-dyn: Dynamic SRv6 proxy, one SR function End.AD.
- srv6proxy-masq: Masquerading SRv6 proxy, one SR function End.AM.
- srv6proxy-stat: Static SRv6 proxy, one SR function End.AS.

In all listed cases low number of IPv6 flows (253 per direction) is routed by VPP.

## 1.5.3 Tunnel Encapsulations

Tunnel encapsulations testing is grouped based on the type of outer header: IPv4 or IPv6.

#### **IPv4 Tunnels**

VPP is tested in the following IPv4 tunnel baseline configurations:

- ip4vxlan-l2bdbase: VXLAN over IPv4 tunnels with L2 bridge-domain MAC switching.
- ip4vxlan-l2xcbase: VXLAN over IPv4 tunnels with L2 cross-connect.
- ip4lispip4-ip4base: LISP over IPv4 tunnels with IPv4 routing.
- ip4lispip6-ip6base: LISP over IPv4 tunnels with IPv6 routing.

In all cases listed above low number of MAC, IPv4, IPv6 flows (253 or 254 per direction) is switched or routed by VPP.

In addition selected IPv4 tunnels are tested at scale:

• dot1q-ip4vxlanscale-l2bd: VXLAN over IPv4 tunnels with L2 bridge- domain MAC switching, with scaled up dot1q VLANs (10, 100, 1k), mapped to scaled up L2 bridge-domains (10, 100, 1k), that are in turn mapped to (10, 100, 1k) VXLAN tunnels. 64.5k flows are transmitted per direction.

#### **IPv6 Tunnels**

VPP is tested in the following IPv6 tunnel baseline configurations:

- ip6lispip4-ip4base: LISP over IPv4 tunnels with IPv4 routing.
- ip6lispip6-ip6base: LISP over IPv4 tunnels with IPv6 routing.

In all cases listed above low number of IPv4, IPv6 flows (253 or 254 per direction) is routed by VPP.

#### 1.5.4 VPP Features

VPP is tested in a number of data plane feature configurations across different forwarding modes. Following sections list features tested.

#### **ACL Security-Groups**

Both stateless and stateful access control lists (ACL), also known as security-groups, are supported by VPP.

Following ACL configurations are tested for MAC switching with L2 bridge-domains:

- I2bdbasemacIrn-iacl{E}sI-{F}flows: Input stateless ACL, with {E} entries and {F} flows.
- I2bdbasemacIrn-oacl{E}sI-{F}flows: Output stateless ACL, with {E} entries and {F} flows.
- *12bdbasemaclrn-iacl{E}sf-{F}flows*: Input stateful ACL, with {E} entries and {F} flows.
- I2bdbasemacIrn-oacl{E}sf-{F}flows: Output stateful ACL, with {E} entries and {F} flows.

Following ACL configurations are tested with IPv4 routing:

- ip4base-iacl{E}sl-{F}flows: Input stateless ACL, with {E} entries and {F} flows.
- ip4base-oacl{E}sl-{F}flows: Output stateless ACL, with {E} entries and {F} flows.
- ip4base-iacl{E}sf-{F}flows: Input stateful ACL, with {E} entries and {F} flows.
- ip4base-oacl{E}sf-{F}flows: Output stateful ACL, with {E} entries and {F} flows.

ACL tests are executed with the following combinations of ACL entries and number of flows:

- ACL entry definitions
  - flow non-matching deny entry: (src-ip4, dst-ip4, src-port, dst-port).
  - flow matching permit ACL entry: (src-ip4, dst-ip4).
- {E} number of non-matching deny ACL entries, {E} = [1, 10, 50].
- {F} number of UDP flows with different tuple (src-ip4, dst-ip4, src-port, dst-port), {F} = [100, 10k, 100k].
- All {E}x{F} combinations are tested per ACL type, total of 9.

#### **ACL MAC-IP**

MAC-IP binding ACLs are tested for MAC switching with L2 bridge-domains:

• I2bdbasemaclrn-macip-iacl{E}sl-{F}flows: Input stateless ACL, with {E} entries and {F} flows.

MAC-IP ACL tests are executed with the following combinations of ACL entries and number of flows:

- ACL entry definitions
  - flow non-matching deny entry: (dst-ip4, dst-mac, bit-mask)
  - flow matching permit ACL entry: (dst-ip4, dst-mac, bit-mask)
- {E} number of non-matching deny ACL entries, {E} = [1, 10, 50]
- {F} number of UDP flows with different tuple (dst-ip4, dst-mac), {F} = [100, 10k, 100k]
- All {E}x{F} combinations are tested per ACL type, total of 9.

#### NAT44

NAT44 is tested in baseline and scale configurations with IPv4 routing:

- ip4base-nat44: baseline test with single NAT entry (addr, port), single UDP flow.
- ip4base-udpsrcscale{U}-nat44: baseline test with {U} NAT entries (addr, {U}ports), {U}=15.
- ip4scale{R}-udpsrcscale{U}-nat44: scale tests with {R}\*{U} NAT entries ({R}addr, {U}ports), {R}=[100, 1k, 2k, 4k], {U}=15.

## 1.5.5 Data Plane Throughput

#### **Data Plane Throughput Tests**

Network data plane throughput is measured using multiple test methods in order to obtain representative and repeatable results across the large set of performance test cases implemented and executed within CSIT.

Following throughput test methods are used:

- MLRsearch Multiple Loss Ratio search
- MRR Maximum Receive Rate
- PLRsearch Probabilistic Loss Ratio search

Description of each test method is followed by generic test properties shared by all methods.

#### **MLRsearch Tests**

#### **Description**

Multiple Loss Ratio search (MLRsearch) tests discover multiple packet throughput rates in a single search, reducing the overall test execution time compared to a binary search. Each rate is associated with a distinct Packet Loss Ratio (PLR) criteria. In FD.io CSIT two throughput rates are discovered: Non-Drop Rate (NDR, with zero packet loss, PLR=0) and Partial Drop Rate (PDR, with PLR<0.5%). MLRsearch is compliant with RFC 2544<sup>4</sup>.

#### **Usage**

MLRsearch tests are run to discover NDR and PDR rates for each VPP and DPDK release covered by CSIT report. Results for small frame sizes (64b/78B, IMIX) are presented in packet throughput graphs (Box-and-Whisker Plots) with NDR and PDR rates plotted against the test cases covering popular VPP packet paths.

Each test is executed at least 10 times to verify measurements repeatability and results are compared between releases and test environments. NDR and PDR packet and bandwidth throughput results for all frame sizes and for all tests are presented in detailed results tables.

#### **Details**

See *MLRsearch Tests* (page 21) section for more detail. MLRsearch is being standardized in IETF in draft-vpolak-mkonstan-mlrsearch<sup>5</sup>.

#### **MRR Tests**

#### Description

Maximum Receive Rate (MRR) tests are complementary to MLRsearch tests, as they provide a maximum "raw" throughput benchmark for development and testing community.

MRR tests measure the packet forwarding rate under the maximum load offered by traffic generator (dependent on link type and NIC model) over a set trial duration, regardless of packet loss. Maximum load for specified Ethernet frame size is set to the bi-directional link rate.

<sup>&</sup>lt;sup>4</sup> https://tools.ietf.org/html/rfc2544.html

 $<sup>^{5}\</sup> https://tools.ietf.org/html/draft-vpolak-mkonstan-bmwg-mlrsearch$ 

#### **Usage**

MRR tests are much faster than MLRsearch as they rely on a single trial or a small set of trials with very short duration. It is this property that makes them suitable for continuous execution in daily performance trending jobs enabling detection of performance anomalies (regressions, progressions) resulting from data plane code changes.

MRR tests are also used for VPP per patch performance jobs verifying patch performance vs. parent. CSIT reports include MRR throughput comparisons between releases and test environments. Small frame sizes only (64b/78B, IMIX).

#### **Details**

See MRR Throughput (page 21) section for more detail about MRR tests configuration.

FD.io CSIT performance dashboard includes complete description of daily performance trending tests<sup>6</sup> and VPP per patch tests<sup>7</sup>.

#### **PLRsearch Tests**

#### **Description**

Probabilistic Loss Ratio search (PLRsearch) tests discovers a packet throughput rate associated with configured Packet Loss Ratio (PLR) criteria for tests run over an extended period of time a.k.a. soak testing. PLRsearch assumes that system under test is probabilistic in nature, and not deterministic.

#### **Usage**

PLRsearch are run to discover a sustained throughput for PLR=10^-7 (close to NDR) for VPP release covered by CSIT report. Results for small frame sizes (64b/78B) are presented in packet throughput graphs (Box Plots) for a small subset of baseline tests.

Each soak test lasts 30 minutes and is executed at least twice. Results are compared against NDR and PDR rates discovered with MLRsearch.

#### **Details**

See *PLRsearch* (page 22) methodology section for more detail. PLRsearch is being standardized in IETF in draft-vpolak-bmwg-plrsearch<sup>8</sup>.

#### **Generic Test Properties**

All data plane throughput test methodologies share following generic properties:

- Tested L2 frame sizes (untagged Ethernet):
  - IPv4 payload: 64B, IMIX (28x64B, 16x570B, 4x1518B), 1518B, 9000B.
  - IPv6 payload: 78B, IMIX (28x78B, 16x570B, 4x1518B), 1518B, 9000B.
  - All quoted sizes include frame CRC, but exclude per frame transmission overhead of 20B (preamble, inter frame gap).

<sup>&</sup>lt;sup>6</sup> https://docs.fd.io/csit/master/trending/methodology/performance\_tests.html

<sup>&</sup>lt;sup>7</sup> https://docs.fd.io/csit/master/trending/methodology/perpatch\_performance\_tests.html

<sup>&</sup>lt;sup>8</sup> https://tools.ietf.org/html/draft-vpolak-bmwg-plrsearch

- Offered packet load is always bi-directional and symmetric.
- All measured and reported packet and bandwidth rates are aggregate bi-directional rates reported from external Traffic Generator perspective.

#### **MLRsearch Tests**

#### Overview

Multiple Loss Rate search (MLRsearch) tests use new search algorithm implemented in FD.io CSIT project. MLRsearch discovers multiple packet throughput rates in a single search, with each rate associated with a different Packet Loss Ratio (PLR) criteria.

Two throughput measurements used in FD.io CSIT are Non-Drop Rate (NDR, with zero packet loss, PLR=0) and Partial Drop Rate (PDR, with packet loss rate not greater than the configured non-zero PLR).

MLRsearch discovers NDR and PDR in a single pass reducing required time duration compared to separate 'binary search'\_es for NDR and PDR. Overall search time is reduced even further by relying on shorter trial durations of intermediate steps, with only the final measurements conducted at the specified final trial duration. This results in the shorter overall execution time when compared to standard NDR/PDR binary search, while guaranteeing similar results.

If needed, next version of MLRsearch can be easily adopted to discover more throughput rates with different pre-defined PLRs.

**Note:** All throughput rates are *always* bi-directional aggregates of two equal (symmetric) uni-directional packet rates received and reported by an external traffic generator.

#### **Search Implementation**

Detailed description of the MLRsearch algorithm is included in the IETF draft draft-vpolak-mkonstan-mlrsearch<sup>9</sup> that is in the process of being standardized in the IETF Benchmarking Methodology Working Group (BMWG).

MLRsearch is also available as a PyPI (Python Package Index) library<sup>10</sup>.

#### **Implementation Deviations**

FD.io CSIT implementation of MLRsearch so far is fully based on the -02 version of the draft-vpolak-mkonstan-mlrsearch-02<sup>11</sup>.

## **MRR Throughput**

Maximum Receive Rate (MRR) tests are complementary to MLRsearch tests, as they provide a maximum "raw" throughput benchmark for development and testing community. MRR tests measure the packet forwarding rate under the maximum load offered by traffic generator over a set trial duration, regardless of packet loss.

MRR tests are currently used for following test jobs:

- Report performance comparison: 64B, IMIX for vhost, memif.
- Daily performance trending: 64B, IMIX for vhost, memif.

<sup>&</sup>lt;sup>9</sup> https://tools.ietf.org/html/draft-vpolak-mkonstan-bmwg-mlrsearch

<sup>&</sup>lt;sup>10</sup> https://pypi.org/project/MLRsearch/

<sup>&</sup>lt;sup>11</sup> https://tools.ietf.org/html/draft-vpolak-mkonstan-bmwg-mlrsearch-02

- Per-patch performance verification: 64B.
- Initial iterations of MLRsearch and PLRsearch: 64B.

Maximum offered load for specific L2 Ethernet frame size is set to either the maximum bi-directional link rate or tested NIC model capacity, as follows:

- For 10GE NICs the maximum packet rate load is 2x14.88 Mpps for 64B, a 10GE bi-directional link rate.
- For 25GE NICs the maximum packet rate load is 2x18.75 Mpps for 64B, a 25GE bi-directional link sub-rate limited by 25GE NIC used on TRex TG, XXV710.
- For 40GE NICs the maximum packet rate load is 2x18.75 Mpps for 64B, a 40GE bi-directional link sub-rate limited by 40GE NIC used on TRex TG,XL710. Packet rate for other tested frame sizes is limited by PCleGen3 x8 bandwidth limitation of ~50Gbps.

MRR test code implements multiple bursts of offered packet load and has two configurable burst parameters: individual trial duration and number of trials in a single burst. This enables more precise performance trending by providing more results data for analysis.

Burst parameter settings vary between different tests using MRR:

- MRR individual trial duration:
  - Report performance comparison: 1 sec.
  - Daily performance trending: 1 sec.
  - Per-patch performance verification: 10 sec.
  - Initial iteration for MLRsearch: 1 sec.
  - Initial iteration for PLRsearch: 5.2 sec.
- Number of MRR trials per burst:
  - Report performance comparison: 10.
  - Daily performance trending: 10.
  - Per-patch performance verification: 5.
  - Initial iteration for MLRsearch: 1.
  - Initial iteration for PLRsearch: 1.

#### **PLRsearch**

## **Motivation for PLRsearch**

Network providers are interested in throughput a system can sustain.

RFC 2544<sup>12</sup> assumes loss ratio is given by a deterministic function of offered load. But NFV software systems are not deterministic enough. This makes deterministic algorithms (such as binary search<sup>13</sup> per RFC 2544 and MLRsearch with single trial) to return results, which when repeated show relatively high standard deviation, thus making it harder to tell what "the throughput" actually is.

We need another algorithm, which takes this indeterminism into account.

#### **Generic Algorithm**

Detailed description of the PLRsearch algorithm is included in the IETF draft draft-vpolak-bmwg-plrsearch-02<sup>14</sup> that is in the process of being standardized in the IETF Benchmarking Methodology Work-

<sup>12</sup> https://tools.ietf.org/html/rfc2544

<sup>&</sup>lt;sup>13</sup> https://en.wikipedia.org/wiki/Binary\_search\_algorithm

<sup>&</sup>lt;sup>14</sup> https://tools.ietf.org/html/draft-vpolak-bmwg-plrsearch-02

ing Group (BMWG).

#### **Terms**

The rest of this page assumes the reader is familiar with the following terms defined in the IETF draft:

- Trial Order Independent System
- Duration Independent System
- Target Loss Ratio
- Critical Load
- Offered Load regions
  - Zero Loss Region
  - Non-Deterministic Region
  - Guaranteed Loss Region
- Fitting Function
  - Stretch Function
  - Erf Function
- Bayesian Inference
  - Prior distribution
  - Posterior Distribution
- Numeric Integration
  - Monte Carlo
  - Importance Sampling

#### **FD.io CSIT Implementation Specifics**

The search receives min\_rate and max\_rate values, to avoid measurements at offered loads not supporeted by the traffic generator.

The implemented tests cases use bidirectional traffic. The algorithm stores each rate as bidirectional rate (internally, the algorithm is agnostic to flows and directions, it only cares about aggregate counts of packets sent and packets lost), but debug output from traffic generator lists unidirectional values.

In a sample implemenation in FD.io CSIT project, there is roughly 0.5 second delay between trials due to restrictons imposed by packet traffic generator in use (T-Rex).

As measurements results come in, posterior distribution computation takes more time (per sample), although there is a considerable constant part (mostly for inverting the fitting functions).

Also, the integrator needs a fair amount of samples to reach the region the posterior distribution is concentrated at.

And of course, the speed of the integrator depends on computing power of the CPU the algorithm is able to use.

All those timing related effects are addressed by arithmetically increasing trial durations with configurable coefficients (currently 5.1 seconds for the first trial, each subsequent trial being 0.1 second longer).

In order to avoid them, the current implementation tracks natural logarithm (instead of the original quantity) for any quantity which is never negative. Logarithm of zero is minus infinity (not supported by Python), so special value "None" is used instead. Specific functions for frequent operations (such as "logarithm of sum of exponentials") are defined to handle None correctly.

Current implementation uses two fitting functions, called "stretch" and "erf". In general, their estimates for critical rate differ, which adds a simple source of systematic error, on top of randomness error reported by integrator. Otherwise the reported stdev of critical rate estimate is unrealistically low.

Both functions are not only increasing, but also convex (meaning the rate of increase is also increasing).

Both fitting functions have several mathematically equivalent formulas, each can lead to an arithmetic overflow or underflow in different sub-terms. Overflows can be eliminated by using different exact formulas for different argument ranges. Underflows can be avoided by using approximate formulas in affected argument ranges, such ranges have their own formulas to compute. At the end, both fitting function implementations contain multiple "if" branches, discontinuities are a possibility at range boundaries.

The numeric integrator expects all the parameters to be distributed (independently and) uniformly on an interval (-1, 1).

As both "mrr" and "spread" parameters are positive and not dimensionless, a transformation is needed. Dimentionality is inherited from max\_rate value.

The "mrr" parameter follows a Lomax distribution<sup>15</sup> with alpha equal to one, but shifted so that mrr is always greater than 1 packet per second.

The "stretch" parameter is generated simply as the "mrr" value raised to a random power between zero and one; thus it follows a reciprocal distribution 16.

After few measurements, the posterior distribution of fitting function arguments gets quite concentrated into a small area. The integrator is using Monte Carlo<sup>17</sup> with importance sampling<sup>18</sup> where the biased distribution is bivariate Gaussian<sup>19</sup> distribution, with deliberately larger variance. If the generated sample falls outside (-1, 1) interval, another sample is generated.

The center and the covariance matrix for the biased distribution is based on the first and second moments of samples seen so far (within the computation). The center is used directly, covariance matrix is scaled up by a heurictic constant (8.0 by default). The following additional features are applied designed to avoid hyper-focused distributions.

Each computation starts with the biased distribution inherited from the previous computation (zero point and unit covariance matrix is used in the first computation), but the overal weight of the data is set to the weight of the first sample of the computation. Also, the center is set to the first sample point. When additional samples come, their weight (including the importance correction) is compared to sum of the weights of data seen so far (within the iteration). If the new sample is more than one e-fold more impactful, both weight values (for data so far and for the new sample) are set to (geometric) average of the two weights.

This combination showed the best behavior, as the integrator usually follows two phases. First phase (where inherited biased distribution or single big sample are dominating) is mainly important for locating the new area the posterior distribution is concentrated at. The second phase (dominated by whole sample population) is actually relevant for the critical rate estimation.

First two measurements are hardcoded to happen at the middle of rate interval and at max\_rate. Next two measurements follow MRR-like logic, offered load is decreased so that it would reach target loss ratio if offered load decrease lead to equal decrease of loss rate.

The rest of measurements start directly in between erf and stretch estimate average. There is one workaround implemented, aimed at reducing the number of consequent zero loss measurements (per fitting function). The workaround first stores every measurement result which loss ratio was the targed loss ratio or higher. Sorted list (called lossy loads) of such results is maintained.

When a sequence of one or more zero loss measurement results is encountered, a smallest of lossy loads is drained from the list. If the estimate average is smaller than the drained value, a weighted average of this estimate and the drained value is used as the next offered load. The weight of the estimate decreases exponentially with the length of consecutive zero loss results.

<sup>&</sup>lt;sup>15</sup> https://en.wikipedia.org/wiki/Lomax distribution

<sup>&</sup>lt;sup>16</sup> https://en.wikipedia.org/wiki/Reciprocal\_distribution

<sup>&</sup>lt;sup>17</sup> https://en.wikipedia.org/wiki/Monte\_Carlo\_integration

<sup>&</sup>lt;sup>18</sup> https://en.wikipedia.org/wiki/Importance\_sampling

<sup>&</sup>lt;sup>19</sup> https://en.wikipedia.org/wiki/Multivariate\_normal\_distribution

This behavior helps the algorithm with convergence speed, as it does not need so many zero loss result to get near critical region. Using the smallest (not drained yet) of lossy loads makes it sure the new offered load is unlikely to result in big loss region. Draining even if the estimate is large enough helps to discard early measurements when loss hapened at too low offered load. Current implementation adds 4 copies of lossy loads and drains 3 of them, which leads to fairly stable behavior even for somewhat inconsistent SUTs.

As high loss count measurements add many bits of information, they need a large amount of small loss count measurements to balance them, making the algorithm converge quite slowly. Typically, this happens when few initial measurements suggest spread way bigger then later measurements. The workaround in offered load selection helps, but more intelligent workarounds could get faster convergence still.

Some systems evidently do not follow the assumption of repeated measurements having the same average loss rate (when the offered load is the same). The idea of estimating the trend is not implemented at all, as the observed trends have varied characteristics.

Probably, using a more realistic fitting functions will give better estimates than trend analysis.

#### **Bottom Line**

The notion of Throughput is easy to grasp, but it is harder to measure with any accuracy for non-deterministic systems.

Even though the notion of critical rate is harder to grasp than the notion of throughput, it is easier to measure using probabilistic methods.

In testing, the difference between throughput measurements and critical rate measurements is usually small, see *Soak Tests vs. NDR Tests* (page 521).

In pactice, rules of thumb such as "send at max 95% of purported throughput" are common. The correct benchmarking analysis should ask "Which notion is 95% of throughput an approximation to?" before attempting to answer "Is 95% of critical rate safe enough?".

#### **Algorithmic Analysis**

While the estimation computation is based on hard probability science; the offered load selection part of PLRsearch logic is pure heuristics, motivated by what would a human do based on measurement and computation results.

The quality of any heuristic is not affected by soundness of its motivation, just by its ability to achieve the intended goals. In case of offered load selection, the goal is to help the search to converge to the long duration estimates sooner.

But even those long duration estimates could still be of poor quality. Even though the estimate computation is Bayesian (so it is the best it could be within the applied assumptions), it can still of poor quality when compared to what a human would estimate.

One possible source of poor quality is the randomnes inherently present in Monte Carlo numeric integration, but that can be supressed by tweaking the time related input parameters.

The most likely source of poor quality then are the assumptions. Most importantly, the number and the shape of fitting functions; but also others, such as trial order independence and duration independence.

The result can have poor quality in basically two ways. One way is related to location. Both upper and lower bounds can be overestimates or underestimates, meaning the entire estimated interval between lower bound and upper bound lays above or below (respectively) of human-estimated interval. The other way is related to the estimation interval width. The interval can be too wide or too narrow, compared to human estimation.

An estimate from a particular fitting function can be classified as an overestimate (or underestimate) just by looking at time evolution (without human examining measurement results). Overestimates decrease by time, underestimates increase by time (assuming the system performance stays constant).

Quality of the width of the estimation interval needs human evaluation, and is unrelated to both rate of narrowing (both good and bad estimate intervals get narrower at approximately the same relative rate) and relatative width (depends heavily on the system being tested).

The following pictures show the upper (red) and lower (blue) bound, as well as average of Stretch (pink) and Erf (light green) estimate, and offered load chosen (grey), as computed by PLRsearch, after each trial measurement within the 30 minute duration of a test run.

Both graphs are focusing on later estimates. Estimates computed from few initial measurements are wildly off the y-axis range shown.

The following analysis will rely on frequency of zero loss measurements and magnitude of loss ratio if nonzero.

The offered load selection strategy used implies zero loss measurements can be gleaned from the graph by looking at offered load points. When the points move up farther from lower estimate, it means the previous measurement had zero loss. After non-zero loss, the offered load starts again right between (the previous values of) the estimate curves.

The very big loss ratio results are visible as noticeable jumps of both estimates downwards. Medium and small loss ratios are much harder to distinguish just by looking at the estimate curves, the analysis is based on raw loss ratio measurement results.

The following descriptions should explain why the graphs seem to signal low quality estimate at first sight, but a more detailed look reveals the quality is good (considering the measurement results).

#### L2 patch

Both fitting functions give similar estimates, the graph shows "stochasticity" of measurements (estimates increase and decrease within small time regions), and an overall trend of decreasing estimates.

On the first look, the final interval looks fairly narrow, especially compared to the region the estimates have travelled during the search. But the look at the frequency of zero loss results shows this is not a case of overestimation. Measurements at around the same offered load have higher probability of zero loss earlier (when performed farther from upper bound), but smaller probability later (when performed closer to upper bound). That means it is the performance of the system under test that decreases (slightly) over time.

With that in mind, the apparent narrowness of the interval is not a sign of low quality, just a consequence of PLRsearch assuming the performance stays constant.



#### **Vhost**

This test case shows what looks like a quite broad estimation interval, compared to other test cases with similarly looking zero loss frequencies. Notable features are infrequent high-loss measurement results causing big drops of estimates, and lack of long-term convergence.

Any convergence in medium-sized intervals (during zero loss results) is reverted by the big loss results, as they happen quite far from the critical load estimates, and the two fitting functions extrapolate differently.

In other words, human only seeing estimates from one fitting function would expect narrower end interval, but human seeing the measured loss ratios agrees that the interval should be wider than that.



#### **Summary**

The two graphs show the behavior of PLRsearch algorithm applied to soaking test when some of PLRsearch assumptions do not hold:

- L2 patch measurement results violate the assumption of performance not changing over time.
- Vhost measurement results violate the assumption of Poisson distribution matching the loss counts.

The reported upper and lower bounds can have distance larger or smaller than a first look by a human would expect, but a more closer look reveals the quality is good, considering the circumstances.

The usefullness of the critical load estimate is of questionable value when the assumptions are violated.

Some improvements can be made via more specific workarounds, for example long term limit of L2 patch performance could be estmated by some heuristic.

Other improvements can be achieved only by asking users whether loss patterns matter. Is it better to have single digit losses distributed fairly evenly over time (as Poisson distribution would suggest), or is it better to have short periods of medium losses mixed with long periods of zero losses (as happens in Vhost test) with the same overall loss ratio?

## 1.5.6 Packet Latency

TRex Traffic Generator (TG) is used for measuring latency across 2-Node and 3-Node SUT server topologies. TRex integrates A High Dynamic Range Histogram (HDRH)<sup>20</sup> code providing per packet latency distribution for latency streams sent in parallel to the main load packet streams. Packet latency is measured using following methodology:

• Latency tests are performed at following packet load levels:

- No-Load: latency streams only.

- Low-Load: at 10% PDR.

- Mid-Load: at 50% PDR.

- High-Load: at 90% PDR.

- NDR-Load: at 100% NDR.

- PDR-Load: at 100% PDR.

- Latency is measured for all tested packet sizes except IMIX due to TG restriction.
- TG sends dedicated latency streams, one per direction, each at the rate of 9 kpps at the prescribed packet size; these are sent in addition to the main load streams.
- TG reports Min/Avg/Max and HDRH latency values distribution per stream direction, hence two sets of latency values are reported per test case.
- Reported latency values are aggregate across tested topology.
- +/- 1 usec is the measurement accuracy advertised by TRex TG for the setup used.
- TG setup introduces an always-on Tx/Rx interface latency of about 2 \* 2 usec per direction induced by TRex SW writing and reading packet timestamps on CPU cores.

#### 1.5.7 Multi-Core Speedup

All performance tests are executed with single physical core and with multiple cores scenarios.

#### Intel Hyper-Threading (HT)

Intel Xeon processors used in FD.io CSIT can operate either in HT Disabled mode (single logical core per each physical core) or in HT Enabled mode (two logical cores per each physical core). HT setting is applied in BIOS and requires server SUT reload for it to take effect, making it impractical for continuous changes of HT mode of operation.

CSIT-2001 performance tests are executed with server SUTs' Intel XEON processors configured with Intel Hyper-Threading Disabled for all Xeon Haswell testbeds (3n-hsw) and with Intel Hyper-Threading Enabled for all Xeon Skylake and Xeon Cascadelake testbeds.

More information about physical testbeds is provided in *Physical Testbeds* (page 5).

#### **Multi-core Tests**

CSIT-2001 multi-core tests are executed in the following VPP worker thread and physical core configurations:

- 1. Intel Xeon Haswell testbeds (3n-hsw) with Intel HT disabled (1 logical CPU core per each physical core):
- 1. 1t1c 1 VPP worker thread on 1 physical core.

<sup>&</sup>lt;sup>20</sup> http://hdrhistogram.org/

- 2. 2t2c 2 VPP worker threads on 2 physical cores.
- 3. 4t4c 4 VPP worker threads on 4 physical cores.
- 1. Intel Xeon Skylake and Cascadelake testbeds (2n-skx, 3n-skx, 2n-clx) with Intel HT enabled (2 logical CPU cores per each physical core):
- 1. 2t1c 2 VPP worker threads on 1 physical core.
- 2. 4t2c 4 VPP worker threads on 2 physical cores.
- 3. 8t4c 8 VPP worker threads on 4 physical cores.

VPP worker threads are the data plane threads running on isolated logical cores. With Intel HT enabled VPP workers are placed as sibling threads on each used physical core. VPP control threads (main, stats) are running on a separate non-isolated core together with other Linux processes.

In all CSIT tests care is taken to ensure that each VPP worker handles the same amount of received packet load and does the same amount of packet processing work. This is achieved by evenly distributing per interface type (e.g. physical, virtual) receive queues over VPP workers using default VPP round-robin mapping and by loading these queues with the same amount of packet flows.

If number of VPP workers is higher than number of physical or virtual interfaces, multiple receive queues are configured on each interface. NIC Receive Side Scaling (RSS) for physical interfaces and multi-queue for virtual interfaces are used for this purpose.

Section *Speedup Multi-Core* (page 167) includes a set of graphs illustrating packet throughout speedup when running VPP worker threads on multiple cores. Note that in quite a few test cases running VPP workers on 2 or 4 physical cores hits the I/O bandwidth or packets-per-second limit of tested NIC.

## 1.5.8 Hoststack Testing

#### HTTP/TCP with WRK

WRK HTTP benchmarking tool<sup>21</sup> is used for TCP/IP and HTTP tests of VPP Host Stack and built-in static HTTP server. WRK has been chosen as it is capable of generating significant TCP/IP and HTTP loads by scaling number of threads across multi-core processors.

This in turn enables high scale benchmarking of the VPP Host Stack TCP/IP and HTTP service including HTTP TCP/IP Connections-Per-Second (CPS) and HTTP Requests-Per-Second.

The initial tests are designed as follows:

- HTTP and TCP/IP Connections-Per-Second (CPS)
  - WRK configured to use 8 threads across 8 cores, 1 thread per core.
  - Maximum of 50 concurrent connections across all WRK threads.
  - Timeout for server responses set to 5 seconds.
  - Test duration is 30 seconds.
  - Expected HTTP test sequence:
    - \* Single HTTP GET Request sent per open connection.
    - \* Connection close after valid HTTP reply.
    - \* Resulting flow sequence 8 packets: >Syn, <Syn-Ack, >Ack, >Req, <Rep, >Fin, <Fin, >Ack.
- HTTP Requests-Per-Second
  - WRK configured to use 8 threads across 8 cores, 1 thread per core.
  - Maximum of 50 concurrent connections across all WRK threads.

 $<sup>^{21}</sup>$  https://github.com/wg/wrk

- Timeout for server responses set to 5 seconds.
- Test duration is 30 seconds.
- Expected HTTP test sequence:
  - \* Multiple HTTP GET Requests sent in sequence per open connection.
  - \* Connection close after set test duration time.
  - \* Resulting flow sequence: >Syn, <Syn-Ack, >Ack, >Req[1], <Rep[1], ..., >Req[n], <Rep[n], >Fin, <Fin, >Ack.

### TCP/IP with iperf3

iperf3 goodput measurement tool<sup>22</sup> is used for measuring the maximum attainable goodput of the VPP Host Stack connection across two instances of VPP running on separate DUT nodes. iperf3 is a popular open source tool for active measurements of the maximum achievable goodput on IP networks.

Because iperf3 utilizes the POSIX socket interface APIs, the current test configuration utilizes the LD\_PRELOAD mechanism in the linux kernel to connect iperf3 to the VPP Host Stack using the VPP Communications Library (VCL) LD PRELOAD library (libvol ldpreload.so).

In the future, a forked version of iperf3 which has been modified to directly use the VCL application APIs may be added to determine the difference in performance of 'VCL Native' applications versus utilizing LD\_PRELOAD which inherently has more overhead and other limitations.

The test configuration is as follows:

```
DUT1 Network DUT2
[ iperf3-client -> VPP1 ]======[ VPP2 -> iperf3-server]
```

#### where.

- 1. iperf3 server attaches to VPP2 and LISTENs on VPP2:TCP port 5201.
- 2. iperf3 client attaches to VPP1 and opens one or more stream connections to VPP2:TCP port 5201.
- 3. iperf3 client transmits a uni-directional stream as fast as the VPP Host Stack allows to the iperf3 server for the test duration.
- 4. At the end of the test the iperf3 client emits the goodput measurements for all streams and the sum of all streams.

Test cases include 1 and 10 Streams with a 20 second test duration with the VPP Host Stack configured to utilize the Cubic TCP congestion algorithm.

Note: iperf3 is single threaded, so it is expected that the 10 stream test does not show any performance improvement due to multi-thread/multi-core execution.

There are also variations of these test cases which use the VPP Network Simulator (NSIM) plugin to test the VPP Hoststack goodput with 1 percent of the traffic being dropped at the output interface of VPP1 thereby simulating a lossy network.

#### QUIC/UDP/IP with vpp\_echo

vpp\_echo performance testing tool<sup>23</sup> is a bespoke performance test application which utilizes the 'native HostStack APIs' to verify performance and correct handling of connection/stream events with unidirectional and bi-directional streams of data.

Because iperf3 does not support the QUIC transport protocol, vpp\_echo is used for measuring the maximum attainable goodput of the VPP Host Stack connection utilizing the QUIC transport protocol across

<sup>&</sup>lt;sup>22</sup> https://github.com/esnet/iperf

<sup>&</sup>lt;sup>23</sup> https://wiki.fd.io/view/VPP/HostStack#External\_Echo\_Server.2FClient\_.28vpp\_echo.29

two instances of VPP running on separate DUT nodes. The QUIC transport protocol supports multiple streams per connection and test cases utilize different combinations of QUIC connections and number of streams per connection.

The test configuration is as follows:

#### where,

- 1. vpp echo server attaches to VPP2 and LISTENs on VPP2:TCP port 1234.
- 2. vpp\_echo client creates one or more connections to VPP1 and opens one or more stream per connection to VPP2:TCP port 1234.
- 3. vpp\_echo client transmits a uni-directional stream as fast as the VPP Host Stack allows to the vpp\_echo server for the test duration.
- 4. At the end of the test the vpp\_echo client emits the goodput measurements for all streams and the sum of all streams.

#### Test cases include

- 1. 1 QUIC Connection with 1 Stream
- 2. 1 OUIC connection with 10 Streams
- 3. 10 QUIC connetions with 1 Stream
- 4. 10 QUIC connections with 10 Streams

with stream sizes to provide reasonable test durations. The VPP Host Stack QUIC transport is configured to utilize the picotls encryption library. In the future, tests utilizing additional encryption algorithms will be added.

## 1.5.9 Reconfiguration Tests

**Important:** DISCLAIMER: Described reconf test methodology is experimental, and subject to change following consultation within csit-dev, vpp-dev and user communities. Current test results should be treated as indicative.

#### **Overview**

Reconf tests are designed to measure the impact of VPP re-configuration on data plane traffic. While VPP takes some measures against the traffic being entirely stopped for a prolonged time, the immediate forwarding rate varies during the re-configuration, as some configurations steps need the active dataplane worker threads to be stopped temporarily.

As the usual methods of measuring throughput need multiple trial measurements with somewhat long durations, and the re-configuration process can also be long, finding an offered load which would result in zero loss during the re-configuration process would be time-consuming.

Instead, reconf tests first find a throughput value (lower bound for NDR) without re-configuration, and then maintain that ofered load during re-configuration. The measured loss count is then assumed to be caused by the re-configuration process. The result published by reconf tests is the effective blocked time, that is the loss count divided by the offered load.

#### **Current Implementation**

Each reconf suite is based on a similar MLRsearch performance suite.

MLRsearch parameters are changed to speed up the throughput discovery. For example, PDR is not searched for, and the final trial duration is shorter.

The MLRsearch suite has to contain a configuration parameter that can be scaled up, e.g. number of tunnels or number of service chains. Currently, only increasing the scale is supported as the re-configuration operation. In future, scale decrease or other operations can be implemented.

The traffic profile is not changed, so the traffic present is processed only by the smaller scale configuration. The added tunnels / chains are not targetted by the traffic.

For the re-configuration, the same Robot Framework and Python libraries are used, as were used in the initial configuration, with the exception of the final calls that do not interact with VPP (e.g. starting virtual machines) being skipped to reduce the test overall duration.

#### **Discussion**

Robot Framework introduces a certain overhead, which may affect timing of individual VPP API calls, which in turn may affect the number of packets lost.

The exact calls executed may contain unnecessary info dumps, repeated commands, or commands which change a value that do not need to be changed (e.g. MTU). Thus, implementation details are affecting the results, even if their effect on the corresponding MLRsearch suite is negligible.

The lower bound for NDR is the only value safe to be used when zero packets lost are expected without reconfiguration. But different suites show different "jitter" in that value. For some suites, the lower bound is not tight, allowing full NIC buffers to drain quickly between worker pauses. For other suites, lower bound for NDR still has quite a large probability of non-zero packet loss even without re-configuration.

### 1.5.10 VPP Startup Settings

CSIT code manipulates a number of VPP settings in startup.conf for optimized performance. List of common settings applied to all tests and test dependent settings follows.

See VPP startup.conf<sup>24</sup> for a complete set and description of listed settings.

#### **Common Settings**

List of VPP startup.conf settings applied to all tests:

- 1. heap-size <value> set separately for ip4, ip6, stats, main depending on scale tested.
- 2. no-tx-checksum-offload disables UDP / TCP TX checksum offload in DPDK. Typically needed for use faster vector PMDs (together with no-multi-seg).
- 3. buffers-per-numa <value> sets a number of memory buffers allocated to VPP per CPU socket. VPP default is 16384. Needs to be increased for scenarios with large number of interfaces and worker threads. To accommodate for scale tests, CSIT is setting it to the maximum possible value corresponding to the limit of DPDK memory mappings (currently 256). For Xeon Skylake platforms configured with 2MB hugepages and VPP data-size and buffer-size defaults (2048B and 2496B respectively), this results in value of 215040 (256 \* 840 = 215040, 840 \* 2496B buffers fit in 2MB hugepage). For Xeon Haswell nodes value of 107520 is used.

 $<sup>^{24} \</sup> https://git.fd.io/vpp/tree/src/vpp/conf/startup.conf?h=stable/2001 \&id=fce396738f865293f0a023bc7f172086f81da456$ 

#### **Per Test Settings**

List of vpp startup.conf settings applied dynamically per test:

- 1. corelist-workers < list\_of\_cores> list of logical cores to run VPP worker data plane threads. Depends on HyperThreading and core per test configuration.
- 2. num-rx-gueues <value> depends on a number of VPP threads and NIC interfaces.
- 3. no-multi-seg disables multi-segment buffers in DPDK, improves packet throughput, but disables Jumbo MTU support. Disabled for all tests apart from the ones that require Jumbo 9000B frame support.
- 4. UIO driver depends on topology file definition.
- 5. QAT VFs depends on NRThreads, each thread = 1QAT VFs.

#### 1.5.11 KVM VMs vhost-user

QEMU is used for KVM VM vhost-user testing environment. By default, standard QEMU version is used, preinstalled from OS repositories (qemu-2.11.1 for Ubuntu 18.04). The path to the QEMU binary can be adjusted in *Constants.py*.

FD.io CSIT performance lab is testing VPP vhost-user with KVM VMs using following environment settings:

CSIT supports two types of VMs:

- Image-VM: used for all functional, VPP\_device, and regular performance tests except NFV density tests.
- **Kernel-VM**: new VM type introduced for NFV density tests to provide greater in-VM application install flexibility and to further reduce test execution time by simpler VM lifecycle management.

#### Image-VM

CSIT can use a pre-created VM image. The path to the image can be adjusted in *Constants.py*. For convenience and full compatibility CSIT repository contains a set of scripts to prepare Built-root<sup>25</sup> based embedded Linux image with all the dependencies needed to run DPDK Testpmd, DPDK L3Fwd, Linux bridge or Linux IPv4 forwarding.

Built-root was chosen for a VM image to make it lightweight and with fast booting time to limit impact on tests duration.

In order to execute CSIT tests, VM image must have following software installed: qemu-guest-agent, sshd, bridge-utils, VirtIO support and DPDK Testpmd/L3fwd applications. Username/password for the VM must be cisco/cisco and NOPASSWD sudo access. The interface naming is based on the driver (management interface type is Intel E1000), all E1000 interfaces will be named mgmt<n> and all VirtIO interfaces will be named virtio<n>. In VM /etc/init.d/qemu-guest-agent must be set to TRANSPORT=isa-serial:/ dev/ttyS1 because ttyS0 is used by serial console and ttyS1 is dedicated for qemu-guest-agent in QEMU setup.

#### Kernel-VM

CSIT can use a kernel KVM image as a boot kernel, as an alternative to image VM. This option allows better configurability of what application is running in VM userspace. Using root9p filesystem allows mapping the host-OS filesystem as read only guest-OS filesystem.

Example of custom init script for the kernel-VM:

<sup>&</sup>lt;sup>25</sup> https://buildroot.org/

```
#!/bin/bash
mount -t sysfs -o "nodev, noexec, nosuid" sysfs /sys
mount -t proc -o "nodev,noexec,nosuid" proc /proc
mkdir /dev/pts
mkdir /dev/hugepages
mount -t devpts -o "rw,noexec,nosuid,gid=5,mode=0620" devpts /dev/pts || true
mount -t tmpfs -o "rw,noexec,nosuid,size=10%,mode=0755" tmpfs /run
mount -t tmpfs -o "rw,noexec,nosuid,size=10%,mode=0755" tmpfs /tmp
mount -t hugetlbfs -o "rw,relatime,pagesize=2M" hugetlbfs /dev/hugepages
echo 0000:00:06.0 > /sys/bus/pci/devices/0000:00:06.0/driver/unbind
echo 0000:00:07.0 > /sys/bus/pci/devices/0000:00:07.0/driver/unbind
echo vfio-pci > /sys/bus/pci/devices/0000:00:06.0/driver_override
echo vfio-pci > /sys/bus/pci/devices/0000:07.0/driver_override
echo 0000:00:06.0 > /sys/bus/pci/drivers/vfio-pci/bind
echo 0000:00:07.0 > /sys/bus/pci/drivers/vfio-pci/bind
$vnf_bin
poweroff -f
```

QemuUtils library during runtime replaces the \$vnf\_bin variable by the path to NF binary and its parameters. This allows CSIT to run any application installed on host OS, for example the same version of VPP as running on the host-OS.

Kernel-VM image must be available in the host filesystem as a prerequisite. The path to kernel-VM image is defined in *Constants.py*.

#### 1.5.12 LXC/DRC Container Memif

CSIT includes tests taking advantage of VPP memif virtual interface (shared memory interface) to interconnect VPP running in Containers. VPP vswitch instance runs in bare-metal user-mode handling NIC interfaces and connecting over memif (Slave side) to VPPs running in Linux Container (LXC) or in Docker Container (DRC) configured with memif (Master side). LXCs and DRCs run in a priviliged mode with VPP data plane worker threads pinned to dedicated physical CPU cores per usual CSIT practice. All VPP instances run the same version of software. This test topology is equivalent to existing tests with vhost-user and VMs as described earlier in *Logical Topologies* (page 38).

In addition to above vswitch tests, a single memif interface test is executed. It runs in a simple topology of two VPP container instances connected over memif interface in order to verify standalone memif interface performance.

More information about CSIT LXC and DRC setup and control is available in *Container Orchestration in CSIT* (page 560).

#### 1.5.13 NFV Service Density

Network Function Virtualization (NFV) service density tests focus on measuring total per server throughput at varied NFV service "packing" densities with vswitch providing host dataplane. The goal is to compare and contrast performance of a shared vswitch for different network topologies and virtualization technologies, and their impact on vswitch performance and efficiency in a range of NFV service configurations.

Each NFV service instance consists of a set of Network Functions (NFs), running in VMs (VNFs) or in Containers (CNFs), that are connected into a virtual network topology using VPP vswitch running in Linux user-mode. Multiple service instances share the vswitch that in turn provides per service chain forwarding context(s). In order to provide a most complete picture, each network topology and service configuration is tested in different service density setups by varying two parameters:

- Number of service instances (e.g. 1, 2, 4, 6, 8, 10).
- Number of NFs per service instance (e.g. 1, 2, 4, 6, 8, 10).

Implementation of NFV service density tests in CSIT-2001 is using two NF applications:

- VNF: VPP of the same version as vswitch running in KVM VM, configured with /8 IPv4 prefix routing.
- CNF: VPP of the same version as vswitch running in Docker Container, configured with /8 IPv4 prefix routing.

Tests are designed such that in all tested cases VPP vswitch is the most stressed application, as for each flow vswitch is processing each packet multiple times, whereas VNFs and CNFs process each packets only once. To that end, all VNFs and CNFs are allocated enough resources to not become a bottleneck.

#### **Service Configurations**

Following NFV network topologies and configurations are tested:

- VNF Service Chains (VSC) with L2 vswitch
  - Network Topology: Sets of VNFs dual-homed to VPP vswitch over virtio-vhost links. Each set belongs to separate service instance.
  - Network Configuration: VPP L2 bridge-domain contexts form logical service chains of VNF sets and connect each chain to physical interfaces.
- CNF Service Chains (CSC) with L2 vswitch
  - *Network Topology*: Sets of CNFs dual-homed to VPP vswitch over memif links. Each set belongs to separate service instance.
  - *Network Configuration*: VPP L2 bridge-domain contexts form logical service chains of CNF sets and connect each chain to physical interfaces.
- CNF Service Pipelines (CSP) with L2 vswitch
  - Network Topology: Sets of CNFs connected into pipelines over a series of memif links, with edge CNFs single-homed to VPP vswitch over memif links. Each set belongs to separate service instance.
  - Network Configuration: VPP L2 bridge-domain contexts connect each CNF pipeline to physical interfaces.

### **Thread-to-Core Mapping**

CSIT defines specific ratios for mapping software threads of vswitch and VNFs/CNFs to physical cores, with separate ratios defined for main control threads and data-plane threads.

In CSIT-2001 NFV service density tests run on Intel Xeon testbeds with Intel Hyper-Threading enabled, so each physical core is associated with a pair of sibling logical cores corresponding to the hyper-threads.

CSIT-2001 executes tests with the following software thread to physical core mapping ratios:

- vSwitch
  - Data-plane on single core
    - \* (main:core) = (1:1) => 1mt1c 1 main thread on 1 core.
    - \* (data:core) = (1:1) => 2dt1c 2 Data-plane Threads on 1 Core.
  - Data-plane on two cores
    - \* (main:core) = (1:1) => 1mt1c 1 Main Thread on 1 Core.
    - \* (data:core) = (1:2) => 4dt2c 4 Data-plane Threads on 2 Cores.
- VNF and CNF
  - Data-plane on single core

- \* (main:core) = (2:1) => 2mt1c 2 Main Threads on 1 Core, 1 Thread per NF, core shared between two NFs.
- \* (data:core) = (1:1) => 2dt1c 2 Data-plane Threads on 1 Core per NF.
- Data-plane on single logical core (Two NFs per physical core)
  - \* (main:core) = (2:1) => 2mt1c 2 Main Threads on 1 Core, 1 Thread per NF, core shared between two NFs.
  - \* (data:core) = (2:1) => 2dt1c 2 Data-plane Threads on 1 Core, 1 Thread per NF, core shared between two NFs.

Maximum tested service densities are limited by a number of physical cores per NUMA. CSIT-2001 allocates cores within NUMA0. Support for multi NUMA tests is to be added in future release.

## 1.5.14 VPP\_Device Functional

CSIT-2001 includes VPP\_Device test environment for functional VPP device tests integrated into LFN CI/CD infrastructure. VPP\_Device tests run on 1-Node testbeds (1n-skx, 1n-arm) and rely on Linux SRIOV Virtual Function (VF), dot1q VLAN tagging and external loopback cables to facilitate packet passing over external physical links. Initial focus is on few baseline tests. New device tests can be added by small edits to existing CSIT Performance (2-node) test. RF test definition code stays unchanged with the exception of traffic generator related L2 KWs.

### 1.5.15 IPSec on Intel QAT

VPP IPSec performance tests are using DPDK cryptodev device driver in combination with HW cryptodev devices - Intel QAT 8950 50G - present in LF FD.io physical testbeds. DPDK cryptodev can be used for all IPSec data plane functions supported by VPP.

Currently CSIT-2001 implements following IPSec test cases:

- AES-GCM, CBC-SHA1 ciphers, in combination with IPv4 routed-forwarding with Intel xI710 NIC.
- CBC-SHA1 ciphers, in combination with LISP-GPE overlay tunneling for IPv4-over-IPv4 with Intel xl710 NIC.

#### 1.5.16 TRex Traffic Generator

### **Usage**

TRex traffic generator<sup>26</sup> is used for all CSIT performance tests. TRex stateless mode is used to measure NDR and PDR throughputs using MLRsearch and to measure maximum transer rate in MRR tests.

TRex is installed and run on the TG compute node. The typical procedure is:

- If the TRex is not already installed on TG, it is installed in the suite setup phase see TRex installation<sup>27</sup>.
- TRex configuration is set in its configuration file

```
/etc/trex_cfg.yaml
```

TRex is started in the background mode

<sup>&</sup>lt;sup>26</sup> https://trex-tgn.cisco.com

 $<sup>^{27}\</sup> https://git.fd.io/csit/tree/resources/tools/trex/trex_installer.sh?h=rls2001$ 

• There are traffic streams dynamically prepared for each test, based on traffic profiles. The traffic is sent and the statistics obtained using trex.stl.api.STLClient.

### **Measuring Packet Loss**

Following sequence is followed to measure packet loss:

- Create an instance of STLClient.
- Connect to the client.
- Add all streams.
- Clear statistics.
- Send the traffic for defined time.
- Get the statistics.

If there is a warm-up phase required, the traffic is sent also before test and the statistics are ignored.

### **Measuring Latency**

If measurement of latency is requested, two more packet streams are created (one for each direction) with TRex flow\_stats parameter set to STLFlowLatencyStats. In that case, returned statistics will also include min/avg/max latency values and encoded HDRHstogram data.

**CHAPTER** 

**TWO** 

# **VPP PERFORMANCE**

## 2.1 Overview

VPP performance test results are reported for all three physical testbed types present in FD.io labs: 3-Node Xeon Haswell (3n-hsw), 3-Node Xeon Skylake (3n-skx), 2-Node Xeon Skylake (2n-skx) and installed NIC models. For description of physical testbeds used for VPP performance tests please refer to *Physical Testbeds* (page 5).

## 2.1.1 Logical Topologies

CSIT VPP performance tests are executed on physical testbeds described in *Physical Testbeds* (page 5). Based on the packet path thru server SUTs, three distinct logical topology types are used for VPP DUT data plane testing:

- 1. NIC-to-NIC switching topologies.
- 2. VM service switching topologies.
- 3. Container service switching topologies.

### **NIC-to-NIC Switching**

The simplest logical topology for software data plane application like VPP is NIC-to-NIC switching. Tested topologies for 2-Node and 3-Node testbeds are shown in figures below.





Server Systems Under Test (SUT) run VPP application in Linux user-mode as a Device Under Test (DUT). Server Traffic Generator (TG) runs T-Rex application. Physical connectivity between SUTs and TG is provided using different drivers and NIC models that need to be tested for performance (packet/bandwidth throughput and latency).

From SUT and DUT perspectives, all performance tests involve forwarding packets between two (or more) physical Ethernet ports (10GE, 25GE, 40GE, 100GE). In most cases both physical ports on SUT are located on the same NIC. The only exceptions are link bonding and 100GE tests. In the latter case only one port per NIC can be driven at linerate due to PCIe Gen3 x16 slot bandwidth limiations. 100GE NICs are not supported in PCIe Gen3 x8 slots.

2.1. Overview 39

Note that reported VPP DUT performance results are specific to the SUTs tested. SUTs with other processors than the ones used in FD.io lab are likely to yield different results. A good rule of thumb, that can be applied to estimate VPP packet thoughput for NIC-to-NIC switching topology, is to expect the forwarding performance to be proportional to processor core frequency for the same processor architecture, assuming processor is the only limiting factor and all other SUT parameters are equivalent to FD.io CSIT environment.

#### **VM Service Switching**

VM service switching topology test cases require VPP DUT to communicate with Virtual Machines (VMs) over vhost-user virtual interfaces.

Two types of VM service topologies are tested in CSIT-2001:

- 1. "Parallel" topology with packets flowing within SUT from NIC(s) via VPP DUT to VM, back to VPP DUT, then out thru NIC(s).
- 2. "Chained" topology (a.k.a. "Snake") with packets flowing within SUT from NIC(s) via VPP DUT to VM, back to VPP DUT, then to the next VM, back to VPP DUT and so on and so forth until the last VM in a chain, then back to VPP DUT and out thru NIC(s).

For each of the above topologies, VPP DUT is tested in a range of L2 or IPv4/IPv6 configurations depending on the test suite. Sample VPP DUT "Chained" VM service topologies for 2-Node and 3-Node testbeds with each SUT running N of VM instances is shown in the figures below.





In "Chained" VM topologies, packets are switched by VPP DUT multiple times: twice for a single VM, three times for two VMs, N+1 times for N VMs. Hence the external throughput rates measured by TG and listed in this report must be multiplied by N+1 to represent the actual VPP DUT aggregate packet forwarding rate.

For "Parallel" service topology packets are always switched twice by VPP DUT per service chain.

Note that reported VPP DUT performance results are specific to the SUTs tested. SUTs with other processor than the ones used in FD.io lab are likely to yield different results. Similarly to NIC-to-NIC switching topology, here one can also expect the forwarding performance to be proportional to processor core frequency for the same processor architecture, assuming processor is the only limiting factor. However due to much higher dependency on intensive memory operations in VM service chained topologies and sensitivity to Linux scheduler settings and behaviour, this estimation may not always yield good enough accuracy.

#### **Container Service Switching**

Container service switching topology test cases require VPP DUT to communicate with Containers (Ctrs) over memif virtual interfaces.

Three types of VM service topologies are tested in CSIT-2001:

- 1. "Parallel" topology with packets flowing within SUT from NIC(s) via VPP DUT to Container, back to VPP DUT, then out thru NIC(s).
- 2. "Chained" topology (a.k.a. "Snake") with packets flowing within SUT from NIC(s) via VPP DUT to Container, back to VPP DUT, then to the next Container, back to VPP DUT and so on and so forth until the last Container in a chain, then back to VPP DUT and out thru NIC(s).
- 3. "Horizontal" topology with packets flowing within SUT from NIC(s) via VPP DUT to Container, then via "horizontal" memif to the next Container, and so on and so forth until the last Container, then back to VPP DUT and out thru NIC(s).

For each of the above topologies, VPP DUT is tested in a range of L2 or IPv4/IPv6 configurations depending on the test suite. Sample VPP DUT "Chained" Container service topologies for 2-Node and 3-Node testbeds with each SUT running N of Container instances is shown in the figures below.

2.1. Overview 41





In "Chained" Container topologies, packets are switched by VPP DUT multiple times: twice for a single Container, three times for two Containers, N+1 times for N Containers. Hence the external throughput rates measured by TG and listed in this report must be multiplied by N+1 to represent the actual VPP DUT aggregate packet forwarding rate.

For a "Parallel" and "Horizontal" service topologies packets are always switched by VPP DUT twice per service chain.

Note that reported VPP DUT performance results are specific to the SUTs tested. SUTs with other processor than the ones used in FD.io lab are likely to yield different results. Similarly to NIC-to-NIC switching topology, here one can also expect the forwarding performance to be proportional to processor core frequency for the same processor architecture, assuming processor is the only limiting factor. However due

to much higher dependency on intensive memory operations in Container service chained topologies and sensitivity to Linux scheduler settings and behaviour, this estimation may not always yield good enough accuracy.

## 2.1.2 Performance Tests Coverage

Performance tests measure following metrics for tested VPP DUT topologies and configurations:

- Packet Throughput: measured in accordance with RFC 2544<sup>28</sup>, using FD.io CSIT Multiple Loss Ratio search (MLRsearch), an optimized binary search algorithm, producing throughput at different Packet Loss Ratio (PLR) values:
  - Non Drop Rate (NDR): packet throughput at PLR=0%.
  - Partial Drop Rate (PDR): packet throughput at PLR=0.5%.
- One-Way Packet Latency: measured at different offered packet loads:
  - 100% of discovered NDR throughput.
  - 100% of discovered PDR throughput.
- Maximum Receive Rate (MRR): measure packet forwarding rate under the maximum load offered by traffic generator over a set trial duration, regardless of packet loss. Maximum load for specified Ethernet frame size is set to the bi-directional link rate.

CSIT-2001 includes following VPP data plane functionality performance tested across a range of NIC drivers and NIC models:

2.1. Overview 43

<sup>&</sup>lt;sup>28</sup> https://tools.ietf.org/html/rfc2544.html

| Functionality | Description                                                                           |
|---------------|---------------------------------------------------------------------------------------|
| ACL           | L2 Bridge-Domain switching and IPv4and IPv6 routing with iACL and oACL IP ad-         |
|               | dress, MAC address and L4 port security.                                              |
| COP           | IPv4 and IPv6 routing with COP address security.                                      |
| IPv4          | IPv4 routing.                                                                         |
| IPv6          | IPv6 routing.                                                                         |
| IPv4 Scale    | IPv4 routing with 20k, 200k and 2M FIB entries.                                       |
| IPv6 Scale    | IPv6 routing with 20k, 200k and 2M FIB entries.                                       |
| IPSecHW       | IPSec encryption with AES-GCM, CBC-SHA-256 ciphers, in combination with IPv4          |
|               | routing. Intel QAT HW acceleration.                                                   |
| IPSec+LISP    | IPSec encryption with CBC-SHA1 ciphers, in combination with LISP-GPE overlay tun-     |
|               | neling for IPv4-over-IPv4.                                                            |
| IPSecSW       | IPSec encryption with AES-GCM, CBC-SHA-256 ciphers, in combination with IPv4          |
|               | routing.                                                                              |
| KVM VMs       | Virtual topologies with service chains of 1 VM using vhost-user interfaces, with dif- |
| vhost-user    | ferent VPP forwarding modes incl. L2XC, L2BD, VXLAN with L2BD, IPv4 routing.          |
| L2BD          | L2 Bridge-Domain switching of untagged Ethernet frames with MAC learning; dis-        |
|               | abled MAC learning i.e. static MAC tests to be added.                                 |
| L2BD Scale    | L2 Bridge-Domain switching of untagged Ethernet frames with MAC learning; dis-        |
|               | abled MAC learning i.e. static MAC tests to be added with 20k, 200k and 2M FIB        |
|               | entries.                                                                              |
| L2XC          | L2 Cross-Connect switching of untagged, dot1q, dot1ad VLAN tagged Ethernet            |
|               | frames.                                                                               |
| LISP          | LISP overlay tunneling for IPv4-over-IPv4, IPv6-over-IPv4, IPv6-over-IPv6, IPv4-      |
|               | over-IPv6 in IPv4 and IPv6 routing modes.                                             |
| LXC/DRC       | Container VPP memif virtual interface tests with different VPP forwarding modes       |
| Containers    | incl. L2XC, L2BD.                                                                     |
| Memif         |                                                                                       |
| NAT           | (Source) Network Address Translation tests with varying number of users and ports     |
|               | per user.                                                                             |
| QoS Policer   | Ingress packet rate measuring, marking and limiting (IPv4).                           |
| SRv6 Routing  | Segment Routing IPv6 tests.                                                           |
| VPP TCP/IP    | Tests of VPP TCP/IP stack used with VPP built-in HTTP server.                         |
| stack         |                                                                                       |
| VTS           | Virtual Topology System use case tests combining VXLAN overlay tunneling with         |
| 2001.421      | L2BD, ACL and KVM VM vhost-user features.                                             |
| VXLAN         | VXLAN overlay tunnelling integration with L2XC and L2BD.                              |

Execution of performance tests takes time, especially the throughput tests. Due to limited HW testbed resources available within FD.io labs hosted by LF, the number of tests for some NIC models has been limited to few baseline tests.

## 2.1.3 Performance Tests Naming

FD.io CSIT-2001 follows a common structured naming convention for all performance and system functional tests, introduced in CSIT-17.01.

The naming should be intuitive for majority of the tests. Complete description of FD.io CSIT test naming convention is provided on *Test Naming* (page 673).

## 2.2 Release Notes

### 2.2.1 Changes in CSIT-2001

#### 1. VPP PERFORMANCE TESTS

- Intel Xeon 2n-skx, 3n-skx testbeds: VPP performance test data is not included in this report version. This is due to the lower performance and behaviour inconsistency of these systems following the upgrade of processor microcode packages (skx ucode 0x2000064), done as part of updating Ubuntu 18.04 LTS kernel version. Tested VPP and DPDK applications (L3fwd) are affected. Skx test data will be added in subsequent maintenance report version(s) once the issue is resolved. See *Known Issues* (page 47).
- Intel Xeon 2n-clx testbeds: VPP performance test data is now included in this report, after resolving the issue of lower performance and behaviour inconsistency of these systems due to the Linux kernel driven upgrade of processor microcode packages to 0x500002c. The resolution is to use latest SuperMicro BIOS 3.2 (for X11DPG-QT motherboards used) that upgrades processor microcode to 0x500002c, AND NOT kernel provided ucode package as it does put system into sub-optimal state. Subset of 2n-clx VPP tests are failing due to clx system behaviour change: i) all ip4 tests with xxv710 and avf driver and ii) some cx556a rdma tests. See *Known Issues* (page 47).
- Service density 2n-skx tests: Added new NF density tests with IPsec encryption between DUTs.
- AVF tests: Full test coveraged based on code changes in CSIT core layer (driver/interface awareness) and generated by suite generator (Intel Fortville NICs only).
- Hoststack tests: Major refactor of VPP Hoststack TCP/IP performance tests using WRK generator talking to the VPP HTTP static server plugin measuring connections per second and requests per second. Added new iperf3 with LDPreload tests, iperf3/LDPreload tests with packet loss induced via the VPP NSIM (Network Simulator) plugin, and QUIC/UDP/IP transport tests. All of the new tests measure goodput through the VPP Hoststack from client to server.
- Latency HDRHistogram: Added High Dynamic Range Histogram latency measurements based on the new capability in TRex traffic generator. HDRH latency data presented in latency packet percentile graphs and in detailed results tables.
- Mellanox CX556A-EDAT tests: Added tests with Mellanox ConnectX5-2p100GE NICs in 2n-clx testbeds using VPP native rdma driver.
- **IPsec reconfiguration tests**: Added tests measuring the impact of IPsec tunnels creations and removals.
- Load Balancer tests: Added VPP performance tests for Maglev, L3DSR (Direct Server Return), Layer 4 Load Balancing NAT Mode.

#### 2. TEST FRAMEWORK

- CSIT Python3 support: Full migration of CSIT from Python2.7 to Python3.6. This change includes library migration, PIP dependency upgrade, CSIT container images, infrastructure packages ugrade/installation.
- CSIT PAPI support: Finished conversion of CSIT VAT L1 keywords to PAPI L1 KWs in CSIT using VPP Python bindings (VPP PAPI). Redesign of key components of PAPI Socket Executor and PAPI history. Due to issues with PAPI performance, VAT is still used in CSIT for all VPP scale tests. See known issues below.
- **Test Suite Generator**: Added capability to generate suites for different drivers per NIC model including DPDK, AVF, RDMA. Extended coverage for all tests.
- **General Code Housekeeping**: Ongoing RF keywords optimizations, removal of redundant RF keywords and aligning of suite/test setup/teardowns.

2.2. Release Notes 45

#### 3. TEST ENVIRONMENT

- TRex Fortville NIC Performance: Received FVL fix from Intel resolving TRex low throughput issue. TRex per FVL NIC throughput increased from ~27 Mpps to the nominal ~37 Mpps. For detail see CSIT-1503<sup>29</sup> and TRex-519<sup>30</sup>].
- New Intel Xeon Cascadelake Testbeds: Added performance tests for 2-Node-Cascadelake (2n-clx) testbeds with x710, xxv710 and cx556a-edat NIC cards.

#### 4. PRESENTATION AND ANALYTICS LAYER

- **Graphs layout improvements**: Improved performance graphs layout for better readibility and maintenance: test grouping, axis labels, descriptions, other informative decoration.
- Latency graphs: Min/Avg/Max group bar latency graphs are replaced with packet latency percentile distribution at different background packet loads based on TRex latency hdrhistogram measurements.

<sup>&</sup>lt;sup>29</sup> https://jira.fd.io/browse/CSIT-1503

<sup>30</sup> https://trex-tgn.cisco.com/youtrack/issue/trex-519

## 2.2.2 Known Issues

List of known issues in CSIT-2001 for VPP performance tests:

| #  | Ji- Issue Description                                                                                                                   |
|----|-----------------------------------------------------------------------------------------------------------------------------------------|
|    | ralD                                                                                                                                    |
| 1  | CSIT Sporadic (1 in 200) NDR discovery test failures on x520. DPDK reporting rx-errors, indi-                                           |
|    | 570 <sup>3</sup> cating L1 issue. Suspected issue with HW combination of X710-X520 in LF testbeds. Not                                  |
|    | observed outside of LF testbeds.                                                                                                        |
| 2  | VPP- 9000B packets not supported by NICs VIC1227 and VIC1387.                                                                           |
|    | 662 <sup>32</sup>                                                                                                                       |
| 3  | CSIT Memif tests are sporadically failing on initialization of memif connection.                                                        |
|    | 1498 <sup>33</sup>                                                                                                                      |
| 4  | VPP- 9000B ip4 nat44: VPP crash + coredump. VPP crashes very often in case that NAT44 is                                                |
|    | 1677 configured and it has to process IP4 jumbo frames (9000B).                                                                         |
| 5  | CSIT All CSIT scale tests can not use PAPI due to much slower performance compared to                                                   |
|    | 1591 VAT/CLI (it takes much longer to program VPP). This needs to be addressed on the PAPI                                              |
|    | VPP- side.                                                                                                                              |
|    | 1763 <sup>36</sup>                                                                                                                      |
| 6  | VPP- IPv4 IPSEC 9000B packet tests are failing as no packet is forwarded. Reason: chained                                               |
| 7  | 1675 buffers are not supported.                                                                                                         |
| 7  | CSIT- IPv4 AVF 9000B packet tests are failing on 3n-skx while passing on 2n-skx. 1593 <sup>38</sup>                                     |
| 8  |                                                                                                                                         |
| ð  | CSIT Intel Xeon 2n-skx, 3n-skx and 2n-clx testbeds behaviour and performance became in-                                                 |
|    | 1675 consistent following the upgrade to the latest Ubuntu 18.04 LTS kernel version (4.15.0-                                            |
|    | 72-generic) and associated microcode packages (skx ucode 0x2000064, clx ucode 0x500002c). VPP as well as DPDK L3fwd tests are affected. |
| 9  | CSIT All 2n-clx VPP ip4 tests with xxv710 and avf driver are failing.                                                                   |
| 7  | 1679 <sup>40</sup>                                                                                                                      |
| 10 | CSIT Some 2n-clx cx556a rdma tests are failing.                                                                                         |
| 10 | 1680 <sup>41</sup>                                                                                                                      |
|    | 1000                                                                                                                                    |

2.2. Release Notes 47

<sup>31</sup> https://jira.fd.io/browse/CSIT-570

<sup>32</sup> https://jira.fd.io/browse/VPP-662 33 https://jira.fd.io/browse/CSIT-1498

<sup>34</sup> https://jira.fd.io/browse/VPP-1677 35 https://jira.fd.io/browse/CSIT-1499

https://jira.fd.io/browse/VPP-1763 https://jira.fd.io/browse/VPP-1675 https://jira.fd.io/browse/VPP-1675 https://jira.fd.io/browse/CSIT-1593

<sup>39</sup> https://jira.fd.io/browse/CSIT-1675 40 https://jira.fd.io/browse/CSIT-1679

<sup>41</sup> https://jira.fd.io/browse/CSIT-1680

# 2.3 Packet Throughput

Throughput graphs are generated based on the results data obtained from the CSIT-2001 test jobs. In order to verify benchmark results repeatibility selected, CSIT performance tests are executed multiple times (target: 10 times) on each physical testbed type. Box-and-Whisker plots are used to display variations in measured throughput values.

Lists of tests selected for multiple execution and graphing are captured per testbed type in test\_select\_list\_{testbed\_type}.md<sup>42</sup> files.

Graphs are split into sections as follows:

- 1. Header 1: VPP packet path and lookup types
  - L2 Ethernet Switching: L2 bridge-doman, L2 cross-connect and L2 patch
  - IPv4 Routing: IPv4 routing with /32 prefixes
  - IPv6 Routing: IPv6 routing with /128 prefixes
  - SRv6 Routing: SRv6 with IPv6 routing
  - IPv4 Tunnels: IPv4 overlay tunnels
  - KVM VMs vhost-user: KVM VMs connected over virtio and vhost-user interfaces
  - LXC/DRC Container Memif: Linux containers and Docker containers connected over Memif interfaces
  - IPsec IPv4 Routing: IPsec encryption/decryption with IPv4 routing
  - Virtual Topology System: VXLAN configurations with L2 bridge-domains
- 2. Header 2: testbeds and NIC models
  - section name format:
    - {testbed\_type}-{nic\_model}
  - testbed\_type:
    - 2n-skx: 2-node Xeon Skylake
    - 3n-skx: 3-node Xeon Skylake
    - 2n-clx: 2-node Xeon Cascade Lake
    - 3n-hsw: 3-node Xeon Haswell
    - 3n-tsh: 3-node Arm TaiShan
    - 2n-dnv: 2-node Atom Denverton
    - 3n-dnv: 3-node Atom Denverton
  - nic\_model:
    - xxv710: xxv710 2p25GE Intel (Fortville)
    - x710: x710 4p10GE Intel (Fortville)
    - xl710: xl710 2p40GE Intel (Fortville)
    - x520: x520 2p10GE Intel (Niantic)
    - x553: x553 2p10GE Intel (Niantic)
- 3. **Header 3**: test group names
  - section name format:

<sup>42</sup> https://git.fd.io/csit/tree/docs/job\_specs

- {frame\_size}-{worker\_thread\_core\_cfg}-{vpp\_functionality}-{vpp\_lookup\_type}-{baseline\_scale}-{nic\_driver}

#### • frame\_size:

- 64b: 64 byte frames, smallest frame size for untagged IPv4 packets
- 78b: 78 byte frames, smallest frame size for untagged IPv6 packets
- 114b: VXLAN encapsulated L2 frames
- imix: a sequence of (7x64B, 4x570, 1x1518) byte frames

#### worker\_thread\_core\_cfg:

- 1t1c: 1 worker thread on 1 core, hyper-threading not used
- 2t1c: 2 worker threads on 1 core, hyper-threading used

### • vpp\_functionality (optional):

- features: including input-acl, output-acl, macip-iacl, nat44
- srv6: srv6 encap/decap, proxy
- link-bonding: L2 link aggregation with 1 or 2 bonded links
- ipsec: IPsec encryption/decryption with different ciphers
- vts: Virtual Topology System specific tests

#### vpp\_lookup\_type:

- I2switching, ip4routing, ip6routing, ip4tunnel, vhost, memif

#### baseline\_scale:

- base: baseline tests with less than 10 forwarding entries
- scale: scale tests with up to 2 million forwarding entries
- base-scale: both baseline and scale tests grouped together

#### • nic\_driver:

- avf: VPP native avf driver for Intel Fortville NICs
- i40e: dpdk poll mode driver for Intel Fortville NICs
- ixgbe: dpdk poll mode driver for Intel Niantic NICs

For each test case, Box-and-Whisker plots show the quartiles (Min, 1st quartile / 25th percentile, 2nd quartile / 50th percentile / mean, 3rd quartile / 75th percentile, Max) across collected data set. Outliers are plotted as individual points.

Additional information about graph data:

- 1. **Graph Title**: describes tested packet path, testbed topology, processor model, NIC model, packet size, number of cores and threads used by data plane workers and indication of VPP DUT configuration.
- 2. X-axis Labels: indices of individual test suites as listed in Graph Legend.
- 3. Y-axis Labels: measured Packets Per Second [pps] throughput values.
- 4. **Graph Legend**: lists X-axis indices with associated CSIT test suites executed to generate graphed test results.
- 5. **Hover Information**: lists minimum, first quartile, median, third quartile, and maximum. If either type of outlier is present the whisker on the appropriate side is taken to 1.5×IQR from the quartile (the "inner fence") rather than the max or min, and individual outlying data points are displayed as unfilled circles (for suspected outliers) or filled circles (for outliers). (The "outer fence" is 3×IQR from the quartile.)

**Note:** Test results have been generated by FD.io test executor vpp performance job 2n-skx $^{43}$ , FD.io test executor vpp performance job 3n-skx $^{44}$ , FD.io test executor vpp performance job 2n-clx $^{45}$ , FD.io test executor vpp performance job 3n-tsh $^{47}$ , FD.io test executor vpp performance job 3n-tsh $^{47}$ , FD.io test executor vpp performance job 3n-dnv $^{49}$  with RF result files csit-vpp-perf-2001-\*.zip archived here. Required per test case data set size is **10**, but for VPP tests the actual size varies per test case and is <=10.

<sup>43</sup> https://jenkins.fd.io/view/csit/job/csit-vpp-perf-verify-2001-2n-skx

<sup>44</sup> https://jenkins.fd.io/view/csit/job/csit-vpp-perf-verify-2001-3n-skx

<sup>45</sup> https://jenkins.fd.io/view/csit/job/csit-vpp-perf-verify-2001-2n-clx

<sup>46</sup> https://jenkins.fd.io/view/csit/job/csit-vpp-perf-verify-2001-3n-hsw

<sup>&</sup>lt;sup>47</sup> https://jenkins.fd.io/view/csit/job/csit-vpp-perf-verify-2001-3n-tsh

<sup>48</sup> https://jenkins.fd.io/view/csit/job/csit-vpp-perf-verify-2001-2n-dnv

<sup>&</sup>lt;sup>49</sup> https://jenkins.fd.io/view/csit/job/csit-vpp-perf-verify-2001-3n-dnv

# 2.3.1 L2 Ethernet Switching

Following sections include summary graphs of VPP Phy-to-Phy performance with L2 Ethernet switching, including NDR throughput (zero packet loss) and PDR throughput (<0.5% packet loss). Performance is reported for VPP running in multiple configurations of VPP worker thread(s), a.k.a. VPP data plane thread(s), and their physical CPU core(s) placement.

CSIT source code for the test cases used for plots can be found in CSIT git repository<sup>50</sup>.

 $<sup>^{50}\</sup> https://git.fd.io/csit/tree/tests/vpp/perf/l2?h=rls2001$ 

3n-hsw-xl710

64b-1t1c-l2switching-base-scale-dpdk

#### 3n-tsh-x520

## 64b-1t1c-l2switching-base-ixgbe





# 64b-1t1c-l2switching-base-scale-ixgbe





## 64b-1t1c-features-I2switching-base-ixgbe





### 2n-dnv-x553

## 64b-1t1c-l2switching-base-ixgbe





# 64b-1t1c-l2switching-base-scale-ixgbe





#### 3n-dnv-x553

## 64b-1t1c-l2switching-base-ixgbe





# 64b-1t1c-l2switching-base-scale-ixgbe





# 64b-1t1c-features-I2switching-base-ixgbe





#### 2n-clx-xxv710

## 64b-2t1c-l2switching-base-avf





# 64b-2t1c-l2switching-base-scale-avf





# 64b-2t1c-l2switching-base-dpdk





# 64b-2t1c-l2switching-base-scale-dpdk





### 2n-clx-x710

## 64b-2t1c-l2switching-base-scale-[avf,dpdk]





### 2n-clx-cx556a

## 64b-2t1c-l2switching-base-rdma-core





# 64b-2t1c-l2switching-scale-rdma-core





## 2.3.2 IPv4 Routing

Following sections include summary graphs of VPP Phy-to-Phy performance with IPv4 Routed-Forwarding, including NDR throughput (zero packet loss) and PDR throughput (<0.5% packet loss). Performance is reported for VPP running in multiple configurations of VPP worker thread(s), a.k.a. VPP data plane thread(s), and their physical CPU core(s) placement.

CSIT source code for the test cases used for plots can be found in CSIT git repository<sup>51</sup>.

 $<sup>^{51}</sup>$  https://git.fd.io/csit/tree/tests/vpp/perf/ip4?h=rls2001

### 3n-hsw-xl710

## 64b-1t1c-ip4routing-base-scale-dpdk





### 3n-tsh-x520

## 64b-1t1c-ip4routing-base-scale-ixgbe





## 64b-1t1c-features-ip4routing-base-ixgbe





### 2n-dnv-x553

## 64b-1t1c-ip4routing-base-scale-ixgbe





### 3n-dnv-x553

## 64b-1t1c-ip4routing-base-scale-ixgbe





### 2n-clx-xxv710

## 64b-2t1c-ip4routing-base-scale-avf





# 64b-2t1c-ip4routing-base-scale-dpdk





# 64b-2t1c-features-ip4routing-base-dpdk





### 2n-clx-x710

## 64b-2t1c-ip4routing-base-scale-[avf,dpdk]





### 2n-clx-cx556a

## 64b-2t1c-ip4routing-base-rdma-core





# 64b-2t1c-ip4routing-scale-rdma-core





## 64b-2t1c-ip4routing-features





## 2.3.3 IPv6 Routing

Following sections include summary graphs of VPP Phy-to-Phy performance with IPv6 Routed-Forwarding, including NDR throughput (zero packet loss) and PDR throughput (<0.5% packet loss). Performance is reported for VPP running in multiple configurations of VPP worker thread(s), a.k.a. VPP data plane thread(s), and their physical CPU core(s) placement.

CSIT source code for the test cases used for plots can be found in CSIT git repository<sup>52</sup>.

<sup>&</sup>lt;sup>52</sup> https://git.fd.io/csit/tree/tests/vpp/perf/ip6?h=rls2001

78b-1t1c-ip6routing-base-scale-dpdk

## 78b-1t1c-ip6routing-base-scale-ixgbe





#### 2n-dnv-x553

## 78b-1t1c-ip6routing-base-scale-ixgbe





#### 3n-dnv-x553

## 78b-1t1c-ip6routing-base-scale-ixgbe





#### 2n-clx-xxv710

## 78b-2t1c-ip6routing-base-scale-dpdk





#### 2n-clx-x710

## 78b-2t1c-ip6routing-base-scale-dpdk





#### 2n-clx-cx556a

## 78b-2t1c-ip6routing-base-scale-rdma-core





## 2.3.4 SRv6 Routing

Following sections include summary graphs of VPP Phy-to-Phy performance with SRv6, including NDR throughput (zero packet loss) and PDR throughput (<0.5% packet loss). Performance is reported for VPP running in multiple configurations of VPP worker thread(s), a.k.a. VPP data plane thread(s), and their physical CPU core(s) placement.

CSIT source code for the test cases used for plots can be found in CSIT git repository<sup>53</sup>.

 $<sup>^{53}</sup>$  https://git.fd.io/csit/tree/tests/vpp/perf/srv6?h=rls2001

## 78b-1t1c-srv6-ip6routing-base-dpdk





## 78b-1t1c-srv6-ip6routing-base-ixgbe





### 2.3.5 IPv4 Tunnels

Following sections include summary graphs of VPP Phy-to-Phy performance with IPv4 Overlay Tunnels, including NDR throughput (zero packet loss) and PDR throughput (<0.5% packet loss). Performance is reported for VPP running in multiple configurations of VPP worker thread(s), a.k.a. VPP data plane thread(s), and their physical CPU core(s) placement.

CSIT source code for the test cases used for plots can be found in CSIT git repository<sup>54</sup>.

 $<sup>^{54}</sup>$  https://git.fd.io/csit/tree/tests/vpp/perf/ip4\_tunnels?h=rls2001

## 64b-1t1c-ip4tunnel-base-dpdk





## 64b-1t1c-ip4tunnel-base-scale-ixgbe





#### 3n-dnv-x553

## 64b-1t1c-ip4tunnel-base-scale-ixgbe





### 2.3.6 KVM VMs vhost-user

Following sections include summary graphs of VPP Phy-to-VM(s)-to-Phy performance with VM virtio and VPP vhost-user virtual interfaces, including NDR throughput (zero packet loss) and PDR throughput (<0.5% packet loss). Performance is reported for VPP running in multiple configurations of VPP worker thread(s), a.k.a. VPP data plane thread(s), and their physical CPU core(s) placement.

CSIT source code for the test cases used for plots can be found in CSIT git repository<sup>55</sup>.

 $<sup>^{55}</sup>$  https://git.fd.io/csit/tree/tests/vpp/perf/vm\_vhost?h=rls2001

## 64b-1t1c-vhost-base-dpdk-testpmd





# 64b-1t1c-vhost-base-dpdk-vpp





## 64b-1t1c-vhost-base-ixgbe-vppl2xc





#### 2n-clx-xxv710

## 64b-2t1c-vhost-base-dpdk-testpmd





# 64b-2t1c-vhost-base-dpdk-vpp





#### 2n-clx-cx556a

## 64b-2t1c-vhost-base-rdma-core-testpmd





## 64b-2t1c-vhost-base-rdma-core-vpp





## 2.3.7 LXC/DRC Container Memif

Following sections include summary graphs of VPP Phy-to-Phy performance with Container memif Connections, including NDR throughput (zero packet loss) and PDR throughput (<0.5% packet loss). Performance is reported for VPP running in multiple configurations of VPP worker thread(s), a.k.a. VPP data plane thread(s), and their physical CPU core(s) placement.

CSIT source code for the test cases used for plots can be found in CSIT git repository<sup>56</sup>.

 $<sup>^{56}\</sup> https://git.fd.io/csit/tree/tests/vpp/perf/container\_memif?h=rls2001$ 

#### 3n-tsh-x520

## 64b-1t1c-memif-base-ixgbe





#### 2n-clx-xxv710

## 64b-2t1c-memif-base-dpdk





#### 2n-clx-cx556a

#### 64b-2t1c-memif-base-rdma-core





## 2.3.8 IPSec IPv4 Routing

Following sections include summary graphs of VPP Phy-to-Phy performance with IPSec encryption used in combination with IPv4 routed-forwarding, including NDR throughput (zero packet loss) and PDR throughput (<0.5% packet loss). VPP IPSec encryption is accelerated using DPDK cryptodev library driving Intel Quick Assist (QAT) crypto PCle hardware cards. Performance is reported for VPP running in multiple configurations of VPP worker thread(s), a.k.a. VPP data plane thread(s), and their physical CPU core(s) placement.

CSIT source code for the test cases used for plots can be found in CSIT git repository<sup>57</sup>.

<sup>&</sup>lt;sup>57</sup> https://git.fd.io/csit/tree/tests/vpp/perf/crypto?h=rls2001

#### 3n-hsw-xl710

## imix-1t1c-ipsec-ip4routing-base-scale-sw-dpdk





# imix-1t1c-ipsec-ip4routing-base-scale-hw-dpdk





#### 3n-tsh-x520

## imix-1t1c-ipsec-ip4routing-base-scale-sw-ixgbe





#### 3n-dnv-x553

## imix-1t1c-ipsec-ip4routing-base-scale-sw-ixgbe





# 2.4 Speedup Multi-Core

Speedup Multi-Core throughput graphs are generated by multiple executions of the same performance tests across physical testbeds hosted LF FD.io labs: 3n-hsw, 2n-skx, 3n-skx, 2n-clx, 3n-tsh, 2n-dnv, 3n-dnv. Grouped bars illustrate the 64B/78B packet throughput speedup ratio for 2- and 4-core multi-threaded VPP configurations relative to 1-core configurations.

Additional information about graph data:

- 1. **Graph Title**: describes tested packet path, testbed topology, processor model, NIC model, packet size used by data plane workers and indication of VPP DUT configuration.
- 2. X-axis Labels: number of cores.
- 3. Y-axis Labels: measured Packets Per Second [pps] throughput values.
- 4. Graph Legend: lists CSIT test suites executed to generate graphed test results.
- 5. **Hover Information**: lists number of runs executed, specific test substring, mean value of the measured packet throughput, calculated perfect throughput value, difference between measured and perfect values and relative speedup value.

**Note:** Test results have been generated by FD.io test executor vpp performance job 2n-skx<sup>58</sup>, FD.io test executor vpp performance job 3n-skx<sup>59</sup>, FD.io test executor vpp performance job 2n-clx<sup>60</sup>, FD.io test executor vpp performance job 3n-tsh<sup>62</sup>, FD.io test executor vpp performance job 3n-tsh<sup>62</sup>, FD.io test executor vpp performance job 3n-dnv<sup>64</sup> with RF result files csit-vpp-perf-2001-\*.zip archived here. Required per test case data set size is **10**, but for VPP tests the actual size varies per test case and is <=10.

<sup>&</sup>lt;sup>58</sup> https://jenkins.fd.io/view/csit/job/csit-vpp-perf-verify-2001-2n-skx

 $<sup>^{59}\</sup> https://jenkins.fd.io/view/csit/job/csit-vpp-perf-verify-2001-3n-skx$ 

<sup>&</sup>lt;sup>60</sup> https://jenkins.fd.io/view/csit/job/csit-vpp-perf-verify-2001-2n-clx

<sup>61</sup> https://jenkins.fd.io/view/csit/job/csit-vpp-perf-verify-2001-3n-hsw

<sup>62</sup> https://jenkins.fd.io/view/csit/job/csit-vpp-perf-verify-2001-3n-tsh

<sup>63</sup> https://jenkins.fd.io/view/csit/job/csit-vpp-perf-verify-2001-2n-dnv

<sup>64</sup> https://jenkins.fd.io/view/csit/job/csit-vpp-perf-verify-2001-3n-dnv

## 2.4.1 L2 Ethernet Switching

Following sections include Throughput Speedup Analysis for VPP multi- core multi-thread configurations with no Hyper-Threading, specifically for tested 2t2c (2threads, 2cores) and 4t4c scenarios. 1t1c throughput results are used as a reference for reported speedup ratio. Input data used for the graphs comes from Phy-to-Phy 64B performance tests with VPP L2 Ethernet switching, including NDR throughput (zero packet loss) and PDR throughput (<0.5% packet loss).

CSIT source code for the test cases used for plots can be found in CSIT git repository<sup>65</sup>.

<sup>65</sup> https://git.fd.io/csit/tree/tests/vpp/perf/l2?h=rls2001

3n-hsw-xl710

64b-I2switching-base-scale-dpdk

#### 3n-tsh-x520

## 64b-l2switching-base-ixgbe





# 64b-l2switching-base-scale-ixgbe





# 64b-features-I2switching-base-ixgbe





#### 2n-dnv-x553

## 64b-l2switching-base-ixgbe





# 64b-l2switching-base-scale-ixgbe





#### 3n-dnv-x553

## 64b-l2switching-base-ixgbe





# 64b-l2switching-base-scale-ixgbe





# 64b-features-I2switching-base-ixgbe





#### 2n-clx-xxv710

## 64b-l2switching-base-avf





# 64b-l2switching-base-scale-avf





# 64b-l2switching-base-dpdk





# 64b-l2switching-base-scale-dpdk





#### 2n-clx-x710

## 64b-l2switching-base-scale-[avf,dpdk]





#### 2n-clx-cx556a

## 64b-l2switching-base-rdma-core





# 64b-l2switching-scale





## 2.4.2 IPv4 Routing

Following sections include Throughput Speedup Analysis for VPP multi- core multi-thread configurations with no Hyper-Threading, specifically for tested 2t2c (2threads, 2cores) and 4t4c scenarios. 1t1c throughput results are used as a reference for reported speedup ratio. Input data used for the graphs comes from Phy-to-Phy 64B performance tests with VPP IPv4 Routed-Forwarding, including NDR throughput (zero packet loss) and PDR throughput (<0.5% packet loss).

CSIT source code for the test cases used for plots can be found in CSIT git repository<sup>66</sup>.

<sup>66</sup> https://git.fd.io/csit/tree/tests/vpp/perf/ip4?h=rls2001

#### 3n-hsw-xl710

## 64b-ip4routing-base-scale-dpdk





#### 3n-tsh-x520

## 64b-ip4routing-base-scale-ixgbe





## 64b-features-ip4routing-base-ixgbe





#### 2n-dnv-x553

## 64b-ip4routing-base-scale-ixgbe





# 64b-features-ip4routing-base-ixgbe





#### 3n-dnv-x553

## 64b-ip4routing-base-scale-ixgbe





# 64b-features-ip4routing-base-ixgbe





### 2n-clx-xxv710

## 64b-ip4routing-base-scale-avf





## 64b-ip4routing-base-scale-dpdk





# 64b-features-ip4routing-base-dpdk





### 2n-clx-x710

## 64b-ip4routing-base-scale-[avf,dpdk]





### 2n-clx-cx556a

## 64b-ip4routing-base-rdma-core





## 64b-ip4routing-scale





## 64b-ip4routing-features





## 2.4.3 IPv6 Routing

Following sections include Throughput Speedup Analysis for VPP multi- core multi-thread configurations with no Hyper-Threading, specifically for tested 2t2c (2threads, 2cores) and 4t4c scenarios. 1t1c throughput results are used as a reference for reported speedup ratio. Input data used for the graphs comes from Phy-to-Phy 78B performance tests with VPP IPv6 Routed-Forwarding, including NDR throughput (zero packet loss) and PDR throughput (<0.5% packet loss).

CSIT source code for the test cases used for plots can be found in CSIT git repository<sup>67</sup>.

<sup>67</sup> https://git.fd.io/csit/tree/tests/vpp/perf/ip6?h=rls2001

3n-hsw-xl710

78b-ip6routing-base-scale-dpdk

### 3n-tsh-x520

## 78b-ip6routing-base-scale-ixgbe





### 2n-dnv-x553

## 78b-ip6routing-base-scale-ixgbe





### 3n-dnv-x553

## 78b-ip6routing-base-scale-ixgbe





### 2n-clx-xxv710

## 78b-ip6routing-base-scale-dpdk





### 2n-clx-x710

## 78b-ip6routing-base-scale-dpdk





### 2n-clx-cx556a

## 78b-ip6routing-base-scale-rdma-core





## 2.4.4 SRv6 Routing

Following sections include Throughput Speedup Analysis for VPP multi- core multi-thread configurations with no Hyper-Threading, specifically for tested 2t2c (2threads, 2cores) and 4t4c scenarios. 1t1c throughput results are used as a reference for reported speedup ratio. Input data used for the graphs comes from Phy-to-Phy 78B performance tests with VPP SRv6, including NDR throughput (zero packet loss) and PDR throughput (<0.5% packet loss).

CSIT source code for the test cases used for plots can be found in CSIT git repository<sup>68</sup>.

<sup>&</sup>lt;sup>68</sup> https://git.fd.io/csit/tree/tests/vpp/perf/srv6?h=rls2001

### 3n-hsw-xl710

## 78b-srv6-ip6routing-base-dpdk





### 3n-tsh-x520

## 78b-srv6-ip6routing-base-ixgbe





### 2.4.5 IPv4 Tunnels

Following sections include Throughput Speedup Analysis for VPP multi- core multi-thread configurations with no Hyper-Threading, specifically for tested 2t2c (2threads, 2cores) and 4t4c scenarios. 1t1c throughput results are used as a reference for reported speedup ratio. Performance is reported for VPP running in multiple configurations of VPP worker thread(s), a.k.a. VPP data plane thread(s), and their physical CPU core(s) placement.

CSIT source code for the test cases used for plots can be found in CSIT git repository<sup>69</sup>.

<sup>69</sup> https://git.fd.io/csit/tree/tests/vpp/perf/ip4\_tunnels?h=rls2001

### 3n-hsw-xl710

## 64b-ip4tunnel-base-dpdk





## 64b-ip4tunnel-base-scale-ixgbe





### 3n-dnv-x553

## 64b-ip4tunnel-base-scale-ixgbe





### 2.4.6 KVM VMs vhost-user

Following sections include Throughput Speedup Analysis for VPP multi- core multi-thread configurations with no Hyper-Threading, specifically for tested 2t2c (2threads, 2cores) and 4t4c scenarios. 1t1c throughput results are used as a reference for reported speedup ratio. Input data used for the graphs comes from Phy-to-Phy 64B performance tests with VM vhost-user, including NDR throughput (zero packet loss) and PDR throughput (<0.5% packet loss).

CSIT source code for the test cases used for plots can be found in CSIT git repository<sup>70</sup>.

 $<sup>^{70}\</sup> https://git.fd.io/csit/tree/tests/vpp/perf/vm_vhost?h=rls2001$ 

### 3n-hsw-xl710

## 64b-vhost-base-dpdk-testpmd





## 64b-vhost-base-dpdk-vpp





## 64b-vhost-base-ixgbe-vppl2xc





### 2n-clx-xxv710

## 64b-vhost-base-dpdk-testpmd





## 64b-vhost-base-dpdk-vpp





### 2n-clx-cx556a

## 64b-vhost-base-rdma-core-testpmd





## 64b-vhost-base-rdma-core-vpp



### 2.4.7 LXC/DRC Container Memif

Following sections include Throughput Speedup Analysis for VPP multi- core multi-thread configurations with no Hyper-Threading, specifically for tested 2t2c (2threads, 2cores) and 4t4c scenarios. 1t1c throughput results are used as a reference for reported speedup ratio. Performance is reported for VPP running in multiple configurations of VPP worker thread(s), a.k.a. VPP data plane thread(s), and their physical CPU core(s) placement.

CSIT source code for the test cases used for plots can be found in CSIT git repository<sup>71</sup>.

 $<sup>^{71}\</sup> https://git.fd.io/csit/tree/tests/vpp/perf/container\_memif?h=rls2001$ 

## 64b-memif-base-ixgbe





## 2n-clx-xxv710

## 64b-memif-base-dpdk





### 2n-clx-cx556a

### 64b-memif-base-rdma-core





## 2.4.8 IPSec IPv4 Routing

Following sections include Throughput Speedup Analysis for VPP multi- core multi-thread configurations with no Hyper-Threading, specifically for tested 2t2c (2threads, 2cores) and 4t4c scenarios. 1t1c throughput results are used as a reference for reported speedup ratio. VPP IPSec encryption is accelerated using DPDK cryptodev library driving Intel Quick Assist (QAT) crypto PCle hardware cards. Performance is reported for VPP running in multiple configurations of VPP worker thread(s), a.k.a. VPP data plane thread(s), and their physical CPU core(s) placement.

CSIT source code for the test cases used for plots can be found in CSIT git repository<sup>72</sup>.

 $<sup>^{72}\</sup> https://git.fd.io/csit/tree/tests/vpp/perf/crypto?h=rls2001$ 

### 3n-hsw-xl710

## imix-ipsec-ip4routing-base-scale-sw-dpdk





# imix-ipsec-ip4routing-base-scale-hw-dpdk





## imix-ipsec-ip4routing-base-scale-sw-ixgbe





### 3n-dnv-x553

## imix-ipsec-ip4routing-base-scale-sw-ixgbe





VPP latency results are generated based on the test data obtained from CSIT-2001 NDR-PDR throughput tests executed across physical testbeds hosted in LF FD.io labs: 3n-hsw, 3n-skx, 2n- skx, 2n-clx, 3n-dnv, 2n-dnv, 3n-tsh.

Latency by percentile distribution plots are used to show packet latency percentiles at different packet rate load levels: i) No-Load latency streams only, ii) Low-Load at 10% PDR, iii) Mid-Load at 50% PDR and iv) High-Load at 90% PDR.

Additional information about graph data:

- 1. Graph Title: describes tested DUT packet path.
- 2. X-axis Labels: percentile of packets.
- 3. Y-axis Labels: measured one-way packet latency values in [uSec].
- 4. Graph Legend: list of latency tests at different packet rate load level.
- 5. **Hover Information**: packet rate load level, stream direction (East-West, West-East), percentile, one-way latency.

**Note:** Test results have been generated by FD.io test executor vpp performance job 3n-hsw<sup>73</sup> and FD.io test executor vpp performance job 3n-tsh<sup>74</sup> with RF result files csit-vpp-perf-2001-\*.zip archived here.

 $<sup>^{73}\</sup> https://jenkins.fd.io/view/csit/job/csit-vpp-perf-verify-2001-3n-hsw$ 

<sup>74</sup> https://jenkins.fd.io/view/csit/job/csit-vpp-perf-verify-2001-3n-tsh

# 2.5.1 L2 Ethernet Switching

CSIT source code for the test cases used for plots can be found in CSIT git repository<sup>75</sup>.

<sup>&</sup>lt;sup>75</sup> https://git.fd.io/csit/tree/tests/vpp/perf/l2?h=rls2001

### 3n-hsw-xl710

## 64b-1t1c-l2switching-base-scale-dpdk













#### 3n-tsh-x520

### 64b-1t1c-l2switching-base-scale-ixgbe

















### 64b-1t1c-features-I2switching-base-ixgbe











### 2n-clx-xxv710

## 64b-2t1c-l2switching-base-scale-avf



















### 64b-2t1c-l2switching-base-scale-dpdk



















# 2.5.2 IPv4 Routing

CSIT source code for the test cases used for plots can be found in CSIT git repository  $^{76}$ .

<sup>&</sup>lt;sup>76</sup> https://git.fd.io/csit/tree/tests/vpp/perf/ip4?h=rls2001

### 3n-hsw-xl710

### 64b-1t1c-ip4routing-base-scale-dpdk







#### 3n-tsh-x520

#### 64b-1t1c-ip4routing-base-scale-ixgbe









## 64b-1t1c-ip4routing-features-ixgbe













#### 2n-clx-xxv710

#### 64b-2t1c-ip4routing-base-scale-avf





























## 64b-2t1c-ip4routing-base-scale-dpdk



















# 2.5.3 IPv6 Routing

CSIT source code for the test cases used for plots can be found in CSIT git repository  $^{77}$ .

<sup>77</sup> https://git.fd.io/csit/tree/tests/vpp/perf/ip6?h=rls2001

#### 3n-hsw-xl710

## 78b-1t1c-ip6routing-base-scale-dpdk







#### 3n-tsh-x520

### 78b-1t1c-ip6routing-base-scale-ixgbe











#### 2n-clx-xxv710

### 78b-2t1c-ip6routing-base-scale-avf











## 78b-2t1c-ip6routing-base-scale-dpdk











# 2.5.4 SRv6 Routing

CSIT source code for the test cases used for plots can be found in CSIT git repository  $^{78}$ .

<sup>&</sup>lt;sup>78</sup> https://git.fd.io/csit/tree/tests/vpp/perf/srv6?h=rls2001

#### 3n-hsw-xl710

## 78b-1t1c-srv6-ip6routing-base-dpdk











#### 3n-tsh-x520

### 78b-1t1c-srv6-ip6routing-base-ixgbe











## 2.5.5 IPv4 Tunnels

CSIT source code for the test cases used for plots can be found in CSIT git repository  $^{79}$ .

<sup>79</sup> https://git.fd.io/csit/tree/tests/vpp/perf/ip4\_tunnels?h=rls2001

#### 3n-hsw-xl710

## 64b-1t1c-ip4tunnel-base-dpdk





#### 3n-tsh-x520

# 64b-1t1c-ip4tunnel-base-scale-ixgbe





# 2.5.6 KVM VMs vhost-user

CSIT source code for the test cases used for plots can be found in CSIT git repository<sup>80</sup>.

<sup>80</sup> https://git.fd.io/csit/tree/tests/vpp/perf/vm\_vhost?h=rls2001

#### 3n-hsw-xl710

## 64b-1t1c-vhost-base-dpdk



















#### 3n-tsh-x520

## 64b-1t1c-vhost-base-ixgbe













#### 2n-clx-xxv710

## 64b-2t1c-vhost-base-avf-testpmd









# 64b-2t1c-vhost-base-dpdk-testpmd









# 64b-2t1c-vhost-base-avf-vpp









# 64b-2t1c-vhost-base-dpdk-vpp









## 2.5.7 LXC/DRC Container Memif

CSIT source code for the test cases used for plots can be found in CSIT git repository  $^{81}$ .

<sup>81</sup> https://git.fd.io/csit/tree/tests/vpp/perf/container\_memif?h=rls2001

### 3n-tsh-x520

## 64b-1t1c-memif-base-ixgbe











#### 2n-clx-xxv710

## 64b-2t1c-memif-base-avf









## 64b-2t1c-memif-base-dpdk









## 2.5.8 IPSec IPv4 Routing

CSIT source code for the test cases used for plots can be found in CSIT git repository  $^{82}$ .

<sup>82</sup> https://git.fd.io/csit/tree/tests/vpp/perf/crypto?h=rls2001

#### 3n-hsw-xl710

## 1518b-1t1c-ipsec-ip4routing-base-scale-sw-dpdk













## 1518b-1t1c-ipsec-ip4routing-base-scale-hw-dpdk









#### 3n-tsh-x520

## 1518b-1t1c-ipsec-ip4routing-base-scale-sw-ixgbe







## 2.6 Soak Tests

Long duration (30 minutes per test) soak tests are executed using *PLRsearch* (page 22) algorithm. As the test take long time, only 10 test were executed, two runs each.

Additional information about graph data:

- 1. Graph Title: describes type of tests and soak test duration.
- 2. X-axis Labels: indices of test suites.
- 3. Y-axis Labels: estimated lower bounds for critical rate value in [Mpps].
- 4. Graph Legend: list of X-axis indices with CSIT test suites.
- 5. **Hover Information**: in general lists minimum, first quartile, median, third quartile, and maximum. If either type of outlier is present the whisker on the appropriate side is taken to 1.5×IQR from the quartile (the "inner fence") rather than the max or min, and individual outlying data points are displayed as unfilled circles (for suspected outliers) or filled circles (for outliers). (The "outer fence" is 3×IQR from the quartile.) When number of samples is low, some values are not displayed.

**Note:** Test results have been generated by FD.io test executor vpp performance job 2n-skx<sup>83</sup> with RF result files csit-vpp-perf-2001-\*.zip archived here.

2.6. Soak Tests 459

<sup>83</sup> https://jenkins.fd.io/view/csit/job/csit-vpp-perf-verify-2001-2n-skx





2.6. Soak Tests 461

# 2.7 Reconfiguration Tests

See Reconfiguration Tests (page 31) for methodology description of this test type.

#### 2.7.1 VNF Service Chains

In each test, a single service chain is added, the re-configuration contains all the steps the initial chains got, except the last step (starting VMs) is skipped.

Additional information about graph data:

- 1. Graph Title: describes tested VPP packet path. Format:
  - wire encapsulation dot1qip4v1xan,
  - VPP forwarding mode 12bd,
  - total number {Y} of service chains {Y}ch,
  - total number of chains being reconfigured 1ach,
  - total number of vhost-user interfaces forwarding packets on VPP with {Y} chains and {X} VMs per chain {2XY}vh (2 interfaces per {X} VMs per {Y} chains),
  - total number {XY} of VNF VMs forwarding packets {XY}vm and finally
  - VNF workload in VM testpmd.
- 2. X-axis Labels: indices of individual test suites as listed in Graph Legend.
- 3. Y-axis Labels: measured Implied time loss [s] values.
- 4. **Graph Legend**: lists X-axis indices with associated CSIT test suites executed to generate graphed test results and the average value of measured packet loss.
- 5. **Hover Information**: lists minimum, first quartile, median, third quartile, and maximum. If either type of outlier is present the whisker on the appropriate side is taken to 1.5×IQR from the quartile (the "inner fence") rather than the max or min, and individual outlying data points are displayed as unfilled circles (for suspected outliers) or filled circles (for outliers). (The "outer fence" is 3×IQR from the quartile.)

**Note:** Test results have been generated by FD.io test executor vpp performance job 2n-skx<sup>84</sup>, FD.io test executor vpp performance job 2n-clx<sup>85</sup> with RF result files csit-vpp-perf-2001-\*.zip archived here.

<sup>84</sup> https://jenkins.fd.io/view/csit/job/csit-vpp-perf-verify-2001-2n-skx

<sup>&</sup>lt;sup>85</sup> https://jenkins.fd.io/view/csit/job/csit-vpp-perf-verify-2001-2n-clx

#### 2n-clx-xxv710

## imix-2t1c-dot1qip4vxlan-l2bd



## imix-4t2c-dot1qip4vxlan-l2bd



## imix-8t4c-dot1qip4vxlan-l2bd



# 2.8 NFV Service Density

NFV Service Density is benchmarked in three distinct NF service configurations:

- VNF Service Chains Routing
- CNF Service Chains Routing
- CNF Service Pipelines Routing
- VNF Service Chains Tunnels
- CNF Service Chains IPSEC

Each configuration is tested in a number of service density combinations [Number of Service Instances] x [Number of NFs per Service Instance]. The actual tested range is based on available CPU physical core resources.

# 2.8.1 VNF Service Chains Routing

Throughput graphs for VNF service chains are generated by multiple executions of tests covering a range of VNF service densities defined as [Number of Service Chains] x [Number of VNFs per Service Chain]. The results are presented in the service density graph. Each graph includes the results of both configurations: one NF per physical core and two NFs per physical core and their relative difference.

Additional information about graph data:

- 1. Graph Title: describes tested packet path including VNF workload running in each VM.
- 2. **X-axis Labels**: VNFs per service chain.
- 3. Y-axis Labels: number of service chains.
- 4. **Z-axis Color Scale**: lists 64B/IMIX Packet Throughput (mean MRR/NDR/PDR value) in Mpps or the Relative Difference.
- 5. **Hover Information**: specific test substring listing vhost-chain-vm combinations, number of runs executed, mean MRR/NDR/PDR throughput in Mpps, standard deviation for both configurations and their relative difference.

**Note:** Test results have been generated by FD.io test executor vpp performance job 2n-skx<sup>86</sup> and FD.io test executor vpp performance job 2n-clx<sup>87</sup> with RF result files csit-vpp-perf-2001-\*.zip archived here.

 $<sup>^{86}</sup>$  https://jenkins.fd.io/view/csit/job/csit-vpp-perf-verify-2001-2n-skx

<sup>&</sup>lt;sup>87</sup> https://jenkins.fd.io/view/csit/job/csit-vpp-perf-verify-2001-2n-clx

## 2n-clx-xxv710-mrr







## 2n-clx-xxv710-ndr







# 2n-clx-xxv710-pdr







## 2.8.2 CNF Service Chains Routing

Throughput graphs for CNF service chains are generated by multiple executions of tests covering a range of CNF service densities defined as [Number of Service Chains] x [Number of CNFs per Service Chain]. The results are presented in the service density graph. Each graph includes the results of both configurations: one NF per physical core and two NFs per physical core and their relative difference.

Additional information about graph data:

- 1. **Graph Title**: describes tested packet path including CNF workload running in each Docker Container.
- 2. X-axis Labels: CNFs per service chain.
- 3. Y-axis Labels: number of service chains.
- 4. **Z-axis Color Scale**: lists 64B/IMIX Packet Throughput (mean MRR/NDR/PDR value) in Mpps or the Relative Difference.
- 5. **Hover Information**: specific test substring listing memif-chain-docker\_container combinations, number of runs executed, mean MRR/NDR/PDR throughput in Mpps, standard deviation for both configurations and their relative difference.

**Note:** Test results have been generated by FD.io test executor vpp performance job 2n-skx<sup>88</sup> and FD.io test executor vpp performance job 2n-clx<sup>89</sup> with RF result files csit-vpp-perf-2001-\*.zip archived here.

<sup>&</sup>lt;sup>88</sup> https://jenkins.fd.io/view/csit/job/csit-vpp-perf-verify-2001-2n-skx

<sup>&</sup>lt;sup>89</sup> https://jenkins.fd.io/view/csit/job/csit-vpp-perf-verify-2001-2n-clx

## 2n-clx-xxv710-mrr







## 2n-clx-xxv710-ndr







# 2n-clx-xxv710-pdr







## 2.8.3 CNF Service Pipelines Routing

Throughput graphs for CNF service pipelines are generated by multiple executions of tests covering a range of CNF service densities defined as [Number of Service Pipelines] x [Number of CNFs per Service Pipeline]. The results are presented in the service density graph. Each graph includes the results of both configurations: one NF per physical core and two NFs per physical core and their relative difference.

Additional information about graph data:

- 1. **Graph Title**: describes tested packet path including CNF workload running in each Docker Container.
- 2. X-axis Labels: CNFs per service pipeline.
- 3. Y-axis Labels: number of service pipelines.
- 4. **Z-axis Color Scale**: lists 64B/IMIX Packet Throughput (mean MRR/NDR/PDR value) in Mpps or the Relative Difference.
- 5. **Hover Information**: specific test substring listing memif-pipeline-docker\_container combinations, number of runs executed, mean MRR/NDR/PDR throughput in Mpps, standard deviation for both configurations and their relative difference.

**Note:** Test results have been generated by FD.io test executor vpp performance job 2n-skx<sup>90</sup> and FD.io test executor vpp performance job 2n-clx<sup>91</sup> with RF result files csit-vpp-perf-2001-\*.zip archived here.

 $<sup>^{90}\</sup> https://jenkins.fd.io/view/csit/job/csit-vpp-perf-verify-2001-2n-skx$ 

<sup>91</sup> https://jenkins.fd.io/view/csit/job/csit-vpp-perf-verify-2001-2n-clx

## 2n-clx-xxv710-mrr







## 2n-clx-xxv710-ndr







## 2n-clx-xxv710-pdr







## 2.8.4 VNF Service Chains Tunnels

Additional information about graph data:

- 1. Graph Title: describes tested packet path including VNF workload running in each VM.
- 2. X-axis Labels: VNFs per service chain.
- 3. Y-axis Labels: number of service chains.
- 4. **Z-axis Color Scale**: lists 64B/IMIX Packet Throughput (mean MRR/NDR/PDR value) in Mpps or the Relative Difference.
- 5. **Hover Information**: specific test substring listing vhost-chain-vm combinations, number of runs executed, mean MRR/NDR/PDR throughput in Mpps, standard deviation for both configurations and their relative difference.

**Note:** Test results have been generated by FD.io test executor vpp performance job 2n-skx<sup>92</sup> and FD.io test executor vpp performance job 2n-clx<sup>93</sup> with RF result files csit-vpp-perf-2001-\*.zip archived here.

<sup>92</sup> https://jenkins.fd.io/view/csit/job/csit-vpp-perf-verify-2001-2n-skx

<sup>93</sup> https://jenkins.fd.io/view/csit/job/csit-vpp-perf-verify-2001-2n-clx

## 2n-clx-xxv710-mrr







2n-clx-xxv710-ndr

# imix-4t2c-eth-l2bd



# imix-8t4c-eth-l2bd



# 2n-clx-xxv710-pdr

# imix-2t1c-eth-l2bd



# imix-4t2c-eth-l2bd



# imix-8t4c-eth-l2bd



# 2.9 Hoststack Testing

# 2.9.1 HTTP/TCP with WRK

Performance graphs are generated by multiple executions of the same performance tests across physical testbeds hosted LF FD.io labs: 3n-hsw. Box-and-Whisker plots are used to display variations in measured throughput values, without making any assumptions of the underlying statistical distribution.

For each test case, Box-and-Whisker plots show the quartiles (Min, 1st quartile / 25th percentile, 2nd quartile / 50th percentile / mean, 3rd quartile / 75th percentile, Max) across collected data set. Outliers are plotted as individual points.

Additional information about graph data:

- 1. X-axis Labels: indices of individual test suites as listed in Graph Legend.
- 2. **Y-axis Labels**: measured Connections Per Second [cps] or Requests Per Second [rps] throughput values.
- 3. **Graph Legend**: lists X-axis indices with associated CSIT test suites executed to generate graphed test results.
- 4. **Hover Information**: lists minimum, first quartile, median, third quartile, and maximum. If either type of outlier is present the whisker on the appropriate side is taken to 1.5×IQR from the quartile (the "inner fence") rather than the max or min, and individual outlying data points are displayed as unfilled circles (for suspected outliers) or filled circles (for outliers). (The "outer fence" is 3×IQR from the quartile.)

**Note:** Data sources for reported test results: i) FD.io test executor vpp performance job 2n-clx<sup>94</sup>, ii) archived FD.io jobs test result output files.

CSIT source code for the test cases used for plots can be found in CSIT git repository<sup>95</sup>.

<sup>94</sup> https://jenkins.fd.io/view/csit/job/csit-vpp-perf-verify-2001-2n-clx

<sup>95</sup> https://git.fd.io/csit/tree/tests/vpp/perf/tcp?h=rls2001

# Connections per second



# Requests per second



# 2.9.2 TCP/IP with iperf3

9000b-1t1c-xl710-base-scale

9000b-1t1c-xl710-nsim-base-scale

# 2.9.3 QUIC/UDP/IP with vpp\_echo

9000b-1t1c-xl710-base-scale

# 2.10 Comparisons

#### 2.10.1 Current vs. Previous Release

Relative comparison of VPP packet throughput (NDR, PDR and MRR) between VPP-20.01 release and VPP-19.08 release (measured for CSIT-2001 and CSIT-1908 respectively) is calculated from results of tests running on 2-node Intel Xeon Skylake (2n-skx), 3-node Intel Xeon Skylake (3n-skx), 3-Node Intel Xeon Haswell (3n-hsw), 2-node Intel Atom Denverton (2n-dnv), 3-node Intel Atom Denverton (3n-dnv), 3-node Arm TaiShan (3n-tsh) testbeds, in 1-core, 2-core and 4-core (MRR only) configurations.

Listed mean and standard deviation values are computed based on a series of the same tests executed against respective VPP releases to verify test results repeatability, with percentage change calculated for mean values. Note that the standard deviation is quite high for a small number of packet throughput tests, what indicates poor test results repeatability and makes the relative change of mean throughput value not fully representative for these tests. The root causes behind poor results repeatability vary between the test cases.

# Note: Test results have been generated by

- FD.io test executor vpp performance job 2n-skx<sup>96</sup>,
- FD.io test executor vpp performance job 3n-skx<sup>97</sup>,
- FD.io test executor vpp performance job 3n-hsw<sup>98</sup>,
- FD.io test executor vpp performance job 2n-dnv<sup>99</sup>,
- FD.io test executor vpp performance job 3n-dnv<sup>100</sup>
- FD.io test executor vpp performance job 3n-tsh<sup>101</sup>

with RF result files csit-vpp-perf-2001-\*.zip archived here.

#### 3n-hsw

# **NDR Comparison**

Comparison tables in HTML, ASCII and CSV formats:

- HTML 1t1c NDR comparison
- HTML 2t2c NDR comparison
- ASCII 1t1c NDR comparison
- ASCII 2t2c NDR comparison
- CSV 1t1c NDR comparison
- CSV 2t2c NDR comparison

# **PDR Comparison**

# Comparison tables in HTML, ASCII and CSV formats:

<sup>96</sup> https://jenkins.fd.io/view/csit/job/csit-vpp-perf-verify-2001-2n-skx

<sup>97</sup> https://jenkins.fd.io/view/csit/job/csit-vpp-perf-verify-2001-3n-skx

<sup>98</sup> https://jenkins.fd.io/view/csit/job/csit-vpp-perf-verify-2001-3n-hsw

<sup>99</sup> https://jenkins.fd.io/view/csit/job/csit-vpp-perf-verify-2001-2n-dnv

<sup>100</sup> https://jenkins.fd.io/view/csit/job/csit-vpp-perf-verify-2001-3n-dnv

<sup>&</sup>lt;sup>101</sup> https://jenkins.fd.io/view/csit/job/csit-vpp-perf-verify-2001-3n-tsh

- HTML 1t1c PDR comparison
- HTML 2t2c PDR comparison
- ASCII 1t1c PDR comparison
- ASCII 2t2c PDR comparison
- CSV 1t1c PDR comparison
- CSV 2t2c PDR comparison

# **MRR Comparison**

# Comparison tables in HTML, ASCII and CSV formats:

- HTML 1t1c MRR comparison
- HTML 2t2c MRR comparison
- HTML 4t4c MRR comparison
- ASCII 1t1c MRR comparison
- ASCII 2t2c MRR comparison
- ASCII 4t4c MRR comparison
- CSV 1t1c MRR comparison
- CSV 2t2c MRR comparison
- CSV 4t4c MRR comparison

### 2n-dnv

## **NDR Comparison**

#### Comparison tables in HTML, ASCII and CSV formats:

- HTML 1t1c NDR comparison
- HTML 2t2c NDR comparison
- ASCII 1t1c NDR comparison
- ASCII 2t2c NDR comparison
- CSV 1t1c NDR comparison
- CSV 2t2c NDR comparison

# **PDR Comparison**

# Comparison tables in HTML, ASCII and CSV formats:

- HTML 1t1c PDR comparison
- HTML 2t2c PDR comparison
- ASCII 1t1c PDR comparison
- ASCII 2t2c PDR comparison
- CSV 1t1c PDR comparison
- CSV 2t2c PDR comparison

2.10. Comparisons 517

#### **MRR Comparison**

### Comparison tables in HTML, ASCII and CSV formats:

- HTML 1t1c MRR comparison
- HTML 2t2c MRR comparison
- HTML 4t4c MRR comparison
- ASCII 1t1c MRR comparison
- ASCII 2t2c MRR comparison
- ASCII 4t4c MRR comparison
- CSV 1t1c MRR comparison
- CSV 2t2c MRR comparison
- CSV 4t4c MRR comparison

# 3n-dnv

#### **NDR Comparison**

# Comparison tables in HTML, ASCII and CSV formats:

- HTML 1t1c NDR comparison
- HTML 2t2c NDR comparison
- ASCII 1t1c NDR comparison
- ASCII 2t2c NDR comparison
- CSV 1t1c NDR comparison
- CSV 2t2c NDR comparison

#### **PDR Comparison**

# Comparison tables in HTML, ASCII and CSV formats:

- HTML 1t1c PDR comparison
- HTML 2t2c PDR comparison
- ASCII 1t1c PDR comparison
- ASCII 2t2c PDR comparison
- CSV 1t1c PDR comparison
- CSV 2t2c PDR comparison

### **MRR Comparison**

## Comparison tables in HTML, ASCII and CSV formats:

- HTML 1t1c MRR comparison
- HTML 2t2c MRR comparison
- HTML 4t4c MRR comparison
- ASCII 1t1c MRR comparison

- ASCII 2t2c MRR comparison
- ASCII 4t4c MRR comparison
- CSV 1t1c MRR comparison
- CSV 2t2c MRR comparison
- CSV 4t4c MRR comparison

#### 3n-tsh

# **NDR Comparison**

# Comparison tables in HTML, ASCII and CSV formats:

- HTML 1t1c NDR comparison
- HTML 2t2c NDR comparison
- ASCII 1t1c NDR comparison
- ASCII 2t2c NDR comparison
- CSV 1t1c NDR comparison
- CSV 2t2c NDR comparison

# **PDR Comparison**

# Comparison tables in HTML, ASCII and CSV formats:

- HTML 1t1c PDR comparison
- HTML 2t2c PDR comparison
- ASCII 1t1c PDR comparison
- ASCII 2t2c PDR comparison
- CSV 1t1c PDR comparison
- CSV 2t2c PDR comparison

# **MRR Comparison**

#### Comparison tables in HTML, ASCII and CSV formats:

- HTML 1t1c MRR comparison
- HTML 2t2c MRR comparison
- HTML 4t4c MRR comparison
- ASCII 1t1c MRR comparison
- ASCII 2t2c MRR comparison
- ASCII 4t4c MRR comparison
- CSV 1t1c MRR comparisonCSV 2t2c MRR comparison
- CSV 4t4c MRR comparison

2.10. Comparisons 519

# 2.10.2 2n-Clx vs. 3n-Hsw Testbeds

Relative comparison of VPP-20.01 release packet throughput (NDR, PDR and MRR) is calculated for the same tests executed on 3-Node Skylake (3n- skx) and 3-Node Haswell (3n-hsw) physical testbed types, in 1-core, 2-core and 4-core configurations.

**Note:** Test results have been generated by FD.io test executor vpp performance job 3n-hsw $^{102}$  and FD.io test executor vpp performance job 2n-clx $^{103}$  with RF result files csit-vpp-perf-2001-\*.zip archived here.

# **NDR Comparison**

Comparison tables in HTML, ASCII and CSV formats:

- HTML 1c NDR comparison
- HTML 2c NDR comparison
- ASCII 1c NDR comparison
- ASCII 2c NDR comparison
- CSV 1c NDR comparison
- CSV 2c NDR comparison

# **PDR Comparison**

Comparison tables in HTML, ASCII and CSV formats:

- HTML 1c PDR comparison
- HTML 2c PDR comparison
- ASCII 1c PDR comparison
- ASCII 2c PDR comparison
- CSV 1c PDR comparison
- CSV 2c PDR comparison

# **MRR Comparison**

Comparison tables in HTML, ASCII and CSV formats:

- HTML 1c MRR comparison
- HTML 2c MRR comparison
- HTML 4c MRR comparison
- ASCII 1c MRR comparison
- ASCII 2c MRR comparison
- ASCII 4c MRR comparison
- CSV 1c MRR comparison
- CSV 2c MRR comparison
- CSV 4c MRR comparison

<sup>102</sup> https://jenkins.fd.io/view/csit/job/csit-vpp-perf-verify-2001-3n-hsw

 $<sup>^{103}\</sup> https://jenkins.fd.io/view/csit/job/csit-vpp-perf-verify-2001-2n-clx$ 

# 2.10.3 Soak Tests vs. NDR Tests

Relative comparison of VPP-20.01 release Soak PLRSearch vs. NDR packet throughput is calculated for the tests executed on 2-Node Skylake physical testbed types, in 1-core configurations.

**Note:** Test results have been generated by FD.io test executor vpp performance job 2n-skx<sup>104</sup>, FD.io test executor vpp performance job 2n-clx<sup>105</sup> with RF result files csit-vpp-perf-2001-\*.zip archived here.

Comparison tables in ASCII and CSV formats:

#### 2n-clx

- ASCII Soak vs. NDR comparison
- CSV Soak vs. NDR comparison

# 2.11 Throughput Trending

In addition to reporting throughput comparison between VPP releases, CSIT provides continuous performance trending for VPP master branch:

- 1. Performance Dashboard<sup>106</sup>: per VPP test case throughput trend, trend compliance and summary of detected anomalies.
- 2. Trending Methodology<sup>107</sup>: throughput test metrics, trend calculations and anomaly classification (progression, regression).
- 3. VPP Trendline Graphs<sup>108</sup>: per VPP build MRR throughput measurements against the trendline with anomaly highlights and associated CSIT test jobs.

<sup>104</sup> https://jenkins.fd.io/view/csit/job/csit-vpp-perf-verify-2001-2n-skx

<sup>&</sup>lt;sup>105</sup> https://jenkins.fd.io/view/csit/job/csit-vpp-perf-verify-2001-2n-clx

<sup>&</sup>lt;sup>106</sup> https://docs.fd.io/csit/master/trending/introduction/index.html

 $<sup>^{107}\</sup> https://docs.fd.io/csit/master/trending/methodology/index.html$ 

<sup>108</sup> https://docs.fd.io/csit/master/trending/trending/index.html

# 2.12 Test Environment

# 2.12.1 Physical Testbeds

FD.io CSIT performance tests are executed in physical testbeds hosted by LF for FD.io project. Two physical testbed topology types are used:

- **3-Node Topology**: Consisting of two servers acting as SUTs (Systems Under Test) and one server as TG (Traffic Generator), all connected in ring topology.
- **2-Node Topology**: Consisting of one server acting as SUTs and one server as TG both connected in ring topology.

Tested SUT servers are based on a range of processors including Intel Xeon Haswell-SP, Intel Xeon Skylake-SP, Intel Xeon Cascade Lake-SP, Arm, Intel Atom. More detailed description is provided in *Physical Testbeds* (page 5). Tested logical topologies are described in *Logical Topologies* (page 38).

# 2.12.2 Server Specifications

Complete technical specifications of compute servers used in CSIT physical testbeds are maintained in FD.io CSIT repository: FD.io CSIT testbeds - Xeon Cascade Lake<sup>109</sup>, FD.io CSIT testbeds - Xeon Skylake, Arm, Atom<sup>110</sup> and FD.io CSIT Testbeds - Xeon Haswell<sup>111</sup>.

# 2.12.3 Pre-Test Server Calibration

Number of SUT server sub-system runtime parameters have been identified as impacting data plane performance tests. Calibrating those parameters is part of FD.io CSIT pre-test activities, and includes measuring and reporting following:

- 1. System level core jitter measure duration of core interrupts by Linux in clock cycles and how often interrupts happen. Using CPU core jitter tool 112.
- 2. Memory bandwidth measure bandwidth with Intel MLC tool 113.
- 3. Memory latency measure memory latency with Intel MLC tool.
- 4. Cache latency at all levels (L1, L2, and Last Level Cache) measure cache latency with Intel MLC tool

Measured values of listed parameters are especially important for repeatable zero packet loss throughput measurements across multiple system instances. Generally they come useful as a background data for comparing data plane performance results across disparate servers.

Following sections include measured calibration data for testbeds.

# 2.12.4 Calibration Data - Skylake

Following sections include sample calibration data measured on s11-t31-sut1 server running in one of the Intel Xeon Skylake testbeds as specified in FD.io CSIT testbeds - Xeon Skylake, Arm, Atom<sup>114</sup>.

Calibration data obtained from all other servers in Skylake testbeds shows the same or similar values.

 $<sup>^{109}\</sup> https://git.fd.io/csit/tree/docs/lab/testbeds\_sm\_clx\_hw\_bios\_cfg.md?h=rls2001$ 

<sup>110</sup> https://git.fd.io/csit/tree/docs/lab/testbeds\_sm\_skx\_hw\_bios\_cfg.md?h=rls2001

<sup>111</sup> https://git.fd.io/csit/tree/docs/lab/testbeds\_ucs\_hsw\_hw\_bios\_cfg.md?h=rls2001

<sup>112</sup> https://git.fd.io/pma\_tools/tree/jitter

<sup>113</sup> https://software.intel.com/en-us/articles/intelr-memory-latency-checker

<sup>114</sup> https://git.fd.io/csit/tree/docs/lab/testbeds\_sm\_skx\_hw\_bios\_cfg.md?h=rls2001

#### Linux cmdline

```
$ cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-4.15.0-72-generic root=UUID=e05120bb-7127-43db-b1e3-a66edd4c43bd ro_

→isolcpus=1-27,29-55,57-83,85-111 nohz_full=1-27,29-55,57-83,85-111 rcu_nocbs=1-27,29-55,57-83,85-

→111 numa_balancing=disable intel_pstate=disable intel_iommu=on iommu=pt nmi_watchdog=0 audit=0_

→nosoftlockup processor.max_cstate=1 intel_idle.max_cstate=1 hpet=disable tsc=reliable mce=off_

→console=tty0 console=ttyS0,115200n8
```

#### Linux uname

### **System-level Core Jitter**

```
\ sudo taskset -c 3 /home/testuser/pma_tools/jitter/jitter -i 20
Linux Jitter testing program version 1.8
Iterations=20
The pragram will execute a dummy function 80000 times
Display is updated every 20000 displayUpdate intervals
Timings are in CPU Core cycles
Inst_Min:
            Minimum Excution time during the display update interval(default is ~1 second)
Inst_Max:
            Maximum Excution time during the display update interval(default is ~1 second)
Inst_jitter: Jitter in the Excution time during rhe display update interval. This is the value of_
→interest
last_Exec:
            The Excution time of last iteration just before the display update
Abs_Min:
            Absolute Minimum Excution time since the program started or statistics were reset
Abs_Max:
            Absolute Maximum Excution time since the program started or statistics were reset
            Cumulative value calcualted by the dummy function
tmp:
Interval:
            Time interval between the display updates in Core Cycles
Sample No: Sample number
   Inst_Min
            Inst_Max Inst_jitter last_Exec Abs_min
                                                           Abs_max
                                                                        tmp
                                                                                  Interval
→Sample No
   160022
             171330
                          11308
                                     160022
                                                160022
                                                           171330
                                                                     2538733568 3204142750
→1
   160022
              167294
                            7272
                                     160026
                                                160022
                                                           171330
                                                                      328335360 3203873548
∽2
   160022
              167560
                            7538
                                     160026
                                                160022
                                                           171330
                                                                     2412904448 3203878736
⇔3
   160022
              169000
                                     160024
                                                160022
                                                           171330
                                                                      202506240 3203864588
                            8978
→4
   160022
              166572
                            6550
                                     160026
                                                160022
                                                           171330
                                                                     2287075328 3203866224
∽5
   160022
              167460
                            7438
                                     160026
                                                160022
                                                           171330
                                                                       76677120 3203854632
∽6
   160022
              168134
                            8112
                                     160024
                                                160022
                                                                     2161246208 3203874674
                                                           171330
∽7
   160022
              169094
                            9072
                                     160022
                                                160022
                                                           171330
                                                                     4245815296 3203878798
<del>⇔</del>8
   160022
              172460
                           12438
                                     160024
                                                160022
                                                                     2035417088 3204112010
                                                           172460
→9
   160022
              167862
                            7840
                                     160030
                                                160022
                                                           172460
                                                                     4119986176 3203856800
→10
   160022
              168398
                            8376
                                     160024
                                                160022
                                                           172460
                                                                     1909587968 3203854192
→11
```

(continues on next page)

2.12. Test Environment

| (continued | trom   | nravialic | nagal |
|------------|--------|-----------|-------|
| (COHUHUCU  | 110111 | DICVIOUS  | Dagei |
|            |        |           |       |

| 160022      | 167548 | 7526  | 160024 | 160022 | 172460 | 3994157056 3203847442 | 1   |
|-------------|--------|-------|--------|--------|--------|-----------------------|-----|
| <b>→</b> 12 |        |       |        |        |        |                       |     |
| 160022      | 167562 | 7540  | 160026 | 160022 | 172460 | 1783758848 3203862936 |     |
| <b>⇔</b> 13 |        |       |        |        |        |                       |     |
| 160022      | 167604 | 7582  | 160024 | 160022 | 172460 | 3868327936 3203859346 |     |
| <b>⇔</b> 14 |        |       |        |        |        |                       |     |
| 160022      | 168262 | 8240  | 160024 | 160022 | 172460 | 1657929728 3203851120 | _   |
| <b>⇔</b> 15 |        |       |        |        |        |                       |     |
| 160022      | 169700 | 9678  | 160024 | 160022 | 172460 | 3742498816 3203877690 | L L |
| <b>⇔</b> 16 |        |       |        |        |        |                       |     |
| 160022      | 170476 | 10454 | 160026 | 160022 | 172460 | 1532100608 3204088480 | u u |
| <b>⇔</b> 17 |        |       |        |        |        |                       |     |
| 160022      | 167798 | 7776  | 160024 | 160022 | 172460 | 3616669696 3203862072 | u u |
| <b>⇔</b> 18 |        |       |        |        |        |                       |     |
| 160022      | 166540 | 6518  | 160024 | 160022 | 172460 | 1406271488 3203836904 | u u |
| <b>⇔</b> 19 |        |       |        |        |        |                       |     |
| 160022      | 167516 | 7494  | 160024 | 160022 | 172460 | 3490840576 3203848120 | L L |
| <b>⇔</b> 20 |        |       |        |        |        |                       |     |
|             |        |       |        |        |        |                       |     |

# **Memory Bandwidth**

```
$ sudo /home/testuser/mlc --bandwidth_matrix
Intel(R) Memory Latency Checker - v3.5
{\tt Command \ line \ parameters: --bandwidth\_matrix}
Using buffer size of 100.000MB/thread for reads and an additional 100.000MB/thread for writes
Measuring Memory Bandwidths between nodes within system
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using Read-only traffic type
                Numa node
Numa node
                0
         107947.7
                      50951.5
   0
    1
           50834.6
                   108183.4
```

```
$ sudo /home/testuser/mlc --peak_injection_bandwidth
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --peak_injection_bandwidth

Using buffer size of 100.000MB/thread for reads and an additional 100.000MB/thread for writes

Measuring Peak Injection Memory Bandwidths for the system
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using traffic with the following read-write ratios

ALL Reads : 215733.9

3:1 Reads-Writes : 182141.9

2:1 Reads-Writes : 178615.7

1:1 Reads-Writes : 149911.3

Stream-triad like: 159533.6
```

```
$ sudo /home/testuser/mlc --max_bandwidth
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --max_bandwidth

Using buffer size of 100.000MB/thread for reads and an additional 100.000MB/thread for writes

Measuring Maximum Memory Bandwidths for the system
```

```
Will take several minutes to complete as multiple injection rates will be tried to get the best_bandwidth

Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)

Using all the threads from each core if Hyper-threading is enabled

Using traffic with the following read-write ratios

ALL Reads : 216875.73

3:1 Reads-Writes : 182615.14

2:1 Reads-Writes : 178745.67

1:1 Reads-Writes : 149485.27

Stream-triad like: 180057.87
```

#### **Memory Latency**

```
$ sudo /home/testuser/mlc --idle_latency
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --idle_latency
Using buffer size of 2000.000MB
Each iteration took 202.0 core clocks ( 80.8 ns)
```

```
$ sudo /home/testuser/mlc --loaded_latency
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --loaded_latency
Using buffer size of 100.000MB/thread for reads and an additional 100.000MB/thread for writes
Measuring Loaded Latencies for the system
Using all the threads from each core if Hyper-threading is enabled
Using Read-only traffic type
Inject Latency Bandwidth
Delay (ns) MB/sec
00000 282.66 215712.8
00002 282.14 215757.4
00008 280.21 215868.1
00015 279.20 216313.2
00050 275.25 216643.0
00100 227.05 215075.0
00200 121.92 160242.9
00300 101.21 111587.4
00400 95.48
              85019.7
00500 94.46 68717.3
00700 92.27
              49742.2
01000 91.03 35264.8
01300 90.11 27396.3
01700 89.34
              21178.7
02500 90.15 14672.8
```

```
    03500
    89.00
    10715.7

    05000
    82.00
    7788.2

    09000
    81.46
    4684.0

    20000
    81.40
    2541.9
```

# L1/L2/LLC Latency

```
$ sudo /home/testuser/mlc --c2c_latency
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --c2c_latency
Measuring cache-to-cache transfer latency (in ns)...
Local Socket L2->L2 HIT latency
                                   53.7
Local Socket L2->L2 HITM latency
                                    53.7
Remote Socket L2->L2 HITM latency (data address homed in writer socket)
                     Reader Numa Node
Writer Numa Node
                       0
                            113.9
                    113.9
Remote Socket L2->L2 HITM latency (data address homed in reader socket)
                     Reader Numa Node
Writer Numa Node
                       0
           0
                            177.9
            1
                    177.6
```

### **Spectre and Meltdown Checks**

Following section displays the output of a running shell script to tell if system is vulnerable against the several "speculative execution" CVEs that were made public in 2018. Script is available on Spectre & Meltdown Checker Github<sup>115</sup>.

```
Spectre and Meltdown mitigation detection tool v0.43
awk: cannot open bash (No such file or directory)
Checking for vulnerabilities on current system
Kernel is Linux 4.15.0-72-generic #81-Ubuntu SMP Tue Nov 26 12:20:02 UTC 2019 x86_64
CPU is Intel(R) Xeon(R) Platinum 8180 CPU @ 2.50GHz
Hardware check
* Hardware support (CPU microcode) for mitigation techniques
* Indirect Branch Restricted Speculation (IBRS)
  * SPEC_CTRL MSR is available: YES
   * CPU indicates IBRS capability: YES (SPEC_CTRL feature bit)
* Indirect Branch Prediction Barrier (IBPB)
  * PRED_CMD MSR is available: YES
  * CPU indicates IBPB capability: YES (SPEC_CTRL feature bit)
* Single Thread Indirect Branch Predictors (STIBP)
  * SPEC_CTRL MSR is available: YES
   * CPU indicates STIBP capability: YES (Intel STIBP feature bit)
 * Speculative Store Bypass Disable (SSBD)
  * CPU indicates SSBD capability: YES (Intel SSBD)
* L1 data cache invalidation
  * FLUSH_CMD MSR is available: YES
   * CPU indicates L1D flush capability: YES (L1D flush feature bit)
 * Microarchitectural Data Sampling
   * VERW instruction is available: YES (MD_CLEAR feature bit)
```

<sup>115</sup> https://github.com/speed47/spectre-meltdown-checker

```
* Enhanced IBRS (IBRS_ALL)
  * CPU indicates ARCH_CAPABILITIES MSR availability: NO
   * ARCH_CAPABILITIES MSR advertises IBRS_ALL capability: NO
* CPU explicitly indicates not being vulnerable to Meltdown/L1TF (RDCL_NO): NO
* CPU explicitly indicates not being vulnerable to Variant 4 (SSB_NO): NO
\star CPU/Hypervisor indicates L1D flushing is\ not\ necessary on this system: NO
* Hypervisor indicates host CPU might be vulnerable to RSB underflow (RSBA): NO
\star CPU explicitly indicates not being vulnerable to Microarchitectural Data Sampling (MDS_NO): NO
\star CPU explicitly indicates not being vulnerable to TSX Asynchronous Abort (TAA_NO): NO
* CPU explicitly indicates not being vulnerable to iTLB Multihit (PSCHANGE_MSC_NO): NO
* CPU explicitly indicates having MSR for TSX control (TSX_CTRL_MSR): NO
* CPU supports Transactional Synchronization Extensions (TSX): YES (RTM feature bit)
* CPU supports Software Guard Extensions (SGX): NO
* CPU microcode is known to cause stability problems: NO (model 0x55 family 0x6 stepping 0x4 ucode_
→0x2000064 cpuid 0x50654)
* CPU microcode is the latest known available version: awk: cannot open bash (No such file or_
→directory)
UNKNOWN (latest microcode version for your CPU model is unknown)
* CPU vulnerability to the speculative execution attack variants
* Vulnerable to CVE-2017-5753 (Spectre Variant 1, bounds check bypass): YES
* Vulnerable to CVE-2017-5715 (Spectre Variant 2, branch target injection): YES
* Vulnerable to CVE-2017-5754 (Variant 3, Meltdown, rogue data cache load): YES
* Vulnerable to CVE-2018-3640 (Variant 3a, rogue system register read): YES
* Vulnerable to CVE-2018-3639 (Variant 4, speculative store bypass): YES
* Vulnerable to CVE-2018-3615 (Foreshadow (SGX), L1 terminal fault): NO
* Vulnerable to CVE-2018-3620 (Foreshadow-NG (OS), L1 terminal fault): YES
* Vulnerable to CVE-2018-3646 (Foreshadow-NG (VMM), L1 terminal fault): YES
* Vulnerable to CVE-2018-12126 (Fallout, microarchitectural store buffer data sampling (MSBDS)):
YES
* Vulnerable to CVE-2018-12130 (ZombieLoad, microarchitectural fill buffer data sampling (MFBDS)):_
* Vulnerable to CVE-2018-12127 (RIDL, microarchitectural load port data sampling (MLPDS)): YES
* Vulnerable to CVE-2019-11091 (RIDL, microarchitectural data sampling uncacheable memory_
→(MDSUM)): YES
* Vulnerable to CVE-2019-11135 (ZombieLoad V2, TSX Asynchronous Abort (TAA)): YES
* Vulnerable to CVE-2018-12207 (No eXcuses, iTLB Multihit, machine check exception on page size_
CVE-2017-5753 aka Spectre Variant 1, bounds check bypass
* Mitigated according to the /sys interface: YES (Mitigation: usercopy/swapgs barriers and __user_
→pointer sanitization)
* Kernel has array_index_mask_nospec: YES (1 occurrence(s) found of x86 64 bits array_index_mask_
* Kernel has the Red Hat/Ubuntu patch: NO
* Kernel has mask_nospec64 (arm64): NO
> STATUS: NOT VULNERABLE (Mitigation: usercopy/swapgs barriers and __user pointer sanitization)
CVE-2017-5715 aka Spectre Variant 2, branch target injection
* Mitigated according to the /sys interface: YES (Mitigation: Full generic retpoline, IBPB:_
* Mitigation 1
* Kernel is compiled with IBRS support: YES
  * IBRS enabled and active: YES (for firmware code only)
* Kernel is compiled with IBPB support: YES
   * IBPB enabled and active: YES
* Mitigation 2
* Kernel has branch predictor hardening (arm): NO
* Kernel compiled with retpoline option: YES
   * Kernel compiled with a retpoline-aware compiler: YES (kernel reports full retpoline_
* Kernel supports RSB filling: YES
```

```
> STATUS: NOT VULNERABLE (Full retpoline + IBPB are mitigating the vulnerability)
CVE-2017-5754 aka Variant 3, Meltdown, rogue data cache load
* Mitigated according to the /sys interface: YES (Mitigation: PTI)
* Kernel supports Page Table Isolation (PTI): YES
* PTI enabled and active: YES
* Reduced performance impact of PTI: YES (CPU supports INVPCID, performance impact of PTI will be_
→greatly reduced)
* Running as a Xen PV DomU: NO
> STATUS: NOT VULNERABLE (Mitigation: PTI)
CVE-2018-3640 aka Variant 3a, rogue system register read
* CPU microcode mitigates the vulnerability: YES
> STATUS: NOT VULNERABLE (your CPU microcode mitigates the vulnerability)
CVE-2018-3639 aka Variant 4, speculative store bypass
* Mitigated according to the /sys interface: YES (Mitigation: Speculative Store Bypass disabled via_
→prctl and seccomp)
{\rm * \ Kernel \ supports \ disabling \ speculative \ store \ bypass \ (SSB): \ YES \ (found \ in \ /proc/self/status)}
* SSB mitigation is enabled and active: YES (per-thread through prctl)
* SSB mitigation currently active for selected processes: YES (systemd-journald systemd-logind_
> STATUS: NOT VULNERABLE (Mitigation: Speculative Store Bypass disabled via prctl and seccomp)
CVE-2018-3615 aka Foreshadow (SGX), L1 terminal fault
* CPU microcode mitigates the vulnerability: N/A
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-3620 aka Foreshadow-NG (OS), L1 terminal fault
* Mitigated according to the /sys interface: YES (Mitigation: PTE Inversion; VMX: conditional cache_
→flushes, SMT vulnerable)
* Kernel supports PTE inversion: YES (found in kernel image)
* PTE inversion enabled and active: YES
> STATUS: NOT VULNERABLE (Mitigation: PTE Inversion; VMX: conditional cache flushes, SMT vulnerable)
CVE-2018-3646 aka Foreshadow-NG (VMM), L1 terminal fault
* Information from the /sys interface: Mitigation: PTE Inversion; VMX: conditional cache flushes, _
→SMT vulnerable
\star This system is a host running a hypervisor: NO
* Mitigation 1 (KVM)
* EPT is disabled: NO
* Mitigation 2
* L1D flush is supported by kernel: YES (found flush_l1d in /proc/cpuinfo)
* L1D flush enabled: YES (conditional flushes)
* Hardware-backed L1D flush supported: YES (performance impact of the mitigation will be greatly_
→reduced)
* Hyper-Threading (SMT) is enabled: YES
> STATUS: NOT VULNERABLE (this system is not running a hypervisor)
CVE-2018-12126 aka Fallout, microarchitectural store buffer data sampling (MSBDS)
* Mitigated according to the /sys interface: YES (Mitigation: Clear CPU buffers; SMT vulnerable)
* Kernel supports using MD_CLEAR mitigation: YES (md_clear found in /proc/cpuinfo)
* Kernel mitigation is enabled and active: YES
* SMT is either mitigated or disabled: NO
> STATUS: NOT VULNERABLE (Your microcode and kernel are both up to date for this mitigation, and_
→mitigation is enabled)
CVE-2018-12130 aka ZombieLoad, microarchitectural fill buffer data sampling (MFBDS)
* Mitigated according to the /sys interface: YES (Mitigation: Clear CPU buffers; SMT vulnerable)
* Kernel supports using MD_CLEAR mitigation: YES (md_clear found in /proc/cpuinfo)
* Kernel mitigation is enabled and active: YES
```

```
* SMT is either mitigated or disabled: NO
> STATUS: NOT VULNERABLE (Your microcode and kernel are both up to date for this mitigation, and_
→mitigation is enabled)
CVE-2018-12127 aka RIDL, microarchitectural load port data sampling (MLPDS)
* Mitigated according to the /sys interface: YES (Mitigation: Clear CPU buffers; SMT vulnerable)
* Kernel supports using MD_CLEAR mitigation: YES (md_clear found in /proc/cpuinfo)
* Kernel mitigation is enabled and active: YES
* SMT is either mitigated or disabled: NO
> STATUS: NOT VULNERABLE (Your microcode and kernel are both up to date for this mitigation, and_
→mitigation is enabled)
CVE-2019-11091 aka RIDL, microarchitectural data sampling uncacheable memory (MDSUM)
* Mitigated according to the /sys interface: YES (Mitigation: Clear CPU buffers; SMT vulnerable)
* Kernel supports using MD_CLEAR mitigation: YES (md_clear found in /proc/cpuinfo)
* Kernel mitigation is enabled and active: YES
* SMT is either mitigated or disabled: NO
> STATUS: NOT VULNERABLE (Your microcode and kernel are both up to date for this mitigation, and_
→mitigation is enabled)
CVE-2019-11135 aka ZombieLoad V2, TSX Asynchronous Abort (TAA)
* Mitigated according to the /sys interface: YES (Mitigation: Clear CPU buffers; SMT vulnerable)
* TAA mitigation is supported by kernel: YES (found tsx_async_abort in kernel image)
* TAA mitigation enabled and active: YES (Mitigation: Clear CPU buffers; SMT vulnerable)
> STATUS: NOT VULNERABLE (Mitigation: Clear CPU buffers; SMT vulnerable)
CVE-2018-12207 aka No eXcuses, iTLB Multihit, machine check exception on page size changes (MCEPSC)
* Mitigated according to the /sys interface: YES (KVM: Mitigation: Split huge pages)
* This system is a host running a hypervisor: NO
* iTLB Multihit mitigation is supported by kernel: YES (found itlb_multihit in kernel image)
* iTLB Multihit mitigation enabled and active: YES (KVM: Mitigation: Split huge pages)
> STATUS: NOT VULNERABLE (this system is not running a hypervisor)
> SUMMARY: CVE-2017-5753:OK CVE-2017-5715:OK CVE-2017-5754:OK CVE-2018-3640:OK CVE-2018-3639:OK CVE-
→2018-3615:0K CVE-2018-3620:0K CVE-2018-3646:0K CVE-2018-12126:0K CVE-2018-12130:0K CVE-2018-
→12127:OK CVE-2019-11091:OK CVE-2019-11135:OK CVE-2018-12207:OK
```

# 2.12.5 Calibration Data - Cascade Lake

Following sections include sample calibration data measured on s32-t27-sut1 server running in one of the Intel Xeon Skylake testbeds as specified in FD.io CSIT testbeds - Xeon Cascade Lake<sup>116</sup>.

Calibration data obtained from all other servers in Cascade Lake testbeds shows the same or similar values.

# Linux cmdline

```
$ cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-4.15.0-72-generic root=UUID=1d03969e-a2a0-41b2-a97e-1cc171b07e88 ro_

→isolcpus=1-23,25-47,49-71,73-95 nohz_full=1-23,25-47,49-71,73-95 rcu_nocbs=1-23,25-47,49-71,73-95_

→numa_balancing=disable intel_pstate=disable intel_iommu=on iommu=pt nmi_watchdog=0 audit=0_

→nosoftlockup processor.max_cstate=1 intel_idle.max_cstate=1 hpet=disable tsc=reliable mce=off_

→console=tty0 console=ttyS0,115200n8
```

https://git.fd.io/csit/tree/docs/lab/testbeds\_sm\_clx\_hw\_bios\_cfg.md?h=rls2001

#### Linux uname

```
$ uname -a
Linux s32-t27-sut1 4.15.0-72-generic #81-Ubuntu SMP Tue Nov 26 12:20:02 UTC 2019 x86_64 x86_64 x86_

→64 GNU/Linux
```

### **System-level Core Jitter**

```
$ sudo taskset -c 3 /home/testuser/pma_tools/jitter/jitter -i 30
Linux Jitter testing program version 1.9
Iterations=30
The pragram will execute a dummy function 80000 times
Display is updated every 20000 displayUpdate intervals
Thread affinity will be set to core_id:7
Timings are in CPU Core cycles
Inst Min:
             Minimum Excution time during the display update interval(default is ~1 second)
Inst_Max:
             Maximum Excution time during the display update interval(default is ~1 second)
Inst_jitter: Jitter in the Excution time during rhe display update interval. This is the value of_
→interest
             The Excution time of last iteration just before the display update
last_Exec:
             Absolute Minimum Excution time since the program started or statistics were reset
Abs Min:
             Absolute Maximum Excution time since the program started or statistics were reset
Abs_Max:
             Cumulative value calcualted by the dummy function
tmp:
Interval:
             Time interval between the display updates in Core Cycles
Sample No:
             Sample number
Inst_Min,Inst_Max,Inst_jitter,last_Exec,Abs_min,Abs_max,tmp,Interval,Sample No
160022, 167590, 7568, 160026, 160022, 167590, 2057568256, 3203711852, 1
160022,170628,10606,160024,160022,170628,4079222784,3204010824,2
160022,169824,9802,160024,160022,170628,1805910016,3203812064,3
160022, 168832, 8810, 160030, 160022, 170628, 3827564544, 3203792594, 4
160022,168248,8226,160026,160022,170628,1554251776,3203765920,5
160022, 167834, 7812, 160028, 160022, 170628, 3575906304, 3203761114, 6
160022, 167442, 7420, 160024, 160022, 170628, 1302593536, 3203769250, 7
160022,169120,9098,160028,160022,170628,3324248064,3203853340,8
160022,170710,10688,160024,160022,170710,1050935296,3203985878,9
160022, 167952, 7930, 160024, 160022, 170710, 3072589824, 3203733756, 10
160022,168314,8292,160030,160022,170710,799277056,3203741152,11
160022, 169672, 9650, 160024, 160022, 170710, 2820931584, 3203739910, 12
160022, 168684, 8662, 160024, 160022, 170710, 547618816, 3203727336, 13
160022, 168246, 8224, 160024, 160022, 170710, 2569273344, 3203739052, 14
160022, 168134, 8112, 160030, 160022, 170710, 295960576, 3203735874, 15
160022,170230,10208,160024,160022,170710,2317615104,3203996356,16
160022, 167190, 7168, 160024, 160022, 170710, 44302336, 3203713628, 17
160022.167304.7282.160024.160022.170710.2065956864.3203717954.18
160022, 167500, 7478, 160024, 160022, 170710, 4087611392, 3203706674, 19
160022, 167302, 7280, 160024, 160022, 170710, 1814298624, 3203726452, 20
160022, 167266, 7244, 160024, 160022, 170710, 3835953152, 3203702804, 21
160022, 167820, 7798, 160022, 160022, 170710, 1562640384, 3203719138, 22
160022, 168100, 8078, 160024, 160022, 170710, 3584294912, 3203716636, 23
160022,170408,10386,160024,160022,170710,1310982144,3203946958,24
160022, 167276, 7254, 160024, 160022, 170710, 3332636672, 3203706236, 25
160022, 167052, 7030, 160024, 160022, 170710, 1059323904, 3203696444, 26
160022,170322,10300,160024,160022,170710,3080978432,3203747514,27
160022, 167332, 7310, 160024, 160022, 170710, 807665664, 3203716210, 28
160022, 167426, 7404, 160026, 160022, 170710, 2829320192, 3203700630, 29
160022,168840,8818,160024,160022,170710,556007424,3203727658,30
```

#### **Memory Bandwidth**

```
$ sudo /home/testuser/mlc --bandwidth_matrix
Intel(R) Memory Latency Checker - v3.7
Command line parameters: --bandwidth_matrix
Using buffer size of 100.000MiB/thread for reads and an additional 100.000MiB/thread for writes
Measuring Memory Bandwidths between nodes within system
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using Read-only traffic type
               Numa node
Numa node
                    0
               122097.7
      0
                            51327.9
               51309.2
                            122005.5
       1
```

```
$ sudo /home/testuser/mlc --peak_injection_bandwidth
Intel(R) Memory Latency Checker - v3.7
Command line parameters: --peak_injection_bandwidth
Using buffer size of 100.000MiB/thread for reads and an additional 100.000MiB/thread for writes
Measuring Peak Injection Memory Bandwidths for the system
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using traffic with the following read-write ratios
ALL Reads
                       243159.4
3:1 Reads-Writes :
                      219132.5
2:1 Reads-Writes :
                      216603.1
1:1 Reads-Writes :
                      203713.0
Stream-triad like:
                     193790.8
```

```
$ sudo /home/testuser/mlc --max_bandwidth
Intel(R) Memory Latency Checker - v3.7
Command line parameters: --max_bandwidth
Using buffer size of 100.000MiB/thread for reads and an additional 100.000MiB/thread for writes
Measuring Maximum Memory Bandwidths for the system
Will take several minutes to complete as multiple injection rates will be tried to get the best_
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using traffic with the following read-write ratios
ALL Reads
                      244114.27
               :
3:1 Reads-Writes :
                     219441.97
2:1 Reads-Writes :
                     216603.72
1:1 Reads-Writes :
                     203679.09
Stream-triad like:
                     214902.80
```

#### **Memory Latency**

```
Numa node 0 1
0 81.2 130.2
1 130.2 81.1
```

```
$ sudo /home/testuser/mlc --idle_latency
Intel(R) Memory Latency Checker - v3.7
Command line parameters: --idle_latency

Using buffer size of 2000.000MiB
Each iteration took 186.1 core clocks ( 80.9 ns)
```

```
$ sudo /home/testuser/mlc --loaded_latency
Intel(R) Memory Latency Checker - v3.7
Command line parameters: --loaded_latency
Using buffer size of 100.000MiB/thread for reads and an additional 100.000MiB/thread for writes
Measuring Loaded Latencies for the system
Using all the threads from each core if Hyper-threading is enabled
Using Read-only traffic type
Inject Latency Bandwidth
Delay (ns) MB/sec
00000 233.86 243421.9
00002 230.61 243544.1
00008 232.56
              243394.5
00015 229.52
              244076.6
00050 225.82
               244290.6
00100 161.65
               236744.8
00200 100.63
              133844.0
00300
       96.84
               90548.2
       95.71
00400
               68504.3
00500
       95.68
              55139.0
00700
       88.77
              39798.4
01000
       84.74
              28200.1
01300 83.08
              21915.5
01700 82.27 16969.3
02500 81.66 11810.6
03500 81.98 8662.9
05000 81.48 6306.8
09000 81.17
                3857.8
20000 80.19
              2179.9
```

#### L1/L2/LLC Latency

```
$ sudo /home/testuser/mlc --c2c_latency
Intel(R) Memory Latency Checker - v3.7
Command line parameters: --c2c_latency
Measuring cache-to-cache transfer latency (in ns)...
Local Socket L2->L2 HIT latency
                                       55.5
Local Socket L2->L2 HITM latency
                                        55.6
Remote Socket L2->L2 HITM latency (data address homed in writer socket)
                        Reader Numa Node
Writer Numa Node
                     0
                            1
           0
                        115.6
                115.6
Remote Socket L2->L2 HITM latency (data address homed in reader socket)
```

```
Reader Numa Node
Writer Numa Node 0 1
0 - 178.2
1 178.4 -
```

#### **Spectre and Meltdown Checks**

Following section displays the output of a running shell script to tell if system is vulnerable against the several speculative execution CVEs that were made public in 2018. Script is available on Spectre & Meltdown Checker Github<sup>117</sup>.

```
Spectre and Meltdown mitigation detection tool v0.43
awk: fatal: cannot open file `bash for reading (No such file or directory)
Checking for vulnerabilities on current system
Kernel is Linux 4.15.0-72-generic #81-Ubuntu SMP Tue Nov 26 12:20:02 UTC 2019 x86_64
CPU is Intel(R) Xeon(R) Platinum 8280 CPU @ 2.70GHz
Hardware check
* Hardware support (CPU microcode) for mitigation techniques
 * Indirect Branch Restricted Speculation (IBRS)
    * SPEC_CTRL MSR is available: YES
   * CPU indicates IBRS capability: YES (SPEC_CTRL feature bit)
 * Indirect Branch Prediction Barrier (IBPB)
   * PRED_CMD MSR is available: YES
   * CPU indicates IBPB capability: YES (SPEC_CTRL feature bit)
 * Single Thread Indirect Branch Predictors (STIBP)
   * SPEC_CTRL MSR is available: YES
   * CPU indicates STIBP capability: YES (Intel STIBP feature bit)
 * Speculative Store Bypass Disable (SSBD)
   * CPU indicates SSBD capability: YES (Intel SSBD)
 * L1 data cache invalidation
   * FLUSH_CMD MSR is available: YES
    * CPU indicates L1D flush capability: YES (L1D flush feature bit)
 * Microarchitectural Data Sampling
   * VERW instruction is available: YES (MD_CLEAR feature bit)
 * Enhanced IBRS (IBRS_ALL)
   * CPU indicates ARCH_CAPABILITIES MSR availability: YES
   * ARCH_CAPABILITIES MSR advertises IBRS_ALL capability: YES
 * CPU explicitly indicates not being vulnerable to Meltdown/L1TF (RDCL_NO): YES
 * CPU explicitly indicates not being vulnerable to Variant 4 (SSB_NO): NO
 * CPU/Hypervisor indicates L1D flushing is not necessary on this system: YES
 * Hypervisor indicates host CPU might be vulnerable to RSB underflow (RSBA): NO
 * CPU explicitly indicates not being vulnerable to Microarchitectural Data Sampling (MDS_NO): YES
 * CPU explicitly indicates not being vulnerable to TSX Asynchronous Abort (TAA_NO): NO
 * CPU explicitly indicates not being vulnerable to iTLB Multihit (PSCHANGE_MSC_NO): NO
 * CPU explicitly indicates having MSR for TSX control (TSX_CTRL_MSR): YES
   * TSX_CTRL MSR indicates TSX RTM is disabled: YES
   * TSX_CTRL MSR indicates TSX CPUID bit is cleared: YES
 * CPU supports Transactional Synchronization Extensions (TSX): NO
 * CPU supports Software Guard Extensions (SGX): NO
 * CPU microcode is known to cause stability problems: NO (model 0x55 family 0x6 stepping 0x7_
* CPU microcode is the latest known available version: awk: fatal: cannot open file `bash for_
→reading (No such file or directory)
UNKNOWN (latest microcode version for your CPU model is unknown)
* CPU vulnerability to the speculative execution attack variants
 * Vulnerable to CVE-2017-5753 (Spectre Variant 1, bounds check bypass): YES
```

(continues on next page)

2.12. Test Environment

<sup>117</sup> https://github.com/speed47/spectre-meltdown-checker

```
* Vulnerable to CVE-2017-5715 (Spectre Variant 2, branch target injection): YES
  * Vulnerable to CVE-2017-5754 (Variant 3, Meltdown, rogue data cache load): NO
 * Vulnerable to CVE-2018-3640 (Variant 3a, rogue system register read): YES
 * Vulnerable to CVE-2018-3639 (Variant 4, speculative store bypass): YES
 * Vulnerable to CVE-2018-3615 (Foreshadow (SGX), L1 terminal fault): NO
 * Vulnerable to CVE-2018-3620 (Foreshadow-NG (OS), L1 terminal fault): YES
 * Vulnerable to CVE-2018-3646 (Foreshadow-NG (VMM), L1 terminal fault): YES
 * Vulnerable to CVE-2018-12126 (Fallout, microarchitectural store buffer data sampling (MSBDS)):_
 * Vulnerable to CVE-2018-12130 (ZombieLoad, microarchitectural fill buffer data sampling_
→(MFBDS)): NO
 * Vulnerable to CVE-2018-12127 (RIDL, microarchitectural load port data sampling (MLPDS)): NO
 * Vulnerable to CVE-2019-11091 (RIDL, microarchitectural data sampling uncacheable memory_
 * Vulnerable to CVE-2019-11135 (ZombieLoad V2, TSX Asynchronous Abort (TAA)): NO
 * Vulnerable to CVE-2018-12207 (No eXcuses, iTLB Multihit, machine check exception on page size_
CVE-2017-5753 aka Spectre Variant 1, bounds check bypass
* Mitigated according to the /sys interface: YES (Mitigation: usercopy/swapgs barriers and __user_
→pointer sanitization)
* Kernel has array_index_mask_nospec: YES (1 occurrence(s) found of x86 64 bits array_index_mask_
→nospec())
* Kernel has the Red Hat/Ubuntu patch: NO
* Kernel has mask_nospec64 (arm64): NO
> STATUS: NOT VULNERABLE (Mitigation: usercopy/swapgs barriers and __user pointer sanitization)
CVE-2017-5715 aka Spectre Variant 2, branch target injection
* Mitigated according to the /sys interface: YES (Mitigation: Enhanced IBRS, IBPB: conditional, RSB_
→filling)
* Mitigation 1
 * Kernel is compiled with IBRS support: YES
   * IBRS enabled and active: YES (Enhanced flavor, performance impact will be greatly reduced)
 * Kernel is compiled with IBPB support: YES
   * IBPB enabled and active: YES
* Mitigation 2
 * Kernel has branch predictor hardening (arm): NO
 * Kernel compiled with retpoline option: YES
 * Kernel supports RSB filling: YES
> STATUS: NOT VULNERABLE (Enhanced IBRS + IBPB are mitigating the vulnerability)
CVE-2017-5754 aka Variant 3, Meltdown, rogue data cache load
* Mitigated according to the /sys interface: YES (Not affected)
* Kernel supports Page Table Isolation (PTI): YES
 * PTI enabled and active: UNKNOWN (dmesg truncated, please reboot and relaunch this script)
 * Reduced performance impact of PTI: YES (CPU supports INVPCID, performance impact of PTI will be_
→greatly reduced)
* Running as a Xen PV DomU: NO
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-3640 aka Variant 3a, rogue system register read
* CPU microcode mitigates the vulnerability: YES
> STATUS: NOT VULNERABLE (your CPU microcode mitigates the vulnerability)
CVE-2018-3639 aka Variant 4, speculative store bypass
* Mitigated according to the /sys interface: YES (Mitigation: Speculative Store Bypass disabled via_
→prctl and seccomp)
* Kernel supports disabling speculative store bypass (SSB): YES (found in /proc/self/status)
* SSB mitigation is enabled and active: YES (per-thread through prctl)
* SSB mitigation currently active for selected processes: YES (systemd-journald systemd-logind_
```

```
> STATUS: NOT VULNERABLE (Mitigation: Speculative Store Bypass disabled via prctl and seccomp)
CVE-2018-3615 aka Foreshadow (SGX), L1 terminal fault
* CPU microcode mitigates the vulnerability: N/A
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-3620 aka Foreshadow-NG (OS), L1 terminal fault
\star Mitigated according to the /sys interface: YES (Not affected)
* Kernel supports PTE inversion: YES (found in kernel image)
* PTE inversion enabled and active: NO
> STATUS: NOT VULNERABLE (Not affected)
CVE-2018-3646 aka Foreshadow-NG (VMM), L1 terminal fault
* Information from the /sys interface: Not affected
* This system is a host running a hypervisor: NO
* Mitigation 1 (KVM)
 * EPT is disabled: NO
* Mitigation 2
 * L1D flush is supported by kernel: YES (found flush_11d in /proc/cpuinfo)
 * L1D flush enabled: NO
 * Hardware-backed L1D flush supported: YES (performance impact of the mitigation will be greatly_
→reduced)
 * Hyper-Threading (SMT) is enabled: YES
> STATUS: NOT VULNERABLE (your kernel reported your CPU model as not vulnerable)
CVE-2018-12126 aka Fallout, microarchitectural store buffer data sampling (MSBDS)
* Mitigated according to the /sys interface: YES (Not affected)
* Kernel supports using MD_CLEAR mitigation: YES (md_clear found in /proc/cpuinfo)
* Kernel mitigation is enabled and active: NO
* SMT is either mitigated or disabled: NO
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-12130 aka ZombieLoad, microarchitectural fill buffer data sampling (MFBDS)
* Mitigated according to the /sys interface: YES (Not affected)
* Kernel supports using MD_CLEAR mitigation: YES (md_clear found in /proc/cpuinfo)
* Kernel mitigation is enabled and active: NO
* SMT is either mitigated or disabled: NO
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-12127 aka RIDL, microarchitectural load port data sampling (MLPDS)
* Mitigated according to the /sys interface: YES (Not affected)
* Kernel supports using MD_CLEAR mitigation: YES (md_clear found in /proc/cpuinfo)
* Kernel mitigation is enabled and active: NO
* SMT is either mitigated or disabled: NO
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2019-11091 aka RIDL, microarchitectural data sampling uncacheable memory (MDSUM)
* Mitigated according to the /sys interface: YES (Not affected)
* Kernel supports using MD_CLEAR mitigation: YES (md_clear found in /proc/cpuinfo)
* Kernel mitigation is enabled and active: NO
* SMT is either mitigated or disabled: NO
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2019-11135 aka ZombieLoad V2, TSX Asynchronous Abort (TAA)
* Mitigated according to the /sys interface: YES (Mitigation: TSX disabled)
* TAA mitigation is supported by kernel: YES (found tsx_async_abort in kernel image)
* TAA mitigation enabled and active: YES (Mitigation: TSX disabled)
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-12207 aka No eXcuses, iTLB Multihit, machine check exception on page size changes (MCEPSC)
* Mitigated according to the /sys interface: YES (KVM: Mitigation: Split huge pages)
```

\* This system is a host running a hypervisor: NO

(continued from previous page)

```
* iTLB Multihit mitigation enabled and active: YES (KVM: Mitigation: Split huge pages)
> STATUS: NOT VULNERABLE (this system is not running a hypervisor)
> SUMMARY: CVE-2017-5753:OK CVE-2017-5715:OK CVE-2017-5754:OK CVE-2018-3640:OK CVE-2018-3639:OK CVE-
→2018-3615:OK CVE-2018-3620:OK CVE-2018-3646:OK CVE-2018-12126:OK CVE-2018-12130:OK CVE-2018-
→12127:OK CVE-2019-11091:OK CVE-2019-11135:OK CVE-2018-12207:OK
awk: fatal: cannot open file `bash for reading (No such file or directory)
Checking for vulnerabilities on current system
Kernel is Linux 4.15.0-72-generic #81-Ubuntu SMP Tue Nov 26 12:20:02 UTC 2019 x86_64
CPU is Intel(R) Xeon(R) Gold 6252N CPU @ 2.30GHz
Hardware check
* Hardware support (CPU microcode) for mitigation techniques
 * Indirect Branch Restricted Speculation (IBRS)
   * SPEC_CTRL MSR is available: YES
    * CPU indicates IBRS capability: YES (SPEC_CTRL feature bit)
 * Indirect Branch Prediction Barrier (IBPB)
   * PRED_CMD MSR is available: YES
    * CPU indicates IBPB capability: YES (SPEC_CTRL feature bit)
 * Single Thread Indirect Branch Predictors (STIBP)
   * SPEC_CTRL MSR is available: YES
   * CPU indicates STIBP capability: YES (Intel STIBP feature bit)
 * Speculative Store Bypass Disable (SSBD)
   * CPU indicates SSBD capability: YES (Intel SSBD)
 * L1 data cache invalidation
   * FLUSH_CMD MSR is available: YES
    * CPU indicates L1D flush capability: YES (L1D flush feature bit)
 * Microarchitectural Data Sampling
   * VERW instruction is available: YES (MD_CLEAR feature bit)
 * Enhanced IBRS (IBRS_ALL)
   * CPU indicates ARCH_CAPABILITIES MSR availability: YES
    * ARCH_CAPABILITIES MSR advertises IBRS_ALL capability: YES
 * CPU explicitly indicates not being vulnerable to Meltdown/L1TF (RDCL_NO): YES
 * CPU explicitly indicates not being vulnerable to Variant 4 (SSB_NO): NO
 * CPU/Hypervisor indicates L1D flushing is not necessary on this system: YES
 * Hypervisor indicates host CPU might be vulnerable to RSB underflow (RSBA): NO
 * CPU explicitly indicates not being vulnerable to Microarchitectural Data Sampling (MDS_NO): YES
 * CPU explicitly indicates not being vulnerable to TSX Asynchronous Abort (TAA_NO): NO
 * CPU explicitly indicates not being vulnerable to iTLB Multihit (PSCHANGE_MSC_NO): NO
 * CPU explicitly indicates having MSR for TSX control (TSX_CTRL_MSR): YES
   * TSX_CTRL MSR indicates TSX RTM is disabled: YES
   * TSX_CTRL MSR indicates TSX CPUID bit is cleared: YES
 * CPU supports Transactional Synchronization Extensions (TSX): NO
 * CPU supports Software Guard Extensions (SGX): NO
 * CPU microcode is known to cause stability problems: NO (family 0x6 model 0x55 stepping 0x7_

→ucode 0x500002c cpuid 0x50657)

 * CPU microcode is the latest known available version: awk: fatal: cannot open file `bash for_
→reading (No such file or directory)
UNKNOWN (latest microcode version for your CPU model is unknown)
* CPU vulnerability to the speculative execution attack variants
 * Vulnerable to CVE-2017-5753 (Spectre Variant 1, bounds check bypass): YES
 * Vulnerable to CVE-2017-5715 (Spectre Variant 2, branch target injection): YES
 * Vulnerable to CVE-2017-5754 (Variant 3, Meltdown, rogue data cache load): NO
 * Vulnerable to CVE-2018-3640 (Variant 3a, rogue system register read): YES
 \star Vulnerable to CVE-2018-3639 (Variant 4, speculative store bypass): YES
 * Vulnerable to CVE-2018-3615 (Foreshadow (SGX), L1 terminal fault): NO
  * Vulnerable to CVE-2018-3620 (Foreshadow-NG (OS), L1 terminal fault): YES
  * Vulnerable to CVE-2018-3646 (Foreshadow-NG (VMM), L1 terminal fault): YES
```

\* iTLB Multihit mitigation is supported by kernel: YES (found itlb\_multihit in kernel image)

```
* Vulnerable to CVE-2018-12126 (Fallout, microarchitectural store buffer data sampling (MSBDS)):_
 * Vulnerable to CVE-2018-12130 (ZombieLoad, microarchitectural fill buffer data sampling_
→(MFBDS)): NO
 * Vulnerable to CVE-2018-12127 (RIDL, microarchitectural load port data sampling (MLPDS)): NO
 * Vulnerable to CVE-2019-11091 (RIDL, microarchitectural data sampling uncacheable memory_
→(MDSUM)): NO
 * Vulnerable to CVE-2019-11135 (ZombieLoad V2, TSX Asynchronous Abort (TAA)): NO
 * Vulnerable to CVE-2018-12207 (No eXcuses, iTLB Multihit, machine check exception on page size_
CVE-2017-5753 aka Spectre Variant 1, bounds check bypass
* Mitigated according to the /sys interface: YES (Mitigation: usercopy/swapgs barriers and __user_
→pointer sanitization)
* Kernel has array_index_mask_nospec: YES (1 occurrence(s) found of x86 64 bits array_index_mask_
→nospec())
* Kernel has the Red Hat/Ubuntu patch: NO
* Kernel has mask_nospec64 (arm64): NO
> STATUS: NOT VULNERABLE (Mitigation: usercopy/swapgs barriers and __user pointer sanitization)
CVE-2017-5715 aka Spectre Variant 2, branch target injection
* Mitigated according to the /sys interface: YES (Mitigation: Enhanced IBRS, IBPB: conditional, RSB_
→filling)
* Mitigation 1
 \star Kernel is compiled with IBRS support: YES
   * IBRS enabled and active: YES (Enhanced flavor, performance impact will be greatly reduced)
 * Kernel is compiled with IBPB support: YES
   * IBPB enabled and active: YES
* Mitigation 2
 * Kernel has branch predictor hardening (arm): NO
 * Kernel compiled with retpoline option: YES
 * Kernel supports RSB filling: YES
> STATUS: NOT VULNERABLE (Enhanced IBRS + IBPB are mitigating the vulnerability)
CVE-2017-5754 aka Variant 3, Meltdown, rogue data cache load
* Mitigated according to the /sys interface: YES (Not affected)
* Kernel supports Page Table Isolation (PTI): YES
 * PTI enabled and active: UNKNOWN (dmesg truncated, please reboot and relaunch this script)
 * Reduced performance impact of PTI: YES (CPU supports INVPCID, performance impact of PTI will be_
→greatly reduced)
* Running as a Xen PV DomU: NO
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-3640 aka Variant 3a, rogue system register read
* CPU microcode mitigates the vulnerability: YES
> STATUS: NOT VULNERABLE (your CPU microcode mitigates the vulnerability)
CVE-2018-3639 aka Variant 4, speculative store bypass
\star Mitigated according to the /sys interface: YES (Mitigation: Speculative Store Bypass disabled via_
→prctl and seccomp)
* Kernel supports disabling speculative store bypass (SSB): YES (found in /proc/self/status)
* SSB mitigation is enabled and active: YES (per-thread through prctl)
* SSB mitigation currently active for selected processes: YES (systemd-journald systemd-logind_
> STATUS: NOT VULNERABLE (Mitigation: Speculative Store Bypass disabled via prctl and seccomp)
CVE-2018-3615 aka Foreshadow (SGX), L1 terminal fault
* CPU microcode mitigates the vulnerability: N/A
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-3620 aka Foreshadow-NG (OS), L1 terminal fault
```

```
* Mitigated according to the /sys interface: YES (Not affected)
* Kernel supports PTE inversion: YES (found in kernel image)
* PTE inversion enabled and active: NO
> STATUS: NOT VULNERABLE (Not affected)
CVE-2018-3646 aka Foreshadow-NG (VMM), L1 terminal fault
* Information from the /sys interface: Not affected
* This system is a host running a hypervisor: NO
* Mitigation 1 (KVM)
 * EPT is disabled: NO
* Mitigation 2
 * L1D flush is supported by kernel: YES (found flush_11d in /proc/cpuinfo)
 * L1D flush enabled: NO
 * Hardware-backed L1D flush supported: YES (performance impact of the mitigation will be greatly_
 * Hyper-Threading (SMT) is enabled: YES
> STATUS: NOT VULNERABLE (your kernel reported your CPU model as not vulnerable)
CVE-2018-12126 aka Fallout, microarchitectural store buffer data sampling (MSBDS)
\star Mitigated according to the /sys interface: YES (Not affected)
* Kernel supports using MD_CLEAR mitigation: YES (md_clear found in /proc/cpuinfo)
* Kernel mitigation is enabled and active: NO
* SMT is either mitigated or disabled: NO
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-12130 aka ZombieLoad, microarchitectural fill buffer data sampling (MFBDS)
* Mitigated according to the /sys interface: YES (Not affected)
* Kernel supports using MD_CLEAR mitigation: YES (md_clear found in /proc/cpuinfo)
* Kernel mitigation is enabled and active: NO
* SMT is either mitigated or disabled: NO
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-12127 aka RIDL, microarchitectural load port data sampling (MLPDS)
* Mitigated according to the /sys interface: YES (Not affected)
* Kernel supports using MD_CLEAR mitigation: YES (md_clear found in /proc/cpuinfo)
* Kernel mitigation is enabled and active: NO
* SMT is either mitigated or disabled: NO
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2019-11091 aka RIDL, microarchitectural data sampling uncacheable memory (MDSUM)
* Mitigated according to the /sys interface: YES (Not affected)
* Kernel supports using MD_CLEAR mitigation: YES (md_clear found in /proc/cpuinfo)
* Kernel mitigation is enabled and active: NO
* SMT is either mitigated or disabled: NO
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2019-11135 aka ZombieLoad V2, TSX Asynchronous Abort (TAA)
* Mitigated according to the /sys interface: YES (Mitigation: TSX disabled)
* TAA mitigation is supported by kernel: YES (found tsx_async_abort in kernel image)
* TAA mitigation enabled and active: YES (Mitigation: TSX disabled)
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-12207 aka No eXcuses, iTLB Multihit, machine check exception on page size changes (MCEPSC)
* Mitigated according to the /sys interface: YES (KVM: Mitigation: Split huge pages)
* This system is a host running a hypervisor: NO
* iTLB Multihit mitigation is supported by kernel: YES (found itlb_multihit in kernel image)
* iTLB Multihit mitigation enabled and active: YES (KVM: Mitigation: Split huge pages)
> STATUS: NOT VULNERABLE (this system is not running a hypervisor)
> SUMMARY: CVE-2017-5753:OK CVE-2017-5715:OK CVE-2017-5754:OK CVE-2018-3640:OK CVE-2018-3639:OK CVE-
→2018-3615:OK CVE-2018-3620:OK CVE-2018-3646:OK CVE-2018-12126:OK CVE-2018-12130:OK CVE-2018-
 →12127:OK CVE-2019-11091:OK CVE-2019-11135:OK CVE-2018-12207:OK
                                                                                  (continues on next page)
```

### 2.12.6 Calibration Data - Haswell

Following sections include sample calibration data measured on t1-sut1 server running in one of the Intel Xeon Haswell testbeds as specified in FD.io CSIT Testbeds - Xeon Haswell<sup>118</sup>.

Calibration data obtained from all other servers in Haswell testbeds shows the same or similar values.

#### Linux cmdline

```
$ cat /proc/cmdline

BOOT_IMAGE=/vmlinuz-4.15.0-72-generic root=UUID=c59ae603-8076-41f4-bb5d-bc3fc8dd3ea1 ro isolcpus=1-

17,19-35 nohz_full=1-17,19-35 rcu_nocbs=1-17,19-35 numa_balancing=disable intel_pstate=disable_

intel_iommu=on iommu=pt nmi_watchdog=0 audit=0 nosoftlockup processor.max_cstate=1 intel_idle.max_

cstate=1 hpet=disable tsc=reliable mce=off console=tty0console=tty50,115200n8
```

#### Linux uname

```
$ uname -a
Linux t1-tg1 4.15.0-72-generic #81-Ubuntu SMP Tue Nov 26 12:20:02 UTC 2019 x86_64 x86_64 x86_64 GNU/

→Linux
```

#### System-level Core Jitter

```
$ sudo taskset -c 3 /home/testuser/pma_tools/jitter/jitter -i 30
Linux Jitter testing program version 1.8
Iterations=30
The pragram will execute a dummy function 80000 times
Display is updated every 20000 displayUpdate intervals
Timings are in CPU Core cycles
Inst Min:
            Minimum Excution time during the display update interval(default is ~1 second)
Inst_Max:
            Maximum Excution time during the display update interval(default is ~1 second)
Inst_jitter: Jitter in the Excution time during rhe display update interval. This is the value of_

interest
last_Exec:
             The Excution time of last iteration just before the display update
Abs_Min:
            Absolute Minimum Excution time since the program started or statistics were reset
Abs_Max:
             Absolute Maximum Excution time since the program started or statistics were reset
             Cumulative value calcualted by the dummy function
tmp:
Interval:
            Time interval between the display updates in Core Cycles
Sample No:
            Sample number
  Inst_Min
                       Inst_jitter last_Exec Abs_min
                                                                                   Interval
             Inst_Max
                                                           Abs max
                                                                         tmp
Sample No
                           12612
                                     160028
                                                                      1573060608 3205463144
   160024
              172636
                                                 160024
                                                            172636
→1
   160024
              188236
                           28212
                                     160028
                                                160024
                                                            188236
                                                                       958595072 3205500844
\hookrightarrow2
   160024
               185676
                           25652
                                     160028
                                                160024
                                                            188236
                                                                       344129536 3205485976
⇔3
   160024
                                                                      4024631296 3205472740
               172608
                           12584
                                     160024
                                                160024
                                                            188236
<u>4</u>
   160024
               179260
                           19236
                                     160028
                                                160024
                                                            188236
                                                                      3410165760 3205502164
→5
```

(continues on next page)

2.12. Test Environment

<sup>118</sup> https://git.fd.io/csit/tree/docs/lab/testbeds\_ucs\_hsw\_hw\_bios\_cfg.md?h=rls2001

| /          |      |          | ١.    |
|------------|------|----------|-------|
| (continued | trom | previous | page) |

| 160024                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |             |        |       |        |        |        | (continued from previo | ous page) |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------|--------|-------|--------|--------|--------|------------------------|-----------|
| 160024                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |             | 172432 | 12408 | 160024 | 160024 | 188236 | 2795700224 3205452036  | u         |
| 160024                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | 160024      | 178820 | 18796 | 160024 | 160024 | 188236 | 2181234688 3205455408  | u         |
| 160024                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | 160024      | 172512 | 12488 | 160028 | 160024 | 188236 | 1566769152 3205461528  | u         |
| 160024                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |             | 172636 | 12612 | 160028 | 160024 | 188236 | 952303616 3205478820   | J         |
| 160024   178776   18752   160028   160024   188236   4018339840   3205481472                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |             | 173676 | 13652 | 160028 | 160024 | 188236 | 337838080 3205470412   | u         |
| 160024   172788   12764   160028   160024   188236   3403874304   3205492336                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |             | 178776 | 18752 | 160028 | 160024 | 188236 | 4018339840 3205481472  |           |
| □12 □160024 174616 14592 160028 160024 188236 2789408768 3205474904 □ □13 □160024 174440 14416 160028 160024 188236 2174943232 3205479448 □ □14 □160024 178748 18724 160024 160024 188236 1560477696 3205482668 □ □15 □160024 172588 12564 169404 160024 188236 946012160 3205510496 □ □16 □160024 172636 12612 160024 160024 188236 331546624 3205472204 □ □17 □160024 172480 12456 160024 160024 188236 4012048384 3205455864 □ □18 □160024 172740 12716 160028 160024 188236 3397582848 3205464932 □ □19 □160024 179200 19176 160028 160024 188236 2783117312 3205476012 □ □20 □160024 172480 12456 160028 160024 188236 2168651776 3205465632 □ □21 □160024 172728 12704 160028 160024 188236 939720704 3205466972 □ □22 □160024 172620 12596 160028 160024 188236 939720704 3205466972 □ □23 □160024 172640 12616 160028 160024 188236 32555168 3205471216 □ □24 □160024 172640 12616 160028 160024 188236 32555168 3205471216 □ □25 □160024 172640 12616 160028 160024 188236 325255168 3205471216 □ □26 □160024 172640 12616 160028 160024 188236 325255168 3205471216 □ □26 □160024 172640 12616 160028 160024 188236 325255168 3205471216 □ □26 □160024 172640 12616 160028 160024 188236 3391291392 3205487388 □ □26 □160024 172636 12612 160028 160024 188236 3391291392 320548748 □ □26 □160024 172636 12612 160028 160024 188236 3391291392 320548748 □ □26 □160024 172636 12612 160028 160024 188236 2776825856 3205467152 □ □27 □160024 172636 12612 160028 160024 188236 2776825856 3205467152 □ □26 □160024 172636 12612 160028 160024 188236 2776825856 3205467152 □ □27 □160024 172672 12648 160024 160024 188236 1547894784 3205488536 □ □28 □160024 176932 16008 160024 180024 188236 933429248 3205488536 □ □29 □160024 172452 12428 160028 160024 188236 933429248 3205486536 □ |             | 172788 | 12764 | 160028 | 160024 | 188236 | 3403874304 3205492336  | <u>.</u>  |
| 13                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |             | 174616 | 14592 | 160028 | 160024 | 188236 | 2789408768 3205474904  |           |
| 14                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |             |        |       |        |        |        |                        |           |
| 15160024 172588 12564 169404 160024 188236 946012160 320551049616160024 172636 12612 160024 160024 188236 331546624 320547220417160024 172480 12456 160024 160024 188236 4012048384 32054558641818160024 179200 19176 160028 160024 188236 3397582848 32054649321919160024 179200 19176 160028 160024 188236 2783117312 320547601220202020202020 -                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | <b>→14</b>  |        |       |        |        |        |                        |           |
| 16                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | <b>⇔</b> 15 |        |       |        |        |        |                        | J         |
| →17                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | <b>⇔</b> 16 |        |       |        |        |        |                        | J         |
| →18                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |             | 172636 | 12612 | 160024 | 160024 | 188236 | 331546624 3205472204   | J         |
| →19                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |             | 172480 | 12456 | 160024 | 160024 | 188236 | 4012048384 3205455864  | _         |
| →20     160024 172480 12456 160028 160024 188236 2168651776 3205465632 □     →21     160024 172728 12704 160024 160024 188236 1554186240 3205497204 □     →22     160024 172620 12596 160028 160024 188236 939720704 3205466972 □     →23     160024 172640 12616 160028 160024 188236 325255168 3205471216 □     →24     160024 172484 12460 160028 160024 188236 4005756928 3205467388 □     →25     160024 172636 12612 160028 160024 188236 3391291392 3205482748 □     →26     160024 179056 19032 160024 160024 188236 2776825856 3205467152 □     →27     160024 172672 12648 160024 160024 188236 2162360320 3205483268 □     →28     160024 176932 16908 160024 160024 188236 1547894784 3205488536 □     →29     160024 172452 12428 160028 160024 188236 933429248 3205440636 □                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |             | 172740 | 12716 | 160028 | 160024 | 188236 | 3397582848 3205464932  | u         |
| 160024 172480 12456 160028 160024 188236 2168651776 3205465632 □  →21  160024 172728 12704 160024 160024 188236 1554186240 3205497204 □  →22  160024 172620 12596 160028 160024 188236 939720704 3205466972 □  →23  160024 172640 12616 160028 160024 188236 325255168 3205471216 □  →24  160024 172484 12460 160028 160024 188236 4005756928 3205467388 □  →25  160024 172636 12612 160028 160024 188236 3391291392 320548748 □  →26  160024 179056 19032 160024 160024 188236 2776825856 3205467152 □  →27  160024 172672 12648 160024 160024 188236 2162360320 3205483268 □  →28  160024 176932 16908 160024 160024 188236 1547894784 3205488536 □  →29  160024 172452 12428 160028 160024 188236 933429248 3205440636 □                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |             | 179200 | 19176 | 160028 | 160024 | 188236 | 2783117312 3205476012  | u u       |
| 160024 172728 12704 160024 160024 188236 1554186240 3205497204                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | 160024      | 172480 | 12456 | 160028 | 160024 | 188236 | 2168651776 3205465632  | _         |
| 160024 172620 12596 160028 160024 188236 939720704 3205466972 □ □23 □160024 172640 12616 160028 160024 188236 325255168 3205471216 □ □24 □160024 172484 12460 160028 160024 188236 4005756928 3205467388 □ □25 □160024 172636 12612 160028 160024 188236 3391291392 3205482748 □ □26 □160024 179056 19032 160024 160024 188236 2776825856 3205467152 □ □27 □160024 172672 12648 160024 160024 188236 2162360320 3205483268 □ □28 □160024 176932 16908 160024 160024 188236 1547894784 3205488536 □ □29 □160024 172452 12428 160028 160024 188236 933429248 3205440636 □                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | 160024      | 172728 | 12704 | 160024 | 160024 | 188236 | 1554186240 3205497204  | u         |
| 160024 172640 12616 160028 160024 188236 325255168 3205471216                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | 160024      | 172620 | 12596 | 160028 | 160024 | 188236 | 939720704 3205466972   | u         |
| 160024 172484 12460 160028 160024 188236 4005756928 3205467388 □ □25 □160024 172636 12612 160028 160024 188236 3391291392 3205482748 □ □26 □160024 179056 19032 160024 160024 188236 2776825856 3205467152 □ □27 □160024 172672 12648 160024 160024 188236 2162360320 3205483268 □ □28 □160024 176932 16908 160024 160024 188236 1547894784 3205488536 □ □29 □160024 172452 12428 160028 160024 188236 933429248 3205440636 □                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | 160024      | 172640 | 12616 | 160028 | 160024 | 188236 | 325255168 3205471216   | _         |
| 160024 172636 12612 160028 160024 188236 3391291392 3205482748 □ □ 26 □ 160024 179056 19032 160024 160024 188236 2776825856 3205467152 □ □ 27 □ 160024 172672 12648 160024 160024 188236 2162360320 3205483268 □ □ 28 □ 160024 176932 16908 160024 160024 188236 1547894784 3205488536 □ □ 29 □ 160024 172452 12428 160028 160024 188236 933429248 3205440636 □                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | 160024      | 172484 | 12460 | 160028 | 160024 | 188236 | 4005756928 3205467388  | u         |
| 160024 179056 19032 160024 160024 188236 2776825856 3205467152                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | 160024      | 172636 | 12612 | 160028 | 160024 | 188236 | 3391291392 3205482748  | <u>.</u>  |
| 160024 172672 12648 160024 160024 188236 2162360320 3205483268 □  →28  160024 176932 16908 160024 160024 188236 1547894784 3205488536 □  →29  160024 172452 12428 160028 160024 188236 933429248 3205440636 □                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |             | 179056 | 19032 | 160024 | 160024 | 188236 | 2776825856 3205467152  | u         |
| 160024 176932 16908 160024 160024 188236 1547894784 3205488536 □  →29  160024 172452 12428 160028 160024 188236 933429248 3205440636 □                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |             | 172672 | 12648 | 160024 | 160024 | 188236 | 2162360320 3205483268  | <u>.</u>  |
| →29<br>160024 172452 12428 160028 160024 188236 933429248 3205440636 _                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |             | 176932 | 16908 | 160024 | 160024 | 188236 | 1547894784 3205488536  | u         |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |             | 172452 | 12428 | 160028 | 160024 | 188236 | 933429248 3205440636   |           |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |             |        |       |        |        |        |                        | _         |

## **Memory Bandwidth**

\$ sudo /home/testuser/mlc --bandwidth\_matrix
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --bandwidth\_matrix

Using buffer size of 100.000MB/thread for reads and an additional 100.000MB/thread for writes Measuring Memory Bandwidths between nodes within system

```
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using Read-only traffic type

Numa node
Numa node
0 1
0 57935.5 30265.2
1 30284.6 58409.9
```

```
$ sudo /home/testuser/mlc --peak_injection_bandwidth
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --peak_injection_bandwidth

Using buffer size of 100.000MB/thread for reads and an additional 100.000MB/thread for writes

Measuring Peak Injection Memory Bandwidths for the system
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using traffic with the following read-write ratios
ALL Reads : 115762.2
3:1 Reads-Writes : 106242.2
2:1 Reads-Writes : 103031.8
1:1 Reads-Writes : 87943.7
Stream-triad like: 100048.4
```

```
$ sudo /home/testuser/mlc --max_bandwidth
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --max_bandwidth
Using buffer size of 100.000MB/thread for reads and an additional 100.000MB/thread for writes
Measuring Maximum Memory Bandwidths for the system
Will take several minutes to complete as multiple injection rates will be tried to get the best_
→bandwidth
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using traffic with the following read-write ratios
ALL Reads
            : 115782.41
3:1 Reads-Writes : 105965.78
2:1 Reads-Writes : 103162.38
1:1 Reads-Writes : 88255.82
Stream-triad like: 105608.10
```

#### **Memory Latency**

```
$ sudo /home/testuser/mlc --idle_latency
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --idle_latency
```

```
Using buffer size of 200.000MB

Each iteration took 227.2 core clocks ( 99.0 ns)
```

```
$ sudo /home/testuser/mlc --loaded_latency
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --loaded_latency
Using buffer size of 100.000MB/thread for reads and an additional 100.000MB/thread for writes
Measuring Loaded Latencies for the system
Using all the threads from each core if Hyper-threading is enabled
Using Read-only traffic type
Inject Latency Bandwidth
Delay (ns)
             MR/sec
00000 294.08 115841.6
00002 294.27 115851.5
00008 293.67 115821.8
00015 278.92 115587.5
00050 246.80 113991.2
00100 206.86 104508.1
00200 123.72 72873.6
00300 113.35 52641.1
00400 108.89 41078.9
               33699.1
00500 108.11
00700 106.19
                24878.0
01000 104.75
                17948.1
01300 103.72
                14089.0
01700 102.95
                11013.6
02500 102.25
                 7756.3
03500 101.81
                 5749.3
05000 101.46
                 4230.4
09000 101.05
                 2641.4
20000 100.77
                 1542.5
```

## L1/L2/LLC Latency

```
$ sudo /home/testuser/mlc --c2c_latency
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --c2c_latency
Measuring cache-to-cache transfer latency (in ns)...
Local Socket L2->L2 HIT latency
Local Socket L2->L2 HITM latency
                                    47.0
Remote Socket L2->L2 HITM latency (data address homed in writer socket)
                  Reader Numa Node
Writer Numa Node
                    0
                        108.0
           1
                106.9
Remote Socket L2->L2 HITM latency (data address homed in reader socket)
                  Reader Numa Node
Writer Numa Node
                        107.7
           0
            1
                 106.6
```

#### **Spectre and Meltdown Checks**

Following section displays the output of a running shell script to tell if system is vulnerable against the several "speculative execution" CVEs that were made public in 2018. Script is available on Spectre & Meltdown Checker Github<sup>119</sup>.

```
Spectre and Meltdown mitigation detection tool v0.43
awk: cannot open bash (No such file or directory)
Checking for vulnerabilities on current system
Kernel is Linux 4.15.0-72-generic #81-Ubuntu SMP Tue Nov 26 12:20:02 UTC 2019 x86_64
CPU is Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz
* Hardware support (CPU microcode) for mitigation techniques
  * Indirect Branch Restricted Speculation (IBRS)
    * SPEC_CTRL MSR is available: YES
    * CPU indicates IBRS capability: YES (SPEC_CTRL feature bit)
  * Indirect Branch Prediction Barrier (IBPB)
    * PRED_CMD MSR is available: YES
    * CPU indicates IBPB capability: YES (SPEC_CTRL feature bit)
  * Single Thread Indirect Branch Predictors (STIBP)
    * SPEC_CTRL MSR is available: YES
    * CPU indicates STIBP capability: YES (Intel STIBP feature bit)
  * Speculative Store Bypass Disable (SSBD)
    * CPU indicates SSBD capability: YES (Intel SSBD)
 * L1 data cache invalidation
   * FLUSH_CMD MSR is available: YES
    * CPU indicates L1D flush capability: YES (L1D flush feature bit)
  * Microarchitectural Data Sampling
    * VERW instruction is available: YES (MD_CLEAR feature bit)
 * Enhanced IBRS (IBRS_ALL)
   * CPU indicates ARCH_CAPABILITIES MSR availability: NO
    * ARCH_CAPABILITIES MSR advertises IBRS_ALL capability: NO
  * CPU explicitly indicates not being vulnerable to Meltdown/L1TF (RDCL_NO): NO
  * CPU explicitly indicates not being vulnerable to Variant 4 (SSB_NO): NO
  * CPU/Hypervisor indicates L1D flushing is not necessary on this system: NO
  * Hypervisor indicates host CPU might be vulnerable to RSB underflow (RSBA): NO
 \star CPU explicitly indicates not being vulnerable to Microarchitectural Data Sampling (MDS_NO): NO
 * CPU explicitly indicates not being vulnerable to TSX Asynchronous Abort (TAA_NO): NO
 * CPU explicitly indicates not being vulnerable to iTLB Multihit (PSCHANGE_MSC_NO): NO
 * CPU explicitly indicates having MSR for TSX control (TSX_CTRL_MSR): NO
 * CPU supports Transactional Synchronization Extensions (TSX): NO
 * CPU supports Software Guard Extensions (SGX): NO
 * CPU microcode is known to cause stability problems: NO (model 0x3f family 0x6 stepping 0x2_

→ucode 0x43 cpuid 0x306f2)
 * CPU microcode is the latest known available version: awk: cannot open bash (No such file or_
→directorv)
UNKNOWN (latest microcode version for your CPU model is unknown)
* CPU vulnerability to the speculative execution attack variants
  * Vulnerable to CVE-2017-5753 (Spectre Variant 1, bounds check bypass): YES
  * Vulnerable to CVE-2017-5715 (Spectre Variant 2, branch target injection): YES
  * Vulnerable to CVE-2017-5754 (Variant 3, Meltdown, rogue data cache load): YES
  * Vulnerable to CVE-2018-3640 (Variant 3a, rogue system register read): YES
  * Vulnerable to CVE-2018-3639 (Variant 4, speculative store bypass): YES
  * Vulnerable to CVE-2018-3615 (Foreshadow (SGX), L1 terminal fault): NO
  * Vulnerable to CVE-2018-3620 (Foreshadow-NG (OS), L1 terminal fault): YES
  * Vulnerable to CVE-2018-3646 (Foreshadow-NG (VMM), L1 terminal fault): YES
  * Vulnerable to CVE-2018-12126 (Fallout, microarchitectural store buffer data sampling (MSBDS)):_
  * Vulnerable to CVE-2018-12130 (ZombieLoad, microarchitectural fill buffer data sampling.
 →(MFBDS)): YES
                                                                                  (continues on next page)
```

2.12. Test Environment 543

<sup>119</sup> https://github.com/speed47/spectre-meltdown-checker

```
* Vulnerable to CVE-2018-12127 (RIDL, microarchitectural load port data sampling (MLPDS)): YES
 * Vulnerable to CVE-2019-11091 (RIDL, microarchitectural data sampling uncacheable memory_
 * Vulnerable to CVE-2019-11135 (ZombieLoad V2, TSX Asynchronous Abort (TAA)): NO
 * Vulnerable to CVE-2018-12207 (No eXcuses, iTLB Multihit, machine check exception on page size_
→changes (MCEPSC)): YES
CVE-2017-5753 aka Spectre Variant 1, bounds check bypass
* Mitigated according to the /sys interface: YES (Mitigation: usercopy/swapgs barriers and __user_
→pointer sanitization)
* Kernel has array_index_mask_nospec: YES (1 occurrence(s) found of x86 64 bits array_index_mask_
→nospec())
* Kernel has the Red Hat/Ubuntu patch: NO
* Kernel has mask_nospec64 (arm64): NO
> STATUS: NOT VULNERABLE (Mitigation: usercopy/swapgs barriers and __user pointer sanitization)
CVE-2017-5715 aka Spectre Variant 2, branch target injection
st Mitigated according to the /sys interface: YES (Mitigation: Full generic retpoline, IBPB:_
→conditional, IBRS_FW, RSB filling)
* Mitigation 1
 * Kernel is compiled with IBRS support: YES
    * IBRS enabled and active: YES (for firmware code only)
 * Kernel is compiled with IBPB support: YES
   * IBPB enabled and active: YES
* Mitigation 2
 * Kernel has branch predictor hardening (arm): NO
 * Kernel compiled with retpoline option: YES
   * Kernel compiled with a retpoline-aware compiler: YES (kernel reports full retpoline_
> STATUS: NOT VULNERABLE (Full retpoline + IBPB are mitigating the vulnerability)
CVE-2017-5754 aka Variant 3, Meltdown, rogue data cache load
* Mitigated according to the /sys interface: YES (Mitigation: PTI)
* Kernel supports Page Table Isolation (PTI): YES
 \star PTI enabled and active: YES
 * Reduced performance impact of PTI: YES (CPU supports INVPCID, performance impact of PTI will be_
→greatly reduced)
* Running as a Xen PV DomU: NO
> STATUS: NOT VULNERABLE (Mitigation: PTI)
CVE-2018-3640 aka Variant 3a, rogue system register read
* CPU microcode mitigates the vulnerability: YES
> STATUS: NOT VULNERABLE (your CPU microcode mitigates the vulnerability)
CVE-2018-3639 aka Variant 4, speculative store bypass
* Mitigated according to the /sys interface: YES (Mitigation: Speculative Store Bypass disabled via_
→prctl and seccomp)
* Kernel supports disabling speculative store bypass (SSB): YES (found in /proc/self/status)
* SSB mitigation is enabled and active: YES (per-thread through prctl)
* SSB mitigation currently active for selected processes: YES (systemd-journald systemd-logind_
> STATUS: NOT VULNERABLE (Mitigation: Speculative Store Bypass disabled via prctl and seccomp)
CVE-2018-3615 aka Foreshadow (SGX), L1 terminal fault
* CPU microcode mitigates the vulnerability: N/A
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-3620 aka Foreshadow-NG (OS), L1 terminal fault
* Mitigated according to the /sys interface: YES (Mitigation: PTE Inversion; VMX: conditional cache_
→flushes, SMT disabled)
* Kernel supports PTE inversion: YES (found in kernel image)
```

```
* PTE inversion enabled and active: YES
> STATUS: NOT VULNERABLE (Mitigation: PTE Inversion; VMX: conditional cache flushes, SMT disabled)
CVE-2018-3646 aka Foreshadow-NG (VMM), L1 terminal fault
* Information from the /sys interface: Mitigation: PTE Inversion; VMX: conditional cache flushes,_
→SMT disabled
* This system is a host running a hypervisor: NO
* Mitigation 1 (KVM)
 * EPT is disabled: NO
* Mitigation 2
 * L1D flush is supported by kernel: YES (found flush_l1d in /proc/cpuinfo)
 * L1D flush enabled: YES (conditional flushes)
  * Hardware-backed L1D flush supported: YES (performance impact of the mitigation will be greatly_
→reduced)
 * Hyper-Threading (SMT) is enabled: NO
> STATUS: NOT VULNERABLE (this system is not running a hypervisor)
CVE-2018-12126 aka Fallout, microarchitectural store buffer data sampling (MSBDS)
* Mitigated according to the /sys interface: YES (Mitigation: Clear CPU buffers; SMT disabled)
* Kernel supports using MD_CLEAR mitigation: YES (md_clear found in /proc/cpuinfo)
* Kernel mitigation is enabled and active: YES
* SMT is either mitigated or disabled: YES
> STATUS: NOT VULNERABLE (Your microcode and kernel are both up to date for this mitigation, and_
→mitigation is enabled)
CVE-2018-12130 aka ZombieLoad, microarchitectural fill buffer data sampling (MFBDS)
* Mitigated according to the /sys interface: YES (Mitigation: Clear CPU buffers; SMT disabled)
* Kernel supports using MD_CLEAR mitigation: YES (md_clear found in /proc/cpuinfo)
* Kernel mitigation is enabled and active: YES
* SMT is either mitigated or disabled: YES
> STATUS: NOT VULNERABLE (Your microcode and kernel are both up to date for this mitigation, and_
→mitigation is enabled)
CVE-2018-12127 aka RIDL, microarchitectural load port data sampling (MLPDS)
* Mitigated according to the /sys interface: YES (Mitigation: Clear CPU buffers; SMT disabled)
* Kernel supports using MD_CLEAR mitigation: YES (md_clear found in /proc/cpuinfo)
* Kernel mitigation is enabled and active: YES
* SMT is either mitigated or disabled: YES
> STATUS: NOT VULNERABLE (Your microcode and kernel are both up to date for this mitigation, and_
→mitigation is enabled)
CVE-2019-11091 aka RIDL, microarchitectural data sampling uncacheable memory (MDSUM)
* Mitigated according to the /sys interface: YES (Mitigation: Clear CPU buffers; SMT disabled)
* Kernel supports using MD_CLEAR mitigation: YES (md_clear found in /proc/cpuinfo)
* Kernel mitigation is enabled and active: YES
* SMT is either mitigated or disabled: YES
> STATUS: NOT VULNERABLE (Your microcode and kernel are both up to date for this mitigation, and_
→mitigation is enabled)
CVE-2019-11135 aka ZombieLoad V2, TSX Asynchronous Abort (TAA)
* Mitigated according to the /sys interface: YES (Not affected)
* TAA mitigation is supported by kernel: YES (found tsx_async_abort in kernel image)
* TAA mitigation enabled and active: NO
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-12207 aka No eXcuses, iTLB Multihit, machine check exception on page size changes (MCEPSC)
* Mitigated according to the /sys interface: YES (KVM: Mitigation: Split huge pages)
* This system is a host running a hypervisor: NO
* iTLB Multihit mitigation is supported by kernel: YES (found itlb_multihit in kernel image)
* iTLB Multihit mitigation enabled and active: YES (KVM: Mitigation: Split huge pages)
> STATUS: NOT VULNERABLE (this system is not running a hypervisor)
```

```
> SUMMARY: CVE-2017-5753:OK CVE-2017-5715:OK CVE-2017-5754:OK CVE-2018-3640:OK CVE-2018-3639:OK CVE-

→2018-3615:OK CVE-2018-3620:OK CVE-2018-3646:OK CVE-2018-12126:OK CVE-2018-12130:OK CVE-2018-

→12127:OK CVE-2019-11091:OK CVE-2019-11135:OK CVE-2018-12207:OK
```

#### 2.12.7 Calibration Data - Denverton

Following sections include sample calibration data measured on Denverton server at Intel SH labs.

A 2-Node Atom Denverton testing took place at Intel Corporation carefully adhering to FD.io CSIT best practices.

#### Linux cmdline

#### Linux uname

```
$ uname -a
Linux 4.15.0-36-generic #39~16.04.1-Ubuntu SMP Tue Sep 25 08:59:23 UTC 2018 x86_64 x86_64 x86_64_

→GNU/Linux
```

#### **System-level Core Jitter**

```
$ sudo taskset -c 2 /home/testuser/pma_tools/jitter/jitter -c 2 -i 20
Linux Jitter testing program version 1.9
Iterations=20
The pragram will execute a dummy function 80000 times
Display is updated every 20000 displayUpdate intervals
Thread affinity will be set to core_id:2
Timings are in CPU Core cycles
Inst_Min: Minimum Excution time during the display update interval(default is ~1 second)
Inst_Max:
            Maximum Excution time during the display update interval(default is \sim 1 second)
Inst_jitter: Jitter in the Excution time during rhe display update interval. This is the value of_
→interest
last_Exec:
            The Excution time of last iteration just before the display update
            Absolute Minimum Excution time since the program started or statistics were reset
Abs_Min:
Abs_Max:
            Absolute Maximum Excution time since the program started or statistics were reset
            Cumulative value calcualted by the dummy function
tmp:
Interval:
            Time interval between the display updates in Core Cycles
Sample No:
            Sample number
   Inst_Min Inst_Max Inst_jitter last_Exec Abs_min
                                                                                 Interval
                                                          Abs_max
                                                                       tmp
→Sample No
   177530
              196100
                          18570
                                    177530
                                               177530
                                                          196100
                                                                    4156751872 3556820054
→1
    177530
              200784
                           23254
                                    177530
                                               177530
                                                          200784
                                                                     321060864 3556897644
-→2
    177530
              196346
                           18816
                                    177530
                                               177530
                                                          200784
                                                                     780337152 3556918674
→3
```

| /          | •    |          | ١.   |
|------------|------|----------|------|
| (continued | trom | previous | page |

|                       |        |       |        |        |        | (continued from previo | 10-7 |
|-----------------------|--------|-------|--------|--------|--------|------------------------|------|
| 177530                | 195962 | 18432 | 177530 | 177530 | 200784 | 1239613440 3556847928  | 1    |
| 177530                | 195960 | 18430 | 177530 | 177530 | 200784 | 1698889728 3556860214  | u    |
| →5<br>177530          | 198824 | 21294 | 177530 | 177530 | 200784 | 2158166016 3556854934  | J    |
| 6<br>177530           | 198522 | 20992 | 177530 | 177530 | 200784 | 2617442304 3556862410  | J    |
|                       | 196362 | 18832 | 177530 | 177530 | 200784 | 3076718592 3556851636  | J    |
| →8<br>177530          | 199114 | 21584 | 177530 | 177530 | 200784 | 3535994880 3556870846  | J    |
| →9<br>177530          | 197194 | 19664 | 177530 | 177530 | 200784 | 3995271168 3556933584  | a a  |
| →10<br>177530         | 198272 | 20742 | 177536 | 177530 | 200784 | 159580160 3556869044   | u u  |
| →11<br>177530         | 197586 | 20056 | 177530 | 177530 | 200784 | 618856448 3556903482   | J    |
| →12<br>177530         | 196072 | 18542 | 177530 | 177530 | 200784 | 1078132736 3556825540  | J    |
| <b>⇔</b> 13<br>177530 | 196354 | 18824 | 177530 | 177530 | 200784 | 1537409024 3556881664  | ı,   |
| →14<br>177530         | 195906 | 18376 | 177530 | 177530 | 200784 | 1996685312 3556839924  | ı,   |
| →15<br>177530         | 199066 | 21536 | 177530 | 177530 | 200784 | 2455961600 3556860220  | J.   |
| →16<br>177530         | 196968 | 19438 | 177530 | 177530 | 200784 | 2915237888 3556871890  | J.   |
| →17<br>177530         | 195896 | 18366 | 177530 | 177530 | 200784 | 3374514176 3556855338  | J.   |
| →18<br>177530         | 196020 | 18490 | 177530 | 177530 | 200784 | 3833790464 3556839820  | ı.   |
| →19<br>177530<br>→20  | 196030 | 18500 | 177530 | 177530 | 200784 | 4293066752 3556889196  | J    |

# **Memory Bandwidth**

```
$ sudo /home/testuser/mlc --bandwidth_matrix
Intel(R) Memory Latency Checker - v3.5
{\tt Command \ line \ parameters: --bandwidth\_matrix}
Using buffer size of 100.000MB/thread for reads and an additional 100.000MB/thread for writes
Measuring Memory Bandwidths between nodes within system
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using Read-only traffic type
       Memory node
Socket
    0 28157.2
```

```
$ sudo /home/testuser/mlc --peak_injection_bandwidth
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --peak_injection_bandwidth
Using buffer size of 100.000MB/thread for reads and an additional 100.000MB/thread for writes
Measuring Peak Injection Memory Bandwidths for the system
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
```

```
Using traffic with the following read-write ratios
ALL Reads : 28150.0
3:1 Reads-Writes : 27425.0
2:1 Reads-Writes : 27565.4
1:1 Reads-Writes : 27489.3
Stream-triad like: 26878.2
```

```
$ sudo /home/testuser/mlc --max_bandwidth
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --max_bandwidth
Using buffer size of 100.000MB/thread for reads and an additional 100.000MB/thread for writes
Measuring Maximum Memory Bandwidths for the system
Will take several minutes to complete as multiple injection rates will be tried to get the best_
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using traffic with the following read-write ratios
ALL Reads
              :
                      30032.40
3:1 Reads-Writes :
                     27450.88
2:1 Reads-Writes :
                     27567.46
1:1 Reads-Writes :
                     27501.90
Stream-triad like:
                    27124.82
```

#### **Memory Latency**

```
$ sudo /home/testuser/mlc --idle_latency
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --idle_latency
Using buffer size of 200.000MB
Each iteration took 186.7 core clocks ( 93.4 ns)
```

```
00002 135.47
               27176.9
00008 134.97
               27063.3
00015 134.41
               26825.6
00050 139.83
               28419.1
00100 124.28
               22616.4
00200 109.40
               14139.8
00300 104.56
             10275.1
00400 102.02
               8120.0
00500 100.38
                6751.4
00700
      98.30
                5124.9
01000
       96.56
                3852.7
01300
       95.65
                3149.0
01700
       95.06
                2585.4
02500
       94.43
                1988.8
03500
       94.16
                1621.1
05000
       93.95
                1343.1
       93.65
09000
                1052.6
20000 93.43
                 851.7
```

### L1/L2/LLC Latency

```
$ sudo /home/testuser/mlc --c2c_latency
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --c2c_latency

Measuring cache-to-cache transfer latency (in ns)...
Local Socket L2->L2 HIT latency 8.8
Local Socket L2->L2 HITM latency 8.8
```

#### **Spectre and Meltdown Checks**

Following section displays the output of a running shell script to tell if system is vulnerable against the several "speculative execution" CVEs that were made public in 2018. Script is available on Spectre & Meltdown Checker Github<sup>120</sup>.

```
Spectre and Meltdown mitigation detection tool v0.42
Checking for vulnerabilities on current system
Kernel is Linux 4.15.0-51-generic #55-Ubuntu SMP Wed May 15 14:27:21 UTC 2019 x86_64
CPU is Intel(R) Atom(TM) CPU C3858 @ 2.00GHz
Hardware check
* Hardware support (CPU microcode) for mitigation techniques
 * Indirect Branch Restricted Speculation (IBRS)
   * SPEC_CTRL MSR is available: YES
    * CPU indicates IBRS capability: YES (SPEC_CTRL feature bit)
 * Indirect Branch Prediction Barrier (IBPB)
   * PRED_CMD MSR is available: YES
    * CPU indicates IBPB capability: YES (SPEC_CTRL feature bit)
 * Single Thread Indirect Branch Predictors (STIBP)
   * SPEC_CTRL MSR is available: YES
    * CPU indicates STIBP capability: YES (Intel STIBP feature bit)
 * Speculative Store Bypass Disable (SSBD)
   * CPU indicates SSBD capability: YES (Intel SSBD)
 * L1 data cache invalidation
   * FLUSH_CMD MSR is available: NO
    * CPU indicates L1D flush capability: NO
```

(continues on next page)

2.12. Test Environment

<sup>120</sup> https://github.com/speed47/spectre-meltdown-checker

```
* Microarchitecture Data Sampling
   * VERW instruction is available: YES (MD_CLEAR feature bit)
 * Enhanced IBRS (IBRS_ALL)
   * CPU indicates ARCH_CAPABILITIES MSR availability: YES
   * ARCH_CAPABILITIES MSR advertises IBRS_ALL capability: NO
 * CPU explicitly indicates not being vulnerable to Meltdown/L1TF (RDCL_NO): YES
 \star CPU explicitly indicates not being vulnerable to Variant 4 (SSB_NO): NO
 \star CPU/Hypervisor indicates L1D flushing is not necessary on this system: YES
 * Hypervisor indicates host CPU might be vulnerable to RSB underflow (RSBA): NO
 * CPU explicitly indicates not being vulnerable to Microarchitectural Data Sampling (MDS_NO): YES
 * CPU supports Software Guard Extensions (SGX): NO
 * CPU microcode is known to cause stability problems: NO (model 0x5f family 0x6 stepping 0x1_
→ucode 0x2e cpuid 0x506f1)
 * CPU microcode is the latest known available version: awk: fatal: cannot open file `bash for_
→reading (No such file or directory)
UNKNOWN (latest microcode version for your CPU model is unknown)
* CPU vulnerability to the speculative execution attack variants
 \star Vulnerable to CVE-2017-5753 (Spectre Variant 1, bounds check bypass): YES
 \star Vulnerable to CVE-2017-5715 (Spectre Variant 2, branch target injection): YES
 \star Vulnerable to CVE-2017-5754 (Variant 3, Meltdown, rogue data cache load): NO
 * Vulnerable to CVE-2018-3640 (Variant 3a, rogue system register read): YES
 * Vulnerable to CVE-2018-3639 (Variant 4, speculative store bypass): YES
 * Vulnerable to CVE-2018-3615 (Foreshadow (SGX), L1 terminal fault): NO
 * Vulnerable to CVE-2018-3620 (Foreshadow-NG (OS), L1 terminal fault): NO
 * Vulnerable to CVE-2018-3646 (Foreshadow-NG (VMM), L1 terminal fault): NO
 * Vulnerable to CVE-2018-12126 (Fallout, microarchitectural store buffer data sampling (MSBDS)):_
NO
 * Vulnerable to CVE-2018-12130 (ZombieLoad, microarchitectural fill buffer data sampling_
→(MFBDS)): NO
 * Vulnerable to CVE-2018-12127 (RIDL, microarchitectural load port data sampling (MLPDS)): NO
 * Vulnerable to CVE-2019-11091 (RIDL, microarchitectural data sampling uncacheable memory_
→(MDSUM)): NO
CVE-2017-5753 aka Spectre Variant 1, bounds check bypass
* Mitigated according to the /sys interface: YES (Mitigation: __user pointer sanitization)
* Kernel has array_index_mask_nospec: YES (1 occurrence(s) found of x86 64 bits array_index_mask_
→nospec())
* Kernel has the Red Hat/Ubuntu patch: NO
* Kernel has mask_nospec64 (arm64): NO
> STATUS: NOT VULNERABLE (Mitigation: __user pointer sanitization)
CVE-2017-5715 aka Spectre Variant 2, branch target injection
* Mitigated according to the /sys interface: YES (Mitigation: Full generic retpoline, IBPB:_
* Kernel is compiled with IBRS support: YES
   * IBRS enabled and active: YES (for firmware code only)
 * Kernel is compiled with IBPB support: YES
   * IBPB enabled and active: YES
* Mitigation 2
 * Kernel has branch predictor hardening (arm): NO
 * Kernel compiled with retpoline option: YES
   * Kernel compiled with a retpoline-aware compiler: YES (kernel reports full retpoline_
> STATUS: NOT VULNERABLE (Full retpoline + IBPB are mitigating the vulnerability)
CVE-2017-5754 aka Variant 3, Meltdown, rogue data cache load
* Mitigated according to the /sys interface: YES (Not affected)
* Kernel supports Page Table Isolation (PTI): YES
 * PTI enabled and active: UNKNOWN (dmesg truncated, please reboot and relaunch this script)
 * Reduced performance impact of PTI: NO (PCID/INVPCID not supported, performance impact of PTI_
 →will be significant)
                                                                                 (continues on next page)
```

```
* Running as a Xen PV DomU: NO
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-3640 aka Variant 3a, rogue system register read
* CPU microcode mitigates the vulnerability: YES
> STATUS: NOT VULNERABLE (your CPU microcode mitigates the vulnerability)
CVE-2018-3639 aka Variant 4, speculative store bypass
* Mitigated according to the /sys interface: YES (Mitigation: Speculative Store Bypass disabled via_
→prctl and seccomp)
* Kernel supports disabling speculative store bypass (SSB): YES (found in /proc/self/status)
* SSB mitigation is enabled and active: YES (per-thread through prctl)
* SSB mitigation currently active for selected processes: YES (systemd-journald systemd-logind_
\hookrightarrow systemd-networkd systemd-resolved systemd-timesyncd systemd-udevd)
> STATUS: NOT VULNERABLE (Mitigation: Speculative Store Bypass disabled via prctl and seccomp)
CVE-2018-3615 aka Foreshadow (SGX), L1 terminal fault
\star CPU microcode mitigates the vulnerability: N/A
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-3620 aka Foreshadow-NG (OS), L1 terminal fault
* Mitigated according to the /sys interface: YES (Not affected)
* Kernel supports PTE inversion: YES (found in kernel image)
* PTE inversion enabled and active: NO
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-3646 aka Foreshadow-NG (VMM), L1 terminal fault
* Information from the /sys interface: Not affected
* This system is a host running a hypervisor: NO
* Mitigation 1 (KVM)
 * EPT is disabled: NO
* Mitigation 2
 * L1D flush is supported by kernel: YES (found flush_11d in kernel image)
 * L1D flush enabled: NO
 * Hardware-backed L1D flush supported: NO (flush will be done in software, this is slower)
 * Hyper-Threading (SMT) is enabled: NO
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-12126 aka Fallout, microarchitectural store buffer data sampling (MSBDS)
* Mitigated according to the /sys interface: YES (Not affected)
* Kernel supports using MD_CLEAR mitigation: YES (md_clear found in /proc/cpuinfo)
* Kernel mitigation is enabled and active: NO
* SMT is either mitigated or disabled: NO
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-12130 aka ZombieLoad, microarchitectural fill buffer data sampling (MFBDS)
* Mitigated according to the /sys interface: YES (Not affected)
* Kernel supports using MD_CLEAR mitigation: YES (md_clear found in /proc/cpuinfo)
* Kernel mitigation is enabled and active: NO
* SMT is either mitigated or disabled: NO
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-12127 aka RIDL, microarchitectural load port data sampling (MLPDS)
* Mitigated according to the /sys interface: YES (Not affected)
* Kernel supports using MD_CLEAR mitigation: YES (md_clear found in /proc/cpuinfo)
* Kernel mitigation is enabled and active: NO
* SMT is either mitigated or disabled: NO
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2019-11091 aka RIDL, microarchitectural data sampling uncacheable memory (MDSUM)
\star Mitigated according to the /sys interface: YES (Not affected)
```

```
* Kernel supports using MD_CLEAR mitigation: YES (md_clear found in /proc/cpuinfo)

* Kernel mitigation is enabled and active: NO

* SMT is either mitigated or disabled: NO

> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)

> SUMMARY: CVE-2017-5753:OK CVE-2017-5715:OK CVE-2017-5754:OK CVE-2018-3640:OK CVE-2018-3639:OK CVE-2018-3615:OK CVE-2018-3620:OK CVE-2018-3646:OK CVE-2018-12126:OK CVE-2018-12130:OK CVE-2018-

$\text{12127:OK CVE-2019-11091:OK}$
```

#### 2.12.8 Calibration Data - TaiShan

Following sections include sample calibration data measured on s17-t33-sut1 server running in one of the Cortex-A72 testbeds.

Calibration data obtained from all other servers in TaiShan testbeds shows the same or similar values.

#### Linux cmdline

```
$ cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-4.15.0-54-generic root=/dev/mapper/huawei--1--vg-root ro isolcpus=1-15,17-

→31,33-47,49-63 nohz_full=1-15 17-31,33-47,49-63 rcu_nocbs=1-15 17-31,33-47,49-63 intel_

→iommu=on nmi_watchdog=0 audit=0 nosoftlockup processor.max_cstate=1 console=ttyAMA0,115200n8
```

#### Linux uname

```
$ uname -a
Linux s17-t33-sut1 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 10:56:40 UTC 2019 aarch64 aarch64_

→aarch64 GNU/Linux
```

## **System-level Core Jitter**

```
$ sudo taskset -c 3 /home/testuser/pma_tools/jitter/jitter -i 20
Linux Jitter testing program version 1.9
Iterations=30
The pragram will execute a dummy function 80000 times
Display is updated every 20000 displayUpdate intervals
Thread affinity will be set to core_id:7
Timings are in CPU Core cycles
Inst_Min:
            Minimum Excution time during the display update interval(default is ~1 second)
            Maximum Excution time during the display update interval(default is ~1 second)
Inst Max:
Inst_jitter: Jitter in the Excution time during rhe display update interval. This is the value of_
→interest
            The Excution time of last iteration just before the display update
last_Exec:
Abs_Min:
            Absolute Minimum Excution time since the program started or statistics were reset
Abs_Max:
            Absolute Maximum Excution time since the program started or statistics were reset
tmp:
            Cumulative value calcualted by the dummy function
Interval:
            Time interval between the display updates in Core Cycles
Sample No: Sample number
  Inst Min
            Inst_Max Inst_jitter last_Exec Abs_min
                                                                                 Interval
                                                          Abs max
                                                                        tmp
→Sample No
                                                                     1903230976 3204401362
    160022
              172254
                          12232
                                    160042
                                               160022
                                                          172254
⊸1
              173148
                                                                      814809088 3204619316
    160022
                          13126
                                    160044
                                               160022
                                                          173148
 →2
```

|                       |        |       |        |        |        | (continued from p     | revious page) |
|-----------------------|--------|-------|--------|--------|--------|-----------------------|---------------|
| 160022<br>→3          | 169460 | 9438  | 160044 | 160022 | 173148 | 4021354496 3204391306 | u u           |
| 160024                | 170270 | 10246 | 160044 | 160022 | 173148 | 2932932608 3204385830 | u             |
| 4<br>160022           | 169660 | 9638  | 160044 | 160022 | 173148 | 1844510720 3204387290 | L L           |
|                       | 169410 | 9388  | 160040 | 160022 | 173148 | 756088832 3204375832  | u u           |
| 6<br>160022           | 169012 | 8990  | 160042 | 160022 | 173148 | 3962634240 3204378924 | u u           |
| ⊶7<br>160022          | 169556 | 9534  | 160044 | 160022 | 173148 | 2874212352 3204374882 | u u           |
| →8<br>160022          | 171684 | 11662 | 160042 | 160022 | 173148 | 1785790464 3204394596 | u u           |
| →9<br>160022          | 171546 | 11524 | 160024 | 160022 | 173148 | 697368576 3204602774  | _             |
| →10<br>160022         | 169248 | 9226  | 160042 | 160022 | 173148 | 3903913984 3204401676 | _             |
| →11<br>160022         | 168458 | 8436  | 160042 | 160022 | 173148 | 2815492096 3204256350 | _             |
|                       | 169574 | 9552  | 160044 | 160022 | 173148 | 1727070208 3204278116 | _             |
| →13<br>160022         | 169352 | 9330  | 160044 | 160022 | 173148 | 638648320 3204327234  |               |
| →14<br>160022         | 169100 | 9078  | 160044 | 160022 | 173148 | 3845193728 3204388132 |               |
| →15<br>160022         | 169338 | 9316  | 160042 | 160022 | 173148 | 2756771840 3204380724 |               |
| <b>⇔</b> 16 160022    | 170828 | 10806 | 160046 | 160022 | 173148 | 1668349952 3204430452 |               |
| →17<br>160022         | 173162 | 13140 | 160026 | 160022 | 173162 | 579928064 3204611318  |               |
| →18<br>160022         | 170482 | 10460 | 160042 | 160022 | 173162 | 3786473472 3204389896 | _             |
| <b>→</b> 19           |        | 10680 | 160042 |        |        |                       | _             |
| 160024<br>⇔20         | 170704 |       |        | 160022 | 173162 | 2698051584 3204422126 | _             |
| 160024<br><b>⇔</b> 21 | 169302 | 9278  | 160044 | 160022 | 173162 | 1609629696 3204397334 | _             |
| 160022<br>→22         | 171848 | 11826 | 160044 | 160022 | 173162 | 521207808 3204389818  | _             |
| 160022<br>→23         | 169438 | 9416  | 160042 | 160022 | 173162 | 3727753216 3204395382 | _             |
| 160022                | 169312 | 9290  | 160042 | 160022 | 173162 | 2639331328 3204371202 | _             |
| 160022<br>⇔25         | 171368 | 11346 | 160044 | 160022 | 173162 | 1550909440 3204440464 | _             |
| 160022<br>⇒26         | 171998 | 11976 | 160042 | 160022 | 173162 | 462487552 3204609440  | _             |
| 160022<br>→27         | 169740 | 9718  | 160046 | 160022 | 173162 | 3669032960 3204405826 | _             |
| 160022<br>→28         | 169610 | 9588  | 160044 | 160022 | 173162 | 2580611072 3204390608 | _             |
| 160022                | 169254 | 9232  | 160044 | 160022 | 173162 | 1492189184 3204399760 | _             |
| →29<br>160022         | 169386 | 9364  | 160046 | 160022 | 173162 | 403767296 3204417762  | _             |
| <b>⇔30</b>            |        |       |        |        |        |                       |               |

2.12. Test Environment 553

#### **Spectre and Meltdown Checks**

Following section displays the output of a running shell script to tell if system is vulnerable against the several "speculative execution" CVEs that were made public in 2018. Script is available on Spectre & Meltdown Checker Github<sup>121</sup>.

```
Spectre and Meltdown mitigation detection tool v0.43
awk: cannot open bash (No such file or directory)
Checking for vulnerabilities on current system
Kernel is Linux 4.15.0-23-generic #25-Ubuntu SMP Wed May 23 18:02:16 UTC 2018 x86_64
CPU is Intel(R) Xeon(R) Platinum 8180 CPU @ 2.50GHz
Hardware check
* Hardware support (CPU microcode) for mitigation techniques
  * Indirect Branch Restricted Speculation (IBRS)
    * SPEC_CTRL MSR is available: YES
    * CPU indicates IBRS capability: YES (SPEC_CTRL feature bit)
  * Indirect Branch Prediction Barrier (IBPB)
    * PRED_CMD MSR is available: YES
    * CPU indicates IBPB capability: YES (SPEC_CTRL feature bit)
  * Single Thread Indirect Branch Predictors (STIBP)
    * SPEC_CTRL MSR is available: YES
    * CPU indicates STIBP capability: YES (Intel STIBP feature bit)
  * Speculative Store Bypass Disable (SSBD)
    * CPU indicates SSBD capability: NO
 * L1 data cache invalidation
   * FLUSH_CMD MSR is available: NO
    * CPU indicates L1D flush capability: NO
  * Microarchitectural Data Sampling
   * VERW instruction is available: NO
 * Enhanced IBRS (IBRS_ALL)
   * CPU indicates ARCH_CAPABILITIES MSR availability: NO
    * ARCH_CAPABILITIES MSR advertises IBRS_ALL capability: NO
  * CPU explicitly indicates not being vulnerable to Meltdown/L1TF (RDCL_NO): NO
 * CPU explicitly indicates not being vulnerable to Variant 4 (SSB_NO): NO
 * CPU/Hypervisor indicates L1D flushing is not necessary on this system: NO
 \star Hypervisor indicates host CPU might be vulnerable to RSB underflow (RSBA): NO
 * CPU explicitly indicates not being vulnerable to Microarchitectural Data Sampling (MDS_NO): NO
 * CPU explicitly indicates not being vulnerable to TSX Asynchronous Abort (TAA_NO): NO
 * CPU explicitly indicates not being vulnerable to iTLB Multihit (PSCHANGE_MSC_NO): NO
 * CPU explicitly indicates having MSR for TSX control (TSX_CTRL_MSR): NO
 * CPU supports Transactional Synchronization Extensions (TSX): YES (RTM feature bit)
 * CPU supports Software Guard Extensions (SGX): NO
 * CPU microcode is known to cause stability problems: NO (model 0x55 family 0x6 stepping 0x4_

→ucode 0x2000043 cpuid 0x50654)

 * CPU microcode is the latest known available version: awk: cannot open bash (No such file or_
→directorv)
UNKNOWN (latest microcode version for your CPU model is unknown)
* CPU vulnerability to the speculative execution attack variants
  * Vulnerable to CVE-2017-5753 (Spectre Variant 1, bounds check bypass): YES
  * Vulnerable to CVE-2017-5715 (Spectre Variant 2, branch target injection): YES
  * Vulnerable to CVE-2017-5754 (Variant 3, Meltdown, rogue data cache load): YES
  * Vulnerable to CVE-2018-3640 (Variant 3a, rogue system register read): YES
  * Vulnerable to CVE-2018-3639 (Variant 4, speculative store bypass): YES
  * Vulnerable to CVE-2018-3615 (Foreshadow (SGX), L1 terminal fault): NO
  * Vulnerable to CVE-2018-3620 (Foreshadow-NG (OS), L1 terminal fault): YES
  * Vulnerable to CVE-2018-3646 (Foreshadow-NG (VMM), L1 terminal fault): YES
  * Vulnerable to CVE-2018-12126 (Fallout, microarchitectural store buffer data sampling (MSBDS)):
→YES
 \star Vulnerable to CVE-2018-12130 (ZombieLoad, microarchitectural fill buffer data sampling.
                                                                                  (continues on next page)
 →(MFBDS)): YES
```

<sup>121</sup> https://github.com/speed47/spectre-meltdown-checker

```
* Vulnerable to CVE-2018-12127 (RIDL, microarchitectural load port data sampling (MLPDS)): YES
  * Vulnerable to CVE-2019-11091 (RIDL, microarchitectural data sampling uncacheable memory_
 * Vulnerable to CVE-2019-11135 (ZombieLoad V2, TSX Asynchronous Abort (TAA)): YES
 * Vulnerable to CVE-2018-12207 (No eXcuses, iTLB Multihit, machine check exception on page size_
→changes (MCEPSC)): YES
CVE-2017-5753 aka Spectre Variant 1, bounds check bypass
* Mitigated according to the /sys interface: YES (Mitigation: __user pointer sanitization)
* Kernel has array_index_mask_nospec: YES (1 occurrence(s) found of x86 64 bits array_index_mask_
→nospec())
* Kernel has the Red Hat/Ubuntu patch: NO
* Kernel has mask_nospec64 (arm64): NO
> STATUS: NOT VULNERABLE (Mitigation: __user pointer sanitization)
CVE-2017-5715 aka Spectre Variant 2, branch target injection
* Mitigated according to the /sys interface: YES (Mitigation: Full generic retpoline, IBPB, IBRS_FW)
* Mitigation 1
 \star Kernel is compiled with IBRS support: YES
    * IBRS enabled and active: YES (for firmware code only)
 * Kernel is compiled with IBPB support: YES
    * IBPB enabled and active: YES
* Mitigation 2
 * Kernel has branch predictor hardening (arm): NO
 * Kernel compiled with retpoline option: YES
    * Kernel compiled with a retpoline-aware compiler: YES (kernel reports full retpoline_
→compilation)
 * Kernel supports RSB filling: YES
> STATUS: NOT VULNERABLE (Full retpoline + IBPB are mitigating the vulnerability)
CVE-2017-5754 aka Variant 3, Meltdown, rogue data cache load
* Mitigated according to the /sys interface: YES (Mitigation: PTI)
* Kernel supports Page Table Isolation (PTI): YES
 * PTI enabled and active: YES
 * Reduced performance impact of PTI: YES (CPU supports INVPCID, performance impact of PTI will be_
→greatly reduced)
* Running as a Xen PV DomU: NO
> STATUS: NOT VULNERABLE (Mitigation: PTI)
CVE-2018-3640 aka Variant 3a, rogue system register read
* CPU microcode mitigates the vulnerability: NO
> STATUS: VULNERABLE (an up-to-date CPU microcode is needed to mitigate this vulnerability)
CVE-2018-3639 aka Variant 4, speculative store bypass
* Mitigated according to the /sys interface: NO (Vulnerable)
* Kernel supports disabling speculative store bypass (SSB): YES (found in /proc/self/status)
* SSB mitigation is enabled and active: NO
> STATUS: VULNERABLE (Your CPU doesnt support SSBD)
CVE-2018-3615 aka Foreshadow (SGX), L1 terminal fault
* CPU microcode mitigates the vulnerability: N/A
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-3620 aka Foreshadow-NG (OS), L1 terminal fault
* Kernel supports PTE inversion: NO
* PTE inversion enabled and active: UNKNOWN (sysfs interface not available)
> STATUS: VULNERABLE (Your kernel doesnt support PTE inversion, update it)
CVE-2018-3646 aka Foreshadow-NG (VMM), L1 terminal fault
* This system is a host running a hypervisor: NO
* Mitigation 1 (KVM)
```

```
* EPT is disabled: NO
* Mitigation 2
 * L1D flush is supported by kernel: NO
 * L1D flush enabled: UNKNOWN (cant find or read /sys/devices/system/cpu/vulnerabilities/l1tf)
 * Hardware-backed L1D flush supported: NO (flush will be done in software, this is slower)
 * Hyper-Threading (SMT) is enabled: YES
> STATUS: NOT VULNERABLE (this system is not running a hypervisor)
CVE-2018-12126 aka Fallout, microarchitectural store buffer data sampling (MSBDS)
* Kernel supports using MD_CLEAR mitigation: NO
> STATUS: VULNERABLE (Neither your kernel or your microcode support mitigation, upgrade both to_
→mitigate the vulnerability)
CVE-2018-12130 aka ZombieLoad, microarchitectural fill buffer data sampling (MFBDS)
* Kernel supports using MD_CLEAR mitigation: NO
> STATUS: VULNERABLE (Neither your kernel or your microcode support mitigation, upgrade both to_
→mitigate the vulnerability)
CVE-2018-12127 aka RIDL, microarchitectural load port data sampling (MLPDS)
* Kernel supports using MD_CLEAR mitigation: NO
> STATUS: VULNERABLE (Neither your kernel or your microcode support mitigation, upgrade both to_
→mitigate the vulnerability)
CVE-2019-11091 aka RIDL, microarchitectural data sampling uncacheable memory (MDSUM)
* Kernel supports using MD_CLEAR mitigation: NO
> STATUS: VULNERABLE (Neither your kernel or your microcode support mitigation, upgrade both to_
→mitigate the vulnerability)
CVE-2019-11135 aka ZombieLoad V2, TSX Asynchronous Abort (TAA)
* TAA mitigation is supported by kernel: NO
* TAA mitigation enabled and active: NO (tsx_async_abort not found in sysfs hierarchy)
> STATUS: VULNERABLE (Your kernel doesnt support TAA mitigation, update it)
CVE-2018-12207 aka No eXcuses, iTLB Multihit, machine check exception on page size changes (MCEPSC)
* This system is a host running a hypervisor: NO
\star iTLB Multihit mitigation is supported by kernel: NO
* iTLB Multihit mitigation enabled and active: NO (itlb_multihit not found in sysfs hierarchy)
> STATUS: NOT VULNERABLE (this system is not running a hypervisor)
> SUMMARY: CVE-2017-5753:OK CVE-2017-5715:OK CVE-2017-5754:OK CVE-2018-3640:KO CVE-2018-3639:KO CVE-
→2018-3615:0K CVE-2018-3620:KO CVE-2018-3646:0K CVE-2018-12126:KO CVE-2018-12130:KO CVE-2018-
→12127:K0 CVE-2019-11091:K0 CVE-2019-11135:K0 CVE-2018-12207:OK
```

## 2.12.9 SUT Settings - Linux

System provisioning is done by combination of PXE boot unattented install and Ansible<sup>122</sup> described in CSIT Testbed Setup<sup>123</sup>.

Below a subset of the running configuration:

#### 1. Ubuntu 18.04.x LTS

```
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.3 LTS
Release: 18.04
Codename: bionic
```

<sup>122</sup> https://www.ansible.com

<sup>123</sup> https://git.fd.io/csit/tree/resources/tools/testbed-setup/README.md?h=rls2001

#### **Linux Boot Parameters**

- isolcpus=<cpu number>-<cpu number> used for all cpu cores apart from first core of each socket used for running VPP worker threads and Qemu/LXC processes https://www.kernel.org/doc/Documentation/admin-guide/kernel-parameters.txt
- intel\_pstate=disable [X86] Do not enable intel\_pstate as the default scaling driver for the supported processors. Intel P-State driver decide what P-state (CPU core power state) to use based on requesting policy from the cpufreq core. [X86 Either 32-bit or 64-bit x86] https://www.kernel.org/doc/Documentation/cpu-freq/intel-pstate.txt
- nohz\_full=<cpu number>--cpu number> [KNL,BOOT] In kernels built with CON-FIG\_NO\_HZ\_FULL=y, set the specified list of CPUs whose tick will be stopped whenever possible. The boot CPU will be forced outside the range to maintain the timekeeping. The CPUs in this range must also be included in the rcu\_nocbs= set. Specifies the adaptive-ticks CPU cores, causing kernel to avoid sending scheduling-clock interrupts to listed cores as long as they have a single runnable task. [KNL Is a kernel start-up parameter, SMP The kernel is an SMP kernel]. https://www.kernel.org/doc/Documentation/timers/NO HZ.txt
- rcu\_nocbs [KNL] In kernels built with CONFIG\_RCU\_NOCB\_CPU=y, set the specified list of CPUs to be no-callback CPUs, that never queue RCU callbacks (read-copy update). https://www.kernel.org/doc/Documentation/admin-guide/kernel-parameters.txt
- numa\_balancing=disable [KNL,X86] Disable automatic NUMA balancing.
- intel iommu=enable [DMAR] Enable Intel IOMMU driver (DMAR) option.
- iommu=on, iommu=pt [x86, IA-64] Disable IOMMU bypass, using IOMMU for PCI devices.
- nmi\_watchdog=0 [KNL,BUGS=X86] Debugging features for SMP kernels. Turn hardlockup detector in nmi\_watchdog off.
- nosoftlockup [KNL] Disable the soft-lockup detector.
- tsc=reliable Disable clocksource stability checks for TSC. [x86] reliable: mark tsc clocksource as reliable, this disables clocksource verification at runtime, as well as the stability checks done at bootup. Used to enable high-resolution timer mode on older hardware, and in virtualized environment.
- hpet=disable [X86-32,HPET] Disable HPET and use PIT instead.

## **Hugepages Configuration**

Huge pages are namaged via sysctl configuration located in /etc/sysctl.d/90-csit.conf on each testbed. Default huge page size is 2M. The exact amount of huge pages depends on testbed. All the values are defined in Ansible inventory - hosts files.

## 2.12.10 DUT Settings - VPP

**VPP Version** 

VPP-20.01 release

**VPP Compile Parameters** 

FD.io VPP compile job<sup>124</sup>

2.12. Test Environment 557

<sup>124</sup> https://jenkins.fd.io/view/vpp/job/vpp-merge-2001-ubuntu1804/

#### **VPP Install Parameters**

```
$ dpkg -i --force-all *vpp*
```

## **VPP Startup Configuration**

VPP startup configuration vary per test case, with different settings for \$\$CORELIST\_WORKERS, \$\$NUM\_RX\_QUEUES, \$\$UIO\_DRIVER, \$\$NUM- MBUFS and \$\$NO\_MULTI\_SEG parameter. Default template is provided below:

```
ip
{
 heap-size 4G
}
statseg
{
 size 4G
}
unix
{
 cli-listen /run/vpp/cli.sock
 log /tmp/vpe.log
 nodaemon
socksvr {
 socket-name /run/vpp/api.sock
ip6
{
 heap-size 4G
 hash-buckets 2000000
heapsize 4G
plugins
 plugin default
   disable
 plugin dpdk_plugin.so
    enable
 }
}
cpu
{
 corelist-workers $$CORELIST_WORKERS
 main-core 1
}
dpdk
 num-mbufs $$NUM-MBUFS
  uio-driver $$UIO_DRIVER
  $$NO_MULTI_SEG
  log-level debug
  dev default
   num-rx-queues $$NUM_RX_QUEUES
  }
  no-tx-checksum-offload
```

```
dev $$DEV_1
dev $$DEV_2
}
```

Description of VPP startup settings used in CSIT is provided in Test Methodology (page 14).

# 2.12.11 TG Settings - TRex

## **TG Version**

TRex v2.73

#### **DPDK Version**

**DPDK v19.05** 

### **TG Build Script Used**

TRex installation<sup>125</sup>

## **TG Startup Configuration**

#### **TG Startup Command**

```
\ sh -c 'cd <t-rex-install-dir>/scripts/ && sudo nohup ./t-rex-64 -i -c 7 --prefix $(hostname) -- \hookrightarrow hdrh > /tmp/trex.log 2>&1 &'> /dev/null
```

#### **TG API Driver**

TRex driver<sup>126</sup>

2.12. Test Environment 559

<sup>125</sup> https://git.fd.io/csit/tree/resources/tools/trex/trex\_installer.sh?h=rls2001

<sup>126</sup> https://git.fd.io/csit/tree/resources/tools/trex/trex\_stateless\_profile.py?h=rls2001

## 2.13 Documentation

#### 2.13.1 Container Orchestration in CSIT

#### Overview

#### **Linux Containers**

Linux Containers is an OS-level virtualization method for running multiple isolated Linux systems (containers) on a compute host using a single Linux kernel. Containers rely on Linux kernel cgroups functionality for controlling usage of shared system resources (i.e. CPU, memory, block I/O, network) and for namespace isolation. The latter enables complete isolation of applications' view of operating environment, including process trees, networking, user IDs and mounted file systems.

LXC (Linux Containers) combine kernel's cgroups and support for isolated namespaces to provide an isolated environment for applications. Docker does use LXC as one of its execution drivers, enabling image management and providing deployment services. More information in [lxc], [lxcnamespace] and [stgraber].

Linux containers can be of two kinds: privileged containers and unprivileged containers.

#### **Unprivileged Containers**

Running unprivileged containers is the safest way to run containers in a production environment. From LXC 1.0 one can start a full system container entirely as a user, allowing to map a range of UIDs on the host into a namespace inside of which a user with UID 0 can exist again. In other words an unprivileged container does mask the userid from the host, making it impossible to gain a root access on the host even if a user gets root in a container. With unprivileged containers, non-root users can create containers and will appear in the container as the root, but will appear as userid <non-zero> on the host. Unprivileged containers are also better suited to supporting multi-tenancy operating environments. More information in [lxcsecurity] and [stgraber].

## **Privileged Containers**

Privileged containers do not mask UIDs, and container UID 0 is mapped to the host UID 0. Security and isolation is controlled by a good configuration of cgroup access, extensive AppArmor profile preventing the known attacks as well as container capabilities and SELinux. Here a list of applicable security control mechanisms:

- Capabilities keep (whitelist) or drop (blacklist) Linux capabilities, [capabilities].
- Control groups cgroups, resource bean counting, resource quotas, access restrictions, [cgroup1], [cgroup2].
- AppArmor apparmor profiles aim to prevent any of the known ways of escaping a container or cause harm to the host, [apparmor].
- SELinux Security Enhanced Linux is a Linux kernel security module that provides similar function to AppArmor, supporting access control security policies including United States Department of Defense-style mandatory access controls. Mandatory access controls allow an administrator of a system to define how applications and users can access different resources such as files, devices, networks and inter- process communication, [selinux].
- Seccomp secure computing mode, enables filtering of system calls, [seccomp].

More information in [lxcsecurity] and [lxcsecfeatures].

**Linux Containers in CSIT** 

CSIT is using Privileged Containers as the sysfs is mounted with RW access. Sysfs is required to be mounted as RW due to VPP accessing /sys/bus/pci/drivers/uio\_pci\_generic/unbind. This is not the case of unprivileged containers where sysfs is mounted as read-only.

#### **Orchestrating Container Lifecycle Events**

Following Linux container lifecycle events need to be addressed by an orchestration system:

- Acquire acquiring/downloading existing container images via docker pull or lxc-create -t download.
- 2. Build building a container image from scratch or another container image via **docker build <dockerfile/composefile>** or customizing LXC templates in **GitHub**<sup>127</sup>.
- 3. (Re-)Create creating a running instance of a container application from anew, or re-creating one that failed. A.k.a. (re-)deploy via **docker run** or **1xc-start**
- 4. Execute execute system operations within the container by attaching to running container. THis is done by **1xc-attach** or **docker exec**
- 5. Distribute distributing pre-built container images to the compute nodes. Currently not implemented in CSIT.

### **Container Orchestration Systems Used in CSIT**

Current CSIT testing framework integrates following Linux container orchestration mechanisms:

• LXC/Docker for complete VPP container lifecycle control.

#### **LXC**

LXC is the well-known and heavily tested low-level Linux container runtime [lxcsource], that provides a userspace interface for the Linux kernel containment features. With a powerful API and simple tools, LXC enables Linux users to easily create and manage system or application containers. LXC uses following kernel features to contain processes:

- Kernel namespaces: ipc, uts, mount, pid, network and user.
- AppArmor and SELinux security profiles.
- Seccomp policies.
- Chroot.
- Cgroups.

CSIT uses LXC runtime and LXC usertools to test VPP data plane performance in a range of virtual networking topologies.

#### **Known Issues**

- Current CSIT restriction: only single instance of lxc runtime due to the cgroup policies used in CSIT.
  There is plan to add the capability into code to create cgroups per container instance to address this
  issue. This sort of functionality is better supported in LXC 2.1 but can be done is current version as
  well.
- CSIT code is currently using cgroup to control the range of CPU cores the LXC container runs on. VPP thread pinning is defined vpp startup.conf.

2.13. Documentation 561

<sup>127</sup> https://github.com/lxc/lxc/tree/master/templates

#### **Docker**

Docker builds on top of Linux kernel containment features, and offers a high-level tool for wrapping the processes, maintaining and executing them in containers [docker]. Currently it using *runc* a CLI tool for spawning and running containers according to the OCI specification<sup>128</sup>

A Docker container image is a lightweight, stand-alone, executable package of a piece of software that includes everything needed to run it: code, runtime, system tools, system libraries, settings.

CSIT uses Docker to manage the maintenance and execution of containerized applications used in CSIT performance tests.

• Data plane thread pinning to CPU cores - Docker CLI and/or Docker configuration file controls the range of CPU cores the Docker image must run on. VPP thread pinning defined vpp startup.conf.

#### Implementation

CSIT container orchestration is implemented in CSIT Level-1 keyword Python libraries following the Builder design pattern. Builder design pattern separates the construction of a complex object from its representation, so that the same construction process can create different representations e.g. LXC, Docker, other.

CSIT Robot Framework keywords are then responsible for higher level lifecycle control of of the named container groups. One can have multiple named groups, with 1..N containers in a group performing different role/functionality e.g. NFs, Switch, Kafka bus, ETCD datastore, etc. ContainerManager class acts as a Director and uses ContainerEngine class that encapsulate container control.

Current CSIT implementation is illustrated using UML Class diagram:

- 1. Acquire
- 2. Build
- 3. (Re-)Create
- 4. Execute

```
RF Keywords (high level lifecycle control)
| Construct VNF containers on all DUTs
| Acquire all '${group}' containers
| Create all '${group}' containers
| Install all '${group}' containers
| Configure all '${group}' containers
| Stop all '${group}' containers
| Destroy all '${group}' containers
             | 1..N
       ----+
   ContainerManager |
                                   | ContainerEngine
| __init()__
                            | __init(node)__
                           | construct_container()
                                   | acquire(force)
                           | create()
construct_containers()
| execute(command)
```

<sup>128</sup> https://www.opencontainers.org/



Sequentional diagram that illustrates the creation of a single container.

```
Legend:
 e = engine [Docker|LXC]
 .. = kwargs (variable number of keyword argument)
                   | ContainerManager |
| RF KW |
                                      | ContainerEngine |
                                      +----+
  | 1: new ContainerManager(e) |
  |-|
                        |-| 2: new ContainerEngine |
  |-|
  |-|
                                      |-|
  |-|
  |-| 3: construct_container(..) |
  |-+--->+-+
                        |-| 4: init()
                        |-+--->+-+
  |-|
                                           |-| 5: new +----+
                         |-|
  |-|
                                           |-+--->| Container A |
  |-|
                         |-|
                                            |-|
                         |-|
                         |-|<------
  |-| 6: acquire_all_containers() |
  |-+--->+-+
```

(continues on next page)

2.13. Documentation 563

```
|-| 7: acquire() |
 |-|
                 |-|
 |-|
                 |-|
 |-|
                         |-|
 |-|
                |-|
 |-|
                              |-|
| |-| ALT [isRunning & force] |-|
                               |-|--+
            [-]
                            | |-|
| |-|
                |-|
 |-|
                 +-+
                              +-+
 |-|
 |-| 9: create_all_containers() |
 |-+--->+-+
 |-|
                |-| 10: create()
 |-|
                 |-+--->+-+
                |-|
 |-|
                            |-+--+
                 |-|
                              |-|
                |-|
                              |-<--+
 |-|
 |-|
| |-| ALT
| |-| (install_vpp, configure_vpp) |
                              | |-|
                              |-| 12: destroy_all_containers() |
                               |-+--->+-+
                 |-| 13: destroy()
                 |-+--->+-+
                 |-|
                              |-|
 |-|
 |-|
 |-|
```

#### **Container Data Structure**

Container is represented in Python L1 library as a separate Class with instance variables and no methods except overriden \_\_getattr\_\_ and \_\_setattr\_\_. Instance variables are assigned to container dynamically during the construct\_container(\*\*kwargs) call and are passed down from the RF keyword.

#### Usage example:

```
| Construct VNF containers on all DUTs
| [Arguments] | ${technology} | ${image} | ${cpu_count}=${1} | ${count}=${1} |
| ...
| | ${group}= | Set Variable | VNF
| | ${skip_cpus}= | Evaluate | ${vpp_cpus}+${system_cpus} |
| Import Library | resources.libraries.python.ContainerUtils.ContainerManager
| | ... | engine=${container_engine} | WITH NAME | ${group} |
| | ${duts}= | Get Matches | ${nodes} | DUT*
| | :FOR | ${dut} | IN | @{duts}
```

```
| | $\{\text{env}} = | Create List | DEBIAN_FRONTEND=noninteractive
| | $\{\text{mnt}} = | Create List | /tmp:/mnt/host | /dev:/dev
| | $\{\text{cpu_node}} = | Get interfaces numa node | $\{\text{nodes}['$\{\text{dut}}']\}\]
| | | ... | $\{\text{dut1_if1}} | $\{\text{dut1_if2}\}\]
| | Run Keyword | $\{\text{group}\}.Construct containers
| | | ... | name=$\{\text{dut}\}_{\text{group}} | node=$\{\text{nodes}['$\{\text{dut}}']\} | mnt=$\{\text{mnt}\}\]
| | | ... | image=$\{\text{container_image}\} | cpu_count=$\{\text{container_cpus}\}\]
| | | ... | cpu_skip=$\{\text{skip_cpus}\} | cpuset_mems=$\{\text{cpu_node}\}\]
| | | ... | install_dkms=$\{\text{container_install_dkms}\}\]
| | Append To List | $\{\text{container_groups}\} | $\{\text{group}\}\]
```

Mandatory parameters to create standalone container are: node, name, image [imagevar], cpu\_count, cpu\_skip, cpuset\_mems, cpu\_shared.

There is no parameters check functionality. Passing required arguments is in coder responsibility. All the above parameters are required to calculate the correct cpu placement. See documentation for the full reference.

## **Kubernetes**

For the future use, Kubernetes [k8sdoc] is implemented as separate library KubernetesUtils.py, with a class with the same name. This utility provides an API for L2 Robot Keywords to control kubectl installed on each of DUTs. One time initialization script, resources/libraries/bash/k8s\_setup.sh does reset/init kubectl, and initializes the csit namespace. CSIT namespace is required to not to interfere with existing setups and it further simplifies apply/get/delete Pod/ConfigMap operations on SUTs.

Kubernetes utility is based on YAML templates to avoid crafting the huge YAML configuration files, what would lower the readability of code and requires complicated algorithms.

Two types of YAML templates are defined:

- Static do not change between deployments, that is infrastructure containers like Kafka, Calico, ETCD.
- Dynamic per test suite/case topology YAML files.

Making own python wrapper library of kubect1 instead of using the official Python package allows to control and deploy environment over the SSH library without the need of using isolated driver running on each of DUTs.

### **Tested Topologies**

Listed CSIT container networking test topologies are defined with DUT containerized VPP switch forwarding packets between NF containers. Each NF container runs their own instance of VPP in L2XC configuration.

Following container networking topologies are tested in CSIT-2001:

- LXC topologies:
  - eth-l2xcbase-eth-2memif-1lxc.
  - eth-l2bdbasemaclrn-eth-2memif-1lxc.
- Docker topologies:
  - eth-l2xcbase-eth-2memif-1docker.
  - eth-l2xcbase-eth-1memif-1docker

2.13. Documentation 565

### References

# 2.13.2 Test Code Documentation

 ${\color{red} \textbf{CSIT VPP Performance Tests Documentation}^{143} \ contains \ detailed \ functional \ description \ and \ input \ parameters \ for each \ test \ case. }$ 

<sup>143</sup> https://docs.fd.io/csit/rls2001/doc/tests.vpp.perf.html

**CHAPTER** 

**THREE** 

# **DPDK PERFORMANCE**

# 3.1 Overview

DPDK performance test results are reported for all three physical testbed types present in FD.io labs: 3-Node Xeon Haswell (3n-hsw), 3-Node Xeon Skylake (3n-skx), 2-Node Xeon Skylake (2n-skx) and installed NIC models. For description of physical testbeds used for DPDK performance tests please refer to *Physical Testbeds* (page 5).

# 3.1.1 Logical Topologies

CSIT DPDK performance tests are executed on physical testbeds described in *Physical Testbeds* (page 5). Based on the packet path through server SUTs, one distinct logical topology type is used for DPDK DUT data plane testing: NIC-to-NIC switching topology.

### **NIC-to-NIC Switching**

The simplest logical topology for software data plane application like DPDK is NIC-to-NIC switching. Tested topologies for 2-Node and 3-Node testbeds are shown in figures below.





Server Systems Under Test (SUT) run DPDK Testpmd or L3fwd application in Linux user-mode as a Device Under Test (DUT). Server Traffic Generator (TG) runs T-Rex application. Physical connectivity between SUTs and TG is provided using different drivers and NIC models that need to be tested for performance (packet/bandwidth throughput and latency).

From SUT and DUT perspectives, all performance tests involve forwarding packets between two physical Ethernet ports (10GE, 25GE, 40GE, 100GE). In most cases both physical ports on SUT are located on the same NIC. The only exceptions are link bonding and 100GE tests. In the latter case only one port per NIC can be driven at linerate due to PCIe Gen3 x16 slot bandwidth limiations. 100GE NICs are not supported in PCIe Gen3 x8 slots.

Note that reported DPDK DUT performance results are specific to the SUTs tested. SUTs with other processors than the ones used in FD.io lab are likely to yield different results. A good rule of thumb, that can be applied to estimate DPDK packet thoughput for NIC-to-NIC switching topology, is to expect the forwarding performance to be proportional to processor core frequency for the same processor architecture, assuming processor is the only limiting factor and all other SUT parameters are equivalent to FD.io CSIT environment.

## 3.1.2 Performance Tests Coverage

Performance tests measure following metrics for tested DPDK DUT topologies and configurations:

- Packet Throughput: measured in accordance with RFC 2544<sup>144</sup>, using FD.io CSIT Multiple Loss Ratio search (MLRsearch), an optimized binary search algorithm, producing throughput at different Packet Loss Ratio (PLR) values:
  - Non Drop Rate (NDR): packet throughput at PLR=0%.
  - Partial Drop Rate (PDR): packet throughput at PLR=0.5%.
- One-Way Packet Latency: measured at different offered packet loads:
  - 100% of discovered NDR throughput.
  - 100% of discovered PDR throughput.
- Maximum Receive Rate (MRR): measured packet forwarding rate under the maximum load offered by traffic generator over a set trial duration, regardless of packet loss. Maximum load for specified Ethernet frame size is set to the bi-directional link rate.

CSIT-2001 includes following DPDK Testpmd and L3fwd data plane functionality performance tested across a range of NIC drivers and NIC models:

| Functionality | Description                                                                  |
|---------------|------------------------------------------------------------------------------|
| L2IntLoop     | L2 Interface Loop forwarding all Ethernet frames between two Interfaces.     |
| IPv4 Routed   | Longest Prefix Match (LPM) L3 IPv4 forwarding of Ethernet frames between two |
| Forwarding    | Interfaces, with two /8 prefixes in lookup table.                            |

## 3.2 Release Notes

## **3.2.1 Changes in CSIT-2001**

## 1. DPDK PERFORMANCE TESTS

- Intel Xeon 2n-skx, 3n-skx testbeds: Testpmd and L3fwd performance test data is not included in this report version. This is due to the lower performance and behaviour inconsistency of these systems following the upgrade of processor microcode packages (skx ucode 0x2000064) as part of updating Ubuntu 18.04 LTS kernel version. Tested VPP and DPDK applications (L3fwd) are affected. Skx test data will be added in subsequent maintenance report version(s) once the issue is resolved. See *Known Issues* (page 569).
- Intel Xeon 2n-clx testbeds: DPDK performance test data is now included in this report, after resolving the issue of lower performance and behaviour inconsistency of these systems due to the Linux kernel driven upgrade of processor microcode packages to 0x500002c. The resolution is to use latest SuperMicro BIOS 3.2 (for X11DPG-QT motherboards used) that upgrades processor microcode to 0x500002c, AND NOT kernel provided ucode package as it does put THE system into sub-optimal state.

### 2. DPDK RELEASE VERSION CHANGE

3.2. Release Notes 569

<sup>144</sup> https://tools.ietf.org/html/rfc2544.html

• CSIT-2001 tested DPDK-19.08, as used by VPP-20.01 release.

### 3. TEST ENVIRONMENT

- TRex Fortville NIC Performance: Received FVL fix from Intel resolving TRex low throughput issue. TRex per FVL NIC throughput increased from ~27 Mpps to the nominal ~37 Mpps. For detail see CSIT-1503<sup>145</sup> and TRex-519<sup>146</sup>].
- New Intel Xeon Cascadelake Testbeds: Added performance tests for 2-Node-Cascadelake (2n-clx) testbeds with x710, xxv710 and cx556a-edat NIC cards.

## 3.2.2 Known Issues

List of known issues in CSIT-2001 for DPDK performance tests:

| # | Ji-   | Issue Description                                                                                     |
|---|-------|-------------------------------------------------------------------------------------------------------|
|   | ralD  |                                                                                                       |
| 1 | CSIT- | Intel Xeon 2n-skx, 3n-skx and 2n-clx testbeds behaviour and performance became incon-                 |
|   | 1675  | <sup>14</sup> sistent following the upgrade to the latest Ubuntu 18.04 LTS kernel version (4.15.0-72- |
|   |       | generic) and associated microcode packages (skx ucode 0x2000064, clx ucode 0x500002c).                |
|   |       | VPP as well as DPDK L3fwd tests are affected.                                                         |

<sup>145</sup> https://jira.fd.io/browse/CSIT-1503

<sup>146</sup> https://trex-tgn.cisco.com/youtrack/issue/trex-519

<sup>147</sup> https://jira.fd.io/browse/CSIT-1675

# 3.3 Packet Throughput

Throughput graphs are generated by multiple executions of the same performance tests across physical testbeds hosted LF FD.io labs: 3n-hsw, 2n-skx, 3n-skx, 2n-clx. Box-and-Whisker plots are used to display variations in measured throughput values, without making any assumptions of the underlying statistical distribution.

For each test case, Box-and-Whisker plots show the quartiles (Min, 1st quartile / 25th percentile, 2nd quartile / 50th percentile / mean, 3rd quartile / 75th percentile, Max) across collected data set. Outliers are plotted as individual points.

Additional information about graph data:

- 1. **Graph Title**: describes tested packet path, testbed topology, processor model, NIC model, packet size, number of cores and threads used by data plane workers and indication of DPDK DUT configuration.
- 2. X-axis Labels: indices of individual test suites as listed in Graph Legend.
- 3. Y-axis Labels: measured Packets Per Second [pps] throughput values.
- 4. **Graph Legend**: lists X-axis indices with associated CSIT test suites executed to generate graphed test results.
- 5. **Hover Information**: lists minimum, first quartile, median, third quartile, and maximum. If either type of outlier is present the whisker on the appropriate side is taken to 1.5×IQR from the quartile (the "inner fence") rather than the max or min, and individual outlying data points are displayed as unfilled circles (for suspected outliers) or filled circles (for outliers). (The "outer fence" is 3×IQR from the quartile.)

**Note:** Test results have been generated by FD.io test executor dpdk performance job 2n-skx<sup>148</sup>, FD.io test executor dpdk performance job 3n-skx<sup>149</sup>, FD.io test executor dpdk performance job 2n-clx<sup>150</sup>, FD.io test executor dpdk performance job 3n-tsh'\_, 'FD.io test executor dpdk performance job 3n-tsh'\_, 'FD.io test executor dpdk performance job 3n-tsh'\_ and 'FD.io test executor dpdk performance job 3n-dnv'\_ with RF result files csit-dpdk-perf-2001-\*.zip archived here. Required per test case data set size is **10** and for DPDK tests this is the actual size, as all scheduled test executions completed successfully.

<sup>&</sup>lt;sup>148</sup> https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-verify-2001-2n-skx

<sup>&</sup>lt;sup>149</sup> https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-verify-2001-3n-skx

<sup>&</sup>lt;sup>150</sup> https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-verify-2001-2n-clx

<sup>&</sup>lt;sup>151</sup> https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-verify-2001-3n-hsw

# 3.3.1 3n-hsw-xl710

Following sections include summary graphs of Phy-to-Phy performance with packet routed forwarding, including NDR throughput (zero packet loss) and PDR throughput (<0.5% packet loss).

CSIT source code for the test cases used for plots can be found in CSIT git repository<sup>152</sup>.

<sup>152</sup> https://git.fd.io/csit/tree/tests/dpdk/perf?h=rls2001

### 64b-1t1c-base









## 3.3.2 3n-hsw-x710

Following sections include summary graphs of Phy-to-Phy performance with packet routed forwarding, including NDR throughput (zero packet loss) and PDR throughput (<0.5% packet loss).

CSIT source code for the test cases used for plots can be found in CSIT git repository<sup>153</sup>.

<sup>153</sup> https://git.fd.io/csit/tree/tests/dpdk/perf?h=rls2001









## 3.3.3 2n-dnv-x553

Following sections include summary graphs of Phy-to-Phy performance with packet routed forwarding, including NDR throughput (zero packet loss) and PDR throughput (<0.5% packet loss).

CSIT source code for the test cases used for plots can be found in CSIT git repository<sup>154</sup>.

<sup>154</sup> https://git.fd.io/csit/tree/tests/dpdk/perf?h=rls2001









## 3.3.4 3n-dnv-x553

Following sections include summary graphs of Phy-to-Phy performance with packet routed forwarding, including NDR throughput (zero packet loss) and PDR throughput (<0.5% packet loss).

CSIT source code for the test cases used for plots can be found in CSIT git repository<sup>155</sup>.

<sup>155</sup> https://git.fd.io/csit/tree/tests/dpdk/perf?h=rls2001









## 3.3.5 3n-tsh-x520

Following sections include summary graphs of Phy-to-Phy performance with packet routed forwarding, including NDR throughput (zero packet loss) and PDR throughput (<0.5% packet loss).

CSIT source code for the test cases used for plots can be found in CSIT git repository<sup>156</sup>.

<sup>156</sup> https://git.fd.io/csit/tree/tests/dpdk/perf?h=rls2001









## 3.3.6 2n-clx-xxv710

Following sections include summary graphs of Phy-to-Phy performance with packet routed forwarding, including NDR throughput (zero packet loss) and PDR throughput (<0.5% packet loss).

CSIT source code for the test cases used for plots can be found in CSIT git repository<sup>157</sup>.

<sup>157</sup> https://git.fd.io/csit/tree/tests/dpdk/perf?h=rls2001









## 3.3.7 2n-clx-x710

Following sections include summary graphs of Phy-to-Phy performance with packet routed forwarding, including NDR throughput (zero packet loss) and PDR throughput (<0.5% packet loss).

CSIT source code for the test cases used for plots can be found in CSIT git repository<sup>158</sup>.

<sup>158</sup> https://git.fd.io/csit/tree/tests/dpdk/perf?h=rls2001









# 3.4 Packet Latency

DPDK Testpmd and L3fwd latency results are generated based on the test data obtained from CSIT-2001 NDR-PDR throughput tests executed across physical testbeds hosted in LF FD.io labs: 3n-hsw, 3n-skx, 2n-skx, 2n-clx, 3n-dnv, 2n-dnv, 3n-tsh.

Latency by percentile distribution plots are used to show packet latency percentiles at different packet rate load levels: i) No-Load latency streams only, ii) Low-Load at 10% PDR, iii) Mid-Load at 50% PDR and iv) High-Load at 90% PDR.

Additional information about graph data:

- 1. Graph Title: describes tested DUT packet path.
- 2. X-axis Labels: percentile of packets.
- 3. Y-axis Labels: measured one-way packet latency values in [uSec].
- 4. Graph Legend: list of latency tests at different packet rate load level.
- 5. **Hover Information**: packet rate load level, stream direction (East-West, West-East), percentile, one-way latency.

**Note:** Test results have been generated by FD.io test executor dpdk performance job 3n-hsw<sup>159</sup> and 'FD.io test executor dpdk performance job 3n-tsh'\_ with RF result files csit-dpdk-perf-2001-\*.zip archived here.

3.4. Packet Latency 607

<sup>159</sup> https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-verify-2001-3n-hsw

## 3.4.1 3n-hsw-xl710

CSIT source code for the test cases used for plots can be found in CSIT git repository 160.

<sup>160</sup> https://git.fd.io/csit/tree/tests/dpdk/perf?h=rls2001



3.4. Packet Latency 609



# 3.4.2 3n-tsh-x520

CSIT source code for the test cases used for plots can be found in CSIT git repository<sup>161</sup>.

3.4. Packet Latency 611

<sup>161</sup> https://git.fd.io/csit/tree/tests/dpdk/perf?h=rls2001

### 64b-1t1c-base





3.4. Packet Latency 613

# 3.4.3 2n-clx-xxv710

CSIT source code for the test cases used for plots can be found in CSIT git repository<sup>162</sup>.

<sup>162</sup> https://git.fd.io/csit/tree/tests/dpdk/perf?h=rls2001

### 64b-2t1c-base



3.4. Packet Latency 615



# 3.5 Comparisons

#### 3.5.1 Current vs. Previous Release

Relative comparison of DPDK Testpmd and L3fwd packet throughput (NDR, PDR and MRR) between DPDK-19.08 and DPDK-19.05 (measured for CSIT-2001 and CSIT-1908 respectively) is calculated from results of tests running on 3-Node Intel Xeon Haswell testbeds (3n-hsw) in 1-core and 2-core configurations.

Listed mean and standard deviation values are computed based on a series of the same tests executed against respective DPDK releases to verify test results repeatability, with percentage change calculated for mean values.

**Note:** Test results have been generated by FD.io test executor dpdk performance job 3n-hsw<sup>163</sup> with RF result files csit-dpdk-perf-2001-\*.zip archived here.

### 3n-hsw

#### **NDR Comparison**

Comparison tables in ASCII and CSV formats:

- ASCII 1t1c NDR comparison
- ASCII 2t2c NDR comparison
- CSV 1t1c NDR comparison
- CSV 2t2c NDR comparison

#### **PDR Comparison**

Comparison tables in ASCII and CSV formats:

- ASCII 1t1c PDR comparison
- ASCII 2t2c PDR comparison
- CSV 1t1c PDR comparison
- CSV 2t2c PDR comparison

# 3.6 Throughput Trending

In addition to reporting throughput comparison between DPDK releases, CSIT provides regular performance trending for DPDK release branches:

- 1. Performance Dashboard<sup>164</sup>: per DPDK test case throughput trend, trend compliance and summary of detected anomalies.
- 2. Trending Methodology<sup>165</sup>: throughput test metrics, trend calculations and anomaly classification (progression, regression).

3.5. Comparisons 617

<sup>&</sup>lt;sup>163</sup> https://jenkins.fd.io/view/csit/job/csit-dpdk-perf-verify-2001-3n-hsw

<sup>&</sup>lt;sup>164</sup> https://docs.fd.io/csit/master/trending/introduction/index.html

<sup>&</sup>lt;sup>165</sup> https://docs.fd.io/csit/master/trending/methodology/index.html

| 3. DPDK Trendline Graphs <sup>166</sup> : weekly DPDK Testpmd and L3fwd MRR throughput measurements against the trendline with anomaly highlights and associated CSIT test jobs. |
|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|                                                                                                                                                                                  |
|                                                                                                                                                                                  |
|                                                                                                                                                                                  |
|                                                                                                                                                                                  |
|                                                                                                                                                                                  |
|                                                                                                                                                                                  |
|                                                                                                                                                                                  |
|                                                                                                                                                                                  |
|                                                                                                                                                                                  |
|                                                                                                                                                                                  |
|                                                                                                                                                                                  |
|                                                                                                                                                                                  |
|                                                                                                                                                                                  |
|                                                                                                                                                                                  |
|                                                                                                                                                                                  |
|                                                                                                                                                                                  |
|                                                                                                                                                                                  |
|                                                                                                                                                                                  |
|                                                                                                                                                                                  |

<sup>166</sup> https://docs.fd.io/csit/master/trending/trending/dpdk.html

### 3.7 Test Environment

### 3.7.1 Physical Testbeds

FD.io CSIT performance tests are executed in physical testbeds hosted by LF for FD.io project. Two physical testbed topology types are used:

- **3-Node Topology**: Consisting of two servers acting as SUTs (Systems Under Test) and one server as TG (Traffic Generator), all connected in ring topology.
- **2-Node Topology**: Consisting of one server acting as SUTs and one server as TG both connected in ring topology.

Tested SUT servers are based on a range of processors including Intel Xeon Haswell-SP, Intel Xeon Skylake-SP, Intel Xeon Cascade Lake-SP, Arm, Intel Atom. More detailed description is provided in *Physical Testbeds* (page 5). Tested logical topologies are described in *Logical Topologies* (page 38).

### 3.7.2 Server Specifications

Complete technical specifications of compute servers used in CSIT physical testbeds are maintained in FD.io CSIT repository: FD.io CSIT testbeds - Xeon Cascade Lake<sup>167</sup>, FD.io CSIT testbeds - Xeon Skylake, Arm, Atom<sup>168</sup> and FD.io CSIT Testbeds - Xeon Haswell<sup>169</sup>.

### 3.7.3 Pre-Test Server Calibration

Number of SUT server sub-system runtime parameters have been identified as impacting data plane performance tests. Calibrating those parameters is part of FD.io CSIT pre-test activities, and includes measuring and reporting following:

- 1. System level core jitter measure duration of core interrupts by Linux in clock cycles and how often interrupts happen. Using CPU core jitter tool<sup>170</sup>.
- 2. Memory bandwidth measure bandwidth with Intel MLC tool 171.
- 3. Memory latency measure memory latency with Intel MLC tool.
- 4. Cache latency at all levels (L1, L2, and Last Level Cache) measure cache latency with Intel MLC tool

Measured values of listed parameters are especially important for repeatable zero packet loss throughput measurements across multiple system instances. Generally they come useful as a background data for comparing data plane performance results across disparate servers.

Following sections include measured calibration data for testbeds.

### 3.7.4 Calibration Data - Skylake

Following sections include sample calibration data measured on s11-t31-sut1 server running in one of the Intel Xeon Skylake testbeds as specified in FD.io CSIT testbeds - Xeon Skylake, Arm, Atom<sup>172</sup>.

Calibration data obtained from all other servers in Skylake testbeds shows the same or similar values.

 $<sup>^{167}\</sup> https://git.fd.io/csit/tree/docs/lab/testbeds\_sm\_clx\_hw\_bios\_cfg.md?h=rls2001$ 

<sup>168</sup> https://git.fd.io/csit/tree/docs/lab/testbeds\_sm\_skx\_hw\_bios\_cfg.md?h=rls2001

<sup>&</sup>lt;sup>169</sup> https://git.fd.io/csit/tree/docs/lab/testbeds\_ucs\_hsw\_hw\_bios\_cfg.md?h=rls2001

<sup>170</sup> https://git.fd.io/pma\_tools/tree/jitter

<sup>&</sup>lt;sup>171</sup> https://software.intel.com/en-us/articles/intelr-memory-latency-checker

<sup>172</sup> https://git.fd.io/csit/tree/docs/lab/testbeds\_sm\_skx\_hw\_bios\_cfg.md?h=rls2001

#### Linux cmdline

```
$ cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-4.15.0-72-generic root=UUID=e05120bb-7127-43db-b1e3-a66edd4c43bd ro_

→isolcpus=1-27,29-55,57-83,85-111 nohz_full=1-27,29-55,57-83,85-111 rcu_nocbs=1-27,29-55,57-83,85-

→111 numa_balancing=disable intel_pstate=disable intel_iommu=on iommu=pt nmi_watchdog=0 audit=0_

→nosoftlockup processor.max_cstate=1 intel_idle.max_cstate=1 hpet=disable tsc=reliable mce=off_

→console=tty0 console=ttyS0,115200n8
```

#### Linux uname

#### **System-level Core Jitter**

```
$ sudo taskset -c 3 /home/testuser/pma_tools/jitter/jitter -i 20
Linux Jitter testing program version 1.8
Iterations=20
The pragram will execute a dummy function 80000 times
Display is updated every 20000 displayUpdate intervals
Timings are in CPU Core cycles
Inst_Min:
            Minimum Excution time during the display update interval(default is ~1 second)
Inst_Max:
            Maximum Excution time during the display update interval(default is ~1 second)
Inst_jitter: Jitter in the Excution time during rhe display update interval. This is the value of_
→interest
last_Exec: The Excution time of last iteration just before the display update
Abs_Min:
            Absolute Minimum Excution time since the program started or statistics were reset
Abs_Max:
            Absolute Maximum Excution time since the program started or statistics were reset
            Cumulative value calcualted by the dummy function
tmp:
            Time interval between the display updates in Core Cycles
Interval:
Sample No: Sample number
   Inst_Min Inst_Max Inst_jitter last_Exec Abs_min
                                                          Abs_max
                                                                        tmp
                                                                                  Interval
→Sample No
                                                                    2538733568 3204142750
   160022
             171330
                          11308
                                    160022
                                                160022
                                                          171330
→1
   160022
              167294
                           7272
                                    160026
                                                160022
                                                          171330
                                                                     328335360 3203873548
∽2
   160022
              167560
                           7538
                                    160026
                                                160022
                                                          171330
                                                                    2412904448 3203878736
→3
   160022
              169000
                           8978
                                    160024
                                                160022
                                                          171330
                                                                     202506240 3203864588
⊶4
   160022
              166572
                           6550
                                    160026
                                                160022
                                                          171330
                                                                    2287075328 3203866224
∽5
   160022
              167460
                           7438
                                    160026
                                                160022
                                                          171330
                                                                      76677120 3203854632
∽6
   160022
              168134
                           8112
                                    160024
                                                160022
                                                                    2161246208 3203874674
                                                          171330
∽7
   160022
              169094
                           9072
                                    160022
                                                160022
                                                          171330
                                                                     4245815296 3203878798
<del>⇔</del>8
   160022
              172460
                           12438
                                    160024
                                                160022
                                                          172460
                                                                     2035417088 3204112010
→9
   160022
              167862
                           7840
                                    160030
                                                160022
                                                          172460
                                                                     4119986176 3203856800
→10
                                                                     1909587968 3203854192
   160022
              168398
                           8376
                                    160024
                                                160022
                                                           172460
→11
```

| /          | •    |          | ١.   |
|------------|------|----------|------|
| (continued | trom | previous | page |

| 160022      | 167548 | 7526  | 160024 | 160022 | 172460 | 3994157056 3203847442 | u u      |
|-------------|--------|-------|--------|--------|--------|-----------------------|----------|
| <b>→</b> 12 |        |       |        |        |        |                       |          |
| 160022      | 167562 | 7540  | 160026 | 160022 | 172460 | 1783758848 3203862936 |          |
| <b>⇔</b> 13 |        |       |        |        |        |                       |          |
| 160022      | 167604 | 7582  | 160024 | 160022 | 172460 | 3868327936 3203859346 | <u>.</u> |
| <b>⇔</b> 14 |        |       |        |        |        |                       |          |
| 160022      | 168262 | 8240  | 160024 | 160022 | 172460 | 1657929728 3203851120 | <u>.</u> |
| <b>⇔</b> 15 |        |       |        |        |        |                       |          |
| 160022      | 169700 | 9678  | 160024 | 160022 | 172460 | 3742498816 3203877690 | <u>.</u> |
| <b>⇔</b> 16 |        |       |        |        |        |                       |          |
| 160022      | 170476 | 10454 | 160026 | 160022 | 172460 | 1532100608 3204088480 | <u>.</u> |
| <b>⇔</b> 17 |        |       |        |        |        |                       |          |
| 160022      | 167798 | 7776  | 160024 | 160022 | 172460 | 3616669696 3203862072 | <u>.</u> |
| <b>⇔</b> 18 |        |       |        |        |        |                       |          |
| 160022      | 166540 | 6518  | 160024 | 160022 | 172460 | 1406271488 3203836904 | <u>.</u> |
| <b>⇔</b> 19 |        |       |        |        |        |                       |          |
| 160022      | 167516 | 7494  | 160024 | 160022 | 172460 | 3490840576 3203848120 | u u      |
| <b>⇔</b> 20 |        |       |        |        |        |                       |          |

### **Memory Bandwidth**

```
$ sudo /home/testuser/mlc --bandwidth_matrix
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --bandwidth_matrix
Using buffer size of 100.000MB/thread for reads and an additional 100.000MB/thread for writes
Measuring Memory Bandwidths between nodes within system
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using Read-only traffic type
               Numa node
Numa node
               0
         107947.7
                     50951.5
   0
    1
          50834.6 108183.4
```

```
$ sudo /home/testuser/mlc --peak_injection_bandwidth
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --peak_injection_bandwidth

Using buffer size of 100.000MB/thread for reads and an additional 100.000MB/thread for writes

Measuring Peak Injection Memory Bandwidths for the system
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)

Using all the threads from each core if Hyper-threading is enabled

Using traffic with the following read-write ratios

ALL Reads : 215733.9

3:1 Reads-Writes : 182141.9

2:1 Reads-Writes : 178615.7

1:1 Reads-Writes : 149911.3

Stream-triad like: 159533.6
```

```
$ sudo /home/testuser/mlc --max_bandwidth
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --max_bandwidth

Using buffer size of 100.000MB/thread for reads and an additional 100.000MB/thread for writes

Measuring Maximum Memory Bandwidths for the system
```

(continues on next page)

```
Will take several minutes to complete as multiple injection rates will be tried to get the best... bandwidth

Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)

Using all the threads from each core if Hyper-threading is enabled

Using traffic with the following read-write ratios

ALL Reads : 216875.73

3:1 Reads-Writes : 182615.14

2:1 Reads-Writes : 178745.67

1:1 Reads-Writes : 149485.27

Stream-triad like: 180057.87
```

#### **Memory Latency**

```
$ sudo /home/testuser/mlc --idle_latency
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --idle_latency
Using buffer size of 2000.000MB
Each iteration took 202.0 core clocks ( 80.8 ns)
```

```
$ sudo /home/testuser/mlc --loaded_latency
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --loaded_latency
Using buffer size of 100.000MB/thread for reads and an additional 100.000MB/thread for writes
Measuring Loaded Latencies for the system
Using all the threads from each core if Hyper-threading is enabled
Using Read-only traffic type
Inject Latency Bandwidth
Delay (ns)
            MB/sec
00000 282.66 215712.8
00002 282.14
              215757.4
00008 280.21
              215868.1
00015 279.20 216313.2
00050 275.25 216643.0
00100 227.05 215075.0
00200 121.92 160242.9
00300 101.21
              111587.4
00400 95.48
              85019.7
00500 94.46
              68717.3
00700 92.27
              49742.2
01000 91.03
              35264.8
01300 90.11 27396.3
01700 89.34
              21178.7
02500 90.15 14672.8
```

```
    03500
    89.00
    10715.7

    05000
    82.00
    7788.2

    09000
    81.46
    4684.0

    20000
    81.40
    2541.9
```

### L1/L2/LLC Latency

```
$ sudo /home/testuser/mlc --c2c_latency
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --c2c_latency
Measuring cache-to-cache transfer latency (in ns)...
Local Socket L2->L2 HIT latency
                                    53.7
Local Socket L2->L2 HITM latency
                                    53.7
Remote Socket L2->L2 HITM latency (data address homed in writer socket)
                     Reader Numa Node
Writer Numa Node
                        0
                            113.9
                    113.9
Remote Socket L2->L2 HITM latency (data address homed in reader socket)
                     Reader Numa Node
Writer Numa Node
                        0
            0
                            177.9
            1
                    177.6
```

#### **Spectre and Meltdown Checks**

Following section displays the output of a running shell script to tell if system is vulnerable against the several "speculative execution" CVEs that were made public in 2018. Script is available on Spectre & Meltdown Checker Github<sup>173</sup>.

```
Spectre and Meltdown mitigation detection tool v0.43
awk: cannot open bash (No such file or directory)
Checking for vulnerabilities on current system
Kernel is Linux 4.15.0-72-generic #81-Ubuntu SMP Tue Nov 26 12:20:02 UTC 2019 x86_64
CPU is Intel(R) Xeon(R) Platinum 8180 CPU @ 2.50GHz
Hardware check
* Hardware support (CPU microcode) for mitigation techniques
* Indirect Branch Restricted Speculation (IBRS)
   * SPEC_CTRL MSR is available: YES
   * CPU indicates IBRS capability: YES (SPEC_CTRL feature bit)
* Indirect Branch Prediction Barrier (IBPB)
  * PRED_CMD MSR is available: YES
  * CPU indicates IBPB capability: YES (SPEC_CTRL feature bit)
* Single Thread Indirect Branch Predictors (STIBP)
  * SPEC_CTRL MSR is available: YES
   * CPU indicates STIBP capability: YES (Intel STIBP feature bit)
 * Speculative Store Bypass Disable (SSBD)
  * CPU indicates SSBD capability: YES (Intel SSBD)
* L1 data cache invalidation
  * FLUSH_CMD MSR is available: YES
   * CPU indicates L1D flush capability: YES (L1D flush feature bit)
 * Microarchitectural Data Sampling
   * VERW instruction is available: YES (MD_CLEAR feature bit)
```

(continues on next page)

<sup>173</sup> https://github.com/speed47/spectre-meltdown-checker

```
* Enhanced IBRS (IBRS_ALL)
  * CPU indicates ARCH_CAPABILITIES MSR availability: NO
   * ARCH_CAPABILITIES MSR advertises IBRS_ALL capability: NO
* CPU explicitly indicates not being vulnerable to Meltdown/L1TF (RDCL_NO): NO
* CPU explicitly indicates not being vulnerable to Variant 4 (SSB_NO): NO
\star CPU/Hypervisor indicates L1D flushing is\ not\ necessary on this system: NO
* Hypervisor indicates host CPU might be vulnerable to RSB underflow (RSBA): NO
\star CPU explicitly indicates not being vulnerable to Microarchitectural Data Sampling (MDS_NO): NO
\star CPU explicitly indicates not being vulnerable to TSX Asynchronous Abort (TAA_NO): NO
* CPU explicitly indicates not being vulnerable to iTLB Multihit (PSCHANGE_MSC_NO): NO
* CPU explicitly indicates having MSR for TSX control (TSX_CTRL_MSR): NO
* CPU supports Transactional Synchronization Extensions (TSX): YES (RTM feature bit)
* CPU supports Software Guard Extensions (SGX): NO
* CPU microcode is known to cause stability problems: NO (model 0x55 family 0x6 stepping 0x4 ucode_
→0x2000064 cpuid 0x50654)
* CPU microcode is the latest known available version: awk: cannot open bash (No such file or_
→directory)
UNKNOWN (latest microcode version for your CPU model is unknown)
* CPU vulnerability to the speculative execution attack variants
* Vulnerable to CVE-2017-5753 (Spectre Variant 1, bounds check bypass): YES
* Vulnerable to CVE-2017-5715 (Spectre Variant 2, branch target injection): YES
* Vulnerable to CVE-2017-5754 (Variant 3, Meltdown, rogue data cache load): YES
* Vulnerable to CVE-2018-3640 (Variant 3a, rogue system register read): YES
* Vulnerable to CVE-2018-3639 (Variant 4, speculative store bypass): YES
* Vulnerable to CVE-2018-3615 (Foreshadow (SGX), L1 terminal fault): NO
* Vulnerable to CVE-2018-3620 (Foreshadow-NG (OS), L1 terminal fault): YES
* Vulnerable to CVE-2018-3646 (Foreshadow-NG (VMM), L1 terminal fault): YES
* Vulnerable to CVE-2018-12126 (Fallout, microarchitectural store buffer data sampling (MSBDS)):
* Vulnerable to CVE-2018-12130 (ZombieLoad, microarchitectural fill buffer data sampling (MFBDS)):_
* Vulnerable to CVE-2018-12127 (RIDL, microarchitectural load port data sampling (MLPDS)): YES
* Vulnerable to CVE-2019-11091 (RIDL, microarchitectural data sampling uncacheable memory_
→(MDSUM)): YES
* Vulnerable to CVE-2019-11135 (ZombieLoad V2, TSX Asynchronous Abort (TAA)): YES
* Vulnerable to CVE-2018-12207 (No eXcuses, iTLB Multihit, machine check exception on page size_
CVE-2017-5753 aka Spectre Variant 1, bounds check bypass
* Mitigated according to the /sys interface: YES (Mitigation: usercopy/swapgs barriers and __user_
→pointer sanitization)
* Kernel has array_index_mask_nospec: YES (1 occurrence(s) found of x86 64 bits array_index_mask_
* Kernel has the Red Hat/Ubuntu patch: NO
* Kernel has mask_nospec64 (arm64): NO
> STATUS: NOT VULNERABLE (Mitigation: usercopy/swapgs barriers and __user pointer sanitization)
CVE-2017-5715 aka Spectre Variant 2, branch target injection
* Mitigated according to the /sys interface: YES (Mitigation: Full generic retpoline, IBPB:_
* Mitigation 1
* Kernel is compiled with IBRS support: YES
  * IBRS enabled and active: YES (for firmware code only)
* Kernel is compiled with IBPB support: YES
   * IBPB enabled and active: YES
* Mitigation 2
* Kernel has branch predictor hardening (arm): NO
* Kernel compiled with retpoline option: YES
   * Kernel compiled with a retpoline-aware compiler: YES (kernel reports full retpoline_
* Kernel supports RSB filling: YES
```

```
> STATUS: NOT VULNERABLE (Full retpoline + IBPB are mitigating the vulnerability)
CVE-2017-5754 aka Variant 3, Meltdown, rogue data cache load
* Mitigated according to the /sys interface: YES (Mitigation: PTI)
* Kernel supports Page Table Isolation (PTI): YES
* PTI enabled and active: YES
* Reduced performance impact of PTI: YES (CPU supports INVPCID, performance impact of PTI will be_
→greatly reduced)
* Running as a Xen PV DomU: NO
> STATUS: NOT VULNERABLE (Mitigation: PTI)
CVE-2018-3640 aka Variant 3a, rogue system register read
* CPU microcode mitigates the vulnerability: YES
> STATUS: NOT VULNERABLE (your CPU microcode mitigates the vulnerability)
CVE-2018-3639 aka Variant 4, speculative store bypass
* Mitigated according to the /sys interface: YES (Mitigation: Speculative Store Bypass disabled via_
→prctl and seccomp)
{\rm * \ Kernel \ supports \ disabling \ speculative \ store \ bypass \ (SSB): \ YES \ (found \ in \ /proc/self/status)}
* SSB mitigation is enabled and active: YES (per-thread through prctl)
* SSB mitigation currently active for selected processes: YES (systemd-journald systemd-logind_
> STATUS: NOT VULNERABLE (Mitigation: Speculative Store Bypass disabled via prctl and seccomp)
CVE-2018-3615 aka Foreshadow (SGX), L1 terminal fault
* CPU microcode mitigates the vulnerability: N/A
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-3620 aka Foreshadow-NG (OS), L1 terminal fault
* Mitigated according to the /sys interface: YES (Mitigation: PTE Inversion; VMX: conditional cache_
→flushes, SMT vulnerable)
* Kernel supports PTE inversion: YES (found in kernel image)
* PTE inversion enabled and active: YES
> STATUS: NOT VULNERABLE (Mitigation: PTE Inversion; VMX: conditional cache flushes, SMT vulnerable)
CVE-2018-3646 aka Foreshadow-NG (VMM), L1 terminal fault
* Information from the /sys interface: Mitigation: PTE Inversion; VMX: conditional cache flushes, _
→SMT vulnerable
* This system is a host running a hypervisor: NO
* Mitigation 1 (KVM)
* EPT is disabled: NO
* Mitigation 2
* L1D flush is supported by kernel: YES (found flush_l1d in /proc/cpuinfo)
* L1D flush enabled: YES (conditional flushes)
* Hardware-backed L1D flush supported: YES (performance impact of the mitigation will be greatly_
→reduced)
* Hyper-Threading (SMT) is enabled: YES
> STATUS: NOT VULNERABLE (this system is not running a hypervisor)
CVE-2018-12126 aka Fallout, microarchitectural store buffer data sampling (MSBDS)
* Mitigated according to the /sys interface: YES (Mitigation: Clear CPU buffers; SMT vulnerable)
* Kernel supports using MD_CLEAR mitigation: YES (md_clear found in /proc/cpuinfo)
* Kernel mitigation is enabled and active: YES
* SMT is either mitigated or disabled: NO
> STATUS: NOT VULNERABLE (Your microcode and kernel are both up to date for this mitigation, and_
→mitigation is enabled)
CVE-2018-12130 aka ZombieLoad, microarchitectural fill buffer data sampling (MFBDS)
* Mitigated according to the /sys interface: YES (Mitigation: Clear CPU buffers; SMT vulnerable)
* Kernel supports using MD_CLEAR mitigation: YES (md_clear found in /proc/cpuinfo)
* Kernel mitigation is enabled and active: YES
```

(continues on next page)

```
* SMT is either mitigated or disabled: NO
> STATUS: NOT VULNERABLE (Your microcode and kernel are both up to date for this mitigation, and_
→mitigation is enabled)
CVE-2018-12127 aka RIDL, microarchitectural load port data sampling (MLPDS)
* Mitigated according to the /sys interface: YES (Mitigation: Clear CPU buffers; SMT vulnerable)
* Kernel supports using MD_CLEAR mitigation: YES (md_clear found in /proc/cpuinfo)
* Kernel mitigation is enabled and active: YES
* SMT is either mitigated or disabled: NO
> STATUS: NOT VULNERABLE (Your microcode and kernel are both up to date for this mitigation, and_
→mitigation is enabled)
CVE-2019-11091 aka RIDL, microarchitectural data sampling uncacheable memory (MDSUM)
* Mitigated according to the /sys interface: YES (Mitigation: Clear CPU buffers; SMT vulnerable)
* Kernel supports using MD_CLEAR mitigation: YES (md_clear found in /proc/cpuinfo)
* Kernel mitigation is enabled and active: YES
\star SMT is either mitigated or disabled: NO
> STATUS: NOT VULNERABLE (Your microcode and kernel are both up to date for this mitigation, and_
→mitigation is enabled)
CVE-2019-11135 aka ZombieLoad V2, TSX Asynchronous Abort (TAA)
* Mitigated according to the /sys interface: YES (Mitigation: Clear CPU buffers; SMT vulnerable)
* TAA mitigation is supported by kernel: YES (found tsx_async_abort in kernel image)
* TAA mitigation enabled and active: YES (Mitigation: Clear CPU buffers; SMT vulnerable)
> STATUS: NOT VULNERABLE (Mitigation: Clear CPU buffers; SMT vulnerable)
CVE-2018-12207 aka No eXcuses, iTLB Multihit, machine check exception on page size changes (MCEPSC)
* Mitigated according to the /sys interface: YES (KVM: Mitigation: Split huge pages)
* This system is a host running a hypervisor: NO
* iTLB Multihit mitigation is supported by kernel: YES (found itlb_multihit in kernel image)
* iTLB Multihit mitigation enabled and active: YES (KVM: Mitigation: Split huge pages)
> STATUS: NOT VULNERABLE (this system is not running a hypervisor)
> SUMMARY: CVE-2017-5753:OK CVE-2017-5715:OK CVE-2017-5754:OK CVE-2018-3640:OK CVE-2018-3639:OK CVE-
→2018-3615:0K CVE-2018-3620:0K CVE-2018-3646:0K CVE-2018-12126:0K CVE-2018-12130:0K CVE-2018-
→12127:OK CVE-2019-11091:OK CVE-2019-11135:OK CVE-2018-12207:OK
```

### 3.7.5 Calibration Data - Cascade Lake

Following sections include sample calibration data measured on s32-t27-sut1 server running in one of the Intel Xeon Skylake testbeds as specified in FD.io CSIT testbeds - Xeon Cascade Lake<sup>174</sup>.

Calibration data obtained from all other servers in Cascade Lake testbeds shows the same or similar values.

### Linux cmdline

```
$ cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-4.15.0-72-generic root=UUID=1d03969e-a2a0-41b2-a97e-1cc171b07e88 ro_
isolcpus=1-23,25-47,49-71,73-95 nohz_full=1-23,25-47,49-71,73-95 rcu_nocbs=1-23,25-47,49-71,73-95_
numa_balancing=disable intel_pstate=disable intel_iommu=on iommu=pt nmi_watchdog=0 audit=0_
nosoftlockup processor.max_cstate=1 intel_idle.max_cstate=1 hpet=disable tsc=reliable mce=off_
console=tty0 console=tty50,115200n8
```

<sup>174</sup> https://git.fd.io/csit/tree/docs/lab/testbeds\_sm\_clx\_hw\_bios\_cfg.md?h=rls2001

#### Linux uname

```
$ uname -a
Linux s32-t27-sut1 4.15.0-72-generic #81-Ubuntu SMP Tue Nov 26 12:20:02 UTC 2019 x86_64 x86_64 x86_

→64 GNU/Linux
```

#### **System-level Core Jitter**

```
$ sudo taskset -c 3 /home/testuser/pma_tools/jitter/jitter -i 30
Linux Jitter testing program version 1.9
Iterations=30
The pragram will execute a dummy function 80000 times
Display is updated every 20000 displayUpdate intervals
Thread affinity will be set to core_id:7
Timings are in CPU Core cycles
Inst Min:
             Minimum Excution time during the display update interval(default is ~1 second)
Inst_Max:
             Maximum Excution time during the display update interval(default is ~1 second)
Inst_jitter: Jitter in the Excution time during rhe display update interval. This is the value of_
→interest
             The Excution time of last iteration just before the display update
last_Exec:
             Absolute Minimum Excution time since the program started or statistics were reset
Abs Min:
             Absolute Maximum Excution time since the program started or statistics were reset
Abs_Max:
             Cumulative value calcualted by the dummy function
tmp:
Interval:
             Time interval between the display updates in Core Cycles
Sample No:
             Sample number
Inst_Min,Inst_Max,Inst_jitter,last_Exec,Abs_min,Abs_max,tmp,Interval,Sample No
160022, 167590, 7568, 160026, 160022, 167590, 2057568256, 3203711852, 1
160022,170628,10606,160024,160022,170628,4079222784,3204010824,2
160022,169824,9802,160024,160022,170628,1805910016,3203812064,3
160022, 168832, 8810, 160030, 160022, 170628, 3827564544, 3203792594, 4
160022,168248,8226,160026,160022,170628,1554251776,3203765920,5
160022, 167834, 7812, 160028, 160022, 170628, 3575906304, 3203761114, 6
160022, 167442, 7420, 160024, 160022, 170628, 1302593536, 3203769250, 7
160022,169120,9098,160028,160022,170628,3324248064,3203853340,8
160022,170710,10688,160024,160022,170710,1050935296,3203985878,9
160022, 167952, 7930, 160024, 160022, 170710, 3072589824, 3203733756, 10
160022,168314,8292,160030,160022,170710,799277056,3203741152,11
160022, 169672, 9650, 160024, 160022, 170710, 2820931584, 3203739910, 12
160022, 168684, 8662, 160024, 160022, 170710, 547618816, 3203727336, 13
160022, 168246, 8224, 160024, 160022, 170710, 2569273344, 3203739052, 14
160022, 168134, 8112, 160030, 160022, 170710, 295960576, 3203735874, 15
160022,170230,10208,160024,160022,170710,2317615104,3203996356,16
160022, 167190, 7168, 160024, 160022, 170710, 44302336, 3203713628, 17
160022.167304.7282.160024.160022.170710.2065956864.3203717954.18
160022, 167500, 7478, 160024, 160022, 170710, 4087611392, 3203706674, 19
160022, 167302, 7280, 160024, 160022, 170710, 1814298624, 3203726452, 20
160022, 167266, 7244, 160024, 160022, 170710, 3835953152, 3203702804, 21
160022, 167820, 7798, 160022, 160022, 170710, 1562640384, 3203719138, 22
160022, 168100, 8078, 160024, 160022, 170710, 3584294912, 3203716636, 23
160022,170408,10386,160024,160022,170710,1310982144,3203946958,24
160022, 167276, 7254, 160024, 160022, 170710, 3332636672, 3203706236, 25
160022, 167052, 7030, 160024, 160022, 170710, 1059323904, 3203696444, 26
160022,170322,10300,160024,160022,170710,3080978432,3203747514,27
160022, 167332, 7310, 160024, 160022, 170710, 807665664, 3203716210, 28
160022, 167426, 7404, 160026, 160022, 170710, 2829320192, 3203700630, 29
160022,168840,8818,160024,160022,170710,556007424,3203727658,30
```

#### **Memory Bandwidth**

```
$ sudo /home/testuser/mlc --bandwidth_matrix
Intel(R) Memory Latency Checker - v3.7
Command line parameters: --bandwidth_matrix
Using buffer size of 100.000MiB/thread for reads and an additional 100.000MiB/thread for writes
Measuring Memory Bandwidths between nodes within system
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using Read-only traffic type
               Numa node
Numa node
                    0
               122097.7
      0
                            51327.9
               51309.2
                            122005.5
       1
```

```
$ sudo /home/testuser/mlc --peak_injection_bandwidth
Intel(R) Memory Latency Checker - v3.7
Command line parameters: --peak_injection_bandwidth
Using buffer size of 100.000MiB/thread for reads and an additional 100.000MiB/thread for writes
Measuring Peak Injection Memory Bandwidths for the system
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using traffic with the following read-write ratios
ALL Reads
                      243159.4
3:1 Reads-Writes :
                      219132.5
2:1 Reads-Writes :
                     216603.1
1:1 Reads-Writes :
                      203713.0
Stream-triad like:
                     193790.8
```

```
$ sudo /home/testuser/mlc --max_bandwidth
Intel(R) Memory Latency Checker - v3.7
Command line parameters: --max_bandwidth
Using buffer size of 100.000MiB/thread for reads and an additional 100.000MiB/thread for writes
Measuring Maximum Memory Bandwidths for the system
Will take several minutes to complete as multiple injection rates will be tried to get the best_
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using traffic with the following read-write ratios
ALL Reads
                      244114.27
               :
3:1 Reads-Writes :
                     219441.97
2:1 Reads-Writes :
                     216603.72
1:1 Reads-Writes :
                     203679.09
Stream-triad like:
                     214902.80
```

#### **Memory Latency**

```
$ sudo /home/testuser/mlc --latency_matrix
Intel(R) Memory Latency Checker - v3.7
Command line parameters: --latency_matrix

Using buffer size of 2000.000MiB
Measuring idle latencies (in ns)...
Numa node
```

```
Numa node 0 1
0 81.2 130.2
1 130.2 81.1
```

```
$ sudo /home/testuser/mlc --idle_latency
Intel(R) Memory Latency Checker - v3.7
Command line parameters: --idle_latency

Using buffer size of 2000.000MiB
Each iteration took 186.1 core clocks ( 80.9 ns)
```

```
$ sudo /home/testuser/mlc --loaded_latency
Intel(R) Memory Latency Checker - v3.7
Command line parameters: --loaded_latency
Using buffer size of 100.000MiB/thread for reads and an additional 100.000MiB/thread for writes
Measuring Loaded Latencies for the system
Using all the threads from each core if Hyper-threading is enabled
Using Read-only traffic type
Inject Latency Bandwidth
Delay (ns) MB/sec
00000 233.86 243421.9
00002 230.61 243544.1
00008 232.56
              243394.5
00015 229.52
              244076.6
00050 225.82
               244290.6
00100 161.65
               236744.8
00200 100.63
              133844.0
00300
       96.84
               90548.2
       95.71
00400
               68504.3
00500
       95.68
              55139.0
00700
       88.77
              39798.4
01000
       84.74
              28200.1
01300 83.08 21915.5
01700 82.27 16969.3
02500 81.66 11810.6
03500 81.98 8662.9
05000 81.48 6306.8
09000 81.17
                3857.8
20000 80.19
              2179.9
```

#### L1/L2/LLC Latency

```
$ sudo /home/testuser/mlc --c2c_latency
Intel(R) Memory Latency Checker - v3.7
Command line parameters: --c2c_latency
Measuring cache-to-cache transfer latency (in ns)...
Local Socket L2->L2 HIT latency
                                  55.5
Local Socket L2->L2 HITM latency
                                       55.6
Remote Socket L2->L2 HITM latency (data address homed in writer socket)
                       Reader Numa Node
Writer Numa Node
                    0
                            1
           0
                        115.6
                115.6
Remote Socket L2->L2 HITM latency (data address homed in reader socket)
```

(continues on next page)

```
Reader Numa Node
Writer Numa Node 0 1
0 - 178.2
1 178.4 -
```

#### **Spectre and Meltdown Checks**

Following section displays the output of a running shell script to tell if system is vulnerable against the several speculative execution CVEs that were made public in 2018. Script is available on Spectre & Meltdown Checker Github<sup>175</sup>.

```
Spectre and Meltdown mitigation detection tool v0.43
awk: fatal: cannot open file `bash for reading (No such file or directory)
Checking for vulnerabilities on current system
Kernel is Linux 4.15.0-72-generic #81-Ubuntu SMP Tue Nov 26 12:20:02 UTC 2019 x86_64
CPU is Intel(R) Xeon(R) Platinum 8280 CPU @ 2.70GHz
Hardware check
* Hardware support (CPU microcode) for mitigation techniques
 * Indirect Branch Restricted Speculation (IBRS)
   * SPEC_CTRL MSR is available: YES
   * CPU indicates IBRS capability: YES (SPEC_CTRL feature bit)
 * Indirect Branch Prediction Barrier (IBPB)
   * PRED_CMD MSR is available: YES
   * CPU indicates IBPB capability: YES (SPEC_CTRL feature bit)
 * Single Thread Indirect Branch Predictors (STIBP)
   * SPEC_CTRL MSR is available: YES
   * CPU indicates STIBP capability: YES (Intel STIBP feature bit)
 * Speculative Store Bypass Disable (SSBD)
   * CPU indicates SSBD capability: YES (Intel SSBD)
 * L1 data cache invalidation
   * FLUSH_CMD MSR is available: YES
    * CPU indicates L1D flush capability: YES (L1D flush feature bit)
 * Microarchitectural Data Sampling
   * VERW instruction is available: YES (MD_CLEAR feature bit)
 * Enhanced IBRS (IBRS_ALL)
   \star CPU indicates ARCH_CAPABILITIES MSR availability: YES
   * ARCH_CAPABILITIES MSR advertises IBRS_ALL capability: YES
 * CPU explicitly indicates not being vulnerable to Meltdown/L1TF (RDCL_NO): YES
 * CPU explicitly indicates not being vulnerable to Variant 4 (SSB_NO): NO
 * CPU/Hypervisor indicates L1D flushing is not necessary on this system: YES
 * Hypervisor indicates host CPU might be vulnerable to RSB underflow (RSBA): NO
 * CPU explicitly indicates not being vulnerable to Microarchitectural Data Sampling (MDS_NO): YES
 * CPU explicitly indicates not being vulnerable to TSX Asynchronous Abort (TAA_NO): NO
 * CPU explicitly indicates not being vulnerable to iTLB Multihit (PSCHANGE_MSC_NO): NO
 * CPU explicitly indicates having MSR for TSX control (TSX_CTRL_MSR): YES
   * TSX_CTRL MSR indicates TSX RTM is disabled: YES
   * TSX_CTRL MSR indicates TSX CPUID bit is cleared: YES
 * CPU supports Transactional Synchronization Extensions (TSX): NO
 * CPU supports Software Guard Extensions (SGX): NO
 * CPU microcode is known to cause stability problems: NO (model 0x55 family 0x6 stepping 0x7_
* CPU microcode is the latest known available version: awk: fatal: cannot open file `bash for_
→reading (No such file or directory)
UNKNOWN (latest microcode version for your CPU model is unknown)
* CPU vulnerability to the speculative execution attack variants
 * Vulnerable to CVE-2017-5753 (Spectre Variant 1, bounds check bypass): YES
```

<sup>175</sup> https://github.com/speed47/spectre-meltdown-checker

```
* Vulnerable to CVE-2017-5715 (Spectre Variant 2, branch target injection): YES
  * Vulnerable to CVE-2017-5754 (Variant 3, Meltdown, rogue data cache load): NO
 * Vulnerable to CVE-2018-3640 (Variant 3a, rogue system register read): YES
 * Vulnerable to CVE-2018-3639 (Variant 4, speculative store bypass): YES
 * Vulnerable to CVE-2018-3615 (Foreshadow (SGX), L1 terminal fault): NO
 * Vulnerable to CVE-2018-3620 (Foreshadow-NG (OS), L1 terminal fault): YES
 * Vulnerable to CVE-2018-3646 (Foreshadow-NG (VMM), L1 terminal fault): YES
 * Vulnerable to CVE-2018-12126 (Fallout, microarchitectural store buffer data sampling (MSBDS)):_
 * Vulnerable to CVE-2018-12130 (ZombieLoad, microarchitectural fill buffer data sampling_
→(MFBDS)): NO
 * Vulnerable to CVE-2018-12127 (RIDL, microarchitectural load port data sampling (MLPDS)): NO
 * Vulnerable to CVE-2019-11091 (RIDL, microarchitectural data sampling uncacheable memory_
 * Vulnerable to CVE-2019-11135 (ZombieLoad V2, TSX Asynchronous Abort (TAA)): NO
 * Vulnerable to CVE-2018-12207 (No eXcuses, iTLB Multihit, machine check exception on page size_
CVE-2017-5753 aka Spectre Variant 1, bounds check bypass
* Mitigated according to the /sys interface: YES (Mitigation: usercopy/swapgs barriers and __user_
→pointer sanitization)
* Kernel has array_index_mask_nospec: YES (1 occurrence(s) found of x86 64 bits array_index_mask_
→nospec())
* Kernel has the Red Hat/Ubuntu patch: NO
* Kernel has mask_nospec64 (arm64): NO
> STATUS: NOT VULNERABLE (Mitigation: usercopy/swapgs barriers and __user pointer sanitization)
CVE-2017-5715 aka Spectre Variant 2, branch target injection
* Mitigated according to the /sys interface: YES (Mitigation: Enhanced IBRS, IBPB: conditional, RSB_
→filling)
* Mitigation 1
 * Kernel is compiled with IBRS support: YES
   * IBRS enabled and active: YES (Enhanced flavor, performance impact will be greatly reduced)
 * Kernel is compiled with IBPB support: YES
   * IBPB enabled and active: YES
* Mitigation 2
 * Kernel has branch predictor hardening (arm): NO
 * Kernel compiled with retpoline option: YES
 * Kernel supports RSB filling: YES
> STATUS: NOT VULNERABLE (Enhanced IBRS + IBPB are mitigating the vulnerability)
CVE-2017-5754 aka Variant 3, Meltdown, rogue data cache load
* Mitigated according to the /sys interface: YES (Not affected)
* Kernel supports Page Table Isolation (PTI): YES
 * PTI enabled and active: UNKNOWN (dmesg truncated, please reboot and relaunch this script)
 * Reduced performance impact of PTI: YES (CPU supports INVPCID, performance impact of PTI will be_
→greatly reduced)
* Running as a Xen PV DomU: NO
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-3640 aka Variant 3a, rogue system register read
* CPU microcode mitigates the vulnerability: YES
> STATUS: NOT VULNERABLE (your CPU microcode mitigates the vulnerability)
CVE-2018-3639 aka Variant 4, speculative store bypass
* Mitigated according to the /sys interface: YES (Mitigation: Speculative Store Bypass disabled via_
→prctl and seccomp)
* Kernel supports disabling speculative store bypass (SSB): YES (found in /proc/self/status)
* SSB mitigation is enabled and active: YES (per-thread through prctl)
* SSB mitigation currently active for selected processes: YES (systemd-journald systemd-logind_
```

(continues on next page)

```
> STATUS: NOT VULNERABLE (Mitigation: Speculative Store Bypass disabled via prctl and seccomp)
CVE-2018-3615 aka Foreshadow (SGX), L1 terminal fault
* CPU microcode mitigates the vulnerability: N/A
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-3620 aka Foreshadow-NG (OS), L1 terminal fault
* Mitigated according to the /sys interface: YES (Not affected)
* Kernel supports PTE inversion: YES (found in kernel image)
* PTE inversion enabled and active: NO
> STATUS: NOT VULNERABLE (Not affected)
CVE-2018-3646 aka Foreshadow-NG (VMM), L1 terminal fault
* Information from the /sys interface: Not affected
* This system is a host running a hypervisor: NO
* Mitigation 1 (KVM)
 * EPT is disabled: NO
* Mitigation 2
 * L1D flush is supported by kernel: YES (found flush_11d in /proc/cpuinfo)
 * L1D flush enabled: NO
 * Hardware-backed L1D flush supported: YES (performance impact of the mitigation will be greatly_
→reduced)
 * Hyper-Threading (SMT) is enabled: YES
> STATUS: NOT VULNERABLE (your kernel reported your CPU model as not vulnerable)
CVE-2018-12126 aka Fallout, microarchitectural store buffer data sampling (MSBDS)
* Mitigated according to the /sys interface: YES (Not affected)
* Kernel supports using MD_CLEAR mitigation: YES (md_clear found in /proc/cpuinfo)
* Kernel mitigation is enabled and active: NO
* SMT is either mitigated or disabled: NO
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-12130 aka ZombieLoad, microarchitectural fill buffer data sampling (MFBDS)
* Mitigated according to the /sys interface: YES (Not affected)
* Kernel supports using MD_CLEAR mitigation: YES (md_clear found in /proc/cpuinfo)
* Kernel mitigation is enabled and active: NO
* SMT is either mitigated or disabled: NO
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-12127 aka RIDL, microarchitectural load port data sampling (MLPDS)
* Mitigated according to the /sys interface: YES (Not affected)
* Kernel supports using MD_CLEAR mitigation: YES (md_clear found in /proc/cpuinfo)
* Kernel mitigation is enabled and active: NO
* SMT is either mitigated or disabled: NO
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2019-11091 aka RIDL, microarchitectural data sampling uncacheable memory (MDSUM)
* Mitigated according to the /sys interface: YES (Not affected)
* Kernel supports using MD_CLEAR mitigation: YES (md_clear found in /proc/cpuinfo)
* Kernel mitigation is enabled and active: NO
* SMT is either mitigated or disabled: NO
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2019-11135 aka ZombieLoad V2, TSX Asynchronous Abort (TAA)
* Mitigated according to the /sys interface: YES (Mitigation: TSX disabled)
* TAA mitigation is supported by kernel: YES (found tsx_async_abort in kernel image)
* TAA mitigation enabled and active: YES (Mitigation: TSX disabled)
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-12207 aka No eXcuses, iTLB Multihit, machine check exception on page size changes (MCEPSC)
* Mitigated according to the /sys interface: YES (KVM: Mitigation: Split huge pages)
```

```
* iTLB Multihit mitigation enabled and active: YES (KVM: Mitigation: Split huge pages)
> STATUS: NOT VULNERABLE (this system is not running a hypervisor)
> SUMMARY: CVE-2017-5753:OK CVE-2017-5715:OK CVE-2017-5754:OK CVE-2018-3640:OK CVE-2018-3639:OK CVE-
→2018-3615:OK CVE-2018-3620:OK CVE-2018-3646:OK CVE-2018-12126:OK CVE-2018-12130:OK CVE-2018-
→12127:OK CVE-2019-11091:OK CVE-2019-11135:OK CVE-2018-12207:OK
awk: fatal: cannot open file `bash for reading (No such file or directory)
Checking for vulnerabilities on current system
Kernel is Linux 4.15.0-72-generic #81-Ubuntu SMP Tue Nov 26 12:20:02 UTC 2019 x86_64
CPU is Intel(R) Xeon(R) Gold 6252N CPU @ 2.30GHz
Hardware check
* Hardware support (CPU microcode) for mitigation techniques
 * Indirect Branch Restricted Speculation (IBRS)
   * SPEC_CTRL MSR is available: YES
    * CPU indicates IBRS capability: YES (SPEC_CTRL feature bit)
 * Indirect Branch Prediction Barrier (IBPB)
   * PRED_CMD MSR is available: YES
    * CPU indicates IBPB capability: YES (SPEC_CTRL feature bit)
 * Single Thread Indirect Branch Predictors (STIBP)
   * SPEC_CTRL MSR is available: YES
   * CPU indicates STIBP capability: YES (Intel STIBP feature bit)
 * Speculative Store Bypass Disable (SSBD)
   * CPU indicates SSBD capability: YES (Intel SSBD)
 * L1 data cache invalidation
   * FLUSH_CMD MSR is available: YES
    * CPU indicates L1D flush capability: YES (L1D flush feature bit)
 * Microarchitectural Data Sampling
   * VERW instruction is available: YES (MD_CLEAR feature bit)
 * Enhanced IBRS (IBRS_ALL)
   * CPU indicates ARCH_CAPABILITIES MSR availability: YES
    * ARCH_CAPABILITIES MSR advertises IBRS_ALL capability: YES
 * CPU explicitly indicates not being vulnerable to Meltdown/L1TF (RDCL_NO): YES
 * CPU explicitly indicates not being vulnerable to Variant 4 (SSB_NO): NO
 * CPU/Hypervisor indicates L1D flushing is not necessary on this system: YES
 * Hypervisor indicates host CPU might be vulnerable to RSB underflow (RSBA): NO
 * CPU explicitly indicates not being vulnerable to Microarchitectural Data Sampling (MDS_NO): YES
 * CPU explicitly indicates not being vulnerable to TSX Asynchronous Abort (TAA_NO): NO
 * CPU explicitly indicates not being vulnerable to iTLB Multihit (PSCHANGE_MSC_NO): NO
 * CPU explicitly indicates having MSR for TSX control (TSX_CTRL_MSR): YES
   * TSX_CTRL MSR indicates TSX RTM is disabled: YES
   * TSX_CTRL MSR indicates TSX CPUID bit is cleared: YES
 * CPU supports Transactional Synchronization Extensions (TSX): NO
 * CPU supports Software Guard Extensions (SGX): NO
 * CPU microcode is known to cause stability problems: NO (family 0x6 model 0x55 stepping 0x7_

→ucode 0x500002c cpuid 0x50657)

 * CPU microcode is the latest known available version: awk: fatal: cannot open file `bash for_
→reading (No such file or directory)
UNKNOWN (latest microcode version for your CPU model is unknown)
* CPU vulnerability to the speculative execution attack variants
 * Vulnerable to CVE-2017-5753 (Spectre Variant 1, bounds check bypass): YES
 * Vulnerable to CVE-2017-5715 (Spectre Variant 2, branch target injection): YES
 * Vulnerable to CVE-2017-5754 (Variant 3, Meltdown, rogue data cache load): NO
 * Vulnerable to CVE-2018-3640 (Variant 3a, rogue system register read): YES
 \star Vulnerable to CVE-2018-3639 (Variant 4, speculative store bypass): YES
 * Vulnerable to CVE-2018-3615 (Foreshadow (SGX), L1 terminal fault): NO
  * Vulnerable to CVE-2018-3620 (Foreshadow-NG (OS), L1 terminal fault): YES
  * Vulnerable to CVE-2018-3646 (Foreshadow-NG (VMM), L1 terminal fault): YES
```

\* iTLB Multihit mitigation is supported by kernel: YES (found itlb\_multihit in kernel image)

\* This system is a host running a hypervisor: NO

```
* Vulnerable to CVE-2018-12126 (Fallout, microarchitectural store buffer data sampling (MSBDS)):_
 * Vulnerable to CVE-2018-12130 (ZombieLoad, microarchitectural fill buffer data sampling_
→(MFBDS)): NO
 * Vulnerable to CVE-2018-12127 (RIDL, microarchitectural load port data sampling (MLPDS)): NO
 * Vulnerable to CVE-2019-11091 (RIDL, microarchitectural data sampling uncacheable memory_
→(MDSUM)): NO
 * Vulnerable to CVE-2019-11135 (ZombieLoad V2, TSX Asynchronous Abort (TAA)): NO
 * Vulnerable to CVE-2018-12207 (No eXcuses, iTLB Multihit, machine check exception on page size_
CVE-2017-5753 aka Spectre Variant 1, bounds check bypass
* Mitigated according to the /sys interface: YES (Mitigation: usercopy/swapgs barriers and __user_
→pointer sanitization)
* Kernel has array_index_mask_nospec: YES (1 occurrence(s) found of x86 64 bits array_index_mask_
→nospec())
* Kernel has the Red Hat/Ubuntu patch: NO
* Kernel has mask_nospec64 (arm64): NO
> STATUS: NOT VULNERABLE (Mitigation: usercopy/swapgs barriers and __user pointer sanitization)
CVE-2017-5715 aka Spectre Variant 2, branch target injection
* Mitigated according to the /sys interface: YES (Mitigation: Enhanced IBRS, IBPB: conditional, RSB_
→filling)
* Mitigation 1
 * Kernel is compiled with IBRS support: YES
   * IBRS enabled and active: YES (Enhanced flavor, performance impact will be greatly reduced)
 * Kernel is compiled with IBPB support: YES
   * IBPB enabled and active: YES
* Mitigation 2
 * Kernel has branch predictor hardening (arm): NO
 * Kernel compiled with retpoline option: YES
 * Kernel supports RSB filling: YES
> STATUS: NOT VULNERABLE (Enhanced IBRS + IBPB are mitigating the vulnerability)
CVE-2017-5754 aka Variant 3, Meltdown, rogue data cache load
* Mitigated according to the /sys interface: YES (Not affected)
* Kernel supports Page Table Isolation (PTI): YES
 * PTI enabled and active: UNKNOWN (dmesg truncated, please reboot and relaunch this script)
 * Reduced performance impact of PTI: YES (CPU supports INVPCID, performance impact of PTI will be_
→greatly reduced)
* Running as a Xen PV DomU: NO
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-3640 aka Variant 3a, rogue system register read
* CPU microcode mitigates the vulnerability: YES
> STATUS: NOT VULNERABLE (your CPU microcode mitigates the vulnerability)
CVE-2018-3639 aka Variant 4, speculative store bypass
\star Mitigated according to the /sys interface: YES (Mitigation: Speculative Store Bypass disabled via_
→prctl and seccomp)
* Kernel supports disabling speculative store bypass (SSB): YES (found in /proc/self/status)
* SSB mitigation is enabled and active: YES (per-thread through prctl)
* SSB mitigation currently active for selected processes: YES (systemd-journald systemd-logind_
> STATUS: NOT VULNERABLE (Mitigation: Speculative Store Bypass disabled via prctl and seccomp)
CVE-2018-3615 aka Foreshadow (SGX), L1 terminal fault
* CPU microcode mitigates the vulnerability: N/A
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-3620 aka Foreshadow-NG (OS), L1 terminal fault
```

```
* Mitigated according to the /sys interface: YES (Not affected)
* Kernel supports PTE inversion: YES (found in kernel image)
* PTE inversion enabled and active: NO
> STATUS: NOT VULNERABLE (Not affected)
CVE-2018-3646 aka Foreshadow-NG (VMM), L1 terminal fault
* Information from the /sys interface: Not affected
* This system is a host running a hypervisor: NO
* Mitigation 1 (KVM)
 * EPT is disabled: NO
* Mitigation 2
 * L1D flush is supported by kernel: YES (found flush_11d in /proc/cpuinfo)
 * L1D flush enabled: NO
 * Hardware-backed L1D flush supported: YES (performance impact of the mitigation will be greatly_
→reduced)
 * Hyper-Threading (SMT) is enabled: YES
> STATUS: NOT VULNERABLE (your kernel reported your CPU model as not vulnerable)
CVE-2018-12126 aka Fallout, microarchitectural store buffer data sampling (MSBDS)
\star Mitigated according to the /sys interface: YES (Not affected)
* Kernel supports using MD_CLEAR mitigation: YES (md_clear found in /proc/cpuinfo)
* Kernel mitigation is enabled and active: NO
* SMT is either mitigated or disabled: NO
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-12130 aka ZombieLoad, microarchitectural fill buffer data sampling (MFBDS)
* Mitigated according to the /sys interface: YES (Not affected)
* Kernel supports using MD_CLEAR mitigation: YES (md_clear found in /proc/cpuinfo)
* Kernel mitigation is enabled and active: NO
* SMT is either mitigated or disabled: NO
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-12127 aka RIDL, microarchitectural load port data sampling (MLPDS)
* Mitigated according to the /sys interface: YES (Not affected)
* Kernel supports using MD_CLEAR mitigation: YES (md_clear found in /proc/cpuinfo)
* Kernel mitigation is enabled and active: NO
* SMT is either mitigated or disabled: NO
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2019-11091 aka RIDL, microarchitectural data sampling uncacheable memory (MDSUM)
* Mitigated according to the /sys interface: YES (Not affected)
* Kernel supports using MD_CLEAR mitigation: YES (md_clear found in /proc/cpuinfo)
* Kernel mitigation is enabled and active: NO
* SMT is either mitigated or disabled: NO
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2019-11135 aka ZombieLoad V2, TSX Asynchronous Abort (TAA)
* Mitigated according to the /sys interface: YES (Mitigation: TSX disabled)
* TAA mitigation is supported by kernel: YES (found tsx_async_abort in kernel image)
* TAA mitigation enabled and active: YES (Mitigation: TSX disabled)
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-12207 aka No eXcuses, iTLB Multihit, machine check exception on page size changes (MCEPSC)
* Mitigated according to the /sys interface: YES (KVM: Mitigation: Split huge pages)
* This system is a host running a hypervisor: NO
* iTLB Multihit mitigation is supported by kernel: YES (found itlb_multihit in kernel image)
* iTLB Multihit mitigation enabled and active: YES (KVM: Mitigation: Split huge pages)
> STATUS: NOT VULNERABLE (this system is not running a hypervisor)
> SUMMARY: CVE-2017-5753:OK CVE-2017-5715:OK CVE-2017-5754:OK CVE-2018-3640:OK CVE-2018-3639:OK CVE-
→2018-3615:OK CVE-2018-3620:OK CVE-2018-3646:OK CVE-2018-12126:OK CVE-2018-12130:OK CVE-2018-
 →12127:OK CVE-2019-11091:OK CVE-2019-11135:OK CVE-2018-12207:OK
                                                                                  (continues on next page)
```

### 3.7.6 Calibration Data - Haswell

Following sections include sample calibration data measured on t1-sut1 server running in one of the Intel Xeon Haswell testbeds as specified in FD.io CSIT Testbeds - Xeon Haswell 176.

Calibration data obtained from all other servers in Haswell testbeds shows the same or similar values.

### Linux cmdline

```
$ cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-4.15.0-72-generic root=UUID=c59ae603-8076-41f4-bb5d-bc3fc8dd3ea1 ro isolcpus=1-
→17,19-35 nohz_full=1-17,19-35 rcu_nocbs=1-17,19-35 numa_balancing=disable intel_pstate=disable_
→intel_iommu=on iommu=pt nmi_watchdog=0 audit=0 nosoftlockup processor.max_cstate=1 intel_idle.max_
→cstate=1 hpet=disable tsc=reliable mce=off console=tty0console=tty50,115200n8
```

#### Linux uname

```
$ uname -a
Linux t1-tg1 4.15.0-72-generic #81-Ubuntu SMP Tue Nov 26 12:20:02 UTC 2019 x86_64 x86_64 x86_64 GNU/

→Linux
```

### **System-level Core Jitter**

```
$ sudo taskset -c 3 /home/testuser/pma_tools/jitter/jitter -i 30
Linux Jitter testing program version 1.8
Iterations=30
The pragram will execute a dummy function 80000 times
Display is updated every 20000 displayUpdate intervals
Timings are in CPU Core cycles
Inst_Min:
            Minimum Excution time during the display update interval(default is ~1 second)
Inst_Max:
            Maximum Excution time during the display update interval(default is ~1 second)
Inst_jitter: Jitter in the Excution time during rhe display update interval. This is the value of_

interest
last_Exec:
            The Excution time of last iteration just before the display update
Abs_Min:
            Absolute Minimum Excution time since the program started or statistics were reset
Abs_Max:
            Absolute Maximum Excution time since the program started or statistics were reset
            Cumulative value calcualted by the dummy function
tmp:
Interval:
            Time interval between the display updates in Core Cycles
Sample No: Sample number
  Inst_Min
                       Inst_jitter last_Exec Abs_min
                                                                                   Interval
            Inst_Max
                                                           Abs max
                                                                         tmp
Sample No
                                                                     1573060608 3205463144
   160024
              172636
                           12612
                                     160028
                                                160024
                                                           172636
→1
   160024
              188236
                           28212
                                     160028
                                                160024
                                                           188236
                                                                      958595072 3205500844
\hookrightarrow 2
   160024
               185676
                           25652
                                     160028
                                                160024
                                                           188236
                                                                      344129536 3205485976
→ 3
                                                                     4024631296 3205472740
   160024
               172608
                           12584
                                     160024
                                                160024
                                                           188236
→4
    160024
               179260
                           19236
                                     160028
                                                160024
                                                           188236
                                                                     3410165760 3205502164
→5
```

https://git.fd.io/csit/tree/docs/lab/testbeds\_ucs\_hsw\_hw\_bios\_cfg.md?h=rls2001

|                       |        |       |        |        |        | (cor       | ntinued from pro | evious page) |
|-----------------------|--------|-------|--------|--------|--------|------------|------------------|--------------|
| 160024<br><b>→</b> 6  | 172432 | 12408 | 160024 | 160024 | 188236 | 2795700224 | 3205452036       | u u          |
| 160024                | 178820 | 18796 | 160024 | 160024 | 188236 | 2181234688 | 3205455408       | u u          |
| →7<br>160024          | 172512 | 12488 | 160028 | 160024 | 188236 | 1566769152 | 3205461528       | ı ı          |
| →8<br>160024          | 172636 | 12612 | 160028 | 160024 | 188236 | 952303616  | 3205478820       | <u>.</u>     |
| →9<br>160024          | 173676 | 13652 | 160028 | 160024 | 188236 | 337838080  | 3205470412       | <u>.</u>     |
|                       | 178776 | 18752 | 160028 | 160024 | 188236 | 4018339840 | 3205481472       | ı.           |
| →11<br>160024         | 172788 | 12764 | 160028 | 160024 | 188236 | 3403874304 | 3205492336       | _            |
|                       | 174616 | 14592 | 160028 | 160024 | 188236 |            | 3205474904       |              |
| <b>⇔</b> 13           |        |       |        |        |        |            |                  | J            |
| 160024<br><b>⇔</b> 14 | 174440 | 14416 | 160028 | 160024 | 188236 |            | 3205479448       | _            |
| 160024                | 178748 | 18724 | 160024 | 160024 | 188236 | 1560477696 | 3205482668       | 2            |
| 160024                | 172588 | 12564 | 169404 | 160024 | 188236 | 946012160  | 3205510496       | 2            |
| 160024<br>→17         | 172636 | 12612 | 160024 | 160024 | 188236 | 331546624  | 3205472204       | _            |
| 160024<br><b>→</b> 18 | 172480 | 12456 | 160024 | 160024 | 188236 | 4012048384 | 3205455864       | u u          |
| 160024                | 172740 | 12716 | 160028 | 160024 | 188236 | 3397582848 | 3205464932       | _            |
| ←19     160024        | 179200 | 19176 | 160028 | 160024 | 188236 | 2783117312 | 3205476012       | u            |
| →20<br>160024         | 172480 | 12456 | 160028 | 160024 | 188236 | 2168651776 | 3205465632       | u            |
| →21<br>160024         | 172728 | 12704 | 160024 | 160024 | 188236 | 1554186240 | 3205497204       | u u          |
| →22<br>160024         | 172620 | 12596 | 160028 | 160024 | 188236 | 939720704  | 3205466972       | <u>.</u>     |
| ⇔23<br>160024         | 172640 | 12616 | 160028 | 160024 | 188236 | 325255168  | 3205471216       | u u          |
| →24<br>160024         | 172484 | 12460 | 160028 | 160024 | 188236 |            | 3205467388       | _            |
| ⇒25<br>160024         | 172636 | 12612 | 160028 | 160024 | 188236 |            | 3205482748       |              |
| <b>⇒</b> 26           |        |       |        |        |        |            |                  | _            |
| 160024<br>⇔27         | 179056 | 19032 | 160024 | 160024 | 188236 |            | 3205467152       | u            |
| 160024<br>⇔28         | 172672 | 12648 | 160024 | 160024 | 188236 | 2162360320 | 3205483268       | u            |
| 160024<br>⇔29         | 176932 | 16908 | 160024 | 160024 | 188236 | 1547894784 | 3205488536       | u u          |
| 160024<br>→30         | 172452 | 12428 | 160028 | 160024 | 188236 | 933429248  | 3205440636       | u u          |
|                       |        |       |        |        |        |            |                  |              |

### **Memory Bandwidth**

\$ sudo /home/testuser/mlc --bandwidth\_matrix
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --bandwidth\_matrix

Using buffer size of 100.000MB/thread for reads and an additional 100.000MB/thread for writes Measuring Memory Bandwidths between nodes within system

(continues on next page)

```
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using Read-only traffic type
Numa node
Numa node
0 1
0 57935.5 30265.2
1 30284.6 58409.9
```

```
$ sudo /home/testuser/mlc --peak_injection_bandwidth
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --peak_injection_bandwidth

Using buffer size of 100.000MB/thread for reads and an additional 100.000MB/thread for writes

Measuring Peak Injection Memory Bandwidths for the system
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using traffic with the following read-write ratios
ALL Reads : 115762.2
3:1 Reads-Writes : 106242.2
2:1 Reads-Writes : 103031.8
1:1 Reads-Writes : 87943.7
Stream-triad like: 100048.4
```

```
$ sudo /home/testuser/mlc --max_bandwidth
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --max_bandwidth
Using buffer size of 100.000MB/thread for reads and an additional 100.000MB/thread for writes
Measuring Maximum Memory Bandwidths for the system
Will take several minutes to complete as multiple injection rates will be tried to get the best_
→bandwidth
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using traffic with the following read-write ratios
ALL Reads
            : 115782.41
3:1 Reads-Writes : 105965.78
2:1 Reads-Writes : 103162.38
1:1 Reads-Writes : 88255.82
Stream-triad like: 105608.10
```

#### **Memory Latency**

```
$ sudo /home/testuser/mlc --idle_latency
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --idle_latency
```

```
Using buffer size of 200.000MB
Each iteration took 227.2 core clocks ( 99.0 ns)
```

```
$ sudo /home/testuser/mlc --loaded_latency
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --loaded_latency
Using buffer size of 100.000MB/thread for reads and an additional 100.000MB/thread for writes
Measuring Loaded Latencies for the system
Using all the threads from each core if Hyper-threading is enabled
Using Read-only traffic type
Inject Latency Bandwidth
Delay (ns)
             MR/sec
00000 294.08 115841.6
00002 294.27 115851.5
00008 293.67 115821.8
00015 278.92 115587.5
00050 246.80 113991.2
00100 206.86 104508.1
00200 123.72 72873.6
00300 113.35 52641.1
00400 108.89 41078.9
               33699.1
00500 108.11
00700 106.19
                24878.0
01000 104.75
                17948.1
01300 103.72
                14089.0
01700 102.95
                11013.6
02500 102.25
                 7756.3
03500 101.81
                 5749.3
05000 101.46
                 4230.4
09000 101.05
                 2641.4
20000 100.77
                 1542.5
```

### L1/L2/LLC Latency

```
$ sudo /home/testuser/mlc --c2c_latency
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --c2c_latency
Measuring cache-to-cache transfer latency (in ns)...
Local Socket L2->L2 HIT latency
Local Socket L2->L2 HITM latency
                                    47.0
Remote Socket L2->L2 HITM latency (data address homed in writer socket)
                  Reader Numa Node
Writer Numa Node
                    0
           0
                        108.0
           1
                106.9
Remote Socket L2->L2 HITM latency (data address homed in reader socket)
                  Reader Numa Node
Writer Numa Node
                        107.7
           0
            1
                 106.6
```

#### **Spectre and Meltdown Checks**

Following section displays the output of a running shell script to tell if system is vulnerable against the several "speculative execution" CVEs that were made public in 2018. Script is available on Spectre & Meltdown Checker Github<sup>177</sup>.

```
Spectre and Meltdown mitigation detection tool v0.43
awk: cannot open bash (No such file or directory)
Checking for vulnerabilities on current system
Kernel is Linux 4.15.0-72-generic #81-Ubuntu SMP Tue Nov 26 12:20:02 UTC 2019 x86_64
CPU is Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz
Hardware check
* Hardware support (CPU microcode) for mitigation techniques
  * Indirect Branch Restricted Speculation (IBRS)
    * SPEC_CTRL MSR is available: YES
    * CPU indicates IBRS capability: YES (SPEC_CTRL feature bit)
  * Indirect Branch Prediction Barrier (IBPB)
    * PRED_CMD MSR is available: YES
    * CPU indicates IBPB capability: YES (SPEC_CTRL feature bit)
  * Single Thread Indirect Branch Predictors (STIBP)
    * SPEC_CTRL MSR is available: YES
    * CPU indicates STIBP capability: YES (Intel STIBP feature bit)
  * Speculative Store Bypass Disable (SSBD)
    * CPU indicates SSBD capability: YES (Intel SSBD)
 * L1 data cache invalidation
   * FLUSH_CMD MSR is available: YES
    * CPU indicates L1D flush capability: YES (L1D flush feature bit)
  * Microarchitectural Data Sampling
   * VERW instruction is available: YES (MD_CLEAR feature bit)
 * Enhanced IBRS (IBRS_ALL)
   * CPU indicates ARCH_CAPABILITIES MSR availability: NO
    * ARCH_CAPABILITIES MSR advertises IBRS_ALL capability: NO
  * CPU explicitly indicates not being vulnerable to Meltdown/L1TF (RDCL_NO): NO
  * CPU explicitly indicates not being vulnerable to Variant 4 (SSB_NO): NO
  * CPU/Hypervisor indicates L1D flushing is not necessary on this system: NO
  * Hypervisor indicates host CPU might be vulnerable to RSB underflow (RSBA): NO
 * CPU explicitly indicates not being vulnerable to Microarchitectural Data Sampling (MDS_NO): NO
 * CPU explicitly indicates not being vulnerable to TSX Asynchronous Abort (TAA_NO): NO
 * CPU explicitly indicates not being vulnerable to iTLB Multihit (PSCHANGE_MSC_NO): NO
 * CPU explicitly indicates having MSR for TSX control (TSX_CTRL_MSR): NO
 * CPU supports Transactional Synchronization Extensions (TSX): NO
 * CPU supports Software Guard Extensions (SGX): NO
 * CPU microcode is known to cause stability problems: NO (model 0x3f family 0x6 stepping 0x2_

→ucode 0x43 cpuid 0x306f2)
 * CPU microcode is the latest known available version: awk: cannot open bash (No such file or_
→directorv)
UNKNOWN (latest microcode version for your CPU model is unknown)
* CPU vulnerability to the speculative execution attack variants
  * Vulnerable to CVE-2017-5753 (Spectre Variant 1, bounds check bypass): YES
  * Vulnerable to CVE-2017-5715 (Spectre Variant 2, branch target injection): YES
  * Vulnerable to CVE-2017-5754 (Variant 3, Meltdown, rogue data cache load): YES
  * Vulnerable to CVE-2018-3640 (Variant 3a, rogue system register read): YES
  * Vulnerable to CVE-2018-3639 (Variant 4, speculative store bypass): YES
  * Vulnerable to CVE-2018-3615 (Foreshadow (SGX), L1 terminal fault): NO
  * Vulnerable to CVE-2018-3620 (Foreshadow-NG (OS), L1 terminal fault): YES
  * Vulnerable to CVE-2018-3646 (Foreshadow-NG (VMM), L1 terminal fault): YES
  * Vulnerable to CVE-2018-12126 (Fallout, microarchitectural store buffer data sampling (MSBDS)):
→YES
 * Vulnerable to CVE-2018-12130 (ZombieLoad, microarchitectural fill buffer data sampling.
                                                                                  (continues on next page)
 →(MFBDS)): YES
```

<sup>177</sup> https://github.com/speed47/spectre-meltdown-checker

```
* Vulnerable to CVE-2018-12127 (RIDL, microarchitectural load port data sampling (MLPDS)): YES
 * Vulnerable to CVE-2019-11091 (RIDL, microarchitectural data sampling uncacheable memory_
 * Vulnerable to CVE-2019-11135 (ZombieLoad V2, TSX Asynchronous Abort (TAA)): NO
 * Vulnerable to CVE-2018-12207 (No eXcuses, iTLB Multihit, machine check exception on page size_
→changes (MCEPSC)): YES
CVE-2017-5753 aka Spectre Variant 1, bounds check bypass
* Mitigated according to the /sys interface: YES (Mitigation: usercopy/swapgs barriers and __user_
→pointer sanitization)
* Kernel has array_index_mask_nospec: YES (1 occurrence(s) found of x86 64 bits array_index_mask_
→nospec())
* Kernel has the Red Hat/Ubuntu patch: NO
* Kernel has mask_nospec64 (arm64): NO
> STATUS: NOT VULNERABLE (Mitigation: usercopy/swapgs barriers and __user pointer sanitization)
CVE-2017-5715 aka Spectre Variant 2, branch target injection
* Mitigated according to the /sys interface: YES (Mitigation: Full generic retpoline, IBPB:_
→conditional, IBRS_FW, RSB filling)
* Mitigation 1
 * Kernel is compiled with IBRS support: YES
    * IBRS enabled and active: YES (for firmware code only)
 * Kernel is compiled with IBPB support: YES
   * IBPB enabled and active: YES
* Mitigation 2
 * Kernel has branch predictor hardening (arm): NO
 * Kernel compiled with retpoline option: YES
   * Kernel compiled with a retpoline-aware compiler: YES (kernel reports full retpoline_
> STATUS: NOT VULNERABLE (Full retpoline + IBPB are mitigating the vulnerability)
CVE-2017-5754 aka Variant 3, Meltdown, rogue data cache load
* Mitigated according to the /sys interface: YES (Mitigation: PTI)
* Kernel supports Page Table Isolation (PTI): YES
 \star PTI enabled and active: YES
 * Reduced performance impact of PTI: YES (CPU supports INVPCID, performance impact of PTI will be_
→greatly reduced)
* Running as a Xen PV DomU: NO
> STATUS: NOT VULNERABLE (Mitigation: PTI)
CVE-2018-3640 aka Variant 3a, rogue system register read
* CPU microcode mitigates the vulnerability: YES
> STATUS: NOT VULNERABLE (your CPU microcode mitigates the vulnerability)
CVE-2018-3639 aka Variant 4, speculative store bypass
* Mitigated according to the /sys interface: YES (Mitigation: Speculative Store Bypass disabled via_
→prctl and seccomp)
* Kernel supports disabling speculative store bypass (SSB): YES (found in /proc/self/status)
* SSB mitigation is enabled and active: YES (per-thread through prctl)
* SSB mitigation currently active for selected processes: YES (systemd-journald systemd-logind_
> STATUS: NOT VULNERABLE (Mitigation: Speculative Store Bypass disabled via prctl and seccomp)
CVE-2018-3615 aka Foreshadow (SGX), L1 terminal fault
* CPU microcode mitigates the vulnerability: N/A
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-3620 aka Foreshadow-NG (OS), L1 terminal fault
* Mitigated according to the /sys interface: YES (Mitigation: PTE Inversion; VMX: conditional cache_
→flushes, SMT disabled)
* Kernel supports PTE inversion: YES (found in kernel image)
```

(continues on next page)

```
* PTE inversion enabled and active: YES
> STATUS: NOT VULNERABLE (Mitigation: PTE Inversion; VMX: conditional cache flushes, SMT disabled)
CVE-2018-3646 aka Foreshadow-NG (VMM), L1 terminal fault
* Information from the /sys interface: Mitigation: PTE Inversion; VMX: conditional cache flushes,_
→SMT disabled
* This system is a host running a hypervisor: NO
* Mitigation 1 (KVM)
 * EPT is disabled: NO
* Mitigation 2
 * L1D flush is supported by kernel: YES (found flush_l1d in /proc/cpuinfo)
 * L1D flush enabled: YES (conditional flushes)
  * Hardware-backed L1D flush supported: YES (performance impact of the mitigation will be greatly_
→reduced)
 * Hyper-Threading (SMT) is enabled: NO
> STATUS: NOT VULNERABLE (this system is not running a hypervisor)
CVE-2018-12126 aka Fallout, microarchitectural store buffer data sampling (MSBDS)
* Mitigated according to the /sys interface: YES (Mitigation: Clear CPU buffers; SMT disabled)
* Kernel supports using MD_CLEAR mitigation: YES (md_clear found in /proc/cpuinfo)
* Kernel mitigation is enabled and active: YES
* SMT is either mitigated or disabled: YES
> STATUS: NOT VULNERABLE (Your microcode and kernel are both up to date for this mitigation, and_
→mitigation is enabled)
CVE-2018-12130 aka ZombieLoad, microarchitectural fill buffer data sampling (MFBDS)
* Mitigated according to the /sys interface: YES (Mitigation: Clear CPU buffers; SMT disabled)
* Kernel supports using MD_CLEAR mitigation: YES (md_clear found in /proc/cpuinfo)
* Kernel mitigation is enabled and active: YES
* SMT is either mitigated or disabled: YES
> STATUS: NOT VULNERABLE (Your microcode and kernel are both up to date for this mitigation, and_
→mitigation is enabled)
CVE-2018-12127 aka RIDL, microarchitectural load port data sampling (MLPDS)
* Mitigated according to the /sys interface: YES (Mitigation: Clear CPU buffers; SMT disabled)
* Kernel supports using MD_CLEAR mitigation: YES (md_clear found in /proc/cpuinfo)
* Kernel mitigation is enabled and active: YES
\star SMT is either mitigated or disabled: YES
> STATUS: NOT VULNERABLE (Your microcode and kernel are both up to date for this mitigation, and_
→mitigation is enabled)
CVE-2019-11091 aka RIDL, microarchitectural data sampling uncacheable memory (MDSUM)
* Mitigated according to the /sys interface: YES (Mitigation: Clear CPU buffers; SMT disabled)
* Kernel supports using MD_CLEAR mitigation: YES (md_clear found in /proc/cpuinfo)
* Kernel mitigation is enabled and active: YES
* SMT is either mitigated or disabled: YES
> STATUS: NOT VULNERABLE (Your microcode and kernel are both up to date for this mitigation, and_
→mitigation is enabled)
CVE-2019-11135 aka ZombieLoad V2, TSX Asynchronous Abort (TAA)
* Mitigated according to the /sys interface: YES (Not affected)
* TAA mitigation is supported by kernel: YES (found tsx_async_abort in kernel image)
* TAA mitigation enabled and active: NO
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-12207 aka No eXcuses, iTLB Multihit, machine check exception on page size changes (MCEPSC)
* Mitigated according to the /sys interface: YES (KVM: Mitigation: Split huge pages)
* This system is a host running a hypervisor: NO
* iTLB Multihit mitigation is supported by kernel: YES (found itlb_multihit in kernel image)
* iTLB Multihit mitigation enabled and active: YES (KVM: Mitigation: Split huge pages)
> STATUS: NOT VULNERABLE (this system is not running a hypervisor)
```

```
> SUMMARY: CVE-2017-5753:OK CVE-2017-5715:OK CVE-2017-5754:OK CVE-2018-3640:OK CVE-2018-3639:OK CVE-
→2018-3615:OK CVE-2018-3620:OK CVE-2018-3646:OK CVE-2018-12126:OK CVE-2018-12130:OK CVE-2018-
→12127:OK CVE-2019-11091:OK CVE-2019-11135:OK CVE-2018-12207:OK
```

#### 3.7.7 Calibration Data - Denverton

Following sections include sample calibration data measured on Denverton server at Intel SH labs.

A 2-Node Atom Denverton testing took place at Intel Corporation carefully adhering to FD.io CSIT best practices.

#### Linux cmdline

#### Linux uname

```
$ uname -a
Linux 4.15.0-36-generic #39~16.04.1-Ubuntu SMP Tue Sep 25 08:59:23 UTC 2018 x86_64 x86_64 x86_64_

GNU/Linux
```

#### **System-level Core Jitter**

```
$ sudo taskset -c 2 /home/testuser/pma_tools/jitter/jitter -c 2 -i 20
Linux Jitter testing program version 1.9
Iterations=20
The pragram will execute a dummy function 80000 times
Display is updated every 20000 displayUpdate intervals
Thread affinity will be set to core_id:2
Timings are in CPU Core cycles
Inst_Min: Minimum Excution time during the display update interval(default is ~1 second)
Inst_Max:
           Maximum Excution time during the display update interval(default is ~1 second)
Inst_jitter: Jitter in the Excution time during rhe display update interval. This is the value of_
⇔interest
last_Exec: The Excution time of last iteration just before the display update
Abs_Min:
            Absolute Minimum Excution time since the program started or statistics were reset
Abs_Max:
            Absolute Maximum Excution time since the program started or statistics were reset
            Cumulative value calcualted by the dummy function
tmp:
Interval:
            Time interval between the display updates in Core Cycles
Sample No:
            Sample number
  Inst Min
            Inst_Max Inst_jitter last_Exec Abs_min
                                                                                 Interval
                                                          Abs_max
                                                                       tmp
→Sample No
   177530
              196100
                          18570
                                    177530
                                               177530
                                                          196100
                                                                    4156751872 3556820054
⇔1
   177530
              200784
                          23254
                                    177530
                                               177530
                                                          200784
                                                                     321060864 3556897644
→2
   177530
              196346
                          18816
                                    177530
                                               177530
                                                          200784
                                                                     780337152 3556918674
→3
```

(continues on next page)

| /          | •    |          | ١.   |
|------------|------|----------|------|
| (continued | trom | previous | page |

|                      |        |        |        |          |        | (00110111000      | mom previou | .o pago, |
|----------------------|--------|--------|--------|----------|--------|-------------------|-------------|----------|
| 177530               | 195962 | 18432  | 177530 | 177530   | 200784 | 1239613440 35568  | 47928       | <u>.</u> |
| <b>→</b> 4           |        |        |        |          |        |                   |             |          |
| 177530               | 195960 | 18430  | 177530 | 177530   | 200784 | 1698889728 35568  | 60214       | u        |
| <b>⇔</b> 5           |        |        |        |          |        |                   |             |          |
| 177530               | 198824 | 21294  | 177530 | 177530   | 200784 | 2158166016 35568  | 54934       | u        |
| <b>⇔</b> 6           |        |        |        |          |        |                   |             |          |
| 177530               | 198522 | 20992  | 177530 | 177530   | 200784 | 2617442304 35568  | 62410       |          |
| <b>→</b> 7           |        |        |        |          |        |                   |             | _        |
| 177530               | 196362 | 18832  | 177530 | 177530   | 200784 | 3076718592 35568  | 51636       | u u      |
| <b>⇔</b> 8           |        |        |        |          |        |                   |             | _        |
| 177530               | 199114 | 21584  | 177530 | 177530   | 200784 | 3535994880 35568  | 70846       | u u      |
| <b>→9</b>            |        |        |        |          |        |                   |             | _        |
| 177530               | 197194 | 19664  | 177530 | 177530   | 200784 | 3995271168 35569  | 33584       | _        |
| <b>→10</b>           |        |        |        | .,,,,,,, | 200701 | 0000271100 00000  |             |          |
| 177530               | 198272 | 20742  | 177536 | 177530   | 200784 | 159580160 35568   | 69044       | _        |
| <b>⇔</b> 11          | .002/2 | 207.12 |        | .,,,,,,, | 200701 |                   |             |          |
| 177530               | 197586 | 20056  | 177530 | 177530   | 200784 | 618856448 35569   | 03482       |          |
| →12                  | 137300 | 20000  | 177000 | 177000   | 200701 | 010000110 00000   | 03.02       | <u>.</u> |
| 177530               | 196072 | 18542  | 177530 | 177530   | 200784 | 1078132736 35568  | 25540       |          |
| →13                  | 130072 | 10012  | 177000 | 177000   | 200701 | 1070132730 33300  | 20010       | <u>.</u> |
| 177530               | 196354 | 18824  | 177530 | 177530   | 200784 | 1537409024 35568  | 81664       |          |
| →14                  | 130331 | 10021  | 177330 | 177330   | 200701 | 1337 103021 33300 | 01001       | <u>.</u> |
| 177530               | 195906 | 18376  | 177530 | 177530   | 200784 | 1996685312 35568  | 39924       |          |
| →15                  | 133300 | 10370  | 177330 | 177330   | 200701 | 1330003312 33300  | 33321       |          |
| 177530               | 199066 | 21536  | 177530 | 177530   | 200784 | 2455961600 35568  | 60220       |          |
| →16                  | 133000 | 21330  | 177550 | 177550   | 200704 | 2433301000 33300  | 100220      | <u>.</u> |
| 177530               | 196968 | 19438  | 177530 | 177530   | 200784 | 2915237888 35568  | 71890       |          |
| →17                  | 130300 | 13430  | 177550 | 177550   | 200704 | 2313231000 33300  | 7 1030      | <u>.</u> |
| 177530               | 195896 | 18366  | 177530 | 177530   | 200784 | 3374514176 35568  | 55338       |          |
| 177530<br><b>→18</b> | 133030 | 10300  | 177550 | 177550   | 200704 | 3374314170 33300  | 33330       | <u>.</u> |
| 177530               | 196020 | 18490  | 177530 | 177530   | 200784 | 3833790464 35568  | 39820       |          |
| 177330<br><b>→19</b> | 130020 | 10430  | 177330 | 177550   | 200704 | 2022/20404 22200  | 33020       | <u>.</u> |
| 177530               | 196030 | 18500  | 177530 | 177530   | 200784 | 4293066752 35568  | 20106       |          |
| 177530<br>→20        | 190030 | 10300  | 177330 | 177330   | 200704 | 4233000732 33300  | 05150       | _        |
| →ZV                  |        |        |        |          |        |                   |             |          |

## **Memory Bandwidth**

```
$ sudo /home/testuser/mlc --peak_injection_bandwidth
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --peak_injection_bandwidth

Using buffer size of 100.000MB/thread for reads and an additional 100.000MB/thread for writes

Measuring Peak Injection Memory Bandwidths for the system
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
```

```
Using traffic with the following read-write ratios
ALL Reads : 28150.0
3:1 Reads-Writes : 27425.0
2:1 Reads-Writes : 27565.4
1:1 Reads-Writes : 27489.3
Stream-triad like: 26878.2
```

```
$ sudo /home/testuser/mlc --max_bandwidth
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --max_bandwidth
Using buffer size of 100.000MB/thread for reads and an additional 100.000MB/thread for writes
Measuring Maximum Memory Bandwidths for the system
Will take several minutes to complete as multiple injection rates will be tried to get the best_
Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
Using all the threads from each core if Hyper-threading is enabled
Using traffic with the following read-write ratios
ALL Reads
              :
                      30032.40
3:1 Reads-Writes :
                     27450.88
2:1 Reads-Writes :
                     27567.46
1:1 Reads-Writes :
                     27501.90
Stream-triad like: 27124.82
```

#### **Memory Latency**

```
$ sudo /home/testuser/mlc --idle_latency
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --idle_latency
Using buffer size of 200.000MB
Each iteration took 186.7 core clocks ( 93.4 ns)
```

(continues on next page)

```
00002 135.47 27176.9
00008 134.97 27063.3
00015 134.41 26825.6
00050 139.83 28419.1
00100 124.28 22616.4
00200 109.40 14139.8
00300 104.56 10275.1
00400 102.02 8120.0
00500 100.38 6751.4
00700 98.30
               5124.9
01000 96.56
               3852.7
01300
      95.65
               3149.0
01700
      95.06
               2585.4
02500 94.43
               1988.8
03500 94.16
               1621.1
05000 93.95
               1343.1
09000 93.65
               1052.6
20000 93.43
               851.7
```

### L1/L2/LLC Latency

```
$ sudo /home/testuser/mlc --c2c_latency
Intel(R) Memory Latency Checker - v3.5
Command line parameters: --c2c_latency

Measuring cache-to-cache transfer latency (in ns)...
Local Socket L2->L2 HIT latency 8.8
Local Socket L2->L2 HITM latency 8.8
```

#### **Spectre and Meltdown Checks**

Following section displays the output of a running shell script to tell if system is vulnerable against the several "speculative execution" CVEs that were made public in 2018. Script is available on Spectre & Meltdown Checker Github<sup>178</sup>.

```
Spectre and Meltdown mitigation detection tool v0.42
Checking for vulnerabilities on current system
Kernel is Linux 4.15.0-51-generic #55-Ubuntu SMP Wed May 15 14:27:21 UTC 2019 x86_64
CPU is Intel(R) Atom(TM) CPU C3858 @ 2.00GHz
Hardware check
* Hardware support (CPU microcode) for mitigation techniques
 * Indirect Branch Restricted Speculation (IBRS)
   * SPEC_CTRL MSR is available: YES
    * CPU indicates IBRS capability: YES (SPEC_CTRL feature bit)
 * Indirect Branch Prediction Barrier (IBPB)
   * PRED_CMD MSR is available: YES
    * CPU indicates IBPB capability: YES (SPEC_CTRL feature bit)
 * Single Thread Indirect Branch Predictors (STIBP)
   * SPEC_CTRL MSR is available: YES
    * CPU indicates STIBP capability: YES (Intel STIBP feature bit)
 * Speculative Store Bypass Disable (SSBD)
   * CPU indicates SSBD capability: YES (Intel SSBD)
 * L1 data cache invalidation
   \star FLUSH_CMD MSR is available: NO
    * CPU indicates L1D flush capability: NO
```

<sup>178</sup> https://github.com/speed47/spectre-meltdown-checker

```
* Microarchitecture Data Sampling
   * VERW instruction is available: YES (MD_CLEAR feature bit)
 * Enhanced IBRS (IBRS_ALL)
   * CPU indicates ARCH_CAPABILITIES MSR availability: YES
    * ARCH_CAPABILITIES MSR advertises IBRS_ALL capability: NO
 * CPU explicitly indicates not being vulnerable to Meltdown/L1TF (RDCL_NO): YES
 * CPU explicitly indicates not being vulnerable to Variant 4 (SSB_NO): NO
 \star CPU/Hypervisor indicates L1D flushing is not necessary on this system: YES
 * Hypervisor indicates host CPU might be vulnerable to RSB underflow (RSBA): NO
 * CPU explicitly indicates not being vulnerable to Microarchitectural Data Sampling (MDS_NO): YES
 * CPU supports Software Guard Extensions (SGX): NO
 * CPU microcode is known to cause stability problems: NO (model 0x5f family 0x6 stepping 0x1_
→ucode 0x2e cpuid 0x506f1)
 * CPU microcode is the latest known available version: awk: fatal: cannot open file `bash for_
→reading (No such file or directory)
UNKNOWN (latest microcode version for your CPU model is unknown)
* CPU vulnerability to the speculative execution attack variants
 \star Vulnerable to CVE-2017-5753 (Spectre Variant 1, bounds check bypass): YES
 * Vulnerable to CVE-2017-5715 (Spectre Variant 2, branch target injection): YES
 \star Vulnerable to CVE-2017-5754 (Variant 3, Meltdown, rogue data cache load): NO
 * Vulnerable to CVE-2018-3640 (Variant 3a, rogue system register read): YES
 * Vulnerable to CVE-2018-3639 (Variant 4, speculative store bypass): YES
 * Vulnerable to CVE-2018-3615 (Foreshadow (SGX), L1 terminal fault): NO
 * Vulnerable to CVE-2018-3620 (Foreshadow-NG (OS), L1 terminal fault): NO
 * Vulnerable to CVE-2018-3646 (Foreshadow-NG (VMM), L1 terminal fault): NO
 * Vulnerable to CVE-2018-12126 (Fallout, microarchitectural store buffer data sampling (MSBDS)):_
 * Vulnerable to CVE-2018-12130 (ZombieLoad, microarchitectural fill buffer data sampling_
→(MFBDS)): NO
 * Vulnerable to CVE-2018-12127 (RIDL, microarchitectural load port data sampling (MLPDS)): NO
 * Vulnerable to CVE-2019-11091 (RIDL, microarchitectural data sampling uncacheable memory_
→(MDSUM)): NO
CVE-2017-5753 aka Spectre Variant 1, bounds check bypass
* Mitigated according to the /sys interface: YES (Mitigation: __user pointer sanitization)
* Kernel has array_index_mask_nospec: YES (1 occurrence(s) found of x86 64 bits array_index_mask_
→nospec())
* Kernel has the Red Hat/Ubuntu patch: NO
* Kernel has mask_nospec64 (arm64): NO
> STATUS: NOT VULNERABLE (Mitigation: __user pointer sanitization)
CVE-2017-5715 aka Spectre Variant 2, branch target injection
* Mitigated according to the /sys interface: YES (Mitigation: Full generic retpoline, IBPB:_
* Kernel is compiled with IBRS support: YES
   * IBRS enabled and active: YES (for firmware code only)
 * Kernel is compiled with IBPB support: YES
   * IBPB enabled and active: YES
* Mitigation 2
 * Kernel has branch predictor hardening (arm): NO
 * Kernel compiled with retpoline option: YES
   * Kernel compiled with a retpoline-aware compiler: YES (kernel reports full retpoline_
> STATUS: NOT VULNERABLE (Full retpoline + IBPB are mitigating the vulnerability)
CVE-2017-5754 aka Variant 3, Meltdown, rogue data cache load
* Mitigated according to the /sys interface: YES (Not affected)
* Kernel supports Page Table Isolation (PTI): YES
 * PTI enabled and active: UNKNOWN (dmesg truncated, please reboot and relaunch this script)
 * Reduced performance impact of PTI: NO (PCID/INVPCID not supported, performance impact of PTI_
 →will be significant)
                                                                                (continues on next page)
```

3.7. Test Environment 647

```
* Running as a Xen PV DomU: NO
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-3640 aka Variant 3a, rogue system register read
* CPU microcode mitigates the vulnerability: YES
> STATUS: NOT VULNERABLE (your CPU microcode mitigates the vulnerability)
CVE-2018-3639 aka Variant 4, speculative store bypass
* Mitigated according to the /sys interface: YES (Mitigation: Speculative Store Bypass disabled via_
→prctl and seccomp)
* Kernel supports disabling speculative store bypass (SSB): YES (found in /proc/self/status)
* SSB mitigation is enabled and active: YES (per-thread through prctl)
* SSB mitigation currently active for selected processes: YES (systemd-journald systemd-logind_
\hookrightarrow systemd-networkd systemd-resolved systemd-timesyncd systemd-udevd)
> STATUS: NOT VULNERABLE (Mitigation: Speculative Store Bypass disabled via prctl and seccomp)
CVE-2018-3615 aka Foreshadow (SGX), L1 terminal fault
\star CPU microcode mitigates the vulnerability: N/A
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-3620 aka Foreshadow-NG (OS), L1 terminal fault
* Mitigated according to the /sys interface: YES (Not affected)
* Kernel supports PTE inversion: YES (found in kernel image)
* PTE inversion enabled and active: NO
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-3646 aka Foreshadow-NG (VMM), L1 terminal fault
* Information from the /sys interface: Not affected
* This system is a host running a hypervisor: NO
* Mitigation 1 (KVM)
 * EPT is disabled: NO
* Mitigation 2
 * L1D flush is supported by kernel: YES (found flush_11d in kernel image)
 * L1D flush enabled: NO
 * Hardware-backed L1D flush supported: NO (flush will be done in software, this is slower)
 * Hyper-Threading (SMT) is enabled: NO
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-12126 aka Fallout, microarchitectural store buffer data sampling (MSBDS)
* Mitigated according to the /sys interface: YES (Not affected)
* Kernel supports using MD_CLEAR mitigation: YES (md_clear found in /proc/cpuinfo)
* Kernel mitigation is enabled and active: NO
* SMT is either mitigated or disabled: NO
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-12130 aka ZombieLoad, microarchitectural fill buffer data sampling (MFBDS)
* Mitigated according to the /sys interface: YES (Not affected)
* Kernel supports using MD_CLEAR mitigation: YES (md_clear found in /proc/cpuinfo)
* Kernel mitigation is enabled and active: NO
* SMT is either mitigated or disabled: NO
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-12127 aka RIDL, microarchitectural load port data sampling (MLPDS)
* Mitigated according to the /sys interface: YES (Not affected)
* Kernel supports using MD_CLEAR mitigation: YES (md_clear found in /proc/cpuinfo)
* Kernel mitigation is enabled and active: NO
* SMT is either mitigated or disabled: NO
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2019-11091 aka RIDL, microarchitectural data sampling uncacheable memory (MDSUM)
\star Mitigated according to the /sys interface: YES (Not affected)
```

(continues on next page)

#### 3.7.8 Calibration Data - TaiShan

Following sections include sample calibration data measured on s17-t33-sut1 server running in one of the Cortex-A72 testbeds.

Calibration data obtained from all other servers in TaiShan testbeds shows the same or similar values.

#### Linux cmdline

```
$ cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-4.15.0-54-generic root=/dev/mapper/huawei--1--vg-root ro isolcpus=1-15,17-

→31,33-47,49-63 nohz_full=1-15 17-31,33-47,49-63 rcu_nocbs=1-15 17-31,33-47,49-63 intel_

→iommu=on nmi_watchdog=0 audit=0 nosoftlockup processor.max_cstate=1 console=ttyAMA0,115200n8
```

#### Linux uname

### **System-level Core Jitter**

```
$ sudo taskset -c 3 /home/testuser/pma_tools/jitter/jitter -i 20
Linux Jitter testing program version 1.9
Iterations=30
The pragram will execute a dummy function 80000 times
Display is updated every 20000 displayUpdate intervals
Thread affinity will be set to core_id:7
Timings are in CPU Core cycles
Inst_Min:
            Minimum Excution time during the display update interval(default is ~1 second)
            Maximum Excution time during the display update interval(default is ~1 second)
Inst Max:
Inst_jitter: Jitter in the Excution time during rhe display update interval. This is the value of_
→interest
            The Excution time of last iteration just before the display update
last_Exec:
Abs_Min:
            Absolute Minimum Excution time since the program started or statistics were reset
Abs_Max:
            Absolute Maximum Excution time since the program started or statistics were reset
tmp:
            Cumulative value calcualted by the dummy function
Interval:
            Time interval between the display updates in Core Cycles
Sample No: Sample number
  Inst Min
            Inst_Max Inst_jitter last_Exec Abs_min
                                                                                 Interval
                                                          Abs max
                                                                        tmp
→Sample No
                                                                     1903230976 3204401362
    160022
              172254
                          12232
                                    160042
                                               160022
                                                           172254
1 ب
                                                                     814809088 3204619316
              173148
    160022
                          13126
                                    160044
                                               160022
                                                           173148
 →2
```

(continues on next page)

3.7. Test Environment 649

| /          | •    |          | ١.    |
|------------|------|----------|-------|
| (continued | trom | previous | page) |

|                       |        |       |        |        |        | (continued from previ |          |
|-----------------------|--------|-------|--------|--------|--------|-----------------------|----------|
| 160022<br>→3          | 169460 | 9438  | 160044 | 160022 | 173148 | 4021354496 3204391306 | ٦        |
| 160024                | 170270 | 10246 | 160044 | 160022 | 173148 | 2932932608 3204385830 | u        |
| <b>4</b> 160022       | 169660 | 9638  | 160044 | 160022 | 173148 | 1844510720 3204387290 | u        |
| →5<br>160022          | 169410 | 9388  | 160040 | 160022 | 173148 | 756088832 3204375832  | <u>.</u> |
| <b>→</b> 6 160022     | 169012 | 8990  | 160042 | 160022 | 173148 | 3962634240 3204378924 |          |
| →7<br>160022          | 169556 | 9534  | 160044 | 160022 | 173148 | 2874212352 3204374882 |          |
| <b>⇔</b> 8            |        |       |        |        |        |                       | J        |
| 160022<br><b>→</b> 9  | 171684 | 11662 | 160042 | 160022 | 173148 | 1785790464 3204394596 | J        |
| 160022                | 171546 | 11524 | 160024 | 160022 | 173148 | 697368576 3204602774  | _        |
| 160022<br>→11         | 169248 | 9226  | 160042 | 160022 | 173148 | 3903913984 3204401676 | u        |
| 160022                | 168458 | 8436  | 160042 | 160022 | 173148 | 2815492096 3204256350 | <u>.</u> |
| 160022                | 169574 | 9552  | 160044 | 160022 | 173148 | 1727070208 3204278116 | L.       |
| →13<br>160022         | 169352 | 9330  | 160044 | 160022 | 173148 | 638648320 3204327234  | u        |
| →14<br>160022         | 169100 | 9078  | 160044 | 160022 | 173148 | 3845193728 3204388132 |          |
| →15<br>160022         | 169338 | 9316  | 160042 | 160022 | 173148 | 2756771840 3204380724 | L L      |
| <b>⇔</b> 16           |        |       |        |        |        |                       |          |
| 160022<br><b>→</b> 17 | 170828 | 10806 | 160046 | 160022 | 173148 | 1668349952 3204430452 | ü        |
| 160022<br>→18         | 173162 | 13140 | 160026 | 160022 | 173162 | 579928064 3204611318  | u u      |
| 160022<br>→19         | 170482 | 10460 | 160042 | 160022 | 173162 | 3786473472 3204389896 | _        |
| 160024<br>→20         | 170704 | 10680 | 160044 | 160022 | 173162 | 2698051584 3204422126 | u        |
| 160024                | 169302 | 9278  | 160044 | 160022 | 173162 | 1609629696 3204397334 | u        |
| →21<br>160022         | 171848 | 11826 | 160044 | 160022 | 173162 | 521207808 3204389818  | u u      |
| →22<br>160022         | 169438 | 9416  | 160042 | 160022 | 173162 | 3727753216 3204395382 |          |
| →23<br>160022         | 169312 | 9290  | 160042 | 160022 | 173162 | 2639331328 3204371202 | _        |
| <b>⇔</b> 24 160022    | 171368 | 11346 | 160044 | 160022 | 173162 | 1550909440 3204440464 |          |
| <b>⇒</b> 25           |        |       |        |        |        |                       | <u>.</u> |
| 160022<br>→26         | 171998 | 11976 | 160042 | 160022 | 173162 | 462487552 3204609440  | u        |
| 160022<br><b>⇔</b> 27 | 169740 | 9718  | 160046 | 160022 | 173162 | 3669032960 3204405826 | u        |
| 160022<br>→28         | 169610 | 9588  | 160044 | 160022 | 173162 | 2580611072 3204390608 | u u      |
| 160022                | 169254 | 9232  | 160044 | 160022 | 173162 | 1492189184 3204399760 | <u>.</u> |
| →29<br>160022<br>→30  | 169386 | 9364  | 160046 | 160022 | 173162 | 403767296 3204417762  | ı        |

#### **Spectre and Meltdown Checks**

Following section displays the output of a running shell script to tell if system is vulnerable against the several "speculative execution" CVEs that were made public in 2018. Script is available on Spectre & Meltdown Checker Github<sup>179</sup>.

```
Spectre and Meltdown mitigation detection tool v0.43
awk: cannot open bash (No such file or directory)
Checking for vulnerabilities on current system
Kernel is Linux 4.15.0-23-generic #25-Ubuntu SMP Wed May 23 18:02:16 UTC 2018 x86_64
CPU is Intel(R) Xeon(R) Platinum 8180 CPU @ 2.50GHz
* Hardware support (CPU microcode) for mitigation techniques
  * Indirect Branch Restricted Speculation (IBRS)
    * SPEC_CTRL MSR is available: YES
    * CPU indicates IBRS capability: YES (SPEC_CTRL feature bit)
  * Indirect Branch Prediction Barrier (IBPB)
    * PRED_CMD MSR is available: YES
    * CPU indicates IBPB capability: YES (SPEC_CTRL feature bit)
  * Single Thread Indirect Branch Predictors (STIBP)
    * SPEC_CTRL MSR is available: YES
    * CPU indicates STIBP capability: YES (Intel STIBP feature bit)
  * Speculative Store Bypass Disable (SSBD)
    * CPU indicates SSBD capability: NO
 * L1 data cache invalidation
   * FLUSH_CMD MSR is available: NO
    * CPU indicates L1D flush capability: NO
  * Microarchitectural Data Sampling
    * VERW instruction is available: NO
 * Enhanced IBRS (IBRS_ALL)
   * CPU indicates ARCH_CAPABILITIES MSR availability: NO
    * ARCH_CAPABILITIES MSR advertises IBRS_ALL capability: NO
  * CPU explicitly indicates not being vulnerable to Meltdown/L1TF (RDCL_NO): NO
  * CPU explicitly indicates not being vulnerable to Variant 4 (SSB_NO): NO
  * CPU/Hypervisor indicates L1D flushing is not necessary on this system: NO
  * Hypervisor indicates host CPU might be vulnerable to RSB underflow (RSBA): NO
 \star CPU explicitly indicates not being vulnerable to Microarchitectural Data Sampling (MDS_NO): NO
 * CPU explicitly indicates not being vulnerable to TSX Asynchronous Abort (TAA_NO): NO
 * CPU explicitly indicates not being vulnerable to iTLB Multihit (PSCHANGE_MSC_NO): NO
 * CPU explicitly indicates having MSR for TSX control (TSX_CTRL_MSR): NO
 * CPU supports Transactional Synchronization Extensions (TSX): YES (RTM feature bit)
 * CPU supports Software Guard Extensions (SGX): NO
 * CPU microcode is known to cause stability problems: NO (model 0x55 family 0x6 stepping 0x4_

→ucode 0x2000043 cpuid 0x50654)

 * CPU microcode is the latest known available version: awk: cannot open bash (No such file or_
→directorv)
UNKNOWN (latest microcode version for your CPU model is unknown)
* CPU vulnerability to the speculative execution attack variants
  * Vulnerable to CVE-2017-5753 (Spectre Variant 1, bounds check bypass): YES
  * Vulnerable to CVE-2017-5715 (Spectre Variant 2, branch target injection): YES
  * Vulnerable to CVE-2017-5754 (Variant 3, Meltdown, rogue data cache load): YES
  * Vulnerable to CVE-2018-3640 (Variant 3a, rogue system register read): YES
  * Vulnerable to CVE-2018-3639 (Variant 4, speculative store bypass): YES
  * Vulnerable to CVE-2018-3615 (Foreshadow (SGX), L1 terminal fault): NO
  * Vulnerable to CVE-2018-3620 (Foreshadow-NG (OS), L1 terminal fault): YES
  * Vulnerable to CVE-2018-3646 (Foreshadow-NG (VMM), L1 terminal fault): YES
  * Vulnerable to CVE-2018-12126 (Fallout, microarchitectural store buffer data sampling (MSBDS)):_
  * Vulnerable to CVE-2018-12130 (ZombieLoad, microarchitectural fill buffer data sampling.
 →(MFBDS)): YES
                                                                                  (continues on next page)
```

3.7. Test Environment 651

<sup>179</sup> https://github.com/speed47/spectre-meltdown-checker

```
* Vulnerable to CVE-2018-12127 (RIDL, microarchitectural load port data sampling (MLPDS)): YES
  * Vulnerable to CVE-2019-11091 (RIDL, microarchitectural data sampling uncacheable memory_
 * Vulnerable to CVE-2019-11135 (ZombieLoad V2, TSX Asynchronous Abort (TAA)): YES
 * Vulnerable to CVE-2018-12207 (No eXcuses, iTLB Multihit, machine check exception on page size_
→changes (MCEPSC)): YES
CVE-2017-5753 aka Spectre Variant 1, bounds check bypass
* Mitigated according to the /sys interface: YES (Mitigation: __user pointer sanitization)
* Kernel has array_index_mask_nospec: YES (1 occurrence(s) found of x86 64 bits array_index_mask_
→nospec())
* Kernel has the Red Hat/Ubuntu patch: NO
* Kernel has mask_nospec64 (arm64): NO
> STATUS: NOT VULNERABLE (Mitigation: __user pointer sanitization)
CVE-2017-5715 aka Spectre Variant 2, branch target injection
* Mitigated according to the /sys interface: YES (Mitigation: Full generic retpoline, IBPB, IBRS_FW)
* Mitigation 1
 \star Kernel is compiled with IBRS support: YES
    * IBRS enabled and active: YES (for firmware code only)
 * Kernel is compiled with IBPB support: YES
    * IBPB enabled and active: YES
* Mitigation 2
 * Kernel has branch predictor hardening (arm): NO
 * Kernel compiled with retpoline option: YES
    * Kernel compiled with a retpoline-aware compiler: YES (kernel reports full retpoline_
→compilation)
 * Kernel supports RSB filling: YES
> STATUS: NOT VULNERABLE (Full retpoline + IBPB are mitigating the vulnerability)
CVE-2017-5754 aka Variant 3, Meltdown, rogue data cache load
* Mitigated according to the /sys interface: YES (Mitigation: PTI)
* Kernel supports Page Table Isolation (PTI): YES
 * PTI enabled and active: YES
 * Reduced performance impact of PTI: YES (CPU supports INVPCID, performance impact of PTI will be_
→greatly reduced)
* Running as a Xen PV DomU: NO
> STATUS: NOT VULNERABLE (Mitigation: PTI)
CVE-2018-3640 aka Variant 3a, rogue system register read
* CPU microcode mitigates the vulnerability: NO
> STATUS: VULNERABLE (an up-to-date CPU microcode is needed to mitigate this vulnerability)
CVE-2018-3639 aka Variant 4, speculative store bypass
* Mitigated according to the /sys interface: NO (Vulnerable)
* Kernel supports disabling speculative store bypass (SSB): YES (found in /proc/self/status)
* SSB mitigation is enabled and active: NO
> STATUS: VULNERABLE (Your CPU doesnt support SSBD)
CVE-2018-3615 aka Foreshadow (SGX), L1 terminal fault
* CPU microcode mitigates the vulnerability: N/A
> STATUS: NOT VULNERABLE (your CPU vendor reported your CPU model as not vulnerable)
CVE-2018-3620 aka Foreshadow-NG (OS), L1 terminal fault
* Kernel supports PTE inversion: NO
* PTE inversion enabled and active: UNKNOWN (sysfs interface not available)
> STATUS: VULNERABLE (Your kernel doesnt support PTE inversion, update it)
CVE-2018-3646 aka Foreshadow-NG (VMM), L1 terminal fault
* This system is a host running a hypervisor: NO
* Mitigation 1 (KVM)
```

(continues on next page)

```
* EPT is disabled: NO
* Mitigation 2
 * L1D flush is supported by kernel: NO
 * L1D flush enabled: UNKNOWN (cant find or read /sys/devices/system/cpu/vulnerabilities/l1tf)
 * Hardware-backed L1D flush supported: NO (flush will be done in software, this is slower)
 * Hyper-Threading (SMT) is enabled: YES
> STATUS: NOT VULNERABLE (this system is not running a hypervisor)
CVE-2018-12126 aka Fallout, microarchitectural store buffer data sampling (MSBDS)
* Kernel supports using MD_CLEAR mitigation: NO
> STATUS: VULNERABLE (Neither your kernel or your microcode support mitigation, upgrade both to_
→mitigate the vulnerability)
CVE-2018-12130 aka ZombieLoad, microarchitectural fill buffer data sampling (MFBDS)
* Kernel supports using MD_CLEAR mitigation: NO
> STATUS: VULNERABLE (Neither your kernel or your microcode support mitigation, upgrade both to_
→mitigate the vulnerability)
CVE-2018-12127 aka RIDL, microarchitectural load port data sampling (MLPDS)
* Kernel supports using MD_CLEAR mitigation: NO
> STATUS: VULNERABLE (Neither your kernel or your microcode support mitigation, upgrade both to_
→mitigate the vulnerability)
CVE-2019-11091 aka RIDL, microarchitectural data sampling uncacheable memory (MDSUM)
* Kernel supports using MD_CLEAR mitigation: NO
> STATUS: VULNERABLE (Neither your kernel or your microcode support mitigation, upgrade both to_
→mitigate the vulnerability)
CVE-2019-11135 aka ZombieLoad V2, TSX Asynchronous Abort (TAA)
* TAA mitigation is supported by kernel: NO
* TAA mitigation enabled and active: NO (tsx_async_abort not found in sysfs hierarchy)
> STATUS: VULNERABLE (Your kernel doesnt support TAA mitigation, update it)
CVE-2018-12207 aka No eXcuses, iTLB Multihit, machine check exception on page size changes (MCEPSC)
* This system is a host running a hypervisor: NO
* iTLB Multihit mitigation is supported by kernel: NO
* iTLB Multihit mitigation enabled and active: NO (itlb_multihit not found in sysfs hierarchy)
> STATUS: NOT VULNERABLE (this system is not running a hypervisor)
> SUMMARY: CVE-2017-5753:OK CVE-2017-5715:OK CVE-2017-5754:OK CVE-2018-3640:KO CVE-2018-3639:KO CVE-
→2018-3615:0K CVE-2018-3620:KO CVE-2018-3646:0K CVE-2018-12126:KO CVE-2018-12130:KO CVE-2018-
→12127:KO CVE-2019-11091:KO CVE-2019-11135:KO CVE-2018-12207:OK
```

# 3.7.9 SUT Settings - Linux

System provisioning is done by combination of PXE boot unattented install and Ansible<sup>180</sup> described in CSIT Testbed Setup<sup>181</sup>.

Below a subset of the running configuration:

#### 1. Ubuntu 18.04.x LTS

```
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.3 LTS
Release: 18.04
Codename: bionic
```

3.7. Test Environment 653

 $<sup>^{180}</sup>$  https://www.ansible.com

<sup>&</sup>lt;sup>181</sup> https://git.fd.io/csit/tree/resources/tools/testbed-setup/README.md?h=rls2001

#### **Linux Boot Parameters**

- isolcpus=<cpu number>-<cpu number> used for all cpu cores apart from first core of each socket used for running VPP worker threads and Qemu/LXC processes https://www.kernel.org/doc/Documentation/admin-guide/kernel-parameters.txt
- intel\_pstate=disable [X86] Do not enable intel\_pstate as the default scaling driver for the supported processors. Intel P-State driver decide what P-state (CPU core power state) to use based on requesting policy from the cpufreq core. [X86 Either 32-bit or 64-bit x86] https://www.kernel.org/doc/Documentation/cpu-freq/intel-pstate.txt
- nohz\_full=<cpu number>--cpu number> [KNL,BOOT] In kernels built with CON-FIG\_NO\_HZ\_FULL=y, set the specified list of CPUs whose tick will be stopped whenever possible. The boot CPU will be forced outside the range to maintain the timekeeping. The CPUs in this range must also be included in the rcu\_nocbs= set. Specifies the adaptive-ticks CPU cores, causing kernel to avoid sending scheduling-clock interrupts to listed cores as long as they have a single runnable task. [KNL Is a kernel start-up parameter, SMP The kernel is an SMP kernel]. https://www.kernel.org/doc/Documentation/timers/NO HZ.txt
- rcu\_nocbs [KNL] In kernels built with CONFIG\_RCU\_NOCB\_CPU=y, set the specified list of CPUs to be no-callback CPUs, that never queue RCU callbacks (read-copy update). https://www.kernel.org/doc/Documentation/admin-guide/kernel-parameters.txt
- numa\_balancing=disable [KNL,X86] Disable automatic NUMA balancing.
- intel iommu=enable [DMAR] Enable Intel IOMMU driver (DMAR) option.
- iommu=on, iommu=pt [x86, IA-64] Disable IOMMU bypass, using IOMMU for PCI devices.
- nmi\_watchdog=0 [KNL,BUGS=X86] Debugging features for SMP kernels. Turn hardlockup detector in nmi\_watchdog off.
- nosoftlockup [KNL] Disable the soft-lockup detector.
- tsc=reliable Disable clocksource stability checks for TSC. [x86] reliable: mark tsc clocksource as reliable, this disables clocksource verification at runtime, as well as the stability checks done at bootup. Used to enable high-resolution timer mode on older hardware, and in virtualized environment.
- hpet=disable [X86-32,HPET] Disable HPET and use PIT instead.

### **Hugepages Configuration**

Huge pages are namaged via sysctl configuration located in /etc/sysctl.d/90-csit.conf on each testbed. Default huge page size is 2M. The exact amount of huge pages depends on testbed. All the values are defined in Ansible inventory - hosts files.

### 3.7.10 DUT Settings - DPDK

#### **DPDK Version**

DPDK-19.08

#### **DPDK Compile Parameters**

make install T=<arch>-native-linuxapp-gcc -j

#### **Testpmd Startup Configuration**

Testpmd startup configuration changes per test case with different settings for \$\$CORES, \$\$RXQ and max-pkt-len parameter if test is sending jumbo frames. Startup command template:

```
testpmd -c $$CORE_MASK -n 4 -- --numa --nb-ports=2 --portmask=0x3 --nb-cores=$$CORES [--max-pkt-

→len=9000] --txqflags=0 --forward-mode=io --rxq=$$RXQ --txq=$$TXQ --burst=64 --rxd=1024 --txd=1024_

→--disable-link-check --auto-start
```

#### **L3FWD Startup Configuration**

L3FWD startup configuration changes per test case with different settings for \$\$CORES and enable-jumbo parameter if test is sending jumbo frames. Startup command template:

```
l3fwd -l $$CORE_LIST -n 4 -- -P -L -p 0x3 --config='${port_config}' [--enable-jumbo --max-pkt-
--len=9000] --eth-dest=0,${adj_mac0} --eth-dest=1,${adj_mac1} --parse-ptype
```

# 3.7.11 TG Settings - TRex

#### **TG Version**

TRex v2.73

#### **DPDK Version**

**DPDK v19.05** 

#### **TG Build Script Used**

TRex installation<sup>182</sup>

### **TG Startup Configuration**

### **TG Startup Command**

```
$ sh -c 'cd <t-rex-install-dir>/scripts/ && sudo nohup ./t-rex-64 -i -c 7 --prefix $(hostname) --

→hdrh > /tmp/trex.log 2>&1 &'> /dev/null
```

#### **TG API Driver**

### TRex driver<sup>183</sup>

3.7. Test Environment 655

 $<sup>^{182}\</sup> https://git.fd.io/csit/tree/resources/tools/trex/trex\_installer.sh?h=rls2001$ 

<sup>183</sup> https://git.fd.io/csit/tree/resources/tools/trex/trex\_stateless\_profile.py?h=rls2001

# 3.8 Documentation

CSIT DPDK Performance Tests Documentation<sup>184</sup> contains detailed functional description and input parameters for each test case.

<sup>184</sup> https://docs.fd.io/csit/rls2001/doc/tests.dpdk.perf.html

**CHAPTER** 

**FOUR** 

# **VPP DEVICE**

# 4.1 Overview

# 4.1.1 Virtual Topologies

CSIT VPP Device tests are executed in Physical containerized topologies created on demand using set of scripts hosted and developed under CSIT repository. It runs on physical baremetal servers hosted by LF FD.io project. Based on the packet path thru SUT Containers, three distinct logical topology types are used for VPP DUT data plane testing:

- 1. vfNIC-to-vfNIC switching topologies.
- 2. vfNIC-to-vhost-user switching topologies.
- 3. vfNIC-to-memif switching topologies.

# vfNIC-to-vfNIC Switching

The simplest physical topology for software data plane application like VPP is vfNIC-to-vfNIC switching. Tested virtual topologies for 2-Node testbeds are shown in figures below.



SUT1 is Docker Container (running Ubuntu, depending on the test suite), TG is a Traffic Generator (running Ubuntu Container). SUTs run VPP SW application in Linux user-mode as a Device Under Test (DUT) within the container. TG runs Scapy SW application as a packet Traffic Generator. Network connectivity between SUTs and to TG is provided using virtual function of physical NICs.

Virtual topologies are created on-demand whenever a verification job is started (e.g. triggered by the gerrit patch submission) and destroyed upon completion of all functional tests. Each node is a container running on physical server. During the test execution, all nodes are reachable thru the Management (not shown above for clarity).

### vfNIC-to-vhost-user Switching

vfNIC-to-vhost-user switching topology test cases require VPP DUT to communicate with Virtual Machine (VM) over Vhost-user virtual interfaces. VM is created on SUT1 for the duration of these particular test cases only. Virtual test topology with VM is shown in the figure below.



# vfNIC-to-memif Switching

vfNIC-to-memif switching topology test cases require VPP DUT to communicate with another Docker Container over memif interfaces. Container is created for the duration of these particular test cases only and it is running the same VPP version as running on DUT. Virtual test topology with Memif is shown in the figure below.

4.1. Overview 659



# **4.1.2 Functional Tests Coverage**

CSIT-2001 includes following VPP functionality tested in VPP Device environment:

| Functionality     | Description                                                                   |
|-------------------|-------------------------------------------------------------------------------|
| ACL               | Ingress Access Control List security for L2 Bridge-Domain MAC switching, IPv4 |
|                   | routing, IPv6 routing.                                                        |
| COP               | COP address white-list and black-list filtering for IPv4 and IPv6 routing.    |
| IPSec             | IPSec tunnel and transport modes.                                             |
| IPv4              | IPv4 routing, ICMPv4.                                                         |
| IPv6              | IPv4 routing, ICMPv6.                                                         |
| L2BD              | L2 Bridge-Domain switching for untagged Ethernet.                             |
| L2XC              | L2 Cross-Connect switching for untagged Ethernet.                             |
| Memif Interface   | Baseline VPP memif interface tests.                                           |
| QoS Policer Me-   | Ingress packet rate metering and marking for IPv4, IPv6.                      |
| tering            |                                                                               |
| Tap Interface     | Baseline Linux tap interface tests.                                           |
| VLAN Tag          | L2 VLAN subinterfaces.                                                        |
| Vhost-user Inter- | Baseline VPP vhost-user interface tests.                                      |
| face              |                                                                               |
| VXLAN             | VXLAN overlay tunneling for L2-over-IPv4 and -over-IPv6.                      |

# 4.1.3 Tests Naming

CSIT-2001 follows a common structured naming convention for all performance and system functional tests, introduced in CSIT-17.01.

The naming should be intuitive for majority of the tests. Complete description of CSIT test naming convention is provided on *Test Naming* (page 673).

# 4.2 Release Notes

# 4.2.1 Changes in CSIT-2001

- 1. TEST FRAMEWORK
  - Bug fixes.
  - ARM platform compatibility.
- 2. TEST COVERAGE
  - Increased test coverage: Dot1q, IPsec, 802.1ad VXLAN, COP whitelist, COP blacklist, QoS Policer Metering, iACL whitelist, AVF driver, TAP Interface.
  - Align vpp\_device L2 Robot Keywords with performance L2 Robot Keywords.

## 4.2.2 Known Issues

List of known issues in CSIT-2001 for VPP functional tests in VPP Device:

| # | JiralD | Issue Description |
|---|--------|-------------------|
| 1 |        |                   |

# 4.3 Integration Tests

#### 4.3.1 Abstract

FD.io VPP software data plane technology has become very popular across a wide range of VPP ecosystem use cases, putting higher pressure on continuous verification of VPP software quality.

This document describes a proposal for design and implementation of extended continuous VPP testing by extending existing test environments. Furthermore it describes and summarizes implementation details of Integration and System tests platform 1-Node VPP\_Device. It aims to provide a complete end-to-end view of 1-Node VPP\_Device environment in order to improve extendability and maintenance, under the guideline of VPP core team.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 8174<sup>185</sup>.

4.2. Release Notes 661

<sup>185</sup> https://tools.ietf.org/html/rfc8174.html

### 4.3.2 Overview



# 4.3.3 Physical Testbeds

All FD.io CSIT vpp-device tests are executed on physical testbeds built with bare-metal servers hosted by LF FD.io project. Two 1-node testbed topologies are used:

• 2-Container Topology: Consisting of one Docker container acting as SUT (System Under Test) and one Docker container as TG (Traffic Generator), both connected in ring topology via physical NIC cross-connecting.

Current FD.io production testbeds are built with servers based on one processor generation of Intel Xeons: Skylake (Platinum 8180). Testbeds built with servers based on Arm processors are in the process of being added to FD.io production.

Following section describe existing production 1n-skx testbed.

### 1-Node Xeon Skylake (1n-skx)

1n-skx testbed is based on single SuperMicro SYS-7049GP-TRT server equipped with two Intel Xeon Skylake Platinum 8180 2.5 GHz 28 core processors. Physical testbed topology is depicted in a figure below.



Server is populated with the following NIC models:

- 1. NIC-1: x710-da4 4p10GE Intel.
- 2. NIC-2: x710-da4 4p10GE Intel.

All Intel Xeon Skylake servers run with Intel Hyper-Threading enabled, doubling the number of logical cores exposed to Linux, with 56 logical cores and 28 physical cores per processor socket.

NIC interfaces are shared using Linux vfio\_pci and VPP VF drivers:

- DPDK VF driver,
- · Fortville AVF driver.

Provided Intel x710-da4 4p10GE NICs support 32 VFs per interface, 128 per NIC.

Complete 1n-skx testbeds specification is available on CSIT LF Testbeds 186 wiki page.

Total of two 1n-skx testbeds are in operation in FD.io labs.

#### 1-Node Virtualbox (1n-vbox)

1n-skx testbed can run in single VirtualBox VM machine. This solution replaces the previously used Vagrant environment based on 3 VMs.

VirtualBox VM MAY be created by Vagrant and MUST have additional 4 virtio NICs each pair attached to separate private networks to simulate back-to-back connections. It SHOULD be 82545EM device model (otherwise can be changed in boostrap scripts). Example of Vagrant configuration:

```
Vagrant.configure(2) do |c|
c.vm.network "private_network", type: "dhcp", auto_config: false,

(continues on next page)
```

4.3. Integration Tests

 $<sup>^{186}</sup>$  https://wiki.fd.io/view/CSIT/Testbeds:\_Xeon\_Skx,\_Arm,\_Atom.

```
virtualbox__intnet: "port1", nic_type: "82545EM"
c.vm.network "private_network", type: "dhcp", auto_config: false,
    virtualbox__intnet: "port2", nic_type: "82545EM"

c.vm.provider :virtualbox do |v|
    v.customize ["modifyvm", :id, "--nicpromisc2", "allow-all"]
    v.customize ["modifyvm", :id, "--nicpromisc3", "allow-all"]
    v.customize ["modifyvm", :id, "--nicpromisc4", "allow-all"]
    v.customize ["modifyvm", :id, "--nicpromisc5", "allow-all"]
```

Vagrant VM is populated with the following NIC models:

- 1. NIC-1: 82545EM Intel.
- 2. NIC-2: 82545EM Intel.
- 3. NIC-3: 82545EM Intel.
- 4. NIC-4: 82545EM Intel.

#### 4.3.4 Containers

It was agreed on TWS (Technical Work Stream) call to continue with Ubuntu 18.04 LTS as a baseline system with OPTIONAL extend to Centos 7 and SuSE per demand [TWSLink].

All DCR (Docker container) images are REQUIRED to be hosted on Docker registry available from LF network, publicly available and trackable. For backup, tracking and contributing purposes all Dockerfiles (including files needed for building container) MUST be available and stored in [fdiocsitgerrit] repository under appropriate folders. This allows the peer review process to be done for every change of infrastructure related to scope of this document. Currently only **csit-shim-dcr** and **csit-sut-dcr** containers will be stored and maintained under CSIT repository by CSIT contributors.

At the time of designing solution described in this document the interconnection between [dockerhub] and [fdiocsitgerrit] for automated build purposes and image hosting cannot be established with the trust and respectful to security of FD.io project. Unless adressed, DCR images will be placed in custom registry service [fdioregistry]. Automated Jenkins jobs will be created in align of long term solution for container lifecycle and ability to build new version of docker images.

In parallel, the effort is started to find the outsourced Docker registry service.

#### Versioning

As of initial version of vpp-device, we do have only single latest version of Docker image hosted on [dockerhub]. This will be addressed as further improvement with proper semantic versioning.

### jenkins-slave-dcr

This DCR acts as the Jenkins slave (known also as jenkins minion). It can connect over SSH protocol to TCP port 6022 of **csit-shim-dcr** and executes non-interactive reservation script. Nomad is responsible for scheduling this container execution onto specific **1-Node VPP\_Device** testbed. It executes CSIT environment including CSIT framework.

All software dependencies including VPP/DPDK that are not present in **csit-sut-dcr** container image and/or needs to be compiled prior running on **csit-sut-dcr** SHOULD be compiled in this container.

- Container Image Location: Docker image at snergster/vpp-ubuntu18.
- Container Definition: Docker file specified at [JenkinsSlaveDcrFile].
- Initializing: Container is initialized from within Consul by HashiCorp and Nomad by HashiCorp.

#### csit-shim-dcr

This DCR acts as an intermediate layer running script responsible for orchestrating topologies under test and reservation. Responsible for managing VF resources and allocation to DUT (Device Under Test), TG (Traffic Generator) containers. This MUST to be done on **csit-shim-dcr**. This image also acts as the generic reservation mechanics arbiter to make sure that only Y number of simulations are spawned on any given HW node.

- Container Image Location: Docker image at snergster/csit-shim.
- Container Definition: Docker file specified at [CsitShimDcrFile].
- Initializing: Container is initialized from within Consul by HashiCorp and Nomad by HashiCorp. Required docker parameters, to be able to run nested containers with VF reservation system are: privileged, net=host, pid=host.
- Connectivity: Over SSH only, using <host>:6022 format. Currently using root user account as primary. From the jenkins slave it will be able to connect via env variable, since the jenkins slave doesn't actually know what host its running on.

```
ssh -p 6022 root@10.30.51.node
```

#### csit-sut-dcr

This DCR acts as an SUT (System Under Test). Any DUT or TG application is installed there. It is REC-OMMENDED to install DUT and all DUT dependencies via commands rpm -ihv on RedHat based OS or dpkg -i on Debian based OS.

Container is designed to be a very lightweight Docker image that only installs packages and execute binaries (previously built or downloaded on **jenkins-slave-dcr**) and contains libraries necessary to run CSIT framework including those required by DUT/TG.

- Container Image Location: Docker image at snergster/csit-sut.
- Container Definition: Docker file specified at [CsitSutDcrFile].
- Initializing:

```
docker run
# Run the container in the background and print the new container ID.
--detach=true
# Give extended privileges to this container. A "privileged" container is
# given access to all devices and able to run nested containers.
--privileged
# Publish all exposed ports to random ports on the host interfaces.
--publish-all
# Automatically remove the container when it exits.
# Size of /dev/shm.
dcr_stc_params+="--shm-size 512M "
# Override access to PCI bus by attaching a filesystem mount to the
dcr_stc_params+="--mount type=tmpfs,destination=/sys/bus/pci/devices "
# Mount vfio to be able to bind to see bound interfaces. We cannot use
# --device=/dev/vfio as this does not see newly bound interfaces.
dcr_stc_params+="--volume /dev/vfio:/dev/vfio
# Mount docker.sock to be able to use docker deamon of the host.
dcr_stc_params+="--volume /var/run/docker.sock:/var/run/docker.sock "
# Mount /opt/boot/ where VM kernel and initrd are located.
dcr_stc_params+="--volume /opt/boot/:/opt/boot/ "
# Mount host hugepages for VMs.
dcr_stc_params+="--volume /dev/hugepages/:/dev/hugepages/ "
```

Container name is catenated from **csit**- prefix and uuid generated uniquely for each container instance.

• Connectivity: Over SSH only, using <host>[:<port>] format. Currently using root user account as primary.

```
ssh -p <port> root@10.30.51.<node>
```

Container required to run as --privileged due to ability to create nested containers and have full read/write access to sysfs (for bind/unbind). Docker automatically pick free network port (--publish-all) for ability to connect over ssh. To be able to limit access to PCI bus, container is creating tmpfs mount type in PCI bus tree. CSIT reservation script is dynamically linking only PCI devices (NIC cards) that are reserved for particular container. This way it is not colliding with other containers. To make vfio work, access to /dev/vfio must be granted.

### 4.3.5 Environment initialization

All 1-node servers are to be managed and provisioned via the [ansiblelink] set of playbooks with *vpp-device* role. Full playbooks can be found under [fdiocsitansible] directory. This way we are able to track all configuration changes of physical servers in gerrit (in structured yaml format) as well as we are able to extend *vpp-device* to additional servers with less effort or re-stage servers in case of failure.

SR-IOV VF initialization is done via systemd service during host system boot up. Service with name *csit-initialize-vfs.service* is created under systemd system context (/etc/systemd/system/). By default service is calling /usr/local/bin/csit-initialize-vfs.sh with single parameter:

- start: Creates maximum number of virtual functions (VFs) (detected from sriov\_totalvfs) for each whitelisted PCI device.
- stop: Removes all VFs (VFs) for all whitelisted PCI device.

Service is considered active even when all of its processes exited successfully. Stopping service will automatically remove VFs.

```
[Unit]
Description=CSIT Initialize SR-IOV VFs
After=network.target

[Service]
Type=one-shot
RemainAfterExit=True
ExecStart=/usr/local/bin/csit-initialize-vfs.sh start
ExecStop=/usr/local/bin/csit-initialize-vfs.sh stop

[Install]
WantedBy=default.target
```

Script is driven by two array variables pci\_blacklist/pci\_whitelist. They MUST store all PCI addresses in <domain>:<br/>
<downarray variables pci\_blacklist/pci\_whitelist. They MUST store all PCI addresses in <downarray variables pci\_blacklist/pci\_whitelist. They MUST store all PCI addresses in <downarray variables pci\_blacklist/pci\_whitelist. They MUST store all PCI addresses in <downarray variables pci\_blacklist/pci\_whitelist.</d>

- pci\_blacklist: PCI addresses to be skipped from VFs initialization (usefull for e.g. excluding management network interfaces).
- pci\_whitelist: PCI addresses to be included for VFs initialization.

### 4.3.6 VF reservation

During topology initialization phase of script, mutex is used to avoid multiple instances of script to interact with each other during resources allocation. Mutal exclusion ensure that no two distinct instances of script will get same resource list.

Reservation function reads the list of all available virtual function network devices in system:

```
# Find the first ${device_count} number of available TG Linux network
# VF device names. Only allowed VF PCI IDs are filtered.
for netdev in ${tg_netdev[@]}
    for netdev_path in $(grep -l "${pci_id}" \
                         /sys/class/net/${netdev}*/device/device \
                         2> /dev/null)
    do
        if [[ ${#TG_NETDEVS[@]} -lt ${device_count} ]]; then
            tg_netdev_name=$(dirname ${netdev_path})
            tg_netdev_name=$(dirname ${tg_netdev_name})
            TG_NETDEVS+=($(basename ${tg_netdev_name}))
            break
        fi
    done
    if [[ ${#TG_NETDEVS[@]} -eq ${device_count} ]]; then
        break
    fi
done
```

Where \${pci\_id} is ID of white-listed VF PCI ID. For more information please see [pciids]. This act as security constraint to prevent taking other unwanted interfaces. The output list of all VF network devices is split into two lists for TG and SUT side of connection. First two items from each TG or SUT network devices list are taken to expose directly to namespace of container. This can be done via commands:

```
$ ip link set ${netdev} netns ${DCR_CPIDS[tg]}
$ ip link set ${netdev} netns ${DCR_CPIDS[dut1]}
```

In this stage also symbolic links to PCI devices under sysfs bus directory tree are created in running containers. Once VF devices are assigned to container namespace and PCI deivces are linked to running containers and mutex is exited. Selected VF network device automatically dissapear from parent container namespace, so another instance of script will not find device under that namespace.

Once Docker container exits, network device is returned back into parent namespace and can be reused.

#### 4.3.7 Network traffic isolation - Intel i40evf

In a virtualized environment, on Intel(R) Server Adapters that support SR-IOV, the virtual function (VF) may be subject to malicious behavior. Software- generated layer two frames, like IEEE 802.3x (link flow control), IEEE 802.1Qbb (priority based flow-control), and others of this type, are not expected and can throttle traffic between the host and the virtual switch, reducing performance. To resolve this issue, configure all SR-IOV enabled ports for VLAN tagging. This configuration allows unexpected, and potentially malicious, frames to be dropped. [inteli40e]

To configure VLAN tagging for the ports on an SR-IOV enabled adapter, use the following command. The VLAN configuration SHOULD be done before the VF driver is loaded or the VM is booted. [inteli40e]

```
$ ip link set dev <PF netdev id> vf <id> vlan <vlan id>
```

For example, the following instructions will configure PF eth0 and the first VF on VLAN 10.

```
$ ip link set dev eth0 vf 0 vlan 10
```

VLAN Tag Packet Steering allows to send all packets with a specific VLAN tag to a particular SR-IOV virtual function (VF). Further, this feature allows to designate a particular VF as trusted, and allows that trusted VF to request selective promiscuous mode on the Physical Function (PF). [inteli40e]

To set a VF as trusted or untrusted, enter the following command in the Hypervisor:

```
$ ip link set dev eth0 vf 1 trust [on|off]
```

Once the VF is designated as trusted, use the following commands in the VM to set the VF to promiscuous mode. [inteli40e]

• For promiscuous all:

```
$ ip link set eth2 promisc on
```

• For promiscuous Multicast:

```
$ ip link set eth2 allmulti on
```

**Note:** By default, the ethtool priv-flag vf-true-promisc-support is set to *off*, meaning that promiscuous mode for the VF will be limited. To set the promiscuous mode for the VF to true promiscuous and allow the VF to see all ingress traffic, use the following command. \$ ethtool set-priv-flags p261p1 vf-true-promisc-support on The vf-true-promisc-support priv-flag does not enable promiscuous mode; rather, it designates which type of promiscuous mode (limited or true) you will get when you enable promiscuous mode using the ip link commands above. Note that this is a global setting that affects the entire device. However,the vf-true-promisc-support priv-flag is only exposed to the first PF of the device. The PF remains in limited promiscuous mode (unless it is in MFP mode) regardless of the vf-true-promisc-support setting. [inteli40e]

Service described earlier *csit-initialize-vfs.service* is responsible for assigning 802.1Q vlan tagging to each vitual function via physical function from list of white-listed PCI addresses by following (simplified) code.

```
SCRIPT_DIR="$(dirname $(readlink -e "${BASH_SOURCE[0]}"))"
source "${SCRIPT_DIR}/csit-initialize-vfs-data.sh"
# Initilize whitelisted NICs with maximum number of VFs.
pci idx=0
for pci_addr in ${PCI_WHITELIST[@]}; do
    if ! [[ ${PCI_BLACKLIST[*]} =~ "${pci_addr}" ]]; then
        pci_path="/sys/bus/pci/devices/${pci_addr}"
        # SR-IOV initialization
        case "${1:-start}" in
            "start" )
                sriov_totalvfs=$(< "${pci_path}"/sriov_totalvfs)</pre>
            "stop" )
                sriov_totalvfs=0
        esac
        echo ${sriov_totalvfs} > "${pci_path}"/sriov_numvfs
        # SR-IOV 802.1Q isolation
        case "${1:-start}" in
            "start" )
                pf=$(basename "${pci_path}"/net/*)
                for vf in $(seq "${sriov_totalvfs}"); do
                    # PCI address index in array (pairing siblings).
                    if [[ -n ${PF_INDICES[@]} ]]
                    then
                        vlan_pf_idx=${PF_INDICES[$pci_addr]}
                    else
                        vlan_pf_idx=$((pci_idx % (${#PCI_WHITELIST[@]}/2)))
                    fi
                    # 802.1Q base offset.
                    vlan_bs_off=1100
                    # 802.1Q PF PCI address offset.
```

(continues on next page)

668

```
vlan_pf_off=$(( vlan_pf_idx * 100 + vlan_bs_off ))
                   # 802.1Q VF PCI address offset.
                   vlan_vf_off=$(( vlan_pf_off + vf - 1 ))
                   # VLAN string.
                   vlan_str="vlan ${vlan_vf_off}"
                   # MAC string.
                   mac5="$(printf '%x' ${pci_idx})"
                   mac6="$(printf '%x' $(( vf - 1 )))"
                   mac_str="mac ba:dc:0f:fe:${mac5}:${mac6}"
                   # Set 802.1Q VLAN id and MAC address
                   ip link set ff vf ((vf - 1)) fmac_str} vlan_str
                   ip link set pf} vf ((vf - 1)) trust on
                   ip link set fpf} vf ((vf - 1)) spoof off
               done
               pci_idx=$(( pci_idx + 1 ))
       esac
       rmmod i40evf
       modprobe i40evf
   fi
done
```

Assignment starts at VLAN 1100 and incrementing by 1 for each VF and by 100 for each white-listed PCI address up to the middle of the PCI list. Second half of the lists is assumed to be directly (cable) paired siblings and assigned with same 802.1Q VLANs as its siblings.

### 4.3.8 Open tasks

#### Security

**Note:** Switch to non-privileged containers: As of now all three container flavors are using privileged containers to make it working. Explore options to switch containers to non-privileged with explicit rather implicit privileges.

Note: Switch to testuser account intead of root.

### Maintainability

Note: Docker image distribution: Create jenkins jobs with full pipiline of CI/CD for CSIT Docker images.

# **Stability**

**Note:** Implement queueing mechanism: Currently there is no mechanics that would place starving jobs in queue in case of no resources available.

**Note:** Replace reservation script with Docker network plugin written in GOLANG/SH/Python - platform independent.

# 4.3.9 Links

# 4.4 Documentation

CSIT VPP Device Tests Documentation<sup>197</sup> contains detailed functional description and input parameters for each test case.

<sup>197</sup> https://docs.fd.io/csit/rls2001/doc/tests.vpp.device.html

**CHAPTER** 

**FIVE** 

# **CSIT FRAMEWORK**

# 5.1 Design

FD.io CSIT system design needs to meet continuously expanding requirements of FD.io projects including VPP, related sub-systems (e.g., plugin applications, DPDK drivers) and FD.io applications (e.g., DPDK applications), as well as growing number of compute platforms running those applications. With CSIT project scope and charter including both FD.io continuous testing AND performance trending/comparisons, those evolving requirements further amplify the need for CSIT framework modularity, flexibility and usability.

# 5.1.1 Design Hierarchy

CSIT follows a hierarchical system design with SUTs and DUTs at the bottom level of the hierarchy, presentation level at the top level and a number of functional layers in-between. The current CSIT system design including CSIT framework is depicted in the figure below.



**CSIT System Design Hierarchy** 

A brief bottom-up description is provided here:

- 1. SUTs, DUTs, TGs
  - SUTs Systems Under Test;
  - DUTs Devices Under Test;
  - TGs Traffic Generators;
- 2. Level-1 libraries Robot and Python
  - Lowest level CSIT libraries abstracting underlying test environment, SUT, DUT and TG specifics;
  - Used commonly across multiple L2 KWs;
  - Performance and functional tests:
    - L1 KWs (KeyWords) are implemented as RF libraries and Python libraries;
  - Performance TG L1 KWs:
    - All L1 KWs are implemented as Python libraries:
      - \* Support for TRex only today;
      - \* CSIT IXIA drivers in progress;
  - Performance data plane traffic profiles:
    - TG-specific stream profiles provide full control of:
      - \* Packet definition layers, MACs, IPs, ports, combinations thereof e.g. IPs and UDP ports;
      - \* Stream definitions different streams can run together, delayed, one after each other;
      - \* Stream profiles are independent of CSIT framework and can be used in any T-rex setup, can be sent anywhere to repeat tests with exactly the same setup;
      - Easily extensible one can create a new stream profile that meets tests requirements;
      - \* Same stream profile can be used for different tests with the same traffic needs;
  - Functional data plane traffic scripts:
    - Scapy specific traffic scripts;
- 3. Level-2 libraries Robot resource files:
  - Higher level CSIT libraries abstracting required functions for executing tests;
  - L2 KWs are classified into the following functional categories:
    - Configuration, test, verification, state report;
    - Suite setup, suite teardown;
    - Test setup, test teardown;
- 4. Tests Robot:
  - · Test suites with test cases;
  - Performance tests using physical testbed environment:
    - VPP:
    - DPDK-Testpmd;
    - DPDK-L3Fwd;
  - Tools:
    - Documentation generator;
    - Report generator;

- Testbed environment setup ansible playbooks;
- Operational debugging scripts;

# 5.1.2 Test Lifecycle Abstraction

A well coded test must follow a disciplined abstraction of the test lifecycles that includes setup, configuration, test and verification. In addition to improve test execution efficiency, the commmon aspects of test setup and configuration shared across multiple test cases should be done only once. Translating these high-level guidelines into the Robot Framework one arrives to definition of a well coded RF tests for FD.io CSIT. Anatomy of Good Tests for CSIT:

- 1. Suite Setup Suite startup Configuration common to all Test Cases in suite: uses Configuration KWs, Verification KWs, StateReport KWs;
- 2. Test Setup Test startup Configuration common to multiple Test Cases: uses Configuration KWs, StateReport KWs;
- 3. Test Case uses L2 KWs with RF Gherkin style:
  - prefixed with {Given} Verification of Test setup, reading state: uses Configuration KWs, Verification KWs, StateReport KWs;
  - prefixed with {When} Test execution: Configuration KWs, Test KWs;
  - prefixed with {Then} Verification of Test execution, reading state: uses Verification KWs, StateReport KWs;
- 4. Test Teardown post Test teardown with Configuration cleanup and Verification common to multiple Test Cases uses: Configuration KWs, Verification KWs, StateReport KWs;
- 5. Suite Teardown Suite post-test Configuration cleanup: uses Configuration KWs, Verification KWs, StateReport KWs;

### 5.1.3 RF Keywords Functional Classification

CSIT RF KWs are classified into the functional categories matching the test lifecycle events described earlier. All CSIT RF L2 and L1 KWs have been grouped into the following functional categories:

- 1. Configuration;
- 2. Test;
- 3. Verification;
- 4. StateReport;
- 5. SuiteSetup;
- 6. TestSetup;
- 7. SuiteTeardown;
- 8. TestTeardown;

### 5.1.4 RF Keywords Naming Guidelines

Readability counts: "..code is read much more often than it is written." Hence following a good and consistent grammar practice is important when writing RF KeyWords and Tests. All CSIT test cases are coded using Gherkin style and include only L2 KWs references. L2 KWs are coded using simple style and include L2 KWs, L1 KWs, and L1 python references. To improve readability, the proposal is to use the same grammar for both RF KW styles, and to formalize the grammar of English sentences used for naming the RF KWs. RF KWs names are short sentences expressing functional description of the command. They must follow English sentence grammar in one of the following forms:

5.1. Design 673

- 1. Imperative verb-object(s): "Do something", verb in base form.
- 2. **Declarative** subject-verb-object(s): "Subject does something", verb in a third-person singular present tense form.
- 3. **Affirmative** modal\_verb-verb-object(s): "Subject should be something", "Object should exist", verb in base form.
- 4. **Negative** modal\_verb-Not-verb-object(s): "Subject should not be something", "Object should not exist", verb in base form.

Passive form MUST NOT be used. However a usage of past participle as an adjective is okay. See usage examples provided in the Coding guidelines section below. Following sections list applicability of the above grammar forms to different RF KW categories. Usage examples are provided, both good and bad.

# 5.1.5 Coding Guidelines

Coding guidelines can be found on Design optimizations wiki page 198.

# 5.2 Test Naming

### 5.2.1 Background

CSIT-2001 follows a common structured naming convention for all performance and system functional tests, introduced in CSIT-1701.

The naming should be intuitive for majority of the tests. Complete description of CSIT test naming convention is provided on CSIT test naming wiki page<sup>199</sup>. Below few illustrative examples of the naming usage for test suites across CSIT performance, functional and Honeycomb management test areas.

# 5.2.2 Naming Convention

The CSIT approach is to use tree naming convention and to encode following testing information into test suite and test case names:

- 1. packet network port configuration
  - port type, physical or virtual;
  - number of ports;
  - NIC model, if applicable;
  - port-NIC locality, if applicable;
- 2. packet encapsulations;
- 3. VPP packet processing
  - packet forwarding mode;
  - packet processing function(s);
- 4. packet forwarding path
  - if present, network functions (processes, containers, VMs) and their topology within the computer;
- 5. main measured variable, type of test.

<sup>&</sup>lt;sup>198</sup> https://wiki.fd.io/view/CSIT/Design\_Optimizations

<sup>199</sup> https://wiki.fd.io/view/CSIT/csit-test-naming

Proposed convention is to encode ports and NICs on the left (underlay), followed by outer-most frame header, then other stacked headers up to the header processed by vSwitch-VPP, then VPP forwarding function, then encap on vhost interface, number of vhost interfaces, number of VMs. If chained VMs present, they get added on the right. Test topology is expected to be symmetric, in other words packets enter and leave SUT through ports specified on the left of the test name. Here some examples to illustrate the convention followed by the complete legend, and tables mapping the new test filenames to old ones.

# **5.2.3 Naming Examples**

CSIT test suite naming examples (filename.robot) for common tested VPP topologies:

- 1. Physical port to physical port a.k.a. NIC-to-NIC, Phy-to-Phy, P2P
  - PortNICConfig-WireEncapsulation-PacketForwardingFunction-PacketProcessingFunctionN-TestType
  - 10ge2p1x520-dot1q-l2bdbasemaclrn-ndrdisc.robot => 2 ports of 10GE on Intel x520 NIC, dot1q tagged Ethernet, L2 bridge-domain baseline switching with MAC learning, NDR throughput discovery.
  - 10ge2p1x520-ethip4vxlan-l2bdbasemaclrn-ndrchk.robot => 2 ports of 10GE on Intel x520 NIC, IPv4 VXLAN Ethernet, L2 bridge-domain baseline switching with MAC learning, NDR throughput discovery.
  - 10ge2p1x520-ethip4-ip4base-ndrdisc.robot => 2 ports of 10GE on Intel x520 NIC, IPv4 baseline routed forwarding, NDR throughput discovery.
  - 10ge2p1x520-ethip6-ip6scale200k-ndrdisc.robot => 2 ports of 10GE on Intel x520 NIC, IPv6 scaled up routed forwarding, NDR throughput discovery.
  - 10ge2p1x520-ethip4-ip4base-iacldstbase-ndrdisc.robot => 2 ports of 10GE on Intel x520 NIC, IPv4 baseline routed forwarding, ingress Access Control Lists baseline matching on destination, NDR throughput discovery.
  - 40ge2p1vic1385-ethip4-ip4base-ndrdisc.robot => 2 ports of 40GE on Cisco vic1385 NIC, IPv4 baseline routed forwarding, NDR throughput discovery.
  - eth2p-ethip4-ip4base-func.robot => 2 ports of Ethernet, IPv4 baseline routed forwarding, functional tests.
- 2. Physical port to VM (or VM chain) to physical port a.k.a. NIC2VM2NIC, P2V2P, NIC2VMchain2NIC, P2V2V2P
  - PortNICConfig-WireEncapsulation-PacketForwardingFunction- PacketProcessingFunction1-...- PacketProcessingFunctionN-VirtEncapsulation- VirtPortConfig-VMconfig-TestType
  - 10ge2p1x520-dot1q-l2bdbasemaclrn-eth-2vhost-1vm-ndrdisc.robot => 2 ports of 10GE on Intel x520 NIC, dot1q tagged Ethernet, L2 bridge-domain switching to/from two vhost interfaces and one VM, NDR throughput discovery.
  - 10ge2p1x520-ethip4vxlan-l2bdbasemaclrn-eth-2vhost-1vm-ndrdisc.robot => 2 ports of 10GE on Intel x520 NIC, IPv4 VXLAN Ethernet, L2 bridge-domain switching to/from two vhost interfaces and one VM, NDR throughput discovery.
  - 10ge2p1x520-ethip4vxlan-l2bdbasemaclrn-eth-4vhost-2vm-ndrdisc.robot => 2 ports of 10GE on Intel x520 NIC, IPv4 VXLAN Ethernet, L2 bridge-domain switching to/from four vhost interfaces and two VMs, NDR throughput discovery.
  - eth2p-ethip4vxlan-l2bdbasemaclrn-eth-2vhost-1vm-func.robot => 2 ports of Ethernet, IPv4 VXLAN Ethernet, L2 bridge-domain switching to/from two vhost interfaces and one VM, functional tests.
- 3. API CRUD tests Create (Write), Read (Retrieve), Update (Modify), Delete (Destroy) operations for configuration and operational data

5.2. Test Naming 675

- ManagementTestKeyword-ManagementOperation-ManagedFunction1-...- ManagedFunctionN-ManagementAPI1-ManagementAPIN-TestType
- mgmt-cfg-lisp-apivat-func => configuration of LISP with VAT API calls, functional tests.
- mgmt-cfg-l2bd-apihc-apivat-func => configuration of L2 Bridge-Domain with Honeycomb API and VAT API calls, functional tests.
- mgmt-oper-int-apihcnc-func => reading status and operational data of interface with Honeycomb NetConf API calls, functional tests.
- mgmt-cfg-int-tap-apihcnc-func => configuration of tap interfaces with Honeycomb NetConf API calls, functional tests.
- mgmt-notif-int-subint-apihcnc-func => notifications of interface and sub-interface events with Honeycomb NetConf Notifications, functional tests.

For complete description of CSIT test naming convention please refer to CSIT test naming wiki page<sup>200</sup>.

# 5.3 Presentation and Analytics

### 5.3.1 Overview

The presentation and analytics layer (PAL) is the fourth layer of CSIT hierarchy. The model of presentation and analytics layer consists of four sub-layers, bottom up:

- sL1 Data input data to be processed:
  - Static content .rst text files, .svg static figures, and other files stored in the CSIT git repository.
  - Data to process .xml files generated by Jenkins jobs executing tests, stored as robot results files (output.xml).
  - Specification .yaml file with the models of report elements (tables, plots, layout, ...) generated by this tool. There is also the configuration of the tool and the specification of input data (jobs and builds).
- sL2 Data processing
  - The data are read from the specified input files (.xml) and stored as multi-indexed pandas.Series<sup>201</sup>.
  - This layer provides also interface to input data and filtering of the input data.
- sL3 Data presentation This layer generates the elements specified in the specification file:
  - Tables: .csv files linked to static .rst files.
  - Plots: .html files generated using plot.ly linked to static .rst files.
- sL4 Report generation Sphinx generates required formats and versions:
  - formats: html, pdf
  - versions: minimal, full (TODO: define the names and scope of versions)

<sup>&</sup>lt;sup>200</sup> https://wiki.fd.io/view/CSIT/csit-test-naming

<sup>&</sup>lt;sup>201</sup> https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.html



#### 5.3.2 Data

#### **Report Specification**

The report specification file defines which data is used and which outputs are generated. It is human readable and structured. It is easy to add / remove / change items. The specification includes:

- Specification of the environment.
- Configuration of debug mode (optional).
- Specification of input data (jobs, builds, files, ...).
- Specification of the output.
- What and how is generated: What: plots, tables. How: specification of all properties and parameters.
- .yaml format.

### Structure of the specification file

The specification file is organized as a list of dictionaries distinguished by the type:

```
type: "environment"

type: "configuration"

type: "debug"

type: "static"

type: "static"

type: "input"
```

(continues on next page)

```
type: "output"
-
   type: "table"
-
   type: "plot"
-
   type: "file"
```

Each type represents a section. The sections "environment", "debug", "static", "input" and "output" are listed only once in the specification; "table", "file" and "plot" can be there multiple times.

Sections "debug", "table", "file" and "plot" are optional.

Table(s), files(s) and plot(s) are referred as "elements" in this text. It is possible to define and implement other elements if needed.

#### **Section: Environment**

This section has the following parts:

- type: "environment" says that this is the section "environment".
- configuration configuration of the PAL.
- paths paths used by the PAL.
- urls urls pointing to the data sources.
- make-dirs a list of the directories to be created by the PAL while preparing the environment.
- remove-dirs a list of the directories to be removed while cleaning the environment.
- build-dirs a list of the directories where the results are stored.

The structure of the section "Environment" is as follows (example):

```
type: "environment"
configuration:
 # Debug mode:
  # - Skip:
  # - Download of input data files
     - Read data from given zip / xml files
  # - Set the configuration as it is done in normal mode
  # If the section "type: debug" is missing, CFG[DEBUG] is set to 0.
  CFGГDEBUG1: ∅
  # Top level directories:
  ## Working directory
 DIR[WORKING]: "_tmp"
  ## Build directories
  DIR[BUILD,HTML]: "_build"
 DIR[BUILD,LATEX]: "_build_latex"
  # Static .rst files
  DIR[RST]: "../../docs/report"
  # Working directories
  ## Input data files (.zip, .xml)
  DIR[WORKING,DATA]: "{DIR[WORKING]}/data"
  ## Static source files from git
```

(continues on next page)

```
DIR[WORKING,SRC]: "{DIR[WORKING]}/src"
  DIR[WORKING,SRC,STATIC]: "{DIR[WORKING,SRC]}/_static"
  # Static html content
  DIR[STATIC]: "{DIR[BUILD, HTML]}/_static"
  DIR[STATIC, VPP]: "{DIR[STATIC]}/vpp"
  DIR[STATIC,DPDK]: "{DIR[STATIC]}/dpdk"
  DIR[STATIC, ARCH]: "{DIR[STATIC]}/archive"
  # Detailed test results
  DIR[DTR]: "{DIR[WORKING, SRC]}/detailed_test_results"
  DIR[DTR,PERF,DPDK]: "{DIR[DTR]}/dpdk_performance_results"
  DIR[DTR,PERF,VPP]: "{DIR[DTR]}/vpp_performance_results"
  DIR[DTR,FUNC,VPP]: "{DIR[DTR]}/vpp_functional_results"
  DIR[DTR,PERF,VPP,IMPRV]: "{DIR[WORKING,SRC]}/vpp_performance_tests/performance_improvements"
  # Detailed test configurations
  DIR[DTC]: "{DIR[WORKING,SRC]}/test_configuration"
  DIR[DTC,PERF,VPP]: "{DIR[DTC]}/vpp_performance_configuration"
  DIR[DTC,FUNC,VPP]: "{DIR[DTC]}/vpp_functional_configuration"
  # Detailed tests operational data
  DIR[DTO]: "{DIR[WORKING, SRC]}/test_operational_data"
  DIR[DTO,PERF,VPP]: "{DIR[DTO]}/vpp_performance_operational_data"
  # .css patch file to fix tables generated by Sphinx
  DIR[CSS_PATCH_FILE]: "{DIR[STATIC]}/theme_overrides.css"
  DIR[CSS_PATCH_FILE2]: "{DIR[WORKING,SRC,STATIC]}/theme_overrides.css"
urls:
  URL[JENKINS,CSIT]: "https://jenkins.fd.io/view/csit/job"
  URL[JENKINS,HC]: "https://jenkins.fd.io/view/hc2vpp/job"
make-dirs:
# List the directories which are created while preparing the environment.
# All directories MUST be defined in "paths" section.
- "DIR[WORKING, DATA]"
- "DIR[STATIC, VPP]"
- "DIR[STATIC, DPDK]"
- "DIR[STATIC, ARCH]"
- "DIR[BUILD, LATEX]"
- "DIR[WORKING, SRC]"
- "DIR[WORKING, SRC, STATIC]"
# List the directories which are deleted while cleaning the environment.
# All directories MUST be defined in "paths" section.
#- "DIR[BUILD,HTML]"
build-dirs:
# List the directories where the results (build) is stored.
# All directories MUST be defined in "paths" section.
- "DIR[BUILD, HTML]"
- "DIR[BUILD, LATEX]"
```

It is possible to use defined items in the definition of other items, e.g.:

```
DIR[WORKING,DATA]: "{DIR[WORKING]}/data"
```

will be automatically changed to

```
DIR[WORKING,DATA]: "_tmp/data"
```

#### **Section: Configuration**

This section specifies the groups of parameters which are repeatedly used in the elements defined later in the specification file. It has the following parts:

- data sets Specification of data sets used later in element's specifications to define the input data.
- plot layouts Specification of plot layouts used later in plots' specifications to define the plot layout.

The structure of the section "Configuration" is as follows (example):

```
type: "configuration"
data-sets:
 plot-vpp-throughput-latency:
   csit-vpp-perf-1710-all:
    - 11
    - 12
   - 13
    - 14
    - 15
    - 16
    - 17
    - 18
    - 19
    - 20
  vpp-perf-results:
   csit-vpp-perf-1710-all:
    - 20
    - 23
plot-layouts:
  plot-throughput:
    xaxis:
      autorange: True
      autotick: False
      fixedrange: False
      gridcolor: "rgb(238, 238, 238)"
      linecolor: "rgb(238, 238, 238)"
      linewidth: 1
      showgrid: True
      showline: True
      showticklabels: True
      tickcolor: "rgb(238, 238, 238)"
      tickmode: "linear"
      title: "Indexed Test Cases"
      zeroline: False
      gridcolor: "rgb(238, 238, 238)'"
      hoverformat: ".4s"
      linecolor: "rgb(238, 238, 238)"
      linewidth: 1
      range: []
      showgrid: True
      showline: True
      showticklabels: True
      tickcolor: "rgb(238, 238, 238)"
      title: "Packets Per Second [pps]"
      zeroline: False
    boxmode: "group"
```

(continues on next page)

680

```
boxgroupgap: 0.5
autosize: False
margin:
    t: 50
    b: 20
    1: 50
    r: 20
showlegend: True
legend:
    orientation: "h"
width: 700
height: 1000
```

The definitions from this sections are used in the elements, e.g.:

```
type: "plot"
 title: "VPP Performance 64B-1t1c-(eth|dot1q|dot1ad)-(12xcbase|12bdbasemaclrn)-ndrdisc"
 algorithm: "plot_performance_box"
 output-file-type: ".html"
 output-file: "{DIR[STATIC, VPP]}/64B-1t1c-12-sel1-ndrdisc"
 data:
   "plot-vpp-throughput-latency"
 filter: "'64B' and ('BASE' or 'SCALE') and 'NDRDISC' and '1T1C' and ('L2BDMACSTAT' or 'L2BDMACLRN
→' or 'L2XCFWD') and not 'VHOST'"
 parameters:
 - "throughput"
 - "parent"
 traces:
   hoverinfo: "x+y"
   boxpoints: "outliers"
   whiskerwidth: 0
 lavout:
   title: "64B-1t1c-(eth|dot1q|dot1ad)-(12xcbase|12bdbasemaclrn)-ndrdisc"
   layout:
     "plot-throughput"
```

# Section: Debug mode

This section is optional as it configures the debug mode. It is used if one does not want to download input data files and use local files instead.

If the debug mode is configured, the "input" section is ignored.

This section has the following parts:

- type: "debug" says that this is the section "debug".
- general:
  - input-format xml or zip.
  - extract if "zip" is defined as the input format, this file is extracted from the zip file, otherwise this parameter is ignored.
- builds list of builds from which the data is used. Must include a job name as a key and then a list of builds and their output files.

The structure of the section "Debug" is as follows (example):

```
type: "debug"
```

(continues on next page)

```
general:
  input-format: "zip" # zip or xml
  extract: "robot-plugin/output.xml" # Only for zip
  # The files must be in the directory DIR[WORKING, DATA]
  csit-dpdk-perf-1707-all:
    build: 10
    file: "csit-dpdk-perf-1707-all__10.xml"
    build: 9
    file: "csit-dpdk-perf-1707-all__9.xml"
  csit-vpp-functional-1707-ubuntu1604-virl:
    build: lastSuccessfulBuild
   file: "csit-vpp-functional-1707-ubuntu1604-virl-lastSuccessfulBuild.xml"
  hc2vpp-csit-integration-1707-ubuntu1604:
    build: lastSuccessfulBuild
    file: "hc2vpp-csit-integration-1707-ubuntu1604-lastSuccessfulBuild.xml"
  csit-vpp-perf-1707-all:
    build: 16
    file: "csit-vpp-perf-1707-all__16__output.xml"
    build: 17
    file: "csit-vpp-perf-1707-all__17__output.xml"
```

#### **Section: Static**

This section defines the static content which is stored in git and will be used as a source to generate the report.

This section has these parts:

- type: "static" says that this section is the "static".
- src-path path to the static content.
- dst-path destination path where the static content is copied and then processed.

```
type: "static"
src-path: "{DIR[RST]}"
dst-path: "{DIR[WORKING, SRC]}"
```

#### **Section: Input**

This section defines the data used to generate elements. It is mandatory if the debug mode is not used.

This section has the following parts:

- type: "input" says that this section is the "input".
- general parameters common to all builds:
  - file-name: file to be downloaded.
  - file-format: format of the downloaded file, ".zip" or ".xml" are supported.

- download-path: path to be added to url pointing to the file, e.g.: "{job}/{build}/robot/report/zip/{filename}"; {job}, {build} and {filename} are replaced by proper values defined in this section.
- extract: file to be extracted from downloaded zip file, e.g.: "output.xml"; if xml file is downloaded, this parameter is ignored.
- builds list of jobs (keys) and numbers of builds which output data will be downloaded.

The structure of the section "Input" is as follows (example from 17.07 report):

```
type: "input" # Ignored in debug mode
general:
  file-name: "robot-plugin.zip"
  file-format: ".zip"
  download-path: "{job}/{build}/robot/report/*zip*/{filename}"
  extract: "robot-plugin/output.xml"
builds:
  csit-vpp-perf-1707-all:
  - 9
  - 10
  - 13
  - 14
  - 15
  - 16
  - 17
  - 18
 - 19
  - 21
 - 22
  csit-dpdk-perf-1707-all:
  - 1
  - 2
  - 3
  - 5
  - 6
  - 7
  - 8
  - 9
  - 10
 csit-vpp-functional-1707-ubuntu1604-virl:
  - lastSuccessfulBuild
 hc2vpp-csit-perf-master-ubuntu1604:
  - 8
  - 9
  hc2vpp-csit-integration-1707-ubuntu1604:
  - lastSuccessfulBuild
```

#### **Section: Output**

This section specifies which format(s) will be generated (html, pdf) and which versions will be generated for each format.

This section has the following parts:

- type: "output" says that this section is the "output".
- format: html or pdf.
- version: defined for each format separately.

The structure of the section "Output" is as follows (example):

```
type: "output"
format:
  html:
  - full
  pdf:
  - full
  - minimal
```

TODO: define the names of versions

#### Content of "minimal" version

TODO: define the name and content of this version

#### **Section: Table**

This section defines a table to be generated. There can be 0 or more "table" sections.

This section has the following parts:

- type: "table" says that this section defines a table.
- title: Title of the table.
- algorithm: Algorithm which is used to generate the table. The other parameters in this section must provide all information needed by the used algorithm.
- template: (optional) a .csv file used as a template while generating the table.
- output-file-ext: extension of the output file.
- output-file: file which the table will be written to.
- columns: specification of table columns:
  - title: The title used in the table header.
  - data: Specification of the data, it has two parts command and arguments:
    - \* command:
      - template take the data from template, arguments:
      - number of column in the template.
      - · data take the data from the input data, arguments:
      - · jobs and builds which data will be used.
      - operation performs an operation with the data already in the table, arguments:
      - operation to be done, e.g.: mean, stdev, relative\_change (compute the relative change between two columns) and display number of data samples ~= number of test jobs.
         The operations are implemented in the utils.py TODO: Move from utils,py to e.g. operations.py
      - numbers of columns which data will be used (optional).
- data: Specify the jobs and builds which data is used to generate the table.
- filter: filter based on tags applied on the input data, if "template" is used, filtering is based on the template.
- parameters: Only these parameters will be put to the output data structure.

The structure of the section "Table" is as follows (example of "table\_performance\_improvements"):

```
type: "table"
title: "Performance improvements"
algorithm: "table_performance_improvements"
template: \ "\{ \texttt{DIR[DTR}, \texttt{PERF}, \texttt{VPP}, \texttt{IMPRV}] \} / tmpl\_performance\_improvements.csv" \} \\
output-file-ext: ".csv"
output-file: "{DIR[DTR,PERF,VPP,IMPRV]}/performance_improvements"
columns:
 title: "VPP Functionality"
 data: "template 1"
 title: "Test Name"
 data: "template 2"
 title: "VPP-16.09 mean [Mpps]"
 data: "template 3"
 title: "VPP-17.01 mean [Mpps]"
 data: "template 4"
 title: "VPP-17.04 mean [Mpps]"
 data: "template 5"
 title: "VPP-17.07 mean [Mpps]"
 data: "data csit-vpp-perf-1707-all mean"
 title: "VPP-17.07 stdev [Mpps]"
 data: "data csit-vpp-perf-1707-all stdev"
 title: "17.04 to 17.07 change [%]"
  data: "operation relative_change 5 4"
data:
 csit-vpp-perf-1707-all:
 - 9
 - 10
 - 13
 - 14
 - 15
  - 16
 - 17
 - 18
  - 19
  - 21
filter: "template"
parameters:
- "throughput"
```

Example of "table\_details" which generates "Detailed Test Results - VPP Performance Results":

```
type: "table"
title: "Detailed Test Results - VPP Performance Results"
algorithm: "table_details"
output-file-ext: ".csv"
output-file: "{DIR[WORKING]}/vpp_performance_results"
columns:
title: "Name"
data: "data test_name"

title: "Documentation"
```

```
data: "data test_documentation"
-
   title: "Status"
   data: "data test_msg"

data:
    csit-vpp-perf-1707-all:
   - 17
filter: "all"
parameters:
   - "parent"
   - "doc"
   - "msg"
```

Example of "table\_details" which generates "Test configuration - VPP Performance Test Configs":

```
type: "table"
title: "Test configuration - VPP Performance Test Configs"
algorithm: "table_details"
output-file-ext: ".csv"
output-file: "{DIR[WORKING]}/vpp_test_configuration"
columns:
  title: "Name"
 data: "data name"
  title: "VPP API Test (VAT) Commands History - Commands Used Per Test Case"
  data: "data show-run"
 csit-vpp-perf-1707-all:
  - 17
filter: "all"
parameters:
- "parent"
- "name"
- "show-run"
```

#### **Section: Plot**

This section defines a plot to be generated. There can be 0 or more "plot" sections.

This section has these parts:

- type: "plot" says that this section defines a plot.
- title: Plot title used in the logs. Title which is displayed is in the section "layout".
- output-file-type: format of the output file.
- output-file: file which the plot will be written to.
- algorithm: Algorithm used to generate the plot. The other parameters in this section must provide all information needed by plot.ly to generate the plot. For example:
  - traces
  - layout
  - These parameters are transparently passed to plot.ly.
- data: Specify the jobs and numbers of builds which data is used to generate the plot.
- filter: filter applied on the input data.

• parameters: Only these parameters will be put to the output data structure.

The structure of the section "Plot" is as follows (example of a plot showing throughput in a chart box-with-whiskers):

```
type: "plot"
 \textbf{title: "VPP Performance 64B-1t1c-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc"}
 algorithm: "plot_performance_box"
 output-file-type: ".html"
 output-file: "{DIR[STATIC, VPP]}/64B-1t1c-l2-sel1-ndrdisc"
   csit-vpp-perf-1707-all:
   - 9
   - 10
   - 13
   - 14
   - 15
   - 16
   - 17
   - 18
   - 19
   - 21
 # Keep this formatting, the filter is enclosed with " (quotation mark) and
 # each tag is enclosed with ' (apostrophe).
 filter: "'64B' and 'BASE' and 'NDRDISC' and '1T1C' and ('L2BDMACSTAT' or 'L2BDMACLRN' or 'L2XCFWD
→') and not 'VHOST'"
 parameters:
 - "throughput"
 - "parent"
 traces:
   hoverinfo: "x+y"
   boxpoints: "outliers"
   whiskerwidth: 0
   title: "64B-1t1c-(eth|dot1q|dot1ad)-(12xcbase|12bdbasemaclrn)-ndrdisc"
   xaxis:
     autorange: True
     autotick: False
     fixedrange: False
     gridcolor: "rgb(238, 238, 238)"
     linecolor: "rgb(238, 238, 238)"
     linewidth: 1
     showgrid: True
     showline: True
     showticklabels: True
     tickcolor: "rgb(238, 238, 238)"
     tickmode: "linear"
     title: "Indexed Test Cases"
     zeroline: False
     gridcolor: "rgb(238, 238, 238)'"
     hoverformat: ".4s"
     linecolor: "rgb(238, 238, 238)"
     linewidth: 1
     range: []
     showgrid: True
     showline: True
     showticklabels: True
     tickcolor: "rgb(238, 238, 238)"
     title: "Packets Per Second [pps]"
     zeroline: False
   boxmode: "group"
```

```
boxgroupgap: 0.5
autosize: False
margin:
    t: 50
    b: 20
    l: 50
    r: 20
showlegend: True
legend:
    orientation: "h"
width: 700
height: 1000
```

The structure of the section "Plot" is as follows (example of a plot showing latency in a box chart):

```
type: "plot"
 title: "VPP Latency 64B-1t1c-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc"
 algorithm: "plot_latency_box"
 output-file-type: ".html"
 output-file: "{DIR[STATIC, VPP]}/64B-1t1c-12-sel1-ndrdisc-lat50"
 data:
   csit-vpp-perf-1707-all:
   - 9
   - 10
   - 13
   - 14
   - 15
   - 16
   - 17
   - 18
   - 19
   - 21
 filter: "'64B' and 'BASE' and 'NDRDISC' and '1T1C' and ('L2BDMACSTAT' or 'L2BDMACLRN' or 'L2XCFWD
→') and not 'VHOST'"
 parameters:
 - "latency"
 - "parent"
 traces:
   boxmean: False
 layout:
   title: "64B-1t1c-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc"
     autorange: True
     autotick: False
     fixedrange: False
     gridcolor: "rgb(238, 238, 238)"
     linecolor: "rgb(238, 238, 238)"
     linewidth: 1
     showgrid: True
     showline: True
     showticklabels: True
     tickcolor: "rgb(238, 238, 238)"
     tickmode: "linear"
     title: "Indexed Test Cases"
     zeroline: False
   yaxis:
     gridcolor: "rgb(238, 238, 238)'"
     hoverformat: ""
     linecolor: "rgb(238, 238, 238)"
     linewidth: 1
```

```
range: []
  showgrid: True
  showline: True
  showticklabels: True
  tickcolor: "rgb(238, 238, 238)"
 title: "Latency min/avg/max [uSec]"
  zeroline: False
boxmode: "group"
boxgroupgap: 0.5
autosize: False
margin:
 t: 50
  b: 20
 1: 50
  r: 20
showlegend: True
legend:
 orientation: "h"
width: 700
height: 1000
```

The structure of the section "Plot" is as follows (example of a plot showing VPP HTTP server performance in a box chart with pre-defined data "plot-vpp-httlp-server-performance" set and plot layout "plot-cps"):

```
type: "plot"
title: "VPP HTTP Server Performance"
algorithm: "plot_http_server_perf_box"
output-file-type: ".html"
output-file: "{DIR[STATIC, VPP]}/http-server-performance-cps"
  "plot-vpp-httlp-server-performance"
\mbox{\tt\#} Keep this formatting, the filter is enclosed with " (quotation mark) and
# each tag is enclosed with ' (apostrophe).
filter: "'HTTP' and 'TCP_CPS'"
parameters:
- "result"
- "name"
traces:
  hoverinfo: "x+y"
  boxpoints: "outliers"
  whiskerwidth: 0
layout:
  title: "VPP HTTP Server Performance"
  lavout:
    "plot-cps"
```

# Section: file

This section defines a file to be generated. There can be 0 or more "file" sections.

This section has the following parts:

- type: "file" says that this section defines a file.
- title: Title of the table.
- algorithm: Algorithm which is used to generate the file. The other parameters in this section must provide all information needed by the used algorithm.
- output-file-ext: extension of the output file.

- output-file: file which the file will be written to.
- file-header: The header of the generated .rst file.
- dir-tables: The directory with the tables.
- data: Specify the jobs and builds which data is used to generate the table.
- filter: filter based on tags applied on the input data, if "all" is used, no filtering is done.
- parameters: Only these parameters will be put to the output data structure.
- chapters: the hierarchy of chapters in the generated file.
- start-level: the level of the top-level chapter.

The structure of the section "file" is as follows (example):

```
type: "file"
 title: "VPP Performance Results"
 algorithm: "file_test_results"
 output-file-ext: ".rst"
 output-file: "{DIR[DTR,PERF,VPP]}/vpp_performance_results"
 file-header: "\n.. |br| raw:: html\n\n
                                       \hookrightarrow|preout| raw:: html\n\n \n\n"
 dir-tables: "{DIR[DTR,PERF,VPP]}"
 data:
   csit-vpp-perf-1707-all:
   - 22
 filter: "all"
 parameters:
 - "name"
 - "doc"
 - "level"
 data-start-level: 2 # 0, 1, 2, ...
 chapters-start-level: 2 # 0, 1, 2, ...
```

#### Static content

- Manually created / edited files.
- .rst files, static .csv files, static pictures (.svg), ...
- Stored in CSIT git repository.

No more details about the static content in this document.

## Data to process

The PAL processes tests results and other information produced by Jenkins jobs. The data are now stored as robot results in Jenkins (TODO: store the data in nexus) either as .zip and / or .xml files.

### 5.3.3 Data processing

As the first step, the data are downloaded and stored locally (typically on a Jenkins slave). If .zip files are used, the given .xml files are extracted for further processing.

Parsing of the .xml files is performed by a class derived from "robot.api.ResultVisitor", only necessary methods are overridden. All and only necessary data is extracted from .xml file and stored in a structured form.

The parsed data are stored as the multi-indexed pandas. Series data type. Its structure is as follows:

```
<job name>
  <build>
    <metadata>
        <suites>
        <tests>
```

"job name", "build", "metadata", "suites", "tests" are indexes to access the data. For example:

```
job 1 name:
 build 1:
   metadata: metadata
    suites: suites
   tests: tests
 build N:
   metadata: metadata
    suites: suites
   tests: tests
job M name:
  build 1:
    metadata: metadata
    suites: suites
    tests: tests
 build N:
   metadata: metadata
    suites: suites
    tests: tests
```

Using indexes data["job 1 name"]["build 1"]["tests"] (e.g.: data["csit-vpp-perf-1704-all"]["17"]["tests"]) we get a list of all tests with all tests data.

Data will not be accessible directly using indexes, but using getters and filters.

## Structure of metadata:

```
"metadata": {
    "version": "VPP version",
    "job": "Jenkins job name"
    "build": "Information about the build"
},
```

#### Structure of suites:

```
"suites": {
    "Suite name 1": {
        "doc": "Suite 1 documentation"
        "parent": "Suite 1 parent"
}

"Suite name N": {
        "doc": "Suite N documentation"
        "parent": "Suite N parent"
}
```

#### Structure of tests:

Performance tests:

```
"tests": {
    "ID": {
```

```
"name": "Test name",
    "parent": "Name of the parent of the test",
    "doc": "Test documentation"
    "msg": "Test message"
    "tags": ["tag 1", "tag 2", "tag n"],
    "type": "PDR" | "NDR",
    "throughput": {
        "value": int,
        "unit": "pps" | "bps" | "percentage"
    "latency": {
        "direction1": {
            "100": {
               "min": int,
                "avg": int,
                "max": int
            "50": { # Only for NDR
               "min": int,
               "avg": int,
               "max": int
            "10": { # Only for NDR
               "min": int,
                "avg": int,
                "max": int
            }
        },
        "direction2": {
            "100": {
                "min": int,
                "avg": int,
                "max": int
            },
            "50": { # Only for NDR
               "min": int,
                "avg": int,
                "max": int
            "10": { # Only for NDR
                "min": int,
                "avg": int,
                "max": int
            }
        }
    "lossTolerance": "lossTolerance" # Only for PDR
    "vat-history": "DUT1 and DUT2 VAT History"
    "show-run": "Show Run"
},
"ID" {
    # next test
}
```

#### Functional tests:

```
"tests": {
    "ID": {
        "name": "Test name",
        "parent": "Name of the parent of the test",
```

```
"doc": "Test documentation"
    "msg": "Test message"
    "tags": ["tag 1", "tag 2", "tag n"],
    "vat-history": "DUT1 and DUT2 VAT History"
    "show-run": "Show Run"
    "status": "PASS" | "FAIL"
},
"ID" {
    # next test
}
```

Note: ID is the lowercase full path to the test.

### **Data filtering**

The first step when generating an element is getting the data needed to construct the element. The data are filtered from the processed input data.

The data filtering is based on:

- job name(s).
- build number(s).
- tag(s).
- required data only this data is included in the output.

WARNING: The filtering is based on tags, so be careful with tagging.

For example, the element which specification includes:

```
data:
    csit-vpp-perf-1707-all:
    - 9
    - 10
    - 13
    - 14
    - 15
    - 16
    - 17
    - 18
    - 19
    - 21
filter:
    - "'64B' and 'BASE' and 'NDRDISC' and '1T1C' and ('L2BDMACSTAT' or 'L2BDMACLRN' or 'L2XCFWD') and
    →not 'VHOST'"
```

will be constructed using data from the job "csit-vpp-perf-1707-all", for all listed builds and the tests with the list of tags matching the filter conditions.

The output data structure for filtered test data is:

```
- job 1
- build 1
- test 1
- parameter 1
- parameter 2
...
- parameter n
...
- test n
```

```
...
- build n
...
- job n
```

### **Data analytics**

Data analytics part implements:

- methods to compute statistical data from the filtered input data.
- trending.

## Throughput Speedup Analysis - Multi-Core with Multi-Threading

Throughput Speedup Analysis (TSA) calculates throughput speedup ratios for tested 1-, 2- and 4-core multi-threaded VPP configurations using the following formula:

```
N_core_throughput
N_core_throughput_speedup = ------
1_core_throughput
```

Multi-core throughput speedup ratios are plotted in grouped bar graphs for throughput tests with 64B/78B frame size, with number of cores on X-axis and speedup ratio on Y-axis.

For better comparison multiple test results' data sets are plotted per each graph:

- graph type: grouped bars;
- graph X-axis: (testcase index, number of cores);
- graph Y-axis: speedup factor.

Subset of existing performance tests is covered by TSA graphs.

### Model for TSA:

```
type: "plot"
 title: "TSA: 64B-*-(eth|dot1q|dot1ad)-(12xcbase|12bdbasemaclrn)-ndrdisc"
 algorithm: "plot_throughput_speedup_analysis"
 output-file-type: ".html"
 output-file: "{DIR[STATIC, VPP]}/10ge2p1x520-64B-l2-tsa-ndrdisc"
   "plot-throughput-speedup-analysis"
 filter: "'NIC_Intel-X520-DA2' and '64B' and 'BASE' and 'NDRDISC' and ('L2BDMACSTAT' or 'L2BDMACLRN
→' or 'L2XCFWD') and not 'VHOST'"
 parameters:
 - "throughput"
 - "parent"
 - "tags"
 lavout:
   title: "64B-*-(eth|dot1q|dot1ad)-(12xcbase|12bdbasemaclrn)-ndrdisc"
   layout:
     "plot-throughput-speedup-analysis"
```

### Comparison of results from two sets of the same test executions

This algorithm enables comparison of results coming from two sets of the same test executions. It is used to quantify performance changes across all tests after test environment changes e.g. Operating System upgrades/patches, Hardware changes.

It is assumed that each set of test executions includes multiple runs of the same tests, 10 or more, to verify test results repeatibility and to yield statistically meaningful results data.

Comparison results are presented in a table with a specified number of the best and the worst relative changes between the two sets. Following table columns are defined:

- name of the test:
- throughput mean values of the reference set;
- throughput standard deviation of the reference set;
- throughput mean values of the set to compare;
- throughput standard deviation of the set to compare;
- relative change of the mean values.

#### The model

The model specifies:

- type: "table" means this section defines a table.
- title: Title of the table.
- algorithm: Algorithm which is used to generate the table. The other parameters in this section must provide all information needed by the used algorithm.
- output-file-ext: Extension of the output file.
- output-file: File which the table will be written to.
- reference the builds which are used as the reference for comparison.
- compare the builds which are compared to the reference.
- data: Specify the sources, jobs and builds, providing data for generating the table.
- filter: Filter based on tags applied on the input data, if "template" is used, filtering is based on the template.
- parameters: Only these parameters will be put to the output data structure.
- nr-of-tests-shown: Number of the best and the worst tests presented in the table. Use 0 (zero) to present all tests.

## Example:

```
type: "table"
title: "Performance comparison"
algorithm: "table_perf_comparison"
output-file-ext: ".csv"
output-file: "{DIR[DTR,PERF,VPP,IMPRV]}/vpp_performance_comparison"
reference:
   title: "csit-vpp-perf-1801-all - 1"
   data:
        csit-vpp-perf-1801-all:
        - 1
        - 2
compare:
   title: "csit-vpp-perf-1801-all - 2"
```

```
data:
    csit-vpp-perf-1801-all:
    - 1
    - 2

data:
    "vpp-perf-comparison"
filter: "all"
parameters:
    - "name"
    - "parent"
    - "throughput"
nr-of-tests-shown: 20
```

## Advanced data analytics

In the future advanced data analytics (ADA) will be added to analyze the telemetry data collected from SUT telemetry sources and correlate it to performance test results.

#### **TODO**

- describe the concept of ADA.
- add specification.

## 5.3.4 Data presentation

Generates the plots and tables according to the report models per specification file. The elements are generated using algorithms and data specified in their models.

## **Tables**

- tables are generated by algorithms implemented in PAL, the model includes the algorithm and all necessary information.
- output format: csv
- generated tables are stored in specified directories and linked to .rst files.

#### **Plots**

- plot.ly<sup>202</sup> is currently used to generate plots, the model includes the type of plot and all the necessary information to render it.
- output format: html.
- generated plots are stored in specified directories and linked to .rst files.

## 5.3.5 Report generation

Report is generated using Sphinx and Read\_the\_Docs template. PAL generates html and pdf formats. It is possible to define the content of the report by specifying the version (TODO: define the names and content of versions).

```
<sup>202</sup> https://plot.ly/
```

#### The process

- 1. Read the specification.
- 2. Read the input data.
- 3. Process the input data.
- 4. For element (plot, table, file) defined in specification:
  - a. Get the data needed to construct the element using a filter.
  - b. Generate the element.
  - c. Store the element.
- 5. Generate the report.
- 6. Store the report (Nexus).

The process is model driven. The elements' models (tables, plots, files and report itself) are defined in the specification file. Script reads the elements' models from specification file and generates the elements.

It is easy to add elements to be generated in the report. If a new type of an element is required, only a new algorithm needs to be implemented and integrated.

# 5.3.6 Continuous Performance Measurements and Trending

## Performance analysis and trending execution sequence:

CSIT PA runs performance analysis, change detection and trending using specified trend analysis metrics over the rolling window of last <N> sets of historical measurement data. PA is defined as follows:

- 1. PA job triggers:
  - 1. By PT job at its completion.
  - 2. Manually from Jenkins UI.
- 2. Download and parse archived historical data and the new data:
  - New data from latest PT job is evaluated against the rolling window of <N> sets of historical data.
  - 2. Download RF output.xml files and compressed archived data.
  - 3. Parse out the data filtering test cases listed in PA specification (part of CSIT PAL specification file).
- 3. Calculate trend metrics for the rolling window of <N> sets of historical data:
  - 1. Calculate quartiles Q1, Q2, Q3.
  - 2. Trim outliers using IQR.
  - 3. Calculate TMA and TMSD.
  - 4. Calculate normal trending range per test case based on TMA and TMSD.
- 4. Evaluate new test data against trend metrics:
  - 1. If within the range of (TMA +/- 3\*TMSD) => Result = Pass, Reason = Normal.
  - 2. If below the range => Result = Fail, Reason = Regression.
  - 3. If above the range => Result = Pass, Reason = Progression.
- 5. Generate and publish results
  - 1. Relay evaluation result to job result.

- 2. Generate a new set of trend analysis summary graphs and drill-down graphs.
  - 1. Summary graphs to include measured values with Normal, Progression and Regression markers. MM shown in the background if possible.
  - 2. Drill-down graphs to include MM, TMA and TMSD.
- 3. Publish trend analysis graphs in html format on https://docs.fd.io/csit/master/trending/.

#### Parameters to specify:

General section - parameters common to all plots:

- type: "cpta";
- title: The title of this section;
- output-file-type: only ".html" is supported;
- output-file: path where the generated files will be stored.

#### Plots section:

- plot title:
- output file name;
- input data for plots;
  - job to be monitored the Jenkins job which results are used as input data for this test;
  - builds used for trending plot(s) specified by a list of build numbers or by a range of builds defined by the first and the last build number;
- tests to be displayed in the plot defined by a filter;
- list of parameters to extract from the data;
- plot layout

### Example:

```
type: "cpta"
 title: "Continuous Performance Trending and Analysis"
 output-file-type: ".html"
 output-file: "{DIR[STATIC, VPP]}/cpta"
 plots:
   - title: "VPP 1T1C L2 64B Packet Throughput - Trending"
     output-file-name: "12-1t1c-x520"
     data: "plot-performance-trending-vpp"
     filter: "'NIC_Intel-X520-DA2' and 'MRR' and '64B' and ('BASE' or 'SCALE') and '1T1C' and (
\hookrightarrow 'L2BDMACSTAT' or 'L2BDMACLRN' or 'L2XCFWD') and not 'VHOST' and not 'MEMIF'"
     parameters:
      "result"
     layout: "plot-cpta-vpp"
   - title: "DPDK 4T4C IMIX MRR Trending"
     output-file-name: "dpdk-imix-4t4c-x1710"
     data: "plot-performance-trending-dpdk"
     filter: "'NIC_Intel-XL710' and 'IMIX' and 'MRR' and '4T4C' and 'DPDK'"
     parameters:
      - "result"
     layout: "plot-cpta-dpdk"
```

#### The Dashboard

Performance dashboard tables provide the latest VPP throughput trend, trend compliance and detected anomalies, all on a per VPP test case basis. The Dashboard is generated as three tables for 1t1c, 2t2c and 4t4c MRR tests.

At first, the .csv tables are generated (only the table for 1t1c is shown):

```
type: "table"
 title: "Performance trending dashboard"
 algorithm: "table_perf_trending_dash"
 output-file-ext: ".csv"
 output-file: "{DIR[STATIC, VPP]}/performance-trending-dashboard-1t1c"
 data: "plot-performance-trending-all"
 filter: "'MRR' and '1T1C'"
 parameters:
 - "name"
 - "parent"
 - "result"
 ignore-list:
 - "tests.vpp.perf.l2.10ge2p1x520-eth-l2bdscale1mmaclrn-mrr.tc01-64b-1t1c-eth-l2bdscale1mmaclrn-
→ndrdisc"
 outlier-const: 1.5
 window: 14
 evaluated-window: 14
 long-trend-window: 180
```

Then, html tables stored inside .rst files are generated:

```
type: "table"
title: "HTML performance trending dashboard 1t1c"
algorithm: "table_perf_trending_dash_html"
input-file: "{DIR[STATIC, VPP]}/performance-trending-dashboard-1t1c.csv"
output-file: "{DIR[STATIC, VPP]}/performance-trending-dashboard-1t1c.rst"
```

## 5.3.7 Root Cause Analysis

Root Cause Analysis (RCA) by analysing archived performance results – re-analyse available data for specified:

- range of jobs builds,
- set of specific tests and
- PASS/FAIL criteria to detect performance change.

In addition, PAL generates trending plots to show performance over the specified time interval.

### Root Cause Analysis - Option 1: Analysing Archived VPP Results

It can be used to speed-up the process, or when the existing data is sufficient. In this case, PAL uses existing data saved in Nexus, searches for performance degradations and generates plots to show performance over the specified time interval for the selected tests.

#### **Execution Sequence**

- 1. Download and parse archived historical data and the new data.
- 2. Calculate trend metrics.

- 3. Find regression / progression.
- 4. Generate and publish results:
  - 1. Summary graphs to include measured values with Progression and Regression markers.
  - 2. List the DUT build(s) where the anomalies were detected.

## **CSIT PAL Specification**

- What to test:
  - first build (Good); specified by the Jenkins job name and the build number
  - last build (Bad); specified by the Jenkins job name and the build number
  - step (1..n).
- Data:
  - tests of interest; list of tests (full name is used) which results are used

### Example:

TODO

## 5.3.8 API

### List of modules, classes, methods and functions

```
specification_parser.py
   class Specification
        Methods:
            read_specification
            set_input_state
            set_input_file_name
        Getters:
            specification
            environment
            debug
            is_debug
            input
            builds
            output
            tables
            plots
            files
            static
input_data_parser.py
   class InputData
        Methods:
            read_data
            filter_data
        Getters:
```

```
data
            metadata
            suites
            tests
environment.py
    Functions:
       clean_environment
    class Environment
        Methods:
            set\_environment
        Getters:
            environment
input_data_files.py
    Functions:
        download_data_files
        unzip_files
generator_tables.py
    Functions:
        generate_tables
    Functions implementing algorithms to generate particular types of
    tables (called by the function "generate_tables"):
        table_details
        table_performance_improvements
generator_plots.py
    Functions:
        generate_plots
    Functions implementing algorithms to generate particular types of
    plots (called by the function "generate_plots"):
        plot_performance_box
        plot_latency_box
generator_files.py
    Functions:
       generate_files
    Functions implementing algorithms to generate particular types of
    files (called by the function "generate_files"):
        file_test_results
report.py
```

```
Functions:
    generate_report

Functions implementing algorithms to generate particular types of
report (called by the function "generate_report"):
    generate_html_report
    generate_pdf_report

Other functions called by the function "generate_report":
    archive_input_data
    archive_report
```

### **PAL** functional diagram



## How to add an element

Element can be added by adding it's model to the specification file. If the element is to be generated by an existing algorithm, only it's parameters must be set.

If a brand new type of element needs to be added, also the algorithm must be implemented. Element generation algorithms are implemented in the files with names starting with "generator" prefix. The name of the function implementing the algorithm and the name of algorithm in the specification file have to be the same.

# 5.4 CSIT RF Tags Descriptions

All CSIT test cases are labelled with Robot Framework tags used to allow for easy test case type identification, test case grouping and selection for execution. Following sections list currently used CSIT TAGs and their documentation based on the content of tag documentation rst file<sup>203</sup>.

# 5.4.1 Testbed Topology Tags

## 2\_NODE\_DOUBLE\_LINK\_TOPO

2 nodes connected in a circular topology with two links interconnecting the devices.

## 2\_NODE\_SINGLE\_LINK\_TOPO

2 nodes connected in a circular topology with at least one link interconnecting devices.

# 3\_NODE\_DOUBLE\_LINK\_TOPO

3 nodes connected in a circular topology with two links interconnecting the devices.

## 3\_NODE\_SINGLE\_LINK\_TOPO

3 nodes connected in a circular topology with at least one link interconnecting devices.

## 5.4.2 Objective Tags

## SKIP\_PATCH

Test case(s) marked to not run in case of vpp-csit-verify (i.e. VPP patch) and csit-vpp-verify jobs (i.e. CSIT patch).

### SKIP\_VPP\_PATCH

Test case(s) marked to not run in case of vpp-csit-verify (i.e. VPP patch).

## **5.4.3 Environment Tags**

## **HW\_ENV**

DUTs and TGs are running on bare metal.

### VM\_ENV

DUTs and TGs are running in virtual environment.

# VPP\_VM\_ENV

DUTs with VPP and capable of running Virtual Machine.

<sup>203</sup> https://git.fd.io/csit/tree/docs/tag\_documentation.rst?h=rls2001

# 5.4.4 NIC Model Tags

NIC\_Intel-X520-DA2

Intel X520-DA2 NIC. NIC\_Intel-XL710 Intel XL710 NIC. NIC\_Intel-X710 Intel X710 NIC. NIC\_Intel-XXV710 Intel XXV710 NIC. NIC\_Cisco-VIC-1227 VIC-1227 by Cisco. NIC\_Cisco-VIC-1385 VIC-1385 by Cisco. **5.4.5 Scaling Tags** FIB\_20K 2x10,000 entries in single fib table FIB\_200K 2x100,000 entries in single fib table FIB\_2M 2x1,000,000 entries in single fib table L2BD\_1 Test with 1 L2 bridge domain. L2BD\_10 Test with 10 L2 bridge domains.

Test with 100 L2 bridge domains.

L2BD\_100

## L2BD\_1K

Test with 1000 L2 bridge domains.

### VLAN\_1

Test with 1 VLAN sub-interface.

## VLAN\_10

Test with 10 VLAN sub-interfaces.

## VLAN\_100

Test with 100 VLAN sub-interfaces.

## VLAN\_1K

Test with 1000 VLAN sub-interfaces.

## VXLAN\_1

Test with 1 VXLAN tunnel.

## VXLAN\_10

Test with 10 VXLAN tunnels.

## VXLAN\_100

Test with 100 VXLAN tunnels.

## VXLAN\_1K

Test with 1000 VXLAN tunnels.

## TNL\_{t}

IPSec in tunnel mode - {t} tunnels.

# SRC\_USER\_1

Traffic flow with 1 unique IP (users) in one direction.

## SRC\_USER\_10

Traffic flow with 10 unique IPs (users) in one direction.

# SRC\_USER\_100

Traffic flow with 100 unique IPs (users) in one direction.

## SRC\_USER\_1000

Traffic flow with 1000 unique IPs (users) in one direction.

## SRC\_USER\_2000

Traffic flow with 2000 unique IPs (users) in one direction.

### SRC\_USER\_4000

Traffic flow with 4000 unique IPs (users) in one direction.

#### 100\_FLOWS

Traffic stream with 100 unique flows (10 IPs/users x 10 UDP ports) in one direction.

#### 10k FLOWS

Traffic stream with 10 000 unique flows (10 IPs/users x 1000 UDP ports) in one direction.

### 100k\_FLOWS

Traffic stream with 100 000 unique flows (100 IPs/users x 1000 UDP ports) in one direction.

# 5.4.6 Test Category Tags

#### **FUNCTEST**

All functional test cases.

### **PERFTEST**

All performance test cases.

## **5.4.7 Performance Type Tags**

## **NDRPDR**

Single test finding both No Drop Rate and Partial Drop Rate simultaneously. The search is done by optimized algorithm which performs multiple trial runs at different durations and transmit rates. The results come from the final trials, which have duration of 30 seconds.

### **MRR**

Performance tests where TG sends the traffic at maximum rate (line rate) and reports total sent/received packets over trial duration. The result is an average of 10 trials of 1 second duration.

## **SOAK**

Performance tests using PLRsearch to find the critical load.

#### **RECONF**

Performance tests aimed to measure lost packets (time) when performing reconfiguration while full throughput offered load is applied.

## **5.4.8 Ethernet Frame Size Tags**

These are describing the traffic offered by Traffic Generator, "primary" traffic in case of asymmetric load. For traffic between DUTs, or for "secondary" traffic, see \${overhead}} value.

#### 64B

64B frames used for test. Generic ethernet or IPv4.

#### 78B

78B frames used for test. Ipv6.

#### 114B

114B frames used for test. IPv4+vxlan.

#### 118B

118B frames used for test. Dot1q+IPv4+vxlan.

#### **IMIX**

IMIX frame sequence (28x 64B, 16x 570B, 4x 1518B) used for test.

#### 1460B

1460B frames used for test.

## 1480B

1480B frames used for test.

#### 1514B

1514B frames used for test.

### 1518B

1518B frames used for test.

#### 9000B

9000B frames used for test.

# 5.4.9 Test Type Tags

## **BASE**

Baseline test cases, no encapsulation, no feature(s) configured in tests.

### **IP4BASE**

IPv4 baseline test cases, no encapsulation, no feature(s) configured in tests.

### **IP6BASE**

IPv6 baseline test cases, no encapsulation, no feature(s) configured in tests.

### **L2XCBASE**

L2XC baseline test cases, no encapsulation, no feature(s) configured in tests.

#### L2BDBASE

L2BD baseline test cases, no encapsulation, no feature(s) configured in tests.

#### L2PATCH

L2PATCH baseline test cases, no encapsulation, no feature(s) configured in tests.

#### **SCALE**

Scale test cases.

### **ENCAP**

Test cases where encapsulation is used. Use also encapsulation tag(s).

## **FEATURE**

At least one feature is configured in test cases. Use also feature tag(s).

## **TCP**

Tests which use TCP.

## TCP\_CPS

Performance tests which measure connections per second using http requests.

## TCP\_RPS

Performance tests which measure requests per second using http requests.

## **HTTP**

Tests which use HTTP.

## **NF\_DENSITY**

Performance tests that measure throughput of multiple VNF and CNF service topologies at different service densities.

# 5.4.10 NF Service Density Tags

#### **CHAIN**

NF service density tests with VNF or CNF service chain topology(ies).

#### PIPE

NF service density tests with CNF service pipeline topology(ies).

## NF\_L3FWDIP4

NF service density tests with DPDK I3fwd IPv4 routing as NF workload.

#### NF\_VPPIP4

NF service density tests with VPP IPv4 routing as NF workload.

### ${r}R{c}C$

Service density matrix locator  $\{r\}R\{c\}C$ ,  $\{r\}Row$  denoting number of service instances,  $\{c\}Column$  denoting number of NFs per service instance.  $\{r\}=(1,2,4,6,8,10)$ ,  $\{c\}=(1,2,4,6,8,10)$ .

### ${n}VM{t}T$

Service density {n}VM{t}T, {n}Number of NF Qemu VMs, {t}Number of threads per NF.

### {n}DCRt}T

Service density {n}DCR{t}T, {n}Number of NF Docker containers, {t}Number of threads per NF.

## {n}\_ADDED\_CHAINS

{n}Number of chains (or pipelines) added (and/or removed) during RECONF test.

## 5.4.11 Forwarding Mode Tags

## L2BDMACSTAT

VPP L2 bridge-domain, L2 MAC static.

#### L2BDMACLRN

VPP L2 bridge-domain, L2 MAC learning.

### L2XCFWD

VPP L2 point-to-point cross-connect.

#### **IP4FWD**

VPP IPv4 routed forwarding.

### **IP6FWD**

VPP IPv6 routed forwarding.

# LOADBALANCER\_MAGLEV

VPP Load balancer maglev mode.

## LOADBALANCER\_L3DSR

VPP Load balancer l3dsr mode.

## LOADBALANCER\_NAT4

VPP Load balancer nat4 mode.

# 5.4.12 Underlay Tags

### **IP4UNRLAY**

IPv4 underlay.

## **IP6UNRLAY**

IPv6 underlay.

## **MPLSUNRLAY**

MPLS underlay.

# 5.4.13 Overlay Tags

#### **L2OVRLAY**

L2 overlay.

## **IP4OVRLAY**

IPv4 overlay (IPv4 payload).

### **IP6OVRLAY**

IPv6 overlay (IPv6 payload).

# 5.4.14 Tagging Tags

## DOT1Q

All test cases with dot1q.

#### DOT1AD

All test cases with dot1ad.

# **5.4.15 Encapsulation Tags**

#### ETH

All test cases with base Ethernet (no encapsulation).

#### LISP

All test cases with LISP.

### **LISPGPE**

All test cases with LISP-GPE.

# LISP\_IP4o4

All test cases with LISP\_IP4o4.

# LISPGPE\_IP4o4

All test cases with LISPGPE\_IP4o4.

## LISPGPE\_IP6o4

All test cases with LISPGPE\_IP6o4.

# LISPGPE\_IP4o6

All test cases with LISPGPE\_IP4o6.

# LISPGPE\_IP6o6

All test cases with LISPGPE\_IP6o6.

## **VXLAN**

All test cases with Vxlan.

## **VXLANGPE**

All test cases with VXLAN-GPE.

#### GRE

All test cases with GRE.

### **IPSEC**

All test cases with IPSEC.

### SRv6

All test cases with Segment routing over IPv6 dataplane.

### SRv6\_1SID

All SRv6 test cases with single SID.

### SRv6\_2SID\_DECAP

All SRv6 test cases with two SIDs and with decapsulation.

## SRv6\_2SID\_NODECAP

All SRv6 test cases with two SIDs and without decapsulation.

# **5.4.16 Interface Tags**

### PHY

All test cases which use physical interface(s).

### **VHOST**

All test cases which uses VHOST.

### VHOST\_256

All test cases which uses VHOST with gemu queue size set to 256.

### VHOST\_1024

All test cases which uses VHOST with gemu queue size set to 1024.

## CFS\_OPT

All test cases which uses VM with optimised scheduler policy.

### **TUNTAP**

All test cases which uses TUN and TAP.

#### **AFPKT**

All test cases which uses AFPKT.

### **NETMAP**

All test cases which uses Netmap.

### **MEMIF**

All test cases which uses Memif.

# SINGLE\_MEMIF

All test cases which uses only single Memif connection per DUT. One DUT instance is running in container having one physical interface exposed to container.

#### **LBOND**

All test cases which uses link bonding (BondEthernet interface).

### LBOND\_DPDK

All test cases which uses DPDK link bonding.

#### LBOND\_VPP

All test cases which uses VPP link bonding.

#### LBOND\_MODE\_XOR

All test cases which uses link bonding with mode XOR.

## LBOND\_MODE\_LACP

All test cases which uses link bonding with mode LACP.

### LBOND\_LB\_L34

All test cases which uses link bonding with load-balance mode I34.

## LBOND\_1L

All test cases which uses one link for link bonding.

## LBOND\_2L

All test cases which uses two links for link bonding.

## DRV\_AVF

All test cases which uses Intel Adaptive Virtual Function (AVF) device plugin for VPP. This plugins provides native device support for Intel AVF. AVF is driver specification for current and future Intel Virtual Function devices. In essence, today this driver can be used only with Intel XL710 / X710 / XXV710 adapters.

## DRV\_VFIO\_PCI

All test cases which uses vfio-pci device driver. It supports variety of NIC adapters.

### DRV\_RDMA\_CORE

All test cases which uses rdma-core device driver. It supports Mellanox NIC adapters.

### RXQ\_SIZE\_{n}

All test cases which RXQ size (RX descriptors) are set to {n}. Default is 0, which means VPP (API) default.

### TXQ\_SIZE\_{n}

All test cases which TXQ size (TX descriptors) are set to {n}. Default is 0, which means VPP (API) default.

# **5.4.17 Feature Tags**

# IACLDST

iACL destination.

## **COPWHLIST**

COP whitelist.

#### NAT44

NAT44 configured and tested.

### NAT64

NAT44 configured and tested.

### ACL

ACL plugin configured and tested.

### **IACL**

ACL plugin configured and tested on input path.

### OACL

ACL plugin configured and tested on output path.

# **ACL\_STATELESS**

ACL plugin configured and tested in stateless mode (permit action).

### **ACL\_STATEFUL**

ACL plugin configured and tested in stateful mode (permit+reflect action).

### ACL1

ACL plugin configured and tested with 1 not-hitting ACE.

### ACL10

ACL plugin configured and tested with 10 not-hitting ACEs.

## ACL50

ACL plugin configured and tested with 50 not-hitting ACEs.

# SRv6\_PROXY

SRv6 endpoint to SR-unaware appliance via proxy.

## SRv6\_PROXY\_STAT

SRv6 endpoint to SR-unaware appliance via static proxy.

## SRv6\_PROXY\_DYN

SRv6 endpoint to SR-unaware appliance via dynamic proxy.

## SRv6\_PROXY\_MASQ

SRv6 endpoint to SR-unaware appliance via masquerading proxy.

# **5.4.18 Encryption Tags**

## **IPSECSW**

Crypto in software.

### **IPSECHW**

Crypto in hardware.

#### **IPSECTRAN**

IPSec in transport mode.

### **IPSECTUN**

IPSec in tunnel mode.

## **IPSECINT**

IPSec in interface mode.

## AES

IPSec using AES algorithms.

## AES\_128\_CBC

IPSec using AES 128 CBC algorithms.

## AES\_128\_GCM

IPSec using AES 128 GCM algorithms.

# AES\_256\_GCM

IPSec using AES 256 GCM algorithms.

# **HMAC**

IPSec using HMAC integrity algorithms.

## HMAC\_SHA\_256

IPSec using HMAC SHA 256 integrity algorithms.

## HMAC\_SHA\_512

IPSec using HMAC SHA 512 integrity algorithms.

# 5.4.19 Client-Workload Tags

#### VM

All test cases which use at least one virtual machine.

## LXC

All test cases which use Linux container and LXC utils.

#### DRC

All test cases which use at least one Docker container.

#### **DOCKER**

All test cases which use Docker as container manager.

#### **APP**

All test cases with specific APP use.

# 5.4.20 Container Orchestration Tags

#### **1VSWITCH**

VPP running in Docker container acting as VSWITCH.

## **1VNF**

1 VPP running in Docker container acting as VNF work load.

### 2VNF

2 VPP running in 2 Docker containers acting as VNF work load.

# **4VNF**

4 VPP running in 4 Docker containers acting as VNF work load.

# 5.4.21 Multi-Threading Tags

#### **STHREAD**

Dynamic tag. All test cases using single poll mode thread.

#### **MTHREAD**

**Dynamic tag.** All test cases using more then one poll mode driver thread.

#### 1NUMA

All test cases with packet processing on single socket.

#### 2NUMA

All test cases with packet processing on two sockets.

### 1C

1 worker thread pinned to 1 dedicated physical core; or if HyperThreading is enabled, 2 worker threads each pinned to a separate logical core within 1 dedicated physical core. Main thread pinned to core 1.

#### 2C

2 worker threads pinned to 2 dedicated physical cores; or if HyperThreading is enabled, 4 worker threads each pinned to a separate logical core within 2 dedicated physical cores. Main thread pinned to core 1.

#### 4C

4 worker threads pinned to 4 dedicated physical cores; or if HyperThreading is enabled, 8 worker threads each pinned to a separate logical core within 4 dedicated physical cores. Main thread pinned to core 1.

#### 1T1C

**Dynamic tag.** 1 worker thread pinned to 1 dedicated physical core. 1 receive queue per interface. Main thread pinned to core 1.

#### **2T2C**

**Dynamic tag.** 2 worker threads pinned to 2 dedicated physical cores. 1 receive queue per interface. Main thread pinned to core 1.

#### 4T4C

**Dynamic tag.** 4 worker threads pinned to 4 dedicated physical cores. 2 receive queues per interface. Main thread pinned to core 1.

#### 2T1C

**Dynamic tag.** 2 worker threads each pinned to a separate logical core within 1 dedicated physical core. 1 receive queue per interface. Main thread pinned to core 1.

#### 4T2C

**Dynamic tag.** 4 worker threads each pinned to a separate logical core within 2 dedicated physical cores. 2 receive queues per interface. Main thread pinned to core 1.

## 8T4C

**Dynamic tag.** 8 worker threads each pinned to a separate logical core within 4 dedicated physical cores. 4 receive queues per interface. Main thread pinned to core 1.

# 5.4.22 Honeycomb Tags

# **HC\_FUNC**

Honeycomb functional test cases.

## HC\_NSH

Honeycomb NSH test cases.

# **HC\_PERSIST**

Honeycomb persistence test cases.

## HC\_REST\_ONLY

(Exclusion tag) Honeycomb test cases that cannot be run in Netconf mode using ODL client for Restfconf -> Netconf translation.

## **BIBLIOGRAPHY**

```
Linux Containers<sup>129</sup>
[lxc]
[Ixcnamespace] Resource management: Linux kernel Namespaces and cgroups 130.
[stgraber] LXC 1.0: Blog post series<sup>131</sup>.
[Ixcsecurity] Linux Containers Security<sup>132</sup>.
[capabilities] Linux manual - capabilities - overview of Linux capabilities 133.
[cgroup1] Linux kernel documentation: cgroups<sup>134</sup>.
[cgroup2] Linux kernel documentation: Control Group v2<sup>135</sup>.
[selinux] SELinux Project Wiki<sup>136</sup>.
[lxcsecfeatures] LXC 1.0: Security features<sup>137</sup>.
[Ixcsource] Linux Containers source<sup>138</sup>.
[apparmor] Ubuntu AppArmor<sup>139</sup>.
[seccomp] SECure COMPuting with filters 140.
[docker] Docker<sup>141</sup>.
[k8sdoc] Kubernetes documentation<sup>142</sup>.
[TWSLink] TWS<sup>187</sup>
[dockerhub] Docker hub<sup>188</sup>
[fdiocsitgerrit] FD.io/CSIT gerrit<sup>189</sup>
[fdioregistry] FD.io registy
[JenkinsSlaveDcrFile] jenkins-slave-dcr-file<sup>190</sup>
129 https://linuxcontainers.org/

    https://www.cs.ucsb.edu/~rich/class/cs293b-cloud/papers/lxc-namespace.pdf
    https://stgraber.org/2013/12/20/lxc-1-0-blog-post-series/

 132 https://linuxcontainers.org/lxc/security/
 133 http://man7.org/linux/man-pages/man7/capabilities.7.html
 134 https://www.kernel.org/doc/Documentation/cgroup-v1/cgroups.txt
 135 https://www.kernel.org/doc/Documentation/cgroup-v2.txt
 136 http://selinuxproject.org/page/Main_Page
 137 https://stgraber.org/2014/01/01/lxc-1-0-security-features/
 138 https://github.com/lxc/lxc
 139 https://wiki.ubuntu.com/AppArmor
 140 https://www.kernel.org/doc/Documentation/prctl/seccomp_filter.txt
 141 https://www.docker.com/what-docker
 142 https://kubernetes.io/docs/home/
 187 https://wiki.fd.io/view/CSIT/TWS
 188 https://hub.docker.com/
 189 https://gerrit.fd.io/r/CSIT
 ^{190}\ https://github.com/snergfdio/multivppcache/blob/master/ubuntu18/Dockerfile
```

[CsitShimDcrFile] csit-shim-dcr-file<sup>191</sup> [CsitSutDcrFile] csit-sut-dcr-file<sup>192</sup> [ansiblelink] ansible<sup>193</sup> [fdiocsitansible] Fd.io/CSIT ansible 194 [inteli40e] Intel i40e<sup>195</sup> pci ids<sup>196</sup> [pciids]

**Bibliography** 721

<sup>191</sup> https://github.com/snergfdio/multivppcache/blob/master/csit-shim/Dockerfile

<sup>192</sup> https://github.com/snergfdio/multivppcache/blob/master/csit-sut/Dockerfile

<sup>&</sup>lt;sup>193</sup> https://www.ansible.com/

 <sup>194</sup> https://git.fd.io/csit/tree/resources/tools/testbed-setup/ansible
 195 https://downloadmirror.intel.com/26370/eng/readme.txt

<sup>&</sup>lt;sup>196</sup> http://pci-ids.ucw.cz/v2.2/pci.ids