Release Notes

Changes in CSIT-2210

VPP PERFORMANCE TESTS
- Added new performance testbed 3n-snr (3 Node SnowRidge, with Intel Atom processors), to later replace 3n-dnv and 2n-dnv (3 and 2 Node Denverton) testbeds.
- Added GTPU HW offload tests using VPP GTPU hardware offload with Intel e810 4p25ge NICs (3n-icx testbeds only). These tests were already there in CSIT-2206, but were yielding invalid results due to using TRex v2.97 that was incompatible with e810 NICs used for those tests.
- Added Wireguard tests using VPP software crypto (3n-icx, 3n-snr testbeds) and using built-in hardware crypto QAT device (3n-snr testbed only).
- Reduction of tests: Removed certain test variations executed iteratively for the report (as well as in daily and weekly trending) due to physical testbeds overload.
TEST FRAMEWORK
- CSIT-2210 executes all VPP v22.10 performance tests using vpp ubuntu2204 images, due to CSIT execution environment change as noted below. This applies to all performance testbeds except Denverton. Consequently, VPP v22.06 has not been re-tested in CSIT-2210 environment, as no ubuntu204 images are available for that VPP version. Performance comparison between VPP v22.10 (current version) vs VPP v22.06 (previous version) may be impacted by VPP build environment change (ubuntu2004 to ubuntu 2204) change and CSIT environment change. See Root Cause Analysis for Performance Changes for details.
- CSIT test environment version has been updated to ver. 11, see Environment Versioning.
- TCP TPUT profiles had to be changed, as newer TRex versions are not deterministic enough when deciding when to send an ACK.
- CSIT PAPI support: Due to issues with PAPI performance, and deprecation of VAT, VPP CLI is used in CSIT for many VPP scale tests. See Known Issues.
- General Code Housekeeping: Ongoing code optimizations and bug fixes.
PRESENTATION AND ANALYTICS LAYER
- C-Dash performance dashboard got updated UI and updated backend increasing its performance and robustness.

Known Issues

New

#	JiraID	Issue Description
1	CSIT-1850	2n-dnv: sporadic 1518B tput tests failing to establish required sessions.
2	CSIT-1864	2n-clx: half of the packets lost on PDR tests.
3	CSIT-1868	2n-clx: ALL ldpreload-nginx tests fails when trying to start nginx.
4	CSIT-1871	3n-snr: 25GE interface between SUT and TG/TRex goes down randomly.
5	CSIT-1877	3n-alt, 3n-tsh: VM tests failing to boot VM.
6	CSIT-1883	3n-snr: All hwasync wireguard tests failing when trying to verify device.
7	CSIT-1884	2n-clx, 2n-icx: All NAT44DET NDR PDR IMIX over 1M sessions BIDIR tests failing to create enough sessions.
8	CSIT-1885	3n-icx: 9000b ip4 ip6 l2 NDRPDR AVF tests are failing to forward traffic.
9	CSIT-1886	3n-icx: Wireguard tests with 100 and more tunnels are failing PDR criteria.

Previous

Issues reported in previous releases which still affect the current results.

#	JiraID	Issue Description
1	CSIT-1671	All CSIT scale tests can not use PAPI due to much slower performance compared to VAT/CLI (it takes much longer to program VPP). This needs to be addressed on the PAPI side. Currently, the time critical code uses VAT running large files with exec statements and CLI commands. Still, we needed to reduce the number of scale tests run to keep overall duration reasonable. More improvements needed to achieve sufficient configuration speed.
1	VPP-1763
2	CSIT-1782	Multicore AVF tests are failing when trying to create interface. Frequency is reduced by CSIT workaround, but occasional failures do still happen.
3	CSIT-1785	NAT44ED tests failing to establish all TCP sessions. At least for max scale, in allotted time (limited by session 500s timeout) due to worse slow path performance than previously measured and calibrated for. CSIT removed the max scale NAT tests to avoid this issue.
3	VPP-1972
4	CSIT-1799	All NAT44-ED 16M sessions CPS scale tests fail while setting NAT44 address range.
5	CSIT-1800	All Geneve L3 mode scale tests (1024 tunnels) are failing.
6	CSIT-1801	9000B payload frames not forwarded over tunnels due to violating supported Max Frame Size (VxLAN, LISP, SRv6).
7	CSIT-1802	AF-XDP - NDR tests failing from time to time.
8	CSIT-1804	All testbeds: NDR tests failing from time to time.
9	CSIT-1808	All tests with 9000B payload frames not forwarded over memif interfaces.
10	CSIT-1827	3n-icx, 3n-skx: all AVF crypto tests sporadically fail. 1518B with no traffic, IMIX with excessive packet loss.
11	CSIT-1835	3n-icx: QUIC vppecho BPS tests failing on timeout when checking hoststack finished.
12	CSIT-1849	2n-skx, 2n-clx, 2n-icx: UDP 16m TPUT tests fail to create all sessions.

Fixed

Issues reported in previous releases which were fixed in this release:

#	JiraID	Issue Description
1	CSIT-1834	2n-icx, 2n-skx: sporadic AVF soak tests failing to find critical load with PLRsearch.
2	CSIT-1846	2n-skx, 2n-clx, 2n-icx: ALL 1518B TCP tput tests failing with big packet loss.
3	CSIT-1851	trending regression: various icelake tests around 2202-04-15 Somewhat expected consequence of a VPP usability fix, the previous VPP compiler version was too new for the OS used.

Root Cause Analysis for Performance Changes

List of RCAs in CSIT-2210 for VPP performance changes:

#	JiraID	Issue Description
1	CSIT-1887	rls2210 RCA: ASTF tests TRex upgrade decreased TRex performance. NAT results not affected, except on Denverton due to interference from VPP-2010.
2	CSIT-1888	rls2210 RCA: testbed differences, especially for ipsec Not caused by VPP code nor CSIT code. Most probable cause is clang-14 behavior.
3	CSIT-1889	rls2210 RCA: policy-outbound-nocrypto When VPP added spd fast path matching (Gerrit 36097), it decreased MRR of the corresponding tests, at least on 3-alt.