Why VPP is Awesome: Real-World Performance Testing

Andrej Binder

July 14, 2025

See how VPP outperforms Linux bridging in real-world tests. Learn why it's essential for fast, scalable, and reliable embedded networking at 10G and beyond.

← BACK

Why VPP is Awesome: Real-World Performance Testing

By Andrej Binder

In high-performance networking, everyone claims to be fast. But you can’t fake packets. So we built a lab, wired it for 10G, and asked a simple question:

How does traditional Linux bridging compare to VPP in the real world?

This wasn’t about squeezing every ounce of performance through kernel tuning. It wasn’t about theoretical max speeds. It was about running VPP through its paces in real-world, untuned scenarios—just like our customers do.

Spoiler: the results weren’t close.

🤔 Wait.. What is VPP—and Why Does It Matter to Big Network?

Before we dive into the benchmark data, let’s answer the obvious question: what is VPP—and why are we testing it?

VPP stands for Vector Packet Processing, an open-source framework developed by the Fast Data Project (FD.io). It’s designed to do one thing exceptionally well: move packets fast. Instead of processing each packet one at a time through the Linux kernel, VPP handles them in batches—vectors—minimizing CPU overhead, avoiding unnecessary copies, and bypassing the kernel altogether. The result? Higher throughput and lower latency.

That kernel bypass is critical to Big Network’s architecture. Every Big Network Edge device—whether Edge Lite, Edge Pro, or IRG—uses VPP under the hood. These appliances aren’t general-purpose servers. They’re purpose-built packet-forwarding machines, connecting people, places, and devices through intelligent routing and multi-path tunnels. So if there’s a faster, more efficient way to move traffic, we want it.

And we wanted to see just how well VPP holds up in real-world conditions.

So we ran three tests:

Test #1: Direct connection — no software involved
Test #2: Linux bridge — packets forwarded by the Linux kernel
Test #3: VPP bridge — kernel bypass with vector packet processing

Test Lab Setup

Every great showdown needs a solid crew. Ours was built entirely on Big Network Edge Pro devices—each powered by a 4-core Intel Atom C3558 CPU. All nodes were wired via 10GbE DAC cables and used a default MTU of 1500.

Here’s who played what role:

🛠 Destroyer – The iperf3 client. It throws the packets. Hard.
🗑 Dumpster – The iperf3 server. It catches everything. No judgment.
🚀 Datapult – The bridge node. Sometimes old-school Linux, sometimes full-throttle VPP.
🔌 Netinator – Our standby router for future tests and packet-mangling ambitions.

Each test ran for 15 seconds using iperf3.

Test #1: Direct Connection (Destroyer → Dumpster)

This is the baseline: a single 10GbE DAC cable between the client and server.

We start with the cleanest possible setup. Two devices. One cable. No middleman. This gives us a baseline for what perfect looks like.

No routing.
No bridging.
Just clean, raw TCP between two machines.

Packets traveled straight from Destroyer to Dumpster over a single 10GbE DAC link.

Result:

Throughput: 9.40 Gbps
Retransmits: 140

This is what full line-rate performance looks like. Minimal retransmits. Clean, simple, effective.

Test #2: Linux Bridge (Destroyer → Datapult [Linux bridge] → Dumpster)

This test introduces Datapult, configured as a traditional Linux software bridge using the traditional brctl method that relies on the Linux kernel’s networking stack.

/usr/sbin/ip link set eno1 up

/usr/sbin/ip link set eno2 up

/usr/sbin/brctl addbr br10g

/usr/sbin/ip link set br10g up

/usr/sbin/brctl addif br10g eno1

/usr/sbin/brctl addif br10g eno2

Linux is doing all the bridging in the kernel, using interrupt-driven drivers and softirq processing.

Result:

Throughput: 3.70 Gbps
Retransmits: 9600

Interpretation: This setup works functionally but suffers massively in performance. Honestly, performance cratered. CPU shows heavy ksoftirqd load — a common bottleneck when dealing with high packet rates on Linux without offload or tuning.

Wait. Wuts ksoftirqd ? It’s a background kernel process responsible for handling network traffic at high rates. When traffic overwhelms it, latency spikes and packet drops pile up. That’s exactly what we saw.

CPU was overloaded. Retransmits exploded.This is what happens when a modern 10G workload runs into the limits of the legacy Linux kernel path.

Test #3: VPP Bridge (Destroyer → Datapult [VPP L2 Bridge] → Dumpster)

We took the same setup from Test #2 and replaced the Linux bridge with a VPP-based Layer 2 bridge using DPDK-bound NICs and poll-mode drivers:

modprobe vfio-pci

dpdk-devbind.py --bind=vfio-pci <PCI_ID>

vppctl set interface state TenGigabitEthernet9/0/0 up

vppctl set interface state TenGigabitEthernet9/0/1 up

vppctl create bridge-domain 1

vppctl set interface l2 bridge TenGigabitEthernet9/0/0 1

vppctl set interface l2 bridge TenGigabitEthernet9/0/1 1

Unlike Linux, VPP doesn’t use interrupts or kernel queues. It runs entirely in user space with predictable, high-efficiency batching.

Result:

Throughput: 9.26 Gbps
Retransmits: 1877

Interpretation: Near line-rate performance, just like Test #1—but now through a full software bridge. The improvement over Linux bridging was massive.

VPP Isn’t Just a Performance Boost—It’s a Prerequisite

There are many ways to forward packets in Linux. But the traditional path—processing each packet through the kernel—isn’t built for modern, high-speed networking.

Big Network isn’t terminating sessions. We’re not running apps. We’re moving packets as fast as physics allows. And that means bypassing bottlenecks like ksoftirqd, memory copies, and interrupt handling.

VPP changes the game.

With poll-mode drivers, vector batching, and a fully user-space stack, VPP delivers not just better performance, but more consistency, control, and scalability. In our testing, it nearly matched direct-connect performance—while Linux bridging fell apart.

Sure, VPP brings architectural complexity. But the payoff is worth it: more speed, less latency, and a future-proof foundation for embedded routing.

‍

Direct and VPP Bridge yield the best throughput.
Linux Bridge kills performance and reliability.

VPP changes the model entirely.

By bypassing the kernel, VPP avoids hundreds of thousands of lines of generic packet-handling logic. Instead, it operates in user space using poll-mode drivers and vectorized batch processing—reducing latency, increasing throughput, and scaling predictably. In our testing, it delivered near line-rate performance with far fewer retransmits than the Linux kernel bridge. And that wasn’t with special tuning. That was out of the box.

Yes, VPP introduces more architectural complexity. But that complexity buys us speed, control, and a future-proof foundation for embedded networking.

‍

🏁 TL;DR

Linux bridging is easy, but not performant.
VPP bridging is fast, scalable, and efficient.
There are other options, like Open vSwitch, eBPF/XDP, and NetMAP, but they each come with a large and specific learning curve.

If you're serious about modern networking, especially at 10G and beyond, VPP isn't optional — it's essential.

Ready to make your packets fly? Let's talk VPP.

‍

Why VPP is Awesome: Real-World Performance Testing

Why VPP is Awesome: Real-World Performance Testing

🤔 Wait.. What is VPP—and Why Does It Matter to Big Network?

Test Lab Setup

Test #1: Direct Connection (Destroyer → Dumpster)

Test #2: Linux Bridge (Destroyer → Datapult [Linux bridge] → Dumpster)

Test #3: VPP Bridge (Destroyer → Datapult [VPP L2 Bridge] → Dumpster)

VPP Isn’t Just a Performance Boost—It’s a Prerequisite

🏁 TL;DR

Ready to Network?