Introduction
The network namespace from the patchset has been benchmarked with tbench and netperf. The tests are simple and just try to measure the impact of the network virtualization on the TCP throughtput and the CPU usage overhead.
Two kernels were tested with several configuration:
- 2.6.20 for the reference values
- 2.6.20-lxc8
- network namespace compiled out (CONFIG_NET_NS=no)
- network namespace compiled in (CONFIG_NET_NS=yes)
- without container
- inside a container with a real network device
- inside a container with ip_forward, route and etun
- inside a container with a bridge and etun
Each benchmarking has been done with 2 machines running netperf and
tbench. A dedicated machine with a RH4 kernel run the bench servers.
For each bench, netperf and tbench, the tests are ran on:
- Intel Xeon EM64T, Bi-processor 2,8GHz with hyperthreading activated, 4GB of RAM and Gigabyte NIC (tg3)
- AMD Athlon MP 1800+, Bi-processor 1,5GHz, 1GB of RAM and Gigabyte NIC (dl2000)
Each tests are run on these machines in order to have a CPU relative overhead.
Vanilla 2.6.20
| Netperf | CPU usage (%) | Throughput (Mbits/s) | Service Demand (us/KB) |
| on xeon | 5.99 | 941.38 | 2.084 |
|---|---|---|---|
| on athlon | 28.17 | 844.82 | 5.462 |
| Tbench | Throughput (Mbits/s) |
| on xeon | 66.35 |
|---|---|
| on athlon | 65.31 |
lxc 2.6.20-lxc8
With net_ns compiled out
| Netperf | CPU usage (%) / overhead | Throughput (Mbits/s) / changed | Service Demand (us/KB) |
| on xeon | 6.04 / +0.8 % | 941.33 / 0 % | 2.104 |
|---|---|---|---|
| on athlon | 28.45 / +1 % | 840.76 / -0.5 % | 5.545 |
| Tbench | Throughput (Mbits/s) / changed |
| on xeon | 65.69 / -1 % |
|---|---|
| on athlon | 65.35 / -0.2 % |
Observation : no noticeable overhead
With net_ns compiled in
Without container
| Netperf | CPU usage (%) / overhead | Throughput (Mbits/s) / changed | Service Demand (us/KB) |
| on xeon | 6.02 / +0.5 % | 941.34 / 0 % | 2.097 |
|---|---|---|---|
| on athlon | 27.93 / -0.8 % | 833.53 / -1.3 % | 5.490 |
| Tbench | Throughput (Mbits/s) / changed |
| on xeon | 66.00 / -0.5 % |
|---|---|
| on athlon | 64.94 / -0.9 % |
Observation : no noticeable overhead
Inside the container with real device
| Netperf | CPU usage (%) / overhead | Throughput (Mbits/s) / changed | Service Demand (us/KB) |
| on xeon | 5.60 / -6.5 % | 941.42 / 0 % | 1.949 |
|---|---|---|---|
| on athlon | 27.73 / -1.5 % | 835.11 / +1.5 % | 5.440 |
| Tbench | Throughput (Mbits/s) / changed |
| on xeon | 74.36 / +12 % |
|---|---|
| on athlon | 70.87 / +8.2 % |
Observation : no noticeable overhead. The network interface is only
used by the container, so I guess it does not interact with another
network traffic and that explains the performances are better.
Inside the container with etun and routes
| Netperf | CPU usage (%) / overhead | Throughput (Mbits/s) / changed | Service Demand (us/KB) |
| on xeon | 16.25 / +171 % | 941.31 / 0 % | 5.657 |
|---|---|---|---|
| on athlon | 49.99 / +77 % | 828.94 / -1.9 % | 9.880 |
| Tbench | Throughput (Mbits/s) / changed |
| on xeon | 65.61 / -1.1 % |
|---|---|
| on athlon | 62.58 / -4.5 % |
Observation : The CPU overhead is very big. Throughput is a little
impacted on the less powerful machine
Inside the container with etun and bridge
| Netperf | CPU usage (%) / overhead | Throughput (Mbits/s) / changed | Service Demand (us/KB) |
| on xeon | 18.39 / +207 % | 941.30 / 0 % | 6.400 |
|---|---|---|---|
| on athlon | 49.94 / +77 % | 823.75 / -2.5 % | 9.933 |
| Tbench | Throughput (Mbits/s) / changed |
| on xeon | 66.52 / +0.2 % |
|---|---|
| on athlon | 61.07 / -6.8 % |
Observation : The CPU overhead is very big. Throughput is a little
impacted on the less powerful machine
General observations
The objective to have no performances degrations, when the network
namespace is off in the kernel, is reached.
When the network is used outside the container and the network
namespace are compiled in, there is no performance degradations.
The patchset allows to move network devices between namespaces and
this is clearly a good feature. This helps us to see that the network namespace code does not add overhead when using directly the physical network device into the container.
The loss of performances is very noticeable inside the container and
seems to be directly related to the usage of the pair device and the
specific network configuration needed for the container. When the
packets are sent by the container, the mac address is for the pair
device but the IP address is not owned by the host. That directly
implies to have the host to act as a router and the packets to be
forwarded. That adds a noticeable overhead.
A hack has been made in the ip_forward function to avoid useless skb_cow when using the pair device/tunnel device and the overhead is reduced by the half. When the bridge configuration is used and the CONFIG_BRIDGE_NETFILER is off, the CPU overhead is significantly reduced by the half.
The related patches associated to these old tests on 2.6.20, are not available anymore