Mailing List Archive

Unexpected high dom0 load for bridges, especially with VLAN tag
I've been working on cutting down the number of "little" boxes here and
rebuilt a perimeter firewall and interior router/firewall on Xen 4.1.1
running Debian Buster dom0 on an Intel i3-7100T
(c. 2017, 2 cores, 4 threads, 3.4 GHz)
with a dual-port, Intel PCI NIC (believed genuine)
in addition to the onboard NICs.

TL;DR

I'm seeing ~140-150% dom0 load in xentop when passing ~250 Mbit/s of
packets between the two domUs on a dedicated, two-port Open vSwitch
bridge.

This seems excessive for what should be "just a wire" between the two
(other traffic for them is on PCI pass-through of the Intel NICs).

Taking out the function of these domUs out of the picture, bringing up
two "fresh" two Debian Buster domUs and iperf3 still shows seemingly
high load, especially if VLAN tags are involved. This is seen with
Open vSwitch or Linux bridges:

Without VLAN tag

    at 300 Mbits/s    ~18% dom0 load
    at 1000 Mbits/s   ~40% dom0 load

With VLAN tagging/detagging from the domU interfaces

    at 300 Mbits/s    ~ 40% dom0 load
    at 1000 Mbits/s   ~115% dom0 load

As there are only two ports on the bridge and two MAC addresses
involved, this seems high. No bridge filtering is configured.

It is especially surprising that using a single, consistent VLAN tag
"on the wire" doubles or triples the load.

This is reasonably consistent for Open vSwitch, Linux bridge set up
with Debian /etc/network/interfaces config, created with `brctl`, or
created with `ip link add ... type bridge`


Is this kind of load expected?


Is there any configuration of either style bridge that might
significantly improve this?


(At least for now, I need to stick with tagging the VLAN as I'm trying
to unravel why running without the tag causes some throughput problems
with the domUs involved.)



More detail:

xen 4.1.1
ovs 2.10.1
Linux xen-i3 4.19.0-9-amd64 #1 SMP Debian 4.19.118-2 (2020-04-29) x86_64
GNU/Linux


The interior router uses one of the Intel card's NICs via PCI
pass-through to connect to the Cisco SG300 switch for "inside" access
to VLAN trunks. It is running FreeBSD 12.1 in HVM mode.

The perimeter firewall uses the other of the Intel card's NICs to
connect through the Cisco to the DOCSIS modem. The Comcast line is
good for ~250 Mbps down and ~10 Mbps up. It is running Debian Buster
in PV mode, booted through grub-x86_64-xen.custom.bin (to recognize
the ZFS file system on which it runs).

The two are connected through a dedicated, two-port Open vSwitch
bridge, with the same VLAN tag they were running with when the two
functions each had their own, physical hardware.

When running a bandwidth test from a local host to a remote server,
the outbound packet path, as I understand it is:

    Interior host sources
    Interior host sends via Cisco SG300
    Cisco SG300 forwards to Intel NIC "0" on PCI pass-through to
wildside (interior)
    Wildside processes, routes over VIF pair, tagged
    Received on other end of pair by dom0
    Packet bridged by dom0
    Packet goes out another VIF pair to front (perimeter)
    Front receives packed at other end of VIF pair
    Front routes packet out Intel NIC "1" on PCI pass-through
    Cisco SG300 forwards packet to the modem

Examining htop on dom0 under load shows truncated names that appear to
be queues, three or four associated with each of the two, involved VIFs.

No special configuration of kernel governor, CPU affinity, or the like
has been done on dom0 or any of the domUs.


I've run them both tagged, and was working to cut over untagged on
both, but have run into a dribble of throughput when I do. as that
involves a non-Linux domU, I'll work through that in another thread.

The current xl config has front untagged and wildsdie still tagged.


Front (permieter router)

vif = [
'script=vif-openvswitch,type=vif,vifname=front-zfs_xn0,bridge=ovsbr0:<mgmt
VLAN>:<other VLAN>',
'script=vif-openvswitch,type=vif,vifname=front-zfs_xn1,bridge=ovsbr1.<link
VLAN>',
]

pci = [
  '01:00.0',
]



Wildside (interior router)

vif         = [
'script=vif-openvswitch,type=vif,vifname=wildside_xn0,bridge=ovsbr1:<link
VLAN>',
]

pci = [
  '01:00.1',
]

The VIF names seem to be within the typical 15-character limit.

This was previously running on an AMD GX-412TC (4 core, 1 GHz) and a
Celeron J1900 (4 core, 2 GHz).

The i3-7100T and the NICs on its Intel card have been used to
benchmark networking at up to GigE symmetric rates.


I've tried to "direct wire" the two domUs with specifying a backend
for the VIF in the xl config. Though I am surprised that the VIF pair
and a two-port bridge are apparently so CPU hungry, even at low
speeds, such a connection would seem to simplify things by removing
one VIF pair and the bridge entirely.

Even if that were possible, it still leaves me with concerns around
using VLAN trunking and its apparent impact on CPU load. This all came
about as suricata was the next service I was going to try to move to
the Xen box.


Thanks!

Jeff Kletsky
Re: Unexpected high dom0 load for bridges, especially with VLAN tag [ In reply to ]
On 5/25/20 1:49 PM, Jeff Kletsky wrote:
> I've been working on cutting down the number of "little" boxes here and
> rebuilt a perimeter firewall and interior router/firewall on Xen 4.1.1

Why such an old version of Xen? That is 7 years old. Buster is on xen 4.11.3 https://packages.debian.org/buster/xen-hypervisor-common
Re: Unexpected high dom0 load for bridges, especially with VLAN tag [ In reply to ]
On 5/28/20 3:57 PM, Sarah Newman wrote:
> On 5/25/20 1:49 PM, Jeff Kletsky wrote:
>> I've been working on cutting down the number of "little" boxes here and
>> rebuilt a perimeter firewall and interior router/firewall on Xen 4.1.1
>
> Why such an old version of Xen? That is 7 years old. Buster is on xen
> 4.11.3 https://packages.debian.org/buster/xen-hypervisor-common
>
>

Thanks for the catch on that!

I'm not sure why I thought that it was 4.1.1 -- it is apparently
4.11.3+24-g14b62ab3e5-1~deb10u1


$ apt info 'xen*' 2>/dev/null | egrep -A1 'Package: xen'
Package: xen-utils-common
Version: 4.11.3+24-g14b62ab3e5-1~deb10u1
--
Package: xen-system-amd64
Version: 4.11.3+24-g14b62ab3e5-1~deb10u1
--
Package: xen-doc
Version: 4.11.3+24-g14b62ab3e5-1~deb10u1
--
Package: xen-hypervisor-4.11-amd64
Version: 4.11.3+24-g14b62ab3e5-1~deb10u1
--
Package: xen-utils-4.11
Version: 4.11.3+24-g14b62ab3e5-1~deb10u1
--
Package: xen-hypervisor-common
Version: 4.11.3+24-g14b62ab3e5-1~deb10u1
--
Package: xenstore-utils
Version: 4.11.3+24-g14b62ab3e5-1~deb10u1
--
Package: xen-tools
Version: 4.8-1
--
Package: xenwatch
Version: 0.5.4-4+b1
--


Checking the apt logs confirms 4.11.3+24-g14b62ab3e5-1~deb10u1


apt update
apt list --upgradable

does not show any xen-related updates available


Jeff