Mailing List Archive

Trouble with Zero Copy Performance
Hello,

I'm having high CPU issues with a nProbe/Zero Copy setup on a system that is intended to receive up to 20 Gbps of traffic on an x520-DA2 NIC. The traffic is being split up by a network switch so that each interface on the 520 is receiving the same amount. Currently the traffic load is 500 kpps / 3.5 Gbps on each interface and the CPU is right around 95% utilized on each core the interfaces are pinned to. If I startup nProbe without zero copy the performance is about the same, maybe a bit better. I've gone through documentation, six months of listserv, and tried just about every setting I can think of. I'm stuck and could use some help or hints on where to go to troubleshoot this. When the traffic gets higher than the stated above (happens every day for hours during peak) the two CPU cores are totally pegged.

Here are some system/setup specifics:

Intel(R) Xeon(R) CPU X3440 @ 2.53GHz (hyperthreading enabled)
Intel Corporation Ethernet Server Adapter X520-2 (p1p1 & p1p2)

[root@flowgen log]# cat /etc/redhat-release
CentOS Linux release 7.2.1511 (Core)
[root@flowgen log]# uname -a
Linux flowgen 3.10.0-327.36.1.el7.x86_64 #1 SMP Sun Sep 18 13:04:29 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

Using PFRING_ZC v.6.4.1.160615
Welcome to nProbe v.7.4.160928 (r5333) for x86_64-unknown-linux-gnu
Sep 28 12:24:00 Installed: ixgbe-zc-4.1.5.859-dkms.noarch
RSS=1,1
IRQ for p1p1 pinned to CPU3
IRQ for p1p2 pinned to CPU4
LRO OFF, GRO OFF, RXVLAN OFF, RING=32768, HUGEPAGES=1204

[root@flowgen log]# cat /proc/net/pf_ring/dev/p1p1/info
Name: p1p1
Index: 10
Address: <redacted>
Polling Mode: NAPI/ZC
Type: Ethernet
Family: Intel ixgbe 82599
Max # TX Queues: 1
# Used RX Queues: 1
Num RX Slots: 32768
Num TX Slots: 32768

/usr/local/bin/nprobe --daemon-mode --pid-file /var/nprobe_p1p1.pid --interface zc:p1p1 --as-list /usr/share/ntopng/httpdocs/geoip/GeoIPASNum.dat --lifetime-timeout 300 --verbose 1 --syslog nProbe_p1p1 --collector udp://redacted.uconn.edu:21583 --in-iface-idx 1 --out-iface-idx 1
/usr/local/bin/nprobe --daemon-mode --pid-file /var/nprobe_p1p2.pid --interface zc:p1p2 --as-list /usr/share/ntopng/httpdocs/geoip/GeoIPASNum.dat --lifetime-timeout 300 --verbose 1 --syslog nProbe_p1p2 --collector udp://redacted.uconn.edu:21583 --in-iface-idx 2 --out-iface-idx 2

Licenses look OK:

[root@flowgen pf_ring]# zcount -i zc:p1p1 -C
License Ok

Doing nprobe -v shows nProbe Standard and doing nprobe --help shows nProbe Pro, both with valid licenses

One other thing, not sure if it is related, I get the following error and warning when starting the above in zero copy only (goes away with zero copy mode):

29/Sep/2016 14:14:31 [util.c:4371] ERROR: Cannot get hw addr for zc:p1p1
29/Sep/2016 14:14:32 [nprobe.c:5620] WARNING: Unable to set pcap capture direction

Thanks in advance,


- Mike
Re: Trouble with Zero Copy Performance [ In reply to ]
Michael
95% of CPU load is already too much. I would look at the nProbe traces to see if the number of slots, fragments etc are ok. If you do not decrease the Cpu load in case of traffic spikes what you describe is reasonable although not desirable. Please let me know if you can see anything strange in logs. As of the warning about the MAC address is already fixed in the development version. I suggest to use RSS to virtualise the interface and start multiple nprobe's one per virtual queue.

This said, have you tried nProbe cento? I believe this is the app you need for 20G+. Give it a try and let me know how it goes. You can read more about it here http://www.ntop.org/nprobe/flow-based-monitoring-nprobe-cento-vs-standardpro/ <http://www.ntop.org/nprobe/flow-based-monitoring-nprobe-cento-vs-standardpro/>

Regards Luca


> On 30 Sep 2016, at 17:05, Lang, Michael <mike.lang@uconn.edu> wrote:
>
> Hello,
>
> I’m having high CPU issues with a nProbe/Zero Copy setup on a system that is intended to receive up to 20 Gbps of traffic on an x520-DA2 NIC. The traffic is being split up by a network switch so that each interface on the 520 is receiving the same amount. Currently the traffic load is 500 kpps / 3.5 Gbps on each interface and the CPU is right around 95% utilized on each core the interfaces are pinned to. If I startup nProbe without zero copy the performance is about the same, maybe a bit better. I’ve gone through documentation, six months of listserv, and tried just about every setting I can think of. I’m stuck and could use some help or hints on where to go to troubleshoot this. When the traffic gets higher than the stated above (happens every day for hours during peak) the two CPU cores are totally pegged.
>
> Here are some system/setup specifics:
>
> Intel(R) Xeon(R) CPU X3440 @ 2.53GHz (hyperthreading enabled)
> Intel Corporation Ethernet Server Adapter X520-2 (p1p1 & p1p2)
>
> [root@flowgen log]# cat /etc/redhat-release
> CentOS Linux release 7.2.1511 (Core)
> [root@flowgen log]# uname -a
> Linux flowgen 3.10.0-327.36.1.el7.x86_64 #1 SMP Sun Sep 18 13:04:29 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
>
> Using PFRING_ZC v.6.4.1.160615
> Welcome to nProbe v.7.4.160928 (r5333) for x86_64-unknown-linux-gnu
> Sep 28 12:24:00 Installed: ixgbe-zc-4.1.5.859-dkms.noarch
> RSS=1,1
> IRQ for p1p1 pinned to CPU3
> IRQ for p1p2 pinned to CPU4
> LRO OFF, GRO OFF, RXVLAN OFF, RING=32768, HUGEPAGES=1204
>
> [root@flowgen log]# cat /proc/net/pf_ring/dev/p1p1/info
> Name: p1p1
> Index: 10
> Address: <redacted>
> Polling Mode: NAPI/ZC
> Type: Ethernet
> Family: Intel ixgbe 82599
> Max # TX Queues: 1
> # Used RX Queues: 1
> Num RX Slots: 32768
> Num TX Slots: 32768
>
> /usr/local/bin/nprobe --daemon-mode --pid-file /var/nprobe_p1p1.pid --interface zc:p1p1 --as-list /usr/share/ntopng/httpdocs/geoip/GeoIPASNum.dat --lifetime-timeout 300 --verbose 1 --syslog nProbe_p1p1 --collector udp://redacted.uconn.edu:21583 <udp://redacted.uconn.edu:21583> --in-iface-idx 1 --out-iface-idx 1
> /usr/local/bin/nprobe --daemon-mode --pid-file /var/nprobe_p1p2.pid --interface zc:p1p2 --as-list /usr/share/ntopng/httpdocs/geoip/GeoIPASNum.dat --lifetime-timeout 300 --verbose 1 --syslog nProbe_p1p2 --collector udp://redacted.uconn.edu:21583 <udp://redacted.uconn.edu:21583> --in-iface-idx 2 --out-iface-idx 2
>
> Licenses look OK:
>
> [root@flowgen pf_ring]# zcount -i zc:p1p1 -C
> License Ok
>
> Doing nprobe -v shows nProbe Standard and doing nprobe --help shows nProbe Pro, both with valid licenses
>
> One other thing, not sure if it is related, I get the following error and warning when starting the above in zero copy only (goes away with zero copy mode):
>
> 29/Sep/2016 14:14:31 [util.c:4371] ERROR: Cannot get hw addr for zc:p1p1
> 29/Sep/2016 14:14:32 [nprobe.c:5620] WARNING: Unable to set pcap capture direction
>
> Thanks in advance,
>
> - Mike
> _______________________________________________
> Ntop-misc mailing list
> Ntop-misc@listgateway.unipi.it <mailto:Ntop-misc@listgateway.unipi.it>
> http://listgateway.unipi.it/mailman/listinfo/ntop-misc <http://listgateway.unipi.it/mailman/listinfo/ntop-misc>