Mailing List Archive: Zbalance Issue

Zbalance Issue

Jan 28, 2019, 8:46 AM

Post #1 of 7 (965 views)

I am having an issue with zbalance_ipc where we are dropping 10% of packets at 1.5Gbps+/interface even before any consuming applications begin to consume the data. The setup is a 2x18 core server where one application is using a single queue and the other application is consuming 18 queues, and we have a zbalance process per NUMA Node. When we start z_balance with only one application (-n 1), we see no loss but with -n 18,1, we see ~10% packet drops before anything is hooked to the dummies:

23/Jan/2019 21:58:59 [zbalance_ipc.c:266] Absolute Stats: Recv 1'547'957'308 pkts (138'616'411 drops) - Forwarded 3'095'914'616 pkts (0 drops)

23/Jan/2019 21:58:59 [zbalance_ipc.c:305] zc:p1p2 RX 1547957313 pkts Dropped 138616411 pkts (8.2 %)

23/Jan/2019 21:58:59 [zbalance_ipc.c:319] Q 0 RX 0 pkts Dropped 0 pkts (0.0 %)

23/Jan/2019 21:58:59 [zbalance_ipc.c:319] Q 1 RX 0 pkts Dropped 0 pkts (0.0 %)

23/Jan/2019 21:58:59 [zbalance_ipc.c:319] Q 2 RX 0 pkts Dropped 0 pkts (0.0 %)

23/Jan/2019 21:58:59 [zbalance_ipc.c:319] Q 3 RX 0 pkts Dropped 0 pkts (0.0 %)

23/Jan/2019 21:58:59 [zbalance_ipc.c:319] Q 4 RX 0 pkts Dropped 0 pkts (0.0 %)

23/Jan/2019 21:58:59 [zbalance_ipc.c:319] Q 5 RX 0 pkts Dropped 0 pkts (0.0 %)

23/Jan/2019 21:58:59 [zbalance_ipc.c:319] Q 6 RX 0 pkts Dropped 0 pkts (0.0 %)

23/Jan/2019 21:58:59 [zbalance_ipc.c:319] Q 7 RX 0 pkts Dropped 0 pkts (0.0 %)

23/Jan/2019 21:58:59 [zbalance_ipc.c:319] Q 8 RX 0 pkts Dropped 0 pkts (0.0 %)

23/Jan/2019 21:58:59 [zbalance_ipc.c:319] Q 9 RX 0 pkts Dropped 0 pkts (0.0 %)

23/Jan/2019 21:58:59 [zbalance_ipc.c:319] Q 10 RX 0 pkts Dropped 0 pkts (0.0 %)

23/Jan/2019 21:58:59 [zbalance_ipc.c:319] Q 11 RX 0 pkts Dropped 0 pkts (0.0 %)

23/Jan/2019 21:58:59 [zbalance_ipc.c:319] Q 12 RX 0 pkts Dropped 0 pkts (0.0 %)

23/Jan/2019 21:58:59 [zbalance_ipc.c:319] Q 13 RX 0 pkts Dropped 0 pkts (0.0 %)

23/Jan/2019 21:58:59 [zbalance_ipc.c:319] Q 14 RX 0 pkts Dropped 0 pkts (0.0 %)

23/Jan/2019 21:58:59 [zbalance_ipc.c:319] Q 15 RX 0 pkts Dropped 0 pkts (0.0 %)

23/Jan/2019 21:58:59 [zbalance_ipc.c:319] Q 16 RX 0 pkts Dropped 0 pkts (0.0 %)

23/Jan/2019 21:58:59 [zbalance_ipc.c:319] Q 17 RX 0 pkts Dropped 0 pkts (0.0 %)

23/Jan/2019 21:58:59 [zbalance_ipc.c:319] Q 18 RX 0 pkts Dropped 0 pkts (0.0 %)

23/Jan/2019 21:58:59 [zbalance_ipc.c:338] Actual Stats: Recv 256'628.26 pps (0.00 drops) - Forwarded 513'257.53 pps (0.00 drops)

The interfaces are intel NICs, 2x10GE but we're only consuming data of of one interface on each card.

0% packet loss (running as background so that I can get logs, but normally run with -d):

sudo /usr/bin/zbalance_ipc -i zc:p1p2 -a -q 131072 -c 99 -n 1 -m 1 -S 4 -g 2 -r 0:dummy57 -l /var/log/zbalance_p1p2.log -v -p -u /dev/hugepages &

sudo /usr/bin/zbalance_ipc -i zc:p4p2 -a -q 131072 -c 97 -n 1 -m 1 -S 5 -g 3 -r 0:dummy59 -l /var/log/zbalance_p4p2.log -v -p -u /dev/hugepages &

10% Packet drops to start, greater with actual consumers on the dummies:

sudo /usr/bin/zbalance_ipc -i zc:p1p2 -a -q 131072 -c 98 -n 18,1 -m 4 -S 8 -g 6 -r 0:dummy20 -r 1:dummy21 -r 2:dummy22 -r 3:dummy23 -r 4:dummy24 -r 5:dummy25 -r 6:dummy26 -r 7:dummy27 -r 8:dummy28 -r 9:dummy30 -r 10:dummy31 -r 11:dummy32 -r 12:dummy33 -r 13:dummy34 -r 14:dummy35 -r 15:dummy36 -r 16:dummy37 -r 17:dummy38 -r 18:dummy57 -l /var/log/zbalance_p1p2.log -v -p -u /dev/hugepages &

sudo /usr/bin/zbalance_ipc -i zc:p4p2 -a -q 131072 -c 96 -n 18,1 -m 4 -S 9 -g 7 -r 0:dummy0 -r 1:dummy1 -r 2:dummy2 -r 3:dummy3 -r 4:dummy4 -r 5:dummy5 -r 6:dummy6 -r 7:dummy7 -r 8:dummy8 -r 9:dummy10 -r 10:dummy11 -r 11:dummy12 -r 12:dummy13 -r 13:dummy14 -r 14:dummy15 -r 15:dummy16 -r 16:dummy17 -r 17:dummy18 -r 18:dummy59 -l /var/log/zbalance_p4p2.log -v -p -u /dev/hugepages &

p4p2 is hooked to NUMA 1
P1p2 is hooked to NUMA 0
Other relevant data from the server:

NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46,48,50,52,54,56,58,60,62,64,66,68,70

NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,43,45,47,49,51,53,55,57,59,61,63,65,67,69,71

ixgbe.conf:

RSS=1,1,1,1,1,1 numa_cpu_affinity=2,2,2,2,3,3

PF Ring version is:

zbalance_ipc -h

zbalance_ipc - (C) 2014-2018 ntop.org

Using PFRING_ZC v.7.3.0.180817

Let me know if I'm missing something obvious or any ideas on how to improve performance. Thanks so much!

Cheers,

Nate

________________________________
NOTICE: Morgan Stanley is not acting as a municipal advisor and the opinions or views contained herein are not intended to be, and do not constitute, advice within the meaning of Section 975 of the Dodd-Frank Wall Street Reform and Consumer Protection Act. If you have received this communication in error, please destroy all electronic and paper copies and notify the sender immediately. Mistransmission is not intended to waive confidentiality or privilege. Morgan Stanley reserves the right, to the extent required and/or permitted under applicable law, to monitor electronic communications, including telephone calls with Morgan Stanley personnel. This message is subject to the Morgan Stanley General Disclaimers available at the following link: http://www.morganstanley.com/disclaimers. If you cannot access the links, please notify us by reply message and we will send the contents to you. By communicating with Morgan Stanley you acknowledge that you have read, understand and consent, (where applicable), to the foregoing and the Morgan Stanley General Disclaimers.