Mailing List Archive

nProbe performance and packet drops
Hello list,

I'm testing nProbe listening from two different 10Gb interfaces (using i40e
pf_ring's driver). As we need to mix up the information of these two links,
we use a custom application (using pf_ring sources) that creates a virtual
interface with traffic from the original physical ones.

The total traffic is about 3Gbps (we expect to have more), but we are
seeing around 50-60% packet drops between our application and nProbe. When
testing with zcount/pfcount instead of nProbe, we see 0% drops.

I've made some tuning in the nProbe parameters (hash-size, max-num-flows,
idle-timeout, ...), but no significant changes has been noticed.

Drops are fewer when disabling the export to Kafka and disabling all
plugins (GTPv1, GTPv2 and HTTP), although we always have (around 1-2%).

I'm pasting below my nProbe configuration, and some traffic statistics.

Do you have any recommendation I could follow to improve this performance?
Thanks a lot in advance.


-- System:

nProbe: nprobe-8.0.171020-5797.x86_64
System RAM: 64GB
System CPU: 12 cores
System OS: CentOS Linux release 7.4.1708 (Core)
Linux Kernel: 3.10.0-693.11.6.el7.x86_64 #1 SMP Thu Jan 4 01:06:37 UTC
2018 x86_64 x86_64 x86_64 GNU/Linux

-- nProbe configuration:

-n=none
-i=zc:1@0
-s=128
-t=60
-d=30
-a=0
-e=1
-B=10
-w=4048000
-M=10000000
-z=0
-S=1:1
-E=0:0
-g=/var/run/nprobe-zc1-0.pid
--vlanid-as-iface-idx=none
-V=9
-T="%IPV4_SRC_ADDR %IPV4_DST_ADDR %IN_PKTS %IN_BYTES %OUT_PKTS %OUT_BYTES
%FIRST_SWITCHED %LAST_SWITCHED %L4_SRC_PORT %L4_DST_PORT %TCP_FLAGS
%PROTOCOL %SRC_TOS %SRC_AS %DST_AS %L7_PROTO %L7_PROTO_NAME %SRC_IP_COUNTRY
%SRC_IP_CITY %SRC_IP_LONG %SRC_IP_LAT %DST_IP_COUNTRY %DST_IP_CITY
%DST_IP_LONG %DST_IP_LAT %SRC_VLAN %DST_VLAN %DOT1Q_SRC_VLAN
%DOT1Q_DST_VLAN %DIRECTION %SSL_SERVER_NAME %SRC_AS_MAP %DST_AS_MAP
%HTTP_METHOD %HTTP_RET_CODE %HTTP_REFERER %HTTP_UA %HTTP_MIME %HTTP_HOST
%HTTP_SITE %UPSTREAM_TUNNEL_ID %UPSTREAM_SESSION_ID %DOWNSTREAM_TUNNEL_ID
%DOWNSTREAM_SESSION_ID %UNTUNNELED_PROTOCOL %UNTUNNELED_IPV4_SRC_ADDR
%UNTUNNELED_L4_SRC_PORT %UNTUNNELED_IPV4_DST_ADDR %UNTUNNELED_L4_DST_PORT
%GTPV2_REQ_MSG_TYPE %GTPV2_RSP_MSG_TYPE %GTPV2_C2S_S1U_GTPU_TEID
%GTPV2_C2S_S1U_GTPU_IP %GTPV2_S2C_S1U_GTPU_TEID %GTPV2_S5_S8_GTPC_TEID
%GTPV2_S2C_S1U_GTPU_IP %GTPV2_C2S_S5_S8_GTPU_TEID
%GTPV2_S2C_S5_S8_GTPU_TEID %GTPV2_C2S_S5_S8_GTPU_IP
%GTPV2_S2C_S5_S8_GTPU_IP %GTPV2_END_USER_IMSI %GTPV2_END_USER_MSISDN
%GTPV2_APN_NAME %GTPV2_ULI_MCC %GTPV2_ULI_MNC %GTPV2_ULI_CELL_TAC
%GTPV2_ULI_CELL_ID %GTPV2_RESPONSE_CAUSE %GTPV2_RAT_TYPE %GTPV2_PDN_IP
%GTPV2_END_USER_IMEI %GTPV2_C2S_S5_S8_GTPC_IP %GTPV2_S2C_S5_S8_GTPC_IP
%GTPV2_C2S_S5_S8_SGW_GTPU_TEID %GTPV2_S2C_S5_S8_SGW_GTPU_TEID
%GTPV2_C2S_S5_S8_SGW_GTPU_IP %GTPV2_S2C_S5_S8_SGW_GTPU_IP
%GTPV1_REQ_MSG_TYPE %GTPV1_RSP_MSG_TYPE %GTPV1_C2S_TEID_DATA
%GTPV1_C2S_TEID_CTRL %GTPV1_S2C_TEID_DATA %GTPV1_S2C_TEID_CTRL
%GTPV1_END_USER_IP %GTPV1_END_USER_IMSI %GTPV1_END_USER_MSISDN
%GTPV1_END_USER_IMEI %GTPV1_APN_NAME %GTPV1_RAT_TYPE %GTPV1_RAI_MCC
%GTPV1_RAI_MNC %GTPV1_RAI_LAC %GTPV1_RAI_RAC %GTPV1_ULI_MCC %GTPV1_ULI_MNC
%GTPV1_ULI_CELL_LAC %GTPV1_ULI_CELL_CI %GTPV1_ULI_SAC %GTPV1_RESPONSE_CAUSE
%SRC_FRAGMENTS %DST_FRAGMENTS %CLIENT_NW_LATENCY_MS %SERVER_NW_LATENCY_MS
%APPL_LATENCY_MS %RETRANSMITTED_IN_BYTES %RETRANSMITTED_IN_PKTS
%RETRANSMITTED_OUT_BYTES %RETRANSMITTED_OUT_PKTS %OOORDER_IN_PKTS
%OOORDER_OUT_PKTS %FLOW_ACTIVE_TIMEOUT %FLOW_INACTIVE_TIMEOUT %MIN_TTL
%MAX_TTL %IN_SRC_MAC %OUT_DST_MAC %PACKET_SECTION_OFFSET %FRAME_LENGTH
%SRC_TO_DST_MAX_THROUGHPUT %SRC_TO_DST_MIN_THROUGHPUT
%SRC_TO_DST_AVG_THROUGHPUT %DST_TO_SRC_MAX_THROUGHPUT
%DST_TO_SRC_MIN_THROUGHPUT %DST_TO_SRC_AVG_THROUGHPUT
%NUM_PKTS_UP_TO_128_BYTES %NUM_PKTS_128_TO_256_BYTES
%NUM_PKTS_256_TO_512_BYTES %NUM_PKTS_512_TO_1024_BYTES
%NUM_PKTS_1024_TO_1514_BYTES %NUM_PKTS_OVER_1514_BYTES %LONGEST_FLOW_PKT
%SHORTEST_FLOW_PKT %NUM_PKTS_TTL_EQ_1 %NUM_PKTS_TTL_2_5 %NUM_PKTS_TTL_5_32
%NUM_PKTS_TTL_32_64 %NUM_PKTS_TTL_64_96 %NUM_PKTS_TTL_96_128
%NUM_PKTS_TTL_128_160 %NUM_PKTS_TTL_160_192 %NUM_PKTS_TTL_192_224
%NUM_PKTS_TTL_224_255 %DURATION_IN %DURATION_OUT %TCP_WIN_MIN_IN
%TCP_WIN_MAX_IN %TCP_WIN_MSS_IN %TCP_WIN_SCALE_IN %TCP_WIN_MIN_OUT
%TCP_WIN_MAX_OUT %TCP_WIN_MSS_OUT %TCP_WIN_SCALE_OUT"
-A=/usr/share/ntopng/httpdocs/geoip/GeoIPASNum.dat
--city-list=/usr/share/ntopng/httpdocs/geoip/GeoLiteCity.dat
--kafka "127.0.0.1:9092;TEST"
-D=t
--tunnel
-b=1
--smart-udp-frags
-f="udp"

-- nProbe traffic statistics:

19/Jan/2018 18:49:15 [nprobe.c:3106] Average traffic: [227.07 K pps][All
Traffic 1.18 Gb/sec][IP Traffic 1.05 Gb/sec][ratio 0.89]
19/Jan/2018 18:49:15 [nprobe.c:3114] Current traffic: [222.87 K pps][1.14
Gb/sec]
19/Jan/2018 18:49:15 [nprobe.c:3120] Current flow export rate: [5350.3
flows/sec]
19/Jan/2018 18:49:15 [nprobe.c:3123] Flow drops: [export queue too
long=0][too many flows=0][ELK queue flow drops=0]
19/Jan/2018 18:49:15 [nprobe.c:3128] Export Queue: 510035/4048000 [12.6 %]
19/Jan/2018 18:49:15 [nprobe.c:3133] Flow Buckets:
[active=541656][allocated=1051691][toBeExported=510035]
19/Jan/2018 18:49:15 [nprobe.c:3139] Kafka [flows exported=9071081/5350.3
flows/sec][msgs sent=9071081/1.0 flows/msg][send errors=0]
19/Jan/2018 18:49:15 [nprobe.c:2956] Processed packets: 388520559 (max
bucket search: 19)
19/Jan/2018 18:49:15 [nprobe.c:2939] Fragment queue length: 0
19/Jan/2018 18:49:15 [nprobe.c:2962] WARNING: Your bucket search is too
slow (19): expect drops
19/Jan/2018 18:49:15 [nprobe.c:2965] Flow export stats: [0 bytes/0 pkts][0
flows/0 pkts sent]
19/Jan/2018 18:49:15 [nprobe.c:2975] Flow drop stats: [1241630041
bytes/273259031 pkts][0 flows]
19/Jan/2018 18:49:15 [nprobe.c:2980] Total flow stats: [1241630041
bytes/273259031 pkts][0 flows/0 pkts sent]
19/Jan/2018 18:49:15 [nprobe.c:2991] Kafka [flows exported=9071083][msgs
sent=9071083/1.0 flows/msg][send errors=0]


-- Our application statistics:

19/Jan/2018 18:49:15 [zcluster.c:173] Absolute Stats: Recv 5'604'403'857
pkts (0 drops) - Forwarded 2'197'198'847 pkts (3'407'205'010 drops)
19/Jan/2018 18:49:15 [zcluster.c:205] (In p2p1) RX 1176494979 pkts
Dropped 0 pkts (0.0 %)
19/Jan/2018 18:49:15 [zcluster.c:205] (In p2p2) RX 4427908920 pkts
Dropped 0 pkts (0.0 %)
19/Jan/2018 18:49:15 [zcluster.c:219] (Out Local) num:0 RX
2197166603 pkts Dropped 3407205052 pkts (60.8 %)
19/Jan/2018 18:49:15 [zcluster.c:239] Actual Stats: Recv 538'086.38 pps
(0.00 drops) - Forwarded 209'878.02 pps (328'208.35 drops)


--
Regards,
David Notivol
dnotivol@gmail.com
Re: nProbe performance and packet drops [ In reply to ]
Hello,

Sorry for replying to myself.

Just adding troubleshooting information. We've tested using zbalance to
form the virtual interface instead of our application, and we are getting
the same results; we keep having drops.
Thanks.

Regards,
David Notivol.

2018-01-19 16:59 GMT+01:00 David Notivol <dnotivol@gmail.com>:

> Hello list,
>
> I'm testing nProbe listening from two different 10Gb interfaces (using
> i40e pf_ring's driver). As we need to mix up the information of these two
> links, we use a custom application (using pf_ring sources) that creates a
> virtual interface with traffic from the original physical ones.
>
> The total traffic is about 3Gbps (we expect to have more), but we are
> seeing around 50-60% packet drops between our application and nProbe. When
> testing with zcount/pfcount instead of nProbe, we see 0% drops.
>
> I've made some tuning in the nProbe parameters (hash-size, max-num-flows,
> idle-timeout, ...), but no significant changes has been noticed.
>
> Drops are fewer when disabling the export to Kafka and disabling all
> plugins (GTPv1, GTPv2 and HTTP), although we always have (around 1-2%).
>
> I'm pasting below my nProbe configuration, and some traffic statistics.
>
> Do you have any recommendation I could follow to improve this performance?
> Thanks a lot in advance.
>
>
> -- System:
>
> nProbe: nprobe-8.0.171020-5797.x86_64
> System RAM: 64GB
> System CPU: 12 cores
> System OS: CentOS Linux release 7.4.1708 (Core)
> Linux Kernel: 3.10.0-693.11.6.el7.x86_64 #1 SMP Thu Jan 4 01:06:37 UTC
> 2018 x86_64 x86_64 x86_64 GNU/Linux
>
> -- nProbe configuration:
>
> -n=none
> -i=zc:1@0
> -s=128
> -t=60
> -d=30
> -a=0
> -e=1
> -B=10
> -w=4048000
> -M=10000000
> -z=0
> -S=1:1
> -E=0:0
> -g=/var/run/nprobe-zc1-0.pid
> --vlanid-as-iface-idx=none
> -V=9
> -T="%IPV4_SRC_ADDR %IPV4_DST_ADDR %IN_PKTS %IN_BYTES %OUT_PKTS %OUT_BYTES
> %FIRST_SWITCHED %LAST_SWITCHED %L4_SRC_PORT %L4_DST_PORT %TCP_FLAGS
> %PROTOCOL %SRC_TOS %SRC_AS %DST_AS %L7_PROTO %L7_PROTO_NAME %SRC_IP_COUNTRY
> %SRC_IP_CITY %SRC_IP_LONG %SRC_IP_LAT %DST_IP_COUNTRY %DST_IP_CITY
> %DST_IP_LONG %DST_IP_LAT %SRC_VLAN %DST_VLAN %DOT1Q_SRC_VLAN
> %DOT1Q_DST_VLAN %DIRECTION %SSL_SERVER_NAME %SRC_AS_MAP %DST_AS_MAP
> %HTTP_METHOD %HTTP_RET_CODE %HTTP_REFERER %HTTP_UA %HTTP_MIME %HTTP_HOST
> %HTTP_SITE %UPSTREAM_TUNNEL_ID %UPSTREAM_SESSION_ID %DOWNSTREAM_TUNNEL_ID
> %DOWNSTREAM_SESSION_ID %UNTUNNELED_PROTOCOL %UNTUNNELED_IPV4_SRC_ADDR
> %UNTUNNELED_L4_SRC_PORT %UNTUNNELED_IPV4_DST_ADDR %UNTUNNELED_L4_DST_PORT
> %GTPV2_REQ_MSG_TYPE %GTPV2_RSP_MSG_TYPE %GTPV2_C2S_S1U_GTPU_TEID
> %GTPV2_C2S_S1U_GTPU_IP %GTPV2_S2C_S1U_GTPU_TEID %GTPV2_S5_S8_GTPC_TEID
> %GTPV2_S2C_S1U_GTPU_IP %GTPV2_C2S_S5_S8_GTPU_TEID
> %GTPV2_S2C_S5_S8_GTPU_TEID %GTPV2_C2S_S5_S8_GTPU_IP
> %GTPV2_S2C_S5_S8_GTPU_IP %GTPV2_END_USER_IMSI %GTPV2_END_USER_MSISDN
> %GTPV2_APN_NAME %GTPV2_ULI_MCC %GTPV2_ULI_MNC %GTPV2_ULI_CELL_TAC
> %GTPV2_ULI_CELL_ID %GTPV2_RESPONSE_CAUSE %GTPV2_RAT_TYPE %GTPV2_PDN_IP
> %GTPV2_END_USER_IMEI %GTPV2_C2S_S5_S8_GTPC_IP %GTPV2_S2C_S5_S8_GTPC_IP
> %GTPV2_C2S_S5_S8_SGW_GTPU_TEID %GTPV2_S2C_S5_S8_SGW_GTPU_TEID
> %GTPV2_C2S_S5_S8_SGW_GTPU_IP %GTPV2_S2C_S5_S8_SGW_GTPU_IP
> %GTPV1_REQ_MSG_TYPE %GTPV1_RSP_MSG_TYPE %GTPV1_C2S_TEID_DATA
> %GTPV1_C2S_TEID_CTRL %GTPV1_S2C_TEID_DATA %GTPV1_S2C_TEID_CTRL
> %GTPV1_END_USER_IP %GTPV1_END_USER_IMSI %GTPV1_END_USER_MSISDN
> %GTPV1_END_USER_IMEI %GTPV1_APN_NAME %GTPV1_RAT_TYPE %GTPV1_RAI_MCC
> %GTPV1_RAI_MNC %GTPV1_RAI_LAC %GTPV1_RAI_RAC %GTPV1_ULI_MCC %GTPV1_ULI_MNC
> %GTPV1_ULI_CELL_LAC %GTPV1_ULI_CELL_CI %GTPV1_ULI_SAC %GTPV1_RESPONSE_CAUSE
> %SRC_FRAGMENTS %DST_FRAGMENTS %CLIENT_NW_LATENCY_MS %SERVER_NW_LATENCY_MS
> %APPL_LATENCY_MS %RETRANSMITTED_IN_BYTES %RETRANSMITTED_IN_PKTS
> %RETRANSMITTED_OUT_BYTES %RETRANSMITTED_OUT_PKTS %OOORDER_IN_PKTS
> %OOORDER_OUT_PKTS %FLOW_ACTIVE_TIMEOUT %FLOW_INACTIVE_TIMEOUT %MIN_TTL
> %MAX_TTL %IN_SRC_MAC %OUT_DST_MAC %PACKET_SECTION_OFFSET %FRAME_LENGTH
> %SRC_TO_DST_MAX_THROUGHPUT %SRC_TO_DST_MIN_THROUGHPUT
> %SRC_TO_DST_AVG_THROUGHPUT %DST_TO_SRC_MAX_THROUGHPUT
> %DST_TO_SRC_MIN_THROUGHPUT %DST_TO_SRC_AVG_THROUGHPUT
> %NUM_PKTS_UP_TO_128_BYTES %NUM_PKTS_128_TO_256_BYTES
> %NUM_PKTS_256_TO_512_BYTES %NUM_PKTS_512_TO_1024_BYTES
> %NUM_PKTS_1024_TO_1514_BYTES %NUM_PKTS_OVER_1514_BYTES %LONGEST_FLOW_PKT
> %SHORTEST_FLOW_PKT %NUM_PKTS_TTL_EQ_1 %NUM_PKTS_TTL_2_5 %NUM_PKTS_TTL_5_32
> %NUM_PKTS_TTL_32_64 %NUM_PKTS_TTL_64_96 %NUM_PKTS_TTL_96_128
> %NUM_PKTS_TTL_128_160 %NUM_PKTS_TTL_160_192 %NUM_PKTS_TTL_192_224
> %NUM_PKTS_TTL_224_255 %DURATION_IN %DURATION_OUT %TCP_WIN_MIN_IN
> %TCP_WIN_MAX_IN %TCP_WIN_MSS_IN %TCP_WIN_SCALE_IN %TCP_WIN_MIN_OUT
> %TCP_WIN_MAX_OUT %TCP_WIN_MSS_OUT %TCP_WIN_SCALE_OUT"
> -A=/usr/share/ntopng/httpdocs/geoip/GeoIPASNum.dat
> --city-list=/usr/share/ntopng/httpdocs/geoip/GeoLiteCity.dat
> --kafka "127.0.0.1:9092;TEST"
> -D=t
> --tunnel
> -b=1
> --smart-udp-frags
> -f="udp"
>
> -- nProbe traffic statistics:
>
> 19/Jan/2018 18:49:15 [nprobe.c:3106] Average traffic: [227.07 K pps][All
> Traffic 1.18 Gb/sec][IP Traffic 1.05 Gb/sec][ratio 0.89]
> 19/Jan/2018 18:49:15 [nprobe.c:3114] Current traffic: [222.87 K pps][1.14
> Gb/sec]
> 19/Jan/2018 18:49:15 [nprobe.c:3120] Current flow export rate: [5350.3
> flows/sec]
> 19/Jan/2018 18:49:15 [nprobe.c:3123] Flow drops: [export queue too
> long=0][too many flows=0][ELK queue flow drops=0]
> 19/Jan/2018 18:49:15 [nprobe.c:3128] Export Queue: 510035/4048000 [12.6 %]
> 19/Jan/2018 18:49:15 [nprobe.c:3133] Flow Buckets:
> [active=541656][allocated=1051691][toBeExported=510035]
> 19/Jan/2018 18:49:15 [nprobe.c:3139] Kafka [flows exported=9071081/5350.3
> flows/sec][msgs sent=9071081/1.0 flows/msg][send errors=0]
> 19/Jan/2018 18:49:15 [nprobe.c:2956] Processed packets: 388520559 (max
> bucket search: 19)
> 19/Jan/2018 18:49:15 [nprobe.c:2939] Fragment queue length: 0
> 19/Jan/2018 18:49:15 [nprobe.c:2962] WARNING: Your bucket search is too
> slow (19): expect drops
> 19/Jan/2018 18:49:15 [nprobe.c:2965] Flow export stats: [0 bytes/0 pkts][0
> flows/0 pkts sent]
> 19/Jan/2018 18:49:15 [nprobe.c:2975] Flow drop stats: [1241630041
> bytes/273259031 pkts][0 flows]
> 19/Jan/2018 18:49:15 [nprobe.c:2980] Total flow stats: [1241630041
> bytes/273259031 pkts][0 flows/0 pkts sent]
> 19/Jan/2018 18:49:15 [nprobe.c:2991] Kafka [flows exported=9071083][msgs
> sent=9071083/1.0 flows/msg][send errors=0]
>
>
> -- Our application statistics:
>
> 19/Jan/2018 18:49:15 [zcluster.c:173] Absolute Stats: Recv 5'604'403'857
> pkts (0 drops) - Forwarded 2'197'198'847 pkts (3'407'205'010 drops)
> 19/Jan/2018 18:49:15 [zcluster.c:205] (In p2p1) RX 1176494979 pkts
> Dropped 0 pkts (0.0 %)
> 19/Jan/2018 18:49:15 [zcluster.c:205] (In p2p2) RX 4427908920 pkts
> Dropped 0 pkts (0.0 %)
> 19/Jan/2018 18:49:15 [zcluster.c:219] (Out Local) num:0 RX
> 2197166603 pkts Dropped 3407205052 pkts (60.8 %)
> 19/Jan/2018 18:49:15 [zcluster.c:239] Actual Stats: Recv 538'086.38 pps
> (0.00 drops) - Forwarded 209'878.02 pps (328'208.35 drops)
>
>
> --
> Regards,
> David Notivol
> dnotivol@gmail.com
>



--
Saludos,
David Notivol
dnotivol@gmail.com
Re: nProbe performance and packet drops [ In reply to ]
David
sorry for the delay. What you can also do is the following
1. Enable RSS let’s say with two queues
2. start
nprobe -i eth1@0,eth2@0 -g 1 ...
nprobe -i eth1@1,eth2@1 -g 2 ...

If this is not enough you can increase the number of RSS queues so that each probe has less messages to process

Regards Luca

> On 22 Jan 2018, at 17:41, David Notivol <dnotivol@gmail.com> wrote:
>
> Hello,
>
> Sorry for replying to myself.
>
> Just adding troubleshooting information. We've tested using zbalance to form the virtual interface instead of our application, and we are getting the same results; we keep having drops.
> Thanks.
>
> Regards,
> David Notivol.
>
> 2018-01-19 16:59 GMT+01:00 David Notivol <dnotivol@gmail.com <mailto:dnotivol@gmail.com>>:
> Hello list,
>
> I'm testing nProbe listening from two different 10Gb interfaces (using i40e pf_ring's driver). As we need to mix up the information of these two links, we use a custom application (using pf_ring sources) that creates a virtual interface with traffic from the original physical ones.
>
> The total traffic is about 3Gbps (we expect to have more), but we are seeing around 50-60% packet drops between our application and nProbe. When testing with zcount/pfcount instead of nProbe, we see 0% drops.
>
> I've made some tuning in the nProbe parameters (hash-size, max-num-flows, idle-timeout, ...), but no significant changes has been noticed.
>
> Drops are fewer when disabling the export to Kafka and disabling all plugins (GTPv1, GTPv2 and HTTP), although we always have (around 1-2%).
>
> I'm pasting below my nProbe configuration, and some traffic statistics.
>
> Do you have any recommendation I could follow to improve this performance?
> Thanks a lot in advance.
>
>
>
> -- System:
>
> nProbe: nprobe-8.0.171020-5797.x86_64
> System RAM: 64GB
> System CPU: 12 cores
> System OS: CentOS Linux release 7.4.1708 (Core)
> Linux Kernel: 3.10.0-693.11.6.el7.x86_64 #1 SMP Thu Jan 4 01:06:37 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
>
>
> -- nProbe configuration:
> -n=none
> -i=zc:1@0
> -s=128
> -t=60
> -d=30
> -a=0
> -e=1
> -B=10
> -w=4048000
> -M=10000000
> -z=0
> -S=1:1
> -E=0:0
> -g=/var/run/nprobe-zc1-0.pid
> --vlanid-as-iface-idx=none
> -V=9
> -T="%IPV4_SRC_ADDR %IPV4_DST_ADDR %IN_PKTS %IN_BYTES %OUT_PKTS %OUT_BYTES %FIRST_SWITCHED %LAST_SWITCHED %L4_SRC_PORT %L4_DST_PORT %TCP_FLAGS %PROTOCOL %SRC_TOS %SRC_AS %DST_AS %L7_PROTO %L7_PROTO_NAME %SRC_IP_COUNTRY %SRC_IP_CITY %SRC_IP_LONG %SRC_IP_LAT %DST_IP_COUNTRY %DST_IP_CITY %DST_IP_LONG %DST_IP_LAT %SRC_VLAN %DST_VLAN %DOT1Q_SRC_VLAN %DOT1Q_DST_VLAN %DIRECTION %SSL_SERVER_NAME %SRC_AS_MAP %DST_AS_MAP %HTTP_METHOD %HTTP_RET_CODE %HTTP_REFERER %HTTP_UA %HTTP_MIME %HTTP_HOST %HTTP_SITE %UPSTREAM_TUNNEL_ID %UPSTREAM_SESSION_ID %DOWNSTREAM_TUNNEL_ID %DOWNSTREAM_SESSION_ID %UNTUNNELED_PROTOCOL %UNTUNNELED_IPV4_SRC_ADDR %UNTUNNELED_L4_SRC_PORT %UNTUNNELED_IPV4_DST_ADDR %UNTUNNELED_L4_DST_PORT %GTPV2_REQ_MSG_TYPE %GTPV2_RSP_MSG_TYPE %GTPV2_C2S_S1U_GTPU_TEID %GTPV2_C2S_S1U_GTPU_IP %GTPV2_S2C_S1U_GTPU_TEID %GTPV2_S5_S8_GTPC_TEID %GTPV2_S2C_S1U_GTPU_IP %GTPV2_C2S_S5_S8_GTPU_TEID %GTPV2_S2C_S5_S8_GTPU_TEID %GTPV2_C2S_S5_S8_GTPU_IP %GTPV2_S2C_S5_S8_GTPU_IP %GTPV2_END_USER_IMSI %GTPV2_END_USER_MSISDN %GTPV2_APN_NAME %GTPV2_ULI_MCC %GTPV2_ULI_MNC %GTPV2_ULI_CELL_TAC %GTPV2_ULI_CELL_ID %GTPV2_RESPONSE_CAUSE %GTPV2_RAT_TYPE %GTPV2_PDN_IP %GTPV2_END_USER_IMEI %GTPV2_C2S_S5_S8_GTPC_IP %GTPV2_S2C_S5_S8_GTPC_IP %GTPV2_C2S_S5_S8_SGW_GTPU_TEID %GTPV2_S2C_S5_S8_SGW_GTPU_TEID %GTPV2_C2S_S5_S8_SGW_GTPU_IP %GTPV2_S2C_S5_S8_SGW_GTPU_IP %GTPV1_REQ_MSG_TYPE %GTPV1_RSP_MSG_TYPE %GTPV1_C2S_TEID_DATA %GTPV1_C2S_TEID_CTRL %GTPV1_S2C_TEID_DATA %GTPV1_S2C_TEID_CTRL %GTPV1_END_USER_IP %GTPV1_END_USER_IMSI %GTPV1_END_USER_MSISDN %GTPV1_END_USER_IMEI %GTPV1_APN_NAME %GTPV1_RAT_TYPE %GTPV1_RAI_MCC %GTPV1_RAI_MNC %GTPV1_RAI_LAC %GTPV1_RAI_RAC %GTPV1_ULI_MCC %GTPV1_ULI_MNC %GTPV1_ULI_CELL_LAC %GTPV1_ULI_CELL_CI %GTPV1_ULI_SAC %GTPV1_RESPONSE_CAUSE %SRC_FRAGMENTS %DST_FRAGMENTS %CLIENT_NW_LATENCY_MS %SERVER_NW_LATENCY_MS %APPL_LATENCY_MS %RETRANSMITTED_IN_BYTES %RETRANSMITTED_IN_PKTS %RETRANSMITTED_OUT_BYTES %RETRANSMITTED_OUT_PKTS %OOORDER_IN_PKTS %OOORDER_OUT_PKTS %FLOW_ACTIVE_TIMEOUT %FLOW_INACTIVE_TIMEOUT %MIN_TTL %MAX_TTL %IN_SRC_MAC %OUT_DST_MAC %PACKET_SECTION_OFFSET %FRAME_LENGTH %SRC_TO_DST_MAX_THROUGHPUT %SRC_TO_DST_MIN_THROUGHPUT %SRC_TO_DST_AVG_THROUGHPUT %DST_TO_SRC_MAX_THROUGHPUT %DST_TO_SRC_MIN_THROUGHPUT %DST_TO_SRC_AVG_THROUGHPUT %NUM_PKTS_UP_TO_128_BYTES %NUM_PKTS_128_TO_256_BYTES %NUM_PKTS_256_TO_512_BYTES %NUM_PKTS_512_TO_1024_BYTES %NUM_PKTS_1024_TO_1514_BYTES %NUM_PKTS_OVER_1514_BYTES %LONGEST_FLOW_PKT %SHORTEST_FLOW_PKT %NUM_PKTS_TTL_EQ_1 %NUM_PKTS_TTL_2_5 %NUM_PKTS_TTL_5_32 %NUM_PKTS_TTL_32_64 %NUM_PKTS_TTL_64_96 %NUM_PKTS_TTL_96_128 %NUM_PKTS_TTL_128_160 %NUM_PKTS_TTL_160_192 %NUM_PKTS_TTL_192_224 %NUM_PKTS_TTL_224_255 %DURATION_IN %DURATION_OUT %TCP_WIN_MIN_IN %TCP_WIN_MAX_IN %TCP_WIN_MSS_IN %TCP_WIN_SCALE_IN %TCP_WIN_MIN_OUT %TCP_WIN_MAX_OUT %TCP_WIN_MSS_OUT %TCP_WIN_SCALE_OUT"
> -A=/usr/share/ntopng/httpdocs/geoip/GeoIPASNum.dat
> --city-list=/usr/share/ntopng/httpdocs/geoip/GeoLiteCity.dat
> --kafka "127.0.0.1:9092;TEST"
> -D=t
> --tunnel
> -b=1
> --smart-udp-frags
> -f="udp"
>
>
> -- nProbe traffic statistics:
>
> 19/Jan/2018 18:49:15 [nprobe.c:3106] Average traffic: [227.07 K pps][All Traffic 1.18 Gb/sec][IP Traffic 1.05 Gb/sec][ratio 0.89]
> 19/Jan/2018 18:49:15 [nprobe.c:3114] Current traffic: [222.87 K pps][1.14 Gb/sec]
> 19/Jan/2018 18:49:15 [nprobe.c:3120] Current flow export rate: [5350.3 flows/sec]
> 19/Jan/2018 18:49:15 [nprobe.c:3123] Flow drops: [export queue too long=0][too many flows=0][ELK queue flow drops=0]
> 19/Jan/2018 18:49:15 [nprobe.c:3128] Export Queue: 510035/4048000 [12.6 %]
> 19/Jan/2018 18:49:15 [nprobe.c:3133] Flow Buckets: [active=541656][allocated=1051691][toBeExported=510035]
> 19/Jan/2018 18:49:15 [nprobe.c:3139] Kafka [flows exported=9071081/5350.3 flows/sec][msgs sent=9071081/1.0 flows/msg][send errors=0]
> 19/Jan/2018 18:49:15 [nprobe.c:2956] Processed packets: 388520559 (max bucket search: 19)
> 19/Jan/2018 18:49:15 [nprobe.c:2939] Fragment queue length: 0
> 19/Jan/2018 18:49:15 [nprobe.c:2962] WARNING: Your bucket search is too slow (19): expect drops
> 19/Jan/2018 18:49:15 [nprobe.c:2965] Flow export stats: [0 bytes/0 pkts][0 flows/0 pkts sent]
> 19/Jan/2018 18:49:15 [nprobe.c:2975] Flow drop stats: [1241630041 bytes/273259031 pkts][0 flows]
> 19/Jan/2018 18:49:15 [nprobe.c:2980] Total flow stats: [1241630041 bytes/273259031 pkts][0 flows/0 pkts sent]
> 19/Jan/2018 18:49:15 [nprobe.c:2991] Kafka [flows exported=9071083][msgs sent=9071083/1.0 flows/msg][send errors=0]
>
>
>
> -- Our application statistics:
>
> 19/Jan/2018 18:49:15 [zcluster.c:173] Absolute Stats: Recv 5'604'403'857 pkts (0 drops) - Forwarded 2'197'198'847 pkts (3'407'205'010 drops)
> 19/Jan/2018 18:49:15 [zcluster.c:205] (In p2p1) RX 1176494979 pkts Dropped 0 pkts (0.0 %)
> 19/Jan/2018 18:49:15 [zcluster.c:205] (In p2p2) RX 4427908920 pkts Dropped 0 pkts (0.0 %)
> 19/Jan/2018 18:49:15 [zcluster.c:219] (Out Local) num:0 RX 2197166603 pkts Dropped 3407205052 pkts (60.8 %)
> 19/Jan/2018 18:49:15 [zcluster.c:239] Actual Stats: Recv 538'086.38 pps (0.00 drops) - Forwarded 209'878.02 pps (328'208.35 drops)
>
>
>
> --
> Regards,
> David Notivol
> dnotivol@gmail.com <mailto:dnotivol@gmail.com>
>
>
>
> --
> Saludos,
> David Notivol
> dnotivol@gmail.com <mailto:dnotivol@gmail.com>
> _______________________________________________
> Ntop-misc mailing list
> Ntop-misc@listgateway.unipi.it <mailto:Ntop-misc@listgateway.unipi.it>
> http://listgateway.unipi.it/mailman/listinfo/ntop-misc <http://listgateway.unipi.it/mailman/listinfo/ntop-misc>
Re: nProbe performance and packet drops [ In reply to ]
Thanks Luca for your advice,

Running several nprobe instances we're able to handle all the traffic.
Just for the record, finally we got it working using zbalance with a GTP
hash and 4 application instances:

zbalance_ipc -i p2p1,p2p2 -c 1 -n 4 -m 4 -a -p

Regards,
David.


2018-01-22 22:17 GMT+01:00 Luca Deri <deri@ntop.org>:

> David
> sorry for the delay. What you can also do is the following
> 1. Enable RSS let’s say with two queues
> 2. start
> nprobe -i eth1@0,eth2@0 -g 1 ...
> nprobe -i eth1@1,eth2@1 -g 2 ...
>
> If this is not enough you can increase the number of RSS queues so that
> each probe has less messages to process
>
> Regards Luca
>
> On 22 Jan 2018, at 17:41, David Notivol <dnotivol@gmail.com> wrote:
>
> Hello,
>
> Sorry for replying to myself.
>
> Just adding troubleshooting information. We've tested using zbalance to
> form the virtual interface instead of our application, and we are getting
> the same results; we keep having drops.
> Thanks.
>
> Regards,
> David Notivol.
>
> 2018-01-19 16:59 GMT+01:00 David Notivol <dnotivol@gmail.com>:
>
>> Hello list,
>>
>> I'm testing nProbe listening from two different 10Gb interfaces (using
>> i40e pf_ring's driver). As we need to mix up the information of these two
>> links, we use a custom application (using pf_ring sources) that creates a
>> virtual interface with traffic from the original physical ones.
>>
>> The total traffic is about 3Gbps (we expect to have more), but we are
>> seeing around 50-60% packet drops between our application and nProbe. When
>> testing with zcount/pfcount instead of nProbe, we see 0% drops.
>>
>> I've made some tuning in the nProbe parameters (hash-size, max-num-flows,
>> idle-timeout, ...), but no significant changes has been noticed.
>>
>> Drops are fewer when disabling the export to Kafka and disabling all
>> plugins (GTPv1, GTPv2 and HTTP), although we always have (around 1-2%).
>>
>> I'm pasting below my nProbe configuration, and some traffic statistics.
>>
>> Do you have any recommendation I could follow to improve this performance?
>> Thanks a lot in advance.
>>
>>
>> -- System:
>>
>> nProbe: nprobe-8.0.171020-5797.x86_64
>> System RAM: 64GB
>> System CPU: 12 cores
>> System OS: CentOS Linux release 7.4.1708 (Core)
>> Linux Kernel: 3.10.0-693.11.6.el7.x86_64 #1 SMP Thu Jan 4 01:06:37 UTC
>> 2018 x86_64 x86_64 x86_64 GNU/Linux
>>
>> -- nProbe configuration:
>>
>> -n=none
>> -i=zc:1@0
>> -s=128
>> -t=60
>> -d=30
>> -a=0
>> -e=1
>> -B=10
>> -w=4048000
>> -M=10000000
>> -z=0
>> -S=1:1
>> -E=0:0
>> -g=/var/run/nprobe-zc1-0.pid
>> --vlanid-as-iface-idx=none
>> -V=9
>> -T="%IPV4_SRC_ADDR %IPV4_DST_ADDR %IN_PKTS %IN_BYTES %OUT_PKTS %OUT_BYTES
>> %FIRST_SWITCHED %LAST_SWITCHED %L4_SRC_PORT %L4_DST_PORT %TCP_FLAGS
>> %PROTOCOL %SRC_TOS %SRC_AS %DST_AS %L7_PROTO %L7_PROTO_NAME %SRC_IP_COUNTRY
>> %SRC_IP_CITY %SRC_IP_LONG %SRC_IP_LAT %DST_IP_COUNTRY %DST_IP_CITY
>> %DST_IP_LONG %DST_IP_LAT %SRC_VLAN %DST_VLAN %DOT1Q_SRC_VLAN
>> %DOT1Q_DST_VLAN %DIRECTION %SSL_SERVER_NAME %SRC_AS_MAP %DST_AS_MAP
>> %HTTP_METHOD %HTTP_RET_CODE %HTTP_REFERER %HTTP_UA %HTTP_MIME %HTTP_HOST
>> %HTTP_SITE %UPSTREAM_TUNNEL_ID %UPSTREAM_SESSION_ID %DOWNSTREAM_TUNNEL_ID
>> %DOWNSTREAM_SESSION_ID %UNTUNNELED_PROTOCOL %UNTUNNELED_IPV4_SRC_ADDR
>> %UNTUNNELED_L4_SRC_PORT %UNTUNNELED_IPV4_DST_ADDR %UNTUNNELED_L4_DST_PORT
>> %GTPV2_REQ_MSG_TYPE %GTPV2_RSP_MSG_TYPE %GTPV2_C2S_S1U_GTPU_TEID
>> %GTPV2_C2S_S1U_GTPU_IP %GTPV2_S2C_S1U_GTPU_TEID %GTPV2_S5_S8_GTPC_TEID
>> %GTPV2_S2C_S1U_GTPU_IP %GTPV2_C2S_S5_S8_GTPU_TEID
>> %GTPV2_S2C_S5_S8_GTPU_TEID %GTPV2_C2S_S5_S8_GTPU_IP
>> %GTPV2_S2C_S5_S8_GTPU_IP %GTPV2_END_USER_IMSI %GTPV2_END_USER_MSISDN
>> %GTPV2_APN_NAME %GTPV2_ULI_MCC %GTPV2_ULI_MNC %GTPV2_ULI_CELL_TAC
>> %GTPV2_ULI_CELL_ID %GTPV2_RESPONSE_CAUSE %GTPV2_RAT_TYPE %GTPV2_PDN_IP
>> %GTPV2_END_USER_IMEI %GTPV2_C2S_S5_S8_GTPC_IP %GTPV2_S2C_S5_S8_GTPC_IP
>> %GTPV2_C2S_S5_S8_SGW_GTPU_TEID %GTPV2_S2C_S5_S8_SGW_GTPU_TEID
>> %GTPV2_C2S_S5_S8_SGW_GTPU_IP %GTPV2_S2C_S5_S8_SGW_GTPU_IP
>> %GTPV1_REQ_MSG_TYPE %GTPV1_RSP_MSG_TYPE %GTPV1_C2S_TEID_DATA
>> %GTPV1_C2S_TEID_CTRL %GTPV1_S2C_TEID_DATA %GTPV1_S2C_TEID_CTRL
>> %GTPV1_END_USER_IP %GTPV1_END_USER_IMSI %GTPV1_END_USER_MSISDN
>> %GTPV1_END_USER_IMEI %GTPV1_APN_NAME %GTPV1_RAT_TYPE %GTPV1_RAI_MCC
>> %GTPV1_RAI_MNC %GTPV1_RAI_LAC %GTPV1_RAI_RAC %GTPV1_ULI_MCC %GTPV1_ULI_MNC
>> %GTPV1_ULI_CELL_LAC %GTPV1_ULI_CELL_CI %GTPV1_ULI_SAC %GTPV1_RESPONSE_CAUSE
>> %SRC_FRAGMENTS %DST_FRAGMENTS %CLIENT_NW_LATENCY_MS %SERVER_NW_LATENCY_MS
>> %APPL_LATENCY_MS %RETRANSMITTED_IN_BYTES %RETRANSMITTED_IN_PKTS
>> %RETRANSMITTED_OUT_BYTES %RETRANSMITTED_OUT_PKTS %OOORDER_IN_PKTS
>> %OOORDER_OUT_PKTS %FLOW_ACTIVE_TIMEOUT %FLOW_INACTIVE_TIMEOUT %MIN_TTL
>> %MAX_TTL %IN_SRC_MAC %OUT_DST_MAC %PACKET_SECTION_OFFSET %FRAME_LENGTH
>> %SRC_TO_DST_MAX_THROUGHPUT %SRC_TO_DST_MIN_THROUGHPUT
>> %SRC_TO_DST_AVG_THROUGHPUT %DST_TO_SRC_MAX_THROUGHPUT
>> %DST_TO_SRC_MIN_THROUGHPUT %DST_TO_SRC_AVG_THROUGHPUT
>> %NUM_PKTS_UP_TO_128_BYTES %NUM_PKTS_128_TO_256_BYTES
>> %NUM_PKTS_256_TO_512_BYTES %NUM_PKTS_512_TO_1024_BYTES
>> %NUM_PKTS_1024_TO_1514_BYTES %NUM_PKTS_OVER_1514_BYTES %LONGEST_FLOW_PKT
>> %SHORTEST_FLOW_PKT %NUM_PKTS_TTL_EQ_1 %NUM_PKTS_TTL_2_5 %NUM_PKTS_TTL_5_32
>> %NUM_PKTS_TTL_32_64 %NUM_PKTS_TTL_64_96 %NUM_PKTS_TTL_96_128
>> %NUM_PKTS_TTL_128_160 %NUM_PKTS_TTL_160_192 %NUM_PKTS_TTL_192_224
>> %NUM_PKTS_TTL_224_255 %DURATION_IN %DURATION_OUT %TCP_WIN_MIN_IN
>> %TCP_WIN_MAX_IN %TCP_WIN_MSS_IN %TCP_WIN_SCALE_IN %TCP_WIN_MIN_OUT
>> %TCP_WIN_MAX_OUT %TCP_WIN_MSS_OUT %TCP_WIN_SCALE_OUT"
>> -A=/usr/share/ntopng/httpdocs/geoip/GeoIPASNum.dat
>> --city-list=/usr/share/ntopng/httpdocs/geoip/GeoLiteCity.dat
>> --kafka "127.0.0.1:9092;TEST"
>> -D=t
>> --tunnel
>> -b=1
>> --smart-udp-frags
>> -f="udp"
>>
>> -- nProbe traffic statistics:
>>
>> 19/Jan/2018 18:49:15 [nprobe.c:3106] Average traffic: [227.07 K pps][All
>> Traffic 1.18 Gb/sec][IP Traffic 1.05 Gb/sec][ratio 0.89]
>> 19/Jan/2018 18:49:15 [nprobe.c:3114] Current traffic: [222.87 K pps][1.14
>> Gb/sec]
>> 19/Jan/2018 18:49:15 [nprobe.c:3120] Current flow export rate: [5350.3
>> flows/sec]
>> 19/Jan/2018 18:49:15 [nprobe.c:3123] Flow drops: [export queue too
>> long=0][too many flows=0][ELK queue flow drops=0]
>> 19/Jan/2018 18:49:15 [nprobe.c:3128] Export Queue: 510035/4048000 [12.6 %]
>> 19/Jan/2018 18:49:15 [nprobe.c:3133] Flow Buckets:
>> [active=541656][allocated=1051691][toBeExported=510035]
>> 19/Jan/2018 18:49:15 [nprobe.c:3139] Kafka [flows exported=9071081/5350.3
>> flows/sec][msgs sent=9071081/1.0 flows/msg][send errors=0]
>> 19/Jan/2018 18:49:15 [nprobe.c:2956] Processed packets: 388520559 (max
>> bucket search: 19)
>> 19/Jan/2018 18:49:15 [nprobe.c:2939] Fragment queue length: 0
>> 19/Jan/2018 18:49:15 [nprobe.c:2962] WARNING: Your bucket search is too
>> slow (19): expect drops
>> 19/Jan/2018 18:49:15 [nprobe.c:2965] Flow export stats: [0 bytes/0
>> pkts][0 flows/0 pkts sent]
>> 19/Jan/2018 18:49:15 [nprobe.c:2975] Flow drop stats: [1241630041
>> bytes/273259031 pkts][0 flows]
>> 19/Jan/2018 18:49:15 [nprobe.c:2980] Total flow stats: [1241630041
>> bytes/273259031 pkts][0 flows/0 pkts sent]
>> 19/Jan/2018 18:49:15 [nprobe.c:2991] Kafka [flows exported=9071083][msgs
>> sent=9071083/1.0 flows/msg][send errors=0]
>>
>>
>> -- Our application statistics:
>>
>> 19/Jan/2018 18:49:15 [zcluster.c:173] Absolute Stats: Recv 5'604'403'857
>> pkts (0 drops) - Forwarded 2'197'198'847 pkts (3'407'205'010 drops)
>> 19/Jan/2018 18:49:15 [zcluster.c:205] (In p2p1) RX 1176494979
>> pkts Dropped 0 pkts (0.0 %)
>> 19/Jan/2018 18:49:15 [zcluster.c:205] (In p2p2) RX 4427908920
>> pkts Dropped 0 pkts (0.0 %)
>> 19/Jan/2018 18:49:15 [zcluster.c:219] (Out Local) num:0 RX
>> 2197166603 pkts Dropped 3407205052 pkts (60.8 %)
>> 19/Jan/2018 18:49:15 [zcluster.c:239] Actual Stats: Recv 538'086.38 pps
>> (0.00 drops) - Forwarded 209'878.02 pps (328'208.35 drops)
>>
>>
>> --
>> Regards,
>> David Notivol
>> dnotivol@gmail.com
>>
>
>
>
> --
> Saludos,
> David Notivol
> dnotivol@gmail.com
> _______________________________________________
> Ntop-misc mailing list
> Ntop-misc@listgateway.unipi.it
> http://listgateway.unipi.it/mailman/listinfo/ntop-misc
>
>
>
> _______________________________________________
> Ntop-misc mailing list
> Ntop-misc@listgateway.unipi.it
> http://listgateway.unipi.it/mailman/listinfo/ntop-misc
>



--
Saludos,
David Notivol
dnotivol@gmail.com