I am fairly new to nprobe and have been experimenting with the many
commandline options. I have a few general questions that I would appreciate
any clarification.
nprobe -v
Welcome to nProbe v.8.0.171020 (r5797) for x86_64-unknown-linux-gnu
with native PF_RING acceleration.
Copyright 2002-17 ntop.org
Build OS: CentOS Linux release 7.3.1611 (Core)
SystemID: 68A2B43E76056A7E
GIT rev: 8.0-stable:478c52c6ce70feaf6c65fe4806be05f75fe0e196:20171020
License: Invalid nProbe license (/etc/nprobe.license) [Missing
license file]
Q1. When running on a multi-core host, will nprobe utilize all cores.
Somewhere, I thought I saw something about it being single threaded but now
cannot find that reference. This question goes to sizing my HW. I am seeing
~5% CPU load for one router's flow (about 2500 flow records/sec). I will
ultimately need more than 20x this volume so I need to deploy N hosts
eventually in full production setup. I just want to know if there are any
settings needed to enable nprobe to fully utilize all cores on a given host.
Q2. I am running with this configuration:
[root@vmwdnacollector01 ~]# cat /etc/nprobe/nprobe.conf
--interface=none
--collector=none
--collector-port=2055
--verbose=1
--flow-version=9
--hash-size=262144
--kafka="kafka01:9092;netflow-raw;1"
--dump-stats=/var/log/nprobe/stats.txt
--event-log=/var/log/nprobe/events.txt
-T="%IPV4_SRC_ADDR %IPV4_DST_ADDR %L4_SRC_PORT %L4_DST_PORT %IPV4_SRC_MASK
%IPV4_DST_MASK %IPV4_NEXT_HOP %IN_PKTS %IN_BYTES %OUT_PKTS %OUT_BYTES
%FIRST_SWITCHED %LAST_SWITCHED %TCP_FLAGS %PROTOCOL %SRC_TOS %DIRECTION
%EXPORTER_IPV4_ADDRESS"
I am collecting netflow V9 records from a Cisco router. I was sort of
expecting that the record would include the IP address of the router
because I need that to know where the data came from for upstream
enrichment. I have nprobe publishing to Kafka. But, looking at the raw
flows coming from the router, there is no field that identifies the router
IP. So, I experimented and added a -T <template> definition that matches
the actual fields coming from the router. Then I added the
%EXPORTER_IPV4_ADDRESS field (which is NOT in the raw record from the
router) and voila, the IP address of the router shows up in that field. So,
I assume that nprobe is simply adding the source IP address of each
incoming flow record into that field, as well as mapping each field in the
incoming flow record into the matching field in my defined template - sort
of "cherry picking" the fields out of the source record and packing them
into my template.
So, my question on this point is, am I doing this correctly with defining
my own template? Seems like the only way I can figure it out.
Q3. It appears, for the mode I am operating in, that no license is required
to allow this to work. When I run in the mode where nprobe sniffs packets
from my local interface, it will only produce 25K flows then stops if there
is no license. However, in collector mode, where it just receives flows
from a router and forwards them as JSON to Kafka, it runs for millions of
flows. So, question here is, do I need a license for this sort of use case?
Q4. The Kafka producer has a boat load of configuration options but nprobe
only exposes a couple basic options (topic, acks, brokers). Is that it or
is there some way to provide additional configuration information to the
embedded producer? For example, to properly aggregate data flows, I would
like to partition the topic on the IPV4_SRC_ADDR. I am running in a
multi-tenant environment where each tenant can have overlapping private IP
addresses that we see in the flows. So, I need to aggregate the flows by
TENANT_ID + IPV4_SRC_ADDR, for example. I see no way to configure this with
nprobe + kafka mode.
Q5. Is there any way to bind nprobe to specific interface when used as a
collector in my use case? Meaning, I might need to run multiple instances
on a single host but I want to be able to configure routers to direct their
flow records to a specific IP address so that I can load-balance the flows
over N instances of nprobe running on a single host. I cannot find any
configuration option that will bind the UDP listening port to a specific
interface on a single host.
Thanks for any insights into my questions.
commandline options. I have a few general questions that I would appreciate
any clarification.
nprobe -v
Welcome to nProbe v.8.0.171020 (r5797) for x86_64-unknown-linux-gnu
with native PF_RING acceleration.
Copyright 2002-17 ntop.org
Build OS: CentOS Linux release 7.3.1611 (Core)
SystemID: 68A2B43E76056A7E
GIT rev: 8.0-stable:478c52c6ce70feaf6c65fe4806be05f75fe0e196:20171020
License: Invalid nProbe license (/etc/nprobe.license) [Missing
license file]
Q1. When running on a multi-core host, will nprobe utilize all cores.
Somewhere, I thought I saw something about it being single threaded but now
cannot find that reference. This question goes to sizing my HW. I am seeing
~5% CPU load for one router's flow (about 2500 flow records/sec). I will
ultimately need more than 20x this volume so I need to deploy N hosts
eventually in full production setup. I just want to know if there are any
settings needed to enable nprobe to fully utilize all cores on a given host.
Q2. I am running with this configuration:
[root@vmwdnacollector01 ~]# cat /etc/nprobe/nprobe.conf
--interface=none
--collector=none
--collector-port=2055
--verbose=1
--flow-version=9
--hash-size=262144
--kafka="kafka01:9092;netflow-raw;1"
--dump-stats=/var/log/nprobe/stats.txt
--event-log=/var/log/nprobe/events.txt
-T="%IPV4_SRC_ADDR %IPV4_DST_ADDR %L4_SRC_PORT %L4_DST_PORT %IPV4_SRC_MASK
%IPV4_DST_MASK %IPV4_NEXT_HOP %IN_PKTS %IN_BYTES %OUT_PKTS %OUT_BYTES
%FIRST_SWITCHED %LAST_SWITCHED %TCP_FLAGS %PROTOCOL %SRC_TOS %DIRECTION
%EXPORTER_IPV4_ADDRESS"
I am collecting netflow V9 records from a Cisco router. I was sort of
expecting that the record would include the IP address of the router
because I need that to know where the data came from for upstream
enrichment. I have nprobe publishing to Kafka. But, looking at the raw
flows coming from the router, there is no field that identifies the router
IP. So, I experimented and added a -T <template> definition that matches
the actual fields coming from the router. Then I added the
%EXPORTER_IPV4_ADDRESS field (which is NOT in the raw record from the
router) and voila, the IP address of the router shows up in that field. So,
I assume that nprobe is simply adding the source IP address of each
incoming flow record into that field, as well as mapping each field in the
incoming flow record into the matching field in my defined template - sort
of "cherry picking" the fields out of the source record and packing them
into my template.
So, my question on this point is, am I doing this correctly with defining
my own template? Seems like the only way I can figure it out.
Q3. It appears, for the mode I am operating in, that no license is required
to allow this to work. When I run in the mode where nprobe sniffs packets
from my local interface, it will only produce 25K flows then stops if there
is no license. However, in collector mode, where it just receives flows
from a router and forwards them as JSON to Kafka, it runs for millions of
flows. So, question here is, do I need a license for this sort of use case?
Q4. The Kafka producer has a boat load of configuration options but nprobe
only exposes a couple basic options (topic, acks, brokers). Is that it or
is there some way to provide additional configuration information to the
embedded producer? For example, to properly aggregate data flows, I would
like to partition the topic on the IPV4_SRC_ADDR. I am running in a
multi-tenant environment where each tenant can have overlapping private IP
addresses that we see in the flows. So, I need to aggregate the flows by
TENANT_ID + IPV4_SRC_ADDR, for example. I see no way to configure this with
nprobe + kafka mode.
Q5. Is there any way to bind nprobe to specific interface when used as a
collector in my use case? Meaning, I might need to run multiple instances
on a single host but I want to be able to configure routers to direct their
flow records to a specific IP address so that I can load-balance the flows
over N instances of nprobe running on a single host. I cannot find any
configuration option that will bind the UDP listening port to a specific
interface on a single host.
Thanks for any insights into my questions.