Mailing List Archive

Multithreaded nDPI
I'm looking for some advice on using nDPI in multithreaded code.

Some Googling tells me that I need a separate
ndpi_detection_module_struct for each thread, but beyond that I can't
find a lot of information. The example ndpiReader code seems to only
use threads when processing multiple independent pcap files or NICs.
That's not directly applicable to my situation, as I'm processing a
stream of packets from a single NIC so don't have completely independent
data sources like that.

So, I can use a 5-tuple hash to pick a consistent thread for each flow.
That would mean that each flow would consistently be processed with the
same ndpi_detection_module_struct.

The documentation isn't very clear, but presumably each endpoint should
have a single ndpi_id_struct, irrespective of how many flows it has.
Therefore each of the ndpi_id_structs are shared between multiple flows
and may therefore be used by any of the threads in association with any
of the ndpi_detection_module_structs. I can obviously lock the
ndpi_id_structs so that each can only be used by one thread at a time,
but I'm not sure whether using the same endpoint with multiple
ndpi_detection_module_structs is going to be a problem?

Finally, I'm not sure how much information nDPI shares between flows via
the ndpi_detection_module_struct. i.e. when a flow is processed, does
nDPI store any information in ndpi_detection_module_struct about what it
has learnt and then use that information when processing other flows?
If so, splitting the flows between multiple
ndpi_detection_module_structs is fundamentally problematic.


I've also got a quick query about the ndpi_id_structs that are passed to
ndpi_detection_process_packet. The documentation just says that these
are "source" and "destination", but isn't clear on whether these are the
source/destination of the packet or of the flow. Obviously reply
packets will have the source/destination the opposite way around to the
flow's source/destination.


Many thanks.

--
- Steve Hill
Technical Director | Cyfarwyddwr Technegol
Opendium Online Safety & Web Filtering http://www.opendium.com
Diogelwch Ar-Lein a Hidlo Gwefan

Enquiries | Ymholiadau: sales@opendium.com +44-1792-824568
Support | Cefnogi: support@opendium.com +44-1792-825748

------------------------------------------------------------------------
Opendium Limited is a company registered in England and Wales.
Mae Opendium Limited yn gwmni sydd wedi'i gofrestru yn Lloegr a Chymru.

Company No. | Rhif Cwmni: 5465437
Highfield House, 1 Brue Close, Bruton, Somerset, BA10 0HY, England.
_______________________________________________
Ntop-dev mailing list
Ntop-dev@listgateway.unipi.it
http://listgateway.unipi.it/mailman/listinfo/ntop-dev
Re: Multithreaded nDPI [ In reply to ]
Hi Steve
processing traffic from a single interface with multiple streams/thread is similar
to processing traffic from multiple interfaces with one thread per interface, as
long as traffic is distributed across streams based on 5-tuple. This means that
you need (to avoid locking, for performance) separate data structures including
the ndpi struct.

Alfredo

> On 8 Mar 2019, at 17:51, Steve Hill <steve@opendium.com> wrote:
>
>
> I'm looking for some advice on using nDPI in multithreaded code.
>
> Some Googling tells me that I need a separate ndpi_detection_module_struct for each thread, but beyond that I can't find a lot of information. The example ndpiReader code seems to only use threads when processing multiple independent pcap files or NICs. That's not directly applicable to my situation, as I'm processing a stream of packets from a single NIC so don't have completely independent data sources like that.
>
> So, I can use a 5-tuple hash to pick a consistent thread for each flow. That would mean that each flow would consistently be processed with the same ndpi_detection_module_struct.
>
> The documentation isn't very clear, but presumably each endpoint should have a single ndpi_id_struct, irrespective of how many flows it has. Therefore each of the ndpi_id_structs are shared between multiple flows and may therefore be used by any of the threads in association with any of the ndpi_detection_module_structs. I can obviously lock the ndpi_id_structs so that each can only be used by one thread at a time, but I'm not sure whether using the same endpoint with multiple ndpi_detection_module_structs is going to be a problem?
>
> Finally, I'm not sure how much information nDPI shares between flows via the ndpi_detection_module_struct. i.e. when a flow is processed, does nDPI store any information in ndpi_detection_module_struct about what it has learnt and then use that information when processing other flows? If so, splitting the flows between multiple ndpi_detection_module_structs is fundamentally problematic.
>
>
> I've also got a quick query about the ndpi_id_structs that are passed to ndpi_detection_process_packet. The documentation just says that these are "source" and "destination", but isn't clear on whether these are the source/destination of the packet or of the flow. Obviously reply packets will have the source/destination the opposite way around to the flow's source/destination.
>
>
> Many thanks.
>
> --
> - Steve Hill
> Technical Director | Cyfarwyddwr Technegol
> Opendium Online Safety & Web Filtering http://www.opendium.com
> Diogelwch Ar-Lein a Hidlo Gwefan
>
> Enquiries | Ymholiadau: sales@opendium.com +44-1792-824568
> Support | Cefnogi: support@opendium.com +44-1792-825748
>
> ------------------------------------------------------------------------
> Opendium Limited is a company registered in England and Wales.
> Mae Opendium Limited yn gwmni sydd wedi'i gofrestru yn Lloegr a Chymru.
>
> Company No. | Rhif Cwmni: 5465437
> Highfield House, 1 Brue Close, Bruton, Somerset, BA10 0HY, England.
> _______________________________________________
> Ntop-dev mailing list
> Ntop-dev@listgateway.unipi.it
> http://listgateway.unipi.it/mailman/listinfo/ntop-dev

_______________________________________________
Ntop-dev mailing list
Ntop-dev@listgateway.unipi.it
http://listgateway.unipi.it/mailman/listinfo/ntop-dev
Re: Multithreaded nDPI [ In reply to ]
Hi Steve
processing traffic from a single interface with multiple streams/thread is similar
to processing traffic from multiple interfaces with one thread per interface, as
long as traffic is distributed across streams based on 5-tuple. This means that
you need (to avoid locking, for performance) separate data structures including
the ndpi struct.

Alfredo

> On 8 Mar 2019, at 17:51, Steve Hill <steve@opendium.com> wrote:
>
>
> I'm looking for some advice on using nDPI in multithreaded code.
>
> Some Googling tells me that I need a separate ndpi_detection_module_struct for each thread, but beyond that I can't find a lot of information. The example ndpiReader code seems to only use threads when processing multiple independent pcap files or NICs. That's not directly applicable to my situation, as I'm processing a stream of packets from a single NIC so don't have completely independent data sources like that.
>
> So, I can use a 5-tuple hash to pick a consistent thread for each flow. That would mean that each flow would consistently be processed with the same ndpi_detection_module_struct.
>
> The documentation isn't very clear, but presumably each endpoint should have a single ndpi_id_struct, irrespective of how many flows it has. Therefore each of the ndpi_id_structs are shared between multiple flows and may therefore be used by any of the threads in association with any of the ndpi_detection_module_structs. I can obviously lock the ndpi_id_structs so that each can only be used by one thread at a time, but I'm not sure whether using the same endpoint with multiple ndpi_detection_module_structs is going to be a problem?
>
> Finally, I'm not sure how much information nDPI shares between flows via the ndpi_detection_module_struct. i.e. when a flow is processed, does nDPI store any information in ndpi_detection_module_struct about what it has learnt and then use that information when processing other flows? If so, splitting the flows between multiple ndpi_detection_module_structs is fundamentally problematic.
>
>
> I've also got a quick query about the ndpi_id_structs that are passed to ndpi_detection_process_packet. The documentation just says that these are "source" and "destination", but isn't clear on whether these are the source/destination of the packet or of the flow. Obviously reply packets will have the source/destination the opposite way around to the flow's source/destination.
>
>
> Many thanks.
>
> --
> - Steve Hill
> Technical Director | Cyfarwyddwr Technegol
> Opendium Online Safety & Web Filtering http://www.opendium.com
> Diogelwch Ar-Lein a Hidlo Gwefan
>
> Enquiries | Ymholiadau: sales@opendium.com +44-1792-824568
> Support | Cefnogi: support@opendium.com +44-1792-825748
>
> ------------------------------------------------------------------------
> Opendium Limited is a company registered in England and Wales.
> Mae Opendium Limited yn gwmni sydd wedi'i gofrestru yn Lloegr a Chymru.
>
> Company No. | Rhif Cwmni: 5465437
> Highfield House, 1 Brue Close, Bruton, Somerset, BA10 0HY, England.
> _______________________________________________
> Ntop-dev mailing list
> Ntop-dev@listgateway.unipi.it
> http://listgateway.unipi.it/mailman/listinfo/ntop-dev

_______________________________________________
Ntop-dev mailing list
Ntop-dev@listgateway.unipi.it
http://listgateway.unipi.it/mailman/listinfo/ntop-dev
Re: Multithreaded nDPI [ In reply to ]
On 08/03/2019 16:57, Alfredo Cardigliano wrote:

> processing traffic from a single interface with multiple streams/thread is similar
> to processing traffic from multiple interfaces with one thread per interface, as
> long as traffic is distributed across streams based on 5-tuple. This means that
> you need (to avoid locking, for performance) separate data structures including
> the ndpi struct.

So if a network endpoint now has several ndpi_id_structs (one for each
thread) how does that impact protocol detection? It looks like
historical information is recorded in this struct which is used to help
protocol detection in future flows - that will surely break if I can no
longer guarantee that all of that endpoint's flows will be using the
same ndpi_id_struct?

--
--
- Steve Hill
Technical Director | Cyfarwyddwr Technegol
Opendium Online Safety & Web Filtering http://www.opendium.com
Diogelwch Ar-Lein a Hidlo Gwefan

Enquiries | Ymholiadau: sales@opendium.com +44-1792-824568
Support | Cefnogi: support@opendium.com +44-1792-825748

------------------------------------------------------------------------
Opendium Limited is a company registered in England and Wales.
Mae Opendium Limited yn gwmni sydd wedi'i gofrestru yn Lloegr a Chymru.

Company No. | Rhif Cwmni: 5465437
Highfield House, 1 Brue Close, Bruton, Somerset, BA10 0HY, England.
_______________________________________________
Ntop-dev mailing list
Ntop-dev@listgateway.unipi.it
http://listgateway.unipi.it/mailman/listinfo/ntop-dev
Re: Multithreaded nDPI [ In reply to ]
> On 8 Mar 2019, at 17:51, Steve Hill <steve@opendium.com> wrote:
>
>
> I'm looking for some advice on using nDPI in multithreaded code.
>
> Some Googling tells me that I need a separate ndpi_detection_module_struct for each thread, but beyond that I can't find a lot of information. The example ndpiReader code seems to only use threads when processing multiple independent pcap files or NICs. That's not directly applicable to my situation, as I'm processing a stream of packets from a single NIC so don't have completely independent data sources like that.
yes. You need to have a separate module per thread to avoid locks. However if for some reason you want to follow that path , this is also an option
>
> So, I can use a 5-tuple hash to pick a consistent thread for each flow. That would mean that each flow would consistently be processed with the same ndpi_detection_module_struct.
The module contains the information about protocols etc, but for each flow you need to have the ndpi peers and flow info. So make sure that when a thread starts processing one flow, it is the only one that does that
>
> The documentation isn't very clear, but presumably each endpoint should have a single ndpi_id_struct, irrespective of how many flows it has. Therefore each of the ndpi_id_structs are shared between multiple flows and may therefore be used by any of the threads in association with any of the ndpi_detection_module_structs. I can obviously lock the ndpi_id_structs so that each can only be used by one thread at a time, but I'm not sure whether using the same endpoint with multiple ndpi_detection_module_structs is going to be a problem?

The best example is the ntopng code. have a look at it. Please avoid locks, they are a waste of time every time you lock/unlock. You can avoid them, see the ntopng code

>
> Finally, I'm not sure how much information nDPI shares between flows via the ndpi_detection_module_struct. i.e. when a flow is processed, does nDPI store any information in ndpi_detection_module_struct about what it has learnt and then use that information when processing other flows? If so, splitting the flows between multiple ndpi_detection_module_structs is fundamentally problematic.

No the flow information is stored inside the flow struct https://github.com/ntop/ntopng/blob/dev/include/Flow.h#L63
>
>
> I've also got a quick query about the ndpi_id_structs that are passed to ndpi_detection_process_packet. The documentation just says that these are "source" and "destination", but isn't clear on whether these are the source/destination of the packet or of the flow. Obviously reply packets will have the source/destination the opposite way around to the flow's source/destination.

Correct. Again please see https://github.com/ntop/ntopng/blob/dev/include/Flow.h#L84 <https://github.com/ntop/ntopng/blob/dev/include/Flow.h#L84>

Luca
>
>
> Many thanks.
>
> --
> - Steve Hill
> Technical Director | Cyfarwyddwr Technegol
> Opendium Online Safety & Web Filtering http://www.opendium.com
> Diogelwch Ar-Lein a Hidlo Gwefan
>
> Enquiries | Ymholiadau: sales@opendium.com +44-1792-824568
> Support | Cefnogi: support@opendium.com +44-1792-825748
>
> ------------------------------------------------------------------------
> Opendium Limited is a company registered in England and Wales.
> Mae Opendium Limited yn gwmni sydd wedi'i gofrestru yn Lloegr a Chymru.
>
> Company No. | Rhif Cwmni: 5465437
> Highfield House, 1 Brue Close, Bruton, Somerset, BA10 0HY, England.
> _______________________________________________
> Ntop-dev mailing list
> Ntop-dev@listgateway.unipi.it
> http://listgateway.unipi.it/mailman/listinfo/ntop-dev
Re: Multithreaded nDPI [ In reply to ]
> On 8 Mar 2019, at 17:51, Steve Hill <steve@opendium.com> wrote:
>
>
> I'm looking for some advice on using nDPI in multithreaded code.
>
> Some Googling tells me that I need a separate ndpi_detection_module_struct for each thread, but beyond that I can't find a lot of information. The example ndpiReader code seems to only use threads when processing multiple independent pcap files or NICs. That's not directly applicable to my situation, as I'm processing a stream of packets from a single NIC so don't have completely independent data sources like that.
yes. You need to have a separate module per thread to avoid locks. However if for some reason you want to follow that path , this is also an option
>
> So, I can use a 5-tuple hash to pick a consistent thread for each flow. That would mean that each flow would consistently be processed with the same ndpi_detection_module_struct.
The module contains the information about protocols etc, but for each flow you need to have the ndpi peers and flow info. So make sure that when a thread starts processing one flow, it is the only one that does that
>
> The documentation isn't very clear, but presumably each endpoint should have a single ndpi_id_struct, irrespective of how many flows it has. Therefore each of the ndpi_id_structs are shared between multiple flows and may therefore be used by any of the threads in association with any of the ndpi_detection_module_structs. I can obviously lock the ndpi_id_structs so that each can only be used by one thread at a time, but I'm not sure whether using the same endpoint with multiple ndpi_detection_module_structs is going to be a problem?

The best example is the ntopng code. have a look at it. Please avoid locks, they are a waste of time every time you lock/unlock. You can avoid them, see the ntopng code

>
> Finally, I'm not sure how much information nDPI shares between flows via the ndpi_detection_module_struct. i.e. when a flow is processed, does nDPI store any information in ndpi_detection_module_struct about what it has learnt and then use that information when processing other flows? If so, splitting the flows between multiple ndpi_detection_module_structs is fundamentally problematic.

No the flow information is stored inside the flow struct https://github.com/ntop/ntopng/blob/dev/include/Flow.h#L63
>
>
> I've also got a quick query about the ndpi_id_structs that are passed to ndpi_detection_process_packet. The documentation just says that these are "source" and "destination", but isn't clear on whether these are the source/destination of the packet or of the flow. Obviously reply packets will have the source/destination the opposite way around to the flow's source/destination.

Correct. Again please see https://github.com/ntop/ntopng/blob/dev/include/Flow.h#L84 <https://github.com/ntop/ntopng/blob/dev/include/Flow.h#L84>

Luca
>
>
> Many thanks.
>
> --
> - Steve Hill
> Technical Director | Cyfarwyddwr Technegol
> Opendium Online Safety & Web Filtering http://www.opendium.com
> Diogelwch Ar-Lein a Hidlo Gwefan
>
> Enquiries | Ymholiadau: sales@opendium.com +44-1792-824568
> Support | Cefnogi: support@opendium.com +44-1792-825748
>
> ------------------------------------------------------------------------
> Opendium Limited is a company registered in England and Wales.
> Mae Opendium Limited yn gwmni sydd wedi'i gofrestru yn Lloegr a Chymru.
>
> Company No. | Rhif Cwmni: 5465437
> Highfield House, 1 Brue Close, Bruton, Somerset, BA10 0HY, England.
> _______________________________________________
> Ntop-dev mailing list
> Ntop-dev@listgateway.unipi.it
> http://listgateway.unipi.it/mailman/listinfo/ntop-dev
Re: Multithreaded nDPI [ In reply to ]
> On 8 Mar 2019, at 18:13, Steve Hill <steve@opendium.com> wrote:
>
> On 08/03/2019 16:57, Alfredo Cardigliano wrote:
>
>> processing traffic from a single interface with multiple streams/thread is similar
>> to processing traffic from multiple interfaces with one thread per interface, as
>> long as traffic is distributed across streams based on 5-tuple. This means that
>> you need (to avoid locking, for performance) separate data structures including
>> the ndpi struct.
>
> So if a network endpoint now has several ndpi_id_structs (one for each thread) how does that impact protocol detection? It looks like historical information is recorded in this struct which is used to help protocol detection in future flows - that will surely break if I can no longer guarantee that all of that endpoint's flows will be using the same ndpi_id_struct?

Yes, you need to avoid shuffling/mixing structs. Please see the ntopng code

Luca
>
> --
> --
> - Steve Hill
> Technical Director | Cyfarwyddwr Technegol
> Opendium Online Safety & Web Filtering http://www.opendium.com
> Diogelwch Ar-Lein a Hidlo Gwefan
>
> Enquiries | Ymholiadau: sales@opendium.com +44-1792-824568
> Support | Cefnogi: support@opendium.com +44-1792-825748
>
> ------------------------------------------------------------------------
> Opendium Limited is a company registered in England and Wales.
> Mae Opendium Limited yn gwmni sydd wedi'i gofrestru yn Lloegr a Chymru.
>
> Company No. | Rhif Cwmni: 5465437
> Highfield House, 1 Brue Close, Bruton, Somerset, BA10 0HY, England.
> _______________________________________________
> Ntop-dev mailing list
> Ntop-dev@listgateway.unipi.it
> http://listgateway.unipi.it/mailman/listinfo/ntop-dev

_______________________________________________
Ntop-dev mailing list
Ntop-dev@listgateway.unipi.it
http://listgateway.unipi.it/mailman/listinfo/ntop-dev
Re: Multithreaded nDPI [ In reply to ]
> On 8 Mar 2019, at 18:13, Steve Hill <steve@opendium.com> wrote:
>
> On 08/03/2019 16:57, Alfredo Cardigliano wrote:
>
>> processing traffic from a single interface with multiple streams/thread is similar
>> to processing traffic from multiple interfaces with one thread per interface, as
>> long as traffic is distributed across streams based on 5-tuple. This means that
>> you need (to avoid locking, for performance) separate data structures including
>> the ndpi struct.
>
> So if a network endpoint now has several ndpi_id_structs (one for each thread) how does that impact protocol detection? It looks like historical information is recorded in this struct which is used to help protocol detection in future flows - that will surely break if I can no longer guarantee that all of that endpoint's flows will be using the same ndpi_id_struct?

Yes, you need to avoid shuffling/mixing structs. Please see the ntopng code

Luca
>
> --
> --
> - Steve Hill
> Technical Director | Cyfarwyddwr Technegol
> Opendium Online Safety & Web Filtering http://www.opendium.com
> Diogelwch Ar-Lein a Hidlo Gwefan
>
> Enquiries | Ymholiadau: sales@opendium.com +44-1792-824568
> Support | Cefnogi: support@opendium.com +44-1792-825748
>
> ------------------------------------------------------------------------
> Opendium Limited is a company registered in England and Wales.
> Mae Opendium Limited yn gwmni sydd wedi'i gofrestru yn Lloegr a Chymru.
>
> Company No. | Rhif Cwmni: 5465437
> Highfield House, 1 Brue Close, Bruton, Somerset, BA10 0HY, England.
> _______________________________________________
> Ntop-dev mailing list
> Ntop-dev@listgateway.unipi.it
> http://listgateway.unipi.it/mailman/listinfo/ntop-dev

_______________________________________________
Ntop-dev mailing list
Ntop-dev@listgateway.unipi.it
http://listgateway.unipi.it/mailman/listinfo/ntop-dev
Re: Multithreaded nDPI [ In reply to ]
On 10/03/2019 09:26, Luca Deri wrote:

> The best example is the ntopng code. have a look at it. Please avoid
> locks, they are a waste of time every time you lock/unlock. You can
> avoid them, see the ntopng code

I've looked through the ntopng code, and as far as I can see it doesn't
provide an example of processing a single stream of packets (i.e.
receiving on a single NIC) across multiple threads.

As far as I can see (and please correct me if I'm wrong), ntopng uses
one thread per NIC and treats each NIC more or less independently. This
obviously solves some concurrency problems since it makes it likely
(although not guaranteed) that the flows handled by different threads
are completely unrelated to each other.

In ntopng, each thread appears to maintain its own pool of flows and
peers, which means that if a client attached to eth0 is talking to the
same server as a client attached to eth1, nDPI cannot share information
about the server between the threads.

My understanding is that nDPI can use knowledge of what was detected in
one flow to infer things about a different flow that involves the same
peer. If this is the case, surely it is impacted by partitioning this
data between threads?


When processing packets received from a single NIC, we have a much
harder job in splitting up "unrelated" flows. Take the following flows
as an example:
(1) udp, 10.0.0.1:1234 <-> 192.168.0.1:5678
(2) udp, 10.0.0.1:1234 <-> 192.168.0.2:5678
In a single threaded process, nDPI sees the first flow, records
information about it in the flow[1], peer[10.0.0.1] and
peer[192.168.0.1] structures. It then sees the second flow and records
information about it in the flow[2], peer[10.0.0.1] and
peer[192.168.0.2] structures. While processing the second flow it can
use what it already knows about peer[10.0.0.1] to make inferences about
the flow.

In a multithreaded process using a 5-tuple hash, the flows may be
allocated to different threads. Thread 1 sees the first flow, records
information about it in the thread[1]->flow[1],
thread[1]->peer[10.0.0.1] and thread[1]->peer[192.168.0.1] structures.
Thread 2 then sees the second flow and records information about it in
the thread[2]->flow[2], thread[2]->peer[10.0.0.1] and
thread[2]->peer[192.168.0.2] structures. It cannot access the
thread[1]->peer[10.0.0.1] structure, so is unable to use that
information to make inferences about the second flow.

How does this impact the detection process?

> No the flow information is stored inside the flow struct
> https://github.com/ntop/ntopng/blob/dev/include/Flow.h#L63

ndpi_detection_module_struct appears to include various caches that are
shared between flows, such as ookla_cache. If different flows see
different versions of those caches, is that not going to break some
detections?


--
- Steve Hill
Technical Director | Cyfarwyddwr Technegol
Opendium Online Safety & Web Filtering http://www.opendium.com
Diogelwch Ar-Lein a Hidlo Gwefan

Enquiries | Ymholiadau: sales@opendium.com +44-1792-824568
Support | Cefnogi: support@opendium.com +44-1792-825748

------------------------------------------------------------------------
Opendium Limited is a company registered in England and Wales.
Mae Opendium Limited yn gwmni sydd wedi'i gofrestru yn Lloegr a Chymru.

Company No. | Rhif Cwmni: 5465437
Highfield House, 1 Brue Close, Bruton, Somerset, BA10 0HY, England.
_______________________________________________
Ntop-dev mailing list
Ntop-dev@listgateway.unipi.it
http://listgateway.unipi.it/mailman/listinfo/ntop-dev
Re: Multithreaded nDPI [ In reply to ]
Steve

> On 13 Mar 2019, at 18:56, Steve Hill <steve@opendium.com> wrote:
>
> On 10/03/2019 09:26, Luca Deri wrote:
>
>> The best example is the ntopng code. have a look at it. Please avoid locks, they are a waste of time every time you lock/unlock. You can avoid them, see the ntopng code
>
> I've looked through the ntopng code, and as far as I can see it doesn't provide an example of processing a single stream of packets (i.e. receiving on a single NIC) across multiple threads.
>
> As far as I can see (and please correct me if I'm wrong), ntopng uses one thread per NIC and treats each NIC more or less independently.

Correct.

> This obviously solves some concurrency problems since it makes it likely (although not guaranteed) that the flows handled by different threads are completely unrelated to each other.

No concurrency problems affect the design of ntopng. Flows are created by ntopng threads when processing packets. Each thread processes packets coming from its own NIC. If multiple threads see the same packet, each one will create/update its own flow out of packet metadata.

>
> In ntopng, each thread appears to maintain its own pool of flows and peers,

Correct.

> which means that if a client attached to eth0 is talking to the same server as a client attached to eth1, nDPI cannot share information about the server between the threads.
>
> My understanding is that nDPI can use knowledge of what was detected in one flow to infer things about a different flow that involves the same peer. If this is the case, surely it is impacted by partitioning this data between threads?

Each thread has its own nDPI data structures used for the detection. See NetworkInterface member ndpi_struct.

>
>
> When processing packets received from a single NIC, we have a much harder job in splitting up "unrelated" flows. Take the following flows as an example:
> (1) udp, 10.0.0.1:1234 <-> 192.168.0.1:5678
> (2) udp, 10.0.0.1:1234 <-> 192.168.0.2:5678
> In a single threaded process, nDPI sees the first flow, records information about it in the flow[1], peer[10.0.0.1] and peer[192.168.0.1] structures. It then sees the second flow and records information about it in the flow[2], peer[10.0.0.1] and peer[192.168.0.2] structures. While processing the second flow it can use what it already knows about peer[10.0.0.1] to make inferences about the flow.
>
> In a multithreaded process using a 5-tuple hash, the flows may be allocated to different threads. Thread 1 sees the first flow, records information about it in the thread[1]->flow[1], thread[1]->peer[10.0.0.1] and thread[1]->peer[192.168.0.1] structures. Thread 2 then sees the second flow and records information about it in the thread[2]->flow[2], thread[2]->peer[10.0.0.1] and thread[2]->peer[192.168.0.2] structures. It cannot access the thread[1]->peer[10.0.0.1] structure, so is unable to use that information to make inferences about the second flow.
>
> How does this impact the detection process?

Each thread performs the detection independently from the other. nDPI detection is on a per-thread basis.

>
>> No the flow information is stored inside the flow struct https://github.com/ntop/ntopng/blob/dev/include/Flow.h#L63
>
> ndpi_detection_module_struct appears to include various caches that are shared between flows, such as ookla_cache. If different flows see different versions of those caches, is that not going to break some detections?

Nothing is going to break.

Simone

>
>
> --
> - Steve Hill
> Technical Director | Cyfarwyddwr Technegol
> Opendium Online Safety & Web Filtering http://www.opendium.com
> Diogelwch Ar-Lein a Hidlo Gwefan
>
> Enquiries | Ymholiadau: sales@opendium.com +44-1792-824568
> Support | Cefnogi: support@opendium.com +44-1792-825748
>
> ------------------------------------------------------------------------
> Opendium Limited is a company registered in England and Wales.
> Mae Opendium Limited yn gwmni sydd wedi'i gofrestru yn Lloegr a Chymru.
>
> Company No. | Rhif Cwmni: 5465437
> Highfield House, 1 Brue Close, Bruton, Somerset, BA10 0HY, England.
> _______________________________________________
> Ntop-dev mailing list
> Ntop-dev@listgateway.unipi.it
> http://listgateway.unipi.it/mailman/listinfo/ntop-dev

_______________________________________________
Ntop-dev mailing list
Ntop-dev@listgateway.unipi.it
http://listgateway.unipi.it/mailman/listinfo/ntop-dev
Re: Multithreaded nDPI [ In reply to ]
Steve

> On 13 Mar 2019, at 18:56, Steve Hill <steve@opendium.com> wrote:
>
> On 10/03/2019 09:26, Luca Deri wrote:
>
>> The best example is the ntopng code. have a look at it. Please avoid locks, they are a waste of time every time you lock/unlock. You can avoid them, see the ntopng code
>
> I've looked through the ntopng code, and as far as I can see it doesn't provide an example of processing a single stream of packets (i.e. receiving on a single NIC) across multiple threads.
>
> As far as I can see (and please correct me if I'm wrong), ntopng uses one thread per NIC and treats each NIC more or less independently.

Correct.

> This obviously solves some concurrency problems since it makes it likely (although not guaranteed) that the flows handled by different threads are completely unrelated to each other.

No concurrency problems affect the design of ntopng. Flows are created by ntopng threads when processing packets. Each thread processes packets coming from its own NIC. If multiple threads see the same packet, each one will create/update its own flow out of packet metadata.

>
> In ntopng, each thread appears to maintain its own pool of flows and peers,

Correct.

> which means that if a client attached to eth0 is talking to the same server as a client attached to eth1, nDPI cannot share information about the server between the threads.
>
> My understanding is that nDPI can use knowledge of what was detected in one flow to infer things about a different flow that involves the same peer. If this is the case, surely it is impacted by partitioning this data between threads?

Each thread has its own nDPI data structures used for the detection. See NetworkInterface member ndpi_struct.

>
>
> When processing packets received from a single NIC, we have a much harder job in splitting up "unrelated" flows. Take the following flows as an example:
> (1) udp, 10.0.0.1:1234 <-> 192.168.0.1:5678
> (2) udp, 10.0.0.1:1234 <-> 192.168.0.2:5678
> In a single threaded process, nDPI sees the first flow, records information about it in the flow[1], peer[10.0.0.1] and peer[192.168.0.1] structures. It then sees the second flow and records information about it in the flow[2], peer[10.0.0.1] and peer[192.168.0.2] structures. While processing the second flow it can use what it already knows about peer[10.0.0.1] to make inferences about the flow.
>
> In a multithreaded process using a 5-tuple hash, the flows may be allocated to different threads. Thread 1 sees the first flow, records information about it in the thread[1]->flow[1], thread[1]->peer[10.0.0.1] and thread[1]->peer[192.168.0.1] structures. Thread 2 then sees the second flow and records information about it in the thread[2]->flow[2], thread[2]->peer[10.0.0.1] and thread[2]->peer[192.168.0.2] structures. It cannot access the thread[1]->peer[10.0.0.1] structure, so is unable to use that information to make inferences about the second flow.
>
> How does this impact the detection process?

Each thread performs the detection independently from the other. nDPI detection is on a per-thread basis.

>
>> No the flow information is stored inside the flow struct https://github.com/ntop/ntopng/blob/dev/include/Flow.h#L63
>
> ndpi_detection_module_struct appears to include various caches that are shared between flows, such as ookla_cache. If different flows see different versions of those caches, is that not going to break some detections?

Nothing is going to break.

Simone

>
>
> --
> - Steve Hill
> Technical Director | Cyfarwyddwr Technegol
> Opendium Online Safety & Web Filtering http://www.opendium.com
> Diogelwch Ar-Lein a Hidlo Gwefan
>
> Enquiries | Ymholiadau: sales@opendium.com +44-1792-824568
> Support | Cefnogi: support@opendium.com +44-1792-825748
>
> ------------------------------------------------------------------------
> Opendium Limited is a company registered in England and Wales.
> Mae Opendium Limited yn gwmni sydd wedi'i gofrestru yn Lloegr a Chymru.
>
> Company No. | Rhif Cwmni: 5465437
> Highfield House, 1 Brue Close, Bruton, Somerset, BA10 0HY, England.
> _______________________________________________
> Ntop-dev mailing list
> Ntop-dev@listgateway.unipi.it
> http://listgateway.unipi.it/mailman/listinfo/ntop-dev

_______________________________________________
Ntop-dev mailing list
Ntop-dev@listgateway.unipi.it
http://listgateway.unipi.it/mailman/listinfo/ntop-dev
Re: Multithreaded nDPI [ In reply to ]
On 14/03/2019 09:31, Simone Mainardi wrote:

>> How does this impact the detection process?
>
> Each thread performs the detection independently from the other. nDPI
> detection is on a per-thread basis.

Thanks for your help so far, but I'm not sure this answers my question,
so I'll try to give a more real world example:

The IRC protocol allows clients to set up peer to peer connections
directly between two clients (DCC) to transmit files. This works by one
IRC client sending a private message to another client via the server,
containing:
DCC SEND <filename> <my_address> <my_port>
The receiving client then makes a connection directly to
<my_address>:<my_port>.


In single threaded operation:

1. A client makes a connection to an IRC server.
2. nDPI creates a new flow for the connection (we'll call it flow[0]).
3. ndpi_search_irc_tcp() detects that this is an IRC flow and marks it
as NDPI_PROTOCOL_IRC.
4. The client sends a DCC request request over flow[0].
5. ndpi_search_irc_tcp() examines the traffic, spots the DCC request and
records <my_port> within flow[0]->src->irc_port and flow[0]->dst->irc_port.
6. The other party makes a connection to <my_address>:<my_port>.
7. nDPI creates a new flow for the connection (we'll call it flow[1]),
and flow[1]->dst == flow[0]->src since one end is the client that we
already know about.
8. ndpi_search_irc_tcp() finds that the destination port of flow[1] is
recorded within flow[1]->dst->irc_port, and therefore infers that this
is an IRC DCC flow and marks it as NDPI_PROTOCOL_IRC.


Now compare to multi-threaded operation, using a 5-tuple hash to assign
each flow to a different thread. Each thread maintains separate flows
and peers data to allow it to be lockless:

1. A client makes a connection to an IRC server.
2. nDPI creates a new flow for the connection (we'll call it flow[0]) in
thread 0.
3. ndpi_search_irc_tcp() detects that this is an IRC flow and marks it
as NDPI_PROTOCOL_IRC.
4. The client sends a DCC request request over flow[0].
5. ndpi_search_irc_tcp() examines the traffic, spots the DCC request and
records <my_port> within thread[0]->flow[0]->src->irc_port and
thread[0]->flow[0]->dst->irc_port.
6. The other party makes a connection to <my_address>:<my_port>.
7. nDPI creates a new flow for the connection (we'll call it flow[1])
which could be in any thread since we're selecting thread based on a
5-tuple hash. We'll assume it is handled by in thread 1.
8. Since thread[1]->flow[1]->dst and thread[0]->flow[0]->src are NOT the
same data structure, ndpi_search_irc_tcp() DOES NOT find that the
destination port of flow[1] is recorded within
thread[1]->flow[1]->dst->irc_port, and therefore fails to detect this
flow as IRC.


This kind of pattern is followed by quite a few dissectors, whereby data
recorded in the peer structures when examining one flow is later used to
detect another flow. As far as I can see, this cannot work when the
peer data is not shared between threads and the flows are being divided
between the threads based on a 5-tuple hash. I would certainly welcome
some advice on this.


>> ndpi_detection_module_struct appears to include various caches that
>> are shared between flows, such as ookla_cache. If different flows
>> see different versions of those caches, is that not going to break
>> some detections?
>
> Nothing is going to break.

Can you explain how ookla_cache is used within this struct? It looks to
me as though this is used by the http dissector to record IP addresses
that are associated with Ookla servers, and that the Ookla dissector
then uses that IP address cache to detect other flows as
NDPI_PROTOCOL_OOKLA.

If these flows do not all share the same ndpi_detection_module_struct, I
cannot see how Ookla detection won't break?


Many thanks.

--
- Steve Hill
Technical Director | Cyfarwyddwr Technegol
Opendium Online Safety & Web Filtering http://www.opendium.com
Diogelwch Ar-Lein a Hidlo Gwefan

Enquiries | Ymholiadau: sales@opendium.com +44-1792-824568
Support | Cefnogi: support@opendium.com +44-1792-825748

------------------------------------------------------------------------
Opendium Limited is a company registered in England and Wales.
Mae Opendium Limited yn gwmni sydd wedi'i gofrestru yn Lloegr a Chymru.

Company No. | Rhif Cwmni: 5465437
Highfield House, 1 Brue Close, Bruton, Somerset, BA10 0HY, England.
_______________________________________________
Ntop-dev mailing list
Ntop-dev@listgateway.unipi.it
http://listgateway.unipi.it/mailman/listinfo/ntop-dev
Re: Multithreaded nDPI [ In reply to ]
Steve,

> On 14 Mar 2019, at 16:07, Steve Hill <steve@opendium.com> wrote:
>
> On 14/03/2019 09:31, Simone Mainardi wrote:
>
>>> How does this impact the detection process?
>> Each thread performs the detection independently from the other. nDPI
>> detection is on a per-thread basis.
>
> Thanks for your help so far, but I'm not sure this answers my question, so I'll try to give a more real world example:
>
> The IRC protocol allows clients to set up peer to peer connections directly between two clients (DCC) to transmit files. This works by one IRC client sending a private message to another client via the server, containing:
> DCC SEND <filename> <my_address> <my_port>
> The receiving client then makes a connection directly to <my_address>:<my_port>.
>
>
> In single threaded operation:
>
> 1. A client makes a connection to an IRC server.
> 2. nDPI creates a new flow for the connection (we'll call it flow[0]).
> 3. ndpi_search_irc_tcp() detects that this is an IRC flow and marks it as NDPI_PROTOCOL_IRC.
> 4. The client sends a DCC request request over flow[0].
> 5. ndpi_search_irc_tcp() examines the traffic, spots the DCC request and records <my_port> within flow[0]->src->irc_port and flow[0]->dst->irc_port.
> 6. The other party makes a connection to <my_address>:<my_port>.
> 7. nDPI creates a new flow for the connection (we'll call it flow[1]), and flow[1]->dst == flow[0]->src since one end is the client that we already know about.
> 8. ndpi_search_irc_tcp() finds that the destination port of flow[1] is recorded within flow[1]->dst->irc_port, and therefore infers that this is an IRC DCC flow and marks it as NDPI_PROTOCOL_IRC.
>
>
> Now compare to multi-threaded operation, using a 5-tuple hash to assign each flow to a different thread. Each thread maintains separate flows and peers data to allow it to be lockless:
>
> 1. A client makes a connection to an IRC server.
> 2. nDPI creates a new flow for the connection (we'll call it flow[0]) in thread 0.
> 3. ndpi_search_irc_tcp() detects that this is an IRC flow and marks it as NDPI_PROTOCOL_IRC.
> 4. The client sends a DCC request request over flow[0].
> 5. ndpi_search_irc_tcp() examines the traffic, spots the DCC request and records <my_port> within thread[0]->flow[0]->src->irc_port and thread[0]->flow[0]->dst->irc_port.
> 6. The other party makes a connection to <my_address>:<my_port>.
> 7. nDPI creates a new flow for the connection (we'll call it flow[1]) which could be in any thread since we're selecting thread based on a 5-tuple hash. We'll assume it is handled by in thread 1.
> 8. Since thread[1]->flow[1]->dst and thread[0]->flow[0]->src are NOT the same data structure, ndpi_search_irc_tcp() DOES NOT find that the destination port of flow[1] is recorded within thread[1]->flow[1]->dst->irc_port, and therefore fails to detect this flow as IRC.
>
> This kind of pattern is followed by quite a few dissectors, whereby data recorded in the peer structures when examining one flow is later used to detect another flow. As far as I can see, this cannot work when the peer data is not shared between threads and the flows are being divided between the threads based on a 5-tuple hash. I would certainly welcome some advice on this.
>
>
>>> ndpi_detection_module_struct appears to include various caches that
>>> are shared between flows, such as ookla_cache. If different flows
>>> see different versions of those caches, is that not going to break
>>> some detections?
>> Nothing is going to break.
>
> Can you explain how ookla_cache is used within this struct? It looks to me as though this is used by the http dissector to record IP addresses that are associated with Ookla servers, and that the Ookla dissector then uses that IP address cache to detect other flows as NDPI_PROTOCOL_OOKLA.
>
> If these flows do not all share the same ndpi_detection_module_struct, I cannot see how Ookla detection won't break?


The point is that, in general, packets (and, thus, flows) are NOT being balanced among threads using the 5-tuple hash, neither in ntopng nor in nDPI. All the traffic coming from an interface is processed by a single thread.

If you have to use an hash function for the balancing (e.g., when using RSS) then you have to pick a function that guarantees all the necessary packets for the detection of the applications you are interested into are delivered to the same thread. Typically, 5-tuple works OK as src->dst and dst->src packets are hashed to the same thread.

nDPI structures have not been designed to be used concurrently by multiple threads to avoid costly synchronization mechanisms.


Simone

>
>
> Many thanks.
>
> --
> - Steve Hill
> Technical Director | Cyfarwyddwr Technegol
> Opendium Online Safety & Web Filtering http://www.opendium.com
> Diogelwch Ar-Lein a Hidlo Gwefan
>
> Enquiries | Ymholiadau: sales@opendium.com +44-1792-824568
> Support | Cefnogi: support@opendium.com +44-1792-825748
>
> ------------------------------------------------------------------------
> Opendium Limited is a company registered in England and Wales.
> Mae Opendium Limited yn gwmni sydd wedi'i gofrestru yn Lloegr a Chymru.
>
> Company No. | Rhif Cwmni: 5465437
> Highfield House, 1 Brue Close, Bruton, Somerset, BA10 0HY, England.
> _______________________________________________
> Ntop-dev mailing list
> Ntop-dev@listgateway.unipi.it
> http://listgateway.unipi.it/mailman/listinfo/ntop-dev

_______________________________________________
Ntop-dev mailing list
Ntop-dev@listgateway.unipi.it
http://listgateway.unipi.it/mailman/listinfo/ntop-dev
Re: Multithreaded nDPI [ In reply to ]
Steve,

> On 14 Mar 2019, at 16:07, Steve Hill <steve@opendium.com> wrote:
>
> On 14/03/2019 09:31, Simone Mainardi wrote:
>
>>> How does this impact the detection process?
>> Each thread performs the detection independently from the other. nDPI
>> detection is on a per-thread basis.
>
> Thanks for your help so far, but I'm not sure this answers my question, so I'll try to give a more real world example:
>
> The IRC protocol allows clients to set up peer to peer connections directly between two clients (DCC) to transmit files. This works by one IRC client sending a private message to another client via the server, containing:
> DCC SEND <filename> <my_address> <my_port>
> The receiving client then makes a connection directly to <my_address>:<my_port>.
>
>
> In single threaded operation:
>
> 1. A client makes a connection to an IRC server.
> 2. nDPI creates a new flow for the connection (we'll call it flow[0]).
> 3. ndpi_search_irc_tcp() detects that this is an IRC flow and marks it as NDPI_PROTOCOL_IRC.
> 4. The client sends a DCC request request over flow[0].
> 5. ndpi_search_irc_tcp() examines the traffic, spots the DCC request and records <my_port> within flow[0]->src->irc_port and flow[0]->dst->irc_port.
> 6. The other party makes a connection to <my_address>:<my_port>.
> 7. nDPI creates a new flow for the connection (we'll call it flow[1]), and flow[1]->dst == flow[0]->src since one end is the client that we already know about.
> 8. ndpi_search_irc_tcp() finds that the destination port of flow[1] is recorded within flow[1]->dst->irc_port, and therefore infers that this is an IRC DCC flow and marks it as NDPI_PROTOCOL_IRC.
>
>
> Now compare to multi-threaded operation, using a 5-tuple hash to assign each flow to a different thread. Each thread maintains separate flows and peers data to allow it to be lockless:
>
> 1. A client makes a connection to an IRC server.
> 2. nDPI creates a new flow for the connection (we'll call it flow[0]) in thread 0.
> 3. ndpi_search_irc_tcp() detects that this is an IRC flow and marks it as NDPI_PROTOCOL_IRC.
> 4. The client sends a DCC request request over flow[0].
> 5. ndpi_search_irc_tcp() examines the traffic, spots the DCC request and records <my_port> within thread[0]->flow[0]->src->irc_port and thread[0]->flow[0]->dst->irc_port.
> 6. The other party makes a connection to <my_address>:<my_port>.
> 7. nDPI creates a new flow for the connection (we'll call it flow[1]) which could be in any thread since we're selecting thread based on a 5-tuple hash. We'll assume it is handled by in thread 1.
> 8. Since thread[1]->flow[1]->dst and thread[0]->flow[0]->src are NOT the same data structure, ndpi_search_irc_tcp() DOES NOT find that the destination port of flow[1] is recorded within thread[1]->flow[1]->dst->irc_port, and therefore fails to detect this flow as IRC.
>
> This kind of pattern is followed by quite a few dissectors, whereby data recorded in the peer structures when examining one flow is later used to detect another flow. As far as I can see, this cannot work when the peer data is not shared between threads and the flows are being divided between the threads based on a 5-tuple hash. I would certainly welcome some advice on this.
>
>
>>> ndpi_detection_module_struct appears to include various caches that
>>> are shared between flows, such as ookla_cache. If different flows
>>> see different versions of those caches, is that not going to break
>>> some detections?
>> Nothing is going to break.
>
> Can you explain how ookla_cache is used within this struct? It looks to me as though this is used by the http dissector to record IP addresses that are associated with Ookla servers, and that the Ookla dissector then uses that IP address cache to detect other flows as NDPI_PROTOCOL_OOKLA.
>
> If these flows do not all share the same ndpi_detection_module_struct, I cannot see how Ookla detection won't break?


The point is that, in general, packets (and, thus, flows) are NOT being balanced among threads using the 5-tuple hash, neither in ntopng nor in nDPI. All the traffic coming from an interface is processed by a single thread.

If you have to use an hash function for the balancing (e.g., when using RSS) then you have to pick a function that guarantees all the necessary packets for the detection of the applications you are interested into are delivered to the same thread. Typically, 5-tuple works OK as src->dst and dst->src packets are hashed to the same thread.

nDPI structures have not been designed to be used concurrently by multiple threads to avoid costly synchronization mechanisms.


Simone

>
>
> Many thanks.
>
> --
> - Steve Hill
> Technical Director | Cyfarwyddwr Technegol
> Opendium Online Safety & Web Filtering http://www.opendium.com
> Diogelwch Ar-Lein a Hidlo Gwefan
>
> Enquiries | Ymholiadau: sales@opendium.com +44-1792-824568
> Support | Cefnogi: support@opendium.com +44-1792-825748
>
> ------------------------------------------------------------------------
> Opendium Limited is a company registered in England and Wales.
> Mae Opendium Limited yn gwmni sydd wedi'i gofrestru yn Lloegr a Chymru.
>
> Company No. | Rhif Cwmni: 5465437
> Highfield House, 1 Brue Close, Bruton, Somerset, BA10 0HY, England.
> _______________________________________________
> Ntop-dev mailing list
> Ntop-dev@listgateway.unipi.it
> http://listgateway.unipi.it/mailman/listinfo/ntop-dev

_______________________________________________
Ntop-dev mailing list
Ntop-dev@listgateway.unipi.it
http://listgateway.unipi.it/mailman/listinfo/ntop-dev