Mailing List Archive

Replies / Requests Ratio
Hello,

I'm trying to understand how/why I am getting the "Replies / Requests
Ratio" warnings for DNS.

I am suspect of these alerts, and would like to know how/why they are being
generated. I am suspect for for the following reasons: 1) If it really is
as bad as indicated, I should notice problems. 2) the "events' occur
immediately after I clear the alerts, and tend to persist for hours.

In any case, I cleared the alerts last night, and this is what they look
like:

06/05/2020 22:15:00 12:31:28 Warning Replies / Requests Ratio Host
edgemax.example.net
<http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.1@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
has received 54 DNS requests but sent 0 DNS replies [5 Minutes ratio: 0%]
06/05/2020 22:15:00 12:31:28 Warning Replies / Requests Ratio Host
pihole.example.net
<http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.3@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
has sent 93 DNS requests but received 3 DNS replies [5 Minutes ratio: 3.2%]

06/05/2020 22:15:00 12:31:28 Warning Replies / Requests Ratio Host
pihole-2.example.net
<http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.4@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
has sent 97 DNS requests but received 1 DNS reply [5 Minutes ratio: 1.0%]
Re: Replies / Requests Ratio [ In reply to ]
Hi Aaron,

The alerts that you are reporting basically tell you that such hosts
receive DNS requests but do not send a reply. In order to troubleshoot
possible problems you should augment such information with the knowledge
of your network.

The first question to answer is, are that hosts expected to accept DNS
requests? If not, are the requests generated from the internet or from
the LAN? In the first case a firewall to block such DNS requests may be
a good idea . In the latter case some hosts in the LAN may be
misconfigured. In case of the pihole hosts, I expect pihole to block
some DNS requests for advertisement sites so this could be a normal
behaviour. The following ntopng features may also help you:

https://www.ntop.org/guides/ntopng/advanced_features/live_pcap_download.html

https://www.ntop.org/guides/ntopng/using_with_other_tools/n2disk.html

    https://www.ntop.org/guides/ntopng/historical_flows.html

Regards,
Emanuele

On 5/7/20 5:57 PM, Aaron Scamehorn wrote:
> Hello,
>
> I'm trying to understand how/why I am getting the "Replies / Requests
> Ratio" warnings for DNS.
>
> I am suspect of these alerts, and would like to know how/why they are
> being generated.  I am suspect for for the following reasons:  1) If
> it really is as bad as indicated, I should notice problems.  2) the
> "events' occur immediately after I clear the alerts, and tend to
> persist for hours.
>
> In any case, I cleared the alerts last night, and this is what they
> look like:
>
> 06/05/2020 22:15:00 12:31:28 Warning Replies / Requests Ratio
> Host edgemax.example.net
> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.1@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
> has received 54 DNS requests but sent 0 DNS replies [5 Minutes ratio:
> 0%]
>
> 06/05/2020 22:15:00 12:31:28 Warning Replies / Requests Ratio
> Host pihole.example.net
> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.3@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
> has sent 93 DNS requests but received 3 DNS replies [5 Minutes ratio:
> 3.2%]
> 06/05/2020 22:15:00 12:31:28 Warning Replies / Requests Ratio
> Host pihole-2.example.net
> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.4@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
> has sent 97 DNS requests but received 1 DNS reply [5 Minutes ratio: 1.0%]
>
>
>
>
> _______________________________________________
> Ntop mailing list
> Ntop@listgateway.unipi.it
> http://listgateway.unipi.it/mailman/listinfo/ntop
Re: Replies / Requests Ratio [ In reply to ]
Thank you for your response. In the screenshot below, can you please
explain the significance of the "Date/Time" and the "Duration" columns?
What do they mean in this context?

Do I understand correctly that all 3 hosts triggered the alert at 07:25:01
(OR 07:30:01) this morning? And that all three alerts are active for the
past 07:28:53 hours? Does this mean that there have been no new
additional DNS Reply/Request issues have been detected?

I notice in "Past Alerts" tab, that there are many Reply/Request Alerts for
the same host with very short durations (screen shot #2). When/how does an
alert move from the "Engaged" to "Past" tab?

So in the 2nd screenshot, fire-TV had an alert at 06:20:00 for 05:00
minutes where 18 requests received 0 replies. Then another alert at
06:50:00 for 05:00 minutes. Were the 18 replies from the first alert
ultimately received? And they were received 5 minutes the alert occurred?

Context here is that 99% of the traffic is Internet traffic. Almost all of
the pihole traffic is to forwarders. BTW, the way pihole works (by
default) is it replies 0.0.0.0 for blocked hosts. It should respond to
every query.

I tried the live_pcap_download.html
<https://www.ntop.org/guides/ntopng/advanced_features/live_pcap_download.html>
lua, but couldn't figure out the bpf_filter:
curl --cookie "user=admin; password=xxxxx" "
http://10.12.17.25:3000/lua/live_traffic.lua?ifid=0&duration=600&bpf_filter=\"port
53\""

I also tried the download pcap on the if_stats.lua page. The downloaded
pcap file seems to only contain incoming data (see wireshark)?

If I just do a tshark on the same interface that ntopng is listening on, I
see all of the expected DNS query & replies. I am not able to correlate
the alerts to any missing packets.



On Fri, May 8, 2020 at 2:53 AM Emanuele Faranda <faranda@ntop.org> wrote:

> Hi Aaron,
>
> The alerts that you are reporting basically tell you that such hosts
> receive DNS requests but do not send a reply. In order to troubleshoot
> possible problems you should augment such information with the knowledge of
> your network.
>
> The first question to answer is, are that hosts expected to accept DNS
> requests? If not, are the requests generated from the internet or from the
> LAN? In the first case a firewall to block such DNS requests may be a good
> idea . In the latter case some hosts in the LAN may be misconfigured. In
> case of the pihole hosts, I expect pihole to block some DNS requests for
> advertisement sites so this could be a normal behaviour. The following
> ntopng features may also help you:
>
>
> https://www.ntop.org/guides/ntopng/advanced_features/live_pcap_download.html
>
> https://www.ntop.org/guides/ntopng/using_with_other_tools/n2disk.html
>
> https://www.ntop.org/guides/ntopng/historical_flows.html
>
> Regards,
> Emanuele
> On 5/7/20 5:57 PM, Aaron Scamehorn wrote:
>
> Hello,
>
> I'm trying to understand how/why I am getting the "Replies / Requests
> Ratio" warnings for DNS.
>
> I am suspect of these alerts, and would like to know how/why they are
> being generated. I am suspect for for the following reasons: 1) If it
> really is as bad as indicated, I should notice problems. 2) the "events'
> occur immediately after I clear the alerts, and tend to persist for hours.
>
> In any case, I cleared the alerts last night, and this is what they look
> like:
>
> 06/05/2020 22:15:00 12:31:28 Warning Replies / Requests Ratio Host
> edgemax.example.net
> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.1@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
> has received 54 DNS requests but sent 0 DNS replies [5 Minutes ratio: 0%]
>
> 06/05/2020 22:15:00 12:31:28 Warning Replies / Requests Ratio Host
> pihole.example.net
> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.3@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
> has sent 93 DNS requests but received 3 DNS replies [5 Minutes ratio: 3.2%]
>
> 06/05/2020 22:15:00 12:31:28 Warning Replies / Requests Ratio Host
> pihole-2.example.net
> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.4@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
> has sent 97 DNS requests but received 1 DNS reply [5 Minutes ratio: 1.0%]
>
>
>
>
> _______________________________________________
> Ntop mailing listNtop@listgateway.unipi.ithttp://listgateway.unipi.it/mailman/listinfo/ntop
>
> _______________________________________________
> Ntop mailing list
> Ntop@listgateway.unipi.it
> http://listgateway.unipi.it/mailman/listinfo/ntop
Re: Replies / Requests Ratio [ In reply to ]
Hi Aaron,

Please see below:

On 5/8/20 10:27 PM, Aaron Scamehorn wrote:
> Thank you for your response.  In the screenshot below, can you please
> explain the significance of the "Date/Time" and the "Duration"
> columns?  What do they mean in this context?

Date/Time: the time when the alert was triggered. Ntopng performs
periodic checks in order to trigger alerts. In this particular case, the
check on the requests/reply ratio is performed every 5 minutes. So this
means that problem started between 07:20 and 07:25 .

Duration: the total time in which the problem was active. Again, the
check is performed every 5 minutes for this alert so 5 minutes is the
granularity.

>
> Do I understand correctly that all 3 hosts triggered the alert at
> 07:25:01 (OR 07:30:01) this morning?  And that all three alerts are
> active for the past 07:28:53  hours?   Does this mean that there have
> been no new additional DNS Reply/Request issues have been detected?
As explained above, the problem started between 07:20 and 07:25 . For
07:28:53 hours the problem was active on all the three hosts (the
requests/reply ratio threshold was exceeded for 07:28:53 hours).
>
> I notice in "Past Alerts" tab, that there are many Reply/Request
> Alerts for the same host with very short durations (screen shot #2). 
> When/how does an alert move from the "Engaged" to "Past" tab?
In this case, the engaged alert becomes "past" alert when, after the
check performed every 5 minutes, the requests/reply ratio threshold is
not exceed anymore. This can happen as soon as the next check is
performed (5 minutes).
>
> So in the 2nd screenshot, fire-TV had an alert at 06:20:00 for 05:00
> minutes where 18 requests received 0 replies.  Then another alert at
> 06:50:00 for 05:00 minutes.  Were the 18 replies from the first alert
> ultimately received?  And they were received 5 minutes the alert occurred?

The check is performed on the DNS packet counters. A DNS request cannot
take 5 minutes to be replied. The fact that the alert was closed after
5/10 minutes could be related to one of these events:

- The host went idle

- The host did not send enough DNS requests

- The new DNS requests made by the host were successfully replied.

>
> Context here is that 99% of the traffic is Internet traffic.  Almost
> all of the pihole traffic is to forwarders. BTW, the way pihole works
> (by default) is it replies 0.0.0.0 for blocked hosts.  It should
> respond to every query.
>
> I tried the live_pcap_download.html
> <https://www.ntop.org/guides/ntopng/advanced_features/live_pcap_download.html>
> lua, but couldn't figure out the bpf_filter:
> curl --cookie "user=admin; password=xxxxx"
>  "http://10.12.17.25:3000/lua/live_traffic.lua?ifid=0&duration=600&bpf_filter=\"port
> 53\""
>
> I also tried the download pcap on the if_stats.lua page. The
> downloaded pcap file seems to only contain incoming data (see wireshark)?
This is consistent with the above alerts, please ensure that ntopng is
not dropping packets as this would explain this behavior.
>
> If I just do a tshark on the same interface that ntopng is listening
> on, I see all of the expected DNS query & replies.  I am not able to
> correlate the alerts to any missing packets.

See response above.

Regards,

Emanuele

>
>
>
> On Fri, May 8, 2020 at 2:53 AM Emanuele Faranda <faranda@ntop.org
> <mailto:faranda@ntop.org>> wrote:
>
> Hi Aaron,
>
> The alerts that you are reporting basically tell you that such
> hosts receive DNS requests but do not send a reply. In order to
> troubleshoot possible problems you should augment such information
> with the knowledge of your network.
>
> The first question to answer is, are that hosts expected to accept
> DNS requests? If not, are the requests generated from the internet
> or from the LAN? In the first case a firewall to block such DNS
> requests may be a good idea . In the latter case some hosts in the
> LAN may be misconfigured. In case of the pihole hosts, I expect
> pihole to block some DNS requests for advertisement sites so this
> could be a normal behaviour. The following ntopng features may
> also help you:
>
> https://www.ntop.org/guides/ntopng/advanced_features/live_pcap_download.html
>
> https://www.ntop.org/guides/ntopng/using_with_other_tools/n2disk.html
>
> https://www.ntop.org/guides/ntopng/historical_flows.html
>
> Regards,
> Emanuele
>
> On 5/7/20 5:57 PM, Aaron Scamehorn wrote:
>> Hello,
>>
>> I'm trying to understand how/why I am getting the "Replies /
>> Requests Ratio" warnings for DNS.
>>
>> I am suspect of these alerts, and would like to know how/why they
>> are being generated.  I am suspect for for the following
>> reasons:  1) If it really is as bad as indicated, I should notice
>> problems.  2) the "events' occur immediately after I clear the
>> alerts, and tend to persist for hours.
>>
>> In any case, I cleared the alerts last night, and this is what
>> they look like:
>>
>> 06/05/2020 22:15:00 12:31:28 Warning Replies / Requests Ratio
>> Host edgemax.example.net
>> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.1@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
>> has received 54 DNS requests but sent 0 DNS replies [5 Minutes
>> ratio: 0%]
>>
>> 06/05/2020 22:15:00 12:31:28 Warning Replies / Requests Ratio
>> Host pihole.example.net
>> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.3@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
>> has sent 93 DNS requests but received 3 DNS replies [5 Minutes
>> ratio: 3.2%]
>> 06/05/2020 22:15:00 12:31:28 Warning Replies / Requests Ratio
>> Host pihole-2.example.net
>> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.4@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
>> has sent 97 DNS requests but received 1 DNS reply [5 Minutes
>> ratio: 1.0%]
>>
>>
>>
>>
>> _______________________________________________
>> Ntop mailing list
>> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
>> http://listgateway.unipi.it/mailman/listinfo/ntop
> _______________________________________________
> Ntop mailing list
> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
> http://listgateway.unipi.it/mailman/listinfo/ntop
>
>
> _______________________________________________
> Ntop mailing list
> Ntop@listgateway.unipi.it
> http://listgateway.unipi.it/mailman/listinfo/ntop
Re: Replies / Requests Ratio [ In reply to ]
Hi Emanuele,

Thank you again for the detailed responses.

>From the interfaces page, I see these stats:
Total Traffic 91.6 GB [103,062,265 Pkts] Dropped Packets 0 Pkts
I don't see any dropped packets on the NIC either:
ethtool -S enp2s0
NIC statistics:
tx_packets: 0
rx_packets: 106581943
tx_errors: 0
rx_errors: 0
rx_missed: 0
align_errors: 0
tx_single_collisions: 0
tx_multi_collisions: 0
unicast: 105432876
broadcast: 350738
multicast: 1149060
tx_aborted: 0
tx_underrun: 0

As of right now, 2 of the hosts we are discussing are still in alert, at
the original Date/Time of 07:25:01, and Duration is now "3 Days, 08:06:59".

Given that my replies vs requests ratio is still configured at 50%, this
means that, at every 5 minute interval for the last 3 Days, 8 hours, said
host is receiving < 50% DNS replies, correct? I find this difficult to
believe, and cannot find ANY missing packets in my pcap file.

I have captured a 30 minute pcap file captured with this command:
tcpdump -i enp2s0 -G 1800 -w /tmp/enp2s0.%FT%T.pcap host edgemax and port 53

This file contains DNS traffic to/from edgemax only.
I can count responses like this:
tshark -t a -r enp2s0.2020-05-11T13:00:02.pcap | grep -c "Standard query
response"
349
And queries like this:
tshark -t a -r enp2s0.2020-05-11T13:00:02.pcap | grep -c "Standard query 0x"
349

In other words, no missing DNS responses in the 30 minutes spanning
13:00:02 to 13:29:51.

I would think that the alert should "clear" because the threshold is not
exceeded within that 30 minute pcap file.

In any case, at 13:23, I manually click on the "Release" button for that
alert. 2 minutes later, at 13:25:00, I receive this alert:
Host edgemax has received 62 DNS requests but sent 0 DNS replies [5 Minutes
ratio: 0%]

As stated previously, no missing DNS responses in the 30 minutes spanning
13:00:02 to 13:29:51. Why does ntopng think 62 replies are missing?

I exported 10 minutes of PCAP from if_stats.lua. Using the filter
"(ip.dst_host == "10.12.17.1" or ip.src_host == "10.12.17.1") and dns" I am
not able to find any missing DNS responses in wireshark. Interestingly, If
I specify a BPF Filter ("port 53"), the downloaded PCAP file seems to only
have 1 side (ie. edgemax is only a source, never a dest. Without a BPF
Filter, the download is fine.




On Mon, May 11, 2020 at 8:59 AM Emanuele Faranda <faranda@ntop.org> wrote:

> Hi Aaron,
>
> Please see below:
> On 5/8/20 10:27 PM, Aaron Scamehorn wrote:
>
> Thank you for your response. In the screenshot below, can you please
> explain the significance of the "Date/Time" and the "Duration" columns?
> What do they mean in this context?
>
> Date/Time: the time when the alert was triggered. Ntopng performs periodic
> checks in order to trigger alerts. In this particular case, the check on
> the requests/reply ratio is performed every 5 minutes. So this means that
> problem started between 07:20 and 07:25 .
>
> Duration: the total time in which the problem was active. Again, the check
> is performed every 5 minutes for this alert so 5 minutes is the granularity.
>
>
> Do I understand correctly that all 3 hosts triggered the alert at 07:25:01
> (OR 07:30:01) this morning? And that all three alerts are active for the
> past 07:28:53 hours? Does this mean that there have been no new
> additional DNS Reply/Request issues have been detected?
>
> As explained above, the problem started between 07:20 and 07:25 . For
> 07:28:53 hours the problem was active on all the three hosts (the
> requests/reply ratio threshold was exceeded for 07:28:53 hours).
>
>
> I notice in "Past Alerts" tab, that there are many Reply/Request Alerts
> for the same host with very short durations (screen shot #2). When/how
> does an alert move from the "Engaged" to "Past" tab?
>
> In this case, the engaged alert becomes "past" alert when, after the check
> performed every 5 minutes, the requests/reply ratio threshold is not exceed
> anymore. This can happen as soon as the next check is performed (5 minutes).
>
>
> So in the 2nd screenshot, fire-TV had an alert at 06:20:00 for 05:00
> minutes where 18 requests received 0 replies. Then another alert at
> 06:50:00 for 05:00 minutes. Were the 18 replies from the first alert
> ultimately received? And they were received 5 minutes the alert occurred?
>
> The check is performed on the DNS packet counters. A DNS request cannot
> take 5 minutes to be replied. The fact that the alert was closed after 5/10
> minutes could be related to one of these events:
>
> - The host went idle
>
> - The host did not send enough DNS requests
>
> - The new DNS requests made by the host were successfully replied.
>
>
> Context here is that 99% of the traffic is Internet traffic. Almost all
> of the pihole traffic is to forwarders. BTW, the way pihole works (by
> default) is it replies 0.0.0.0 for blocked hosts. It should respond to
> every query.
>
> I tried the live_pcap_download.html
> <https://www.ntop.org/guides/ntopng/advanced_features/live_pcap_download.html>
> lua, but couldn't figure out the bpf_filter:
> curl --cookie "user=admin; password=xxxxx" "
> http://10.12.17.25:3000/lua/live_traffic.lua?ifid=0&duration=600&bpf_filter=\"port
> 53\""
>
> I also tried the download pcap on the if_stats.lua page. The downloaded
> pcap file seems to only contain incoming data (see wireshark)?
>
> This is consistent with the above alerts, please ensure that ntopng is not
> dropping packets as this would explain this behavior.
>
>
> If I just do a tshark on the same interface that ntopng is listening on, I
> see all of the expected DNS query & replies. I am not able to correlate
> the alerts to any missing packets.
>
> See response above.
>
> Regards,
>
> Emanuele
>
>
>
>
> On Fri, May 8, 2020 at 2:53 AM Emanuele Faranda <faranda@ntop.org> wrote:
>
>> Hi Aaron,
>>
>> The alerts that you are reporting basically tell you that such hosts
>> receive DNS requests but do not send a reply. In order to troubleshoot
>> possible problems you should augment such information with the knowledge of
>> your network.
>>
>> The first question to answer is, are that hosts expected to accept DNS
>> requests? If not, are the requests generated from the internet or from the
>> LAN? In the first case a firewall to block such DNS requests may be a good
>> idea . In the latter case some hosts in the LAN may be misconfigured. In
>> case of the pihole hosts, I expect pihole to block some DNS requests for
>> advertisement sites so this could be a normal behaviour. The following
>> ntopng features may also help you:
>>
>>
>> https://www.ntop.org/guides/ntopng/advanced_features/live_pcap_download.html
>>
>> https://www.ntop.org/guides/ntopng/using_with_other_tools/n2disk.html
>>
>> https://www.ntop.org/guides/ntopng/historical_flows.html
>>
>> Regards,
>> Emanuele
>> On 5/7/20 5:57 PM, Aaron Scamehorn wrote:
>>
>> Hello,
>>
>> I'm trying to understand how/why I am getting the "Replies / Requests
>> Ratio" warnings for DNS.
>>
>> I am suspect of these alerts, and would like to know how/why they are
>> being generated. I am suspect for for the following reasons: 1) If it
>> really is as bad as indicated, I should notice problems. 2) the "events'
>> occur immediately after I clear the alerts, and tend to persist for hours.
>>
>> In any case, I cleared the alerts last night, and this is what they look
>> like:
>>
>> 06/05/2020 22:15:00 12:31:28 Warning Replies / Requests Ratio Host
>> edgemax.example.net
>> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.1@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
>> has received 54 DNS requests but sent 0 DNS replies [5 Minutes ratio: 0%]
>>
>> 06/05/2020 22:15:00 12:31:28 Warning Replies / Requests Ratio Host
>> pihole.example.net
>> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.3@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
>> has sent 93 DNS requests but received 3 DNS replies [5 Minutes ratio: 3.2%]
>>
>> 06/05/2020 22:15:00 12:31:28 Warning Replies / Requests Ratio Host
>> pihole-2.example.net
>> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.4@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
>> has sent 97 DNS requests but received 1 DNS reply [5 Minutes ratio: 1.0%]
>>
>>
>>
>>
>> _______________________________________________
>> Ntop mailing listNtop@listgateway.unipi.ithttp://listgateway.unipi.it/mailman/listinfo/ntop
>>
>> _______________________________________________
>> Ntop mailing list
>> Ntop@listgateway.unipi.it
>> http://listgateway.unipi.it/mailman/listinfo/ntop
>
>
> _______________________________________________
> Ntop mailing listNtop@listgateway.unipi.ithttp://listgateway.unipi.it/mailman/listinfo/ntop
>
> _______________________________________________
> Ntop mailing list
> Ntop@listgateway.unipi.it
> http://listgateway.unipi.it/mailman/listinfo/ntop
Re: Replies / Requests Ratio [ In reply to ]
Hi Aaron,

Please see below.

On 5/11/20 9:29 PM, Aaron Scamehorn wrote:
> Hi Emanuele,
>
> Thank you again for the detailed responses.
>
> From the interfaces page, I see these stats:
> Total Traffic 91.6 GB [103,062,265 Pkts] Dropped Packets 0 Pkts
>
> I don't see any dropped packets on the NIC either:
> ethtool -S enp2s0
> NIC statistics:
>      tx_packets: 0
>      rx_packets: 106581943
>      tx_errors: 0
>      rx_errors: 0
>      rx_missed: 0
>      align_errors: 0
>      tx_single_collisions: 0
>      tx_multi_collisions: 0
>      unicast: 105432876
>      broadcast: 350738
>      multicast: 1149060
>      tx_aborted: 0
>      tx_underrun: 0
>
> As of right now, 2 of the hosts we are discussing are still in alert,
> at the original Date/Time of 07:25:01, and Duration is now "3 Days,
> 08:06:59".
>
> Given that my replies vs requests ratio is still configured at 50%,
> this means that, at every 5 minute interval for the last 3 Days, 8
> hours, said host is receiving < 50% DNS replies, correct?  I find this
> difficult to believe, and cannot find ANY missing packets in my pcap file.
>
> I have captured a 30 minute pcap file captured with this command:
> tcpdump -i enp2s0 -G 1800 -w /tmp/enp2s0.%FT%T.pcap host edgemax and
> port 53
>
> This file contains DNS traffic to/from edgemax only.
> I can count responses like this:
> tshark -t a -r enp2s0.2020-05-11T13:00:02.pcap | grep -c "Standard
> query response"
> 349
> And queries like this:
> tshark -t a -r enp2s0.2020-05-11T13:00:02.pcap | grep -c "Standard
> query 0x"
> 349
>
> In other words, no missing DNS responses in the 30 minutes spanning
> 13:00:02 to 13:29:51.
>
> I would think that the alert should "clear" because the threshold is
> not exceeded within that 30 minute pcap file.
>
> In any case, at 13:23, I manually click on the "Release" button for
> that alert.  2 minutes later, at 13:25:00, I receive this alert:
> Host edgemax has received 62 DNS requests but sent 0 DNS replies [5
> Minutes ratio: 0%]
>
> As stated previously, no missing DNS responses in the 30 minutes
> spanning 13:00:02 to 13:29:51.  Why does ntopng think 62 replies are
> missing?

Please report your ntopng.conf. If you look at the active ntopng DNS
flows, can you identify unidirectional flows? You can also try to run
ntopng on the PCAP file (--original-speed -i file.pcap). If you can
reproduce using the PCAP file, please send it to me privately so that I
can troubleshoot the problem.

>
> I exported 10 minutes of PCAP from if_stats.lua.  Using the filter
> "(ip.dst_host == "10.12.17.1" or ip.src_host == "10.12.17.1") and dns"
> I am not able to find any missing DNS responses in wireshark. 
> Interestingly, If I specify a BPF Filter ("port 53"), the downloaded
> PCAP file seems to only have 1 side (ie. edgemax is only a source,
> never a dest. Without a BPF Filter, the download is fine.

This is probably a bug, please open an issue at
https://github.com/ntop/ntopng .

Regards,

Emanuele

>
>
>
>
> On Mon, May 11, 2020 at 8:59 AM Emanuele Faranda <faranda@ntop.org
> <mailto:faranda@ntop.org>> wrote:
>
> Hi Aaron,
>
> Please see below:
>
> On 5/8/20 10:27 PM, Aaron Scamehorn wrote:
>> Thank you for your response.  In the screenshot below, can you
>> please explain the significance of the "Date/Time" and the
>> "Duration" columns?  What do they mean in this context?
>
> Date/Time: the time when the alert was triggered. Ntopng performs
> periodic checks in order to trigger alerts. In this particular
> case, the check on the requests/reply ratio is performed every 5
> minutes. So this means that problem started between 07:20 and 07:25 .
>
> Duration: the total time in which the problem was active. Again,
> the check is performed every 5 minutes for this alert so 5 minutes
> is the granularity.
>
>>
>> Do I understand correctly that all 3 hosts triggered the alert at
>> 07:25:01 (OR 07:30:01) this morning?  And that all three alerts
>> are active for the past 07:28:53  hours?   Does this mean that
>> there have been no new additional DNS Reply/Request issues have
>> been detected?
> As explained above, the problem started between 07:20 and 07:25 .
> For 07:28:53 hours the problem was active on all the three hosts
> (the requests/reply ratio threshold was exceeded for 07:28:53 hours).
>>
>> I notice in "Past Alerts" tab, that there are many Reply/Request
>> Alerts for the same host with very short durations (screen shot
>> #2).  When/how does an alert move from the "Engaged" to "Past" tab?
> In this case, the engaged alert becomes "past" alert when, after
> the check performed every 5 minutes, the requests/reply ratio
> threshold is not exceed anymore. This can happen as soon as the
> next check is performed (5 minutes).
>>
>> So in the 2nd screenshot, fire-TV had an alert at 06:20:00 for
>> 05:00 minutes where 18 requests received 0 replies.  Then another
>> alert at 06:50:00 for 05:00 minutes.  Were the 18 replies from
>> the first alert ultimately received?  And they were received 5
>> minutes the alert occurred?
>
> The check is performed on the DNS packet counters. A DNS request
> cannot take 5 minutes to be replied. The fact that the alert was
> closed after 5/10 minutes could be related to one of these events:
>
> - The host went idle
>
> - The host did not send enough DNS requests
>
> - The new DNS requests made by the host were successfully replied.
>
>>
>> Context here is that 99% of the traffic is Internet traffic. 
>> Almost all of the pihole traffic is to forwarders.  BTW, the way
>> pihole works (by default) is it replies 0.0.0.0 for blocked
>> hosts.  It should respond to every query.
>>
>> I tried the live_pcap_download.html
>> <https://www.ntop.org/guides/ntopng/advanced_features/live_pcap_download.html>
>> lua, but couldn't figure out the bpf_filter:
>> curl --cookie "user=admin; password=xxxxx"
>>  "http://10.12.17.25:3000/lua/live_traffic.lua?ifid=0&duration=600&bpf_filter=\
>> <http://10.12.17.25:3000/lua/live_traffic.lua?ifid=0&duration=600&bpf_filter=%5C>"port
>> 53\""
>>
>> I also tried the download pcap on the if_stats.lua page.   The
>> downloaded pcap file seems to only contain incoming data (see
>> wireshark)?
> This is consistent with the above alerts, please ensure that
> ntopng is not dropping packets as this would explain this behavior.
>>
>> If I just do a tshark on the same interface that ntopng is
>> listening on, I see all of the expected DNS query & replies.  I
>> am not able to correlate the alerts to any missing packets.
>
> See response above.
>
> Regards,
>
> Emanuele
>
>>
>>
>>
>> On Fri, May 8, 2020 at 2:53 AM Emanuele Faranda <faranda@ntop.org
>> <mailto:faranda@ntop.org>> wrote:
>>
>> Hi Aaron,
>>
>> The alerts that you are reporting basically tell you that
>> such hosts receive DNS requests but do not send a reply. In
>> order to troubleshoot possible problems you should augment
>> such information with the knowledge of your network.
>>
>> The first question to answer is, are that hosts expected to
>> accept DNS requests? If not, are the requests generated from
>> the internet or from the LAN? In the first case a firewall to
>> block such DNS requests may be a good idea . In the latter
>> case some hosts in the LAN may be misconfigured. In case of
>> the pihole hosts, I expect pihole to block some DNS requests
>> for advertisement sites so this could be a normal behaviour.
>> The following ntopng features may also help you:
>>
>> https://www.ntop.org/guides/ntopng/advanced_features/live_pcap_download.html
>>
>> https://www.ntop.org/guides/ntopng/using_with_other_tools/n2disk.html
>>
>> https://www.ntop.org/guides/ntopng/historical_flows.html
>>
>> Regards,
>> Emanuele
>>
>> On 5/7/20 5:57 PM, Aaron Scamehorn wrote:
>>> Hello,
>>>
>>> I'm trying to understand how/why I am getting the "Replies /
>>> Requests Ratio" warnings for DNS.
>>>
>>> I am suspect of these alerts, and would like to know how/why
>>> they are being generated.  I am suspect for for the
>>> following reasons:  1) If it really is as bad as indicated,
>>> I should notice problems.  2) the "events' occur immediately
>>> after I clear the alerts, and tend to persist for hours.
>>>
>>> In any case, I cleared the alerts last night, and this is
>>> what they look like:
>>>
>>> 06/05/2020 22:15:00 12:31:28 Warning Replies / Requests
>>> Ratio Host edgemax.example.net
>>> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.1@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
>>> has received 54 DNS requests but sent 0 DNS replies [5
>>> Minutes ratio: 0%]
>>>
>>> 06/05/2020 22:15:00 12:31:28 Warning Replies / Requests
>>> Ratio Host pihole.example.net
>>> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.3@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
>>> has sent 93 DNS requests but received 3 DNS replies [5
>>> Minutes ratio: 3.2%]
>>> 06/05/2020 22:15:00 12:31:28 Warning Replies / Requests
>>> Ratio Host pihole-2.example.net
>>> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.4@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
>>> has sent 97 DNS requests but received 1 DNS reply [5 Minutes
>>> ratio: 1.0%]
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Ntop mailing list
>>> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>> _______________________________________________
>> Ntop mailing list
>> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>
>>
>> _______________________________________________
>> Ntop mailing list
>> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
>> http://listgateway.unipi.it/mailman/listinfo/ntop
> _______________________________________________
> Ntop mailing list
> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
> http://listgateway.unipi.it/mailman/listinfo/ntop
>
>
> _______________________________________________
> Ntop mailing list
> Ntop@listgateway.unipi.it
> http://listgateway.unipi.it/mailman/listinfo/ntop
Re: Replies / Requests Ratio [ In reply to ]
Emanuele,

Here is ntopng.conf
-G=/var/run/ntopng.pid
-i=enp2s0
-m=10.12.17.0/24
-S=local

I do see unidirectional flows in flows_stats.lua for DNS. Incidentally, I
do also see alerts w/ non-zero replies (though most alerts are 0):
Host pihole has sent 211 DNS requests but received 7 DNS replies

I tried 2 different 30 minute PCAP files. In both cases, right at the 10
minute mark, I got alerts. How can I get these PCAP files to you?

Thanks,
Aaron



On Tue, May 12, 2020 at 4:13 AM Emanuele Faranda <faranda@ntop.org> wrote:

> Hi Aaron,
>
> Please see below.
> On 5/11/20 9:29 PM, Aaron Scamehorn wrote:
>
> Hi Emanuele,
>
> Thank you again for the detailed responses.
>
> From the interfaces page, I see these stats:
> Total Traffic 91.6 GB [103,062,265 Pkts] Dropped Packets 0 Pkts
> I don't see any dropped packets on the NIC either:
> ethtool -S enp2s0
> NIC statistics:
> tx_packets: 0
> rx_packets: 106581943
> tx_errors: 0
> rx_errors: 0
> rx_missed: 0
> align_errors: 0
> tx_single_collisions: 0
> tx_multi_collisions: 0
> unicast: 105432876
> broadcast: 350738
> multicast: 1149060
> tx_aborted: 0
> tx_underrun: 0
>
> As of right now, 2 of the hosts we are discussing are still in alert, at
> the original Date/Time of 07:25:01, and Duration is now "3 Days, 08:06:59".
>
> Given that my replies vs requests ratio is still configured at 50%, this
> means that, at every 5 minute interval for the last 3 Days, 8 hours, said
> host is receiving < 50% DNS replies, correct? I find this difficult to
> believe, and cannot find ANY missing packets in my pcap file.
>
> I have captured a 30 minute pcap file captured with this command:
> tcpdump -i enp2s0 -G 1800 -w /tmp/enp2s0.%FT%T.pcap host edgemax and port
> 53
>
> This file contains DNS traffic to/from edgemax only.
> I can count responses like this:
> tshark -t a -r enp2s0.2020-05-11T13:00:02.pcap | grep -c "Standard query
> response"
> 349
> And queries like this:
> tshark -t a -r enp2s0.2020-05-11T13:00:02.pcap | grep -c "Standard query
> 0x"
> 349
>
> In other words, no missing DNS responses in the 30 minutes spanning
> 13:00:02 to 13:29:51.
>
> I would think that the alert should "clear" because the threshold is not
> exceeded within that 30 minute pcap file.
>
> In any case, at 13:23, I manually click on the "Release" button for that
> alert. 2 minutes later, at 13:25:00, I receive this alert:
> Host edgemax has received 62 DNS requests but sent 0 DNS replies [5
> Minutes ratio: 0%]
>
> As stated previously, no missing DNS responses in the 30 minutes spanning
> 13:00:02 to 13:29:51. Why does ntopng think 62 replies are missing?
>
> Please report your ntopng.conf. If you look at the active ntopng DNS
> flows, can you identify unidirectional flows? You can also try to run
> ntopng on the PCAP file (--original-speed -i file.pcap). If you can
> reproduce using the PCAP file, please send it to me privately so that I can
> troubleshoot the problem.
>
>
> I exported 10 minutes of PCAP from if_stats.lua. Using the filter
> "(ip.dst_host == "10.12.17.1" or ip.src_host == "10.12.17.1") and dns" I am
> not able to find any missing DNS responses in wireshark. Interestingly, If
> I specify a BPF Filter ("port 53"), the downloaded PCAP file seems to only
> have 1 side (ie. edgemax is only a source, never a dest. Without a BPF
> Filter, the download is fine.
>
> This is probably a bug, please open an issue at
> https://github.com/ntop/ntopng .
>
> Regards,
>
> Emanuele
>
>
>
>
>
> On Mon, May 11, 2020 at 8:59 AM Emanuele Faranda <faranda@ntop.org> wrote:
>
>> Hi Aaron,
>>
>> Please see below:
>> On 5/8/20 10:27 PM, Aaron Scamehorn wrote:
>>
>> Thank you for your response. In the screenshot below, can you please
>> explain the significance of the "Date/Time" and the "Duration" columns?
>> What do they mean in this context?
>>
>> Date/Time: the time when the alert was triggered. Ntopng performs
>> periodic checks in order to trigger alerts. In this particular case, the
>> check on the requests/reply ratio is performed every 5 minutes. So this
>> means that problem started between 07:20 and 07:25 .
>>
>> Duration: the total time in which the problem was active. Again, the
>> check is performed every 5 minutes for this alert so 5 minutes is the
>> granularity.
>>
>>
>> Do I understand correctly that all 3 hosts triggered the alert at
>> 07:25:01 (OR 07:30:01) this morning? And that all three alerts are active
>> for the past 07:28:53 hours? Does this mean that there have been no new
>> additional DNS Reply/Request issues have been detected?
>>
>> As explained above, the problem started between 07:20 and 07:25 . For
>> 07:28:53 hours the problem was active on all the three hosts (the
>> requests/reply ratio threshold was exceeded for 07:28:53 hours).
>>
>>
>> I notice in "Past Alerts" tab, that there are many Reply/Request Alerts
>> for the same host with very short durations (screen shot #2). When/how
>> does an alert move from the "Engaged" to "Past" tab?
>>
>> In this case, the engaged alert becomes "past" alert when, after the
>> check performed every 5 minutes, the requests/reply ratio threshold is not
>> exceed anymore. This can happen as soon as the next check is performed (5
>> minutes).
>>
>>
>> So in the 2nd screenshot, fire-TV had an alert at 06:20:00 for 05:00
>> minutes where 18 requests received 0 replies. Then another alert at
>> 06:50:00 for 05:00 minutes. Were the 18 replies from the first alert
>> ultimately received? And they were received 5 minutes the alert occurred?
>>
>> The check is performed on the DNS packet counters. A DNS request cannot
>> take 5 minutes to be replied. The fact that the alert was closed after 5/10
>> minutes could be related to one of these events:
>>
>> - The host went idle
>>
>> - The host did not send enough DNS requests
>>
>> - The new DNS requests made by the host were successfully replied.
>>
>>
>> Context here is that 99% of the traffic is Internet traffic. Almost all
>> of the pihole traffic is to forwarders. BTW, the way pihole works (by
>> default) is it replies 0.0.0.0 for blocked hosts. It should respond to
>> every query.
>>
>> I tried the live_pcap_download.html
>> <https://www.ntop.org/guides/ntopng/advanced_features/live_pcap_download.html>
>> lua, but couldn't figure out the bpf_filter:
>> curl --cookie "user=admin; password=xxxxx" "
>> http://10.12.17.25:3000/lua/live_traffic.lua?ifid=0&duration=600&bpf_filter=\"port
>> 53\""
>>
>> I also tried the download pcap on the if_stats.lua page. The downloaded
>> pcap file seems to only contain incoming data (see wireshark)?
>>
>> This is consistent with the above alerts, please ensure that ntopng is
>> not dropping packets as this would explain this behavior.
>>
>>
>> If I just do a tshark on the same interface that ntopng is listening on,
>> I see all of the expected DNS query & replies. I am not able to correlate
>> the alerts to any missing packets.
>>
>> See response above.
>>
>> Regards,
>>
>> Emanuele
>>
>>
>>
>>
>> On Fri, May 8, 2020 at 2:53 AM Emanuele Faranda <faranda@ntop.org> wrote:
>>
>>> Hi Aaron,
>>>
>>> The alerts that you are reporting basically tell you that such hosts
>>> receive DNS requests but do not send a reply. In order to troubleshoot
>>> possible problems you should augment such information with the knowledge of
>>> your network.
>>>
>>> The first question to answer is, are that hosts expected to accept DNS
>>> requests? If not, are the requests generated from the internet or from the
>>> LAN? In the first case a firewall to block such DNS requests may be a good
>>> idea . In the latter case some hosts in the LAN may be misconfigured. In
>>> case of the pihole hosts, I expect pihole to block some DNS requests for
>>> advertisement sites so this could be a normal behaviour. The following
>>> ntopng features may also help you:
>>>
>>>
>>> https://www.ntop.org/guides/ntopng/advanced_features/live_pcap_download.html
>>>
>>>
>>> https://www.ntop.org/guides/ntopng/using_with_other_tools/n2disk.html
>>>
>>> https://www.ntop.org/guides/ntopng/historical_flows.html
>>>
>>> Regards,
>>> Emanuele
>>> On 5/7/20 5:57 PM, Aaron Scamehorn wrote:
>>>
>>> Hello,
>>>
>>> I'm trying to understand how/why I am getting the "Replies / Requests
>>> Ratio" warnings for DNS.
>>>
>>> I am suspect of these alerts, and would like to know how/why they are
>>> being generated. I am suspect for for the following reasons: 1) If it
>>> really is as bad as indicated, I should notice problems. 2) the "events'
>>> occur immediately after I clear the alerts, and tend to persist for hours.
>>>
>>> In any case, I cleared the alerts last night, and this is what they look
>>> like:
>>>
>>> 06/05/2020 22:15:00 12:31:28 Warning Replies / Requests Ratio Host
>>> edgemax.example.net
>>> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.1@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
>>> has received 54 DNS requests but sent 0 DNS replies [5 Minutes ratio: 0%]
>>>
>>> 06/05/2020 22:15:00 12:31:28 Warning Replies / Requests Ratio Host
>>> pihole.example.net
>>> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.3@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
>>> has sent 93 DNS requests but received 3 DNS replies [5 Minutes ratio: 3.2%]
>>>
>>> 06/05/2020 22:15:00 12:31:28 Warning Replies / Requests Ratio Host
>>> pihole-2.example.net
>>> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.4@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
>>> has sent 97 DNS requests but received 1 DNS reply [5 Minutes ratio: 1.0%]
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Ntop mailing listNtop@listgateway.unipi.ithttp://listgateway.unipi.it/mailman/listinfo/ntop
>>>
>>> _______________________________________________
>>> Ntop mailing list
>>> Ntop@listgateway.unipi.it
>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>
>>
>> _______________________________________________
>> Ntop mailing listNtop@listgateway.unipi.ithttp://listgateway.unipi.it/mailman/listinfo/ntop
>>
>> _______________________________________________
>> Ntop mailing list
>> Ntop@listgateway.unipi.it
>> http://listgateway.unipi.it/mailman/listinfo/ntop
>
>
> _______________________________________________
> Ntop mailing listNtop@listgateway.unipi.ithttp://listgateway.unipi.it/mailman/listinfo/ntop
>
> _______________________________________________
> Ntop mailing list
> Ntop@listgateway.unipi.it
> http://listgateway.unipi.it/mailman/listinfo/ntop
Re: Replies / Requests Ratio [ In reply to ]
Hi Aaron,

Please contact us privately at faranda@ntop.org and mainardi@ntop.org .
Please ensure that the PCAP files only contain DNS traffic.

Regards,

Emanuele

On 5/12/20 5:13 PM, Aaron Scamehorn wrote:
> Emanuele,
>
> Here is ntopng.conf
> -G=/var/run/ntopng.pid
> -i=enp2s0
> -m=10.12.17.0/24 <http://10.12.17.0/24>
> -S=local
>
> I do see unidirectional flows in flows_stats.lua for DNS.
> Incidentally, I do also see alerts w/ non-zero replies (though most
> alerts are 0):
> Host pihole has sent 211 DNS requests but received 7 DNS replies
>
> I tried 2 different 30 minute PCAP files.  In both cases, right at the
> 10 minute mark, I got alerts.  How can I get these PCAP files to you?
>
> Thanks,
> Aaron
>
>
>
> On Tue, May 12, 2020 at 4:13 AM Emanuele Faranda <faranda@ntop.org
> <mailto:faranda@ntop.org>> wrote:
>
> Hi Aaron,
>
> Please see below.
>
> On 5/11/20 9:29 PM, Aaron Scamehorn wrote:
>> Hi Emanuele,
>>
>> Thank you again for the detailed responses.
>>
>> From the interfaces page, I see these stats:
>> Total Traffic 91.6 GB [103,062,265 Pkts] Dropped Packets 0 Pkts
>>
>> I don't see any dropped packets on the NIC either:
>> ethtool -S enp2s0
>> NIC statistics:
>>      tx_packets: 0
>>      rx_packets: 106581943
>>      tx_errors: 0
>>      rx_errors: 0
>>      rx_missed: 0
>>      align_errors: 0
>>      tx_single_collisions: 0
>>      tx_multi_collisions: 0
>>      unicast: 105432876
>>      broadcast: 350738
>>      multicast: 1149060
>>      tx_aborted: 0
>>      tx_underrun: 0
>>
>> As of right now, 2 of the hosts we are discussing are still in
>> alert, at the original Date/Time of 07:25:01, and Duration is now
>> "3 Days, 08:06:59".
>>
>> Given that my replies vs requests ratio is still configured at
>> 50%, this means that, at every 5 minute interval for the last 3
>> Days, 8 hours, said host is receiving < 50% DNS replies,
>> correct?  I find this difficult to believe, and cannot find ANY
>> missing packets in my pcap file.
>>
>> I have captured a 30 minute pcap file captured with this command:
>> tcpdump -i enp2s0 -G 1800 -w /tmp/enp2s0.%FT%T.pcap host edgemax
>> and port 53
>>
>> This file contains DNS traffic to/from edgemax only.
>> I can count responses like this:
>> tshark -t a -r enp2s0.2020-05-11T13:00:02.pcap | grep -c
>> "Standard query response"
>> 349
>> And queries like this:
>> tshark -t a -r enp2s0.2020-05-11T13:00:02.pcap | grep -c
>> "Standard query 0x"
>> 349
>>
>> In other words, no missing DNS responses in the 30 minutes
>> spanning 13:00:02 to 13:29:51.
>>
>> I would think that the alert should "clear" because the threshold
>> is not exceeded within that 30 minute pcap file.
>>
>> In any case, at 13:23, I manually click on the "Release" button
>> for that alert.  2 minutes later, at 13:25:00, I receive this alert:
>> Host edgemax has received 62 DNS requests but sent 0 DNS replies
>> [5 Minutes ratio: 0%]
>>
>> As stated previously, no missing DNS responses in the 30 minutes
>> spanning 13:00:02 to 13:29:51.  Why does ntopng think 62 replies
>> are missing?
>
> Please report your ntopng.conf. If you look at the active ntopng
> DNS flows, can you identify unidirectional flows? You can also try
> to run ntopng on the PCAP file (--original-speed -i file.pcap). If
> you can reproduce using the PCAP file, please send it to me
> privately so that I can troubleshoot the problem.
>
>>
>> I exported 10 minutes of PCAP from if_stats.lua. Using the filter
>> "(ip.dst_host == "10.12.17.1" or ip.src_host == "10.12.17.1") and
>> dns" I am not able to find any missing DNS responses in
>> wireshark. Interestingly, If I specify a BPF Filter ("port 53"),
>> the downloaded PCAP file seems to only have 1 side (ie. edgemax
>> is only a source, never a dest. Without a BPF Filter, the
>> download is fine.
>
> This is probably a bug, please open an issue at
> https://github.com/ntop/ntopng .
>
> Regards,
>
> Emanuele
>
>>
>>
>>
>>
>> On Mon, May 11, 2020 at 8:59 AM Emanuele Faranda
>> <faranda@ntop.org <mailto:faranda@ntop.org>> wrote:
>>
>> Hi Aaron,
>>
>> Please see below:
>>
>> On 5/8/20 10:27 PM, Aaron Scamehorn wrote:
>>> Thank you for your response.  In the screenshot below, can
>>> you please explain the significance of the "Date/Time" and
>>> the "Duration" columns?  What do they mean in this context?
>>
>> Date/Time: the time when the alert was triggered. Ntopng
>> performs periodic checks in order to trigger alerts. In this
>> particular case, the check on the requests/reply ratio is
>> performed every 5 minutes. So this means that problem started
>> between 07:20 and 07:25 .
>>
>> Duration: the total time in which the problem was active.
>> Again, the check is performed every 5 minutes for this alert
>> so 5 minutes is the granularity.
>>
>>>
>>> Do I understand correctly that all 3 hosts triggered the
>>> alert at 07:25:01 (OR 07:30:01) this morning?  And that all
>>> three alerts are active for the past 07:28:53  hours?   Does
>>> this mean that there have been no new additional DNS
>>> Reply/Request issues have been detected?
>> As explained above, the problem started between 07:20 and
>> 07:25 . For 07:28:53 hours the problem was active on all the
>> three hosts (the requests/reply ratio threshold was exceeded
>> for 07:28:53 hours).
>>>
>>> I notice in "Past Alerts" tab, that there are many
>>> Reply/Request Alerts for the same host with very short
>>> durations (screen shot #2).  When/how does an alert move
>>> from the "Engaged" to "Past" tab?
>> In this case, the engaged alert becomes "past" alert when,
>> after the check performed every 5 minutes, the requests/reply
>> ratio threshold is not exceed anymore. This can happen as
>> soon as the next check is performed (5 minutes).
>>>
>>> So in the 2nd screenshot, fire-TV had an alert at 06:20:00
>>> for 05:00 minutes where 18 requests received 0 replies. 
>>> Then another alert at 06:50:00 for 05:00 minutes.  Were the
>>> 18 replies from the first alert ultimately received?  And
>>> they were received 5 minutes the alert occurred?
>>
>> The check is performed on the DNS packet counters. A DNS
>> request cannot take 5 minutes to be replied. The fact that
>> the alert was closed after 5/10 minutes could be related to
>> one of these events:
>>
>> - The host went idle
>>
>> - The host did not send enough DNS requests
>>
>> - The new DNS requests made by the host were successfully
>> replied.
>>
>>>
>>> Context here is that 99% of the traffic is Internet
>>> traffic.  Almost all of the pihole traffic is to
>>> forwarders.  BTW, the way pihole works (by default) is it
>>> replies 0.0.0.0 for blocked hosts.  It should respond to
>>> every query.
>>>
>>> I tried the live_pcap_download.html
>>> <https://www.ntop.org/guides/ntopng/advanced_features/live_pcap_download.html>
>>> lua, but couldn't figure out the bpf_filter:
>>> curl --cookie "user=admin; password=xxxxx"
>>>  "http://10.12.17.25:3000/lua/live_traffic.lua?ifid=0&duration=600&bpf_filter=\
>>> <http://10.12.17.25:3000/lua/live_traffic.lua?ifid=0&duration=600&bpf_filter=%5C>"port
>>> 53\""
>>>
>>> I also tried the download pcap on the if_stats.lua page.  
>>> The downloaded pcap file seems to only contain incoming data
>>> (see wireshark)?
>> This is consistent with the above alerts, please ensure that
>> ntopng is not dropping packets as this would explain this
>> behavior.
>>>
>>> If I just do a tshark on the same interface that ntopng is
>>> listening on, I see all of the expected DNS query &
>>> replies.  I am not able to correlate the alerts to any
>>> missing packets.
>>
>> See response above.
>>
>> Regards,
>>
>> Emanuele
>>
>>>
>>>
>>>
>>> On Fri, May 8, 2020 at 2:53 AM Emanuele Faranda
>>> <faranda@ntop.org <mailto:faranda@ntop.org>> wrote:
>>>
>>> Hi Aaron,
>>>
>>> The alerts that you are reporting basically tell you
>>> that such hosts receive DNS requests but do not send a
>>> reply. In order to troubleshoot possible problems you
>>> should augment such information with the knowledge of
>>> your network.
>>>
>>> The first question to answer is, are that hosts expected
>>> to accept DNS requests? If not, are the requests
>>> generated from the internet or from the LAN? In the
>>> first case a firewall to block such DNS requests may be
>>> a good idea . In the latter case some hosts in the LAN
>>> may be misconfigured. In case of the pihole hosts, I
>>> expect pihole to block some DNS requests for
>>> advertisement sites so this could be a normal behaviour.
>>> The following ntopng features may also help you:
>>>
>>> https://www.ntop.org/guides/ntopng/advanced_features/live_pcap_download.html
>>>
>>> https://www.ntop.org/guides/ntopng/using_with_other_tools/n2disk.html
>>>
>>> https://www.ntop.org/guides/ntopng/historical_flows.html
>>>
>>> Regards,
>>> Emanuele
>>>
>>> On 5/7/20 5:57 PM, Aaron Scamehorn wrote:
>>>> Hello,
>>>>
>>>> I'm trying to understand how/why I am getting the
>>>> "Replies / Requests Ratio" warnings for DNS.
>>>>
>>>> I am suspect of these alerts, and would like to know
>>>> how/why they are being generated.  I am suspect for for
>>>> the following reasons:  1) If it really is as bad as
>>>> indicated, I should notice problems.  2) the "events'
>>>> occur immediately after I clear the alerts, and tend to
>>>> persist for hours.
>>>>
>>>> In any case, I cleared the alerts last night, and this
>>>> is what they look like:
>>>>
>>>> 06/05/2020 22:15:00 12:31:28 Warning Replies /
>>>> Requests Ratio Host edgemax.example.net
>>>> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.1@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
>>>> has received 54 DNS requests but sent 0 DNS replies [5
>>>> Minutes ratio: 0%]
>>>>
>>>> 06/05/2020 22:15:00 12:31:28 Warning Replies /
>>>> Requests Ratio Host pihole.example.net
>>>> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.3@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
>>>> has sent 93 DNS requests but received 3 DNS replies [5
>>>> Minutes ratio: 3.2%]
>>>> 06/05/2020 22:15:00 12:31:28 Warning Replies /
>>>> Requests Ratio Host pihole-2.example.net
>>>> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.4@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
>>>> has sent 97 DNS requests but received 1 DNS reply [5
>>>> Minutes ratio: 1.0%]
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Ntop mailing list
>>>> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
>>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>> _______________________________________________
>>> Ntop mailing list
>>> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>>
>>>
>>> _______________________________________________
>>> Ntop mailing list
>>> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>> _______________________________________________
>> Ntop mailing list
>> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>
>>
>> _______________________________________________
>> Ntop mailing list
>> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
>> http://listgateway.unipi.it/mailman/listinfo/ntop
> _______________________________________________
> Ntop mailing list
> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
> http://listgateway.unipi.it/mailman/listinfo/ntop
>
>
> _______________________________________________
> Ntop mailing list
> Ntop@listgateway.unipi.it
> http://listgateway.unipi.it/mailman/listinfo/ntop
Re: Replies / Requests Ratio [ In reply to ]
Aaron,

Writing to you here to continue the public discussion. The problem is
that the DNS requests have no VLAN tag whereas the DNS replies have the
VLAN tag 1. So ntopng splits the DNS flows in two monodirectional flows.
If you want to ignore the VLAN tag in ntopng you can use the
--ignore-vlans flag in ntopng. This should fix your problem.

Regards,

Emanuele

On 5/13/20 3:06 PM, Emanuele Faranda wrote:
>
> Hi Aaron,
>
> Please contact us privately at faranda@ntop.org and mainardi@ntop.org
> . Please ensure that the PCAP files only contain DNS traffic.
>
> Regards,
>
> Emanuele
>
> On 5/12/20 5:13 PM, Aaron Scamehorn wrote:
>> Emanuele,
>>
>> Here is ntopng.conf
>> -G=/var/run/ntopng.pid
>> -i=enp2s0
>> -m=10.12.17.0/24 <http://10.12.17.0/24>
>> -S=local
>>
>> I do see unidirectional flows in flows_stats.lua for DNS. 
>> Incidentally, I do also see alerts w/ non-zero replies (though most
>> alerts are 0):
>> Host pihole has sent 211 DNS requests but received 7 DNS replies
>>
>> I tried 2 different 30 minute PCAP files.  In both cases, right at
>> the 10 minute mark, I got alerts.  How can I get these PCAP files to you?
>>
>> Thanks,
>> Aaron
>>
>>
>>
>> On Tue, May 12, 2020 at 4:13 AM Emanuele Faranda <faranda@ntop.org
>> <mailto:faranda@ntop.org>> wrote:
>>
>> Hi Aaron,
>>
>> Please see below.
>>
>> On 5/11/20 9:29 PM, Aaron Scamehorn wrote:
>>> Hi Emanuele,
>>>
>>> Thank you again for the detailed responses.
>>>
>>> From the interfaces page, I see these stats:
>>> Total Traffic 91.6 GB [103,062,265 Pkts] Dropped Packets 0 Pkts
>>>
>>> I don't see any dropped packets on the NIC either:
>>> ethtool -S enp2s0
>>> NIC statistics:
>>>      tx_packets: 0
>>>      rx_packets: 106581943
>>>      tx_errors: 0
>>>      rx_errors: 0
>>>      rx_missed: 0
>>>      align_errors: 0
>>>      tx_single_collisions: 0
>>>      tx_multi_collisions: 0
>>>      unicast: 105432876
>>>      broadcast: 350738
>>>      multicast: 1149060
>>>      tx_aborted: 0
>>>      tx_underrun: 0
>>>
>>> As of right now, 2 of the hosts we are discussing are still in
>>> alert, at the original Date/Time of 07:25:01, and Duration is
>>> now "3 Days, 08:06:59".
>>>
>>> Given that my replies vs requests ratio is still configured at
>>> 50%, this means that, at every 5 minute interval for the last 3
>>> Days, 8 hours, said host is receiving < 50% DNS replies,
>>> correct?  I find this difficult to believe, and cannot find ANY
>>> missing packets in my pcap file.
>>>
>>> I have captured a 30 minute pcap file captured with this command:
>>> tcpdump -i enp2s0 -G 1800 -w /tmp/enp2s0.%FT%T.pcap host edgemax
>>> and port 53
>>>
>>> This file contains DNS traffic to/from edgemax only.
>>> I can count responses like this:
>>> tshark -t a -r enp2s0.2020-05-11T13:00:02.pcap | grep -c
>>> "Standard query response"
>>> 349
>>> And queries like this:
>>> tshark -t a -r enp2s0.2020-05-11T13:00:02.pcap | grep -c
>>> "Standard query 0x"
>>> 349
>>>
>>> In other words, no missing DNS responses in the 30 minutes
>>> spanning 13:00:02 to 13:29:51.
>>>
>>> I would think that the alert should "clear" because the
>>> threshold is not exceeded within that 30 minute pcap file.
>>>
>>> In any case, at 13:23, I manually click on the "Release" button
>>> for that alert.  2 minutes later, at 13:25:00, I receive this alert:
>>> Host edgemax has received 62 DNS requests but sent 0 DNS replies
>>> [5 Minutes ratio: 0%]
>>>
>>> As stated previously, no missing DNS responses in the 30 minutes
>>> spanning 13:00:02 to 13:29:51.  Why does ntopng think 62 replies
>>> are missing?
>>
>> Please report your ntopng.conf. If you look at the active ntopng
>> DNS flows, can you identify unidirectional flows? You can also
>> try to run ntopng on the PCAP file (--original-speed -i
>> file.pcap). If you can reproduce using the PCAP file, please send
>> it to me privately so that I can troubleshoot the problem.
>>
>>>
>>> I exported 10 minutes of PCAP from if_stats.lua.  Using the
>>> filter "(ip.dst_host == "10.12.17.1" or ip.src_host ==
>>> "10.12.17.1") and dns" I am not able to find any missing DNS
>>> responses in wireshark.  Interestingly, If I specify a BPF
>>> Filter ("port 53"), the downloaded PCAP file seems to only have
>>> 1 side (ie. edgemax is only a source, never a dest.  Without a
>>> BPF Filter, the download is fine.
>>
>> This is probably a bug, please open an issue at
>> https://github.com/ntop/ntopng .
>>
>> Regards,
>>
>> Emanuele
>>
>>>
>>>
>>>
>>>
>>> On Mon, May 11, 2020 at 8:59 AM Emanuele Faranda
>>> <faranda@ntop.org <mailto:faranda@ntop.org>> wrote:
>>>
>>> Hi Aaron,
>>>
>>> Please see below:
>>>
>>> On 5/8/20 10:27 PM, Aaron Scamehorn wrote:
>>>> Thank you for your response.  In the screenshot below, can
>>>> you please explain the significance of the "Date/Time" and
>>>> the "Duration" columns?  What do they mean in this context?
>>>
>>> Date/Time: the time when the alert was triggered. Ntopng
>>> performs periodic checks in order to trigger alerts. In this
>>> particular case, the check on the requests/reply ratio is
>>> performed every 5 minutes. So this means that problem
>>> started between 07:20 and 07:25 .
>>>
>>> Duration: the total time in which the problem was active.
>>> Again, the check is performed every 5 minutes for this alert
>>> so 5 minutes is the granularity.
>>>
>>>>
>>>> Do I understand correctly that all 3 hosts triggered the
>>>> alert at 07:25:01 (OR 07:30:01) this morning?  And that all
>>>> three alerts are active for the past 07:28:53 hours?   Does
>>>> this mean that there have been no new additional DNS
>>>> Reply/Request issues have been detected?
>>> As explained above, the problem started between 07:20 and
>>> 07:25 . For 07:28:53 hours the problem was active on all the
>>> three hosts (the requests/reply ratio threshold was exceeded
>>> for 07:28:53 hours).
>>>>
>>>> I notice in "Past Alerts" tab, that there are many
>>>> Reply/Request Alerts for the same host with very short
>>>> durations (screen shot #2).  When/how does an alert move
>>>> from the "Engaged" to "Past" tab?
>>> In this case, the engaged alert becomes "past" alert when,
>>> after the check performed every 5 minutes, the
>>> requests/reply ratio threshold is not exceed anymore. This
>>> can happen as soon as the next check is performed (5 minutes).
>>>>
>>>> So in the 2nd screenshot, fire-TV had an alert at 06:20:00
>>>> for 05:00 minutes where 18 requests received 0 replies. 
>>>> Then another alert at 06:50:00 for 05:00 minutes.  Were the
>>>> 18 replies from the first alert ultimately received?  And
>>>> they were received 5 minutes the alert occurred?
>>>
>>> The check is performed on the DNS packet counters. A DNS
>>> request cannot take 5 minutes to be replied. The fact that
>>> the alert was closed after 5/10 minutes could be related to
>>> one of these events:
>>>
>>> - The host went idle
>>>
>>> - The host did not send enough DNS requests
>>>
>>> - The new DNS requests made by the host were successfully
>>> replied.
>>>
>>>>
>>>> Context here is that 99% of the traffic is Internet
>>>> traffic.  Almost all of the pihole traffic is to
>>>> forwarders.  BTW, the way pihole works (by default) is it
>>>> replies 0.0.0.0 for blocked hosts.  It should respond to
>>>> every query.
>>>>
>>>> I tried the live_pcap_download.html
>>>> <https://www.ntop.org/guides/ntopng/advanced_features/live_pcap_download.html>
>>>> lua, but couldn't figure out the bpf_filter:
>>>> curl --cookie "user=admin; password=xxxxx"
>>>>  "http://10.12.17.25:3000/lua/live_traffic.lua?ifid=0&duration=600&bpf_filter=\
>>>> <http://10.12.17.25:3000/lua/live_traffic.lua?ifid=0&duration=600&bpf_filter=%5C>"port
>>>> 53\""
>>>>
>>>> I also tried the download pcap on the if_stats.lua page.  
>>>> The downloaded pcap file seems to only contain incoming
>>>> data (see wireshark)?
>>> This is consistent with the above alerts, please ensure that
>>> ntopng is not dropping packets as this would explain this
>>> behavior.
>>>>
>>>> If I just do a tshark on the same interface that ntopng is
>>>> listening on, I see all of the expected DNS query &
>>>> replies.  I am not able to correlate the alerts to any
>>>> missing packets.
>>>
>>> See response above.
>>>
>>> Regards,
>>>
>>> Emanuele
>>>
>>>>
>>>>
>>>>
>>>> On Fri, May 8, 2020 at 2:53 AM Emanuele Faranda
>>>> <faranda@ntop.org <mailto:faranda@ntop.org>> wrote:
>>>>
>>>> Hi Aaron,
>>>>
>>>> The alerts that you are reporting basically tell you
>>>> that such hosts receive DNS requests but do not send a
>>>> reply. In order to troubleshoot possible problems you
>>>> should augment such information with the knowledge of
>>>> your network.
>>>>
>>>> The first question to answer is, are that hosts
>>>> expected to accept DNS requests? If not, are the
>>>> requests generated from the internet or from the LAN?
>>>> In the first case a firewall to block such DNS requests
>>>> may be a good idea . In the latter case some hosts in
>>>> the LAN may be misconfigured. In case of the pihole
>>>> hosts, I expect pihole to block some DNS requests for
>>>> advertisement sites so this could be a normal
>>>> behaviour. The following ntopng features may also help you:
>>>>
>>>> https://www.ntop.org/guides/ntopng/advanced_features/live_pcap_download.html
>>>>
>>>> https://www.ntop.org/guides/ntopng/using_with_other_tools/n2disk.html
>>>>
>>>> https://www.ntop.org/guides/ntopng/historical_flows.html
>>>>
>>>> Regards,
>>>> Emanuele
>>>>
>>>> On 5/7/20 5:57 PM, Aaron Scamehorn wrote:
>>>>> Hello,
>>>>>
>>>>> I'm trying to understand how/why I am getting the
>>>>> "Replies / Requests Ratio" warnings for DNS.
>>>>>
>>>>> I am suspect of these alerts, and would like to know
>>>>> how/why they are being generated.  I am suspect for
>>>>> for the following reasons:  1) If it really is as bad
>>>>> as indicated, I should notice problems.  2) the
>>>>> "events' occur immediately after I clear the alerts,
>>>>> and tend to persist for hours.
>>>>>
>>>>> In any case, I cleared the alerts last night, and this
>>>>> is what they look like:
>>>>>
>>>>> 06/05/2020 22:15:00 12:31:28 Warning Replies /
>>>>> Requests Ratio Host edgemax.example.net
>>>>> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.1@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
>>>>> has received 54 DNS requests but sent 0 DNS replies [5
>>>>> Minutes ratio: 0%]
>>>>>
>>>>> 06/05/2020 22:15:00 12:31:28 Warning Replies /
>>>>> Requests Ratio Host pihole.example.net
>>>>> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.3@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
>>>>> has sent 93 DNS requests but received 3 DNS replies [5
>>>>> Minutes ratio: 3.2%]
>>>>> 06/05/2020 22:15:00 12:31:28 Warning Replies /
>>>>> Requests Ratio Host pihole-2.example.net
>>>>> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.4@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
>>>>> has sent 97 DNS requests but received 1 DNS reply [5
>>>>> Minutes ratio: 1.0%]
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Ntop mailing list
>>>>> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
>>>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>>> _______________________________________________
>>>> Ntop mailing list
>>>> Ntop@listgateway.unipi.it
>>>> <mailto:Ntop@listgateway.unipi.it>
>>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>>>
>>>>
>>>> _______________________________________________
>>>> Ntop mailing list
>>>> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
>>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>> _______________________________________________
>>> Ntop mailing list
>>> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>>
>>>
>>> _______________________________________________
>>> Ntop mailing list
>>> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>> _______________________________________________
>> Ntop mailing list
>> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>
>>
>> _______________________________________________
>> Ntop mailing list
>> Ntop@listgateway.unipi.it
>> http://listgateway.unipi.it/mailman/listinfo/ntop
>
> _______________________________________________
> Ntop mailing list
> Ntop@listgateway.unipi.it
> http://listgateway.unipi.it/mailman/listinfo/ntop
Re: Replies / Requests Ratio [ In reply to ]
Interesting. I do recall seeing vlan tags on some but not all of the flows
in ntopng.

Looking at the pcaps now, I do see that traffic from the 2 pi-hole hosts
have vlan tags whereas other hosts have no vlan tag. So, the switch that
the pi-holes is adding vlan tags?

Anyway, I ran the 30 minute pcap file with the --ignore-vlan config, and
agree that does resolve the issue with the pcap file.

Adding that config to the "prod" ntopng apparently introduces new
problems. I am now getting Replies / Requests Ratio alerts for HTTP on
various hosts. I have not seen these alerts before. These do not have the
prolonged duration that the DNS alerts were having; rather, these are all
of the 5 minute duration.

Could this be a boundary issue? Could client send the requests in one 5
minute window, and the responses are on the next 5 minute window?

Aaron




On Wed, May 13, 2020 at 8:48 AM Emanuele Faranda <faranda@ntop.org> wrote:

> Aaron,
>
> Writing to you here to continue the public discussion. The problem is that
> the DNS requests have no VLAN tag whereas the DNS replies have the VLAN tag
> 1. So ntopng splits the DNS flows in two monodirectional flows. If you want
> to ignore the VLAN tag in ntopng you can use the --ignore-vlans flag in
> ntopng. This should fix your problem.
>
> Regards,
>
> Emanuele
> On 5/13/20 3:06 PM, Emanuele Faranda wrote:
>
> Hi Aaron,
>
> Please contact us privately at faranda@ntop.org and mainardi@ntop.org .
> Please ensure that the PCAP files only contain DNS traffic.
>
> Regards,
>
> Emanuele
> On 5/12/20 5:13 PM, Aaron Scamehorn wrote:
>
> Emanuele,
>
> Here is ntopng.conf
> -G=/var/run/ntopng.pid
> -i=enp2s0
> -m=10.12.17.0/24
> -S=local
>
> I do see unidirectional flows in flows_stats.lua for DNS. Incidentally, I
> do also see alerts w/ non-zero replies (though most alerts are 0):
> Host pihole has sent 211 DNS requests but received 7 DNS replies
>
> I tried 2 different 30 minute PCAP files. In both cases, right at the 10
> minute mark, I got alerts. How can I get these PCAP files to you?
>
> Thanks,
> Aaron
>
>
>
> On Tue, May 12, 2020 at 4:13 AM Emanuele Faranda <faranda@ntop.org> wrote:
>
>> Hi Aaron,
>>
>> Please see below.
>> On 5/11/20 9:29 PM, Aaron Scamehorn wrote:
>>
>> Hi Emanuele,
>>
>> Thank you again for the detailed responses.
>>
>> From the interfaces page, I see these stats:
>> Total Traffic 91.6 GB [103,062,265 Pkts] Dropped Packets 0 Pkts
>> I don't see any dropped packets on the NIC either:
>> ethtool -S enp2s0
>> NIC statistics:
>> tx_packets: 0
>> rx_packets: 106581943
>> tx_errors: 0
>> rx_errors: 0
>> rx_missed: 0
>> align_errors: 0
>> tx_single_collisions: 0
>> tx_multi_collisions: 0
>> unicast: 105432876
>> broadcast: 350738
>> multicast: 1149060
>> tx_aborted: 0
>> tx_underrun: 0
>>
>> As of right now, 2 of the hosts we are discussing are still in alert, at
>> the original Date/Time of 07:25:01, and Duration is now "3 Days, 08:06:59".
>>
>> Given that my replies vs requests ratio is still configured at 50%, this
>> means that, at every 5 minute interval for the last 3 Days, 8 hours, said
>> host is receiving < 50% DNS replies, correct? I find this difficult to
>> believe, and cannot find ANY missing packets in my pcap file.
>>
>> I have captured a 30 minute pcap file captured with this command:
>> tcpdump -i enp2s0 -G 1800 -w /tmp/enp2s0.%FT%T.pcap host edgemax and port
>> 53
>>
>> This file contains DNS traffic to/from edgemax only.
>> I can count responses like this:
>> tshark -t a -r enp2s0.2020-05-11T13:00:02.pcap | grep -c "Standard query
>> response"
>> 349
>> And queries like this:
>> tshark -t a -r enp2s0.2020-05-11T13:00:02.pcap | grep -c "Standard query
>> 0x"
>> 349
>>
>> In other words, no missing DNS responses in the 30 minutes spanning
>> 13:00:02 to 13:29:51.
>>
>> I would think that the alert should "clear" because the threshold is not
>> exceeded within that 30 minute pcap file.
>>
>> In any case, at 13:23, I manually click on the "Release" button for that
>> alert. 2 minutes later, at 13:25:00, I receive this alert:
>> Host edgemax has received 62 DNS requests but sent 0 DNS replies [5
>> Minutes ratio: 0%]
>>
>> As stated previously, no missing DNS responses in the 30 minutes spanning
>> 13:00:02 to 13:29:51. Why does ntopng think 62 replies are missing?
>>
>> Please report your ntopng.conf. If you look at the active ntopng DNS
>> flows, can you identify unidirectional flows? You can also try to run
>> ntopng on the PCAP file (--original-speed -i file.pcap). If you can
>> reproduce using the PCAP file, please send it to me privately so that I can
>> troubleshoot the problem.
>>
>>
>> I exported 10 minutes of PCAP from if_stats.lua. Using the filter
>> "(ip.dst_host == "10.12.17.1" or ip.src_host == "10.12.17.1") and dns" I am
>> not able to find any missing DNS responses in wireshark. Interestingly, If
>> I specify a BPF Filter ("port 53"), the downloaded PCAP file seems to only
>> have 1 side (ie. edgemax is only a source, never a dest. Without a BPF
>> Filter, the download is fine.
>>
>> This is probably a bug, please open an issue at
>> https://github.com/ntop/ntopng .
>>
>> Regards,
>>
>> Emanuele
>>
>>
>>
>>
>>
>> On Mon, May 11, 2020 at 8:59 AM Emanuele Faranda <faranda@ntop.org>
>> wrote:
>>
>>> Hi Aaron,
>>>
>>> Please see below:
>>> On 5/8/20 10:27 PM, Aaron Scamehorn wrote:
>>>
>>> Thank you for your response. In the screenshot below, can you please
>>> explain the significance of the "Date/Time" and the "Duration" columns?
>>> What do they mean in this context?
>>>
>>> Date/Time: the time when the alert was triggered. Ntopng performs
>>> periodic checks in order to trigger alerts. In this particular case, the
>>> check on the requests/reply ratio is performed every 5 minutes. So this
>>> means that problem started between 07:20 and 07:25 .
>>>
>>> Duration: the total time in which the problem was active. Again, the
>>> check is performed every 5 minutes for this alert so 5 minutes is the
>>> granularity.
>>>
>>>
>>> Do I understand correctly that all 3 hosts triggered the alert at
>>> 07:25:01 (OR 07:30:01) this morning? And that all three alerts are active
>>> for the past 07:28:53 hours? Does this mean that there have been no new
>>> additional DNS Reply/Request issues have been detected?
>>>
>>> As explained above, the problem started between 07:20 and 07:25 . For
>>> 07:28:53 hours the problem was active on all the three hosts (the
>>> requests/reply ratio threshold was exceeded for 07:28:53 hours).
>>>
>>>
>>> I notice in "Past Alerts" tab, that there are many Reply/Request Alerts
>>> for the same host with very short durations (screen shot #2). When/how
>>> does an alert move from the "Engaged" to "Past" tab?
>>>
>>> In this case, the engaged alert becomes "past" alert when, after the
>>> check performed every 5 minutes, the requests/reply ratio threshold is not
>>> exceed anymore. This can happen as soon as the next check is performed (5
>>> minutes).
>>>
>>>
>>> So in the 2nd screenshot, fire-TV had an alert at 06:20:00 for 05:00
>>> minutes where 18 requests received 0 replies. Then another alert at
>>> 06:50:00 for 05:00 minutes. Were the 18 replies from the first alert
>>> ultimately received? And they were received 5 minutes the alert occurred?
>>>
>>> The check is performed on the DNS packet counters. A DNS request cannot
>>> take 5 minutes to be replied. The fact that the alert was closed after 5/10
>>> minutes could be related to one of these events:
>>>
>>> - The host went idle
>>>
>>> - The host did not send enough DNS requests
>>>
>>> - The new DNS requests made by the host were successfully replied.
>>>
>>>
>>> Context here is that 99% of the traffic is Internet traffic. Almost all
>>> of the pihole traffic is to forwarders. BTW, the way pihole works (by
>>> default) is it replies 0.0.0.0 for blocked hosts. It should respond to
>>> every query.
>>>
>>> I tried the live_pcap_download.html
>>> <https://www.ntop.org/guides/ntopng/advanced_features/live_pcap_download.html>
>>> lua, but couldn't figure out the bpf_filter:
>>> curl --cookie "user=admin; password=xxxxx" "
>>> http://10.12.17.25:3000/lua/live_traffic.lua?ifid=0&duration=600&bpf_filter=\"port
>>> 53\""
>>>
>>> I also tried the download pcap on the if_stats.lua page. The
>>> downloaded pcap file seems to only contain incoming data (see wireshark)?
>>>
>>> This is consistent with the above alerts, please ensure that ntopng is
>>> not dropping packets as this would explain this behavior.
>>>
>>>
>>> If I just do a tshark on the same interface that ntopng is listening on,
>>> I see all of the expected DNS query & replies. I am not able to correlate
>>> the alerts to any missing packets.
>>>
>>> See response above.
>>>
>>> Regards,
>>>
>>> Emanuele
>>>
>>>
>>>
>>>
>>> On Fri, May 8, 2020 at 2:53 AM Emanuele Faranda <faranda@ntop.org>
>>> wrote:
>>>
>>>> Hi Aaron,
>>>>
>>>> The alerts that you are reporting basically tell you that such hosts
>>>> receive DNS requests but do not send a reply. In order to troubleshoot
>>>> possible problems you should augment such information with the knowledge of
>>>> your network.
>>>>
>>>> The first question to answer is, are that hosts expected to accept DNS
>>>> requests? If not, are the requests generated from the internet or from the
>>>> LAN? In the first case a firewall to block such DNS requests may be a good
>>>> idea . In the latter case some hosts in the LAN may be misconfigured. In
>>>> case of the pihole hosts, I expect pihole to block some DNS requests for
>>>> advertisement sites so this could be a normal behaviour. The following
>>>> ntopng features may also help you:
>>>>
>>>>
>>>> https://www.ntop.org/guides/ntopng/advanced_features/live_pcap_download.html
>>>>
>>>>
>>>> https://www.ntop.org/guides/ntopng/using_with_other_tools/n2disk.html
>>>>
>>>> https://www.ntop.org/guides/ntopng/historical_flows.html
>>>>
>>>> Regards,
>>>> Emanuele
>>>> On 5/7/20 5:57 PM, Aaron Scamehorn wrote:
>>>>
>>>> Hello,
>>>>
>>>> I'm trying to understand how/why I am getting the "Replies / Requests
>>>> Ratio" warnings for DNS.
>>>>
>>>> I am suspect of these alerts, and would like to know how/why they are
>>>> being generated. I am suspect for for the following reasons: 1) If it
>>>> really is as bad as indicated, I should notice problems. 2) the "events'
>>>> occur immediately after I clear the alerts, and tend to persist for hours.
>>>>
>>>> In any case, I cleared the alerts last night, and this is what they
>>>> look like:
>>>>
>>>> 06/05/2020 22:15:00 12:31:28 Warning Replies / Requests Ratio Host
>>>> edgemax.example.net
>>>> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.1@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
>>>> has received 54 DNS requests but sent 0 DNS replies [5 Minutes ratio: 0%]
>>>>
>>>> 06/05/2020 22:15:00 12:31:28 Warning Replies / Requests Ratio Host
>>>> pihole.example.net
>>>> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.3@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
>>>> has sent 93 DNS requests but received 3 DNS replies [5 Minutes ratio: 3.2%]
>>>>
>>>> 06/05/2020 22:15:00 12:31:28 Warning Replies / Requests Ratio Host
>>>> pihole-2.example.net
>>>> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.4@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
>>>> has sent 97 DNS requests but received 1 DNS reply [5 Minutes ratio: 1.0%]
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Ntop mailing listNtop@listgateway.unipi.ithttp://listgateway.unipi.it/mailman/listinfo/ntop
>>>>
>>>> _______________________________________________
>>>> Ntop mailing list
>>>> Ntop@listgateway.unipi.it
>>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>>
>>>
>>> _______________________________________________
>>> Ntop mailing listNtop@listgateway.unipi.ithttp://listgateway.unipi.it/mailman/listinfo/ntop
>>>
>>> _______________________________________________
>>> Ntop mailing list
>>> Ntop@listgateway.unipi.it
>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>
>>
>> _______________________________________________
>> Ntop mailing listNtop@listgateway.unipi.ithttp://listgateway.unipi.it/mailman/listinfo/ntop
>>
>> _______________________________________________
>> Ntop mailing list
>> Ntop@listgateway.unipi.it
>> http://listgateway.unipi.it/mailman/listinfo/ntop
>
>
> _______________________________________________
> Ntop mailing listNtop@listgateway.unipi.ithttp://listgateway.unipi.it/mailman/listinfo/ntop
>
>
> _______________________________________________
> Ntop mailing listNtop@listgateway.unipi.ithttp://listgateway.unipi.it/mailman/listinfo/ntop
>
> _______________________________________________
> Ntop mailing list
> Ntop@listgateway.unipi.it
> http://listgateway.unipi.it/mailman/listinfo/ntop
Re: Replies / Requests Ratio [ In reply to ]
Hi Aaron,

The alerts on HTTP traffic should not be linked to the --ignore-vlan
option, as adding such option should actually improve the requests vs
reply ratio also in case of HTTP so I expect less alerts to be generated
than before.

Anyway, please monitor the situation and if you still think that there
is such a problem please provide a PCAP file privately with the HTTP
traffic so that we can inspect it.

Regards,

Emanuele

On 5/13/20 4:55 PM, Aaron Scamehorn wrote:
> Interesting.  I do recall seeing vlan tags on some but not all of the
> flows in ntopng.
>
> Looking at the pcaps now, I do see that traffic from the 2 pi-hole
> hosts have vlan tags whereas other hosts have no vlan tag.  So, the
> switch that the pi-holes is adding vlan tags?
>
> Anyway, I ran the 30 minute pcap file with the --ignore-vlan config,
> and agree that does resolve the issue with the pcap file.
>
> Adding that config to the "prod" ntopng apparently introduces new
> problems.  I am now getting Replies / Requests Ratio alerts for HTTP
> on various hosts.  I have not seen these alerts before.  These do not
> have the prolonged duration that the DNS alerts were having; rather,
> these are all of the 5 minute duration.
>
> Could this be a boundary issue?  Could client send the requests in one
> 5 minute window, and the responses are on the next 5 minute window?
>
> Aaron
>
>
>
>
> On Wed, May 13, 2020 at 8:48 AM Emanuele Faranda <faranda@ntop.org
> <mailto:faranda@ntop.org>> wrote:
>
> Aaron,
>
> Writing to you here to continue the public discussion. The problem
> is that the DNS requests have no VLAN tag whereas the DNS replies
> have the VLAN tag 1. So ntopng splits the DNS flows in two
> monodirectional flows. If you want to ignore the VLAN tag in
> ntopng you can use the --ignore-vlans flag in ntopng. This should
> fix your problem.
>
> Regards,
>
> Emanuele
>
> On 5/13/20 3:06 PM, Emanuele Faranda wrote:
>>
>> Hi Aaron,
>>
>> Please contact us privately at faranda@ntop.org
>> <mailto:faranda@ntop.org> and mainardi@ntop.org
>> <mailto:mainardi@ntop.org> . Please ensure that the PCAP files
>> only contain DNS traffic.
>>
>> Regards,
>>
>> Emanuele
>>
>> On 5/12/20 5:13 PM, Aaron Scamehorn wrote:
>>> Emanuele,
>>>
>>> Here is ntopng.conf
>>> -G=/var/run/ntopng.pid
>>> -i=enp2s0
>>> -m=10.12.17.0/24 <http://10.12.17.0/24>
>>> -S=local
>>>
>>> I do see unidirectional flows in flows_stats.lua for DNS. 
>>> Incidentally, I do also see alerts w/ non-zero replies (though
>>> most alerts are 0):
>>> Host pihole has sent 211 DNS requests but received 7 DNS replies
>>>
>>> I tried 2 different 30 minute PCAP files.  In both cases, right
>>> at the 10 minute mark, I got alerts.  How can I get these PCAP
>>> files to you?
>>>
>>> Thanks,
>>> Aaron
>>>
>>>
>>>
>>> On Tue, May 12, 2020 at 4:13 AM Emanuele Faranda
>>> <faranda@ntop.org <mailto:faranda@ntop.org>> wrote:
>>>
>>> Hi Aaron,
>>>
>>> Please see below.
>>>
>>> On 5/11/20 9:29 PM, Aaron Scamehorn wrote:
>>>> Hi Emanuele,
>>>>
>>>> Thank you again for the detailed responses.
>>>>
>>>> From the interfaces page, I see these stats:
>>>> Total Traffic 91.6 GB [103,062,265 Pkts] Dropped Packets
>>>> 0 Pkts
>>>>
>>>> I don't see any dropped packets on the NIC either:
>>>> ethtool -S enp2s0
>>>> NIC statistics:
>>>>      tx_packets: 0
>>>>      rx_packets: 106581943
>>>>      tx_errors: 0
>>>>      rx_errors: 0
>>>>      rx_missed: 0
>>>>      align_errors: 0
>>>>      tx_single_collisions: 0
>>>>      tx_multi_collisions: 0
>>>>      unicast: 105432876
>>>>      broadcast: 350738
>>>>      multicast: 1149060
>>>>      tx_aborted: 0
>>>>      tx_underrun: 0
>>>>
>>>> As of right now, 2 of the hosts we are discussing are still
>>>> in alert, at the original Date/Time of 07:25:01, and
>>>> Duration is now "3 Days, 08:06:59".
>>>>
>>>> Given that my replies vs requests ratio is still configured
>>>> at 50%, this means that, at every 5 minute interval for the
>>>> last 3 Days, 8 hours, said host is receiving < 50% DNS
>>>> replies, correct?  I find this difficult to believe, and
>>>> cannot find ANY missing packets in my pcap file.
>>>>
>>>> I have captured a 30 minute pcap file captured with this
>>>> command:
>>>> tcpdump -i enp2s0 -G 1800 -w /tmp/enp2s0.%FT%T.pcap host
>>>> edgemax and port 53
>>>>
>>>> This file contains DNS traffic to/from edgemax only.
>>>> I can count responses like this:
>>>> tshark -t a -r enp2s0.2020-05-11T13:00:02.pcap | grep -c
>>>> "Standard query response"
>>>> 349
>>>> And queries like this:
>>>> tshark -t a -r enp2s0.2020-05-11T13:00:02.pcap | grep -c
>>>> "Standard query 0x"
>>>> 349
>>>>
>>>> In other words, no missing DNS responses in the 30 minutes
>>>> spanning 13:00:02 to 13:29:51.
>>>>
>>>> I would think that the alert should "clear" because the
>>>> threshold is not exceeded within that 30 minute pcap file.
>>>>
>>>> In any case, at 13:23, I manually click on the "Release"
>>>> button for that alert.  2 minutes later, at 13:25:00, I
>>>> receive this alert:
>>>> Host edgemax has received 62 DNS requests but sent 0 DNS
>>>> replies [5 Minutes ratio: 0%]
>>>>
>>>> As stated previously, no missing DNS responses in the 30
>>>> minutes spanning 13:00:02 to 13:29:51.  Why does ntopng
>>>> think 62 replies are missing?
>>>
>>> Please report your ntopng.conf. If you look at the active
>>> ntopng DNS flows, can you identify unidirectional flows? You
>>> can also try to run ntopng on the PCAP file
>>> (--original-speed -i file.pcap). If you can reproduce using
>>> the PCAP file, please send it to me privately so that I can
>>> troubleshoot the problem.
>>>
>>>>
>>>> I exported 10 minutes of PCAP from if_stats.lua.  Using the
>>>> filter "(ip.dst_host == "10.12.17.1" or ip.src_host ==
>>>> "10.12.17.1") and dns" I am not able to find any missing
>>>> DNS responses in wireshark.  Interestingly, If I specify a
>>>> BPF Filter ("port 53"), the downloaded PCAP file seems to
>>>> only have 1 side (ie. edgemax is only a source, never a
>>>> dest. Without a BPF Filter, the download is fine.
>>>
>>> This is probably a bug, please open an issue at
>>> https://github.com/ntop/ntopng .
>>>
>>> Regards,
>>>
>>> Emanuele
>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, May 11, 2020 at 8:59 AM Emanuele Faranda
>>>> <faranda@ntop.org <mailto:faranda@ntop.org>> wrote:
>>>>
>>>> Hi Aaron,
>>>>
>>>> Please see below:
>>>>
>>>> On 5/8/20 10:27 PM, Aaron Scamehorn wrote:
>>>>> Thank you for your response.  In the screenshot below,
>>>>> can you please explain the significance of the
>>>>> "Date/Time" and the "Duration" columns?  What do they
>>>>> mean in this context?
>>>>
>>>> Date/Time: the time when the alert was triggered.
>>>> Ntopng performs periodic checks in order to trigger
>>>> alerts. In this particular case, the check on the
>>>> requests/reply ratio is performed every 5 minutes. So
>>>> this means that problem started between 07:20 and 07:25 .
>>>>
>>>> Duration: the total time in which the problem was
>>>> active. Again, the check is performed every 5 minutes
>>>> for this alert so 5 minutes is the granularity.
>>>>
>>>>>
>>>>> Do I understand correctly that all 3 hosts triggered
>>>>> the alert at 07:25:01 (OR 07:30:01) this morning?  And
>>>>> that all three alerts are active for the past 07:28:53
>>>>> hours?   Does this mean that there have been no new
>>>>> additional DNS Reply/Request issues have been detected?
>>>> As explained above, the problem started between 07:20
>>>> and 07:25 . For 07:28:53 hours the problem was active
>>>> on all the three hosts (the requests/reply ratio
>>>> threshold was exceeded for 07:28:53 hours).
>>>>>
>>>>> I notice in "Past Alerts" tab, that there are many
>>>>> Reply/Request Alerts for the same host with very short
>>>>> durations (screen shot #2). When/how does an alert
>>>>> move from the "Engaged" to "Past" tab?
>>>> In this case, the engaged alert becomes "past" alert
>>>> when, after the check performed every 5 minutes, the
>>>> requests/reply ratio threshold is not exceed anymore.
>>>> This can happen as soon as the next check is performed
>>>> (5 minutes).
>>>>>
>>>>> So in the 2nd screenshot, fire-TV had an alert at
>>>>> 06:20:00 for 05:00 minutes where 18 requests received
>>>>> 0 replies.  Then another alert at 06:50:00 for 05:00
>>>>> minutes.  Were the 18 replies from the first alert
>>>>> ultimately received?  And they were received 5 minutes
>>>>> the alert occurred?
>>>>
>>>> The check is performed on the DNS packet counters. A
>>>> DNS request cannot take 5 minutes to be replied. The
>>>> fact that the alert was closed after 5/10 minutes could
>>>> be related to one of these events:
>>>>
>>>> - The host went idle
>>>>
>>>> - The host did not send enough DNS requests
>>>>
>>>> - The new DNS requests made by the host were
>>>> successfully replied.
>>>>
>>>>>
>>>>> Context here is that 99% of the traffic is Internet
>>>>> traffic.  Almost all of the pihole traffic is to
>>>>> forwarders.  BTW, the way pihole works (by default) is
>>>>> it replies 0.0.0.0 for blocked hosts.  It should
>>>>> respond to every query.
>>>>>
>>>>> I tried the live_pcap_download.html
>>>>> <https://www.ntop.org/guides/ntopng/advanced_features/live_pcap_download.html>
>>>>> lua, but couldn't figure out the bpf_filter:
>>>>> curl --cookie "user=admin; password=xxxxx"
>>>>>  "http://10.12.17.25:3000/lua/live_traffic.lua?ifid=0&duration=600&bpf_filter=\
>>>>> <http://10.12.17.25:3000/lua/live_traffic.lua?ifid=0&duration=600&bpf_filter=%5C>"port
>>>>> 53\""
>>>>>
>>>>> I also tried the download pcap on the if_stats.lua
>>>>> page.   The downloaded pcap file seems to only contain
>>>>> incoming data (see wireshark)?
>>>> This is consistent with the above alerts, please ensure
>>>> that ntopng is not dropping packets as this would
>>>> explain this behavior.
>>>>>
>>>>> If I just do a tshark on the same interface that
>>>>> ntopng is listening on, I see all of the expected DNS
>>>>> query & replies.  I am not able to correlate the
>>>>> alerts to any missing packets.
>>>>
>>>> See response above.
>>>>
>>>> Regards,
>>>>
>>>> Emanuele
>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Fri, May 8, 2020 at 2:53 AM Emanuele Faranda
>>>>> <faranda@ntop.org <mailto:faranda@ntop.org>> wrote:
>>>>>
>>>>> Hi Aaron,
>>>>>
>>>>> The alerts that you are reporting basically tell
>>>>> you that such hosts receive DNS requests but do
>>>>> not send a reply. In order to troubleshoot
>>>>> possible problems you should augment such
>>>>> information with the knowledge of your network.
>>>>>
>>>>> The first question to answer is, are that hosts
>>>>> expected to accept DNS requests? If not, are the
>>>>> requests generated from the internet or from the
>>>>> LAN? In the first case a firewall to block such
>>>>> DNS requests may be a good idea . In the latter
>>>>> case some hosts in the LAN may be misconfigured.
>>>>> In case of the pihole hosts, I expect pihole to
>>>>> block some DNS requests for advertisement sites so
>>>>> this could be a normal behaviour. The following
>>>>> ntopng features may also help you:
>>>>>
>>>>> https://www.ntop.org/guides/ntopng/advanced_features/live_pcap_download.html
>>>>>
>>>>> https://www.ntop.org/guides/ntopng/using_with_other_tools/n2disk.html
>>>>>
>>>>> https://www.ntop.org/guides/ntopng/historical_flows.html
>>>>>
>>>>> Regards,
>>>>> Emanuele
>>>>>
>>>>> On 5/7/20 5:57 PM, Aaron Scamehorn wrote:
>>>>>> Hello,
>>>>>>
>>>>>> I'm trying to understand how/why I am getting the
>>>>>> "Replies / Requests Ratio" warnings for DNS.
>>>>>>
>>>>>> I am suspect of these alerts, and would like to
>>>>>> know how/why they are being generated.  I am
>>>>>> suspect for for the following reasons: 1) If it
>>>>>> really is as bad as indicated, I should notice
>>>>>> problems.  2) the "events' occur immediately
>>>>>> after I clear the alerts, and tend to persist for
>>>>>> hours.
>>>>>>
>>>>>> In any case, I cleared the alerts last night, and
>>>>>> this is what they look like:
>>>>>>
>>>>>> 06/05/2020 22:15:00 12:31:28 Warning Replies /
>>>>>> Requests Ratio Host edgemax.example.net
>>>>>> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.1@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
>>>>>> has received 54 DNS requests but sent 0 DNS
>>>>>> replies [5 Minutes ratio: 0%]
>>>>>>
>>>>>> 06/05/2020 22:15:00 12:31:28 Warning Replies /
>>>>>> Requests Ratio Host pihole.example.net
>>>>>> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.3@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
>>>>>> has sent 93 DNS requests but received 3 DNS
>>>>>> replies [5 Minutes ratio: 3.2%]
>>>>>> 06/05/2020 22:15:00 12:31:28 Warning Replies /
>>>>>> Requests Ratio Host pihole-2.example.net
>>>>>> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.4@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
>>>>>> has sent 97 DNS requests but received 1 DNS reply
>>>>>> [5 Minutes ratio: 1.0%]
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Ntop mailing list
>>>>>> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
>>>>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>>>> _______________________________________________
>>>>> Ntop mailing list
>>>>> Ntop@listgateway.unipi.it
>>>>> <mailto:Ntop@listgateway.unipi.it>
>>>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Ntop mailing list
>>>>> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
>>>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>>> _______________________________________________
>>>> Ntop mailing list
>>>> Ntop@listgateway.unipi.it
>>>> <mailto:Ntop@listgateway.unipi.it>
>>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>>>
>>>>
>>>> _______________________________________________
>>>> Ntop mailing list
>>>> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
>>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>> _______________________________________________
>>> Ntop mailing list
>>> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>>
>>>
>>> _______________________________________________
>>> Ntop mailing list
>>> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>
>> _______________________________________________
>> Ntop mailing list
>> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
>> http://listgateway.unipi.it/mailman/listinfo/ntop
> _______________________________________________
> Ntop mailing list
> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
> http://listgateway.unipi.it/mailman/listinfo/ntop
>
>
> _______________________________________________
> Ntop mailing list
> Ntop@listgateway.unipi.it
> http://listgateway.unipi.it/mailman/listinfo/ntop
Re: Replies / Requests Ratio [ In reply to ]
Hi Emanuele,

It's been about 9 days since adding the ignore_vlan option. The behavior
has definitely changed, however, I continue to get Replies / Requests Ratio
alerts.

I get far fewer alerts for DNS. As mentioned earlier, I am now getting
alerts for HTTP. Over 600 in the last 9 days:

"msg": "Host edgemax has received 117 HTTP requests but sent 51 HTTP
replies [5 Minutes ratio: 43.2%] "
"msg": "Host edgemax has received 117 HTTP requests but sent 51 HTTP
replies [5 Minutes ratio: 43.2%] "
"msg": "Host edgemax has received 117 HTTP requests but sent 51 HTTP
replies [5 Minutes ratio: 43.2%] "
"msg": "Host edgemax has received 117 HTTP requests but sent 51 HTTP
replies [5 Minutes ratio: 43.2%] "
"msg": "Host edgemax has received 118 HTTP requests but sent 51 HTTP
replies [5 Minutes ratio: 42.9%] "
"msg": "Host edgemax has received 100 HTTP requests but sent 34 HTTP
replies [5 Minutes ratio: 33.7%] "
"msg": "Host edgemax has received 78 HTTP requests but sent 34 HTTP
replies [5 Minutes ratio: 43.0%] "

The duration is usually 10 minutes or less.

I've sent you a PCAP file to reproduce.

Aaron


On Fri, May 15, 2020 at 4:26 AM Emanuele Faranda <faranda@ntop.org> wrote:

> Hi Aaron,
>
> The alerts on HTTP traffic should not be linked to the --ignore-vlan
> option, as adding such option should actually improve the requests vs reply
> ratio also in case of HTTP so I expect less alerts to be generated than
> before.
>
> Anyway, please monitor the situation and if you still think that there is
> such a problem please provide a PCAP file privately with the HTTP traffic
> so that we can inspect it.
>
> Regards,
>
> Emanuele
> On 5/13/20 4:55 PM, Aaron Scamehorn wrote:
>
> Interesting. I do recall seeing vlan tags on some but not all of the
> flows in ntopng.
>
> Looking at the pcaps now, I do see that traffic from the 2 pi-hole hosts
> have vlan tags whereas other hosts have no vlan tag. So, the switch that
> the pi-holes is adding vlan tags?
>
> Anyway, I ran the 30 minute pcap file with the --ignore-vlan config, and
> agree that does resolve the issue with the pcap file.
>
> Adding that config to the "prod" ntopng apparently introduces new
> problems. I am now getting Replies / Requests Ratio alerts for HTTP on
> various hosts. I have not seen these alerts before. These do not have the
> prolonged duration that the DNS alerts were having; rather, these are all
> of the 5 minute duration.
>
> Could this be a boundary issue? Could client send the requests in one 5
> minute window, and the responses are on the next 5 minute window?
>
> Aaron
>
>
>
>
> On Wed, May 13, 2020 at 8:48 AM Emanuele Faranda <faranda@ntop.org> wrote:
>
>> Aaron,
>>
>> Writing to you here to continue the public discussion. The problem is
>> that the DNS requests have no VLAN tag whereas the DNS replies have the
>> VLAN tag 1. So ntopng splits the DNS flows in two monodirectional flows. If
>> you want to ignore the VLAN tag in ntopng you can use the --ignore-vlans
>> flag in ntopng. This should fix your problem.
>>
>> Regards,
>>
>> Emanuele
>> On 5/13/20 3:06 PM, Emanuele Faranda wrote:
>>
>> Hi Aaron,
>>
>> Please contact us privately at faranda@ntop.org and mainardi@ntop.org .
>> Please ensure that the PCAP files only contain DNS traffic.
>>
>> Regards,
>>
>> Emanuele
>> On 5/12/20 5:13 PM, Aaron Scamehorn wrote:
>>
>> Emanuele,
>>
>> Here is ntopng.conf
>> -G=/var/run/ntopng.pid
>> -i=enp2s0
>> -m=10.12.17.0/24
>> -S=local
>>
>> I do see unidirectional flows in flows_stats.lua for DNS. Incidentally,
>> I do also see alerts w/ non-zero replies (though most alerts are 0):
>> Host pihole has sent 211 DNS requests but received 7 DNS replies
>>
>> I tried 2 different 30 minute PCAP files. In both cases, right at the 10
>> minute mark, I got alerts. How can I get these PCAP files to you?
>>
>> Thanks,
>> Aaron
>>
>>
>>
>> On Tue, May 12, 2020 at 4:13 AM Emanuele Faranda <faranda@ntop.org>
>> wrote:
>>
>>> Hi Aaron,
>>>
>>> Please see below.
>>> On 5/11/20 9:29 PM, Aaron Scamehorn wrote:
>>>
>>> Hi Emanuele,
>>>
>>> Thank you again for the detailed responses.
>>>
>>> From the interfaces page, I see these stats:
>>> Total Traffic 91.6 GB [103,062,265 Pkts] Dropped Packets 0 Pkts
>>> I don't see any dropped packets on the NIC either:
>>> ethtool -S enp2s0
>>> NIC statistics:
>>> tx_packets: 0
>>> rx_packets: 106581943
>>> tx_errors: 0
>>> rx_errors: 0
>>> rx_missed: 0
>>> align_errors: 0
>>> tx_single_collisions: 0
>>> tx_multi_collisions: 0
>>> unicast: 105432876
>>> broadcast: 350738
>>> multicast: 1149060
>>> tx_aborted: 0
>>> tx_underrun: 0
>>>
>>> As of right now, 2 of the hosts we are discussing are still in alert, at
>>> the original Date/Time of 07:25:01, and Duration is now "3 Days, 08:06:59".
>>>
>>> Given that my replies vs requests ratio is still configured at 50%, this
>>> means that, at every 5 minute interval for the last 3 Days, 8 hours, said
>>> host is receiving < 50% DNS replies, correct? I find this difficult to
>>> believe, and cannot find ANY missing packets in my pcap file.
>>>
>>> I have captured a 30 minute pcap file captured with this command:
>>> tcpdump -i enp2s0 -G 1800 -w /tmp/enp2s0.%FT%T.pcap host edgemax and
>>> port 53
>>>
>>> This file contains DNS traffic to/from edgemax only.
>>> I can count responses like this:
>>> tshark -t a -r enp2s0.2020-05-11T13:00:02.pcap | grep -c "Standard query
>>> response"
>>> 349
>>> And queries like this:
>>> tshark -t a -r enp2s0.2020-05-11T13:00:02.pcap | grep -c "Standard query
>>> 0x"
>>> 349
>>>
>>> In other words, no missing DNS responses in the 30 minutes spanning
>>> 13:00:02 to 13:29:51.
>>>
>>> I would think that the alert should "clear" because the threshold is not
>>> exceeded within that 30 minute pcap file.
>>>
>>> In any case, at 13:23, I manually click on the "Release" button for that
>>> alert. 2 minutes later, at 13:25:00, I receive this alert:
>>> Host edgemax has received 62 DNS requests but sent 0 DNS replies [5
>>> Minutes ratio: 0%]
>>>
>>> As stated previously, no missing DNS responses in the 30 minutes
>>> spanning 13:00:02 to 13:29:51. Why does ntopng think 62 replies are
>>> missing?
>>>
>>> Please report your ntopng.conf. If you look at the active ntopng DNS
>>> flows, can you identify unidirectional flows? You can also try to run
>>> ntopng on the PCAP file (--original-speed -i file.pcap). If you can
>>> reproduce using the PCAP file, please send it to me privately so that I can
>>> troubleshoot the problem.
>>>
>>>
>>> I exported 10 minutes of PCAP from if_stats.lua. Using the filter
>>> "(ip.dst_host == "10.12.17.1" or ip.src_host == "10.12.17.1") and dns" I am
>>> not able to find any missing DNS responses in wireshark. Interestingly, If
>>> I specify a BPF Filter ("port 53"), the downloaded PCAP file seems to only
>>> have 1 side (ie. edgemax is only a source, never a dest. Without a BPF
>>> Filter, the download is fine.
>>>
>>> This is probably a bug, please open an issue at
>>> https://github.com/ntop/ntopng .
>>>
>>> Regards,
>>>
>>> Emanuele
>>>
>>>
>>>
>>>
>>>
>>> On Mon, May 11, 2020 at 8:59 AM Emanuele Faranda <faranda@ntop.org>
>>> wrote:
>>>
>>>> Hi Aaron,
>>>>
>>>> Please see below:
>>>> On 5/8/20 10:27 PM, Aaron Scamehorn wrote:
>>>>
>>>> Thank you for your response. In the screenshot below, can you please
>>>> explain the significance of the "Date/Time" and the "Duration" columns?
>>>> What do they mean in this context?
>>>>
>>>> Date/Time: the time when the alert was triggered. Ntopng performs
>>>> periodic checks in order to trigger alerts. In this particular case, the
>>>> check on the requests/reply ratio is performed every 5 minutes. So this
>>>> means that problem started between 07:20 and 07:25 .
>>>>
>>>> Duration: the total time in which the problem was active. Again, the
>>>> check is performed every 5 minutes for this alert so 5 minutes is the
>>>> granularity.
>>>>
>>>>
>>>> Do I understand correctly that all 3 hosts triggered the alert at
>>>> 07:25:01 (OR 07:30:01) this morning? And that all three alerts are active
>>>> for the past 07:28:53 hours? Does this mean that there have been no new
>>>> additional DNS Reply/Request issues have been detected?
>>>>
>>>> As explained above, the problem started between 07:20 and 07:25 . For
>>>> 07:28:53 hours the problem was active on all the three hosts (the
>>>> requests/reply ratio threshold was exceeded for 07:28:53 hours).
>>>>
>>>>
>>>> I notice in "Past Alerts" tab, that there are many Reply/Request Alerts
>>>> for the same host with very short durations (screen shot #2). When/how
>>>> does an alert move from the "Engaged" to "Past" tab?
>>>>
>>>> In this case, the engaged alert becomes "past" alert when, after the
>>>> check performed every 5 minutes, the requests/reply ratio threshold is not
>>>> exceed anymore. This can happen as soon as the next check is performed (5
>>>> minutes).
>>>>
>>>>
>>>> So in the 2nd screenshot, fire-TV had an alert at 06:20:00 for 05:00
>>>> minutes where 18 requests received 0 replies. Then another alert at
>>>> 06:50:00 for 05:00 minutes. Were the 18 replies from the first alert
>>>> ultimately received? And they were received 5 minutes the alert occurred?
>>>>
>>>> The check is performed on the DNS packet counters. A DNS request cannot
>>>> take 5 minutes to be replied. The fact that the alert was closed after 5/10
>>>> minutes could be related to one of these events:
>>>>
>>>> - The host went idle
>>>>
>>>> - The host did not send enough DNS requests
>>>>
>>>> - The new DNS requests made by the host were successfully replied.
>>>>
>>>>
>>>> Context here is that 99% of the traffic is Internet traffic. Almost
>>>> all of the pihole traffic is to forwarders. BTW, the way pihole works (by
>>>> default) is it replies 0.0.0.0 for blocked hosts. It should respond to
>>>> every query.
>>>>
>>>> I tried the live_pcap_download.html
>>>> <https://www.ntop.org/guides/ntopng/advanced_features/live_pcap_download.html>
>>>> lua, but couldn't figure out the bpf_filter:
>>>> curl --cookie "user=admin; password=xxxxx" "
>>>> http://10.12.17.25:3000/lua/live_traffic.lua?ifid=0&duration=600&bpf_filter=\"port
>>>> 53\""
>>>>
>>>> I also tried the download pcap on the if_stats.lua page. The
>>>> downloaded pcap file seems to only contain incoming data (see wireshark)?
>>>>
>>>> This is consistent with the above alerts, please ensure that ntopng is
>>>> not dropping packets as this would explain this behavior.
>>>>
>>>>
>>>> If I just do a tshark on the same interface that ntopng is listening
>>>> on, I see all of the expected DNS query & replies. I am not able to
>>>> correlate the alerts to any missing packets.
>>>>
>>>> See response above.
>>>>
>>>> Regards,
>>>>
>>>> Emanuele
>>>>
>>>>
>>>>
>>>>
>>>> On Fri, May 8, 2020 at 2:53 AM Emanuele Faranda <faranda@ntop.org>
>>>> wrote:
>>>>
>>>>> Hi Aaron,
>>>>>
>>>>> The alerts that you are reporting basically tell you that such hosts
>>>>> receive DNS requests but do not send a reply. In order to troubleshoot
>>>>> possible problems you should augment such information with the knowledge of
>>>>> your network.
>>>>>
>>>>> The first question to answer is, are that hosts expected to accept DNS
>>>>> requests? If not, are the requests generated from the internet or from the
>>>>> LAN? In the first case a firewall to block such DNS requests may be a good
>>>>> idea . In the latter case some hosts in the LAN may be misconfigured. In
>>>>> case of the pihole hosts, I expect pihole to block some DNS requests for
>>>>> advertisement sites so this could be a normal behaviour. The following
>>>>> ntopng features may also help you:
>>>>>
>>>>>
>>>>> https://www.ntop.org/guides/ntopng/advanced_features/live_pcap_download.html
>>>>>
>>>>>
>>>>> https://www.ntop.org/guides/ntopng/using_with_other_tools/n2disk.html
>>>>>
>>>>> https://www.ntop.org/guides/ntopng/historical_flows.html
>>>>>
>>>>> Regards,
>>>>> Emanuele
>>>>> On 5/7/20 5:57 PM, Aaron Scamehorn wrote:
>>>>>
>>>>> Hello,
>>>>>
>>>>> I'm trying to understand how/why I am getting the "Replies / Requests
>>>>> Ratio" warnings for DNS.
>>>>>
>>>>> I am suspect of these alerts, and would like to know how/why they are
>>>>> being generated. I am suspect for for the following reasons: 1) If it
>>>>> really is as bad as indicated, I should notice problems. 2) the "events'
>>>>> occur immediately after I clear the alerts, and tend to persist for hours.
>>>>>
>>>>> In any case, I cleared the alerts last night, and this is what they
>>>>> look like:
>>>>>
>>>>> 06/05/2020 22:15:00 12:31:28 Warning Replies / Requests Ratio Host
>>>>> edgemax.example.net
>>>>> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.1@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
>>>>> has received 54 DNS requests but sent 0 DNS replies [5 Minutes ratio: 0%]
>>>>>
>>>>> 06/05/2020 22:15:00 12:31:28 Warning Replies / Requests Ratio Host
>>>>> pihole.example.net
>>>>> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.3@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
>>>>> has sent 93 DNS requests but received 3 DNS replies [5 Minutes ratio: 3.2%]
>>>>>
>>>>> 06/05/2020 22:15:00 12:31:28 Warning Replies / Requests Ratio Host
>>>>> pihole-2.example.net
>>>>> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.4@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
>>>>> has sent 97 DNS requests but received 1 DNS reply [5 Minutes ratio: 1.0%]
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Ntop mailing listNtop@listgateway.unipi.ithttp://listgateway.unipi.it/mailman/listinfo/ntop
>>>>>
>>>>> _______________________________________________
>>>>> Ntop mailing list
>>>>> Ntop@listgateway.unipi.it
>>>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>>>
>>>>
>>>> _______________________________________________
>>>> Ntop mailing listNtop@listgateway.unipi.ithttp://listgateway.unipi.it/mailman/listinfo/ntop
>>>>
>>>> _______________________________________________
>>>> Ntop mailing list
>>>> Ntop@listgateway.unipi.it
>>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>>
>>>
>>> _______________________________________________
>>> Ntop mailing listNtop@listgateway.unipi.ithttp://listgateway.unipi.it/mailman/listinfo/ntop
>>>
>>> _______________________________________________
>>> Ntop mailing list
>>> Ntop@listgateway.unipi.it
>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>
>>
>> _______________________________________________
>> Ntop mailing listNtop@listgateway.unipi.ithttp://listgateway.unipi.it/mailman/listinfo/ntop
>>
>>
>> _______________________________________________
>> Ntop mailing listNtop@listgateway.unipi.ithttp://listgateway.unipi.it/mailman/listinfo/ntop
>>
>> _______________________________________________
>> Ntop mailing list
>> Ntop@listgateway.unipi.it
>> http://listgateway.unipi.it/mailman/listinfo/ntop
>
>
> _______________________________________________
> Ntop mailing listNtop@listgateway.unipi.ithttp://listgateway.unipi.it/mailman/listinfo/ntop
>
> _______________________________________________
> Ntop mailing list
> Ntop@listgateway.unipi.it
> http://listgateway.unipi.it/mailman/listinfo/ntop
Re: Replies / Requests Ratio [ In reply to ]
Hi Aaron,

ntopng did not account the ethernet frame padding which resulted in the
ACK packets to be parsed as HTTP replies, so the actual HTTP reply in
subsequent packets were ignored. This is fixed in
https://github.com/ntop/ntopng/commit/cba3ab2ea6258895fa7270330607f491fa942c47
. A new package will be available in one hour. Please kindly confirm
that it work on the live traffic.

Regards,

Emanuele

On 5/22/20 7:11 AM, Aaron Scamehorn wrote:
> Hi Emanuele,
>
> It's been about 9 days since adding the ignore_vlan option.  The
> behavior has definitely changed, however, I continue to get Replies /
> Requests Ratio alerts.
>
> I get far fewer alerts for DNS.  As mentioned earlier, I am now
> getting alerts for HTTP.  Over 600 in the last 9 days:
>
>     "msg": "Host edgemax has received 117 HTTP requests but sent 51
> HTTP replies [5 Minutes ratio: 43.2%] "
>     "msg": "Host edgemax has received 117 HTTP requests but sent 51
> HTTP replies [5 Minutes ratio: 43.2%] "
>     "msg": "Host edgemax has received 117 HTTP requests but sent 51
> HTTP replies [5 Minutes ratio: 43.2%] "
>     "msg": "Host edgemax has received 117 HTTP requests but sent 51
> HTTP replies [5 Minutes ratio: 43.2%] "
>     "msg": "Host edgemax has received 118 HTTP requests but sent 51
> HTTP replies [5 Minutes ratio: 42.9%] "
>     "msg": "Host edgemax has received 100 HTTP requests but sent 34
> HTTP replies [5 Minutes ratio: 33.7%] "
>     "msg": "Host edgemax has received 78 HTTP requests but sent 34
> HTTP replies [5 Minutes ratio: 43.0%] "
>
> The duration is usually 10 minutes or less.
>
> I've sent you a PCAP file to reproduce.
>
> Aaron
>
>
> On Fri, May 15, 2020 at 4:26 AM Emanuele Faranda <faranda@ntop.org
> <mailto:faranda@ntop.org>> wrote:
>
> Hi Aaron,
>
> The alerts on HTTP traffic should not be linked to the
> --ignore-vlan option, as adding such option should actually
> improve the requests vs reply ratio also in case of HTTP so I
> expect less alerts to be generated than before.
>
> Anyway, please monitor the situation and if you still think that
> there is such a problem please provide a PCAP file privately with
> the HTTP traffic so that we can inspect it.
>
> Regards,
>
> Emanuele
>
> On 5/13/20 4:55 PM, Aaron Scamehorn wrote:
>> Interesting.  I do recall seeing vlan tags on some but not all of
>> the flows in ntopng.
>>
>> Looking at the pcaps now, I do see that traffic from the 2
>> pi-hole hosts have vlan tags whereas other hosts have no vlan
>> tag.  So, the switch that the pi-holes is adding vlan tags?
>>
>> Anyway, I ran the 30 minute pcap file with the --ignore-vlan
>> config, and agree that does resolve the issue with the pcap file.
>>
>> Adding that config to the "prod" ntopng apparently introduces new
>> problems.  I am now getting Replies / Requests Ratio alerts for
>> HTTP on various hosts.  I have not seen these alerts before. 
>> These do not have the prolonged duration that the DNS alerts were
>> having; rather, these are all of the 5 minute duration.
>>
>> Could this be a boundary issue?  Could client send the requests
>> in one 5 minute window, and the responses are on the next 5
>> minute window?
>>
>> Aaron
>>
>>
>>
>>
>> On Wed, May 13, 2020 at 8:48 AM Emanuele Faranda
>> <faranda@ntop.org <mailto:faranda@ntop.org>> wrote:
>>
>> Aaron,
>>
>> Writing to you here to continue the public discussion. The
>> problem is that the DNS requests have no VLAN tag whereas the
>> DNS replies have the VLAN tag 1. So ntopng splits the DNS
>> flows in two monodirectional flows. If you want to ignore the
>> VLAN tag in ntopng you can use the --ignore-vlans flag in
>> ntopng. This should fix your problem.
>>
>> Regards,
>>
>> Emanuele
>>
>> On 5/13/20 3:06 PM, Emanuele Faranda wrote:
>>>
>>> Hi Aaron,
>>>
>>> Please contact us privately at faranda@ntop.org
>>> <mailto:faranda@ntop.org> and mainardi@ntop.org
>>> <mailto:mainardi@ntop.org> . Please ensure that the PCAP
>>> files only contain DNS traffic.
>>>
>>> Regards,
>>>
>>> Emanuele
>>>
>>> On 5/12/20 5:13 PM, Aaron Scamehorn wrote:
>>>> Emanuele,
>>>>
>>>> Here is ntopng.conf
>>>> -G=/var/run/ntopng.pid
>>>> -i=enp2s0
>>>> -m=10.12.17.0/24 <http://10.12.17.0/24>
>>>> -S=local
>>>>
>>>> I do see unidirectional flows in flows_stats.lua for DNS. 
>>>> Incidentally, I do also see alerts w/ non-zero replies
>>>> (though most alerts are 0):
>>>> Host pihole has sent 211 DNS requests but received 7 DNS
>>>> replies
>>>>
>>>> I tried 2 different 30 minute PCAP files.  In both cases,
>>>> right at the 10 minute mark, I got alerts.  How can I get
>>>> these PCAP files to you?
>>>>
>>>> Thanks,
>>>> Aaron
>>>>
>>>>
>>>>
>>>> On Tue, May 12, 2020 at 4:13 AM Emanuele Faranda
>>>> <faranda@ntop.org <mailto:faranda@ntop.org>> wrote:
>>>>
>>>> Hi Aaron,
>>>>
>>>> Please see below.
>>>>
>>>> On 5/11/20 9:29 PM, Aaron Scamehorn wrote:
>>>>> Hi Emanuele,
>>>>>
>>>>> Thank you again for the detailed responses.
>>>>>
>>>>> From the interfaces page, I see these stats:
>>>>> Total Traffic 91.6 GB [103,062,265 Pkts] Dropped
>>>>> Packets 0 Pkts
>>>>>
>>>>> I don't see any dropped packets on the NIC either:
>>>>> ethtool -S enp2s0
>>>>> NIC statistics:
>>>>>      tx_packets: 0
>>>>>      rx_packets: 106581943
>>>>>      tx_errors: 0
>>>>>      rx_errors: 0
>>>>>      rx_missed: 0
>>>>>      align_errors: 0
>>>>>      tx_single_collisions: 0
>>>>>      tx_multi_collisions: 0
>>>>>      unicast: 105432876
>>>>>      broadcast: 350738
>>>>>      multicast: 1149060
>>>>>      tx_aborted: 0
>>>>>      tx_underrun: 0
>>>>>
>>>>> As of right now, 2 of the hosts we are discussing are
>>>>> still in alert, at the original Date/Time of 07:25:01,
>>>>> and Duration is now "3 Days, 08:06:59".
>>>>>
>>>>> Given that my replies vs requests ratio is still
>>>>> configured at 50%, this means that, at every 5 minute
>>>>> interval for the last 3 Days, 8 hours, said host is
>>>>> receiving < 50% DNS replies, correct?  I find this
>>>>> difficult to believe, and cannot find ANY missing
>>>>> packets in my pcap file.
>>>>>
>>>>> I have captured a 30 minute pcap file captured with
>>>>> this command:
>>>>> tcpdump -i enp2s0 -G 1800 -w /tmp/enp2s0.%FT%T.pcap
>>>>> host edgemax and port 53
>>>>>
>>>>> This file contains DNS traffic to/from edgemax only.
>>>>> I can count responses like this:
>>>>> tshark -t a -r enp2s0.2020-05-11T13:00:02.pcap | grep
>>>>> -c "Standard query response"
>>>>> 349
>>>>> And queries like this:
>>>>> tshark -t a -r enp2s0.2020-05-11T13:00:02.pcap | grep
>>>>> -c "Standard query 0x"
>>>>> 349
>>>>>
>>>>> In other words, no missing DNS responses in the 30
>>>>> minutes spanning 13:00:02 to 13:29:51.
>>>>>
>>>>> I would think that the alert should "clear" because
>>>>> the threshold is not exceeded within that 30 minute
>>>>> pcap file.
>>>>>
>>>>> In any case, at 13:23, I manually click on the
>>>>> "Release" button for that alert.  2 minutes later, at
>>>>> 13:25:00, I receive this alert:
>>>>> Host edgemax has received 62 DNS requests but sent 0
>>>>> DNS replies [5 Minutes ratio: 0%]
>>>>>
>>>>> As stated previously, no missing DNS responses in the
>>>>> 30 minutes spanning 13:00:02 to 13:29:51. Why does
>>>>> ntopng think 62 replies are missing?
>>>>
>>>> Please report your ntopng.conf. If you look at the
>>>> active ntopng DNS flows, can you identify
>>>> unidirectional flows? You can also try to run ntopng on
>>>> the PCAP file (--original-speed -i file.pcap). If you
>>>> can reproduce using the PCAP file, please send it to me
>>>> privately so that I can troubleshoot the problem.
>>>>
>>>>>
>>>>> I exported 10 minutes of PCAP from if_stats.lua. 
>>>>> Using the filter "(ip.dst_host == "10.12.17.1" or
>>>>> ip.src_host == "10.12.17.1") and dns" I am not able to
>>>>> find any missing DNS responses in wireshark.
>>>>> Interestingly, If I specify a BPF Filter ("port 53"),
>>>>> the downloaded PCAP file seems to only have 1 side
>>>>> (ie. edgemax is only a source, never a dest.  Without
>>>>> a BPF Filter, the download is fine.
>>>>
>>>> This is probably a bug, please open an issue at
>>>> https://github.com/ntop/ntopng .
>>>>
>>>> Regards,
>>>>
>>>> Emanuele
>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, May 11, 2020 at 8:59 AM Emanuele Faranda
>>>>> <faranda@ntop.org <mailto:faranda@ntop.org>> wrote:
>>>>>
>>>>> Hi Aaron,
>>>>>
>>>>> Please see below:
>>>>>
>>>>> On 5/8/20 10:27 PM, Aaron Scamehorn wrote:
>>>>>> Thank you for your response.  In the screenshot
>>>>>> below, can you please explain the significance of
>>>>>> the "Date/Time" and the "Duration" columns?  What
>>>>>> do they mean in this context?
>>>>>
>>>>> Date/Time: the time when the alert was triggered.
>>>>> Ntopng performs periodic checks in order to
>>>>> trigger alerts. In this particular case, the check
>>>>> on the requests/reply ratio is performed every 5
>>>>> minutes. So this means that problem started
>>>>> between 07:20 and 07:25 .
>>>>>
>>>>> Duration: the total time in which the problem was
>>>>> active. Again, the check is performed every 5
>>>>> minutes for this alert so 5 minutes is the
>>>>> granularity.
>>>>>
>>>>>>
>>>>>> Do I understand correctly that all 3 hosts
>>>>>> triggered the alert at 07:25:01 (OR 07:30:01)
>>>>>> this morning?  And that all three alerts are
>>>>>> active for the past 07:28:53  hours?   Does this
>>>>>> mean that there have been no new additional DNS
>>>>>> Reply/Request issues have been detected?
>>>>> As explained above, the problem started between
>>>>> 07:20 and 07:25 . For 07:28:53 hours the problem
>>>>> was active on all the three hosts (the
>>>>> requests/reply ratio threshold was exceeded for
>>>>> 07:28:53 hours).
>>>>>>
>>>>>> I notice in "Past Alerts" tab, that there are
>>>>>> many Reply/Request Alerts for the same host with
>>>>>> very short durations (screen shot #2). When/how
>>>>>> does an alert move from the "Engaged" to "Past" tab?
>>>>> In this case, the engaged alert becomes "past"
>>>>> alert when, after the check performed every 5
>>>>> minutes, the requests/reply ratio threshold is not
>>>>> exceed anymore. This can happen as soon as the
>>>>> next check is performed (5 minutes).
>>>>>>
>>>>>> So in the 2nd screenshot, fire-TV had an alert at
>>>>>> 06:20:00 for 05:00 minutes where 18 requests
>>>>>> received 0 replies.  Then another alert at
>>>>>> 06:50:00 for 05:00 minutes.  Were the 18 replies
>>>>>> from the first alert ultimately received?  And
>>>>>> they were received 5 minutes the alert occurred?
>>>>>
>>>>> The check is performed on the DNS packet counters.
>>>>> A DNS request cannot take 5 minutes to be replied.
>>>>> The fact that the alert was closed after 5/10
>>>>> minutes could be related to one of these events:
>>>>>
>>>>> - The host went idle
>>>>>
>>>>> - The host did not send enough DNS requests
>>>>>
>>>>> - The new DNS requests made by the host were
>>>>> successfully replied.
>>>>>
>>>>>>
>>>>>> Context here is that 99% of the traffic is
>>>>>> Internet traffic.  Almost all of the pihole
>>>>>> traffic is to forwarders.  BTW, the way pihole
>>>>>> works (by default) is it replies 0.0.0.0 for
>>>>>> blocked hosts.  It should respond to every query.
>>>>>>
>>>>>> I tried the live_pcap_download.html
>>>>>> <https://www.ntop.org/guides/ntopng/advanced_features/live_pcap_download.html>
>>>>>> lua, but couldn't figure out the bpf_filter:
>>>>>> curl --cookie "user=admin; password=xxxxx"
>>>>>>  "http://10.12.17.25:3000/lua/live_traffic.lua?ifid=0&duration=600&bpf_filter=\
>>>>>> <http://10.12.17.25:3000/lua/live_traffic.lua?ifid=0&duration=600&bpf_filter=%5C>"port
>>>>>> 53\""
>>>>>>
>>>>>> I also tried the download pcap on the
>>>>>> if_stats.lua page.   The downloaded pcap file
>>>>>> seems to only contain incoming data (see wireshark)?
>>>>> This is consistent with the above alerts, please
>>>>> ensure that ntopng is not dropping packets as this
>>>>> would explain this behavior.
>>>>>>
>>>>>> If I just do a tshark on the same interface that
>>>>>> ntopng is listening on, I see all of the expected
>>>>>> DNS query & replies.  I am not able to correlate
>>>>>> the alerts to any missing packets.
>>>>>
>>>>> See response above.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Emanuele
>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, May 8, 2020 at 2:53 AM Emanuele Faranda
>>>>>> <faranda@ntop.org <mailto:faranda@ntop.org>> wrote:
>>>>>>
>>>>>> Hi Aaron,
>>>>>>
>>>>>> The alerts that you are reporting basically
>>>>>> tell you that such hosts receive DNS requests
>>>>>> but do not send a reply. In order to
>>>>>> troubleshoot possible problems you should
>>>>>> augment such information with the knowledge
>>>>>> of your network.
>>>>>>
>>>>>> The first question to answer is, are that
>>>>>> hosts expected to accept DNS requests? If
>>>>>> not, are the requests generated from the
>>>>>> internet or from the LAN? In the first case a
>>>>>> firewall to block such DNS requests may be a
>>>>>> good idea . In the latter case some hosts in
>>>>>> the LAN may be misconfigured. In case of the
>>>>>> pihole hosts, I expect pihole to block some
>>>>>> DNS requests for advertisement sites so this
>>>>>> could be a normal behaviour. The following
>>>>>> ntopng features may also help you:
>>>>>>
>>>>>> https://www.ntop.org/guides/ntopng/advanced_features/live_pcap_download.html
>>>>>>
>>>>>> https://www.ntop.org/guides/ntopng/using_with_other_tools/n2disk.html
>>>>>>
>>>>>> https://www.ntop.org/guides/ntopng/historical_flows.html
>>>>>>
>>>>>> Regards,
>>>>>> Emanuele
>>>>>>
>>>>>> On 5/7/20 5:57 PM, Aaron Scamehorn wrote:
>>>>>>> Hello,
>>>>>>>
>>>>>>> I'm trying to understand how/why I am
>>>>>>> getting the "Replies / Requests Ratio"
>>>>>>> warnings for DNS.
>>>>>>>
>>>>>>> I am suspect of these alerts, and would like
>>>>>>> to know how/why they are being generated.  I
>>>>>>> am suspect for for the following reasons: 
>>>>>>> 1) If it really is as bad as indicated, I
>>>>>>> should notice problems.  2) the "events'
>>>>>>> occur immediately after I clear the alerts,
>>>>>>> and tend to persist for hours.
>>>>>>>
>>>>>>> In any case, I cleared the alerts last
>>>>>>> night, and this is what they look like:
>>>>>>>
>>>>>>> 06/05/2020 22:15:00 12:31:28 Warning
>>>>>>> Replies / Requests Ratio Host
>>>>>>> edgemax.example.net
>>>>>>> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.1@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
>>>>>>> has received 54 DNS requests but sent 0 DNS
>>>>>>> replies [5 Minutes ratio: 0%]
>>>>>>>
>>>>>>> 06/05/2020 22:15:00 12:31:28 Warning
>>>>>>> Replies / Requests Ratio Host
>>>>>>> pihole.example.net
>>>>>>> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.3@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
>>>>>>> has sent 93 DNS requests but received 3 DNS
>>>>>>> replies [5 Minutes ratio: 3.2%]
>>>>>>> 06/05/2020 22:15:00 12:31:28 Warning
>>>>>>> Replies / Requests Ratio Host
>>>>>>> pihole-2.example.net
>>>>>>> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.4@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
>>>>>>> has sent 97 DNS requests but received 1 DNS
>>>>>>> reply [5 Minutes ratio: 1.0%]
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Ntop mailing list
>>>>>>> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
>>>>>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>>>>> _______________________________________________
>>>>>> Ntop mailing list
>>>>>> Ntop@listgateway.unipi.it
>>>>>> <mailto:Ntop@listgateway.unipi.it>
>>>>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Ntop mailing list
>>>>>> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
>>>>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>>>> _______________________________________________
>>>>> Ntop mailing list
>>>>> Ntop@listgateway.unipi.it
>>>>> <mailto:Ntop@listgateway.unipi.it>
>>>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Ntop mailing list
>>>>> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
>>>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>>> _______________________________________________
>>>> Ntop mailing list
>>>> Ntop@listgateway.unipi.it
>>>> <mailto:Ntop@listgateway.unipi.it>
>>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>>>
>>>>
>>>> _______________________________________________
>>>> Ntop mailing list
>>>> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
>>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>>
>>> _______________________________________________
>>> Ntop mailing list
>>> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>> _______________________________________________
>> Ntop mailing list
>> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>
>>
>> _______________________________________________
>> Ntop mailing list
>> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
>> http://listgateway.unipi.it/mailman/listinfo/ntop
> _______________________________________________
> Ntop mailing list
> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
> http://listgateway.unipi.it/mailman/listinfo/ntop
>
>
> _______________________________________________
> Ntop mailing list
> Ntop@listgateway.unipi.it
> http://listgateway.unipi.it/mailman/listinfo/ntop
Re: Replies / Requests Ratio [ In reply to ]
Thanks, Emanuele.

I can confirm that I am no longer receiving alerts for HTTP request/replies.

Aaron


On Mon, May 25, 2020 at 5:18 AM Emanuele Faranda <faranda@ntop.org> wrote:

> Hi Aaron,
>
> ntopng did not account the ethernet frame padding which resulted in the
> ACK packets to be parsed as HTTP replies, so the actual HTTP reply in
> subsequent packets were ignored. This is fixed in
> https://github.com/ntop/ntopng/commit/cba3ab2ea6258895fa7270330607f491fa942c47
> . A new package will be available in one hour. Please kindly confirm that
> it work on the live traffic.
>
> Regards,
>
> Emanuele
> On 5/22/20 7:11 AM, Aaron Scamehorn wrote:
>
> Hi Emanuele,
>
> It's been about 9 days since adding the ignore_vlan option. The behavior
> has definitely changed, however, I continue to get Replies / Requests Ratio
> alerts.
>
> I get far fewer alerts for DNS. As mentioned earlier, I am now getting
> alerts for HTTP. Over 600 in the last 9 days:
>
> "msg": "Host edgemax has received 117 HTTP requests but sent 51 HTTP
> replies [5 Minutes ratio: 43.2%] "
> "msg": "Host edgemax has received 117 HTTP requests but sent 51 HTTP
> replies [5 Minutes ratio: 43.2%] "
> "msg": "Host edgemax has received 117 HTTP requests but sent 51 HTTP
> replies [5 Minutes ratio: 43.2%] "
> "msg": "Host edgemax has received 117 HTTP requests but sent 51 HTTP
> replies [5 Minutes ratio: 43.2%] "
> "msg": "Host edgemax has received 118 HTTP requests but sent 51 HTTP
> replies [5 Minutes ratio: 42.9%] "
> "msg": "Host edgemax has received 100 HTTP requests but sent 34 HTTP
> replies [5 Minutes ratio: 33.7%] "
> "msg": "Host edgemax has received 78 HTTP requests but sent 34 HTTP
> replies [5 Minutes ratio: 43.0%] "
>
> The duration is usually 10 minutes or less.
>
> I've sent you a PCAP file to reproduce.
>
> Aaron
>
>
> On Fri, May 15, 2020 at 4:26 AM Emanuele Faranda <faranda@ntop.org> wrote:
>
>> Hi Aaron,
>>
>> The alerts on HTTP traffic should not be linked to the --ignore-vlan
>> option, as adding such option should actually improve the requests vs reply
>> ratio also in case of HTTP so I expect less alerts to be generated than
>> before.
>>
>> Anyway, please monitor the situation and if you still think that there is
>> such a problem please provide a PCAP file privately with the HTTP traffic
>> so that we can inspect it.
>>
>> Regards,
>>
>> Emanuele
>> On 5/13/20 4:55 PM, Aaron Scamehorn wrote:
>>
>> Interesting. I do recall seeing vlan tags on some but not all of the
>> flows in ntopng.
>>
>> Looking at the pcaps now, I do see that traffic from the 2 pi-hole hosts
>> have vlan tags whereas other hosts have no vlan tag. So, the switch that
>> the pi-holes is adding vlan tags?
>>
>> Anyway, I ran the 30 minute pcap file with the --ignore-vlan config, and
>> agree that does resolve the issue with the pcap file.
>>
>> Adding that config to the "prod" ntopng apparently introduces new
>> problems. I am now getting Replies / Requests Ratio alerts for HTTP on
>> various hosts. I have not seen these alerts before. These do not have the
>> prolonged duration that the DNS alerts were having; rather, these are all
>> of the 5 minute duration.
>>
>> Could this be a boundary issue? Could client send the requests in one 5
>> minute window, and the responses are on the next 5 minute window?
>>
>> Aaron
>>
>>
>>
>>
>> On Wed, May 13, 2020 at 8:48 AM Emanuele Faranda <faranda@ntop.org>
>> wrote:
>>
>>> Aaron,
>>>
>>> Writing to you here to continue the public discussion. The problem is
>>> that the DNS requests have no VLAN tag whereas the DNS replies have the
>>> VLAN tag 1. So ntopng splits the DNS flows in two monodirectional flows. If
>>> you want to ignore the VLAN tag in ntopng you can use the --ignore-vlans
>>> flag in ntopng. This should fix your problem.
>>>
>>> Regards,
>>>
>>> Emanuele
>>> On 5/13/20 3:06 PM, Emanuele Faranda wrote:
>>>
>>> Hi Aaron,
>>>
>>> Please contact us privately at faranda@ntop.org and mainardi@ntop.org .
>>> Please ensure that the PCAP files only contain DNS traffic.
>>>
>>> Regards,
>>>
>>> Emanuele
>>> On 5/12/20 5:13 PM, Aaron Scamehorn wrote:
>>>
>>> Emanuele,
>>>
>>> Here is ntopng.conf
>>> -G=/var/run/ntopng.pid
>>> -i=enp2s0
>>> -m=10.12.17.0/24
>>> -S=local
>>>
>>> I do see unidirectional flows in flows_stats.lua for DNS. Incidentally,
>>> I do also see alerts w/ non-zero replies (though most alerts are 0):
>>> Host pihole has sent 211 DNS requests but received 7 DNS replies
>>>
>>> I tried 2 different 30 minute PCAP files. In both cases, right at the
>>> 10 minute mark, I got alerts. How can I get these PCAP files to you?
>>>
>>> Thanks,
>>> Aaron
>>>
>>>
>>>
>>> On Tue, May 12, 2020 at 4:13 AM Emanuele Faranda <faranda@ntop.org>
>>> wrote:
>>>
>>>> Hi Aaron,
>>>>
>>>> Please see below.
>>>> On 5/11/20 9:29 PM, Aaron Scamehorn wrote:
>>>>
>>>> Hi Emanuele,
>>>>
>>>> Thank you again for the detailed responses.
>>>>
>>>> From the interfaces page, I see these stats:
>>>> Total Traffic 91.6 GB [103,062,265 Pkts] Dropped Packets 0 Pkts
>>>> I don't see any dropped packets on the NIC either:
>>>> ethtool -S enp2s0
>>>> NIC statistics:
>>>> tx_packets: 0
>>>> rx_packets: 106581943
>>>> tx_errors: 0
>>>> rx_errors: 0
>>>> rx_missed: 0
>>>> align_errors: 0
>>>> tx_single_collisions: 0
>>>> tx_multi_collisions: 0
>>>> unicast: 105432876
>>>> broadcast: 350738
>>>> multicast: 1149060
>>>> tx_aborted: 0
>>>> tx_underrun: 0
>>>>
>>>> As of right now, 2 of the hosts we are discussing are still in alert,
>>>> at the original Date/Time of 07:25:01, and Duration is now "3 Days,
>>>> 08:06:59".
>>>>
>>>> Given that my replies vs requests ratio is still configured at 50%,
>>>> this means that, at every 5 minute interval for the last 3 Days, 8 hours,
>>>> said host is receiving < 50% DNS replies, correct? I find this difficult
>>>> to believe, and cannot find ANY missing packets in my pcap file.
>>>>
>>>> I have captured a 30 minute pcap file captured with this command:
>>>> tcpdump -i enp2s0 -G 1800 -w /tmp/enp2s0.%FT%T.pcap host edgemax and
>>>> port 53
>>>>
>>>> This file contains DNS traffic to/from edgemax only.
>>>> I can count responses like this:
>>>> tshark -t a -r enp2s0.2020-05-11T13:00:02.pcap | grep -c "Standard
>>>> query response"
>>>> 349
>>>> And queries like this:
>>>> tshark -t a -r enp2s0.2020-05-11T13:00:02.pcap | grep -c "Standard
>>>> query 0x"
>>>> 349
>>>>
>>>> In other words, no missing DNS responses in the 30 minutes spanning
>>>> 13:00:02 to 13:29:51.
>>>>
>>>> I would think that the alert should "clear" because the threshold is
>>>> not exceeded within that 30 minute pcap file.
>>>>
>>>> In any case, at 13:23, I manually click on the "Release" button for
>>>> that alert. 2 minutes later, at 13:25:00, I receive this alert:
>>>> Host edgemax has received 62 DNS requests but sent 0 DNS replies [5
>>>> Minutes ratio: 0%]
>>>>
>>>> As stated previously, no missing DNS responses in the 30 minutes
>>>> spanning 13:00:02 to 13:29:51. Why does ntopng think 62 replies are
>>>> missing?
>>>>
>>>> Please report your ntopng.conf. If you look at the active ntopng DNS
>>>> flows, can you identify unidirectional flows? You can also try to run
>>>> ntopng on the PCAP file (--original-speed -i file.pcap). If you can
>>>> reproduce using the PCAP file, please send it to me privately so that I can
>>>> troubleshoot the problem.
>>>>
>>>>
>>>> I exported 10 minutes of PCAP from if_stats.lua. Using the filter
>>>> "(ip.dst_host == "10.12.17.1" or ip.src_host == "10.12.17.1") and dns" I am
>>>> not able to find any missing DNS responses in wireshark. Interestingly, If
>>>> I specify a BPF Filter ("port 53"), the downloaded PCAP file seems to only
>>>> have 1 side (ie. edgemax is only a source, never a dest. Without a BPF
>>>> Filter, the download is fine.
>>>>
>>>> This is probably a bug, please open an issue at
>>>> https://github.com/ntop/ntopng .
>>>>
>>>> Regards,
>>>>
>>>> Emanuele
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, May 11, 2020 at 8:59 AM Emanuele Faranda <faranda@ntop.org>
>>>> wrote:
>>>>
>>>>> Hi Aaron,
>>>>>
>>>>> Please see below:
>>>>> On 5/8/20 10:27 PM, Aaron Scamehorn wrote:
>>>>>
>>>>> Thank you for your response. In the screenshot below, can you please
>>>>> explain the significance of the "Date/Time" and the "Duration" columns?
>>>>> What do they mean in this context?
>>>>>
>>>>> Date/Time: the time when the alert was triggered. Ntopng performs
>>>>> periodic checks in order to trigger alerts. In this particular case, the
>>>>> check on the requests/reply ratio is performed every 5 minutes. So this
>>>>> means that problem started between 07:20 and 07:25 .
>>>>>
>>>>> Duration: the total time in which the problem was active. Again, the
>>>>> check is performed every 5 minutes for this alert so 5 minutes is the
>>>>> granularity.
>>>>>
>>>>>
>>>>> Do I understand correctly that all 3 hosts triggered the alert at
>>>>> 07:25:01 (OR 07:30:01) this morning? And that all three alerts are active
>>>>> for the past 07:28:53 hours? Does this mean that there have been no new
>>>>> additional DNS Reply/Request issues have been detected?
>>>>>
>>>>> As explained above, the problem started between 07:20 and 07:25 . For
>>>>> 07:28:53 hours the problem was active on all the three hosts (the
>>>>> requests/reply ratio threshold was exceeded for 07:28:53 hours).
>>>>>
>>>>>
>>>>> I notice in "Past Alerts" tab, that there are many Reply/Request
>>>>> Alerts for the same host with very short durations (screen shot #2).
>>>>> When/how does an alert move from the "Engaged" to "Past" tab?
>>>>>
>>>>> In this case, the engaged alert becomes "past" alert when, after the
>>>>> check performed every 5 minutes, the requests/reply ratio threshold is not
>>>>> exceed anymore. This can happen as soon as the next check is performed (5
>>>>> minutes).
>>>>>
>>>>>
>>>>> So in the 2nd screenshot, fire-TV had an alert at 06:20:00 for 05:00
>>>>> minutes where 18 requests received 0 replies. Then another alert at
>>>>> 06:50:00 for 05:00 minutes. Were the 18 replies from the first alert
>>>>> ultimately received? And they were received 5 minutes the alert occurred?
>>>>>
>>>>> The check is performed on the DNS packet counters. A DNS request
>>>>> cannot take 5 minutes to be replied. The fact that the alert was closed
>>>>> after 5/10 minutes could be related to one of these events:
>>>>>
>>>>> - The host went idle
>>>>>
>>>>> - The host did not send enough DNS requests
>>>>>
>>>>> - The new DNS requests made by the host were successfully replied.
>>>>>
>>>>>
>>>>> Context here is that 99% of the traffic is Internet traffic. Almost
>>>>> all of the pihole traffic is to forwarders. BTW, the way pihole works (by
>>>>> default) is it replies 0.0.0.0 for blocked hosts. It should respond to
>>>>> every query.
>>>>>
>>>>> I tried the live_pcap_download.html
>>>>> <https://www.ntop.org/guides/ntopng/advanced_features/live_pcap_download.html>
>>>>> lua, but couldn't figure out the bpf_filter:
>>>>> curl --cookie "user=admin; password=xxxxx" "
>>>>> http://10.12.17.25:3000/lua/live_traffic.lua?ifid=0&duration=600&bpf_filter=\"port
>>>>> 53\""
>>>>>
>>>>> I also tried the download pcap on the if_stats.lua page. The
>>>>> downloaded pcap file seems to only contain incoming data (see wireshark)?
>>>>>
>>>>> This is consistent with the above alerts, please ensure that ntopng is
>>>>> not dropping packets as this would explain this behavior.
>>>>>
>>>>>
>>>>> If I just do a tshark on the same interface that ntopng is listening
>>>>> on, I see all of the expected DNS query & replies. I am not able to
>>>>> correlate the alerts to any missing packets.
>>>>>
>>>>> See response above.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Emanuele
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Fri, May 8, 2020 at 2:53 AM Emanuele Faranda <faranda@ntop.org>
>>>>> wrote:
>>>>>
>>>>>> Hi Aaron,
>>>>>>
>>>>>> The alerts that you are reporting basically tell you that such hosts
>>>>>> receive DNS requests but do not send a reply. In order to troubleshoot
>>>>>> possible problems you should augment such information with the knowledge of
>>>>>> your network.
>>>>>>
>>>>>> The first question to answer is, are that hosts expected to accept
>>>>>> DNS requests? If not, are the requests generated from the internet or from
>>>>>> the LAN? In the first case a firewall to block such DNS requests may be a
>>>>>> good idea . In the latter case some hosts in the LAN may be misconfigured.
>>>>>> In case of the pihole hosts, I expect pihole to block some DNS requests for
>>>>>> advertisement sites so this could be a normal behaviour. The following
>>>>>> ntopng features may also help you:
>>>>>>
>>>>>>
>>>>>> https://www.ntop.org/guides/ntopng/advanced_features/live_pcap_download.html
>>>>>>
>>>>>>
>>>>>> https://www.ntop.org/guides/ntopng/using_with_other_tools/n2disk.html
>>>>>>
>>>>>> https://www.ntop.org/guides/ntopng/historical_flows.html
>>>>>>
>>>>>> Regards,
>>>>>> Emanuele
>>>>>> On 5/7/20 5:57 PM, Aaron Scamehorn wrote:
>>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> I'm trying to understand how/why I am getting the "Replies / Requests
>>>>>> Ratio" warnings for DNS.
>>>>>>
>>>>>> I am suspect of these alerts, and would like to know how/why they are
>>>>>> being generated. I am suspect for for the following reasons: 1) If it
>>>>>> really is as bad as indicated, I should notice problems. 2) the "events'
>>>>>> occur immediately after I clear the alerts, and tend to persist for hours.
>>>>>>
>>>>>> In any case, I cleared the alerts last night, and this is what they
>>>>>> look like:
>>>>>>
>>>>>> 06/05/2020 22:15:00 12:31:28 Warning Replies / Requests Ratio Host
>>>>>> edgemax.example.net
>>>>>> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.1@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
>>>>>> has received 54 DNS requests but sent 0 DNS replies [5 Minutes ratio: 0%]
>>>>>>
>>>>>> 06/05/2020 22:15:00 12:31:28 Warning Replies / Requests Ratio Host
>>>>>> pihole.example.net
>>>>>> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.3@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
>>>>>> has sent 93 DNS requests but received 3 DNS replies [5 Minutes ratio: 3.2%]
>>>>>>
>>>>>> 06/05/2020 22:15:00 12:31:28 Warning Replies / Requests Ratio Host
>>>>>> pihole-2.example.net
>>>>>> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.4@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
>>>>>> has sent 97 DNS requests but received 1 DNS reply [5 Minutes ratio: 1.0%]
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Ntop mailing listNtop@listgateway.unipi.ithttp://listgateway.unipi.it/mailman/listinfo/ntop
>>>>>>
>>>>>> _______________________________________________
>>>>>> Ntop mailing list
>>>>>> Ntop@listgateway.unipi.it
>>>>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Ntop mailing listNtop@listgateway.unipi.ithttp://listgateway.unipi.it/mailman/listinfo/ntop
>>>>>
>>>>> _______________________________________________
>>>>> Ntop mailing list
>>>>> Ntop@listgateway.unipi.it
>>>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>>>
>>>>
>>>> _______________________________________________
>>>> Ntop mailing listNtop@listgateway.unipi.ithttp://listgateway.unipi.it/mailman/listinfo/ntop
>>>>
>>>> _______________________________________________
>>>> Ntop mailing list
>>>> Ntop@listgateway.unipi.it
>>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>>
>>>
>>> _______________________________________________
>>> Ntop mailing listNtop@listgateway.unipi.ithttp://listgateway.unipi.it/mailman/listinfo/ntop
>>>
>>>
>>> _______________________________________________
>>> Ntop mailing listNtop@listgateway.unipi.ithttp://listgateway.unipi.it/mailman/listinfo/ntop
>>>
>>> _______________________________________________
>>> Ntop mailing list
>>> Ntop@listgateway.unipi.it
>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>
>>
>> _______________________________________________
>> Ntop mailing listNtop@listgateway.unipi.ithttp://listgateway.unipi.it/mailman/listinfo/ntop
>>
>> _______________________________________________
>> Ntop mailing list
>> Ntop@listgateway.unipi.it
>> http://listgateway.unipi.it/mailman/listinfo/ntop
>
>
> _______________________________________________
> Ntop mailing listNtop@listgateway.unipi.ithttp://listgateway.unipi.it/mailman/listinfo/ntop
>
> _______________________________________________
> Ntop mailing list
> Ntop@listgateway.unipi.it
> http://listgateway.unipi.it/mailman/listinfo/ntop
Re: Replies / Requests Ratio [ In reply to ]
Hi Aaron,

Thank you for reporting.

Regards,

Emanuele

On 5/26/20 4:41 PM, Aaron Scamehorn wrote:
> Thanks, Emanuele.
>
> I can confirm that I am no longer receiving alerts for HTTP
> request/replies.
>
> Aaron
>
>
> On Mon, May 25, 2020 at 5:18 AM Emanuele Faranda <faranda@ntop.org
> <mailto:faranda@ntop.org>> wrote:
>
> Hi Aaron,
>
> ntopng did not account the ethernet frame padding which resulted
> in the ACK packets to be parsed as HTTP replies, so the actual
> HTTP reply in subsequent packets were ignored. This is fixed in
> https://github.com/ntop/ntopng/commit/cba3ab2ea6258895fa7270330607f491fa942c47
> . A new package will be available in one hour. Please kindly
> confirm that it work on the live traffic.
>
> Regards,
>
> Emanuele
>
> On 5/22/20 7:11 AM, Aaron Scamehorn wrote:
>> Hi Emanuele,
>>
>> It's been about 9 days since adding the ignore_vlan option.  The
>> behavior has definitely changed, however, I continue to get
>> Replies / Requests Ratio alerts.
>>
>> I get far fewer alerts for DNS.  As mentioned earlier, I am now
>> getting alerts for HTTP.  Over 600 in the last 9 days:
>>
>>     "msg": "Host edgemax has received 117 HTTP requests but sent
>> 51 HTTP replies [5 Minutes ratio: 43.2%] "
>>     "msg": "Host edgemax has received 117 HTTP requests but sent
>> 51 HTTP replies [5 Minutes ratio: 43.2%] "
>>     "msg": "Host edgemax has received 117 HTTP requests but sent
>> 51 HTTP replies [5 Minutes ratio: 43.2%] "
>>     "msg": "Host edgemax has received 117 HTTP requests but sent
>> 51 HTTP replies [5 Minutes ratio: 43.2%] "
>>     "msg": "Host edgemax has received 118 HTTP requests but sent
>> 51 HTTP replies [5 Minutes ratio: 42.9%] "
>>     "msg": "Host edgemax has received 100 HTTP requests but sent
>> 34 HTTP replies [5 Minutes ratio: 33.7%] "
>>     "msg": "Host edgemax has received 78 HTTP requests but sent
>> 34 HTTP replies [5 Minutes ratio: 43.0%] "
>>
>> The duration is usually 10 minutes or less.
>>
>> I've sent you a PCAP file to reproduce.
>>
>> Aaron
>>
>>
>> On Fri, May 15, 2020 at 4:26 AM Emanuele Faranda
>> <faranda@ntop.org <mailto:faranda@ntop.org>> wrote:
>>
>> Hi Aaron,
>>
>> The alerts on HTTP traffic should not be linked to the
>> --ignore-vlan option, as adding such option should actually
>> improve the requests vs reply ratio also in case of HTTP so I
>> expect less alerts to be generated than before.
>>
>> Anyway, please monitor the situation and if you still think
>> that there is such a problem please provide a PCAP file
>> privately with the HTTP traffic so that we can inspect it.
>>
>> Regards,
>>
>> Emanuele
>>
>> On 5/13/20 4:55 PM, Aaron Scamehorn wrote:
>>> Interesting.  I do recall seeing vlan tags on some but not
>>> all of the flows in ntopng.
>>>
>>> Looking at the pcaps now, I do see that traffic from the 2
>>> pi-hole hosts have vlan tags whereas other hosts have no
>>> vlan tag. So, the switch that the pi-holes is adding vlan tags?
>>>
>>> Anyway, I ran the 30 minute pcap file with the --ignore-vlan
>>> config, and agree that does resolve the issue with the pcap
>>> file.
>>>
>>> Adding that config to the "prod" ntopng apparently
>>> introduces new problems.  I am now getting Replies /
>>> Requests Ratio alerts for HTTP on various hosts.  I have not
>>> seen these alerts before.  These do not have the prolonged
>>> duration that the DNS alerts were having; rather, these are
>>> all of the 5 minute duration.
>>>
>>> Could this be a boundary issue?  Could client send the
>>> requests in one 5 minute window, and the responses are on
>>> the next 5 minute window?
>>>
>>> Aaron
>>>
>>>
>>>
>>>
>>> On Wed, May 13, 2020 at 8:48 AM Emanuele Faranda
>>> <faranda@ntop.org <mailto:faranda@ntop.org>> wrote:
>>>
>>> Aaron,
>>>
>>> Writing to you here to continue the public discussion.
>>> The problem is that the DNS requests have no VLAN tag
>>> whereas the DNS replies have the VLAN tag 1. So ntopng
>>> splits the DNS flows in two monodirectional flows. If
>>> you want to ignore the VLAN tag in ntopng you can use
>>> the --ignore-vlans flag in ntopng. This should fix your
>>> problem.
>>>
>>> Regards,
>>>
>>> Emanuele
>>>
>>> On 5/13/20 3:06 PM, Emanuele Faranda wrote:
>>>>
>>>> Hi Aaron,
>>>>
>>>> Please contact us privately at faranda@ntop.org
>>>> <mailto:faranda@ntop.org> and mainardi@ntop.org
>>>> <mailto:mainardi@ntop.org> . Please ensure that the
>>>> PCAP files only contain DNS traffic.
>>>>
>>>> Regards,
>>>>
>>>> Emanuele
>>>>
>>>> On 5/12/20 5:13 PM, Aaron Scamehorn wrote:
>>>>> Emanuele,
>>>>>
>>>>> Here is ntopng.conf
>>>>> -G=/var/run/ntopng.pid
>>>>> -i=enp2s0
>>>>> -m=10.12.17.0/24 <http://10.12.17.0/24>
>>>>> -S=local
>>>>>
>>>>> I do see unidirectional flows in flows_stats.lua for
>>>>> DNS. Incidentally, I do also see alerts w/ non-zero
>>>>> replies (though most alerts are 0):
>>>>> Host pihole has sent 211 DNS requests but received 7
>>>>> DNS replies
>>>>>
>>>>> I tried 2 different 30 minute PCAP files.  In both
>>>>> cases, right at the 10 minute mark, I got alerts. How
>>>>> can I get these PCAP files to you?
>>>>>
>>>>> Thanks,
>>>>> Aaron
>>>>>
>>>>>
>>>>>
>>>>> On Tue, May 12, 2020 at 4:13 AM Emanuele Faranda
>>>>> <faranda@ntop.org <mailto:faranda@ntop.org>> wrote:
>>>>>
>>>>> Hi Aaron,
>>>>>
>>>>> Please see below.
>>>>>
>>>>> On 5/11/20 9:29 PM, Aaron Scamehorn wrote:
>>>>>> Hi Emanuele,
>>>>>>
>>>>>> Thank you again for the detailed responses.
>>>>>>
>>>>>> From the interfaces page, I see these stats:
>>>>>> Total Traffic 91.6 GB [103,062,265 Pkts]
>>>>>> Dropped Packets 0 Pkts
>>>>>>
>>>>>> I don't see any dropped packets on the NIC either:
>>>>>> ethtool -S enp2s0
>>>>>> NIC statistics:
>>>>>>      tx_packets: 0
>>>>>>      rx_packets: 106581943
>>>>>>      tx_errors: 0
>>>>>>      rx_errors: 0
>>>>>>      rx_missed: 0
>>>>>>      align_errors: 0
>>>>>>  tx_single_collisions: 0
>>>>>>  tx_multi_collisions: 0
>>>>>>      unicast: 105432876
>>>>>>      broadcast: 350738
>>>>>>      multicast: 1149060
>>>>>>      tx_aborted: 0
>>>>>>      tx_underrun: 0
>>>>>>
>>>>>> As of right now, 2 of the hosts we are discussing
>>>>>> are still in alert, at the original Date/Time of
>>>>>> 07:25:01, and Duration is now "3 Days, 08:06:59".
>>>>>>
>>>>>> Given that my replies vs requests ratio is still
>>>>>> configured at 50%, this means that, at every 5
>>>>>> minute interval for the last 3 Days, 8 hours,
>>>>>> said host is receiving < 50% DNS replies,
>>>>>> correct?  I find this difficult to believe, and
>>>>>> cannot find ANY missing packets in my pcap file.
>>>>>>
>>>>>> I have captured a 30 minute pcap file captured
>>>>>> with this command:
>>>>>> tcpdump -i enp2s0 -G 1800 -w
>>>>>> /tmp/enp2s0.%FT%T.pcap host edgemax and port 53
>>>>>>
>>>>>> This file contains DNS traffic to/from edgemax only.
>>>>>> I can count responses like this:
>>>>>> tshark -t a -r enp2s0.2020-05-11T13:00:02.pcap |
>>>>>> grep -c "Standard query response"
>>>>>> 349
>>>>>> And queries like this:
>>>>>> tshark -t a -r enp2s0.2020-05-11T13:00:02.pcap |
>>>>>> grep -c "Standard query 0x"
>>>>>> 349
>>>>>>
>>>>>> In other words, no missing DNS responses in the
>>>>>> 30 minutes spanning 13:00:02 to 13:29:51.
>>>>>>
>>>>>> I would think that the alert should "clear"
>>>>>> because the threshold is not exceeded within that
>>>>>> 30 minute pcap file.
>>>>>>
>>>>>> In any case, at 13:23, I manually click on the
>>>>>> "Release" button for that alert.  2 minutes
>>>>>> later, at 13:25:00, I receive this alert:
>>>>>> Host edgemax has received 62 DNS requests but
>>>>>> sent 0 DNS replies [5 Minutes ratio: 0%]
>>>>>>
>>>>>> As stated previously, no missing DNS responses in
>>>>>> the 30 minutes spanning 13:00:02 to 13:29:51. 
>>>>>> Why does ntopng think 62 replies are missing?
>>>>>
>>>>> Please report your ntopng.conf. If you look at the
>>>>> active ntopng DNS flows, can you identify
>>>>> unidirectional flows? You can also try to run
>>>>> ntopng on the PCAP file (--original-speed -i
>>>>> file.pcap). If you can reproduce using the PCAP
>>>>> file, please send it to me privately so that I can
>>>>> troubleshoot the problem.
>>>>>
>>>>>>
>>>>>> I exported 10 minutes of PCAP from if_stats.lua. 
>>>>>> Using the filter "(ip.dst_host == "10.12.17.1" or
>>>>>> ip.src_host == "10.12.17.1") and dns" I am not
>>>>>> able to find any missing DNS responses in
>>>>>> wireshark.  Interestingly, If I specify a BPF
>>>>>> Filter ("port 53"), the downloaded PCAP file
>>>>>> seems to only have 1 side (ie. edgemax is only a
>>>>>> source, never a dest.  Without a BPF Filter, the
>>>>>> download is fine.
>>>>>
>>>>> This is probably a bug, please open an issue at
>>>>> https://github.com/ntop/ntopng .
>>>>>
>>>>> Regards,
>>>>>
>>>>> Emanuele
>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, May 11, 2020 at 8:59 AM Emanuele Faranda
>>>>>> <faranda@ntop.org <mailto:faranda@ntop.org>> wrote:
>>>>>>
>>>>>> Hi Aaron,
>>>>>>
>>>>>> Please see below:
>>>>>>
>>>>>> On 5/8/20 10:27 PM, Aaron Scamehorn wrote:
>>>>>>> Thank you for your response.  In the
>>>>>>> screenshot below, can you please explain the
>>>>>>> significance of the "Date/Time" and the
>>>>>>> "Duration" columns? What do they mean in
>>>>>>> this context?
>>>>>>
>>>>>> Date/Time: the time when the alert was
>>>>>> triggered. Ntopng performs periodic checks in
>>>>>> order to trigger alerts. In this particular
>>>>>> case, the check on the requests/reply ratio
>>>>>> is performed every 5 minutes. So this means
>>>>>> that problem started between 07:20 and 07:25 .
>>>>>>
>>>>>> Duration: the total time in which the problem
>>>>>> was active. Again, the check is performed
>>>>>> every 5 minutes for this alert so 5 minutes
>>>>>> is the granularity.
>>>>>>
>>>>>>>
>>>>>>> Do I understand correctly that all 3 hosts
>>>>>>> triggered the alert at 07:25:01 (OR
>>>>>>> 07:30:01) this morning?  And that all three
>>>>>>> alerts are active for the past 07:28:53 
>>>>>>> hours? Does this mean that there have been
>>>>>>> no new additional DNS Reply/Request issues
>>>>>>> have been detected?
>>>>>> As explained above, the problem started
>>>>>> between 07:20 and 07:25 . For 07:28:53 hours
>>>>>> the problem was active on all the three hosts
>>>>>> (the requests/reply ratio threshold was
>>>>>> exceeded for 07:28:53 hours).
>>>>>>>
>>>>>>> I notice in "Past Alerts" tab, that there
>>>>>>> are many Reply/Request Alerts for the same
>>>>>>> host with very short durations (screen shot
>>>>>>> #2).  When/how does an alert move from the
>>>>>>> "Engaged" to "Past" tab?
>>>>>> In this case, the engaged alert becomes
>>>>>> "past" alert when, after the check performed
>>>>>> every 5 minutes, the requests/reply ratio
>>>>>> threshold is not exceed anymore. This can
>>>>>> happen as soon as the next check is performed
>>>>>> (5 minutes).
>>>>>>>
>>>>>>> So in the 2nd screenshot, fire-TV had an
>>>>>>> alert at 06:20:00 for 05:00 minutes where 18
>>>>>>> requests received 0 replies.  Then another
>>>>>>> alert at 06:50:00 for 05:00 minutes.  Were
>>>>>>> the 18 replies from the first alert
>>>>>>> ultimately received?  And they were received
>>>>>>> 5 minutes the alert occurred?
>>>>>>
>>>>>> The check is performed on the DNS packet
>>>>>> counters. A DNS request cannot take 5 minutes
>>>>>> to be replied. The fact that the alert was
>>>>>> closed after 5/10 minutes could be related to
>>>>>> one of these events:
>>>>>>
>>>>>> - The host went idle
>>>>>>
>>>>>> - The host did not send enough DNS requests
>>>>>>
>>>>>> - The new DNS requests made by the host were
>>>>>> successfully replied.
>>>>>>
>>>>>>>
>>>>>>> Context here is that 99% of the traffic is
>>>>>>> Internet traffic.  Almost all of the pihole
>>>>>>> traffic is to forwarders.  BTW, the way
>>>>>>> pihole works (by default) is it replies
>>>>>>> 0.0.0.0 for blocked hosts.  It should
>>>>>>> respond to every query.
>>>>>>>
>>>>>>> I tried the live_pcap_download.html
>>>>>>> <https://www.ntop.org/guides/ntopng/advanced_features/live_pcap_download.html>
>>>>>>> lua, but couldn't figure out the bpf_filter:
>>>>>>> curl --cookie "user=admin; password=xxxxx"
>>>>>>>  "http://10.12.17.25:3000/lua/live_traffic.lua?ifid=0&duration=600&bpf_filter=\
>>>>>>> <http://10.12.17.25:3000/lua/live_traffic.lua?ifid=0&duration=600&bpf_filter=%5C>"port
>>>>>>> 53\""
>>>>>>>
>>>>>>> I also tried the download pcap on the
>>>>>>> if_stats.lua page. The downloaded pcap file
>>>>>>> seems to only contain incoming data (see
>>>>>>> wireshark)?
>>>>>> This is consistent with the above alerts,
>>>>>> please ensure that ntopng is not dropping
>>>>>> packets as this would explain this behavior.
>>>>>>>
>>>>>>> If I just do a tshark on the same interface
>>>>>>> that ntopng is listening on, I see all of
>>>>>>> the expected DNS query & replies.  I am not
>>>>>>> able to correlate the alerts to any missing
>>>>>>> packets.
>>>>>>
>>>>>> See response above.
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Emanuele
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, May 8, 2020 at 2:53 AM Emanuele
>>>>>>> Faranda <faranda@ntop.org
>>>>>>> <mailto:faranda@ntop.org>> wrote:
>>>>>>>
>>>>>>> Hi Aaron,
>>>>>>>
>>>>>>> The alerts that you are reporting
>>>>>>> basically tell you that such hosts
>>>>>>> receive DNS requests but do not send a
>>>>>>> reply. In order to troubleshoot possible
>>>>>>> problems you should augment such
>>>>>>> information with the knowledge of your
>>>>>>> network.
>>>>>>>
>>>>>>> The first question to answer is, are
>>>>>>> that hosts expected to accept DNS
>>>>>>> requests? If not, are the requests
>>>>>>> generated from the internet or from the
>>>>>>> LAN? In the first case a firewall to
>>>>>>> block such DNS requests may be a good
>>>>>>> idea . In the latter case some hosts in
>>>>>>> the LAN may be misconfigured. In case of
>>>>>>> the pihole hosts, I expect pihole to
>>>>>>> block some DNS requests for
>>>>>>> advertisement sites so this could be a
>>>>>>> normal behaviour. The following ntopng
>>>>>>> features may also help you:
>>>>>>>
>>>>>>> https://www.ntop.org/guides/ntopng/advanced_features/live_pcap_download.html
>>>>>>>
>>>>>>> https://www.ntop.org/guides/ntopng/using_with_other_tools/n2disk.html
>>>>>>>
>>>>>>> https://www.ntop.org/guides/ntopng/historical_flows.html
>>>>>>>
>>>>>>> Regards,
>>>>>>> Emanuele
>>>>>>>
>>>>>>> On 5/7/20 5:57 PM, Aaron Scamehorn wrote:
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> I'm trying to understand how/why I am
>>>>>>>> getting the "Replies / Requests Ratio"
>>>>>>>> warnings for DNS.
>>>>>>>>
>>>>>>>> I am suspect of these alerts, and would
>>>>>>>> like to know how/why they are being
>>>>>>>> generated.  I am suspect for for the
>>>>>>>> following reasons:  1) If it really is
>>>>>>>> as bad as indicated, I should notice
>>>>>>>> problems.  2) the "events' occur
>>>>>>>> immediately after I clear the alerts,
>>>>>>>> and tend to persist for hours.
>>>>>>>>
>>>>>>>> In any case, I cleared the alerts last
>>>>>>>> night, and this is what they look like:
>>>>>>>>
>>>>>>>> 06/05/2020 22:15:00 12:31:28 Warning
>>>>>>>> Replies / Requests Ratio Host
>>>>>>>> edgemax.example.net
>>>>>>>> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.1@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
>>>>>>>> has received 54 DNS requests but sent 0
>>>>>>>> DNS replies [5 Minutes ratio: 0%]
>>>>>>>>
>>>>>>>> 06/05/2020 22:15:00 12:31:28 Warning
>>>>>>>> Replies / Requests Ratio Host
>>>>>>>> pihole.example.net
>>>>>>>> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.3@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
>>>>>>>> has sent 93 DNS requests but received 3
>>>>>>>> DNS replies [5 Minutes ratio: 3.2%]
>>>>>>>> 06/05/2020 22:15:00 12:31:28 Warning
>>>>>>>> Replies / Requests Ratio Host
>>>>>>>> pihole-2.example.net
>>>>>>>> <http://xps-630i.scamlan.net:3000/lua/host_details.lua?ifid=2&host=10.12.17.4@1&page=historical&epoch_begin=1588864588&epoch_end=1588868188>
>>>>>>>> has sent 97 DNS requests but received 1
>>>>>>>> DNS reply [5 Minutes ratio: 1.0%]
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Ntop mailing list
>>>>>>>> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
>>>>>>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>>>>>> _______________________________________________
>>>>>>> Ntop mailing list
>>>>>>> Ntop@listgateway.unipi.it
>>>>>>> <mailto:Ntop@listgateway.unipi.it>
>>>>>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Ntop mailing list
>>>>>>> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
>>>>>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>>>>> _______________________________________________
>>>>>> Ntop mailing list
>>>>>> Ntop@listgateway.unipi.it
>>>>>> <mailto:Ntop@listgateway.unipi.it>
>>>>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Ntop mailing list
>>>>>> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
>>>>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>>>> _______________________________________________
>>>>> Ntop mailing list
>>>>> Ntop@listgateway.unipi.it
>>>>> <mailto:Ntop@listgateway.unipi.it>
>>>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Ntop mailing list
>>>>> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
>>>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>>>
>>>> _______________________________________________
>>>> Ntop mailing list
>>>> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
>>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>> _______________________________________________
>>> Ntop mailing list
>>> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>>
>>>
>>> _______________________________________________
>>> Ntop mailing list
>>> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
>>> http://listgateway.unipi.it/mailman/listinfo/ntop
>> _______________________________________________
>> Ntop mailing list
>> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
>> http://listgateway.unipi.it/mailman/listinfo/ntop
>>
>>
>> _______________________________________________
>> Ntop mailing list
>> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
>> http://listgateway.unipi.it/mailman/listinfo/ntop
> _______________________________________________
> Ntop mailing list
> Ntop@listgateway.unipi.it <mailto:Ntop@listgateway.unipi.it>
> http://listgateway.unipi.it/mailman/listinfo/ntop
>
>
> _______________________________________________
> Ntop mailing list
> Ntop@listgateway.unipi.it
> http://listgateway.unipi.it/mailman/listinfo/ntop