[.Sorry for the long post, but I'd rather include extra info than have to
follow-up with more posts.]
I am attempting to use the "-S1" option to keep state between executions of
ntop. This seems to work well if I stop and start ntop when the
hostsInfo.db file is somewhat small.
For example, it seems to stop and start OK when the hostsInfo.db file is
about 9MB bytes (1091 records), but file sizes over 12MB (1576 records)
cause it to fail with a segmentation fault. The biggest hostsInfo.db file
so far has been 21MB (2676 records) for about 2 days worth of monitoring.
There is no core file, but here is the gdb output from the latest run with
the following hostsInfo.db file:
hostname:/var/log/ntop# ls -l hostsInfo.db
-rw-r--r-- 1 root root 21687585 Aug 27 12:51 hostsInfo.db
hostname:/var/log/ntop# gdb /usr/local/bin/ntop
GNU gdb 19990928
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu"...
(gdb) run -S1 -i eth0,eth1 -P /var/log/ntop -t5 -u ntop
Starting program: /usr/local/bin/ntop -S1 -i eth0,eth1 -P /var/log/ntop -t5
-u ntop
Wait please: ntop is coming up...
27/Aug/2001:12:41:47 Initializing IP services...
27/Aug/2001:12:41:48 [main.c:410] SSL is present but https is disabled: use
-W <https port> for enabling it
27/Aug/2001:12:41:48 [initialize.c:398] Initializing SSL...
27/Aug/2001:12:41:48 [initialize.c:431] Initializing GDBM...
27/Aug/2001:12:41:48 [initialize.c:624] Initializing network devices...
27/Aug/2001:12:41:48 [main.c:423] ntop v.2.0.0 MT (SSL) [i686-pc-linux]
(08/25/01 12:11:00 AM build)
27/Aug/2001:12:41:48 [main.c:443] Listening on [eth0,eth1]
27/Aug/2001:12:41:48 [main.c:444] Copyright 1998-2001 by Luca Deri
<deri@ntop.org>
27/Aug/2001:12:41:48 [main.c:445] Get the freshest ntop from
http://www.ntop.org/
27/Aug/2001:12:41:48 [main.c:446] Initialising...
27/Aug/2001:12:41:48 [initialize.c:941] Truncated network size to 1024 hosts
(real netmask 255.255.0.0)
27/Aug/2001:12:41:48 [initialize.c:941] Truncated network size to 1024 hosts
(real netmask 255.255.0.0)
27/Aug/2001:12:41:48 [plugin.c:317] Loading plugins (if any)...
27/Aug/2001:12:41:48 [plugin.c:335] Searching plugins in
/usr/local/lib/ntop/plugins
27/Aug/2001:12:41:48 [icmpPlugin.c:584] Welcome to icmpWatchPlugin. (C) 1999
by Luca Deri.
27/Aug/2001:12:41:48 [lastSeenPlugin.c:414] Welcome to LastSeenWatchPlugin.
(C) 1999 by Andrea Marangoni.
27/Aug/2001:12:41:48 [nfsPlugin.c:342] Welcome to nfsWatchPlugin. (C) 1999
by Luca Deri.
27/Aug/2001:12:41:48 [rmonPlugin.c:169] Welcome to ntopRmon. (C) 2000 by
Luca Deri.
27/Aug/2001:12:41:48 [rmonPlugin.c:171] WARNING: plugin disabled [missing
NET-SNMP]
27/Aug/2001:12:41:48 [wapPlugin.c:304] Welcome to WAPPlugin. (C) 2000 by
Luca Deri.
27/Aug/2001:12:41:48 [plugin.c:189] WARNING: unable to load plugin
'/usr/local/lib/ntop/plugins/libpep.so'
[./usr/local/lib/ntop/plugins/libpep.so: cannot open shared object file: No
such file or directory]
27/Aug/2001:12:41:48 [initialize.c:346] Resetting traffic statistics...
[New Thread 8431 (manager thread)]
[New Thread 8430 (initial thread)]
[New Thread 8432]
27/Aug/2001:12:41:48 [initialize.c:538] Started thread (1026) for network
packet analyser.
[New Thread 8433]
27/Aug/2001:12:41:48 [initialize.c:545] Started thread (2051) for host
traffic statistics.
[New Thread 8434]
27/Aug/2001:12:41:48 [initialize.c:553] Started thread (3076) for throughput
update.
[New Thread 8435]
27/Aug/2001:12:41:48 [initialize.c:561] Started thread (4101) for idle hosts
detection.
[New Thread 8436]
27/Aug/2001:12:41:48 [initialize.c:565] Started thread (5126) for idle TCP
sessions detection.
[New Thread 8437]
27/Aug/2001:12:41:48 [initialize.c:588] Started thread (6151) for DNS
address resolution.
27/Aug/2001:12:41:48 [plugin.c:404] Initialising plugins (if any)...
27/Aug/2001:12:41:48 [webInterface.c:949] Waiting for HTTP connections on
port 3000...
27/Aug/2001:12:41:48 [hash.c:1106] Purging Idle Hosts... (ignoreIdleTime=0,
actDevice=0)
27/Aug/2001:12:41:48 [hash.c:1156] Purging completed (0 sec/0 hosts
deleted).
[New Thread 8438]
27/Aug/2001:12:41:48 [main.c:505] Sniffying...
[New Thread 8439]
27/Aug/2001:12:41:48 [initialize.c:1094] Started thread (8201) for network
packet sniffing on eth0.
[New Thread 8440]
27/Aug/2001:12:41:48 [initialize.c:1094] Started thread (9226) for network
packet sniffing on eth1.
**********************27/Aug/2001:12:41:48 [hash.c:183] Extending hash:
[old=32, new=48]
[Switching to Thread 8439]
Program received signal SIGSEGV, Segmentation fault.
0x401d2c5e in _mapIdx (mappings=0x824c730, idx=1705, lastHashSize=32,
fileName=0x401f62e4 "hash.c", fileLine=534)
at hash.c:130
130 } else if(mappings[idx] == NO_PEER) {
(gdb) bt
#0 0x401d2c5e in _mapIdx (mappings=0x824c730, idx=1705, lastHashSize=32,
fileName=0x401f62e4 "hash.c",
fileLine=534) at hash.c:130
#1 0x401d440f in resizeHostHash (deviceToExtend=0, hashAction=1) at
hash.c:532
#2 0x401e3950 in processPacket (_deviceId=0x0, h=0xbe9ffc54, p=0x812d632
"\b") at pbuf.c:4344
#3 0x404076dd in pcap_read () from /usr/lib/libpcap.so.0
#4 0x40407cdf in pcap_dispatch () from /usr/lib/libpcap.so.0
#5 0x401d9749 in pcapDispatch (_i=0x0) at ntop.c:130
#6 0x40189c9f in pthread_start_thread () from /lib/libpthread.so.0
(gdb) l
125 #ifdef DEBUG
126 traceEvent(TRACE_INFO, "Mapping empty index %d [%s:%d]",
127 idx, fileName, fileLine);
128 #endif
129 return(NO_PEER);
130 } else if(mappings[idx] == NO_PEER) {
131 traceEvent(TRACE_WARNING,
132 "Mapping failed for index %d [%s:%d]",
133 idx, fileName, fileLine);
134 return(NO_PEER);
(gdb) p idx
$1 = 1705
(gdb) p mappings
$2 = (u_int *) 0x824c730
(gdb) p mappings[idx]
Cannot access memory at address 0x824e1d4.
(gdb) p mappings[idx-1]
Cannot access memory at address 0x824e1d0.
(gdb) p mappings[.0
A parse error in expression, near `'.
(gdb) p mappings[0]
$3 = 0
(gdb) p mappings[10]
$4 = 46
(gdb) p mappings[100]
$5 = 0
(gdb) p mappings[1000]
Cannot access memory at address 0x824d6d0.
(gdb) q
The program is running. Exit anyway? (y or n) y
The closest I've gotten to debugging it is to change hash.c like this:
hostname:~/ntop-cvs/ntop# diff -u hash.c.orig hash.c
--- hash.c.orig Mon Aug 27 13:57:40 2001
+++ hash.c Mon Aug 27 13:53:35 2001
@@ -127,15 +127,15 @@
idx, fileName, fileLine);
#endif
return(NO_PEER);
- } else if(mappings[idx] == NO_PEER) {
- traceEvent(TRACE_WARNING,
- "Mapping failed for index %d [%s:%d]",
- idx, fileName, fileLine);
- return(NO_PEER);
} else if(idx >= lastHashSize) {
traceEvent(TRACE_WARNING,
"Index %d out of range (0...%d) [%s:%d]",
idx, lastHashSize, fileName, fileLine);
+ return(NO_PEER);
+ } else if(mappings[idx] == NO_PEER) {
+ traceEvent(TRACE_WARNING,
+ "Mapping failed for index %d [%s:%d]",
+ idx, fileName, fileLine);
return(NO_PEER);
} else {
#ifdef DEBUG
But then I just get lots of "27/Aug/2001:13:55:09 [hash.c:131] Index 3949
out of range (0...542) [hash.c:541]" messages.
I've tried increasing MAX_SUBNET_HOSTS in initialize.c from 1024 to 4096
since that seems to be larger than the number of hosts in hostsInfo.db, but
it still seg faults.
I've worked around it by increasing HASH_INITIAL_SIZE in ntop.h to 10000,
but it eventually seg faults when it tries to purge idle hosts:
27/Aug/2001:15:31:22 [hash.c:1106] Purging Idle Hosts... (ignoreIdleTime=0,
actDevice=0)
[Switching to Thread 31440]
Program received signal SIGSEGV, Segmentation fault.
0x401d5500 in freeGlobalHostPeers (el=0x85c5570, flaggedHosts=0x890e500 "")
at hash.c:844
844
if(flaggedHosts[el->securityHostPkts.closedEmptyTCPConnSent.peersIndexes[j]]
)
(gdb) bt
#0 0x401d5500 in freeGlobalHostPeers (el=0x85c5570, flaggedHosts=0x890e500
"") at hash.c:844
#1 0x401d6008 in purgeIdleHosts (ignoreIdleTime=0, actDevice=0) at
hash.c:1148
#2 0x401da30f in scanIdleLoop (notUsed=0x0) at ntop.c:606
#3 0x40189c9f in pthread_start_thread () from /lib/libpthread.so.0
I can continue to look, but perhaps those more familiar with the code can
take a look.
This is on a Debian 2.2 system, kernel 2.2.19, libpcap 0.4a6. I used CVS to
check out the ntop code on 2001-08-24 around 23:27 CDT. I've got two
interfaces, which are monitoring two gigabit Ethernet ports on a pair of
switches used by our Auspex NFS file server. And yes, this is a
non-subnetted Class B network with over 2000 hosts at last count. Don't ask
me, I didn't design it.
And thanks, Luca!
Regards,
Owen Crow
Systems Programmer (Unix)
BMC Software, Inc.
follow-up with more posts.]
I am attempting to use the "-S1" option to keep state between executions of
ntop. This seems to work well if I stop and start ntop when the
hostsInfo.db file is somewhat small.
For example, it seems to stop and start OK when the hostsInfo.db file is
about 9MB bytes (1091 records), but file sizes over 12MB (1576 records)
cause it to fail with a segmentation fault. The biggest hostsInfo.db file
so far has been 21MB (2676 records) for about 2 days worth of monitoring.
There is no core file, but here is the gdb output from the latest run with
the following hostsInfo.db file:
hostname:/var/log/ntop# ls -l hostsInfo.db
-rw-r--r-- 1 root root 21687585 Aug 27 12:51 hostsInfo.db
hostname:/var/log/ntop# gdb /usr/local/bin/ntop
GNU gdb 19990928
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu"...
(gdb) run -S1 -i eth0,eth1 -P /var/log/ntop -t5 -u ntop
Starting program: /usr/local/bin/ntop -S1 -i eth0,eth1 -P /var/log/ntop -t5
-u ntop
Wait please: ntop is coming up...
27/Aug/2001:12:41:47 Initializing IP services...
27/Aug/2001:12:41:48 [main.c:410] SSL is present but https is disabled: use
-W <https port> for enabling it
27/Aug/2001:12:41:48 [initialize.c:398] Initializing SSL...
27/Aug/2001:12:41:48 [initialize.c:431] Initializing GDBM...
27/Aug/2001:12:41:48 [initialize.c:624] Initializing network devices...
27/Aug/2001:12:41:48 [main.c:423] ntop v.2.0.0 MT (SSL) [i686-pc-linux]
(08/25/01 12:11:00 AM build)
27/Aug/2001:12:41:48 [main.c:443] Listening on [eth0,eth1]
27/Aug/2001:12:41:48 [main.c:444] Copyright 1998-2001 by Luca Deri
<deri@ntop.org>
27/Aug/2001:12:41:48 [main.c:445] Get the freshest ntop from
http://www.ntop.org/
27/Aug/2001:12:41:48 [main.c:446] Initialising...
27/Aug/2001:12:41:48 [initialize.c:941] Truncated network size to 1024 hosts
(real netmask 255.255.0.0)
27/Aug/2001:12:41:48 [initialize.c:941] Truncated network size to 1024 hosts
(real netmask 255.255.0.0)
27/Aug/2001:12:41:48 [plugin.c:317] Loading plugins (if any)...
27/Aug/2001:12:41:48 [plugin.c:335] Searching plugins in
/usr/local/lib/ntop/plugins
27/Aug/2001:12:41:48 [icmpPlugin.c:584] Welcome to icmpWatchPlugin. (C) 1999
by Luca Deri.
27/Aug/2001:12:41:48 [lastSeenPlugin.c:414] Welcome to LastSeenWatchPlugin.
(C) 1999 by Andrea Marangoni.
27/Aug/2001:12:41:48 [nfsPlugin.c:342] Welcome to nfsWatchPlugin. (C) 1999
by Luca Deri.
27/Aug/2001:12:41:48 [rmonPlugin.c:169] Welcome to ntopRmon. (C) 2000 by
Luca Deri.
27/Aug/2001:12:41:48 [rmonPlugin.c:171] WARNING: plugin disabled [missing
NET-SNMP]
27/Aug/2001:12:41:48 [wapPlugin.c:304] Welcome to WAPPlugin. (C) 2000 by
Luca Deri.
27/Aug/2001:12:41:48 [plugin.c:189] WARNING: unable to load plugin
'/usr/local/lib/ntop/plugins/libpep.so'
[./usr/local/lib/ntop/plugins/libpep.so: cannot open shared object file: No
such file or directory]
27/Aug/2001:12:41:48 [initialize.c:346] Resetting traffic statistics...
[New Thread 8431 (manager thread)]
[New Thread 8430 (initial thread)]
[New Thread 8432]
27/Aug/2001:12:41:48 [initialize.c:538] Started thread (1026) for network
packet analyser.
[New Thread 8433]
27/Aug/2001:12:41:48 [initialize.c:545] Started thread (2051) for host
traffic statistics.
[New Thread 8434]
27/Aug/2001:12:41:48 [initialize.c:553] Started thread (3076) for throughput
update.
[New Thread 8435]
27/Aug/2001:12:41:48 [initialize.c:561] Started thread (4101) for idle hosts
detection.
[New Thread 8436]
27/Aug/2001:12:41:48 [initialize.c:565] Started thread (5126) for idle TCP
sessions detection.
[New Thread 8437]
27/Aug/2001:12:41:48 [initialize.c:588] Started thread (6151) for DNS
address resolution.
27/Aug/2001:12:41:48 [plugin.c:404] Initialising plugins (if any)...
27/Aug/2001:12:41:48 [webInterface.c:949] Waiting for HTTP connections on
port 3000...
27/Aug/2001:12:41:48 [hash.c:1106] Purging Idle Hosts... (ignoreIdleTime=0,
actDevice=0)
27/Aug/2001:12:41:48 [hash.c:1156] Purging completed (0 sec/0 hosts
deleted).
[New Thread 8438]
27/Aug/2001:12:41:48 [main.c:505] Sniffying...
[New Thread 8439]
27/Aug/2001:12:41:48 [initialize.c:1094] Started thread (8201) for network
packet sniffing on eth0.
[New Thread 8440]
27/Aug/2001:12:41:48 [initialize.c:1094] Started thread (9226) for network
packet sniffing on eth1.
**********************27/Aug/2001:12:41:48 [hash.c:183] Extending hash:
[old=32, new=48]
[Switching to Thread 8439]
Program received signal SIGSEGV, Segmentation fault.
0x401d2c5e in _mapIdx (mappings=0x824c730, idx=1705, lastHashSize=32,
fileName=0x401f62e4 "hash.c", fileLine=534)
at hash.c:130
130 } else if(mappings[idx] == NO_PEER) {
(gdb) bt
#0 0x401d2c5e in _mapIdx (mappings=0x824c730, idx=1705, lastHashSize=32,
fileName=0x401f62e4 "hash.c",
fileLine=534) at hash.c:130
#1 0x401d440f in resizeHostHash (deviceToExtend=0, hashAction=1) at
hash.c:532
#2 0x401e3950 in processPacket (_deviceId=0x0, h=0xbe9ffc54, p=0x812d632
"\b") at pbuf.c:4344
#3 0x404076dd in pcap_read () from /usr/lib/libpcap.so.0
#4 0x40407cdf in pcap_dispatch () from /usr/lib/libpcap.so.0
#5 0x401d9749 in pcapDispatch (_i=0x0) at ntop.c:130
#6 0x40189c9f in pthread_start_thread () from /lib/libpthread.so.0
(gdb) l
125 #ifdef DEBUG
126 traceEvent(TRACE_INFO, "Mapping empty index %d [%s:%d]",
127 idx, fileName, fileLine);
128 #endif
129 return(NO_PEER);
130 } else if(mappings[idx] == NO_PEER) {
131 traceEvent(TRACE_WARNING,
132 "Mapping failed for index %d [%s:%d]",
133 idx, fileName, fileLine);
134 return(NO_PEER);
(gdb) p idx
$1 = 1705
(gdb) p mappings
$2 = (u_int *) 0x824c730
(gdb) p mappings[idx]
Cannot access memory at address 0x824e1d4.
(gdb) p mappings[idx-1]
Cannot access memory at address 0x824e1d0.
(gdb) p mappings[.0
A parse error in expression, near `'.
(gdb) p mappings[0]
$3 = 0
(gdb) p mappings[10]
$4 = 46
(gdb) p mappings[100]
$5 = 0
(gdb) p mappings[1000]
Cannot access memory at address 0x824d6d0.
(gdb) q
The program is running. Exit anyway? (y or n) y
The closest I've gotten to debugging it is to change hash.c like this:
hostname:~/ntop-cvs/ntop# diff -u hash.c.orig hash.c
--- hash.c.orig Mon Aug 27 13:57:40 2001
+++ hash.c Mon Aug 27 13:53:35 2001
@@ -127,15 +127,15 @@
idx, fileName, fileLine);
#endif
return(NO_PEER);
- } else if(mappings[idx] == NO_PEER) {
- traceEvent(TRACE_WARNING,
- "Mapping failed for index %d [%s:%d]",
- idx, fileName, fileLine);
- return(NO_PEER);
} else if(idx >= lastHashSize) {
traceEvent(TRACE_WARNING,
"Index %d out of range (0...%d) [%s:%d]",
idx, lastHashSize, fileName, fileLine);
+ return(NO_PEER);
+ } else if(mappings[idx] == NO_PEER) {
+ traceEvent(TRACE_WARNING,
+ "Mapping failed for index %d [%s:%d]",
+ idx, fileName, fileLine);
return(NO_PEER);
} else {
#ifdef DEBUG
But then I just get lots of "27/Aug/2001:13:55:09 [hash.c:131] Index 3949
out of range (0...542) [hash.c:541]" messages.
I've tried increasing MAX_SUBNET_HOSTS in initialize.c from 1024 to 4096
since that seems to be larger than the number of hosts in hostsInfo.db, but
it still seg faults.
I've worked around it by increasing HASH_INITIAL_SIZE in ntop.h to 10000,
but it eventually seg faults when it tries to purge idle hosts:
27/Aug/2001:15:31:22 [hash.c:1106] Purging Idle Hosts... (ignoreIdleTime=0,
actDevice=0)
[Switching to Thread 31440]
Program received signal SIGSEGV, Segmentation fault.
0x401d5500 in freeGlobalHostPeers (el=0x85c5570, flaggedHosts=0x890e500 "")
at hash.c:844
844
if(flaggedHosts[el->securityHostPkts.closedEmptyTCPConnSent.peersIndexes[j]]
)
(gdb) bt
#0 0x401d5500 in freeGlobalHostPeers (el=0x85c5570, flaggedHosts=0x890e500
"") at hash.c:844
#1 0x401d6008 in purgeIdleHosts (ignoreIdleTime=0, actDevice=0) at
hash.c:1148
#2 0x401da30f in scanIdleLoop (notUsed=0x0) at ntop.c:606
#3 0x40189c9f in pthread_start_thread () from /lib/libpthread.so.0
I can continue to look, but perhaps those more familiar with the code can
take a look.
This is on a Debian 2.2 system, kernel 2.2.19, libpcap 0.4a6. I used CVS to
check out the ntop code on 2001-08-24 around 23:27 CDT. I've got two
interfaces, which are monitoring two gigabit Ethernet ports on a pair of
switches used by our Auspex NFS file server. And yes, this is a
non-subnetted Class B network with over 2000 hosts at last count. Don't ask
me, I didn't design it.
And thanks, Luca!
Regards,
Owen Crow
Systems Programmer (Unix)
BMC Software, Inc.