Mailing List Archive: Nessus scripts and Moore's Law

Re: several messages [ In reply to ]

Nov 19, 2004, 6:35 PM

Post #26 of 28 (1403 views)

On Sun, 14 Nov 2004, Michel Arboi wrote:

> > You might even dump the result into a big file and map the file
> > readonly
>
> Not easy, because the tree contains pointers and each cell is
> allocated indivually on the heap, among other things like string.

It can be done but it requires familiarity with the exact behaviour of
mmap() and memory layout of a platform the process runs on.

An alternative solution is to use offsets rather than pointers (hand-made
PIC code). It adds some complexity to compute pointers (and extra CPU
cycles) but the benefits of having one shared copy of the data *might*
outweight the disadvantages.

> > Perhaps such warnings should be promoted to "Nessus Alerts"?
>
> Not the 1st one. But the 2nd one could be:
> "A web server was detected on this port but was missed by the 1st
> detector. Your report might be incomplete. You should re-run your scan
> with high timeouts"
> This way, we can even issue warnings on all ports, not only standard
> ports.
>
> I can fix find_service2 & its brothers.

Cool.

> > Do you need more than 3 syscalls to read the whole file: open(), read(),
> > and close()?
>
> For most scripts, no. For bigge scripts, maybe a couple of read

It is always possible to read the whole script with a single read()
(unless some exception interrupts it). On the other hand, you cannot
do this with stdio and it'd be a waste of precious RAM to read the whole
source text (however big) just to parse it character by character and
discard it.

BTW: the excessive use of ungetc() in the lexer is wierd. Glibc implements
getc() with a rather efficient macro but ungetc() is a relatively complex
function, and I suspect other libc implementations are similar.

> >> If anybody finds a way to benchmark such a thing without writing a
> >> new interpretor, that would be great.

> We already have this in exec.c (it prints the result of getresources)

You mean getrusage(), right? If it is implemented by sampling (it is on
Linux), it can yield, well, less precise results when the process in
question does lots of I/O and there are many context switches (and this
happens will multiple nessusd processes running simultaneously). On the
other hand, other statistics, namely minor and major page faults are
precise and can be very useful.

Moreover, finer profiling is able to identify "hot spots" where the
investment into optimization yields the best results.

BTW, have you tried compiling libnasl and libnessus with profiling enabled
(gcc -pg)?

> What I need is a way to estimate the speed gain between the current
> interpretor and a VM based interpretor without rewriting the whole
> interpretor with the VM. i.e., I don't want to do it and discover that
> it wasn't worth the effort.

You yourself say you think the memory is more limiting than CPU power.
I think it'd be a standard case of premature optimization to rewrite the
whole interpreter now. Let's identify the bottlenecks first.

> > BTW: One obvious optimization: rewrite recv_line() to read input in bigger
> > chunks rather than sucking it by single characters.
>
> This is done indirectly by buffered network IO. I considered it as an
> experimental feature and it is enabled only by http_open_socket

I see.

> Should we generalize it?

Probably (although imho 64k buffer is a little bit too big for most
protocols).

On Sun, 14 Nov 2004, Renaud Deraison wrote:

> I want (1): run more processes at the same time. If we can lower the CPU
> usage by 50%, assuming that the network connection is not the
> bottleneck, [...]

...and RAM is not the bottleneck either (swapping is slow enough to make
most CPU optimizations irrelevant). ;)

> > > The problem of the original HTTP caching mecanism [...]
> I don't have exact figures - it worked fine for me but Michel attempted to
> strangle me because it slowed down his laptop to its knees during an
> audit, so I disabled it for now and I'm thinking of a better approach.

This looks more like a "phase change" rather than "graceful degradation".
I'd expect the latter behaviour from increased fork() overhead;
the former one suggests thrashing caused by excessive swapping or
something similar.

On Sun, 14 Nov 2004, Michel Arboi wrote:

> On Sun Nov 14 2004 at 12:36, Renaud Deraison wrote:
> > No, but the kernel has to actually read the file from disk - and this is
> > slow.
>
> Unless we have enough memory and the scripts are kept in the file
> system cache. The scripts directory weight 27 MB, not that big.

Right.

On the other hand, the traditional unix-like filesystem is not the most
efficient storage for lots of small files. I get 4982 files occupying 22
MB on ext3 with 4k blocks (all-2.0.tar.gz from Nov 8, I haven't got a
newer version handy). The directory itself is 163 kB big. Gunzipped tar
is less than 16 MB (~2/3 of 22 MB).

I wonder what made the difference in Renaud's experiment with plugin
caching server?

> When I had the plugin-server running (one process loading all the
> plugins in memory and "handing them out" to other processes), the system
> load was lower (less open()/read()/close() calls) but the results were
> not entirely there yet (and these changes bring a lot of complexity to
> the behavior of Nessus, so I decided to remove them). However, this

After all, there had to be quite a lot of overhead with reads, writes, and
context switches. Those 6+ MB of RAM saved in comparison with the native
buffer cache? Or open() overhead (big directories can be slow)? Would it
help to work with a single big file instead of thousands of tiny files?

On Wed, 17 Nov 2004, Don Kitchen wrote:

> Even after the port scan of a system is finished, yes, it can take quite
> a bit of time to get through all the scripts. When a lot of systems are
> being scanned simultaneously it can put a great load on the CPU.

BTW, I got an impression there is a lot of disk I/O (perhaps logging?)
when the daemon "gets through all the script" even without running any of
them.

> (Another option that comes to mind is a firewall/no firewall detected/
> known no firewall result, possibly even set by the user if they wish. It's
> a waste of time to be told about "holes in the firewall" ala port 53 etc,
> when I'm scanning a LAN and not going through a firewall.)

You can always encounter a filter on the target machine itself (we live in
the era of personal firewalls, XP Service Pack 2, and popular magazines
calling ICMP Echo a "dangerous hacker tool"...even clueless lusers filter
network traffic on their pc's).

On Thu, 18 Nov 2004, Michel Arboi wrote:

> This means that we should have to put version numbers in the KB, and
> maybe add a new kind of optimization. Something like script_require_version
> I'm afraid this is not a simple modification.

You can add multiple keys for "at least version X.Y.Z is installed".
I.e. 1.15 would have one key for 1.14 and 1.15 (assuming both versions
are interesting milestones, look below).

Advantages:
- works without any new code,
- makes it easy to accomodate any wierd version numbering.

Disadvantages:
- eats more memory, and perhaps more CPU during dependency checking,
- some cooperation is needed not to add keys for any version but only
for interesting versions.

On Thu, 18 Nov 2004, Renaud Deraison wrote:

> It would make more sense to implement one of the two following :
>
> - When in command-line mode, get_kb_item() prompts the user for
> a value ;
>
> - When in command-line mode, the user can 'import' a KB when executing
> the script ;

I have implemented it and use it to replay plugins that have returned
dubious results during a regular test (via nessusd). It works fine. Well,
it gets complicated when something in the save kb file has to be changed
manually but it is still doable.

Interactive mode could make it more user-friendly but I am afraid the
interactivity would become pretty annoying when you need to run the script
multiple times. Or when you want to reproduce the exact conditions during
the regular test. Moreover, one would have to make sure get_kb_item() is
not called during time critical operations (ie. the host on the other end
of the wire waiting for a message from us).

Anyway, I agree it is highly undesirable to have plugins whose standard
behaviour is not reproducible in command-line mode (in fact, this is the
exact reason why I bothered to patch nasl to read KB file).

--Pavel Kankovsky aka Peak [ Boycott Microsoft--http://www.vcnet.com/bms ]
"Resistance is futile. Open your source code and prepare for assimilation."

Re: Re: several messages [ In reply to ]

mikhail at nessus

Nov 20, 2004, 2:44 AM

Post #27 of 28 (1399 views)

Permalink

On Sat Nov 20 2004 at 02:35, Pavel Kankovsky wrote:

> It can be done but it requires familiarity with the exact behaviour of
> mmap() and memory layout of a platform the process runs on.

Is it portable?

> An alternative solution is to use offsets rather than pointers (hand-made
> PIC code). It adds some complexity to compute pointers (and extra CPU
> cycles) but the benefits of having one shared copy of the data *might*
> outweight the disadvantages.

I played with oprofile & gprof but the results look strange. It seems
that the NASL interpretor does not eat so much CPU.
Anyway, I removed my "include cache", as the performance gain is tiny
-- if there is really any gain. Bison is really a good beast.

>> I can fix find_service2 & its brothers.
> Cool.

It's done, but only displayed with "verbose" reports.

> It is always possible to read the whole script with a single read()
> (unless some exception interrupts it). On the other hand, you cannot
> do this with stdio and it'd be a waste of precious RAM to read the whole
> source text (however big) just to parse it character by character and
> discard it.

Yes, and it would probably lose the benefit of the filesystem read
ahead.

> BTW: the excessive use of ungetc() in the lexer is wierd.

I had to keep a buffer. As stdio provides one, I used it :)

> You mean getrusage(), right?

Yes

> BTW, have you tried compiling libnasl and libnessus with profiling enabled
> (gcc -pg)?

Yes. But the results are strange and I prefer to get more reliable
results rather than publish something wrong.

> You yourself say you think the memory is more limiting than CPU
> power.

Because I use old laptops and scan a couple of machines at a time :)

> I think it'd be a standard case of premature optimization to rewrite the
> whole interpreter now. Let's identify the bottlenecks first.

I think so. My (unreliable?) profiling results say so.

> Probably (although imho 64k buffer is a little bit too big for most
> protocols).

8 KB?

> the former one suggests thrashing caused by excessive swapping or
> something similar.

Definitely.

> BTW, I got an impression there is a lot of disk I/O (perhaps
> logging?) when the daemon "gets through all the script" even without
> running any of them.

Maybe. Buffering logs might be an answer -- but we need a logging
daemon to do that.

> You can add multiple keys for "at least version X.Y.Z is installed".
> I.e. 1.15 would have one key for 1.14 and 1.15 (assuming both versions
> are interesting milestones, look below).
> Advantages:
> - works without any new code,
> - makes it easy to accomodate any wierd version numbering.

I see another problem. Let's say we have scripts that test flaws in
Apache/2.0.47 and we identify an Apache/2.0.50 on the target. So we
set : AtLeast/Apache/2.0.48 (and maybe AtLeast/Apache/2.0.50 too)
and our scripts use:
script_exclude_keys("AtLeast/Apache/2.0.48");
So far so good.
But if there is a 2nd vulnerable Apache running on another port, we
miss it.

That's why I implemented optimization_level: level 2 ignores
script_exclude_keys instructions, if you want to be sure that you do
not forget anything.

If we build the system in the other way, it will work: adding a key
that says that we have _at most_ version 2.0.50 installed. In my example,
we would have two keys, one for the vulnerable server and one for the
patch server, and our scripts would call
script_require_keys("AtMost/Apache/2.0.47",
"AtMost/Apache/2.0.40" # for example...

But the list of dependencies could grow too big :-\

Note that this kind of problem can hardly happen with IIS, because it
is installed once. But Apache is bundled with many products: you may
install your own Apache and also have IBM_HTTP_SERVER and Oracle web
server... So maybe we could use your system, but restrict it to
software like IIS that are (nearly) part of the operating system.

--
arboi@alussinan.org http://arboi.da.ru
NASL2 reference manual http://michel.arboi.free.fr/nasl2ref/

Re: Re: several messages [ In reply to ]

don at n2

Nov 23, 2004, 11:48 PM

Post #28 of 28 (1418 views)

Permalink

Michel Arboi wrote:

> > You can add multiple keys for "at least version X.Y.Z is installed".
> > I.e. 1.15 would have one key for 1.14 and 1.15 (assuming both versions
> > are interesting milestones, look below).
> > Advantages:
> > - works without any new code,
> > - makes it easy to accomodate any wierd version numbering.

> I see another problem. Let's say we have scripts that test flaws in
> Apache/2.0.47 and we identify an Apache/2.0.50 on the target. So we
> set : AtLeast/Apache/2.0.48 (and maybe AtLeast/Apache/2.0.50 too)
> and our scripts use:
> script_exclude_keys("AtLeast/Apache/2.0.48");
> So far so good.
> But if there is a 2nd vulnerable Apache running on another port, we
> miss it.

> That's why I implemented optimization_level: level 2 ignores
> script_exclude_keys instructions, if you want to be sure that you do
> not forget anything.

I agree that it's possible for there to be multiple versions of the same web
server software on a system.

But I submit that the problem, therefore, is that either the kb should be providing the correct key to the correct plugin (i.e. matching to the port that the plugin is working on), or the banner collector (which is the one that would be figuring out what versions are installed) should detect that multiple web servers are installed and set or not set something differently. (for example, not set AtLeasts that don't apply to both servers.)

Thanks