Mailing List Archive

clamd cache (was Re: clamscan --disable-cache)
Dave Sill via clamav-users <clamav-users@lists.clamav.net> wrote:
>
> > >Skipping multiple copies of the same file won't really help because
> > >the duplication is across systems, and because every file will be
> > >rescanned every time clamscan is run.
> >
> > That's not true of clamdscan.
>
> Hmm...that's promising. I'll give it a try.

Unfortunately, it looks like the cache is too small to help.

I ran clamdscan twice on my /home (69k files) and got:

# clamdscan --fdpass /home
/home/de5/eicar.tar.gz: Eicar-Signature FOUND
WARNING: /home/de5/.cisco/hostscan/.libcsd.ipc: Not supported file type
WARNING: Directory recursion limit reached
WARNING: Directory recursion limit reached
WARNING: Directory recursion limit reached
WARNING: Directory recursion limit reached
WARNING: Directory recursion limit reached
WARNING: Directory recursion limit reached
WARNING: Directory recursion limit reached
WARNING: Directory recursion limit reached
WARNING: Directory recursion limit reached
WARNING: Directory recursion limit reached
WARNING: Directory recursion limit reached
WARNING: Directory recursion limit reached
WARNING: Directory recursion limit reached
WARNING: Directory recursion limit reached
/home/s4i/eicar.tar.gz: Eicar-Signature FOUND

----------- SCAN SUMMARY -----------
Infected files: 2
Total errors: 1
Time: 1428.433 sec (23 m 48 s)
# clamdscan --fdpass /home
/home/de5/eicar.tar.gz: Eicar-Signature FOUND
WARNING: /home/de5/.cisco/hostscan/.libcsd.ipc: Not supported file type
WARNING: Directory recursion limit reached
WARNING: Directory recursion limit reached
WARNING: Directory recursion limit reached
WARNING: Directory recursion limit reached
WARNING: Directory recursion limit reached
WARNING: Directory recursion limit reached
WARNING: Directory recursion limit reached
WARNING: Directory recursion limit reached
WARNING: Directory recursion limit reached
WARNING: Directory recursion limit reached
WARNING: Directory recursion limit reached
WARNING: Directory recursion limit reached
WARNING: Directory recursion limit reached
WARNING: Directory recursion limit reached
/home/s4i/eicar.tar.gz: Eicar-Signature FOUND

----------- SCAN SUMMARY -----------
Infected files: 2
Total errors: 1
Time: 1355.057 sec (22 m 35 s)
#

But on /boot, with 342 files:

# clamdscan --fdpass /boot
/boot: OK

----------- SCAN SUMMARY -----------
Infected files: 0
Time: 21.186 sec (0 m 21 s)
# clamdscan --fdpass /boot
/boot: OK

----------- SCAN SUMMARY -----------
Infected files: 0
Time: 0.362 sec (0 m 0 s)
#

I don't see that the cache size is run-time configurable. Is that right?

-Dave?

_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
Re: clamd cache (was Re: clamscan --disable-cache) [ In reply to ]
Hi there,

On Wed, 30 Sep 2020, Dave Sill via clamav-users wrote:

> Unfortunately, it looks like the cache is too small to help.
>
> I ran clamdscan twice on my /home (69k files) and got:
>
> # clamdscan --fdpass /home
> /home/de5/eicar.tar.gz: Eicar-Signature FOUND
> WARNING: /home/de5/.cisco/hostscan/.libcsd.ipc: Not supported file type
> WARNING: Directory recursion limit reached
> WARNING: Directory recursion limit reached

All this means is that (by default, but it's configurable) the scanner
won't recurse deeper than fifteen directories when it does a recursive
directory scan and you have a driectory tree under /home that's deeper
than that. See 'MaxDirectoryRecursion' in the clamd.conf man page.

There are quite a few limits of this kind to try to prevent you from
DOSing yourself, it's worth perusing the man pages.

--

73,
Ged.

_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
Re: clamd cache (was Re: clamscan --disable-cache) [ In reply to ]
It looks like my point was lost in the noise so I'll try to distill it.

I ran clamdscan twice on my /home (69k files) and got:

# clamdscan --fdpass /home
...
Time: 1428.433 sec (23 m 48 s)
# clamdscan --fdpass /home
...
Time: 1355.057 sec (22 m 35 s)
#

The cache only saved a little over a minute on a 24 minute scan.

But on /boot, with 342 files:

# clamdscan --fdpass /boot
...
Time: 21.186 sec (0 m 21 s)
# clamdscan --fdpass /boot
...
Time: 0.362 sec (0 m 0 s)
#

Here, on a much smaller scan, the cache made a huge difference. That
tells me that the cache isn't large enough to significantly speed up
large scans.

I don't see that the cache size is run-time configurable. Is that right?

-Dave?

_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
Re: clamd cache (was Re: clamscan --disable-cache) [ In reply to ]
Hi there,

On Thu, 1 Oct 2020, Dave Sill via clamav-users wrote:

> It looks like my point was lost in the noise ...

Sorry, I guess it was late and I was in a hurry to get to bed. :(

> The cache only saved a little over a minute on a 24 minute scan.

I tried something similar here on a directory with only 4k files:

----------- SCAN SUMMARY -----------
Infected files: 63
Time: 831.193 sec (13 m 51 s)
Start Date: 2020:10:01 13:05:29
End Date: 2020:10:01 13:19:21

----------- SCAN SUMMARY -----------
Infected files: 63
Time: 55.386 sec (0 m 55 s)
Start Date: 2020:10:01 13:33:15
End Date: 2020:10:01 13:34:10

The infected files were expected. Maybe some more experimentation is
called for. I'm running something with more and much larger files as
I write.

> ... on a much smaller scan, the cache made a huge difference. That
> tells me that the cache isn't large enough to significantly speed up
> large scans.

It might be too soon to draw that conclusion. It's possible that the
daemon reloaded its database during your test, and I'd expect that to
cause any cached results to be discarded for obvious reasons.

> I don't see that the cache size is run-time configurable. Is that right?

Correct, but I'd thought its size would be limited only by the RAM you
have free. If you look at the code in libclamav/cache.c you can see
that struct cache_set is just a few pointers, and if you only have 69k
files under your home directory I wouldn't expect storage of that many
sets of pointers to be an issue.

I'll dig into this a bit more when I have chance if somebody doesn't
beat me to it.

--

73,
Ged.

_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
Re: clamd cache (was Re: clamscan --disable-cache) [ In reply to ]
"G.W. Haywood via clamav-users" <clamav-users@lists.clamav.net> wrote:
> Hi there,
>
> On Thu, 1 Oct 2020, Dave Sill via clamav-users wrote:
>
> >It looks like my point was lost in the noise ...
>
> Sorry, I guess it was late and I was in a hurry to get to bed. :(

No worries. Thanks for your help.

> >... on a much smaller scan, the cache made a huge difference. That
> >tells me that the cache isn't large enough to significantly speed up
> >large scans.
>
> It might be too soon to draw that conclusion. It's possible that the
> daemon reloaded its database during your test, and I'd expect that to
> cause any cached results to be discarded for obvious reasons.

Fair enough. I re-ran the same scan three times after rebooting and got
the following run times:

20:46
19:37
19:18

And the clamd logs show "SelfCheck: Database status OK" every 10 minutes
but no DB updates.

> >I don't see that the cache size is run-time configurable. Is that right?
>
> Correct, but I'd thought its size would be limited only by the RAM you
> have free. If you look at the code in libclamav/cache.c you can see
> that struct cache_set is just a few pointers, and if you only have 69k
> files under your home directory I wouldn't expect storage of that many
> sets of pointers to be an issue.
>
> I'll dig into this a bit more when I have chance if somebody doesn't
> beat me to it.

I only have 16 GB RAM on this system but it still shows 1 GB free.
Maybe it limits itself somehow.

-Dave

_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
Re: clamd cache (was Re: clamscan --disable-cache) [ In reply to ]
I'm not intimately familiar with how the scan cache works. I believe the cache size isn't limited at all. Adding a scan result to the cache should never fail, unless the system runs out of memory. I also would expect your first clamDscan scan to be slower than all subsequent scans of the same files up until a clamd database reload clears the cache.

A database check without a reload shouldn't clear the cache. If it does, that would certainly be a bug.

I wish I had more time to dig in and investigate to help further but I'm a little overwhelmed with other tasks at present. I'll try to keep an eye on the thread to see if something comes up that I can be of help with.

-Micah

-----Original Message-----
From: clamav-users <clamav-users-bounces@lists.clamav.net> On Behalf Of Dave Sill via clamav-users
Sent: Thursday, October 1, 2020 11:02 AM
To: G.W. Haywood via clamav-users <clamav-users@lists.clamav.net>
Cc: Dave Sill <sillde@ornl.gov>
Subject: Re: [clamav-users] clamd cache (was Re: clamscan --disable-cache)

"G.W. Haywood via clamav-users" <clamav-users@lists.clamav.net> wrote:
> Hi there,
>
> On Thu, 1 Oct 2020, Dave Sill via clamav-users wrote:
>
> >It looks like my point was lost in the noise ...
>
> Sorry, I guess it was late and I was in a hurry to get to bed. :(

No worries. Thanks for your help.

> >... on a much smaller scan, the cache made a huge difference. That
> >tells me that the cache isn't large enough to significantly speed up
> >large scans.
>
> It might be too soon to draw that conclusion. It's possible that the
> daemon reloaded its database during your test, and I'd expect that to
> cause any cached results to be discarded for obvious reasons.

Fair enough. I re-ran the same scan three times after rebooting and got the following run times:

20:46
19:37
19:18

And the clamd logs show "SelfCheck: Database status OK" every 10 minutes but no DB updates.

> >I don't see that the cache size is run-time configurable. Is that right?
>
> Correct, but I'd thought its size would be limited only by the RAM you
> have free. If you look at the code in libclamav/cache.c you can see
> that struct cache_set is just a few pointers, and if you only have 69k
> files under your home directory I wouldn't expect storage of that many
> sets of pointers to be an issue.
>
> I'll dig into this a bit more when I have chance if somebody doesn't
> beat me to it.

I only have 16 GB RAM on this system but it still shows 1 GB free.
Maybe it limits itself somehow.

-Dave

_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml

_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
Re: clamd cache (was Re: clamscan --disable-cache) [ In reply to ]
Hi there,

On Thu, 1 Oct 2020, Dave Sill via clamav-users wrote:
> "G.W. Haywood via clamav-users" <clamav-users@lists.clamav.net> wrote:
>
>> It might be too soon to draw that conclusion. It's possible that the
>> daemon reloaded its database during your test, and I'd expect that to
>> cause any cached results to be discarded for obvious reasons.
>
> Fair enough. I re-ran the same scan three times after rebooting and got
> the following run times:
>
> 20:46
> 19:37
> 19:18
>
> And the clamd logs show "SelfCheck: Database status OK" every 10 minutes
> but no DB updates.
> ...
> I only have 16 GB RAM on this system ...

Only 4GB on my clamd server.

$ du -sh images/
16G images/
$ find ./images -type f | wc -l
11586
$ clamdscan images/
...
Time: 12547.333 sec (209 m 7 s)
...
$ clamdscan images/
...
Time: 1477.782 sec (24 m 37 s)

Trying a bigger directory, this is going to take a while...

$ find ./Personal -type f | wc -l
144191


--

73,
Ged.

_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
Re: clamd cache (was Re: clamscan --disable-cache) [ In reply to ]
"G.W. Haywood via clamav-users" <clamav-users@lists.clamav.net> wrote:
>
> Only 4GB on my clamd server.
>
> $ du -sh images/
> 16G images/
> $ find ./images -type f | wc -l
> 11586
> $ clamdscan images/
> ...
> Time: 12547.333 sec (209 m 7 s)
> ...
> $ clamdscan images/
> ...
> Time: 1477.782 sec (24 m 37 s)

That's a nice boost.

> Trying a bigger directory, this is going to take a while...
>
> $ find ./Personal -type f | wc -l
> 144191

I'm trying some tests on a desktop system now, in case the problem is
system-specific.

-Dave

_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
Re: clamd cache (was Re: clamscan --disable-cache) [ In reply to ]
On the desktop system:

$ find Mail -type f|wc -l
123719

# clamdscan --fdpass ~de5/Mail
Time: 2137.531 sec (35 m 37 s)

# clamdscan --fdpass ~de5/Mail
Time: 2138.778 sec (35 m 38 s)

So, still not seeing a benefit from the cache.

Both of my test systems are RHEL 7, so off to try another platform.

-Dave

_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
Re: clamd cache (was Re: clamscan --disable-cache) [ In reply to ]
Dave Sill via clamav-users <clamav-users@lists.clamav.net> wrote:
>
> Both of my test systems are RHEL 7, so off to try another platform.

On Fedora 32:

# find ~dave/Mail -type f|wc -l
26671

# clamdscan --fdpass ~dave/Mail
Time: 932.395 sec (15 m 32 s)

# clamdscan --fdpass ~dave/Mail
Time: 489.627 sec (8 m 9 s)

So that's an improvement. Less than I was hoping for, though.

-Dave

_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
Re: clamd cache (was Re: clamscan --disable-cache) [ In reply to ]
Hi there,

On Fri, 2 Oct 2020, G.W. Haywood wrote:

> Trying a bigger directory, this is going to take a while...

Doesn't look like telling us anything this side of Christmas so I've
killed the process. Time to think a bit harder. Stay tuned.

--

73,
Ged.

_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
Re: clamd cache (was Re: clamscan --disable-cache) [ In reply to ]
Hello again,

On Sat, 3 Oct 2020, G.W. Haywood via clamav-users wrote:

> Stay tuned.

Perhaps try enabling libclamav debug logging.

During your scans I suspect that ClamAV may be reaching some limit(s)
which is causing caching to be disabled. The limits are mostly
tunable (in some cases perhaps care may be needed to avoid a DOS).

In libclamav/scanners.c look for the function

static void emax_reached(cli_ctx *ctx)

and for calls to it.

There will at least be debug messages when some of the calls to that
function occur, but perhaps not all, which might be an issue to be
addressed - along with some other logging issues that have already been
identified in other discussions. It will be very easy to include some
extra logging if need be.

--

73,
Ged.

_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
Re: clamd cache (was Re: clamscan --disable-cache) [ In reply to ]
"G.W. Haywood via clamav-users" <clamav-users@lists.clamav.net> wrote:
>
> Perhaps try enabling libclamav debug logging.

I poked around a bit and didn't see an obvious way to do that, like a
configure option or a .h file. Couldn't really tell where it would be
logging.

> During your scans I suspect that ClamAV may be reaching some limit(s)
> which is causing caching to be disabled. The limits are mostly
> tunable (in some cases perhaps care may be needed to avoid a DOS).

Played around with MemoryLimit=infinite in the systemd unut file and
didn't see that it helped. One thing that did help was the -m/--multiscan
option to clamdscan. Depending on the number of cores, I've seen scans
2-5x faster.

Interestingly, I just ran some more tests on my Fedora 32 desktop, and I
see caching working great:

# clamdscan --fdpass ~dave/Mail
Time: 932.016 sec (15 m 32 s)

# clamdscan -m --fdpass ~dave/Mail
Time: 90.140 sec (1 m 30 s)

# clamdscan -m --fdpass ~dave/Mail
Time: 12.845 sec (0 m 12 s)

# clamdscan --fdpass ~dave/Mail
Time: 9.975 sec (0 m 9 s)

# clamdscan -m --fdpass ~dave/Mail
Time: 3.102 sec (0 m 3 s)

# find ~dave/Mail -type f |wc -l
26675

So, time to test RHEL 8 and Ubuntu.

Thanks, Ged!

-Dave

_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
Re: clamd cache (was Re: clamscan --disable-cache) [ In reply to ]
Hi there,

On Wed, 7 Oct 2020, Dave Sill via clamav-users wrote:

> "G.W. Haywood via clamav-users" <clamav-users@lists.clamav.net> wrote:
>>
>> Perhaps try enabling libclamav debug logging.
>
> I poked around a bit and didn't see an obvious way to do that ...

You just need a line

Debug yes

in clamd.conf. There might be a "Debug no" in there already. Until
just now I've never wondered what would happen if you had both of them
in the config file, I don't think you'll want to try it. :/

> Couldn't really tell where it would be logging.

It logs to the same places that clamd logs, i.e. syslog or a file
depending on LogFile and LogSyslog in clamd.conf. In the man page for
the LogFile entry it says "STRING", it might be a bit clearer if it
said /path/to/file or something like that.

> One thing that did help was the -m/--multiscan option to clamdscan.
> Depending on the number of cores, I've seen scans 2-5x faster.

To be expected if you use more CPU power of course. Caching should
give bigger gains, and be additional to what you get from CPU power.

> ... on my Fedora 32 desktop, and I see caching working great:

Ah, getting somewhere now!

> So, time to test RHEL 8 and Ubuntu.
>
> Thanks, Ged!

Please keep us posted - and you're very welcome. :)

--

73,
Ged.

_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml