Mailing List Archive

[clamav-users] Squid + ClamAV
Hello.

I'm trying the combination Squid + C-ICAP + SquidClamAV + ClamAV, and
I'm seeing terrible performance.
It seems there's no SquidClamAV specific mailing list and asking on
generic Squid list did not help much.
Perhaps someone here is using the same thing or knows how to better
tweak the engine.



The whole thing is working, but page loading times varies a lot:
sometimes they'll load as fast as without virus scanning, but often (the
same pages) will take several seconds to display (with ClamAV eating a
lot of CPU).
I tried to see what is being scanned, but since SquidClamaAV uses inline
connections, clamdtop seems to be helpless.

So I'm looking for suggestions on how to fine-tune ClamAV (and/or
SquidClamaAV) for this specific use case: whitelists, blacklists, which
ruleset are to be used/avoided, and so on.

Anyone already had a look at this?

bye & Thanks
av.

_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
Re: [clamav-users] Squid + ClamAV [ In reply to ]
Dne st?eda 1. dubna 2020 15:47:09 CEST, Andrea Venturoli via clamav-users
napsal(a):
> Hello.
>
> I'm trying the combination Squid + C-ICAP + SquidClamAV + ClamAV, and
> I'm seeing terrible performance.
> It seems there's no SquidClamAV specific mailing list and asking on
> generic Squid list did not help much.
> Perhaps someone here is using the same thing or knows how to better
> tweak the engine.

Hello,

few years ago I used squid + c-icap + clamav (without squidclamav), and it
worked fine. I'm not sure why I stopped using it, maybe it broke on server
upgrade or something. (And I had good antivirus on clients anyway).

> The whole thing is working, but page loading times varies a lot:
> sometimes they'll load as fast as without virus scanning, but often (the
> same pages) will take several seconds to display (with ClamAV eating a
> lot of CPU).
> I tried to see what is being scanned, but since SquidClamaAV uses inline
> connections, clamdtop seems to be helpless.

Are you running clamav as daemon? Is c-icap using the daemon socket (as if
runing clamdscan)? If not it might be spawning clamscan for every downloaded
page, and the startup of clamav takes very long time (parsing all the rules).

Also check if you have enough memory both clamav and squid can eat a lot, so
check if you are not swapping.

Best Regards
Vladislav Kurz

_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
Re: [clamav-users] Squid + ClamAV [ In reply to ]
On 2020-04-01 16:08, Vladislav Kurz via clamav-users wrote:

> Hello,

Thanks for your answer.



> Are you running clamav as daemon?

Sure.



> Is c-icap using the daemon socket (as if runing clamdscan)?

AFAIK it does.
I've got this in its config:
> # Path to the clamd socket, use clamd_local if you use Unix socket or if clamd
> # is listening on an Inet socket, comment clamd_local and set the clamd_ip and
> # clamd_port to the corresponding value.
> clamd_local /var/run/clamav/clamd.sock




> If not it might be spawning clamscan for every downloaded

I see "clamd" using a lot CPU when I surf the web, not a clamscan process.




> Also check if you have enough memory both clamav and squid can eat a lot, so
> check if you are not swapping.

Systems where I'm trying this range from 4 to 64GiB: should be enough.
The one I'm looking at now has 16GiB and 1% swap in use (which is the
minimum I've ever seen, basically meaning no swap).



bye & Thanks
av.

_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
Re: [clamav-users] Squid + ClamAV [ In reply to ]
Hi there,

On Wed, 1 Apr 2020, Andrea Venturoli via clamav-users wrote:

> I'm trying the combination Squid + C-ICAP + SquidClamAV + ClamAV, and I'm
> seeing terrible performance.
> ...
> Perhaps someone here is using the same thing or knows how to better
> tweak the engine.

I'm not surprised that the performance is terrible. :/

To me it sounds like this will not be a quick tweak but a project, and
a lot of work, but it might prove to be a valuable contribution to the
security of a large number of users.

> ... page loading times varies a lot: sometimes they'll load as fast
> as without virus scanning, but often (the same pages) will take
> several seconds to display (with ClamAV eating a lot of CPU).

Still no surprises.

> So I'm looking for suggestions on how to fine-tune ClamAV (and/or
> SquidClamaAV) for this specific use case: ...

It's a very interesting experiment but I'm not sure that the designers
of ClamAV (and of the various databases available for it) anticipated
that they would be used in this way. It bears some resemblance to
on-access scanning but it's sufficiently different to demand a lot of
thought. My approach would probably be to start with very little in
the signature database(s) and gradually add things which might prove
useful, at the same time excluding anything which might be expected to
be nearly useless in this application, all the time logging verbosely.

You might need to put extra intelligence into splitting content from
headers etc. before you pass the data to the scanner. The hashing
algorithm which ClamAV uses to avoid repeating scans of data might
need some work. An individual signature can sometimes cause the
scanning engine to work really hard when a superficially similar
signature does not, so I don't think you'll be able to tackle the
performance problem at the database level. I imagine you'll want to
set up instrumentation to attempt to measure the performance of the
individual signatures - or at least of the separate databases, which
would give only a rough idea of the scale of the problem but possibly
allow you to do binary searches for slow regexes. I guess you'd need
to automate a lot of that, or maybe crowdsource might work.

HTH. Please do keep us informed of any progress.

--

73,
Ged.

_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
Re: [clamav-users] Squid + ClamAV [ In reply to ]
On Wed, Apr 01, 2020 at 03:47:09PM +0200, Andrea Venturoli via clamav-users wrote:
>
> The whole thing is working, but page loading times varies a lot: sometimes
> they'll load as fast as without virus scanning, but often (the same pages)
> will take several seconds to display (with ClamAV eating a lot of CPU).

You'll want to atleast apply the reload patch to clamd, or you will get
hangs while signatures are loading..

https://bugzilla.clamav.net/show_bug.cgi?id=10979

_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
Re: [clamav-users] Squid + ClamAV [ In reply to ]
On Wed, Apr 01, 2020 at 04:36:15PM +0100, G.W. Haywood via clamav-users wrote:
> Hi there,
>
> On Wed, 1 Apr 2020, Andrea Venturoli via clamav-users wrote:
>
> >I'm trying the combination Squid + C-ICAP + SquidClamAV + ClamAV, and I'm
> >seeing terrible performance.
> >...
> >Perhaps someone here is using the same thing or knows how to better
> >tweak the engine.
>
> I'm not surprised that the performance is terrible. :/
>
> To me it sounds like this will not be a quick tweak but a project, and
> a lot of work, but it might prove to be a valuable contribution to the
> security of a large number of users.

There's nothing new about HTTP scanning even with ClamAV. I co-maintained
HAVP scanner (havp.org / havp.hege.li) for years, it had a very clever
method and worked fine. But pretty much all websites are SSL encrypted
these days, so there's nothing to scan unless you do nasty man-in-the-middle
decryption. Everyone has virus scanners on their PC, browsers have all
sorts of proctection etc. The days of proxy scanning are long gone, it's
just categorizing and blacklisting urls these days..


_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
Re: [clamav-users] Squid + ClamAV [ In reply to ]
I've been using HAVP with libclamav for years now, and have liked it.
Now, of course, the prevalence of HTTPS limits the utility of something
like HAVP. (I sometimes wonder what the *net* improvement in security
is when HTTPS is used, given that one is now almost totally dependent
on how secure the Web server is.)

P.S. What I would really like to see is for browsers to have hooks to
attach a plugin virus scanner like HAVP or clamd for scanning of the
*decrypted* content. (Centralized MITM scanning invalidates some
security and privacy principles, in my opinion.)


On Wed, 1 Apr 2020 20:38:41 +0300
Henrik K <hege@hege.li> wrote:

> On Wed, Apr 01, 2020 at 04:36:15PM +0100, G.W. Haywood via
> clamav-users wrote:
> > Hi there,
> >
> > On Wed, 1 Apr 2020, Andrea Venturoli via clamav-users wrote:
> >
> > >I'm trying the combination Squid + C-ICAP + SquidClamAV + ClamAV,
> > >and I'm seeing terrible performance.
> > >...
> > >Perhaps someone here is using the same thing or knows how to better
> > >tweak the engine.
> >
> > I'm not surprised that the performance is terrible. :/
> >
> > To me it sounds like this will not be a quick tweak but a project,
> > and a lot of work, but it might prove to be a valuable contribution
> > to the security of a large number of users.
>
> There's nothing new about HTTP scanning even with ClamAV. I
> co-maintained HAVP scanner (havp.org / havp.hege.li) for years, it
> had a very clever method and worked fine. But pretty much all
> websites are SSL encrypted these days, so there's nothing to scan
> unless you do nasty man-in-the-middle decryption. Everyone has virus
> scanners on their PC, browsers have all sorts of proctection etc.
> The days of proxy scanning are long gone, it's just categorizing and
> blacklisting urls these days..

_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
Re: [clamav-users] Squid + ClamAV [ In reply to ]
On 2020-04-01 19:38, Henrik K wrote:

>> But pretty much all
>> websites are SSL encrypted these days, so there's nothing to scan
>> unless you do nasty man-in-the-middle decryption. Everyone has virus
>> scanners on their PC, browsers have all sorts of proctection etc.
>> The days of proxy scanning are long gone, it's just categorizing and
>> blacklisting urls these days..

Well, you'll need MITM anyway if you want to see HTTPS URLs and be able
to blacklist them.



> (I sometimes wonder what the *net* improvement in security
> is when HTTPS is used, given that one is now almost totally dependent
> on how secure the Web server is.)

Rather, I wonder what the net security improvement of *HTTPS everywhere*
is: if TLS was limited to sites where it's needed/useful, our job would
be much easier.




bye
av.

P.S.
I'm investigatint your other message about the reload patch.
Thanks.

_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
Re: [clamav-users] Squid + ClamAV [ In reply to ]
On Thu, Apr 02, 2020 at 08:14:21AM +0200, Andrea Venturoli via clamav-users wrote:
> On 2020-04-01 19:38, Henrik K wrote:
>
> >> But pretty much all
> >> websites are SSL encrypted these days, so there's nothing to scan
> >> unless you do nasty man-in-the-middle decryption. Everyone has virus
> >> scanners on their PC, browsers have all sorts of proctection etc.
> >> The days of proxy scanning are long gone, it's just categorizing and
> >> blacklisting urls these days..
>
> Well, you'll need MITM anyway if you want to see HTTPS URLs and be able to
> blacklist them.

There's always the request hostname which can be used.. for many
organizations that's enough to filter.


_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
Re: [clamav-users] Squid + ClamAV [ In reply to ]
On 2020-04-02 08:14, Andrea Venturoli wrote:

> P.S.
> I'm investigatint your other message about the reload patch.

Patch is working.
However almost nothing has changed: from the logs I see DB reloads
twice/three times per day... hard to hit if you try :) and in the
meanwhile I still see slowness (which comes from something else, then).

bye & Thanks
av.

_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
Re: [clamav-users] Squid + ClamAV [ In reply to ]
On 06/04/2020 15:53, Andrea Venturoli via clamav-users wrote:
> On 2020-04-02 08:14, Andrea Venturoli wrote:
>
>> P.S.
>> I'm investigatint your other message about the reload patch.
>
> Patch is working.
> However almost nothing has changed: from the logs I see DB reloads
> twice/three times per day... hard to hit if you try :) and in the
> meanwhile I still see slowness (which comes from something else, then).

From my experience sometimes database check and reload is triggered
when a scan is initiated. I started noticing it when I reverted back
from the threaded reload patch.

Good luck
Reio

_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
Re: [clamav-users] Squid + ClamAV [ In reply to ]
On 2020-04-01 17:36, G.W. Haywood via clamav-users wrote:

> My approach would probably be to start with very little in
> the signature database(s) and gradually add things which might prove
> useful, at the same time excluding anything which might be expected to
> be nearly useless in this application, all the time logging verbosely.

I thought about this, but this is gonna be a *long* work.



> You might need to put extra intelligence into splitting content from
> headers etc. before you pass the data to the scanner.

I guess the above layers (squid+c-icap+squidclamav) already do this.



>  I imagine you'll want to
> set up instrumentation to attempt to measure the performance of the
> individual signatures - or at least of the separate databases

You imagine right :)
Any idea how this can be achieved?



bye & Thanks
av.

_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
Re: [clamav-users] Squid + ClamAV [ In reply to ]
Hi there,

On Tue, 7 Apr 2020, Andrea Venturoli via clamav-users wrote:
> On 2020-04-01 17:36, G.W. Haywood via clamav-users wrote:
>
>> ... I imagine you'll want to set up instrumentation to attempt to
>> measure the performance of the individual signatures - or at least
>> of the separate databases
>
> You imagine right :)
> Any idea how this can be achieved?

You can certainly run multiple clamd daemons - I do that routinely for
other reasons, using a custom milter - and load different databases in
each. That would be a start at least.

I haven't given much thought to individual signatures but at worst you
could allocate one of the daemons for that purpose; maybe you name one
signature specially, say TEST-SIGNATURE, then create TEST-SIGNATURE as
whatever individual sig you want to test, then reload your single-sig
database containing TEST-SIGNATURE in the single-sig daemon (that will
be fairly quick if there's just one sig to load). That could all be
automated fairly easily but it seems to me that there ought to be a
better way. Maybe it's the kind of thing that could go in the ClamAV
Bugzilla as a feature request - something like "per-sig reporting". :)

I certainly don't subscribe to the view expressed in this thread (if
that's the view that was expressed, and I'm not simply misrepresenting
it) that this has all been done before. Some of it has, sure, but it
still seems that there are issues, and room for some lateral thinking.

--

73,
Ged.

_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
Re: [clamav-users] Squid + ClamAV [ In reply to ]
On Tue, Apr 07, 2020 at 11:27:50AM +0100, G.W. Haywood via clamav-users wrote:
>
> I certainly don't subscribe to the view expressed in this thread (if
> that's the view that was expressed, and I'm not simply misrepresenting
> it) that this has all been done before. Some of it has, sure, but it
> still seems that there are issues, and room for some lateral thinking.

You are of course correct in that one must carefully choose what signatures
to use, the amount of signatures have bloated much in recent years.


_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
Re: [clamav-users] Squid + ClamAV [ In reply to ]
> On Apr 7, 2020, at 10:24 AM, Henrik K <hege@hege.li> wrote:
>
> On Tue, Apr 07, 2020 at 11:27:50AM +0100, G.W. Haywood via clamav-users wrote:
>>
>> I certainly don't subscribe to the view expressed in this thread (if
>> that's the view that was expressed, and I'm not simply misrepresenting
>> it) that this has all been done before. Some of it has, sure, but it
>> still seems that there are issues, and room for some lateral thinking.
>
> You are of course correct in that one must carefully choose what signatures
> to use, the amount of signatures have bloated much in recent years.

You say that like it’s a bad thing.