Mailing List Archive

Best way to introduce feature to reduce memory footprint?
Hey,

I'd like to get some thoughts of reducing the memory footprint of clamav
(clamd)...

Do you have already some ideas of the way to introduce a feature like
this?

Is my assumption correct that the loaded signature database in memory is
the biggest part?

So my idea would be to do introduce some kind of optional/configurable
round-robin loading of the signatures in limited blocks.
Scanning with the loaded signatures block, loading the next signatures
block from db files and scanning, and so on, until all signatures are
used.
With the next file to scan same procedure again...
Of course this would mean to slow down the scan process, but would free
memory on the system.
At least this is the idea.

I've just had a fast look at the code how clamd engine is initialized
and the db loading. So I might have missed something to think about.

Thanks for your help.

Markus
_______________________________________________

clamav-devel mailing list
clamav-devel@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-devel

Please submit your patches to our Bugzilla: http://bugzilla.clamav.net

Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
Re: Best way to introduce feature to reduce memory footprint? [ In reply to ]
Hi there,

On Sun, 8 Sep 2019, Markus Kolb wrote:

> I'd like to get some thoughts of reducing the memory footprint of clamav
> (clamd)...

My thought: Once upon a time they used to say that hardware accounted
for 90% of the cost of a computer system; then they used to say it was
more like 50/50; more recently, by comparison with the software costs,
hardware costs approximately nothing.

Short version: memory's cheap.

> Do you have already some ideas of the way to introduce a feature like
> this?

Speaking personally, no.

> Is my assumption correct that the loaded signature database in memory is
> the biggest part?

Don't assume things. Measure them.

> So my idea would be to do introduce some kind of optional/configurable
> round-robin loading of the signatures in limited blocks.
> Scanning with the loaded signatures block, loading the next signatures
> block from db files and scanning, and so on, until all signatures are
> used.

Perhaps you haven't been reading recent posts on the users' list.

> With the next file to scan same procedure again...
> Of course this would mean to slow down the scan process, but would free
> memory on the system.

The *scanning* speed isn't going to be the problem with your idea.

> ... I might have missed something to think about.

The scanning engine was designed to scan for millions of signatures
very efficiently. But it loads them rather slowly. Even if you were
to split the database into parts which were one tenth the size of the
existing databases, they would each take seconds to load. But a scan
typically takes a fraction of a second. So I think what you've missed
is the speed of *loading* the signatures. Unless you have some way of
improving that (and everyone here would be *very* pleased to see that)
then your idea will have little to recommend it to most ClamAV users.

There might be a way to run multiple daemons, each with part of the
database already loaded, which could be paged in/out of swap quicker
than it would be to dump and reload the separate database parts. The
idea is pure speculation on my part, and I wouldn't think it worth
pursuing unless I wanted to run ClamAV on a Raspberry Pi on the ISS.

What's your use case?

--

73,
Ged.
_______________________________________________

clamav-devel mailing list
clamav-devel@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-devel

Please submit your patches to our Bugzilla: http://bugzilla.clamav.net

Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
Re: Best way to introduce feature to reduce memory footprint? [ In reply to ]
Markus Kolb wrote:

> Hey,
>
> I'd like to get some thoughts of reducing the memory footprint of
> clamav (clamd)...

This was discussed quite intensively on the openSUSE list just recently:

Create /etc/systemd/system/clamd.service.d/memlimit.conf and add:

MemoryLimit=500M (for instance)
TimeoutSec=300s

This reduces the working set according to your spec, I have had it
running in 300M. Occasionally a scan will take longer, even up to a
minute, but 90% (in my context) are still processed in less than 1 sec.



--
Per Jessen, Zürich (11.1°C)
http://www.dns24.ch/ - free dynamic DNS, made in Switzerland.

_______________________________________________

clamav-devel mailing list
clamav-devel@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-devel

Please submit your patches to our Bugzilla: http://bugzilla.clamav.net

Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
Re: Best way to introduce feature to reduce memory footprint? [ In reply to ]
Am 08.09.2019 18:35, schrieb G.W. Haywood:
> Hi there,
>
> What's your use case?

Well to be possible to run it on limited hardware without using e.g. 40%
of memory for clamd without doing any work.

And yes, with longer scan time I was thinking of the needed DB loading
before scanning.
_______________________________________________

clamav-devel mailing list
clamav-devel@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-devel

Please submit your patches to our Bugzilla: http://bugzilla.clamav.net

Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
Re: Best way to introduce feature to reduce memory footprint? [ In reply to ]
Am 08.09.2019 19:28, schrieb Per Jessen:
> Markus Kolb wrote:
>
>> Hey,
>>
>> I'd like to get some thoughts of reducing the memory footprint of
>> clamav (clamd)...
>
[...]
> MemoryLimit=500M (for instance)
> This reduces the working set according to your spec, I have had it
> running in 300M. Occasionally a scan will take longer, even up to a
> minute, but 90% (in my context) are still processed in less than 1 sec.

First thanks, this is a good idea and for my personal use case this
would do it.
But this doesn't really reduce the memory footprint.
With e.g. MemoryLimit=300M it is swapping the rest...

VmPeak: 1076528 kB
VmSize: 1010992 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM: 304828 kB
VmRSS: 10888 kB
RssAnon: 6760 kB
RssFile: 4128 kB
RssShmem: 0 kB
VmData: 811616 kB
VmStk: 132 kB
VmExe: 168 kB
VmLib: 14556 kB
VmPTE: 1672 kB
VmPMD: 16 kB
VmSwap: 740816 kB

Without limit:

VmPeak: 1002792 kB
VmSize: 937256 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM: 752020 kB
VmRSS: 752020 kB
RssAnon: 752020 kB
RssFile: 0 kB
RssShmem: 0 kB
VmData: 800760 kB
VmStk: 132 kB
VmExe: 168 kB
VmLib: 14556 kB
VmPTE: 1648 kB
VmPMD: 16 kB
VmSwap: 0 kB

What would happen if there is no swapspace? OOM?

_______________________________________________

clamav-devel mailing list
clamav-devel@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-devel

Please submit your patches to our Bugzilla: http://bugzilla.clamav.net

Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
Re: Best way to introduce feature to reduce memory footprint? [ In reply to ]
Am 08.09.2019 18:35, schrieb G.W. Haywood:

> Perhaps you haven't been reading recent posts on the users' list.

Which thread topic do you mean?
The "How to boost clamav? Reloading database results in a talking
timeout?"?

_______________________________________________

clamav-devel mailing list
clamav-devel@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-devel

Please submit your patches to our Bugzilla: http://bugzilla.clamav.net

Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
Re: Best way to introduce feature to reduce memory footprint? [ In reply to ]
Markus Kolb wrote:

> Am 08.09.2019 19:28, schrieb Per Jessen:
>> Markus Kolb wrote:
>>
>>> Hey,
>>>
>>> I'd like to get some thoughts of reducing the memory footprint of
>>> clamav (clamd)...
>>
> [...]
>> MemoryLimit=500M (for instance)
>> This reduces the working set according to your spec, I have had it
>> running in 300M. Occasionally a scan will take longer, even up to a
>> minute, but 90% (in my context) are still processed in less than 1
>> sec.
>
> First thanks, this is a good idea and for my personal use case this
> would do it.
> But this doesn't really reduce the memory footprint.

True, only operationally. I just thought I would mention it, it's an
elegant solution when you want/need to reduce clamd memory consumption.

> With e.g. MemoryLimit=300M it is swapping the rest...

Yes, that is the whole point - swap out what isn't actively used.
Sometimes a scan will take longer, due to swapping in, but that's all.

> What would happen if there is no swapspace? OOM?

Almost certainly, yes.



--
Per Jessen, Zürich (11.5°C)
http://www.dns24.ch/ - free dynamic DNS, made in Switzerland.

_______________________________________________

clamav-devel mailing list
clamav-devel@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-devel

Please submit your patches to our Bugzilla: http://bugzilla.clamav.net

Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
Re: Best way to introduce feature to reduce memory footprint? [ In reply to ]
Am 08.09.2019 19:28, schrieb Per Jessen:

> This was discussed quite intensively on the openSUSE list just
> recently:
>
> Create /etc/systemd/system/clamd.service.d/memlimit.conf and add:
>
> MemoryLimit=500M (for instance)
> TimeoutSec=300s

I've also cheered too soon.
This doesn't always work as expected. ;-(

On forking the source process doesn't get/need cpu time any longer and
always times out.
It breaks down from 100% to 0-2% and mostly dead process. So it times
out.
The exact limit (300M,400M,700M,800M) doesn't matter.
The system is over 70% idle and real memory is enough free.
Looks like some bug (systemd/kernel) in openSUSE 15.0.
Or maybe because it is a KVM vhost? Don't know.
On a bare metal host with openSUSE 15.1 it works.
_______________________________________________

clamav-devel mailing list
clamav-devel@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-devel

Please submit your patches to our Bugzilla: http://bugzilla.clamav.net

Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
Re: Best way to introduce feature to reduce memory footprint? [ In reply to ]
Hello again,

On Mon, 9 Sep 2019, Markus Kolb wrote:
> Am 08.09.2019 18:35, schrieb G.W. Haywood:
>
> > Perhaps you haven't been reading recent posts on the users' list.
>
> Which thread topic do you mean? The "How to boost clamav? Reloading
> database results in a talking timeout?"?

Yes.

Another alternative might be to run a separate machine just for clamd.
Something like the new Raspberry Pi 4 could do the job quite well, if
a little more slowly than the average i9 laptop.

--

73,
Ged.
_______________________________________________

clamav-devel mailing list
clamav-devel@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-devel

Please submit your patches to our Bugzilla: http://bugzilla.clamav.net

Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml
Re: Best way to introduce feature to reduce memory footprint? [ In reply to ]
Markus Kolb wrote:

> Am 08.09.2019 19:28, schrieb Per Jessen:
>
>> This was discussed quite intensively on the openSUSE list just
>> recently:
>>
>> Create /etc/systemd/system/clamd.service.d/memlimit.conf and add:
>>
>> MemoryLimit=500M (for instance)
>> TimeoutSec=300s
>
> I've also cheered too soon.
> This doesn't always work as expected. ;-(
>
> On forking the source process doesn't get/need cpu time any longer and
> always times out.
> It breaks down from 100% to 0-2% and mostly dead process. So it times
> out.

Change the timeout?

> The exact limit (300M,400M,700M,800M) doesn't matter.
> The system is over 70% idle and real memory is enough free.
> Looks like some bug (systemd/kernel) in openSUSE 15.0.
> Or maybe because it is a KVM vhost? Don't know.
> On a bare metal host with openSUSE 15.1 it works.

The latter is what I tested it on, but I have some much smaller test
systems - they manage to run clamd in e.g. 784Mb of memory.



--
Per Jessen, Zürich (23.1°C)
http://www.cloudsuisse.com/ - your owncloud, hosted in Switzerland.

_______________________________________________

clamav-devel mailing list
clamav-devel@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-devel

Please submit your patches to our Bugzilla: http://bugzilla.clamav.net

Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml