Mailing List Archive

Best practices when using caching http proxy as cvd private mirror
On https://docs.clamav.net/appendix/CvdPrivateMirror.html#use-an-http-proxy
Am looking for best practices on how an http proxy should be configured in this scenario.  Some questions:
1) What mechanism should a proxy use to detect a stale cached file?  Want to avoid stale files obviously, but also reduce load to the public mirrors and chance of rate limiting.  I see ETag, Cache-Control, Expires headers in HTTP responses from database.clamav.net.  And have seen cvdupdate specify the If-Modified-Since header in requests.  So a lot of choices, which are preferred?
2) I see that curl requests to database.clamav.net fail unless I override the User-Agent header to have a value similar to what freshclam does, such as "CVDUPDATE/0".  If I have to manually set this in a proxy, is there guidance on what a good future-proof value is?  It feels weird to lie in the request.
3) Happy to hear any dissenting opinions on the HTTP proxy idea.  Is it lower risk to just run cvdupdate, or a freshclam coupled with a web server internally?  On the surface a caching proxy seems simpler, less moving parts, less to maintain.
Thanks!
Aaron
Re: Best practices when using caching http proxy as cvd private mirror [ In reply to ]
Hi there,

On Thu, 8 Sep 2022, Aaron Leliaert via clamav-users wrote:

> On https://docs.clamav.net/appendix/CvdPrivateMirror.html#use-an-http-proxy
> Am looking for best practices on how an http proxy should be
> configured in this scenario.  Some questions:
>
> 1) What mechanism should a proxy use to detect a stale cached file?
>  Want to avoid stale files obviously, but also reduce load to the
> public mirrors and chance of rate limiting.

There are no public mirrors any more, it's a Content Delivery Network
provided by Cloudflare which also provides some protection against
Denial of Service attacks - which have been part of the landscape for
some time now. You probably don't need to worry about stale files, it
happens occasionally but the signatures aren't updated much more often
than daily and you could e.g. set up a cron job to mail you if nothing
changes in your copy of the official signature database for 48 hours.
I've been using ClamAV for about two decades and I can't remember the
last time I had to do *anything* about it. It Just Works. Whether it
will then find what you're looking for is another question entirely...

> 2) I see that curl requests to database.clamav.net fail unless I
> override the User-Agent header to have a value similar to what
> freshclam does, such as "CVDUPDATE/0".  If I have to manually set
> this in a proxy, is there guidance on what a good future-proof value
> is?  It feels weird to lie in the request.

Using curl and lying in the requests is likely to get the requesting
IP banned. My understanding is that you have two choices, you either
use (preferably) freshclam or (if necessary) cvdupdate, and that the
use of curl and similar is essentially forbidden. You will see notes
to this effect in the mailing list, many from Joel, if you search it.

> 3) Happy to hear any dissenting opinions on the HTTP proxy idea.

Now that the files are distributed by a Content Delivery Network, I
think the need for local caching proxies is much reduced (the CDN can
cope with much more traffic) but you will certainly want to avoid the
appearance of being abusive. That isn't too difficult unless you're
managing a large number of clients on your network. For a few dozen
machines I haven't used a proxy for years. What sort of numbers are
you dealing with?

Please note that replies direct to my clamav@ address are rejected,
it accepts mail only from the mailing list.

--

73,
Ged.
_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users


Help us build a comprehensive ClamAV guide:
https://github.com/Cisco-Talos/clamav-documentation

https://docs.clamav.net/#mailing-lists-and-chat
Re: Best practices when using caching http proxy as cvd private mirror [ In reply to ]
What I don’t understand about threads like this:

During my time at Cisco, Micah literally built multiple tools to correctly handle the CDN framework. CVDUPATE and Freshclam itself, and people are going out of their way to try and fake CVDUPDATE to create a local mirror. Which is literally what cvdupdate was invented for.

> On Sep 8, 2022, at 3:59 AM, G.W. Haywood via clamav-users <clamav-users@lists.clamav.net> wrote:
>
> Hi there,
>
> On Thu, 8 Sep 2022, Aaron Leliaert via clamav-users wrote:
>
>> On https://docs.clamav.net/appendix/CvdPrivateMirror.html#use-an-http-proxy
>> Am looking for best practices on how an http proxy should be
>> configured in this scenario. Some questions:
>> 1) What mechanism should a proxy use to detect a stale cached file?
>> Want to avoid stale files obviously, but also reduce load to the
>> public mirrors and chance of rate limiting.
>
> There are no public mirrors any more, it's a Content Delivery Network
> provided by Cloudflare which also provides some protection against
> Denial of Service attacks - which have been part of the landscape for
> some time now. You probably don't need to worry about stale files, it
> happens occasionally but the signatures aren't updated much more often
> than daily and you could e.g. set up a cron job to mail you if nothing
> changes in your copy of the official signature database for 48 hours.
> I've been using ClamAV for about two decades and I can't remember the
> last time I had to do *anything* about it. It Just Works. Whether it
> will then find what you're looking for is another question entirely...
>
>> 2) I see that curl requests to database.clamav.net fail unless I
>> override the User-Agent header to have a value similar to what
>> freshclam does, such as "CVDUPDATE/0". If I have to manually set
>> this in a proxy, is there guidance on what a good future-proof value
>> is? It feels weird to lie in the request.
>
> Using curl and lying in the requests is likely to get the requesting
> IP banned. My understanding is that you have two choices, you either
> use (preferably) freshclam or (if necessary) cvdupdate, and that the
> use of curl and similar is essentially forbidden. You will see notes
> to this effect in the mailing list, many from Joel, if you search it.
>
>> 3) Happy to hear any dissenting opinions on the HTTP proxy idea.
>
> Now that the files are distributed by a Content Delivery Network, I
> think the need for local caching proxies is much reduced (the CDN can
> cope with much more traffic) but you will certainly want to avoid the
> appearance of being abusive. That isn't too difficult unless you're
> managing a large number of clients on your network. For a few dozen
> machines I haven't used a proxy for years. What sort of numbers are
> you dealing with?
>
> Please note that replies direct to my clamav@ address are rejected,
> it accepts mail only from the mailing list.
>
> --
>
> 73,
> Ged.
> _______________________________________________
>
> clamav-users mailing list
> clamav-users@lists.clamav.net
> https://lists.clamav.net/mailman/listinfo/clamav-users
>
>
> Help us build a comprehensive ClamAV guide:
> https://github.com/Cisco-Talos/clamav-documentation
>
> https://docs.clamav.net/#mailing-lists-and-chat