Mailing List Archive

[clamav-users] ClamAV® blog: ClamAV, CVDs, CDIFFs and the magic behind the curtain

ClamAV, CVDs, CDIFFs and the magic behind the curtain

The amount of malicious files that ClamAV can detect has increased immensely over the past few years, but with this increase in efficacy comes some challenges with scale.

Some of these challenges have required drastic measures to ensure the effective operation of the ClamAV infrastructure, including blocking certain methods of downloading the official ClamAV signature sets. To give the community more insight into these matters, we’d like to discuss some of these challenges in-depth and provide insight into future changes and optimizations coming to the product.

ClamAV signatures come in a variety of formats, one for each of the distinct detection methods that the ClamAV file scanning engine supports. ClamAV also uses the ClamAV Virus Database (CVD) file format, which serves as a container for the compressed and digitally-signed official signature sets that power ClamAV — daily.cvd, main.cvd, and bytecode.cvd. Each signature set serves a different purpose:

* bytecode.cvd contains all compiled bytecode signatures evaluated by the bytecode interpreter engine
* daily.cvd contains signatures for the latest threats (updated daily)
* main.cvd contains signatures previously in daily.cvd that have shown to have a low false-positive risk.

< — More — >

Please read the rest of the post at the above link..

Joel Esler
Manager, Communities Division
Cisco Talos Intelligence Group |
Re: [clamav-users] ClamAV? blog: ClamAV, CVDs, CDIFFs and the magic behind the curtain [ In reply to ]
On Fri, 19 Mar 2021, Joel Esler (jesler) via clamav-users wrote:

> ClamAV, CVDs, CDIFFs and the magic behind the curtain

3. ... This is an expensive operation in terms of bandwidth
because daily.cvd and main.cvd are, currently, 105 MB and 117 MB,
... For example, for an update where 10,000 signatures were removed
from daily, the corresponding CDIFF was only around 60 KB in size.
To update via CDIFF, FreshClam determines the version of the database
on disk and requests every CDIFF between that version and the latest.
Assuming each of those CDIFFs exists on the server (only the last
90 days worth are currently kept) ...

60KB * 90 ~= 5MB << 100MB.

A zero-byte CDIFF indicates that FreshClam should download the CVD
instead. This is sometimes preferred to patching when a significant
portion of the CVD changes, like when a large portion of daily is
migrated to main in a single update.

So a machine which is 100 updates behind will download 100+MB of .cvd
instead of <10MB of .cdiff files :-(

I think I may have read that the 90 CDIFF files was being reviewed
which sounds like a good idea
(except of course when there has been a large daily -> main migration).

Is it possible to configure freshclam to keep the (verified) cdiffs if the
update fails, so that they don't have to be downloaded on the next update
attempt ?


Andrew C. Aitchison Kendal, UK


clamav-users mailing list

Help us build a comprehensive ClamAV guide: