Mailing List Archive

[clamav-users] Issues counting strings with Yara rules.
Hi there,

My main use for ClamAV is identifying spam. One of the more effective
ways I've found to defeat the spammers is to count strings in incoming
mail messages with Yara rules. Spammers read mailing lists so just in
case it helps them I won't say much about the strings I'm counting nor
how I use the counts - but that doesn't matter for this discussion.

Using ClamAV's Yara implementation, counting strings works fine for me
for string literals defined in rules using the double-quote syntax:

rule string_rule
$identifier = "string"
#identifier > 2

but AFAICT this doesn't work if the string is defined as a regex,
even if the 'regex' is a literal string constant:

rule regex_rule
$identifier = /regex/
#identifier > 2

I wondered if this might be an example of something which works only
in certain versions of the Yara implementation, but from the docs at
VirusTotal it isn't clear to me if this should be expected to work at
all. Both literals and regexes are called "strings" in the docs, and
the # sigil seems intended to hold a count of the associated string.
But although I've experimented quite a bit with regexes, I've failed
to get a condition to trigger on a count for a regex.

There are ways around it, but I'm fearful that the resulting searches
might be rather inefficient. I made no efficiency measurements yet.

I'm sure there are people on this list with more experience of Yara
than I have. Has anyone any insights to offer?




clamav-users mailing list

Help us build a comprehensive ClamAV guide: