Hi Dismas,
First of all a disclaimer, for everyone else who happens to read this.
This is all going to be about handling email. That's more or less all
I do with ClamAV. Depending on your source of information, emails are
implicated in around 80% to 90% of all network compromises so it might
be relevant to people not scanning emails. But if you're mainly into
scanning filesystems then I guess that this won't be much use to you.
In my view if you find something nasty in the filesystem in some place
that it's not supposed to be (in my case that's anywhere other than in
one of the directories under /var/ where mail gets processed) then you
have a compromise, and you can stop scanning and start re-installing.
Another one of my hobby-horses is looking at metadata. Things one can
find out about things; in this case things one can find out about mail
by e.g. looking at the headers, and doing some investigation into what
is found in there. That will be what the rest of this post is about.
The MTA feeds clamd with mail as it comes in on the wire via a milter.
A milter is provided with ClamAV but we don't use it, we use our own.
It does some other things that aren't especially relevant to scanning,
there's nothing wrong with clamav-milter.
On Mon, 21 Sep 2020, Dismas Axel (Thomas) via clamav-users wrote:
> Yes, I was referring to by "this kind of file" for .xz file types
IF I were going to do this perhaps I'd say to myself, "Well, as far as
I can remember, nobody has ever sent me a compressed file unless I've
asked them to do that. So if one arrives unannounced I'll reject it."
Then I'd look in the message body, and as it's MIME I'd see something
like the extract below (this is in fact your last mail to me but it was
a .png file. I've edited it, to pretend that it was a .xz file):
8<----------------------------------------------------------------------
--b1_mar5aAkWHbYN3D3ZO4yFNHayVn6P2W1zCHleNR1cro
Content-Type: image/png; name=thespam.xz
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename=thespam.xz
iVBORw0KGgoAAAANSUhEUgAAAzYAAAJNCAYAAADj3eD3AAAfSHpUWHRSYXcgcHJvZmlsZSB0eXBl
....
....
8<----------------------------------------------------------------------
Then I might write signatures which blocked things like the above.
Below I'm using simple Perl regexes to explain. The '/' characters
quote the patterns. The variable text is in the two strings ".*" in
each pattern, meaning "any string of text including the empty string".
Later on I'll do something similar with Yara rules.
Signature 1:
/Content-Type:.*name=.*xz/i
and
Signature 2:
/Content-Disposition:.*name=.*xz/i
These signatures don't care what's in the .xz file although they would
need to be exercised because there are probably very many ways which I
haven't thought of for malicious senders to get around them. If your
system doesn't do Perl Compatible Regular Expressions (PCRE) then you
might have to come up with a completely different kind of pattern, I'm
just showing these as an example of what might be done with what I'm
calling metadata. Not looking at the content itself (unless you take
the word 'content' in a very literal way) but at something about the
content. In this case the content is a compressed file, and what I'll
call the MIME meta-information says the content is a compressed file.
It also says what its name will be when the dumb mail client extracts
it from the message and saves it somewhere to the filesystem so that
its hapless user can then uncompress the file and run it, thus having
his computer join some botnet and perpetuate the problem. All I've
done here is look at the line giving the name. Obviously if a malware
author lied about that, just looking at the name might not work.
There are other ways to see what you're dealing with which I won't go
into. ClamAV offers things like the detection of 'PUA's or 'Possibly
Unwanted Applications'. It's all in the documentation, do spend some
quality time with it.
> Here is the header of the spam email and attached is the screenshot
> of the fake email containing this .xz file:
It's mail from a genuine user's computer which has unfortunately been
compromised, and is being used by a criminal to send mail which will
pass SPF forgery tests. But note that, earlier in this mail, I said
"IF I were going to do this ..."
That's a big IF. In fact I wouldn't have done anything like that. I
wouldn't have had to do _anything_ to block that message. To see why,
first let's look at the headers in the mail you received. There are
four headers below and I've broken the two long ones using backslash-\
escaped newlines:
> Return-Path: <y.safary@kums.ac.ir>
> Delivered-To: y.safary@kums.ac.ir
> Received-SPF: Pass (sender SPF authorized) identity=mailfrom; \
> client-ip=5.63.15.95; helo=server507.dnslake.com; \
> envelope-from=y.safary@kums.ac.ir; receiver=y.safary@kums.ac.ir
> Received: from Server507.dnslake.com \
> (webmail.kums.ac.ir [5.63.15.95]) \
> (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 \
> (256/256 bits)) (No client certificate requested)
The headers tell me that the message came from IP address 5.63.15.95.
That's about the only thing I trust thus far (*). If the headers are
to be believed this is a server in Iran. Let's see if we believe that
by looking in our GeoIP database. Here's the output exactly as I saw
it when I typed the SQL query on our GeoIP server:
milter=> SELECT * FROM xm_geo2 WHERE network && '5.63.15.95' ;
network | asnet | geonet | asnum | as_name | geoname_id | country_iso_code
--------------+--------------+-------------+-------+-------------------------------------------------------------+------------+------------------
5.63.15.0/24 | 5.63.15.0/24 | 5.63.8.0/21 | 16292 | Kermanshah University of Medical Science and Health Service | 112931 | IR
(1 row)
We use the excellent MaxMind data service. Here is the query output
again, this time edited to give shorter lines in case your mail reader
wraps the long lines above and makes a mess:
milter=> SELECT * FROM xm_geo2 WHERE network && '5.63.15.95' ;
network 5.63.15.0/24
asnet 5.63.15.0/24
geonet 5.63.8.0/21
asnum 16292
as_name "Kermanshah University of Medical Science and Health Service"
geoname_id 112931
country_iso_code IR
(1 row)
So yes, it looks like the server is in Iran, which amongst the email
sources in my experience is a shining paragon of virtue (**).
Let's, er, dig into this a bit more:
laptop3:~$ >>> dig +short -x 5.63.15.95
webmail.kums.ac.ir.
laptop3:~$ >>> dig +short server507.dnslake.com
5.63.15.95
laptop3:~$ >>> dig +short -t txt kums.ac.ir
"v=spf1 mx a a:dns.kums.ac.ir a:dns2.kums.ac.ir ~all"
laptop3:~$ >>> dig +short -t mx kums.ac.ir
10 mail.kums.ac.ir.
laptop3:~$ >>> dig +short mail.kums.ac.ir
5.63.15.95
A lot happened there at the shell prompt. I'll explain it a little.
For clarity I'll edit the laptop's prompts to make them a single '$'
symbol, instead of what I usually see on the laptop. I work remotely
on a lot of machines at once, and I need to see some information in
the prompts so I don't accidentally do things on the wrong machine!
First I did a reverse DNS lookup on the IP address using 'dig'.
$ dig +short -x 5.63.15.95
webmail.kums.ac.ir.
The reply came back with the name of the server, webmail.kums.ac.ir.
That agrees well with the mail headers. Probably because that's how
the mail headers were generated. :/
Then I did a forward lookup on the 'helo' name which it used when it
opened the SMTP conversation with your server (or maybe it was your
mail provider's server) to send the message - again using 'dig'.
laptop3:~$ >>> dig +short -t a server507.dnslake.com
5.63.15.95
The mail server probably didn't do that - so it's new information.
The IP address for that name is also 5.63.15.95. All makes sense so
far, I'm starting to think I might even believe some of it. I won't
do more DNS queries for that name right now, do them in your own time
later if you wish. Count the nameservers. Same for the KUMS domain.
Consider what will happen if they have a single-point network failure,
and whether you think the administrators are doing a good job of work.
Back to digging: I saw that there is a "Received-SPF:" header in the
message. The SPF result was 'pass' although whatever utility actually
added the header wasn't very helpful. By way of explanation it says
only "sender SPF authorized". It doesn't say which part of the SPF
record it's used to get that authorization, and I want to find out.
As luck is on my side it doesn't take too long - three DNS queries.
The first query I make is
$ dig +short -t txt kums.ac.ir
"v=spf1 mx a a:dns.kums.ac.ir a:dns2.kums.ac.ir ~all"
which says "show me all the TXT records for the domain 'kums.ac.ir'".
There's only one, and it's the SPF record I'm looking for. It starts
with "v=spf1 mx a ..." which is my first (and only) bit of bad luck.
The people who wrote this record don't really know what they're doing
(as we could have guessed, because they're letting their compromised
machines send fraudulent mail all over the planet) so they've used an
obviously inefficient mechanism for the first term in the SPF record,
which means I have to do two more DNS queries to find what I want.
So we keep digging. The SPF record says that any MX for the domain is
authorized to send mail on its behalf. So my next DNS lookup is
$ dig +short -t mx kums.ac.ir
10 mail.kums.ac.ir.
Which says "show me the MX records for "kums.ac.ir". It tells us that
there's one MX for the domain, "mail.kums.ac.ir". Let's see what IPs
we can find for it:
$ dig +short mail.kums.ac.ir
5.63.15.95
That's an 'A' lookup (dig does those by default if you don't tell it)
and that's the IP which sent the message. Everything hangs together
really well here. It doesn't always do that, and then you might have
to do some serious digging. :(
Now I'm in a position to say that If this message had been sent to us,
I'd never have seen it unless I looked for it in the logs or database
for rejected messages (yes, we write them to a database).
Our filter rules block all mail from Iran:
mail6:~$ >>> grep IR /etc/mail/.../extensible-milter_country_blacklist
ID => 4, IL => 4, IN => 4, IQ => 4, IR => 4, IS => 1, IT => 1,
These are configuration values for a Perl hash, which can effectively
block mail from countries which are never expected to send mail to us.
The 'ISO' two-letter country code for Iran is 'IR'. It gets a score
of '4' in this configuration. The meaning (to our Sendmail milter) is
"Don't accept any mail from IPs associated with this country code".
Here's why I block all mail from Iran:
Every time a connection comes in, the server looks up the IP in about
a dozen DNS Block Lists (DNSBLs). Each list has an unique policy for
listing, and we give any listing a 'weight' which ranges from 1 to 3.
The weights are all added together to reach a score. The server will
reject outright any connection from a server which scores 3 or more in
the this calculation. A listing at Spamhaus for example has a weight
of 3. If your IP is listed by Spamhaus, we'll never accept mail from
you. When we get a connection, the milter writes stuff to a database
for perusal at our whim. Now let's see the average DNSBL score for
all connections _ever_ made to our servers from Iranian IPs:
milter=> SELECT avg(bl_score) FROM connections WHERE country_code='IR';
avg
---------------------
10.8538205980066445
(1 row)
I could have calculated something like a standard deviation as well
but you get the idea. The average DNSBL score for connections from
IPs listed in the GeoIP database as being in Iran is just under 11.
We don't do business in Iran and the average DNSBL score is about 11
so we don't accept mail from Iran - it will undoubtedly be malicious.
All that goes for Indonesia, Israel, India and Iraq as well. Their
average scores are 12.5, 9.7, 11.0 and 13.8 respectively. Italy (9.7)
and Iceland (4.3) get the benefit of the rather considerable doubt in
our config as we might occasionally do business there. Italy is quite
a big spam source in my experience so it gets a score, Iceland is not
such a problem but we're less likely to be doing business there. You
might want to think about writing ClamAV signatures to match headers.
A quick whois on the IP 5.63.15.95 shows they just have a /24. Using
the very useful 'valli.org' I looked up the sending IP in quite a lot
of DNS block lists:
http://multirbl.valli.org/lookup/5.63.15.95.html It seems the IP range is relatively well managed, by comparison with
some I that see. There was only one list hit there, and it's a list
that we don't use. So although a GeoIP country code check would have
caught this one, our DNSBL score check would not. That's the problem
with DNSBLs when it's a very recent compromise. The spam traps have
not yet seen a lot of traffic from the IP or IP range. If Kermanshah
University's mail administrator gets on top of it quickly, their IPs
might not even be listed at all.
Summary:
For the majority of mail the IP is *much* easier to use for blocking,
and more reliable, than writing expressions to match on body content.
If you can relatively easily cut out the vast majority of the trash
you can concentrate your efforts more effectively on what remains.
You can write ClamAV signatures for more than just the content, and it
can be easier to target the metadata than to target the actual content
(which is totally controlled by the malware authors so it's easier for
them to obfuscate than is much of the metadata).
You shouldn't overlook the obvious. Here's another header from the
malicious mail which was sent to you:
> From: "UCO Bank" <y.safary@kums.ac.ir>
This is a phishing thing. We reject any mail with the word 'bank' in
a 'From:' header using a Perl regex in the (extensible-)milter. I'd
think that a Yara pattern to do that could work well. If you must,
you could exclude your own bank from the pattern - but we don't even
do that.
Here are some Yara matches that I cobbled together to match some of
this afternoon's spam, and one to match a 'From:' header with "bank"
in it. None of the afternoon spam would have got through anyway, so I
wouldn't normally bother doing this - it's just for example.
8<----------------------------------------------------------------------
pi4b530214:/EXPORTS/clamav/databases# >>> cat My_Spam.yara
rule My_Spam_Rule // block some random spam
{
strings:
$mymatcha = /\r\nSubject:[\W\w]*B2B marketing/ nocase ascii
$mymatchb = /\r\nSubject:[\W\w]*Free SEO Audit/ nocase ascii
$mymatchc = /\r\nFrom:[\W\w]*bank/ nocase ascii
$mymatchd = "Summ Now" nocase ascii
condition:
$mymatcha or $mymatchb or $mymatchc or $mymatchd
}
8<----------------------------------------------------------------------
As we're scanning mail using clamd and a milter, all I have to do is
drop the file into the clamav database directory and issue a RELOAD
command to clamd. Using clamdscan the setup behaves much the same.
Below you can see clamd logging the database reload and then the spam
that it's finding:
8<----------------------------------------------------------------------
Sep 21 17:26:35 clamd[31759]: Reading databases from /EXPORTS/clamav/databases
Sep 21 17:27:39 clamd[31759]: Database correctly reloaded (11319285 signatures)
Sep 21 17:27:39 clamd[31759]: Activating the newly loaded database...
Sep 21 17:28:50 clamd[31759]: instream(192.168.44.11@34538): YARA.My_Spam_rule.UNOFFICIAL FOUND
Sep 21 17:29:48 clamd[31759]: instream(192.168.44.11@34552): Sanesecurity.Jurlbl.0fddc2.UNOFFICIAL FOUND
Sep 21 17:29:57 clamd[31759]: instream(192.168.44.11@34556): YARA.My_Spam_Rule.UNOFFICIAL FOUND
Sep 21 17:30:35 clamd[31759]: instream(192.168.44.11@34572): OK
Sep 21 17:31:47 clamd[31759]: instream(192.168.44.11@34586): YARA.My_Spam_rule.UNOFFICIAL FOUND
Sep 21 17:42:51 clamd[31759]: instream(192.168.44.11@34712): OK
Sep 21 17:46:18 clamd[31759]: instream(192.168.44.11@34746): YARA.My_Spam_rule.UNOFFICIAL FOUND
Sep 21 17:50:49 clamd[31759]: instream(192.168.44.11@34798): YARA.My_Spam_rule.UNOFFICIAL FOUND
Sep 21 17:50:57 clamd[31759]: instream(192.168.44.11@34812): YARA.My_Spam_Rule.UNOFFICIAL FOUND
Sep 21 18:03:25 clamd[31759]: instream(192.168.44.11@34946): OK
Sep 21 18:03:40 clamd[31759]: instream(192.168.44.11@34950): Heuristics.Phishing.Email.SSL-Spoof FOUND
Sep 21 18:03:50 clamd[31759]: instream(192.168.44.11@34954): YARA.My_Spam_Rule.UNOFFICIAL FOUND
8<----------------------------------------------------------------------
Check ClamAV's limitations for Yara rules, and test a few before you
write lots of them. Watch out for efficiency issues, you don't want
to DOS your own machines. This applies to all regex-type matching.
Of course I had to remove those Yara rules to send this message. :/
Sorry, no time for more now, have to run.
HTH
--
73,
Ged.
(*) I don't really trust that yet.
(**) If you'll believe that you'll believe anything.
_______________________________________________
clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq http://www.clamav.net/contact.html#ml