Mailing List Archive

homograph spam
Are there any plugins or techniques that can deal with UTF-8 homographs?
In particular, i'm seeing a lot of attempts to get past filters that
would match on a word like 'amazon', but do not catch it because the 'm'
has been replaced by the UTF-8 version of 'm' that looks identical.

I understand that UTF-8 From and Subject are legitimate, so I do not
want to just block those, but it seems like we should look for typical
homographs in the middle of words and add a weighted score for these.

I do have 'normalize_charset 1' set here.

--
micah
Re: homograph spam [ In reply to ]
On Wed, 17 Jun 2020, micah anderson wrote:

> Are there any plugins or techniques that can deal with UTF-8 homographs?
> In particular, i'm seeing a lot of attempts to get past filters that
> would match on a word like 'amazon', but do not catch it because the 'm'
> has been replaced by the UTF-8 version of 'm' that looks identical.

Yes, look at the FUZZY_* rules, the ReplaceTags plugin and the
25_replace.cf rules file in the base ruleset.

> I understand that UTF-8 From and Subject are legitimate, so I do not
> want to just block those, but it seems like we should look for typical
> homographs in the middle of words and add a weighted score for these.

Unfortunately that sort of obfuscation requires specific rules, as there's
no general way to detect it in arbitrary words.

You'd probably want something like:

ifplugin Mail::SpamAssassin::Plugin::ReplaceTags
body FUZZY_AMAZON /\s<A>(?!mazon)<M><A><Z><O><N>/i
replace_rules FUZZY_AMAZON
describe FUZZY_AMAZON Obfuscated "Amazon"
endif


--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
Activist: Someone who gets involved.
Unregistered Lobbyist: Someone who gets involved
with something the MSM doesn't approve of. -- WizardPC
-----------------------------------------------------------------------
140 days until the Presidential Election