Mailing List Archive

Catch subtly-different Reply-To domain
Is there a rule to catch cases where the domain of the Reply-To header
is a subtle variant on that in the To header. Take this (real) example
from a phishing email sent yesterday:

From: "Karen Howard" <karen@interfacefm.com>
Reply-To: "Karen Howard" <karen@intrefacefm.com>

I realise that other elements of the address can be different without
being a reliable spam indicator but I think that interfacefm.com ->
intrefacefm.com are so similar and yet different that they should be
worth a few points. But I can't think how to write such a rule myself.
Re: Catch subtly-different Reply-To domain [ In reply to ]
Am 2021-02-20 08:58, schrieb Dominic Raferd:
> Is there a rule to catch cases where the domain of the Reply-To header
> is a subtle variant on that in the To header. Take this (real) example
> from a phishing email sent yesterday:
>
> From: "Karen Howard" <karen@interfacefm.com>
> Reply-To: "Karen Howard" <karen@intrefacefm.com>
>
> I realise that other elements of the address can be different without
> being a reliable spam indicator but I think that interfacefm.com ->
> intrefacefm.com are so similar and yet different that they should be
> worth a few points. But I can't think how to write such a rule myself.

Use the "Damerau–Levenshtein distance" to calcutate the similarity.
Since long I was interested to try this, but never found the time.

Michael
Re: Catch subtly-different Reply-To domain [ In reply to ]
On Sun, 21 Feb 2021 11:28:51 +0100
Michael Storz wrote:

> Am 2021-02-20 08:58, schrieb Dominic Raferd:
> > Is there a rule to catch cases where the domain of the Reply-To
> > header is a subtle variant on that in the To header. Take this
> > (real) example from a phishing email sent yesterday:
> >
> > From: "Karen Howard" <karen@interfacefm.com>
> > Reply-To: "Karen Howard" <karen@intrefacefm.com>

> Use the "Damerau–Levenshtein distance" to calcutate the similarity.
> Since long I was interested to try this, but never found the time.

Did you have particular use in mind for that? The example above doesn't
seem all that useful as a phishing technique as it will fail DMARC.

My suspicion is that they are trying to exploit mail systems that
haven't yet adopted DMARC checking and that interfacefm.com was chosen
for its SPF record:

v=spf1 +a +mx +a:ns1.c57578.sgvps.net include:_spf.mailspamprotection.com

There's no -all or ~all on the end.
Re: Catch subtly-different Reply-To domain [ In reply to ]
On 21/02/2021 13:56, RW wrote:
> On Sun, 21 Feb 2021 11:28:51 +0100
> Michael Storz wrote:
>
>> Am 2021-02-20 08:58, schrieb Dominic Raferd:
>>> Is there a rule to catch cases where the domain of the Reply-To
>>> header is a subtle variant on that in the To header. Take this
>>> (real) example from a phishing email sent yesterday:
>>>
>>> From: "Karen Howard" <karen@interfacefm.com>
>>> Reply-To: "Karen Howard" <karen@intrefacefm.com>
>> Use the "Damerau–Levenshtein distance" to calcutate the similarity.
>> Since long I was interested to try this, but never found the time.
> Did you have particular use in mind for that? The example above doesn't
> seem all that useful as a phishing technique as it will fail DMARC.
>
> My suspicion is that they are trying to exploit mail systems that
> haven't yet adopted DMARC checking and that interfacefm.com was chosen
> for its SPF record:
>
> v=spf1 +a +mx +a:ns1.c57578.sgvps.net include:_spf.mailspamprotection.com
>
> There's no -all or ~all on the end.
Yes this mail passed DMARC and it is cases like this that I want to
catch. 99% of domains have not implemented full DMARC with
p=quarantine|reject, so one can't rely on it (although it has a valuable
role).
Re: Catch subtly-different Reply-To domain [ In reply to ]
On Sun, 21 Feb 2021 14:04:20 +0000
Dominic Raferd wrote:

> On 21/02/2021 13:56, RW wrote:

> >>> From: "Karen Howard" <karen@interfacefm.com>
> >>> Reply-To: "Karen Howard" <karen@intrefacefm.com>

> Yes this mail passed DMARC

How did it pass DMARC when it has the domain being spoofed in the from
header?
Re: Catch subtly-different Reply-To domain [ In reply to ]
On 2021-02-21 17:00, RW wrote:
> On Sun, 21 Feb 2021 14:04:20 +0000
> Dominic Raferd wrote:
>
>> On 21/02/2021 13:56, RW wrote:
>
>> >>> From: "Karen Howard" <karen@interfacefm.com>
>> >>> Reply-To: "Karen Howard" <karen@intrefacefm.com>
>
>> Yes this mail passed DMARC
>
> How did it pass DMARC when it has the domain being spoofed in the from
> header?

both domains can have dmarc, but only from header is dmarc tested

and dkim can sign reply-to
Re: Catch subtly-different Reply-To domain [ In reply to ]
On 21/02/2021 16:20, Benny Pedersen wrote:
> On 2021-02-21 17:00, RW wrote:
>> On Sun, 21 Feb 2021 14:04:20 +0000
>> Dominic Raferd wrote:
>>
>>> On 21/02/2021 13:56, RW wrote:
>>
>>> >>> From: "Karen Howard" <karen@interfacefm.com>
>>> >>> Reply-To: "Karen Howard" <karen@intrefacefm.com>
>>
>>> Yes this mail passed DMARC
>>
>> How did it pass DMARC when it has the domain being spoofed in the from
>> header?
>
> both domains can have dmarc, but only from header is dmarc tested
>
> and dkim can sign reply-to
and interfacefm.com (like most domains) does not publish a DMARC policy,
so it must pass
Re: Catch subtly-different Reply-To domain [ In reply to ]
On Sun, 21 Feb 2021 17:00:32 +0000
Dominic Raferd wrote:

> On 21/02/2021 16:20, Benny Pedersen wrote:
> > On 2021-02-21 17:00, RW wrote:
> >> On Sun, 21 Feb 2021 14:04:20 +0000
> >> Dominic Raferd wrote:
> >>
> >>> On 21/02/2021 13:56, RW wrote:
> >>
> >>> >>> From: "Karen Howard" <karen@interfacefm.com>
> >>> >>> Reply-To: "Karen Howard" <karen@intrefacefm.com>
> >>
> >>> Yes this mail passed DMARC
> >>
> >> How did it pass DMARC when it has the domain being spoofed in the
> >> from header?
> >
> > both domains can have dmarc, but only from header is dmarc tested
> >
> > and dkim can sign reply-to
> and interfacefm.com (like most domains) does not publish a DMARC
> policy, so it must pass


But it does:

$ dig +short txt _dmarc.interfacefm.com
"v=DMARC1; p=none; rua=mailto:postmaster@interfacefm.com"

Presumably interfacefm.com has been hacked, but not to the extent that
they can intercept incoming replies.
Re: Catch subtly-different Reply-To domain [ In reply to ]
On 21/02/2021 17:37, RW wrote:
> On Sun, 21 Feb 2021 17:00:32 +0000
> Dominic Raferd wrote:
>
>> On 21/02/2021 16:20, Benny Pedersen wrote:
>>> On 2021-02-21 17:00, RW wrote:
>>>> On Sun, 21 Feb 2021 14:04:20 +0000
>>>> Dominic Raferd wrote:
>>>>
>>>>> On 21/02/2021 13:56, RW wrote:
>>>>
>>>>>>>> From: "Karen Howard" <karen@interfacefm.com>
>>>>>>>> Reply-To: "Karen Howard" <karen@intrefacefm.com>
>>>>
>>>>> Yes this mail passed DMARC
>>>> How did it pass DMARC when it has the domain being spoofed in the
>>>> from header?
>>> both domains can have dmarc, but only from header is dmarc tested
>>>
>>> and dkim can sign reply-to
>> and interfacefm.com (like most domains) does not publish a DMARC
>> policy, so it must pass
> But it does:
>
> $ dig +short txt _dmarc.interfacefm.com
> "v=DMARC1; p=none; rua=mailto:postmaster@interfacefm.com"
>
> Presumably interfacefm.com has been hacked, but not to the extent that
> they can intercept incoming replies.

I stand corrected; but as they specify p=none, the mail must still pass.
Re: Catch subtly-different Reply-To domain [ In reply to ]
On 2021-02-21 19:44, Dominic Raferd wrote:

>> Presumably interfacefm.com has been hacked, but not to the extent that
>> they can intercept incoming replies.
>
> I stand corrected; but as they specify p=none, the mail must still
> pass.

in what way should it pass ?

dmarc tests spf, dkim, and opendmarc from github trunk validates arc
chains aswell, there is no garenti that anything pass

only sendgrid maked that mistake, sorry sendgrid
Re: Catch subtly-different Reply-To domain [ In reply to ]
On 21/02/2021 20:09, Benny Pedersen wrote:
> On 2021-02-21 19:44, Dominic Raferd wrote:
>
>>> Presumably interfacefm.com has been hacked, but not to the extent that
>>> they can intercept incoming replies.
>>
>> I stand corrected; but as they specify p=none, the mail must still pass.
>
> in what way should it pass ?
>
> dmarc tests spf, dkim, and opendmarc from github trunk validates arc
> chains aswell, there is no garenti that anything pass
>
> only sendgrid maked that mistake, sorry sendgrid

p=none is an instruction from the domain controller *not* to reject
emails from their domain even when they fail DMARC testing. So the end
result is that this mail should pass through DMARC testing.

DMARC is a red herring here. My original question wouldn't be relevant
if the sending domain had an enforced DMARC policy
(p=quarantine|reject), but they don't.

Michael's suggestion is interesting. There is a github project allowing
Levenshtein numbers to be calculated and used in SA, I will see if there
is a way to apply it in this situation. Thanks to all for their input.
Re: Catch subtly-different Reply-To domain [ In reply to ]
On 2021-02-21 23:00, Dominic Raferd wrote:

> p=none is an instruction from the domain controller *not* to reject
> emails from their domain even when they fail DMARC testing. So the end
> result is that this mail should pass through DMARC testing.

remember dmarc can pass on spf pass only, even if dkim fail

this is properbly not what so many expect it can

> DMARC is a red herring here. My original question wouldn't be relevant
> if the sending domain had an enforced DMARC policy
> (p=quarantine|reject), but they don't.

good, but there is still no garenti that spf or dkim pass or even arc
seal is giving pass results

> Michael's suggestion is interesting. There is a github project
> allowing Levenshtein numbers to be calculated and used in SA, I will
> see if there is a way to apply it in this situation. Thanks to all for
> their input.

like spamassassin dkim_valid_ef is not dkim standard, want sid-milter
results to be used in dmarc ?

there is some low hanging fruits that should not be used anytime
Re: Catch subtly-different Reply-To domain [ In reply to ]
On Sun, 21 Feb 2021, Dominic Raferd wrote:

> On 21/02/2021 20:09, Benny Pedersen wrote:
>> On 2021-02-21 19:44, Dominic Raferd wrote:
>>
>>>> Presumably interfacefm.com has been hacked, but not to the extent that
>>>> they can intercept incoming replies.
>>>
>>> I stand corrected; but as they specify p=none, the mail must still pass.
>>
>> in what way should it pass ?
>>
>> dmarc tests spf, dkim, and opendmarc from github trunk validates arc chains
>> aswell, there is no garenti that anything pass
>>
>> only sendgrid maked that mistake, sorry sendgrid
>
> p=none is an instruction from the domain controller *not* to reject emails
> from their domain even when they fail DMARC testing. So the end result is
> that this mail should pass through DMARC testing.
>
> DMARC is a red herring here. My original question wouldn't be relevant if the
> sending domain had an enforced DMARC policy (p=quarantine|reject), but they
> don't.
>
> Michael's suggestion is interesting. There is a github project allowing
> Levenshtein numbers to be calculated and used in SA, I will see if there is a
> way to apply it in this situation. Thanks to all for their input.

It would have to be a plugin, and there's a CPAN module for calculating
Levenshtein numbers so most of the heavy lifting is already done.


--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
Avatar: the highest grossing Pocahontas remake ever. -- Chris Sauer
-----------------------------------------------------------------------
Tomorrow: George Washington's 289th Birthday
Re: Catch subtly-different Reply-To domain [ In reply to ]
On Sun, 21 Feb 2021, John Hardin wrote:

> On Sun, 21 Feb 2021, Dominic Raferd wrote:
>
>> On 21/02/2021 20:09, Benny Pedersen wrote:
>>> On 2021-02-21 19:44, Dominic Raferd wrote:
>>>
>>>>> Presumably interfacefm.com has been hacked, but not to the extent that
>>>>> they can intercept incoming replies.
>>>>
>>>> I stand corrected; but as they specify p=none, the mail must still pass.
>>>
>>> in what way should it pass ?
>>>
>>> dmarc tests spf, dkim, and opendmarc from github trunk validates arc
>>> chains aswell, there is no garenti that anything pass
>>>
>>> only sendgrid maked that mistake, sorry sendgrid
>>
>> p=none is an instruction from the domain controller *not* to reject emails
>> from their domain even when they fail DMARC testing. So the end result is
>> that this mail should pass through DMARC testing.
>>
>> DMARC is a red herring here. My original question wouldn't be relevant if
>> the sending domain had an enforced DMARC policy (p=quarantine|reject), but
>> they don't.
>>
>> Michael's suggestion is interesting. There is a github project allowing
>> Levenshtein numbers to be calculated and used in SA, I will see if there is
>> a way to apply it in this situation. Thanks to all for their input.
>
> It would have to be a plugin, and there's a CPAN module for calculating
> Levenshtein numbers so most of the heavy lifting is already done.

Sigh. Ignore that, that's exactly what it is. I need to stop replying so
quickly to stuff.

--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
Avatar: the highest grossing Pocahontas remake ever. -- Chris Sauer
-----------------------------------------------------------------------
Tomorrow: George Washington's 289th Birthday
Re: Catch subtly-different Reply-To domain [ In reply to ]
On Sun, 21 Feb 2021 16:32:01 -0800 (PST)
John Hardin wrote:

> On Sun, 21 Feb 2021, John Hardin wrote:
>
> > On Sun, 21 Feb 2021, Dominic Raferd wrote:

> >> Michael's suggestion is interesting. There is a github project
> >> allowing Levenshtein numbers to be calculated and used in SA, I
> >> will see if there is a way to apply it in this situation. Thanks
> >> to all for their input.
> >
> > It would have to be a plugin, and there's a CPAN module for
> > calculating Levenshtein numbers so most of the heavy lifting is
> > already done.
>
> Sigh. Ignore that, that's exactly what it is. I need to stop replying
> so quickly to stuff.

I don't think there was anything wrong in pointing out that it's
available from CPAN.

There is also a Damerau–Levenshtein version which is probably a better
choice as the transposition of two adjacent characters counts as 1
difference rather than 2.
Re: Catch subtly-different Reply-To domain [ In reply to ]
On Mon, 22 Feb 2021, RW wrote:

> On Sun, 21 Feb 2021 16:32:01 -0800 (PST)
> John Hardin wrote:
>
>> On Sun, 21 Feb 2021, John Hardin wrote:
>>
>>> On Sun, 21 Feb 2021, Dominic Raferd wrote:
>
>>>> Michael's suggestion is interesting. There is a github project
>>>> allowing Levenshtein numbers to be calculated and used in SA, I
>>>> will see if there is a way to apply it in this situation. Thanks
>>>> to all for their input.
>>>
>>> It would have to be a plugin, and there's a CPAN module for
>>> calculating Levenshtein numbers so most of the heavy lifting is
>>> already done.
>>
>> Sigh. Ignore that, that's exactly what it is. I need to stop replying
>> so quickly to stuff.
>
> I don't think there was anything wrong in pointing out that it's
> available from CPAN.
>
> There is also a Damerau–Levenshtein version which is probably a better
> choice as the transposition of two adjacent characters counts as 1
> difference rather than 2.

I was more sighing about: "allowing ... to be ... used in SA" "It would
have to be a plugin"

:)

--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
Today: George Washington's 289th Birthday
Re: Catch subtly-different Reply-To domain [ In reply to ]
On 22/02/2021 15:05, RW wrote:
> On Sun, 21 Feb 2021 16:32:01 -0800 (PST)
> John Hardin wrote:
>
>> On Sun, 21 Feb 2021, John Hardin wrote:
>>
>>> On Sun, 21 Feb 2021, Dominic Raferd wrote:
>>>> Michael's suggestion is interesting. There is a github project
>>>> allowing Levenshtein numbers to be calculated and used in SA, I
>>>> will see if there is a way to apply it in this situation. Thanks
>>>> to all for their input.
>>> It would have to be a plugin, and there's a CPAN module for
>>> calculating Levenshtein numbers so most of the heavy lifting is
>>> already done.
>> Sigh. Ignore that, that's exactly what it is. I need to stop replying
>> so quickly to stuff.
> I don't think there was anything wrong in pointing out that it's
> available from CPAN.
>
> There is also a Damerau–Levenshtein version which is probably a better
> choice as the transposition of two adjacent characters counts as 1
> difference rather than 2.
That sounds better, but I don't know how to employ it to make a rule for
SA. My idea is to compare the domain part of the 'From' and 'Reply-To'
addresses, scoring for a close but not exact match (maybe
Damerau–Levenshtein between 1 and 3). The same logic could also be used
to compare the domain part of the 'From' to a list of domains that are
prone to impersonation (and don't have DMARC policy with
p=reject|quarantine).
Re: Catch subtly-different Reply-To domain [ In reply to ]
On 22/02/2021 15:45, Dominic Raferd wrote:
> On 22/02/2021 15:05, RW wrote:
>>
>>>> On Sun, 21 Feb 2021, Dominic Raferd wrote:
>>>>> Michael's suggestion is interesting. There is a github project
>>>>> allowing Levenshtein numbers to be calculated and used in SA, I
>>>>> will see if there is a way to apply it in this situation. Thanks
>>>>> to all for their input.
>>>>
>> There is also a Damerau–Levenshtein version which is probably a better
>> choice as the transposition of two adjacent characters counts as 1
>> difference rather than 2.
> That sounds better, but I don't know how to employ it to make a rule for
> SA. My idea is to compare the domain part of the 'From' and 'Reply-To'
> addresses, scoring for a close but not exact match (maybe
> Damerau–Levenshtein between 1 and 3). The same logic could also be used
> to compare the domain part of the 'From' to a list of domains that are
> prone to impersonation (and don't have DMARC policy with
> p=reject|quarantine).

I have now implemented this using the (updated) code at
https://github.com/fmbla/spamassassin-levenshtein. This was super-easy
as the new LEVENSHTEIN_REPLY rule does exactly what I need - I just
added the 3 files to /etc/spamassassin and added 1 line to
/etc/spamassassin/z_local.cf:

score LEVENSHTEIN_REPLY 4

My thanks to the coder! Now I need a real-world case to see it in action...