Mailing List Archive

Dinged for .Date
Hello friends!

We make a handheld game system called Playdate, and our site lives at play.date. We find that our support email often doesn’t get delivered, making for occasionally very angry customers.

In debugging this, we’re looking at spam score.

In SA, .date is one of the “bad domains” that gets a heavily punished score (4.497) right out of the gate:

FROM_SUSPICIOUS_NTLD 0.499
FROM_SUSPICIOUS_NTLD_FP 1.999
PDS_OTHER_BAD_TLD 1.999

I found this bug on this topic:

https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7915

And a poster says "Unfortunately, the science backs up that the TLDs are problematic.”

I was trying to research “the science” to understand it. The SA code references the following four sources:

# new TLDs used for spamming
# https://www.spamhaus.org/statistics/tlds/
# http://www.surbl.org/tld
# https://ntldstats.com/fraud
# https://dnslytics.com/tld

Looking at these:

1. Spamhaus says that .date is 3.1% bad. (.com is 1.3% bad.)
2. SURBL ranks .date at #187.
3. The nTLDStats page is just returning a 404, idk
4. DNSlytics just returns basic information about .date, like that it has 6,977 domains

Can anyone help me understand “the science”? And how these domains are chosen for such a heavy punishment?

Is there any path to redemption, or is it that once they’re added, they're dinged forever?

Thanks for your help!

Best,
Cabel Sasser
Panic
Re: Dinged for .Date [ In reply to ]
On Mon, 2024-01-15 at 15:58 -0800, Cabel Sasser wrote:
>
> Can anyone help me understand “the science”? And how these domains are chosen for such a heavy punishment?

What you're facing is essentially an economic problem. Everyone knows
dot-com, and to a lesser extent dot-net and dot-org. But everything
else is junk: if you're the fifth guy to try to buy example.com, you're
probably not who people are looking for when they type www.example.com
into their web browsers. The other TLDs are also much harder for people
to remember if they see it on a commercial. As a result, dot-info, dot-
biz, and everything after have always been considered knock-offs.

When the wave of new gTLDs hit, the value of each successive one became
diluted even further. By the time you get to dot-date, you're at what
should be, like, somebody's 40th choice for a domain name. How to you
sell that? At a huge fucking discount, if you want anyone to buy it!

That's one half of your economic problem.

Now imagine you're trying to block spammers by domain name, and there's
one particular set of domain names that they can get at a 90% discount
because nobody wants them otherwise. Regardless of how many legitimate
companies use those domains, the signal to noise ratio is going to be
crap.

So, the other half of your economic problem is: how much money does it
cost me (as a recipient) to block dot-date, versus how much does it
cost me to not block it? We have customers who complain about spam and
customers who complain about blocked messages. It's a pretty easy
calculation for a recipient to make, and the result for me at least is
that it's less work (i.e. less expensive) to just block every new gTLD
and whitelist the few legitimate senders brave enough to live there.
Re: Dinged for .Date [ In reply to ]
Hi MIchael!

I totally understand what you’re saying. I get it 100%. But your math doesn’t quite add up for me.

There are 1,239 gTLDs. The SpamAssassin source* blocks just *22* of them.

If you believe every new gTLD is garbage (and I get that!), why isn’t SpamAssassin automatically dinging, say, 1,200+ of them?

Or put another way, why _these_ 22, and _only_ these 22, and not the rest?

That’s the “science” I’m trying to understand! :)

(And I’m still curious if there is any path of redemption for these 22. )

Best,
Cabel
Panic

PS: In the future, believe me, we won’t pick any of the gTLDs in this list. It’s also possible we can just send email from panic.com which we’ve now owned for nearly 30 years, but I'm still really curious!

* Assuming I’m reading this right at https://apache.googlesource.com/spamassassin/+/refs/tags/sa-update_3.4.4_20220326083106/rulesrc/sandbox/pds/20_ntld.cf


> On Jan 15, 2024, at 4:35 PM, Michael Orlitzky <michael@orlitzky.com> wrote:
>
> On Mon, 2024-01-15 at 15:58 -0800, Cabel Sasser wrote:
>>
>> Can anyone help me understand “the science”? And how these domains are chosen for such a heavy punishment?
>
> What you're facing is essentially an economic problem. Everyone knows
> dot-com, and to a lesser extent dot-net and dot-org. But everything
> else is junk: if you're the fifth guy to try to buy example.com, you're
> probably not who people are looking for when they type www.example.com
> into their web browsers. The other TLDs are also much harder for people
> to remember if they see it on a commercial. As a result, dot-info, dot-
> biz, and everything after have always been considered knock-offs.
>
> When the wave of new gTLDs hit, the value of each successive one became
> diluted even further. By the time you get to dot-date, you're at what
> should be, like, somebody's 40th choice for a domain name. How to you
> sell that? At a huge fucking discount, if you want anyone to buy it!
>
> That's one half of your economic problem.
>
> Now imagine you're trying to block spammers by domain name, and there's
> one particular set of domain names that they can get at a 90% discount
> because nobody wants them otherwise. Regardless of how many legitimate
> companies use those domains, the signal to noise ratio is going to be
> crap.
>
> So, the other half of your economic problem is: how much money does it
> cost me (as a recipient) to block dot-date, versus how much does it
> cost me to not block it? We have customers who complain about spam and
> customers who complain about blocked messages. It's a pretty easy
> calculation for a recipient to make, and the result for me at least is
> that it's less work (i.e. less expensive) to just block every new gTLD
> and whitelist the few legitimate senders brave enough to live there.
Re: Dinged for .Date [ In reply to ]
On Mon, 15 Jan 2024, Cabel Sasser wrote:

> There are 1,239 gTLDs. The SpamAssassin source* blocks just *22* of them.
>
> If you believe every new gTLD is garbage (and I get that!), why isn’t SpamAssassin automatically dinging, say, 1,200+ of them?
>
> Or put another way, why _these_ 22, and _only_ these 22, and not the rest?
>
> That’s the “science” I’m trying to understand! :)

Primarily it's the real-world email traffic that scoring contributors use
to evaluate the effectiveness of the rules and automatically assign their
scores (called "masscheck"). We basically see a lot of spam from those 22
TLDs, and little or no ham, so rules that penalize those TLDs perform well
with few "false positives" in that corpora.

> (And I’m still curious if there is any path of redemption for these 22. )

Most likely, SA specifically whitelisting legit domains in those poisonous
TLDs which are brought to our attention by, for instance, reports like
yours. Less likely but possible: seeing enough ham claiming to be from
those TLDs in the masscheck contributors' corpora that the scores for
those rules are automatically reduced.

A possible alternative that is under your control and will likely get
faster positive results than SA rules changes: register the domain
playdatesupport.com for your support department's use. They can still
*receive* email at support@play.date, but for outbound email that wouldn't
be the From: domain and thus wouldn't suffer the TLD reputational hit. (If
you do that, avoid setting "ReplyTo: support@play.date", as that would
also take a reputation hit.)



--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
People that keep dreaming about the wasteland, labyrinths and
quick cash, die in amusing ways. -- Root the Dragon
-----------------------------------------------------------------------
2 days until Benjamin Franklin's 318th Birthday
Re: Dinged for .Date [ In reply to ]
On Mon, 2024-01-15 at 17:06 -0800, Cabel Sasser wrote:
>
> There are 1,239 gTLDs. The SpamAssassin source* blocks just *22* of them.
>

The official unofficial KAM ruleset blocks a few more, and there are
plenty of third-party URIBLs that essentially block gTLDs through SA,
albeit at one level of abstraction.


> If you believe every new gTLD is garbage (and I get that!), why isn’t SpamAssassin automatically dinging, say, 1,200+ of them?
>
> Or put another way, why _these_ 22, and _only_ these 22, and not the rest?

Be careful what you wish for :P
Re: Dinged for .Date [ In reply to ]
* Michael Orlitzky:

> the result for me at least is that it's less work (i.e. less
> expensive) to just block every new gTLD and whitelist the few
> legitimate senders brave enough to live there.

My guess is that a significant number of mail service administrators use
the same approach. I definitely do, and my experience with suspecting
every new gTLD to be abused by spammers has been a good one. It is not
nice for the few legitimate users out there to be required to prove
their legitimacy before being permitted to send mail to our servers, but
these users are so few and far between that I cannot even remember the
last time I cleared a sender's domain.

-Ralph
Re: Dinged for .Date [ In reply to ]
HI Josh,

Thank you so much for your reply!

> Most likely, SA specifically whitelisting legit domains in those poisonous TLDs which are brought to our attention by, for instance, reports like yours. Less likely but possible: seeing enough ham claiming to be from those TLDs in the masscheck contributors' corpora that the scores for those rules are automatically reduced.
>
> A possible alternative that is under your control and will likely get faster positive results than SA rules changes: register the domain playdatesupport.com for your support department's use. They can still *receive* email at support@play.date, but for outbound email that wouldn't be the From: domain and thus wouldn't suffer the TLD reputational hit. (If you do that, avoid setting "ReplyTo: support@play.date", as that would also take a reputation hit.)

Great thoughts, and I’ll discuss them with the crew.

Regarding a (potential) whitelist of play.date — The Only Good .Date Domain® — would I… file a bug on that idea?

Best,
Cabel
Panic

PS: my last curiosity question: is there any built-in process within SA for re-reviewing the 22 “bad domains” periodically? Is it possible some get “better” over time or is that a pipe dream on my part?
Re: Dinged for .Date [ In reply to ]
On 1/16/2024 4:49 PM, Cabel Sasser wrote:
> HI Josh,
>
> Thank you so much for your reply!
>
>> Most likely, SA specifically whitelisting legit domains in those poisonous TLDs which are brought to our attention by, for instance, reports like yours. Less likely but possible: seeing enough ham claiming to be from those TLDs in the masscheck contributors' corpora that the scores for those rules are automatically reduced.
>>
>> A possible alternative that is under your control and will likely get faster positive results than SA rules changes: register the domain playdatesupport.com for your support department's use. They can still *receive* email at support@play.date, but for outbound email that wouldn't be the From: domain and thus wouldn't suffer the TLD reputational hit. (If you do that, avoid setting "ReplyTo: support@play.date", as that would also take a reputation hit.)
> Great thoughts, and I’ll discuss them with the crew.


This - getting a .com domain to send mail - is really the only
choice you have.

If Spamassassin were to whitelist your domain *today*, it will be
some period of time until all the people running SA have the updated
rules. I don't know how long, but I'm guessing many months. For
some, years.

I also can't imagine that SA is the only software filter preventing
you from successfully using your .date domain for mail, so fixing SA
won't do anything for those others.

The alternative is playing whack-a-mole asking individual sites to
whitelist you until the end of time.


  -- Noel Jones
Re: Dinged for .Date [ In reply to ]
Hi,

On Mon, Jan 15, 2024 at 05:06:11PM -0800, Cabel Sasser wrote:
> If you believe every new gTLD is garbage (and I get that!), why isn’t SpamAssassin automatically dinging, say, 1,200+ of them?

I have to second the advice to send email from a different domain.
It's just going to be the case that the .date TLD is abused by
people sending shadier dating-related emails and the operators of
that TLD have poisoned the well by making it cheap and easy to do
so.

Even if you somehow got negative scoring in SpamAssassin fixed for
your specific domain, there's going to be countless private,
non-SpamAssassin-based rule sets out there that penalise .date
domains.

It is a similar argument to "why can't I send email out from
$ARBITRARY_GHETTO_HOSTER ? I'm not a spammer!" You can argue forever
that as you're not a spammer and can prove you've never sent any
spam, ever, why would receivers penalise you just for being at an
hoster that is popular with a problematic class of clientele? The
answer being that recipients are just working with what info they
have, and it'll be hard work to convince a significant number of
them that you're different. Is the work worth it? Generally not;
other options exist.

Thanks,
Andy

--
https://bitfolk.com/ -- No-nonsense VPS hosting
Re: Dinged for .Date [ In reply to ]
On 2024-01-16 at 18:33:23 UTC-0500 (Tue, 16 Jan 2024 17:33:23 -0600)
Noel <noeldude@gmail.com>
is rumored to have said:

> This - getting a .com domain to send mail - is really the only choice
> you have.

I have not seen major problems with *.net or *.org domains getting
deliverability and some ccTLDs have reasonably decent reputations.

But yes, a *.com is how most people would want to go.

> If Spamassassin were to whitelist your domain *today*, it will be some
> period of time until all the people running SA have the updated rules.
> I don't know how long, but I'm guessing many months. For some, years.

The long tail is long, but since we encourage all sites to get updates
daily, the sites which lag more than a week are likely failing in many
other ways as well. The long tail is very low. If I put a rule into my
SA sandbox tonight, and it is good enough, it will be on most SA
machines within 4-5 days and will be essentially everywhere worth caring
about in 10. If Kevin makes a change in the KAM list, most of his users
will have the rule the next day, as he does not depend on the RuleQA
process.

SA removing .date from the lists of suspect TLDs would likely fix all
noticeable problems the OP has related to SA within a fortnight. That
*DOES NOT* mean their headaches from using a .date domain would end,
because most users' mailboxes are not protected by SA directly or
indirectly.

> I also can't imagine that SA is the only software filter preventing
> you from successfully using your .date domain for mail, so fixing SA
> won't do anything for those others.

SA may have more installs than any other spam classification tool, but
there's a broad understanding amongst the maintainers that none of the
behemoth mailbox providers (Google, Microsoft, Yahoo/AOL/Oath, GMX,
Apple, etc.) use SA in any way. Fastmail may, Runbox does (or did a few
years ago,) Proton probably does, and it is pretty much universal in the
small-scale mailbox provider/outsourcer world, to the extent that world
still exists. And yet, we cannot compare in scale to the world that uses
proprietary secret filters.

> The alternative is playing whack-a-mole asking individual sites to
> whitelist you until the end of time.

In theory, yes. In practice, not so much. Once you get the big guys on
board and educate direct business partners, the numberXsize of sites
rejecting independently based on a TLD is not so big.



--
Bill Cole
bill@scconsult.com or billcole@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire