Mailing List Archive

Routing failed deliveries through an ESP
In an ideal world, I'd have a single dnslookup router that happily
delivered mail all day long. But host reputation is a fickle beast, and
it's painful to have mail sit around deferred or frozen until I get our IP
taken off the DNSBL list of the week.

As a solution to this game of whack-a-mole, I'm curious if I can configure
Exim in such a way that it attempts a first delivery through the normal
dnslookup router, and if the mail is rejected, retry with a router that
sends it through an ESP (Mailgun in my case) which has way more IPs and
resources to keep them clean.

How might I configure my routers to ignore an initial 5xx response from the
first router and attempt another (and maybe future) deliveries through an
alternate router?

Thanks!
Lance
--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: Routing failed deliveries through an ESP [ In reply to ]
On 17 April 2023 03:08:29 Lance Lovette via Exim-users
<exim-users@exim.org> wrote:
> How might I configure my routers to ignore an initial 5xx response from the
> first router and attempt another (and maybe future) deliveries through an
> alternate router?

I'm going to make the very obvious and morally correct answer: you don't.

If you get a 5xx error from the receiver's MX, you do the right thing and
abide by it. They're telling you they didn't want your message.

If you've got such a problem with IP or domain reputation that you end up
on DNSBLs with any frequency, you need to work on that rather than palming
off your messages to a third party.

That said: why not just send via the ESP in the first place?

Graeme

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: Routing failed deliveries through an ESP [ In reply to ]
On 17/04/2023 02:01, Lance Lovette via Exim-users wrote:
> How might I configure my routers to ignore an initial 5xx response from the
> first router and attempt another (and maybe future) deliveries through an
> alternate router?

You can't. A permenent error response for a message is definitive.
--
Cheers,
Jeremy


--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: Routing failed deliveries through an ESP [ In reply to ]
> I'm going to make the very obvious and morally correct answer: you don't.

I truly understand and at a basic level agree with that position. I'm
simply trying to balance that with what is analogous to a short-term
network outage. I need to have a failover in place to keep the business
functioning while I work to resolve the issue.

> why not just send via the ESP in the first place?

Cost savings. We'd prefer to pay the ESP to deliver only what it must and
let our server deliver most of the messages most of the time.

This is top of mind now because we're about to stand up a new server and I
won't have a good picture of the IP reputation until the bounces start
rolling in. It will take weeks to get everything running smoothly. In the
meantime, bounces will cause chaos :)

The alternative is to implement a process outside of Exim that monitors the
reject log and re-attempts delivery, skipping dnslookup, but I'm hoping the
right Exim router configuration will save us the (non-trivial) effort.

Thanks!
Lance
--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: Routing failed deliveries through an ESP [ In reply to ]
• Lance Lovette via Exim-users [2023-04-16 21:01]:
[...]
> How might I configure my routers to ignore an initial 5xx response from the
> first router and attempt another (and maybe future) deliveries through an
> alternate router?

Maybe recipient verification callout facility could be used, and ACL setting ACL variables
somehow depending on the callout verification result, and later selecting
router depending on those ACL variables.

See
https://www.exim.org/exim-html-current/doc/html/spec_html/ch-access_control_lists.html#SECTcallver

See also
https://www.exim.org/exim-html-current/doc/html/spec_html/ch-access_control_lists.html#SECTaclvariables

K.

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: Routing failed deliveries through an ESP [ In reply to ]
On 2023-04-17 at 03:54:37 UTC-0400 (Mon, 17 Apr 2023 08:54:37 +0100)
Graeme Fowler via Exim-users <graeme@graemef.net>
is rumored to have said:

> On 17 April 2023 03:08:29 Lance Lovette via Exim-users
> <exim-users@exim.org> wrote:
>> How might I configure my routers to ignore an initial 5xx response
>> from the
>> first router and attempt another (and maybe future) deliveries
>> through an
>> alternate router?
>
> I'm going to make the very obvious and morally correct answer: you
> don't.
>
> If you get a 5xx error from the receiver's MX, you do the right thing
> and abide by it. They're telling you they didn't want your message.

There's a rational basis for an exception for 5xx before MAIL FROM, when
the target only has the connection parameters and HELO name to use as a
basis for rejection. Re-routing via a fallback path isn't entirely
unjustifiable in that case, as it changes those elements of the
transaction.


> If you've got such a problem with IP or domain reputation that you end
> up on DNSBLs with any frequency, you need to work on that rather than
> palming off your messages to a third party.

Like it or not, DNSBLs are far from the only reason MTAs use to reject
mail. In the case of early 5xx rejections, it is likely that a public
DNSBL is not the mechanism in use. Fixing whatever problem caused a
particular site to get cranky about Linode or OVH or Digital Ocean or
whatever other garbage VPS provider is a problem this month isn't
feasible for their individual customers.

> That said: why not just send via the ESP in the first place?

ESPs come with their own reputational issues. Deliverability for modest
volume non-bulk mail is a difficult problem.



--
Bill Cole
bill@scconsult.com or billcole@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: Routing failed deliveries through an ESP [ In reply to ]
On Mon, Apr 17, 2023 at 08:54:37AM +0100, Graeme Fowler via Exim-users wrote:

> > How might I configure my routers to ignore an initial 5xx response from the
> > first router and attempt another (and maybe future) deliveries through an
> > alternate router?
>
> If you get a 5xx error from the receiver's MX, you do the right thing and
> abide by it. They're telling you they didn't want your message.

A mail transaction (transmission of a particular message), begins at the
MAIL command and ends at DOT. Any errors outside that context are not
message specific. In particular, they might simply reflect the
unwillingness of the host in question to accept any mail, which may not
be the case with other MX hosts.

In decades past, when now popular MTAs (Exim and Postfix) were new and
evolving their basic SMTP protocol features, it was not uncommon for for
some (IIRC Microsoft Exchange) receiving systems to intermittently
return 5XX when their load was too high.

Consequently, at least Postfix was then, and is still by default now
"tolerant" of 5XX greetings:

smtp_skip_5xx_greeting = yes

This is limited to just the initial banner, not EHLO or later, so
apparently transient misguided 5XX responses to EHLO are not a common
problem. Therefore, I'd be inclined to consider also 5XX in response
to EHLO as a reason to abandom delivery and bounce the envelope.

--
Viktor.

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: Routing failed deliveries through an ESP [ In reply to ]
On 17/04/2023 14:08, Bill Cole via Exim-users wrote:
> There's a rational basis for an exception for 5xx before MAIL FROM, when the target only has the connection parameters and HELO name to use as a basis for rejection. Re-routing via a fallback path isn't entirely unjustifiable in that case, as it changes those elements of the transaction.

Exim treats what you're talking of as a "host error" rather than a "message error",
and goes on to try the next host in the list of possibles determined by the routing
stage. Commonly that would be a lower-priority MX for the domain.
--
Cheers,
Jeremy


--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: Routing failed deliveries through an ESP [ In reply to ]
> There's a rational basis for an exception for 5xx before MAIL FROM,
> when the target only has the connection parameters and HELO
> name to use as a basis for rejection

Unfortunately, Google, in the case of an outright IP-based block, doesn't
reject the message until after DATA has been submitted.

After wrestling with this for a few days, my solution to mitigate some
fallout from host-based rejections is a router condition that allows me to
easily avoid routing to problematic domains while the issue is resolved.
(The router after this sends everything through the ESP.)

FIRST_MX_HOST = ${extract{2}{ \n}{${lookup
dnsdb{mx=$domain}{$value}}}{$value}fail}
r_direct:
driver = dnslookup
transport = t_smtp
domains = ! +local_domains
condition = first_delivery
condition = ${if !eq{$return_path}{}}
condition = ${lookup
{FIRST_MX_HOST}nwildlsearch{/etc/exim/force-esp-mxhosts}{false}{true}}
ignore_target_hosts = 0.0.0.0 : 127.0.0.0/8

Two questions for the experts:

- Is there a more efficient way to achieve this?
- Does Exim have a mechanism to invoke a script with rejected messages, so
I can either re-send the message or add the host to my force-esp-mxhosts?

Thanks!
Lance
--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: Routing failed deliveries through an ESP [ In reply to ]
On Thu, 20 Apr 2023, Lance Lovette via Exim-users wrote:

>> There's a rational basis for an exception for 5xx before MAIL FROM,
>> when the target only has the connection parameters and HELO
>> name to use as a basis for rejection
>
> Unfortunately, Google, in the case of an outright IP-based block, doesn't
> reject the message until after DATA has been submitted.
>
> After wrestling with this for a few days, my solution to mitigate some
> fallout from host-based rejections is a router condition that allows me to
> easily avoid routing to problematic domains while the issue is resolved.
> (The router after this sends everything through the ESP.)
>
> FIRST_MX_HOST = ${extract{2}{ \n}{${lookup
> dnsdb{mx=$domain}{$value}}}{$value}fail}
> r_direct:
> driver = dnslookup
> transport = t_smtp
> domains = ! +local_domains
> condition = first_delivery
> condition = ${if !eq{$return_path}{}}
> condition = ${lookup
> {FIRST_MX_HOST}nwildlsearch{/etc/exim/force-esp-mxhosts}{false}{true}}
> ignore_target_hosts = 0.0.0.0 : 127.0.0.0/8
>
> Two questions for the experts:
>
> - Is there a more efficient way to achieve this?
> - Does Exim have a mechanism to invoke a script with rejected messages, so

${run ...} will run the command.
I am not sure how you test for a rejected message.

--
Andrew C. Aitchison Kendal, UK
andrew@aitchison.me.uk

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: Routing failed deliveries through an ESP [ In reply to ]
On 20/04/2023 15:47, Lance Lovette via Exim-users wrote:
> Does Exim have a mechanism to invoke a script with rejected messages

We already told you no.
--
Cheers,
Jeremy


--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: Routing failed deliveries through an ESP [ In reply to ]
On 2023-04-20 at 10:47:15 UTC-0400 (Thu, 20 Apr 2023 10:47:15 -0400)
Lance Lovette via Exim-users <lance.lovette+exim-users@gmail.com>
is rumored to have said:

>> There's a rational basis for an exception for 5xx before MAIL FROM,
>> when the target only has the connection parameters and HELO
>> name to use as a basis for rejection
>
> Unfortunately, Google, in the case of an outright IP-based block,
> doesn't
> reject the message until after DATA has been submitted.

Then you should not, under any circumstances, retry sending that message
via ANY path. Just don't do it. That message has FAILED. It has been
REJECTED.




--
Bill Cole
bill@scconsult.com or billcole@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: Routing failed deliveries through an ESP [ In reply to ]
On 2023-04-20, Lance Lovette via Exim-users <exim-users@exim.org> wrote:
>> There's a rational basis for an exception for 5xx before MAIL FROM,
>> when the target only has the connection parameters and HELO
>> name to use as a basis for rejection
>
> Unfortunately, Google, in the case of an outright IP-based block, doesn't
> reject the message until after DATA has been submitted.
>
> After wrestling with this for a few days, my solution to mitigate some
> fallout from host-based rejections is a router condition that allows me to
> easily avoid routing to problematic domains while the issue is resolved.
> (The router after this sends everything through the ESP.)
>
> FIRST_MX_HOST = ${extract{2}{ \n}{${lookup
> dnsdb{mx=$domain}{$value}}}{$value}fail}
> r_direct:
> driver = dnslookup
> transport = t_smtp
> domains = ! +local_domains
> condition = first_delivery
> condition = ${if !eq{$return_path}{}}
> condition = ${lookup
> {FIRST_MX_HOST}nwildlsearch{/etc/exim/force-esp-mxhosts}{false}{true}}
> ignore_target_hosts = 0.0.0.0 : 127.0.0.0/8
>
> Two questions for the experts:
>
> - Is there a more efficient way to achieve this?

you could put the ip addresses in ignore target hosts instead.

> - Does Exim have a mechanism to invoke a script with rejected messages, so
> I can either re-send the message or add the host to my force-esp-mxhosts?

you can detect rejections using event_action

But it's not very easy to set up.

you setup event action as an ${acl... expansion and then in the acl branch according to
which event is happening.

When you detect a fake rejection you could then store the fact in a ratelimit.

the ratelimit can then be tested in the main delivery router (again via a
${acl... expansion) and a rateliomit failure used to skip that router.


Preventing the processing of the bounce is harder but you can do it by
arranging for the not-smtp ACL to return "drop" when it sees a bounce
from one of these messages - you'll probably need to pass some
details to this ACL in the headers of the bounce message so that
the ACL can know which trasport is producing the error..

Detecting the rejection and setting the ratelimit could also be done
here instead I guess.


This is tying exim up in knots, it will probably be fairly fragile.

--
Jasen.
???????? ????? ???????

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: Routing failed deliveries through an ESP [ In reply to ]
D?a 21. apríla 2023 4:43:45 UTC používate? Jasen Betts via Exim-users <exim-users@exim.org> napísal:

>you can detect rejections using event_action
>
>When you detect a fake rejection you could then store the fact in a ratelimit.
>
>the ratelimit can then be tested in the main delivery router (again via a
>${acl... expansion) and a rateliomit failure used to skip that router.

When i recently play with ratelimit inside event action (4.94) i got
error, something as "ratelimit not allowed here"...

Did i something wrong?

regards


--
Slavko
https://www.slavino.sk/

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: Routing failed deliveries through an ESP [ In reply to ]
On 21/04/2023 06:55, Slavko via Exim-users wrote:
> Did i something wrong?

Would need the actual error message to guess.
--
Cheers,
Jeremy


--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: Routing failed deliveries through an ESP [ In reply to ]
D?a 21. apríla 2023 8:23:50 UTC používate? Jeremy Harris via Exim-users <exim-users@exim.org> napísal:
>On 21/04/2023 06:55, Slavko via Exim-users wrote:
>> Did i something wrong?
>
>Would need the actual error message to guess.

OK, i have not exact message already, but IIRC it can
be related to per_addr option, as i want to count of
unique failed recipients per authenticated user...

I will try again in near future again and eventually i
return to this topic with exact error...

regards


--
Slavko
https://www.slavino.sk/

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: Routing failed deliveries through an ESP [ In reply to ]
On 21/04/2023 13:13, Slavko via Exim-users wrote:
> it can
> be related to per_addr option

per_addr can only be used in the rcpt acl.
You'd possibly be able to just use count=1,
if this was and event raised once per thing
you want counted.
--
Cheers,
Jeremy


--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: Routing failed deliveries through an ESP [ In reply to ]
D?a 21. apríla 2023 13:40:47 UTC používate? Jeremy Harris via Exim-users <exim-users@exim.org> napísal:

>per_addr can only be used in the rcpt acl.
>You'd possibly be able to just use count=1,
>if this was and event raised once per thing
>you want counted.

OK i got idea, thanks.

Previously i did wrong decision, that ratelimit cannot
be used outside "basic" ACLs. I will play with it again.

regards


--
Slavko
https://www.slavino.sk/

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: Routing failed deliveries through an ESP [ In reply to ]
> put the ip addresses in ignore_target_hosts instead.

Excellent suggestion! That option does exactly what my lookup is doing in a
much more reliable and efficient manner. I had glossed over that option
because its documentation describes in so much detail the handling of IP
addresses, I assumed it wasn't suited for wildcard host names. Never
underestimate Exim! To which, let me take a moment to praise the folks
behind Exim for such comprehensive documentation. Exim is a sophisticated
piece of software and the documentation is every bit as impressive.

> you can detect rejections using event_action
> arranging for the not-smtp ACL to return "drop"

You gave me good food for thought, and after some trial and error I was
able to implement exactly what I intended using a shell script and some
strategic ${run} calls. It's opened a whole world of possibilities
including something I've had an eye towards implementing - message event
webhooks.

I use a transport event_action to run a script to capture message events
and DSN messages. I can then conditionally add a header to the
bounce_message_file that acl_not_smtp can evaluate to drop the bounce,
which in turn freezes the original message. All very straightforward.

One side-effect I'm curious about is the error message logged when a bounce
message is dropped.

> Error while reading message with no usable sender address: rejected by
non-SMTP ACL: local configuration problem

Any chance there's a "less problematic" way to drop a bounce? :) If I
implement a webhook based mechanism to report bounces, it would be nice to
not fill mainlog with this message. (I unset log_reject_target to keep the
headers out of rejectlog but mainlog still gets a report.)

Thanks!
Lance
--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: Routing failed deliveries through an ESP [ In reply to ]
Ahoj,

D?a Fri, 21 Apr 2023 14:40:47 +0100 Jeremy Harris via Exim-users
<exim-users@exim.org> napísal:

> On 21/04/2023 13:13, Slavko via Exim-users wrote:
> > it can
> > be related to per_addr option
>
> per_addr can only be used in the rcpt acl.
> You'd possibly be able to just use count=1,
> if this was and event raised once per thing
> you want counted.

I am sorry for delay, but now i find time to play with it...

I reenabled ratelimit in event msg:fail:delivery. I was wrong, it was
not the per_addr, but per_rcpt, but result is the same. As i call it
from nested ACL (acl=), the log is not useful.

AFAIK, the msg:fail:delivery event is called once per failed recipient,
and that is exactly what i want to count -- the failed recipients rate.
Previously i did it by recipient callout, but IMO events are better,
as no separate callout is done (and i cannot use hold in callout, as i
use BATV to modify envelope sender).

I reread the doc about ratelimit, and i found that only per_conn
and per_cmd has not mentioned other ACLs, where they can be used. I
will guess, that that approach was choose before events was
introduced. If that definition can be opposite -- i mean list of ACLs,
where particular option cannot be used, then many of them can be used in
events...

I was success with per_cmd/count=1, but i am not sure if that is right.
I can see right number in ratelimit DB, but i did only basic testing
yet. I guess, that per_conn will not be useful in this case to count
failed recipients, but i am not sure. Can you please confirm that?

I am not sure how to deal with the same failing recipient yet... But
can be unique=$local_part@$domain used there?

regards

--
Slavko
https://www.slavino.sk