Mailing List Archive

Parsing bounce reports accurately
Hello,

Excuse me for being somewhat OT, but this is most likely
the group of people best equipped to give me an answer
outside of the comp.mail.misc newsgroup:

I have a database of people that I send emails customized
to the information I have about them, specifically their
occupation and where they are looking for work.

I need to handle bounces cleanly, so what I did was to:

1) Generate a unique non-guessable ID for each person
and store it in their database record. This is so
That I have something to authenticate that this is
indeed a real bounce from a real email, and not a
fake bounce from someone wanting to mess with me.

2) send the emails from someuser@www.domain.com with a
custom header:

X-Bounce-Disposition: <unique-non-guessable-id>

3) Set up exim to relay for 'www.domain.com' and set
message_body_visible = 8192 (probably 2X as big as
needed)

4) Put this router into exim.conf:

bounce_router:
driver = domainlist
transport = bounce_transport
route_list = "www.domain.com"

5) Put this transport in:

bounce_transport:
driver = pipe
command = "/usr/bin/handle_bounces ${message_body}"
user = mail
group = mail
return_output
log_output
prefix =
suffix =

6) handle_bounces greps the X-Bounce-Disposition header
out of the bounce report message body and uses it to
increment the bounce count for that name. When the
threshold bounce count is reached we declare the email
address bad and stop mailing to it.

OK, so my problem is that not every MTA will send back my
X-Bounce-Disposition header in the error report; one such
MTA is MS Exchange 5.5. Nor does it return the Message-ID
which I could use as well somehow. If Exchange 5.5 is the
only MTA that does this, I have only a small problem, since
that category of user tends not to have corporate email
accounts AFAICS.

The good part is that so far that's the only MTA that I've
found to be a problem. AOL, CompuServe, Earthlink, rr.net,
Exim, Sendmail, qmail, etc all do it right. For the market
I deal with, that's > 90% coverage, but getting that last
10% would be valuable.

So my questions are:

1) What other MTAs are you aware of that don't return the
full headers from the message that caused the problem?
2) What headers can I count on being returned? (I need
something solid I can authenticate against so some guy
doesn't send me a bunch of fake bounces and invalidate
my entire DB)
3) Is there somewhere a survey of MTA standards compliance
(especially WRT RFC 1894) that I can fine-tune the
handle_bounces script with?

TIA for any assistance you can render.

--
Christopher L. Everett
Chief Technology Officer
The Medical Banner Exchange
Physicians Employment on the Internet