Mailing List Archive

RBL timeouts
Hi,

I recently installed from Theo's RPMs rebuilt from src.rpm, spamassassin
2.63-1 on RHEL3.

I have been using spamassassin previously with no troubles on 2.61-1
installed same way on another RHEL3 server. It seemed to be handling
RBLs normally.

For some reason, I am now getting messages in the log (w/debugging enabled)
as follows:

Jan 31 16:38:24 bwnjlin1 spamd[1333]: debug: RBL: success for 0 of 17
queries
Jan 31 16:38:24 bwnjlin1 spamd[1333]: debug: RBL: timeout for
sorbs,sorbs-notfirsthop after 15 seconds
Jan 31 16:38:24 bwnjlin1 spamd[1333]: debug: RBL: timeout for
njabl-notfirsthop,njabl after 15 seconds
Jan 31 16:38:24 bwnjlin1 spamd[1333]: debug: RBL: timeout for opm after
15 seconds

...etc for remaining RBL checks...

Sometimes it is 0 of 9 queries, and sometimes 0 of 17 queries.

Looking for advice on how to debug this problem.

Thanks, Bill Wraith
Re: RBL timeouts [ In reply to ]
Bill Wraith <bwraith@nji.com> writes:

> For some reason, I am now getting messages in the log (w/debugging enabled)
> as follows:
>
> Jan 31 16:38:24 bwnjlin1 spamd[1333]: debug: RBL: success for 0 of 17
> queries
> Jan 31 16:38:24 bwnjlin1 spamd[1333]: debug: RBL: timeout for
> sorbs,sorbs-notfirsthop after 15 seconds
> Jan 31 16:38:24 bwnjlin1 spamd[1333]: debug: RBL: timeout for
> njabl-notfirsthop,njabl after 15 seconds
> Jan 31 16:38:24 bwnjlin1 spamd[1333]: debug: RBL: timeout for opm after
> 15 seconds
>
> ...etc for remaining RBL checks...
>
> Sometimes it is 0 of 9 queries, and sometimes 0 of 17 queries.

That typically indicates 1 query is for the domain, 8 are for IP
addresses in the header. 9 means one received header, 17 means two
received headers.

Why 0? Probably means your network connection is not working, DNS is
not working, or something like that.

Daniel

--
Daniel Quinlan anti-spam (SpamAssassin), Linux, and
http://www.pathname.com/~quinlan/ open source consulting
RBL timeouts [ In reply to ]
Daniel Quinlan wrote:

> Bill Wraith <bwraith@nji.com> writes:
>
>
>>For some reason, I am now getting messages in the log (w/debugging
enabled)
>>as follows:
>>
>>Jan 31 16:38:24 bwnjlin1 spamd[1333]: debug: RBL: success for 0 of 17
>>queries
>>Jan 31 16:38:24 bwnjlin1 spamd[1333]: debug: RBL: timeout for
>>sorbs,sorbs-notfirsthop after 15 seconds
>>Jan 31 16:38:24 bwnjlin1 spamd[1333]: debug: RBL: timeout for
>>njabl-notfirsthop,njabl after 15 seconds
>>Jan 31 16:38:24 bwnjlin1 spamd[1333]: debug: RBL: timeout for opm after
>>15 seconds

> Why 0? Probably means your network connection is not working, DNS is
> not working, or something like that.

Daniel,

I do have network connectivity, including pings to the various RBL
domains, and I can use dig to manually check an RBL. I get immediate
response to a DNS query such as "dig @opm.blitzed.org
78.44.216.opm.blitzed.org. Is there a way to discover some details about
why the RBL check is failing inside spamassassin, given that I am
successfully able to manually query several of the RBL lists w/dig?

Thanks, Bill
Re: RBL timeouts [ In reply to ]
(For some reason, your mail client sent two separate unique responses
with different Message-IDs and different recipients in the To: headers
(one to me and one to the list). Please only send one response,
preferably always including the list.)

Bill Wraith <bwraith@nji.com> writes:

> I do have network connectivity, including pings to the various RBL
> domains, and I can use dig to manually check an RBL. I get immediate
> response to a DNS query such as "dig @opm.blitzed.org
> 78.44.216.opm.blitzed.org.

SpamAssassin does not query RBL DNS servers directly. That would be bad
for DNS caching and horrible for performance. You need to do queries
through your own DNS servers:

$ host 2.0.0.127.opm.blitzed.org

(or with dig)

It's also good to test stuff out with the 127.0.0.2 test address which
should always return 127.something from a working RBL.

Also, 78.44.216.opm.blitzed.org is not a valid query, but perhaps that
was just a typo.

It could be a problem with Net::DNS on your system or some DNS problem
that only affects Net::DNS and not other tools like dig.

> Is there a way to discover some details about why the RBL check is
> failing inside spamassassin, given that I am successfully able to
> manually query several of the RBL lists w/dig?

Other than turning on debugging (which you already have), not really.
It hasn't been necessary.

--
Daniel Quinlan anti-spam (SpamAssassin), Linux, and
http://www.pathname.com/~quinlan/ open source consulting
Re: RBL timeouts [ In reply to ]
Alex skrev den 2022-12-02 14:04:

> Any bind experts know of a way to record which nameserver is timing
> out so I can perhaps exclude them? Any idea why it wouldn't just
> rotate to the next one, or even how to confirm whether it's doing
> that?

you are using

1: rbls not default in spamassassin
2: not checking 2nd hand sites if the ips are listed

remove dead rpbls in spamassassin, problem solved

> Links:
> ------
> [1] http://168.22.111.13.bb.barracudacentral.org
> [2] http://168.22.111.13.bb.barracudacentral.org/IN/A
> [3] http://216.209.245.104.bb.barracudacentral.org/A/IN
> [4] http://17.31.10.37.cidr.bl.mcafee.com
> [5] http://17.31.10.37.cidr.bl.mcafee.com/IN/A

https://multirbl.valli.org/lookup/13.111.22.168.html
https://multirbl.valli.org/lookup/216.209.245.104.html
https://multirbl.valli.org/lookup/37.10.31.17.html

seems ok, remove cidr.bl.mcafee.com or convence multirbl to add it :=)
Re: RBL timeouts [ In reply to ]
On 2022-12-02 at 08:04:40 UTC-0500 (Fri, 2 Dec 2022 08:04:40 -0500)
Alex <mysqlstudent@gmail.com>
is rumored to have said:

> Hi,
>
> Is anyone (everyone?) also experiencing DNS timeouts with barracuda?

Chonically, for years, until I gave up on them. Not worthy of production
use.

> 02-Dec-2022 07:03:02.229 query-errors: client @0x7fd19d26c968
> 127.0.0.1#37098 (168.22.111.13.bb.barracudacentral.org): query failed
> (timed out) for 168.22.111.13.bb.barracudacentral.org/IN/A at
> ../../../lib/ns/query.c:7729
> 02-Dec-2022 07:03:21.458 lame-servers: SERVFAIL unexpected RCODE
> resolving '
> 216.209.245.104.bb.barracudacentral.org/A/IN': 3.13.7.254#53

But that is NOT a timeout. SERVFAIL is an explicit affirmative reply
that the answering server cannot give any valid answer to the query.

> I'm also seeing a few timeouts from mcafee:
>
> 24-Nov-2022 16:12:37.151 query-errors: client @0x7fd19f7a4f68
> 127.0.0.1#47466 (17.31.10.37.cidr.bl.mcafee.com): query failed (timed
> out)
> for 17.31.10.37.cidr.bl.mcafee.com/IN/A at
> ../../../lib/ns/query.c:7729
>
> I don't necessarily think there's something wrong with my nameservers
> - I'm
> more just surprised that such high-profile companies are having
> problems
> and wanted to confirm.

Big companies have big problems. High-profile companies have
high-profile problems.

> Any bind experts know of a way to record which nameserver is timing
> out so
> I can perhaps exclude them? Any idea why it wouldn't just rotate to
> the
> next one, or even how to confirm whether it's doing that?

The SERVFAIL errors are very likely immune to any workaround attempt.
The timeouts should already be handled as best they can be by BIND & the
system resolver, given reasonable query timeout and retry values, such
as OS defaults. Note that it may not make sense for a resolver to allow
slow DNSBL lookups to block a message transaction from proceeding.

It is unlikely that you can tune BIND and/or your system resolver to
reduce timeouts in any meaningful ways. The exception to that would be
if your system is generally overloaded and BIND is just not getting the
resources (cpu and memory) it needs to operate fast. You would likely
notice that sort of overload.



--
Bill Cole
bill@scconsult.com or billcole@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire