Mailing List Archive: [Bug 7881] New: sa-learn queries uribl for already known messages

https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7881

Bug ID: 7881
Summary: sa-learn queries uribl for already known messages
Product: Spamassassin
Version: 3.4.4
Hardware: PC
OS: Linux
Status: NEW
Severity: normal
Priority: P2
Component: Learner
Assignee: dev@spamassassin.apache.org
Reporter: alexander@affine.space
Target Milestone: Undefined

When running sa-learn on already known messages uribl queries for urls in them
are made. This can result in getting blacklisted by uribl (URIBL_BLOCKED) due
to too many requests in a short time if one has a lot messages. A caching name
server doesn't help, as a lot of different urls can be contained in the
messages.

How to reproduce:

1. Start tcpdump to obtain all queries made:

tcpdump -i lo udp and port 53 | grep uribl

If you don't use a local dns server you might need to adjust "lo" to your
network device.

2. Run sa-learn, e.g.

/usr/bin/sa-learn --ham /home/USER/Maildir/cur/

3. You can see in the output of tcpdump a lot domains like
example.org.multi.uribl.com. .

4. Repeat running sa-learn.

5. You see the dns queries, again.

My bayes_seen file seems to get updated, at least the file's timestamp changes.
Running sa-learn with -L gets rid of the queries and tcpdump's output (also
related: bug 5837 ). The issue isn't general queries of sa-learn though, but
that they are done for already known messages.

--
You are receiving this mail because:
You are the assignee for the bug.