Mailing List Archive

[Bug 3224] New: sa-learn failure causes locking problem, spin, and db corruption
http://bugzilla.spamassassin.org/show_bug.cgi?id=3224

Summary: sa-learn failure causes locking problem, spin, and db
corruption
Product: Spamassassin
Version: 2.63
Platform: Other
OS/Version: FreeBSD
Status: NEW
Severity: major
Priority: P5
Component: Tools
AssignedTo: spamassassin-dev@incubator.apache.org
ReportedBy: ed.randall@ingenotech.com


It's possible for sa-learn to exit without releasing the bayes.lock on
the database. I don't know how it got in to this state, but it happened
to me after an sa-learn --spam --mbox --no-rebuild from a large (44Mb)
spam file today.


When I later returned to the machine to check on my sa-learn process
it appeared to have finished successfully, there were no error messages
on the screen. But CPU was at 100% with 3 spamd processes chewing up the
CPU.

All subsequent mail receipts seemed to cause a new spamd to
form whick spun whilst waiting for this lock. Why does it not
just give up after a certain number of retries? The lock file
itself just keeps growing instead of giving up...

Having stopped all the spamd processed and run sa-learn --rebuild --debug-level
99 I got the following output, it came out of the lock loop when I
removed bayes.lock but is still hanging on the "something fishy" line
after 10 minutes, again at 100% CPU. I guess my database is now screwed,
sa-learn should at least have the decency to make a backup copy of the DB
if it's capable of screwing up all that valuable training.

# sa-learn --rebuild --debug-level 99
debug: Score set 0 chosen.
debug: running in taint mode? yes
debug: Running in taint mode, removing unsafe env vars, and resetting PATH
debug: PATH included '/usr/local/jdk1.2.2/bin', keeping.
debug: PATH included '/home/ed/projects/env/bin', keeping.
debug: PATH included '/sbin', keeping.
debug: PATH included '/bin', keeping.
debug: PATH included '/usr/sbin', keeping.
debug: PATH included '/usr/bin', keeping.
debug: PATH included '/usr/games', keeping.
debug: PATH included '/usr/local/bin', keeping.
debug: PATH included '/usr/X11R6/bin', keeping.
debug: PATH included '/home/ed/bin', keeping.
debug: Final PATH set to:
/usr/local/jdk1.2.2/bin:/home/ed/projects/env/bin:/sbin:/bin:/usr/sbin:/usr/bin:/usr/games:/usr/local/bin:/usr/X11R6/bin:/home/ed/bin
debug: using "/usr/local/share/spamassassin" for default rules dir
debug: using "/etc/mail/spamassassin" for site rules dir
debug: using "/root/.spamassassin/user_prefs" for user prefs file
debug: bayes: 79236 tie-ing to DB file R/O /var/spamassassin/bayes_toks
debug: bayes: 79236 tie-ing to DB file R/O /var/spamassassin/bayes_seen
debug: bayes: found bayes db version 2
debug: Score set 2 chosen.
debug: Initialising learner
debug: Initialising learner
debug: Syncing Bayes journal and expiring old tokens...
debug: lock: 79236 created /var/spamassassin/bayes.lock.srv79.server4me.com.79236
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 0 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 1 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 2 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 3 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 4 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 5 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 6 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 7 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 8 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 9 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 10 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 11 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 12 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 13 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 14 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 15 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 16 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 17 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 18 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 19 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 20 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 21 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 22 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 23 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 24 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 25 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 26 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 27 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 28 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 29 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 30 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 31 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 32 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 33 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 34 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 35 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 36 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 37 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 38 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 39 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 40 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 41 retries
debug: lock: 79236 trying to get lock on /var/spamassassin/bayes with 42 retries
debug: lock: 79236 link to /var/spamassassin/bayes.lock: link ok
debug: bayes: 79236 tie-ing to DB file R/W /var/spamassassin/bayes_toks
debug: bayes: 79236 tie-ing to DB file R/W /var/spamassassin/bayes_seen
debug: bayes: found bayes db version 2
debug: bayes: expiry check keep size, 75% of max: 112500
debug: bayes: token count: 179347, final goal reduction size: 66847
debug: bayes: First pass? Current: 1080467989, Last: 1080024097, atime:
2764800, count: 18702, newdelta: 773516, ratio: 3.57432360175382
debug: bayes: Can't use estimation method for expiry, something fishy,
calculating optimal atime delta (first pass)

--- hangs forever ---

Regards

Ed



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.