Mailing List Archive

[Spamassassin Wiki] Update of "OutOfMemoryProblems" by JustinMason
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Spamassassin Wiki" for change notification.

The following page has been changed by JustinMason:
http://wiki.apache.org/spamassassin/OutOfMemoryProblems

The comment on the change is:
expanded, incorporated stuff from SpamdKillingServer

------------------------------------------------------------------------------
- = I keep running out of memory every now and again. What's happening? =
+ = Memory problems with SpamAssassin =

- == Simultaneous Scans ==
+ If you are seeing mailserver meltdowns due to load imposed by SpamAssassin, here's a checklist of items you should run through.

- If you're using spamassassin to filter all mail on your domain, you will run out of memory, unless you impose a limit on simultaneous scans.
+ == Spamd ==

+ If you have enough load to run into memory problems, and you're not using ''spamd'', MailScanner, Amavisd or another system that keeps the {{{Mail::SpamAssassin}}} modules loaded persistently, then that should be your first priority.
+
+ The overhead of starting the perl interpreter, compiling the code, compiling the SpamAssassin ruleset, and so on is quite high; ''spamd'' was designed to get around that problem. It greatly reduces the per-message scan time.
+
+ == Simultaneous scans ==
+
+ If you're using spamassassin to filter all mail on your domain, you ''will'' run out of memory, unless you impose a limit on simultaneous scans.
+
- Basically, many MTAs (mail transfer agents) will allow unlimited simultaneous deliveries to local mail accounts. If you then start a process to handle each of those deliveries -- as you will if you insert SpamAssassin into the delivery process -- those processes will chew up RAM and bog down the system.
+ Many MTAs (mail transfer agents) will allow unlimited simultaneous deliveries to local mail accounts. If you then start a process to handle each of those deliveries -- as you will if you insert SpamAssassin into the delivery process -- those processes will chew up RAM and bog down the system.

Many spammers do not impose any kind of rate-throttling on their sending side; they just want to send as much spam as quickly as possible. If the receiving system breaks, that's not their problem.

@@ -22, +30 @@


This helps throttle down the load further upstream, and is very beneficial.

- == Very large mails ==
+ In extremis, if you're using procmail, you can set it to scan just one mail at a time using this stanza in the procmailrc:

- You should also ensure you do not scan very large mails, as described in SpamdKillingServer.
+ {{{
+ :0fw: spamassassin.lock
+ * < 300000
+ | spamc
+ }}}

== Very large mails ==

- You should also avoid using very large custom rules files. The larger custom rules files available from the SA community will exacerbate this problem. The following CustomRulesets ''will'' kill your server:
+ Spamassassin will attempt to scan anything you throw at it. However, the time taken to scan messages rises exponentially with the message size. The best way to deal with this problem is to limit the size of messages that get scanned by SpamAssassin. Tests show that larger messages are overwhelmingly likely to be non-spam, given the economics of spamming.

- * sa-blacklist.cf or sa-blacklist-uri.cf
+ For procmail, the following recipe works:
+
+ {{{
+ :0fw
+ * < 300000
+ | spamc
+ }}}
+
+ (The "* < 300000" line prevents messages over 300 KiB from being passed to SpamAssassin.)
+
+ If you're using spamd, you should be aware that spamc will only pass messages under 250k in size to spamd by default. This is a built-in limit that you can override using the "-s" option on the command-line. Refer to the spamc man page for more details.
+
+ == Heavyweight custom rules ==
+
+ You should also avoid using very large custom rules files. The larger custom rules files available from the SA community can double or triple your memory usage.
+
+ In particular, the following CustomRulesets ''will'' kill your server:
+
+ * sa-blacklist.cf
+ * sa-blacklist-uri.cf
* bigevil.cf

- If you are using these rulesets, consider using [http://wiki.apache.org/spamassassin/SURBL] instead.
+ Avoid at all costs on a busy server. If you are using these rulesets, consider using [http://wiki.apache.org/spamassassin/SURBL] instead.
+
+ Try running without custom rulesets, measure memory usage, and re-add them gradually. By doing this you can determine which rulesets are worth the memory/accuracy trade-off.
+
+ == Network tests ==
+
+ If your scan times are very high (greater than 10 seconds per message) and you have network tests enabled, there's a good chance you're running into latency issues caused by network slowness.
+
+ Ensure you are running a local caching DNS server on the mailserver itself. This greatly improves latency of DNS requests, which SpamAssassin uses heavily.
+
+ Consider turning off Razor, Pyzor, or DCC, if you have those enabled. They can be very slow at times, and will typically increase latency by a second or two in the best case.

== AWL ==

- If you're doing the above and still running out of memory, check the size of your AWL databases. Version 3.0.3 fixes an AWL specific bug that can cause memory bloat from large AWL db files.
+ If you're doing the above and still running out of memory intermittently, check the size of your AWL databases. Version 3.0.3 fixes an AWL specific bug that can cause memory bloat from large AWL db files.

You can also use the --clean switch of the tools/check_whitelist utility to remove entries with < n hits, providing a way to clean out the db. This should resolve this specific problem.
[Spamassassin Wiki] Update of "OutOfMemoryProblems" by JustinMason [ In reply to ]
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Spamassassin Wiki" for change notification.

The following page has been changed by JustinMason:
http://wiki.apache.org/spamassassin/OutOfMemoryProblems

------------------------------------------------------------------------------
= I keep running out of memory every now and again. What's happening? =
+
+ == Simultaneous Scans ==

If you're using spamassassin to filter all mail on your domain, you will run out of memory, unless you impose a limit on simultaneous scans.

@@ -20, +22 @@


This helps throttle down the load further upstream, and is very beneficial.

+ == Very large mails ==
+
You should also ensure you do not scan very large mails, as described in SpamdKillingServer.

- You should also avoid using very large custom rules files. The larger custom rules files available from the SA community will exacerbate this problem. A prime example is sa-blacklist.cf or sa-blacklist-uri.cf by Bill Stearns (See CustomRulesets). If you have memory problems using these rulesets, consider using [http://wiki.apache.org/spamassassin/SURBL] instead.
+ == Very large mails ==

- If you're doing the above and still running out of memory, check the size of your AWL databases. Version 3.0.3 fixes an AWL specific bug that can cause memory bloat from large AWL db files. You can also use the --clean switch of the tools/check_whitelist utility to remove entries with < n hits, providing a way to clean out the db. This should resolve this specific problem.
+ You should also avoid using very large custom rules files. The larger custom rules files available from the SA community will exacerbate this problem. The following CustomRulesets ''will'' kill your server:

+ * sa-blacklist.cf or sa-blacklist-uri.cf
+ * bigevil.cf
+
+ If you are using these rulesets, consider using [http://wiki.apache.org/spamassassin/SURBL] instead.
+
+ == AWL ==
+
+ If you're doing the above and still running out of memory, check the size of your AWL databases. Version 3.0.3 fixes an AWL specific bug that can cause memory bloat from large AWL db files.
+
+ You can also use the --clean switch of the tools/check_whitelist utility to remove entries with < n hits, providing a way to clean out the db. This should resolve this specific problem.
+