Mailing List Archive

Automatically removing old mail
Hello,

I brought up this topic two times now and by now accept the fact that
nobody likes it being integrated in Sieve. ;) So here is a new approach:

Some mail systems allow the concept of "rolling/automatically expiring"
mailboxes/folders which automatically remove old mail, which is useful
for newsletter lists and junk mail folders. It is important that this
function is not performed by the client, because clients usually don't
clean up during vacation.

For that reason, I suggest a Maildir extension, which allows the mail
system to remove mail. Perhaps there is a better name than expire,
because people might expect a different behaviour by that name. It is
controlled by creating a file called "expire" in maildir base directories
with a similar structure to the first line of "maildirsize": A list of
numbers with a suffix that specifies their meaning:

1000000s 86400t 100c

That would limit messages to a total of 1MByte, at most one day old and
at most 100 messages. When delivering mail, old mail will be removed
until these constraints are met.

It is still open how it should behave when delivering a 2M message.
The decision is between delivering it, rejecting and discarding it.
Right now, mail is delivered.

I have no idea if limiting the message age is really useful to anybody.
I don't care how old mails in my junk folder are, as long as they take
up too much storage.

Any opinions are appreciated.

Michael

--
## List details at http://lists.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
Re: Automatically removing old mail [ In reply to ]
Michael Haardt wrote:
> Hello,
>
> I brought up this topic two times now and by now accept the fact that
> nobody likes it being integrated in Sieve. ;) So here is a new approach:
>
> Some mail systems allow the concept of "rolling/automatically expiring"
> mailboxes/folders which automatically remove old mail, which is useful
> for newsletter lists and junk mail folders. It is important that this
> function is not performed by the client, because clients usually don't
> clean up during vacation.
>
> For that reason, I suggest a Maildir extension, which allows the mail
> system to remove mail. Perhaps there is a better name than expire,
> because people might expect a different behaviour by that name. It is
> controlled by creating a file called "expire" in maildir base directories
> with a similar structure to the first line of "maildirsize": A list of
> numbers with a suffix that specifies their meaning:
>
> 1000000s 86400t 100c
>
> That would limit messages to a total of 1MByte, at most one day old and
> at most 100 messages. When delivering mail, old mail will be removed
> until these constraints are met.
>
> It is still open how it should behave when delivering a 2M message.
> The decision is between delivering it, rejecting and discarding it.
> Right now, mail is delivered.
>
> I have no idea if limiting the message age is really useful to anybody.
> I don't care how old mails in my junk folder are, as long as they take
> up too much storage.
>
> Any opinions are appreciated.

That seems a lot like overkill when a simple one line command stuck in
cron does it just fine.

# Anything over 14 days old
find . -type f -mtime +14 -exec rm {} \;
# Anything over 1Mb and 2 days old
find . -type f -mtime +2 -size 1000k -exec rm {} \;

Why re-invent the wheel? You're always going to deliver the message so
it's not really in Exim's domain.

--
The Exim Manual
http://www.exim.org/docs.html
http://www.exim.org/exim-html-current/doc/html/spec_html/index.html

--
## List details at http://lists.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
Re: Automatically removing old mail [ In reply to ]
> That seems a lot like overkill when a simple one line command stuck in
> cron does it just fine.
>
> # Anything over 14 days old
> find . -type f -mtime +14 -exec rm {} \;
> # Anything over 1Mb and 2 days old
> find . -type f -mtime +2 -size 1000k -exec rm {} \;
>
> Why re-invent the wheel? You're always going to deliver the message so
> it's not really in Exim's domain.

You could at least have used -o to merge both runs and -print0 of GNU
find together with GNU xargs.

Then, use the same regex as Exim to follow folders. You are just removing
maildirsize files and vacation databases - not good. The script is not
folder-specific, but of course you could parse the expire files, hacking
find a little to change parameters per directory. Then you still have
the problem of scanning many folders that are unchanged. Add checking
timestamps and the like to speed it up. A sequential approach makes poor
use of a RAID, so you use multiple processes for parallel I/O. Ugly,
but works. If only the load peak would not disturb the system now,
so you need to make sure I/O is spread a little more, tuning it per RAID.

Cleaning up as part of delivery has a good chance to have much data in
the page cache already, and it only cleans up folders that are changed
at the time they are. Exim already does parallel deliveries, making good
use of RAIDs, and things happen in time without disturbing anything else.

Some things only look easy until you try them.

Michael

--
## List details at http://lists.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##