Mailing List Archive

[PATCH] procmail-like filters for dbmail
The attached patch enables dbmail to use a simple procmail-like
mechanism for filtering incoming mail. Rather than always delivering to
INBOX, users can now filter their mail with regular expressions and
direct them to different folders. For example, a pattern like
^List-Id:.*dbmail\.org
can be used to direct all mail from the dbmail lists to a specific
folder.

It should be possible for untrusted users to create these filters, and
it's very fast, so of course it can only do pretty simple matching
(POSIX enhanced regexes) and has no command execution etc. like
procmail.

To make this work, my patch adds the "filter" table. If this table is
not present, everything will work like always, so backward compatibility
shouldn't be an issue.
A method for users to input these regular expressions is not included,
as it will probably be a part of a web interface for user administration
at each site.

--
Jonas Jensen <jbj@knef.dk>
Re: [PATCH] procmail-like filters for dbmail [ In reply to ]
I feel that you're attaching too late into the delivery chain. The
function insert_messages() in pipe.c should probably be the place for this
to go (at the same time breaking insert_messages() into smaller parts).

Interestingly, it is also the exact place where my LDAP patch needs to
hook in to create the INBOX for a user who does not have one yet (ie it's
their first email since the account was created in the LDAP directory).
You probably have a similar issue if a mailbox is specified for a filter
rule but does not exist yet.

With the filters applied higher up on the delivery chain, they can also be
used to generate bounces, forwards or call external programs.

If you're interested, I'm working on a Sieve based solution. The project
has quite a bit of steps involved; I'm at about step 3 of 2349058725. I'd
be happy to share the work on this project!

Check out libsieve.sf.net to see the library I set up, although the API
needs an overhaul (I have the design document but haven't coded it yet).
The best part is that it plugs into existing MUA infrastructure, using the
ManageSieve draft RFC to do filter uploading and the Sieve langauge for
the filters. There's a bunch of Sieve script editors/uploaders written in
Perl, PHP and for X. As always, freshmeat.net is your friend ;-)

Really nice patch, though! For those who need filtering yesterday (umm,
me...) and in case the Sieve project tanks, this looks great.

Aaron


On 19 Apr 2003, Jonas Jensen wrote:

> The attached patch enables dbmail to use a simple procmail-like
> mechanism for filtering incoming mail. Rather than always delivering to
> INBOX, users can now filter their mail with regular expressions and
> direct them to different folders. For example, a pattern like
> ^List-Id:.*dbmail\.org
> can be used to direct all mail from the dbmail lists to a specific
> folder.
>
> It should be possible for untrusted users to create these filters, and
> it's very fast, so of course it can only do pretty simple matching
> (POSIX enhanced regexes) and has no command execution etc. like
> procmail.
>
> To make this work, my patch adds the "filter" table. If this table is
> not present, everything will work like always, so backward compatibility
> shouldn't be an issue.
> A method for users to input these regular expressions is not included,
> as it will probably be a part of a web interface for user administration
> at each site.
>
> --
> Jonas Jensen <jbj@knef.dk>
>
Re: [PATCH] procmail-like filters for dbmail [ In reply to ]
On Mon, 2003-04-21 at 02:15, Aaron Stone wrote:
> Interestingly, it is also the exact place where my LDAP patch needs to
> hook in to create the INBOX for a user who does not have one yet (ie it's
> their first email since the account was created in the LDAP directory).
> You probably have a similar issue if a mailbox is specified for a filter
> rule but does not exist yet.

Nope. The SELECT statement I use makes an implicit INNER JOIN between
filters and mailboxes, making sure that the mailbox exists, otherwise
the row won't be returned.

> With the filters applied higher up on the delivery chain, they can also be
> used to generate bounces, forwards or call external programs.

This would be nice. So what you mean is that the filters table should
have an owner_idnr field and a deliver_to field, replacing the
mailbox_idnr field it has now. The code to scan for filters probably
belongs around the call to auth_check_user (which should be renamed to
auth_check_alias IMHO), as it needs to be called recursively.

For example, if Alice has a filter forwarding some mail to Bob, the
final destination will be resolved like this:
alias (alice@domain)
-> userid (5)
-> filter (mail that belongs to bob)
-> alias (bob@domain)
-> userid (6)
-> filter (no filters match)
-> mailbox (INBOX)

Numbers in filters.deliver_to point to mailboxes while numbers in
aliases.deliver_to point to userids. To handle this I propose a
resolve_alias() function for pipe.c. Here's my idea, in pseudo code:

/* Returns a string describing either an external forward or a mailbox id */
string resolve_alias(string deliver_to, const char *headers)
{
while (1) {
deliver_to = auth_check_user(deliver_to);
if (!deliver_to)
deliver_to = auth_check_user(strchr(deliver_to, '@'));

if (is_numeric(deliver_to)) {
/* a userid was returned and must be resolved to a mailboxid */
string filter_destination = db_check_filters(deliver_to, headers);

if (filter_destination) {
/* a filter matched, returning either a mailboxid or an alias */
if (is_numeric(filter_destination))
return filter_destination;
else
deliver_to = filter_destination;

} else {
/* no filters matched, default to INBOX */
return db_create_and_get_inbox(deliver_to);
}
} else {
return deliver_to;
}
}
}

> If you're interested, I'm working on a Sieve based solution. The project
> has quite a bit of steps involved; I'm at about step 3 of 2349058725. I'd
> be happy to share the work on this project!

At first I also wanted a richer mail filtering language like
Sieve/procmail/maildrop, but now I think simple regexes are in many
cases better. dbmail is often used for serving large amounts of
untrusted users, and allowing them to write filters in a rich language
would slow down performance. It would also be necessary to rip out the
command execution and any possibilities for writing infinite loops in
those languages to avoid security problems.

Also, I found that my entire .procmailrc could be represented with
simple regexes. I think that's the case for 95% of dbmail users (users,
not admins).

> Really nice patch, though! For those who need filtering yesterday (umm,
> me...) and in case the Sieve project tanks, this looks great.

Of course, the two solutions can coexist if/when you finish your
libsieve project.

BTW, I'll be offline until Friday.

--
Jonas Jensen <jbj@knef.dk>
Re: [PATCH] procmail-like filters for dbmail [ In reply to ]
On 21 Apr 2003, Jonas Jensen wrote:

> On Mon, 2003-04-21 at 02:15, Aaron Stone wrote:
> > Interestingly, it is also the exact place where my LDAP patch needs to
> > hook in to create the INBOX for a user who does not have one yet (ie it's
> > their first email since the account was created in the LDAP directory).
> > You probably have a similar issue if a mailbox is specified for a filter
> > rule but does not exist yet.
>
> Nope. The SELECT statement I use makes an implicit INNER JOIN between
> filters and mailboxes, making sure that the mailbox exists, otherwise
> the row won't be returned.
>

Oh, I didn't see that. It's good to have someone around who knows their
SQL really well ;-)

> > With the filters applied higher up on the delivery chain, they can also be
> > used to generate bounces, forwards or call external programs.
>
> This would be nice. So what you mean is that the filters table should
> have an owner_idnr field and a deliver_to field, replacing the
> mailbox_idnr field it has now. The code to scan for filters probably
> belongs around the call to auth_check_user (which should be renamed to
> auth_check_alias IMHO), as it needs to be called recursively.
>
> For example, if Alice has a filter forwarding some mail to Bob, the
> final destination will be resolved like this:
> alias (alice@domain)
> -> userid (5)
> -> filter (mail that belongs to bob)
> -> alias (bob@domain)
> -> userid (6)
> -> filter (no filters match)
> -> mailbox (INBOX)
>

It's interesting that you would insert a short circuit for local forwards.
I think it would be easier just to pass all email forwards back to the
MTA. This also adds the appropriate headers and whatnot indicating the
path that the mail took.

> Numbers in filters.deliver_to point to mailboxes while numbers in
> aliases.deliver_to point to userids. To handle this I propose a
> resolve_alias() function for pipe.c. Here's my idea, in pseudo code:
>
> /* Returns a string describing either an external forward or a mailbox id */
> string resolve_alias(string deliver_to, const char *headers)
> {
> while (1) {
> deliver_to = auth_check_user(deliver_to);
> if (!deliver_to)
> deliver_to = auth_check_user(strchr(deliver_to, '@'));
>
> if (is_numeric(deliver_to)) {
> /* a userid was returned and must be resolved to a mailboxid */
> string filter_destination = db_check_filters(deliver_to, headers);
>
> if (filter_destination) {
> /* a filter matched, returning either a mailboxid or an alias */
> if (is_numeric(filter_destination))
> return filter_destination;
> else
> deliver_to = filter_destination;
>
> } else {
> /* no filters matched, default to INBOX */
> return db_create_and_get_inbox(deliver_to);
> }
> } else {
> return deliver_to;
> }
> }
> }
>

I like where some of this is going, but I am hoping to divide up these
pieces further. I'll get some code together and post the prototypes, but
basically everything will have something like these arguments:

int blah( FILE *instream, char *header, list userids, bounces, forwards );

> > If you're interested, I'm working on a Sieve based solution. The project
> > has quite a bit of steps involved; I'm at about step 3 of 2349058725. I'd
> > be happy to share the work on this project!
>
> At first I also wanted a richer mail filtering language like
> Sieve/procmail/maildrop, but now I think simple regexes are in many
> cases better. dbmail is often used for serving large amounts of
> untrusted users, and allowing them to write filters in a rich language
> would slow down performance. It would also be necessary to rip out the
> command execution and any possibilities for writing infinite loops in
> those languages to avoid security problems.
>

Sieve is written specifically for email filtering, and as such does not
include commands to call external programs or to run in loops of any kind.
In fact, there aren't even variables (yet, there's a draft in progress).

> Also, I found that my entire .procmailrc could be represented with
> simple regexes. I think that's the case for 95% of dbmail users (users,
> not admins).
>
> > Really nice patch, though! For those who need filtering yesterday (umm,
> > me...) and in case the Sieve project tanks, this looks great.
>
> Of course, the two solutions can coexist if/when you finish your
> libsieve project.
>
> BTW, I'll be offline until Friday.
>

Take care!

> --
> Jonas Jensen <jbj@knef.dk>
>
> _______________________________________________
> Dbmail-dev mailing list
> Dbmail-dev@dbmail.org
> http://twister.fastxs.net/mailman/listinfo/dbmail-dev
>
Re: [PATCH] procmail-like filters for dbmail [ In reply to ]
So Jonas, just last night I posted my rewritten pipe.c, which now has a
hook for filtering built into the delivery chain. The flow is a little
different than I was expecting, but I'm content the with results, and glad
that I didn't have to touch anything in the database layer.

The flow I sketched in my pipe.c rewrite first copies messages into a
target user's INBOX, and then calls the function execute_filters(useridnr,
msgidnr). The filters can then run on the header and/or body and toss the
message around as needed, sending a rejection to the From address and
deleting it, forwarding and possibly deleting it, or just moving it to
another mailbox.

I'd like to write this function in a way that supports both Sieve and Regex
filters. Let's confer on the Regex filters, and perhaps expand their abilities
to do rejects and forwards, since as this level it would be quite easy to
simply add another field or two with an action type and a forward entry.

Aaron


On Mon, 21 Apr 2003, Aaron Stone wrote:

> On 21 Apr 2003, Jonas Jensen wrote:
>
> > On Mon, 2003-04-21 at 02:15, Aaron Stone wrote:
> > > Interestingly, it is also the exact place where my LDAP patch needs to
> > > hook in to create the INBOX for a user who does not have one yet (ie it's
> > > their first email since the account was created in the LDAP directory).
> > > You probably have a similar issue if a mailbox is specified for a filter
> > > rule but does not exist yet.
> >
> > Nope. The SELECT statement I use makes an implicit INNER JOIN between
> > filters and mailboxes, making sure that the mailbox exists, otherwise
> > the row won't be returned.
> >
>
> Oh, I didn't see that. It's good to have someone around who knows their
> SQL really well ;-)
>
> > > With the filters applied higher up on the delivery chain, they can also be
> > > used to generate bounces, forwards or call external programs.
> >
> > This would be nice. So what you mean is that the filters table should
> > have an owner_idnr field and a deliver_to field, replacing the
> > mailbox_idnr field it has now. The code to scan for filters probably
> > belongs around the call to auth_check_user (which should be renamed to
> > auth_check_alias IMHO), as it needs to be called recursively.
> >
> > For example, if Alice has a filter forwarding some mail to Bob, the
> > final destination will be resolved like this:
> > alias (alice@domain)
> > -> userid (5)
> > -> filter (mail that belongs to bob)
> > -> alias (bob@domain)
> > -> userid (6)
> > -> filter (no filters match)
> > -> mailbox (INBOX)
> >
>
> It's interesting that you would insert a short circuit for local forwards.
> I think it would be easier just to pass all email forwards back to the
> MTA. This also adds the appropriate headers and whatnot indicating the
> path that the mail took.
>
> > Numbers in filters.deliver_to point to mailboxes while numbers in
> > aliases.deliver_to point to userids. To handle this I propose a
> > resolve_alias() function for pipe.c. Here's my idea, in pseudo code:
> >
> > /* Returns a string describing either an external forward or a mailbox id */
> > string resolve_alias(string deliver_to, const char *headers)
> > {
> > while (1) {
> > deliver_to = auth_check_user(deliver_to);
> > if (!deliver_to)
> > deliver_to = auth_check_user(strchr(deliver_to, '@'));
> >
> > if (is_numeric(deliver_to)) {
> > /* a userid was returned and must be resolved to a mailboxid */
> > string filter_destination = db_check_filters(deliver_to, headers);
> >
> > if (filter_destination) {
> > /* a filter matched, returning either a mailboxid or an alias */
> > if (is_numeric(filter_destination))
> > return filter_destination;
> > else
> > deliver_to = filter_destination;
> >
> > } else {
> > /* no filters matched, default to INBOX */
> > return db_create_and_get_inbox(deliver_to);
> > }
> > } else {
> > return deliver_to;
> > }
> > }
> > }
> >
>
> I like where some of this is going, but I am hoping to divide up these
> pieces further. I'll get some code together and post the prototypes, but
> basically everything will have something like these arguments:
>
> int blah( FILE *instream, char *header, list userids, bounces, forwards );
>
> > > If you're interested, I'm working on a Sieve based solution. The project
> > > has quite a bit of steps involved; I'm at about step 3 of 2349058725. I'd
> > > be happy to share the work on this project!
> >
> > At first I also wanted a richer mail filtering language like
> > Sieve/procmail/maildrop, but now I think simple regexes are in many
> > cases better. dbmail is often used for serving large amounts of
> > untrusted users, and allowing them to write filters in a rich language
> > would slow down performance. It would also be necessary to rip out the
> > command execution and any possibilities for writing infinite loops in
> > those languages to avoid security problems.
> >
>
> Sieve is written specifically for email filtering, and as such does not
> include commands to call external programs or to run in loops of any kind.
> In fact, there aren't even variables (yet, there's a draft in progress).
>
> > Also, I found that my entire .procmailrc could be represented with
> > simple regexes. I think that's the case for 95% of dbmail users (users,
> > not admins).
> >
> > > Really nice patch, though! For those who need filtering yesterday (umm,
> > > me...) and in case the Sieve project tanks, this looks great.
> >
> > Of course, the two solutions can coexist if/when you finish your
> > libsieve project.
> >
> > BTW, I'll be offline until Friday.
> >
>
> Take care!
>
> > --
> > Jonas Jensen <jbj@knef.dk>
> >
> > _______________________________________________
> > Dbmail-dev mailing list
> > Dbmail-dev@dbmail.org
> > http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> >
>
> _______________________________________________
> Dbmail-dev mailing list
> Dbmail-dev@dbmail.org
> http://twister.fastxs.net/mailman/listinfo/dbmail-dev
>
Re: [PATCH] procmail-like filters for dbmail [ In reply to ]
Hello Aaron,

> The flow I sketched in my pipe.c rewrite first copies messages into a
> target user's INBOX, and then calls the function execute_filters(useridnr,
> msgidnr). The filters can then run on the header and/or body and toss the
> message around as needed, sending a rejection to the From address and
> deleting it, forwarding and possibly deleting it, or just moving it to another mailbox.

Just a thought - it seems there'd be a race-condition here where
a message has been inserted into INBOX, but before the filters get
executed to move it somewhere else, a user reads their mailbox (eg.
via imap). I don't know if this would cause any actual errors (I
noticed in a recent imap trace posted that it queried for message
metadata matching both message_idnr and mailbox_idnr - if the latter
changed since having been looked up, it could), or just an
inconsistent view of the mailbox. Considering some of those filters
will likely be for matching virus headers, etc., there's a chance
someone could end up opening such a message that should have been
deleted.

Also, I've not looked at your code, so you may well have addressed
this....

Jesse


--
Jesse Norell
jesse (at) kci.net
Re: [PATCH] procmail-like filters for dbmail [ In reply to ]
You're correct, thanks for pointing that out! There are a few other
problems yet to work out, I also noticed an issue if the specified
mailbox does not exist, among other things -- totally missed this one.

In this version, I tried to avoid changes to the db layer, but it
looks like that won't hold up for much longer...

Aaron


On Mon, 28 Apr 2003, Jesse Norell wrote:

>
> Hello Aaron,
>
> > The flow I sketched in my pipe.c rewrite first copies messages into a
> > target user's INBOX, and then calls the function execute_filters(useridnr,
> > msgidnr). The filters can then run on the header and/or body and toss the
> > message around as needed, sending a rejection to the From address and
> > deleting it, forwarding and possibly deleting it, or just moving it to another mailbox.
>
> Just a thought - it seems there'd be a race-condition here where
> a message has been inserted into INBOX, but before the filters get
> executed to move it somewhere else, a user reads their mailbox (eg.
> via imap). I don't know if this would cause any actual errors (I
> noticed in a recent imap trace posted that it queried for message
> metadata matching both message_idnr and mailbox_idnr - if the latter
> changed since having been looked up, it could), or just an
> inconsistent view of the mailbox. Considering some of those filters
> will likely be for matching virus headers, etc., there's a chance
> someone could end up opening such a message that should have been
> deleted.
>
> Also, I've not looked at your code, so you may well have addressed
> this....
>
> Jesse
>
>
> --
> Jesse Norell
> jesse (at) kci.net
>
> _______________________________________________
> Dbmail-dev mailing list
> Dbmail-dev@dbmail.org
> http://twister.fastxs.net/mailman/listinfo/dbmail-dev
>
Re: [PATCH] procmail-like filters for dbmail [ In reply to ]
Hey,

Did you use a temporary mailbox (idnr=0)? If so, you could just
dump the message there and everything else should probably work
(you already give it the user_idnr and message_idnr). Make
execute_filters default to INBOX delivery, or maybe need to clean up
after it if the message is still in the temp mailbox (have to consider
handling multiple recipients of the same message).

---- Original Message ----
From: Aaron Stone <dbmail-dev@dbmail.org>
To: dbmail-dev@dbmail.org
Subject: Re: [Dbmail-dev] [PATCH] procmail-like filters for dbmail
Sent: Mon, 28 Apr 2003 08:39:05 -0700 (PDT)

You're correct, thanks for pointing that out! There are a few other
problems yet to work out, I also noticed an issue if the specified
mailbox does not exist, among other things -- totally missed this one.

In this version, I tried to avoid changes to the db layer, but it
looks like that won't hold up for much longer...

Aaron


On Mon, 28 Apr 2003, Jesse Norell wrote:

>
> Hello Aaron,
>
> > The flow I sketched in my pipe.c rewrite first copies messages into a
> > target user's INBOX, and then calls the function execute_filters(useridnr,
> > msgidnr). The filters can then run on the header and/or body and toss the
> > message around as needed, sending a rejection to the From address and
> > deleting it, forwarding and possibly deleting it, or just moving it to another mailbox.
>
> Just a thought - it seems there'd be a race-condition here where
> a message has been inserted into INBOX, but before the filters get
> executed to move it somewhere else, a user reads their mailbox (eg.
> via imap). I don't know if this would cause any actual errors (I
> noticed in a recent imap trace posted that it queried for message
> metadata matching both message_idnr and mailbox_idnr - if the latter
> changed since having been looked up, it could), or just an
> inconsistent view of the mailbox. Considering some of those filters
> will likely be for matching virus headers, etc., there's a chance
> someone could end up opening such a message that should have been
> deleted.
>
> Also, I've not looked at your code, so you may well have addressed
> this....
>
> Jesse
>
>
> --
> Jesse Norell
> jesse (at) kci.net
>
> _______________________________________________
> Dbmail-dev mailing list
> Dbmail-dev@dbmail.org
> http://twister.fastxs.net/mailman/listinfo/dbmail-dev
>

_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://twister.fastxs.net/mailman/listinfo/dbmail-dev

-- End Original Message --


--
Jesse Norell
jesse (at) kci.net
Re: [PATCH] procmail-like filters for dbmail [ In reply to ]
I think I might be able to go this route. The middle-level function
execute_filters() would need to be a bit more psychic, but not a lot.
Mostly it would just need to know about mailbox delivery, and in fact
that's probably the most appropriate place for that to go. I suddenly
found myself coding an INBOX find/create routine in insert_messages()
and really didn't like the feel of where that was going.

I should have time to write execute_filters() this evening; with luck,
that will pretty much finish straightening out the delivery path!

Aaron


On Mon, 28 Apr 2003, Jesse Norell wrote:

>
> Hey,
>
> Did you use a temporary mailbox (idnr=0)? If so, you could just
> dump the message there and everything else should probably work
> (you already give it the user_idnr and message_idnr). Make
> execute_filters default to INBOX delivery, or maybe need to clean up
> after it if the message is still in the temp mailbox (have to consider
> handling multiple recipients of the same message).
>
> ---- Original Message ----
> From: Aaron Stone <dbmail-dev@dbmail.org>
> To: dbmail-dev@dbmail.org
> Subject: Re: [Dbmail-dev] [PATCH] procmail-like filters for dbmail
> Sent: Mon, 28 Apr 2003 08:39:05 -0700 (PDT)
>
> You're correct, thanks for pointing that out! There are a few other
> problems yet to work out, I also noticed an issue if the specified
> mailbox does not exist, among other things -- totally missed this one.
>
> In this version, I tried to avoid changes to the db layer, but it
> looks like that won't hold up for much longer...
>
> Aaron
>
>
> On Mon, 28 Apr 2003, Jesse Norell wrote:
>
> >
> > Hello Aaron,
> >
> > > The flow I sketched in my pipe.c rewrite first copies messages into a
> > > target user's INBOX, and then calls the function execute_filters(useridnr,
> > > msgidnr). The filters can then run on the header and/or body and toss the
> > > message around as needed, sending a rejection to the From address and
> > > deleting it, forwarding and possibly deleting it, or just moving it to another mailbox.
> >
> > Just a thought - it seems there'd be a race-condition here where
> > a message has been inserted into INBOX, but before the filters get
> > executed to move it somewhere else, a user reads their mailbox (eg.
> > via imap). I don't know if this would cause any actual errors (I
> > noticed in a recent imap trace posted that it queried for message
> > metadata matching both message_idnr and mailbox_idnr - if the latter
> > changed since having been looked up, it could), or just an
> > inconsistent view of the mailbox. Considering some of those filters
> > will likely be for matching virus headers, etc., there's a chance
> > someone could end up opening such a message that should have been
> > deleted.
> >
> > Also, I've not looked at your code, so you may well have addressed
> > this....
> >
> > Jesse
> >
> >
> > --
> > Jesse Norell
> > jesse (at) kci.net
> >
> > _______________________________________________
> > Dbmail-dev mailing list
> > Dbmail-dev@dbmail.org
> > http://twister.fastxs.net/mailman/listinfo/dbmail-dev
> >
>
> _______________________________________________
> Dbmail-dev mailing list
> Dbmail-dev@dbmail.org
> http://twister.fastxs.net/mailman/listinfo/dbmail-dev
>
> -- End Original Message --
>
>
> --
> Jesse Norell
> jesse (at) kci.net
>
> _______________________________________________
> Dbmail-dev mailing list
> Dbmail-dev@dbmail.org
> http://twister.fastxs.net/mailman/listinfo/dbmail-dev
>