Mailing List Archive: local_scan interface discussion

local_scan interface discussion

Jul 14, 2002, 6:43 PM

Post #1 of 9 (978 views)

--

[.
If this should be discussed in a separate (new) forum, let me know
and we'll set something up. Otherwise please keep the discussion
properly threaded so people can follow or ignore it as they wish.
]

My reason for writing this is to facilitate collaboration to design a
"perfect" local_scan interface. Philip has the final say regarding
what will or will not be included in the official exim release, of
course. I want to help make the interface meet my desires, and also
to help implement it (now :-)) as I am able.

Please contribute your $0.02 (in whatever monetary unit you deem most
appropriate).

The Problem :
The current problem with the local_scan API is that it requires exim
to be recompiled when ever the scanner is changed. This is annoying
when the scanner changes often or for experimenting with different
scanners, and runs directly counter to distributions of pre-packaged
software. A better implementation will decouple the scanner from the
rest of exim.

The Goal :
The goal of this discussion is to develop an design for the
implementation of the local_scan function which decouples the scanner
from exim proper. The goal of this decoupling is
1) increased flexibility wrt choosing a scanner
2) allow scanners to be (re)compiled without rebuilding the rest of exim
3) allow distributions (eg RedHat or Debian) to provide separate
packages for exim and each of the available scanners
4) allow users of a distribution to use a scanner while still
using the packaged version of exim

Another goal is for the design (and implementation) to be acceptable
to Philip and included in the standard exim release. This is
important for supporting goals #3 and #4 above.

I think some basic premises must be agreed upon :

o Incompatible changes will occur from time to time. Preventing
these changes from disrupting anything is not realistic. (eg
exim4 breaks compatibility with all exim3 config files)

o When (not if) an incompatibility arises, it is desirable to to
detect, report, and gracefully handle it.

o Following the XP philosophy, the simplest method that works is
best. I would like to keep the implementation as small and as
simple as possible, while still meeting the goals.

To start with, I'll list some ideas I have which I don't think are
very realistic. I am doing this so that they can be "vetoed" right
from the beginning.

o Use CORBA or XML-RPC or some such middleware to communicate
between exim and the scanner.

Pros:
o on-the-wire protocols are standard
o implementation libraries are available
o allows scanners to be written in any language
o eliminates C-level binary compatibility concerns

Cons:
o would require too much complexity in exim itself to
implement its side of the communication

o Embed a high-level language interpreter (eg perl, python, or
java), and let it dynamically load modules and whatnot

Pros:
o eliminates C-level binary compatibility concerns
o allows (forces, rather) local_scan functions to be written
in a language other than C

Cons:
o complexity
o increases the size of "exim" since it would contain an
extra interpreter
o increases performance overhead due to startup requirements
of the extra interpreter
o stirs up language wars

o Same as above, but implement the interpreter ourselves

Pros:
o eliminates C-level binary compatibility concerns
o allows (forces, rather) local_scan functions to be written
in a language other than C
o avoids existing language wars

Cons:
o way too much complexity
o who has time to create or learn Yet Another high level
language anyways?
o could create a new language war

On a more practical level, I have these ideas :

o Treat the local_scan the same way other (dynamic) libraries, such
as libldap or libpg, are treated. Let the system's dynamic linker
deal with loading the local scan library at runtime.

Pros:
o eliminates C-level binary compatibility concerns
o eliminates the need to write code dealing with dlopen(), etc.

Cons:
o only allows a single liblocal_scan to be installed at any
time (AFAIK)

Questions :
o How would a system with multiple local_scan libraries
installed behave?
o How would the admin specify which one to use?

These last two ideas are the most practical, I think.

o Create an interface that leverages existing IPC mechanisms such as
pipes, UNIX Domain Sockets (these are the same as fifos and named
pipes, right?), or TCP sockets to communicate with a scanner. The
scanner would be a separate, complete, application.

Pros:
o eliminates C-level binary compatibility concerns
o allows local_scan functions to be written in any language
o prevents language wars (as embedding an interpreter would create)

Cons:
o Requires creating a new protocol.
(or beating an existing one (maybe LMTP or BSMTP) into a
shape suitable for this use)
o Could be complex.

Comments:
o The complexity of creating and implementing a new protocol
can be minimized by devising a sufficiently simple
protocol.
o If this mechanism is chosen, then additional discussion on
the merits of each IPC mechanism and protocol choices will
need to follow.

Additional Data:
One idea I had for this is using a pipe. exim would open a
pipe to the specified scanner program. The message would be
passed to the scanner on stdin. The exit code from the
scanner program would determine what exim should do with it --
accept, tempreject, permreject. If the scanner rejects the
message, its output would be the message to return to the
other server. Otherwise its output would specify how the
message should be modified (namely adding or modifying
headers). The format of the header modification text is a
detail that can be worked out later.

o Use libdl (dlopen, dlsym) to load an admin-specified .so.

Pros:
o doesn't require a lot of code
o an initial implementation is already available
o the scanner API is almost identical to the current one
o no new protocols need to be devised

Cons:
o C is (very apparently) not well suited for dynamic programs
o the libdl API doesn't provide any type checking the way
the C compiler does (or the way python does for "dynamic"
modules)
o This makes it easy for an admin to shoot a 3-sided hole in
exim. If a bad .so is specified (accidentally or
maliciously), exim _could_ have a hard time handling it
gracefully. It will more than likely crash if the ABI
checking doesn't catch the mismatch.

-D

--
If we claim we have not sinned, we make Him out to be a liar and His
Word has no place in our lives.
I John 1:10

http://dman.ddts.net/~dman/
--
[ Content of type application/pgp-signature deleted ]
--