Mailing List Archive

pipelining vs. single-message delivery (was Re: qmail 1.00 born. wish list for 1.1)
qmail's current behaviour is similar to netscape's behaviour where it
opens multiple tcp connections to suck down multiple images at the same
time. But as has been shown by the W3C
<http://www.w3.org/pub/WWW/Protocols/HTTP/Performance/Pipeline.html>,
pipelined connections are far more efficient on bandwidth. While
"bandwidth is getting cheaper" it isn't here now, and efforts like the
above make it obvious that there is a lot to be saved (an order of
magnitude in some of those test cases).

Pipelining is available as an extension to SMTP. So similar results could
be achieved with SMTP if it was more widely deployed. The changes
required to support pipelining in Apache weren't that extensive, they
consisted mostly of better heuristics as to when to flush to the socket in
order to support non-pipelining clients. So it's reasonable to think that
it wouldn't be hard to put pipelining into the major MTAs (qmail-smtpd
already supports it).

I would be very interested in seeing a study of mail transfer behaviour
over various link qualities, between various MTAs, and using pipelined
versus parallel connections. Of particular interest is behaviour between
qmail and sendmail (in various incarnations) since sendmail is
undoubtedly the most common MTA.

One big win that qmail's model achieves is that it does not do MX lookups
on the entire recipient list prior to initiating delivery. That is one of
the factors that slows down sendmail's delivery on large lists. A
simplistic approach such as sorting the recipient list by domain and using
pipelining for recipients at the same domain is likely to be faster than
what qmail does now.

At any rate, every time this topic comes up the claim is made that qmail's
approach is superior regardless of the waste of bandwidth. However I've
never seen research that shows this and accounts for side-effects of other
things, like MX-lookups, or side-effects on the receiving host (like out
of processes or file-handles).

Dean

On 23 Feb 1997, Russell Nelson wrote:
> > - find a solution to the annoying explosion method (group of mails to the
> > same host are sent in paralel by separate processes/TCP connections
> > instead of more efficiant delivery)
>
> But it *is* more efficient in terms of the only cost that is
> increasing -- the time of the people using the computers. Every other
> cost of email delivery is dropping through the floor -- even in
> PTT-dominated and third-world countries. CPU, disk, and bandwidth are
> all getting cheaper. There is no reason for an MTA, being written in
> the late 90's, to attempt to conserve these resources at the cost of
> human time.
>
> There may be rare circumstances in which bandwidth is more expensive
> than human time. In that case, it makes sense to batch, compress, and
> transport the email to a relay point where the bandwidth is cheaper.
> If bandwidth is truly a concern, then address that concern in an
> effective manner, rather than taking half measures like aggregating
> mail to like hosts.
>
> --
> -russ <nelson@crynwr.com> http://www.crynwr.com/~nelson
> Crynwr Software sells network driver support | PGP ok
> 521 Pleasant Valley Rd. | +1 315 268 1925 voice | Peace, Justice, Freedom:
> Potsdam, NY 13676-3213 | +1 315 268 9201 FAX | pick two (only mostly true)
>