Mailing List Archive

Standardizing the Exim documentation
I have spent a lot of time recently investigating what might be done to
convert the Exim documentation sources (main manual and filter document)
to some kind of standard text format. The executive summary is that I
think it is not unreasonable to consider moving to XML DocBook format.
However, there are a lot of ifs and buts in the detail. Therefore, I
have written a document about it. Please read the document at

http://www.cus.cam.ac.uk/~ph10/exim-docbook.html

A PDF version is available at

http://www.cus.cam.ac.uk/~ph10/exim-docbook.pdf

The document is 6 pages long (in the PDF version). Please send me your
comments, views, and corrections to anything that I might have got
wrong.

Philip

--
Philip Hazel University of Cambridge Computing Service,
ph10@cus.cam.ac.uk Cambridge, England. Phone: +44 1223 334714.
Re: Standardizing the Exim documentation [ In reply to ]
Philip Hazel <ph10@cus.cam.ac.uk> writes:

> I have spent a lot of time recently investigating what might be done to
> convert the Exim documentation sources (main manual and filter document)
> to some kind of standard text format. The executive summary is that I
> think it is not unreasonable to consider moving to XML DocBook format.

I played round with XML DocBook a few times. My personal impression:
Overkill for documentation[1], stylsheets sometimes do not work or need
a speclial xslt-processor.

> However, there are a lot of ifs and buts in the detail. Therefore, I
> have written a document about it. Please read the document at
>
> http://www.cus.cam.ac.uk/~ph10/exim-docbook.html
>
> A PDF version is available at
>
> http://www.cus.cam.ac.uk/~ph10/exim-docbook.pdf
>
> The document is 6 pages long (in the PDF version). Please send me your
> comments, views, and corrections to anything that I might have got
> wrong.
>
> Philip
>
> -- Philip Hazel University of Cambridge Computing Service,
> ph10@cus.cam.ac.uk Cambridge, England. Phone: +44 1223 334714.
>
> --
> ## List details at http://www.exim.org/mailman/listinfo/exim-dev Exim
> details at http://www.exim.org/ ##


Footnotes:
[1] for the needs of exim documentation.

--
Christoph Kliemt
entici GmbH, Goldenbergstr.1
D 50354 Hürth
Phon + 49 (0) 22 33 92 82 44
Re: Standardizing the Exim documentation [ In reply to ]
Hi!

sorry for sending a reply twice... i sent the first mail a bit to
early.. ;-)

Philip Hazel <ph10@cus.cam.ac.uk> writes:

> I have spent a lot of time recently investigating what might be done to
> convert the Exim documentation sources (main manual and filter document)
> to some kind of standard text format. The executive summary is that I
> think it is not unreasonable to consider moving to XML DocBook format.

I played round with XML DocBook a few times. My personal impression:
Overkill for documentation[1], stylsheets sometimes do not work or need
a speclial xslt-processor.

[...]

> The document is 6 pages long (in the PDF version). Please send me your
> comments, views, and corrections to anything that I might have got
> wrong.

Ok.. here it comes.

2. What documentation standard should we use?

Since XML docbook is a moving target i am afraid that there has to be
done a lot of superfluous work just to keep up with current versions.

Workaround: Steal a simple dtd and extend it if needed. I have done so
with cocoons documnet dtd. Its easy and easy to extend.

3. Maintaining the master source

Edit the xml-source with a "validating" xml-Editor. I use an sgml-mode
in xemacs.

4. Generating output from Docbook

Cocoon is worth to have a look at. http://cocoon.apache.org/ Ok, its a
web development framework, but is has everything needed, including
"serializers" to pdf,ps and svg2jpeg, svg2png, svg2tiff etc.

5. What is still outstanding?

figures: what about svg?

Texinfo: If there is a xml-masterfile i do not see why there cant be
stylesheets that do that transformation. I did not have a closer look at
this, but google delivered 940 hits for "xslt stylesheet texinfo".

regards

christoph

Footnotes:
[1] for the needs of exim documentation.

--
Christoph Kliemt
entici GmbH, Goldenbergstr.1
D 50354 Hürth
Phone + 49 (0) 22 33 92 82 44
Re: Standardizing the Exim documentation [ In reply to ]
On Wed, 3 Nov 2004, Christoph Kliemt wrote:

> Since XML docbook is a moving target i am afraid that there has to be
> done a lot of superfluous work just to keep up with current versions.

What is moving? In my experiments, what was generated by Asciidoc seemed
to be the same as the markup described in the 1999 O'Reilly DocBook
book. It's all relatively simple stuff. Certainly the ways of processing
it are changing, though.

> Edit the xml-source with a "validating" xml-Editor. I use an sgml-mode
> in xemacs.

I don't use emacs, and I know I'm not the only person in the world who
doesn't. :-) The "conglomerate" editor is such an editor and no doubt
there may be others in future.

> Cocoon is worth to have a look at. http://cocoon.apache.org/

I'll take a look.

> figures: what about svg?

I didn't know about that. Again, I'll take a look. Thanks for the
pointer.

> Texinfo: If there is a xml-masterfile i do not see why there cant be
> stylesheets that do that transformation. I did not have a closer look at
> this, but google delivered 940 hits for "xslt stylesheet texinfo".

Sure, but you need something to do the processing. As I said, it looks
like there are projects for doing this, but I didn't have enough time to
fully work it through.

Regards,
Philip

--
Philip Hazel University of Cambridge Computing Service,
ph10@cus.cam.ac.uk Cambridge, England. Phone: +44 1223 334714.
Re: Standardizing the Exim documentation [ In reply to ]
Moin!

Philip Hazel <ph10@cus.cam.ac.uk> writes:

> On Wed, 3 Nov 2004, Christoph Kliemt wrote:
>
>> Since XML docbook is a moving target i am afraid that there has to be
>> done a lot of superfluous work just to keep up with current versions.
>
> What is moving? In my experiments, what was generated by Asciidoc seemed
> to be the same as the markup described in the 1999 O'Reilly DocBook
> book.

Hmm... we should sync on "docbook". What came to my mind was this:
http://www.docbook.org/xml/index.html

[...]

>> Edit the xml-source with a "validating" xml-Editor. I use an sgml-mode
>> in xemacs.
>
> I don't use emacs, and I know I'm not the only person in the world who
> doesn't. :-)

Yes, i know there are a lot of people out there who did not find the way
to the light... ;-)

[...]

>> Cocoon is worth to have a look at. http://cocoon.apache.org/
>
> I'll take a look.

Maybe cocoon is a bit to net-centric.

http://xml.apache.org/

[...]

>> Texinfo: If there is a xml-masterfile i do not see why there cant be
>> stylesheets that do that transformation. I did not have a closer look
>> at this, but google delivered 940 hits for "xslt stylesheet texinfo".
>
> Sure, but you need something to do the processing. As I said, it looks
> like there are projects for doing this, but I didn't have enough time to
> fully work it through.

If we use standard-conforming stylesheets any xsl-processor will do the
job ( i often use xalan from the apache project ).

Regards, Christoph
Re: Standardizing the Exim documentation [ In reply to ]
On Thu, 4 Nov 2004, Christoph Kliemt wrote:

> > Sure, but you need something to do the processing. As I said, it looks
> > like there are projects for doing this, but I didn't have enough time to
> > fully work it through.
>
> If we use standard-conforming stylesheets any xsl-processor will do the
> job ( i often use xalan from the apache project ).

If you say so! (I am no expert.) It's just that I know how fiddly it is
to generate Texinfo from the current source because of the way a
properly hyperlinked Texinfo source has to be, so I'm just a bit
skeptical. (You can't generate it in a single pass, for instance. At
least, I found I had to read ahead to the next chapter so that I could
use its "node name" in the link at the start of this chapter. Of course
this was years ago; maybe Texinfo itself has changed in the meantime.)

Anyway, it doesn't really matter. There probably will be a way of doing
it.

Regards,
Philip

--
Philip Hazel University of Cambridge Computing Service,
ph10@cus.cam.ac.uk Cambridge, England. Phone: +44 1223 334714.
Re: Standardizing the Exim documentation [ In reply to ]
I'm coming into this feeling that I need to be convinced that we
actually have to change the base format for the main spec documents.

Its a significant time since I used docbook, but I did do all my
documentation sets in it for a few months (because it seemed to be a
better long term bet than LaTeX at the time). However I dropped it and
translated documents out of it after that period because:-
* I just don't enjoy pain that much
* The output quality was horrid (Philip mentions the typographical
quality)
* LaTeX hasn't died yet

I would be unhappy about moving the current format into a less rich
environment, and if we lose information about the content function (ie
by having to lose differentiation of options, flags, addresses etc in
the source) I would consider that to be a retrograde step. An
alternative I guess would be to have our own DTD with bells and whistles
as required - and a cvs (or other version system) which enforced
validation before checkin.

In terms of what we need to get out, we obviously need:-
* Printable format - with right (ie ISO) and wrong (US Letter)
format outputs. This pretty much means either PS or probably
better now PDF (ideally with indexing).
* On-line readable form - ie HTML or the various related things.
We should also be using id and class tagging on HTML so the
formatting can be made appropriate using CSS etc.
* ASCII.

I don't know how much texinfo is really needed - I think I asked for it
in the early days because it was a good intermediate format that I could
use to generate structured HTML from.

However I could well be convinced at present to stay with SGCAL format
and write stuff to spit out docbook/xml or some other good intermediate
format.

As for diagrams, SVG is good, although the tools I have used for them up
to now have not been as good as I would like, but they are improving
reasonably quick and its becoming a standard output format for several
other tools both free and otherwise.

It would also be worth considering having dot and the other graphviz
formats used for some classes of diagrams. See
http://www.graphviz.org/

I do not see wiki stuff as currently being a good place to put the main
documents. However I am hoping they can contribute to the document
development. Maybe the way here is to have a wiki output format which
we spit new releases into, and then monitor changes to those trees to
see how people are updating those documents. That would require some
work to make it work well. This is not unlike the postgresql
documentation which has all pages in a form where comments can be added
to the bottom.

Nigel.

--
[ Nigel Metheringham Nigel.Metheringham@InTechnology.co.uk ]
[. - Comments in this message are my own and not ITO opinion/policy - ]
Re: Standardizing the Exim documentation [ In reply to ]
Nigel Metheringham wrote:

> However I could well be convinced at present to stay with SGCAL format
> and write stuff to spit out docbook/xml or some other good intermediate
> format.

The question is, how does SGCAL look like? Is it really that complex?

Nico
Re: Standardizing the Exim documentation [ In reply to ]
On Thu, 2004-11-04 at 14:18 +0100, Nico Erfurth wrote:
> Nigel Metheringham wrote:
>
> > However I could well be convinced at present to stay with SGCAL format
> > and write stuff to spit out docbook/xml or some other good intermediate
> > format.
>
> The question is, how does SGCAL look like? Is it really that complex?

Looks like something between RUNOFF/RUNOFQ (which is where I started
out) and troff to me.

I've put a shortish chunk in following on from the next line. Lines may
have additionally wrapped.

Nigel.

.section Message identification
.rset SECTmessiden "~~chapter.~~section"
.index message||ids, details of format
.index format||of message id
.index id of message
.index base62
.index base36
.index Darwin
.index Cygwin
Every message handled by Exim is given a \*message id*\ which is sixteen
characters long. It is divided into three parts, separated by hyphens, for
example \"16VDhn-0001bo-D3"\. Each part is a sequence of letters and digits,
normally encoding numbers in base 62. However, in the Darwin operating
system (Mac OS X) and when Exim is compiled to run under Cygwin, base 36
(avoiding the use of lower case letters) is used instead, because the message
id is used to construct file names, and the names of files in those systems are
not case-sensitive.

.index pid (process id)||re-use of
The detail of the contents of the message id have changed as Exim has evolved.
Earlier versions relied on the operating system not re-using a process id (pid)
within one second. On modern operating systems, this assumption can no longer
be made, so the algorithm had to be changed. To retain backward compatibility,
the format of the message id was retained, which is why the following rules are
somewhat eccentric:
.numberpars $.
The first six characters of the message id are the time at which the message
started to be received, to a granularity of one second. That is, this field
contains the number of seconds since the start of the epoch (the normal Unix
way of representing the date and time of day).
.nextp
After the first hyphen, the next six characters are the id of the process that
received the message.



--
[ Nigel Metheringham Nigel.Metheringham@InTechnology.co.uk ]
[. - Comments in this message are my own and not ITO opinion/policy - ]
Re: Standardizing the Exim documentation [ In reply to ]
On Thu, 2004-11-04 at 13:43 +0000, Yann Golanski wrote:
> Quoth Nigel Metheringham on Thu, Nov 04, 2004 at 13:09:41 +0000
> > * LaTeX hasn't died yet
>
> Far from it... I was wondering if it would not be a good idea to typeset
> the Exim docs in LaTeX. It can output to both HTML and PDF so that
> should cover most users.

If we used LaTeX it would have to be LaTeX with an extra macro set on
top to handle distinguishing between options, addresses, odds etc.

However LaTeX -> HTML is one of those places where demons lurk - I
haven't seen output from a converter of large documents that I really
liked. Its actually no advantage over texinfo for this.

Nigel.

--
[ Nigel Metheringham Nigel.Metheringham@InTechnology.co.uk ]
[. - Comments in this message are my own and not ITO opinion/policy - ]
Re: Standardizing the Exim documentation [ In reply to ]
On Thu, 4 Nov 2004, Nigel Metheringham wrote:

> I don't know how much texinfo is really needed - I think I asked for it
> in the early days because it was a good intermediate format that I could
> use to generate structured HTML from.

<namedrop>
Actually, it was RMS that originally asked for it.
</namedrop>

Philip

--
Philip Hazel University of Cambridge Computing Service,
ph10@cus.cam.ac.uk Cambridge, England. Phone: +44 1223 334714.
Re: Standardizing the Exim documentation [ In reply to ]
On Thu, 4 Nov 2004, Nigel Metheringham wrote:

> I've put a shortish chunk in following on from the next line. Lines may
> have additionally wrapped.

Note that SGCAL input is very soft. The markup used for Exim is in a
private "style" - the equivalent of a private DTD if you like. For
example:

> Every message handled by Exim is given a \*message id*\ which is sixteen
> characters long. It is divided into three parts, separated by hyphens, for
> example \"16VDhn-0001bo-D3"\.

I have chosen to use \*....*\ to mean "emphasized text" and \"...."\ to
mean "fixed pitch font or put in quotes (depending on output format)".
The SGCAL header file for the Exim documentation contains definitions as
to what to do with this markup when the document is processed. For HTML
and PS output, the first of those becomes italic and the second a
fixed-pitch font; for text output, the first is ignored and the second
becomes quoted.

> .index pid (process id)||re-use of

Likewise, the use of || to separate primary and secondary index terms is
my choice. SGCAL just spits out the index information; I have a Perl
script that processes it (into further SGCAL input).

If people want to take a serious look at SGCAL, I can make a tarball
available. It is, however, quite old code now, and I was not expecting
it to be developed further.

Philip

--
Philip Hazel University of Cambridge Computing Service,
ph10@cus.cam.ac.uk Cambridge, England. Phone: +44 1223 334714.
Re: Standardizing the Exim documentation [ In reply to ]
On Thu, 4 Nov 2004, Nigel Metheringham wrote:

> As for diagrams, SVG is good,

I'm planning to take a look. It might be possible to configure Aspic to
produce SVG output, which would be yet another useful conversion.

Philip

--
Philip Hazel University of Cambridge Computing Service,
ph10@cus.cam.ac.uk Cambridge, England. Phone: +44 1223 334714.