Mailing List Archive

1 2  View All
Re: [VOTE] Update our documentation building tool chain [ In reply to ]
There are also these discussions at
https://lists.apache.org/list.html?docs@httpd.apache.org:2018-9 :

"HTML entities"

and

**
"[VOTE] Update our documentation building tool chain"

I then migrated french doc toward UTF-8 without a problem and could
forget HTML entities.



Le 21/02/2020 à 17:26, Mike Rumph a écrit :
> Found a bug report related to this which was last changed on 2018-08-06:
> Bug 57878 - Using UTF-8 for all languages, and avoiding html-entities.
> - https://bz.apache.org/bugzilla/show_bug.cgi?id=57878
>
> Mike
>
> On Thu, Feb 20, 2020 at 2:23 PM André Malo <nd@perlig.de
> <mailto:nd@perlig.de>> wrote:
>
> Rich Bowen wrote:
> > On 1/9/20 2:14 AM, André Malo wrote:
> > > If you'd ask me, I'd rather kill XML/XSLT entirely, as the
> promises both
> > > for Java and XML/XSLT clearly don't hold anymore. However this
> also
> > > requires time and effort...
> >
> > Revisiting a thread that somewhat died out ... I would also be
> very much
> > in favor of moving away from XML/Docbook, and towards something
> > Markdown-based.
>
> Markdown is not powerful enough (by design, but well). I'd suggest
> restructured text instead. It needs some getting used to, but it
> should be
> able to map all features we need and/or want.
>
> Cheers,
> --
> "Solides und umfangreiches Buch"
>                                           -- aus einer Rezension
>
> <http://pub.perlig.de/books.html#apache2>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: docs-unsubscribe@httpd.apache.org
> <mailto:docs-unsubscribe@httpd.apache.org>
> For additional commands, e-mail: docs-help@httpd.apache.org
> <mailto:docs-help@httpd.apache.org>
>
Re: [VOTE] Update our documentation building tool chain [ In reply to ]
Le 20/02/2020 à 23:23, André Malo a écrit :
> Rich Bowen wrote:
>> On 1/9/20 2:14 AM, André Malo wrote:
>>> If you'd ask me, I'd rather kill XML/XSLT entirely, as the promises both
>>> for Java and XML/XSLT clearly don't hold anymore. However this also
>>> requires time and effort...
>> Revisiting a thread that somewhat died out ... I would also be very much
>> in favor of moving away from XML/Docbook, and towards something
>> Markdown-based.
> Markdown is not powerful enough (by design, but well). I'd suggest
> restructured text instead. It needs some getting used to, but it should be
> able to map all features we need and/or want.
>
> Cheers,

Hi,

I also had some looks at RST, but IMHO some functionalities available
with our current framework are not possible with RST.

does RST or RST->HTML generator have the following functionalities:
   - sorting: our directives are sorted in the generated files
   - "theming": I like the way we display links to modules and
directives with different CSS. Can the same be achieved with RST? (not
that important anyway, just a mater of taste)
   - "syntax checking": XML and underlying DTD have the big advantage
to check that the input file is correct (i.e. all fields associated to
the description of a directive are there, and in the correct order for
example). Can the same "checks" can be achieved one way or another with RST?


What generator do you have in mind? Sphinx?

A good point I see is the ability to use gettext with rst files. This
would change our translation process a lot, but would provided a much
easier one, IMHO.

CJ


---------------------------------------------------------------------
To unsubscribe, e-mail: docs-unsubscribe@httpd.apache.org
For additional commands, e-mail: docs-help@httpd.apache.org
Re: [VOTE] Update our documentation building tool chain [ In reply to ]
On Sat, Sep 29, 2018 at 3:03 AM Christophe JAILLET
<christophe.jaillet@wanadoo.fr> wrote:
>
> Hi,
>
> There is a bug in the Xalan, the XSLT 1.0 engine we are using, that
> prevents our doc building tool chain to generate correct documentation.
>
> By not correct, I mean:
> - ISO-8859-1 non ASCII characters are replaced again by their HTML
> entities equivalent (in the French doc for example)
> - breaks the man pages generation
>
> A possible workaround is to use UTF-8 instead of ISO-8859-1.
> The only drawback I see, is that some html files will be slightly bigger
> and a bit less readable because of the use of entities. This is not a
> that big problem.
>

I feel foolish but I just realized a few things that should have been obvious:

- you have to run `extraclean` to reliably see results when you change anything
- the churn that bites me and recently Mike is about links from
languages we choose to use non-utf8 for linking to languages we choose
to utf-8 for
-- This means the "java11" behavior is more correct

Are there any known breaks for using utf-8 for English? Can we take
that baby step safely and add something to the docs-build build to
verify some key output files look correct? Maybe blocking old java
with e.g. -source that will blow up?

---------------------------------------------------------------------
To unsubscribe, e-mail: docs-unsubscribe@httpd.apache.org
For additional commands, e-mail: docs-help@httpd.apache.org
Re: [VOTE] Update our documentation building tool chain [ In reply to ]
Le 30/09/2018 à 17:52, Lucien Gentis a écrit :
>
>
> Le 29/09/2018 à 15:42, Lucien Gentis a écrit :
>>
>>
>> Le 29/09/2018 à 09:03, Christophe JAILLET a écrit :
>>> Hi,
>>>
>>> There is a bug in the Xalan, the XSLT 1.0 engine we are using, that
>>> prevents our doc building tool chain to generate correct documentation.
>>>
>>> By not correct, I mean:
>>>    - ISO-8859-1 non ASCII characters are replaced again by their
>>> HTML entities equivalent (in the French doc for example)
>>>    - breaks the man pages generation
>>>
>>> A possible workaround is to use UTF-8 instead of ISO-8859-1.
>>> The only drawback I see, is that some html files will be slightly
>>> bigger and a bit less readable because of the use of entities. This
>>> is not a that big problem.
>>>
>>> However, Xalan looks mostly unmaintained since about 10 years.
>>> Xalan-Java 2.7.1, the one we are using, was released in November
>>> 2007. The latest release is Xalan-Java 2.7.2 on released in April
>>> 2014. This is mainly a maintenance release which fixes a CVE and a
>>> few bugs.
>>>
>>>
>>> As per Xalan documentation, the JDK or JRE 1.3.x (2000), 1.4.x
>>> (2002), or 5.x (2004) is required.
>>> As per our doc build documentation, we need at least Java 1.2 (1998)
>>> to build the doc.
>>> Java 8 (2014) is LTS and is supported until 2025
>>> Java 9 (2017)
>>> Java 10 (2018)
>>> Java 11 (2018) is apparently LTS
>>>
>>>
>>> Another XSLT engine could also be used. I've tried the latest Saxon
>>> 9.8.0.14 (well, a 9.9.0.1 was released 2 days ago, but there is no
>>> details yet)
>>> My first results are:
>>>    - a "build.sh all" takes ~3 min, instead of ~2 min with Xalan
>>>    - generated files looks just fine
>>>    - it removes some spaces only XML nodes. The generated code is
>>> slightly smaller.
>>>      The generated code could be slightly less readable in some
>>> cases. I've not seen
>>>      any issue in the rendering of the pages without these few spaces
>>>    - Saxon is XSLT 3.0. This could be used to simplify our xsl files.
>>>      However, I've looked at the 2.0 and 3.0 changes, and I'm not
>>> sure we could have
>>>      a real use of it. Maybe dynamic Xpath, more built-in functions
>>> available or
>>>      Text Value Templates?
>>>      Not sure either that we need to upgrades the rules at all. It
>>> already works great.
>>>    - this is a drop-in replacement. We just need to replace 2 jar files
>>>      by a new one. That's all
>>>    - We only need yo change a <func:function to a <xsl:function and
>>> the doc build
>>>      out of the box
>>>    - as per Saxon doc, it require Java 5+, 6+ or 8+ depending of the
>>> version we take
>>>
>>>
>>> So, now is your turn to give your feeling about it:
>>>
>>> Do we need to change something?
>>> ==============================
>>> [ ] this mail is too long, do whatever you want, I just want
>>> something that works
>>> [ ] no. I can leave with the current tool chain
>>> [x] yes. Let clean some dust and update what is needed
>>>
>>>
>>> What version of XSLT is best for us?
>>> ===================================
>>>
>>> [ ] 1.0 - this is what I'm used to, keep things stable
>>> [ ] 2.0
>>> [x] 3.0 - the later the better, and/or the new functionalities rock!
>>>
>>>
>>> Should we change our XSLT engine?
>>> ================================
>>> [ ] No, I love Xalan and it is ASF. Just move to UTF-8 everywhere.
>>> [x] Yes and Saxon is a good candidate. The license of the Home Edition
>>>     is Mozilla Public License version 2.0.
>>> [ ] Yes and ______ should be used instead
>>>
>>>
>>> What is the oldest version of Java we should support?
>>> ====================================================
>>> [ ] 1.2 - what we claim now
>>> [ ] 1.3 - what is needed required by Xalan 1.7.1
>>> [ ] 1.4
>>> [ ] 5.0 - what is required by Saxon 9.6
>>> [ ] 6 - what is required by Saxon 9.7 and 9.8
>>> [ ] 7
>>> [ ] 8 - what is required by the latest Saxon 9.9
>>> [ ] 9
>>> [ ] 10
>>> [ ] 11
>>>
>>>
>>> Depending of the minimum Java requirement consensus, we could also
>>> wonder if:
>>>    - we still need jakarta-oro Regex parser (ASF, but retired since
>>> 2010-09-01). Regex in Java are considered stable since a long time now
>>>      [ ] keep it
>>>      [ ] Axe it
>>>
>>>    - we need to upgrade Ant. (Latest is 1.10.5. Ant 1.9.*: JDK 1.5+,
>>> Ant 1.8.*: JDK 1.4+, Ant 1.7.*: JDK 1.3+, Ant 1.6.*: JDK 1.2+)
>>>      [ ] keep 1.6.5, we don't need to change
>>>      [ ] 1.9.x, recent enough, still maintained, but not the latest.
>>> Should be the more stable
>>>      [ ] 1.10.x, the later the better, and/or the new
>>> functionalities rock!
>>>
>>>    - any other topic?
>> In addition to XSLT version and XSLT engine updates, why not use
>> UTF-8 instead of ISO-8859-1 ?
>> It works for all languages while ISO-8859-1 is only for occidental
>> languages.
>>
>> UTF-8 html files are just a bit longer as ISO-8859-1 : for example
>> bind.html.fr is 1% longer in UTF-8 than in ISO-8859-1
>> About readability, there's for me no difference between UTF-8 and
>> ISO-8859-1 files.
>> Last, it solves the two problems :
>> --- man pages are correctly encoded
>> --- html files are generated without HTML entities.
>>
>> I'll try this weekend to rebuild french doc in UTF-8
>
> French doc (trunk) is yet in UTF-8
>
> I replaced all occurrences of "ISO-8859-1" by "UTF-8" in following
> files :
> manual/style/manual.fr.xsl
> manual/style/xsl/util/designations.xml (only for fr line)
> manual/style/lang/fr.xml
>
> Feel free to give any feedback
>
> Lucien

Hi,

For information, I've updated the "en" files accordingly to switch to
UTF-8 on trunk. (see r1878788)

Looks good to me, both for html files and man pages.

Tested with OpenJDK 11.0.7

CJ


---------------------------------------------------------------------
To unsubscribe, e-mail: docs-unsubscribe@httpd.apache.org
For additional commands, e-mail: docs-help@httpd.apache.org

1 2  View All