Mailing List Archive

encoding problem
Hi,

I have just started to use forrest 0.8 on Japanese version of Windows Vista.
When I render my xml pages(saved in UTF-8) by 'forrest run',
everything goes well.
But, rendering those same xml pages by 'forrest site' creates
unreadable HTML files
in which Japanese characters were encoded incorrectly.

When I saved all xml files in 'Shift_JIS' and changed the related line
of the sitemap.xmap
as follows,

<map:serializer name="html" mime-type="text/html"
src="org.apache.cocoon.serialization.HTMLSerializer">
<doctype-public>-//W3C//DTD HTML 4.01 Transitional//EN</doctype-public>
<doctype-system>http://www.w3.org/TR/html4/loose.dtd</doctype-system>
<encoding>Shift_JIS</encoding>
</map:serializer>

both of 'forrest run' and 'forrest site' created correct HTML files
in Shift_JIS encoding.

Is there anyone who knows what made the differece of rendered results
between by 'forrest run' and by 'forrest site' in processing UTF-8 xml files?
Thanks.

--
E Gwangho
Re: encoding problem [ In reply to ]
Gwangho E wrote:
> Hi,
>
> I have just started to use forrest 0.8 on Japanese version of Windows Vista.
> When I render my xml pages(saved in UTF-8) by 'forrest run',
> everything goes well.
> But, rendering those same xml pages by 'forrest site' creates
> unreadable HTML files
> in which Japanese characters were encoded incorrectly.
>
> When I saved all xml files in 'Shift_JIS' and changed the related line
> of the sitemap.xmap
> as follows,
>
> <map:serializer name="html" mime-type="text/html"
> src="org.apache.cocoon.serialization.HTMLSerializer">
> <doctype-public>-//W3C//DTD HTML 4.01 Transitional//EN</doctype-public>
> <doctype-system>http://www.w3.org/TR/html4/loose.dtd</doctype-system>
> <encoding>Shift_JIS</encoding>
> </map:serializer>
>
> both of 'forrest run' and 'forrest site' created correct HTML files
> in Shift_JIS encoding.
>
> Is there anyone who knows what made the differece of rendered results
> between by 'forrest run' and by 'forrest site' in processing UTF-8 xml files?
> Thanks.
>

Another solution to this (assuming you're serving the files using
apache httpd) is to use a .htaccess file to tell the webserver what
encoding you're using:

AddDefaultCharset UTF-8

That line (with a newline) in a file called ".htaccess" in the root
of you generated content will tell httpd that the files are UTF-8.
This assumes that httpd is configured to permit delegation of
access. It will enable you to change the encoding used by the http
without regenerating your content.

cheers
stuart


--
OSS Watch: http://www.oss-watch.ac.uk/
Re: encoding problem [ In reply to ]
Thak you for your reply, Stuart.

I tested your suggestion right away but it didn't work.
Actually my problem is not about delievered contents by httpd,
but about locally rendered results by forrest.

When I create my xml source files in Shift_JIS or EUC_JP
with corresponding encoding parameter in sitemap.xmap,
both 'forrest run' and 'forrest site' works correctly.
However, as for UTF-8 encoded xml source files,
only 'forrest run' gives me correct rendered results.
'forrest site' returned html files encoded in Shift_JIS
but with <META http-equiv="Content-Type" content="text/html; charset=UTF-8">
tags in those files.

--
E Gwangho
Re: encoding problem [ In reply to ]
It seems that I found a clue to solve this problem.
If there is a broken link in the process of rendering by 'forrest site',
so the 'BUILD FAILED' with the error message
"There appears to be a problem with your site build",
then I got correctly UTF-8 encoded html files including
Japanese characters.
But I don't know why...

--
E Gwangho
Re: encoding problem [ In reply to ]
On Tue, 2007-11-20 at 13:20 +0900, Gwangho E wrote:
> It seems that I found a clue to solve this problem.
> If there is a broken link in the process of rendering by 'forrest site',
> so the 'BUILD FAILED' with the error message
> "There appears to be a problem with your site build",
> then I got correctly UTF-8 encoded html files including
> Japanese characters.
> But I don't know why...

Hmm, not sure whether I understand.

When you got the broken link warning, then you have (or have not) the
correct encoding?

salu2

>
--
Thorsten Scherler thorsten.at.apache.org
Open Source Java consulting, training and solutions
Re: encoding problem [ In reply to ]
Hi,

> When you got the broken link warning, then you have (or have not) the
> correct encoding?

When I got the broken link warning, generated html pages were encoded
correctly.

--
E Gwangho