Mailing List Archive

please help me with Chinese content problem
hi guys,

I'm using forrest to generate a site where most content is in Chinese.
I started at invoking "forrest seed-sample" to generate a template
project, then I edited index.xml in the xdocs directory to made some
Chinese content added, and ran "forrest site".

when I browsed the index.html within my browser, it showed me "????"
instead of the Chinese characters. I checked the index.html generated,
the Chinese Content was turned to "????", I mean It's not because of a
wrong http metadata setting. I also checked the index.pdf, there are
"#####" at the corresponding positions.

I tried to run "forrest run". the http content provided by jetty is
right, but also problems in index.pdf still existed.

I'm so confused, could any body tell me how can I work it out. thanks
in advance.

-fanieste
Re: please help me with Chinese content problem [ In reply to ]
fanieste wrote:
>
> I'm using forrest to generate a site where most content is in Chinese.
> I started at invoking "forrest seed-sample" to generate a template
> project, then I edited index.xml in the xdocs directory to made some
> Chinese content added, and ran "forrest site".
>
> when I browsed the index.html within my browser, it showed me "????"
> instead of the Chinese characters. I checked the index.html generated,
> the Chinese Content was turned to "????", I mean It's not because of a
> wrong http metadata setting. I also checked the index.pdf, there are
> "#####" at the corresponding positions.
>
> I tried to run "forrest run". the http content provided by jetty is
> right, but also problems in index.pdf still existed.
>
> I'm so confused, could any body tell me how can I work it out. thanks
> in advance.

Would you please open a new issue [1] and attach your example
index.xml file. I am not into foreign languages, so probably
cannot help, but perhaps someone else can.

[1] http://forrest.apache.org/issues.html

-David
Re: please help me with Chinese content problem [ In reply to ]
fanieste wrote:
> hi guys,
>
> I'm using forrest to generate a site where most content is in Chinese.
> I started at invoking "forrest seed-sample" to generate a template
> project, then I edited index.xml in the xdocs directory to made some
> Chinese content added, and ran "forrest site".
>
> when I browsed the index.html within my browser, it showed me "????"
> instead of the Chinese characters. I checked the index.html generated,
> the Chinese Content was turned to "????", I mean It's not because of a
> wrong http metadata setting. I also checked the index.pdf, there are
> "#####" at the corresponding positions.
>
I had problems with encoding for a static site. In my case I had
english, french and chinese translations. I finally worked out that I
need correct environment settings before running forrest. I am using
linux.
Below is the shell script that I run from the folder that has the src
and build folders in it. (If you are just building in one language,
then just type 'export LANG=zh_CN.UTF-8' or whatever is appropriate
before running forrest.) Hope this helps someone, my full 'build' bash
script, htaccess, and javascript snippets are below, and is suitable for
assembling a set of files suitable for use with apache.

# start of build script
# remove old forrest generated site
rm -rf build
# rebuild all 3 languages
LANG=en_US.UTF-8 forrest
LANG=fr_CA.UTF-8 forrest
LANG=zh_CN.UTF-8 forrest
# combine sites in the directory combined/ suitable for apache
# clean out combined/
rm -rf combined
mkdir combined
# copy all files from the forrest build directory renaming appropriately
filelist=`find build/site -type f -printf '%p '`
for i in $filelist
do
ii=`echo $i | sed 's|build/site/||'`
la=`echo $ii | sed 's|\(..\).*|\1|'`
dest=`echo $ii | sed 's|../\(.*\)|\1|'`
install -D $i combined/$dest.$la
done

# copy htaccess file
cp htaccess combined/.htaccess

# finally if all language versions exist and are the same merge to a
single non-language version
filelist=`find combined -type f -name \*.en -printf '%p '`
for i in $filelist
do
root=`echo $i | sed 's|\(.*\).en|\1|'`
if [ -f $i -a -f $root.fr -a -f $root.zh ]
then
cmp -s $i $root.fr
cmpfr=$?
cmp -s $i $root.zh
cmpzh=$?
if [ "$cmpfr" -eq 0 ]
then
if [ "$cmpzh" -eq 0 ]
then
# all three versions are the same rename
rm -f $root.fr $root.zh
mv $i $root
fi
fi
fi
done
# then run sitecopy or whatever to upload "combined" folder to
#end if bash script

Here is the file which is copied to combined/.htaccess in the shell script

AddCharset UTF-8 .html .html.zh .html.fr .html.en
SetEnvIf Cookie "StellaeLang=(..)" prefer-language=$1
SetEnvIf Cookie "StellaeLang=def" !prefer-language
Header set ENV_PREFER_LANG %{prefer-language}e env=prefer-language
Header append Vary cookie
AddLanguage zh .zh
LanguagePriority en fr zh

Following are the modifications to the forrest distribution.
I have modified main/webapp/skins/common/scripts/fontsize.js :

function init()
{ //embedded in the doc
//ndeSetTextSize();
StellaeInit();
}

function setStellaeLang(language)
{
ndeCreateCookie('StellaeLang',language,200);
window.location.reload();
return false;
}

function StellaeInit()
{
var l=ndeReadCookie('StellaeLang');
switch (l)
{
case 'en' : document.getElementById('lngsel').selectedIndex =
1; break;
case 'fr' : document.getElementById('lngsel').selectedIndex =
2; break;
case 'zh' : document.getElementById('lngsel').selectedIndex =
3; break;
default : document.getElementById('lngsel').selectedIndex = 0;
}
return false;
}


In main/webapp/skins/pelt/xslt/html/site-to-xhtml.xsl I insert the code
which will add the language selector (under the skinconf-podlink section)

<xsl:template match="div[@id='skinconf-langsel']">
<div class="langsel trail" title="Set browser cookie for
language"><i18n:text>Language</i18n:text>
<select onchange="setStellaeLang(this.value);" size="1"
id="lngsel" class="dida">
<option value="def" selected="selected">Default</option>
<option value="en">English</option>
<option value="fr">Fran&#231;ais</option>
<option value="zh">&#x4e2d;&#x6587;</option>
</select>
</div>
</xsl:template>

In main/webapp/skins/pelt/xslt/html/document-to-html.xsl I add the
following under the pdflink line, so that the xsl above will insert the
selector at the correct place

<div id="skinconf-langsel"/>


(Check http://stellaeboreales.ca/ for the actual site. )