Mailing List Archive

should non-English docs be indexed as UTF-8 encoded?
Hi!
I have some xml documents (encoded as windows-1251(russian)). Should they be
converted to UTF-8 encoded documents to be indexed? Or I can index them in
windows-1251 and just encode query (search) string to UTF-8?


--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>