Mailing List Archive

JapaneseReadingFormFilter cannot convert some hiragana to romaji
Hello,

I found a bug where some hiragana characters are not being converted to romaji when using JapaneseReadingFormFilter.
(For example, “ぐ” is not being converted to “gu”. I noticed this when searching for “ますきんぐ” and “マスキング” did not appear in the search results.)
I believe this is due to the fact that there are hiragana characters in the kuromoji dictionary that do not have an explicitly defined reading.

# Proposed Solution
How about adding a process to convert hiragana to katakana when detected in the getRomanization function?
https://github.com/apache/lucene/pull/12885

I apologize in advance if I have made any mistakes in the reporting process or procedures as this is my first time posting in this community.

--

Takuma Kuramitsu

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org