Mailing List Archive

DelimitedBoostTokenFilterFactory Issue - Boosting and StandardTokenizerFactory
Hi there,

I’m developing custom java application with lucene 8.5.0.

I've tried to use DelimitedBoostTokenFilterFactory but I have a problem, so
please help me if I'm doing something wrong.

I’m using StandardAnalyzer for search, and my SynonymGraphFilter has
configuration as below:

Map<String, String> synonymParam = new HashMap<>();
synonymParam.put("synonyms", synonymFileName);
synonymParam.put("ignoreCase", "true");
synonymParam.put("format", "solr");
synonymParam.put("expand","true");

synonymParam.put("tokenizerFactory","org.apache.lucene.analysis.core.StandardTokenizerFactory");
Map<String, String> delimitedBoostTokenFilterMap = new HashMap<>();
delimitedBoostTokenFilterMap.put("delimiter", "|");
Analyzer customAnalyzer = CustomAnalyzer.builder(Paths.get(synonymFolder))
.withTokenizer(StandardTokenizerFactory.NAME)
.addTokenFilter(SynonymGraphFilterFactory.NAME,
synonymParam)
.addTokenFilter(DelimitedBoostTokenFilterFactory.NAME,
delimitedBoostTokenFilterMap)
.build();


Here’s my debug output:

Query: +spanOr([spanNear([morphology_term_original_name:tumor,
morphology_term_original_name:0.8], 0, true),
spanNear([morphology_term_original_name:neoplasm,
morphology_term_original_name:0.7], 0, true),
spanNear([morphology_term_original_name:tumour,
morphology_term_original_name:0.6], 0, true)])
(spanOr([spanNear([morphology_term_pathognomonic:tumor,
morphology_term_pathognomonic:0.8], 0, true),
spanNear([morphology_term_pathognomonic:neoplasm,
morphology_term_pathognomonic:0.7], 0, true)

Thanks in advance!

Ivana