Hello,
In Lucene 6 I was doing this to get all values for a given field
knowing its type:
public List<Object> getDistinctValues(IndexReader reader, String fieldname,
Class<? extends Object> type) throws IOException {
List<Object> values = new ArrayList<Object>();
Fields fields = MultiFields.getFields(reader);
if (fields == null) return values;
Terms terms = fields.terms(fieldname);
if (terms == null) return values;
TermsEnum iterator = terms.iterator();
BytesRef value = iterator.next();
while (value != null) {
if (type == Long.class) {
values.add(LegacyNumericUtils.prefixCodedToLong(value));
} else if (type == Integer.class) {
values.add(LegacyNumericUtils.prefixCodedToInt(value));
} else if (type == Boolean.class) {
values.add(LegacyNumericUtils.prefixCodedToInt(value) == 1 ?
TRUE : FALSE);
} else if (type == Date.class) {
values.add(new
Date(LegacyNumericUtils.prefixCodedToLong(value)));
} else if (type == String.class) {
values.add(value.utf8ToString());
} else {
// ...
}
value = iterator.next();
}
return values;
}
I am trying to upgrade to lucene 9.
there were 2 changes over time:
- LegacyNumericUtils has been removed in favor of PointBase
- MultiFields.getFields() has been dropped, and I read we were encouraged
to avoid fields in general
what is proper way to implement getting distinct values for a specific
field in a reader?
thanks for your help,
vs
In Lucene 6 I was doing this to get all values for a given field
knowing its type:
public List<Object> getDistinctValues(IndexReader reader, String fieldname,
Class<? extends Object> type) throws IOException {
List<Object> values = new ArrayList<Object>();
Fields fields = MultiFields.getFields(reader);
if (fields == null) return values;
Terms terms = fields.terms(fieldname);
if (terms == null) return values;
TermsEnum iterator = terms.iterator();
BytesRef value = iterator.next();
while (value != null) {
if (type == Long.class) {
values.add(LegacyNumericUtils.prefixCodedToLong(value));
} else if (type == Integer.class) {
values.add(LegacyNumericUtils.prefixCodedToInt(value));
} else if (type == Boolean.class) {
values.add(LegacyNumericUtils.prefixCodedToInt(value) == 1 ?
TRUE : FALSE);
} else if (type == Date.class) {
values.add(new
Date(LegacyNumericUtils.prefixCodedToLong(value)));
} else if (type == String.class) {
values.add(value.utf8ToString());
} else {
// ...
}
value = iterator.next();
}
return values;
}
I am trying to upgrade to lucene 9.
there were 2 changes over time:
- LegacyNumericUtils has been removed in favor of PointBase
- MultiFields.getFields() has been dropped, and I read we were encouraged
to avoid fields in general
what is proper way to implement getting distinct values for a specific
field in a reader?
thanks for your help,
vs