Mailing List Archive

How groupingSearch specifies SortedNumericDocValuesField
When I use groupingSearch specified as SortedNumericDocValuesField,
I got an "unexpected docvalues type NUMERIC for field 'id'
(expected=SORTED)" Exception.

My code is as follows:
String indexPath = "tmp/grouping";
Analyzer standardAnalyzer = new StandardAnalyzer();
Directory indexDir = FSDirectory.open(Paths.get(indexPath));
IndexWriterConfig indexWriterConfig = new
IndexWriterConfig(standardAnalyzer);
indexWriterConfig.setOpenMode(IndexWriterConfig.OpenMode.CREATE);
IndexWriter masterIndex = new IndexWriter(indexDir,
indexWriterConfig);

String name = "Tom";
for (int i = 1; i < 5; i++) {
Document doc = new Document();
doc.add(new StringField("name", name + "_" + i,
Field.Store.YES));
doc.add(new SortedNumericDocValuesField("id", i));
doc.add(new StoredField("id", i));
masterIndex.addDocument(doc);

}
masterIndex.commit();
masterIndex.commit();

IndexReader reader =
DirectoryReader.open(FSDirectory.open(Paths.get(indexPath)));
IndexSearcher searcher = new IndexSearcher(reader);

GroupingSearch groupingSearch = new GroupingSearch("id");
TopGroups topGroups = groupingSearch.search(searcher, new
MatchAllDocsQuery(), 0, 100);

System.out.println(topGroups.totalHitCount);
reader.close();


The exception is as follows:
Exception in thread "main" java.lang.IllegalStateException: unexpected
docvalues type SORTED_NUMERIC for field 'id' (expected=SORTED). Re-index
with correct docvalues type.
at org.apache.lucene.index.DocValues.checkField(DocValues.java:317)
at org.apache.lucene.index.DocValues.getSorted(DocValues.java:369)
at
org.apache.lucene.search.grouping.TermGroupSelector.setNextReader(TermGroupSelector.java:56)
at
org.apache.lucene.search.grouping.FirstPassGroupingCollector.doSetNextReader(FirstPassGroupingCollector.java:348)
at
org.apache.lucene.search.SimpleCollector.getLeafCollector(SimpleCollector.java:33)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:643)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:443)
at
org.apache.lucene.search.grouping.GroupingSearch.groupByFieldOrFunction(GroupingSearch.java:141)
at
org.apache.lucene.search.grouping.GroupingSearch.search(GroupingSearch.java:113)


The version of Lucene I am using is 8.0.0.


Finally, I want to know how groupingSearch specifies three fields:
NumericDocValuesField, SortedNumericDocValuesField, SortedSetDocValuesField?




Thank you for your attention to this matter?
Re: How groupingSearch specifies SortedNumericDocValuesField [ In reply to ]
Hi,

On Tue, May 14, 2019 at 8:28 PM ?? <liboemc@gmail.com> wrote:

> When I use groupingSearch specified as SortedNumericDocValuesField,
> I got an "unexpected docvalues type NUMERIC for field 'id'
> (expected=SORTED)" Exception.
>
> My code is as follows:
> String indexPath = "tmp/grouping";
> Analyzer standardAnalyzer = new StandardAnalyzer();
> Directory indexDir = FSDirectory.open(Paths.get(indexPath));
> IndexWriterConfig indexWriterConfig = new
> IndexWriterConfig(standardAnalyzer);
> indexWriterConfig.setOpenMode(IndexWriterConfig.OpenMode.CREATE);
> IndexWriter masterIndex = new IndexWriter(indexDir,
> indexWriterConfig);
>
> String name = "Tom";
> for (int i = 1; i < 5; i++) {
> Document doc = new Document();
> doc.add(new StringField("name", name + "_" + i,
> Field.Store.YES));
> doc.add(new SortedNumericDocValuesField("id", i));
> doc.add(new StoredField("id", i));
>

are you sure both fields should have the same name ("id") ?


> masterIndex.addDocument(doc);
>
> }
> masterIndex.commit();
> masterIndex.commit();
>
> IndexReader reader =
> DirectoryReader.open(FSDirectory.open(Paths.get(indexPath)));
> IndexSearcher searcher = new IndexSearcher(reader);
>
> GroupingSearch groupingSearch = new GroupingSearch("id");
> TopGroups topGroups = groupingSearch.search(searcher, new
> MatchAllDocsQuery(), 0, 100);
>
> System.out.println(topGroups.totalHitCount);
> reader.close();
>
>
> The exception is as follows:
> Exception in thread "main" java.lang.IllegalStateException: unexpected
> docvalues type SORTED_NUMERIC for field 'id' (expected=SORTED). Re-index
> with correct docvalues type.
> at org.apache.lucene.index.DocValues.checkField(DocValues.java:317)
> at org.apache.lucene.index.DocValues.getSorted(DocValues.java:369)
> at
>
> org.apache.lucene.search.grouping.TermGroupSelector.setNextReader(TermGroupSelector.java:56)
> at
>
> org.apache.lucene.search.grouping.FirstPassGroupingCollector.doSetNextReader(FirstPassGroupingCollector.java:348)
> at
>
> org.apache.lucene.search.SimpleCollector.getLeafCollector(SimpleCollector.java:33)
> at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:643)
> at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:443)
> at
>
> org.apache.lucene.search.grouping.GroupingSearch.groupByFieldOrFunction(GroupingSearch.java:141)
> at
>
> org.apache.lucene.search.grouping.GroupingSearch.search(GroupingSearch.java:113)
>
>
> The version of Lucene I am using is 8.0.0.
>
>
> Finally, I want to know how groupingSearch specifies three fields:
> NumericDocValuesField, SortedNumericDocValuesField,
> SortedSetDocValuesField?
>
>
>
>
> Thank you for your attention to this matter?
>
Re: How groupingSearch specifies SortedNumericDocValuesField [ In reply to ]
Hi?

This is a unit test, and I changed to NumericDocValuesField with a similar
error.

I tried testing the NumericDocValuesField, SortedNumericDocValuesField and
SortedSetDocValuesField, these three fields can not be specified in
groupingSearch. Does groupingSearch only support SortedDocValuesField?



Martin Grigorov <mgrigorov@apache.org> ?2019?5?15??? ??1:51???

> Hi,
>
> On Tue, May 14, 2019 at 8:28 PM ?? <liboemc@gmail.com> wrote:
>
> > When I use groupingSearch specified as SortedNumericDocValuesField,
> > I got an "unexpected docvalues type NUMERIC for field 'id'
> > (expected=SORTED)" Exception.
> >
> > My code is as follows:
> > String indexPath = "tmp/grouping";
> > Analyzer standardAnalyzer = new StandardAnalyzer();
> > Directory indexDir = FSDirectory.open(Paths.get(indexPath));
> > IndexWriterConfig indexWriterConfig = new
> > IndexWriterConfig(standardAnalyzer);
> > indexWriterConfig.setOpenMode(IndexWriterConfig.OpenMode.CREATE);
> > IndexWriter masterIndex = new IndexWriter(indexDir,
> > indexWriterConfig);
> >
> > String name = "Tom";
> > for (int i = 1; i < 5; i++) {
> > Document doc = new Document();
> > doc.add(new StringField("name", name + "_" + i,
> > Field.Store.YES));
> > doc.add(new SortedNumericDocValuesField("id", i));
> > doc.add(new StoredField("id", i));
> >
>
> are you sure both fields should have the same name ("id") ?
>
>
> > masterIndex.addDocument(doc);
> >
> > }
> > masterIndex.commit();
> > masterIndex.commit();
> >
> > IndexReader reader =
> > DirectoryReader.open(FSDirectory.open(Paths.get(indexPath)));
> > IndexSearcher searcher = new IndexSearcher(reader);
> >
> > GroupingSearch groupingSearch = new GroupingSearch("id");
> > TopGroups topGroups = groupingSearch.search(searcher, new
> > MatchAllDocsQuery(), 0, 100);
> >
> > System.out.println(topGroups.totalHitCount);
> > reader.close();
> >
> >
> > The exception is as follows:
> > Exception in thread "main" java.lang.IllegalStateException: unexpected
> > docvalues type SORTED_NUMERIC for field 'id' (expected=SORTED). Re-index
> > with correct docvalues type.
> > at org.apache.lucene.index.DocValues.checkField(DocValues.java:317)
> > at org.apache.lucene.index.DocValues.getSorted(DocValues.java:369)
> > at
> >
> >
> org.apache.lucene.search.grouping.TermGroupSelector.setNextReader(TermGroupSelector.java:56)
> > at
> >
> >
> org.apache.lucene.search.grouping.FirstPassGroupingCollector.doSetNextReader(FirstPassGroupingCollector.java:348)
> > at
> >
> >
> org.apache.lucene.search.SimpleCollector.getLeafCollector(SimpleCollector.java:33)
> > at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:643)
> > at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:443)
> > at
> >
> >
> org.apache.lucene.search.grouping.GroupingSearch.groupByFieldOrFunction(GroupingSearch.java:141)
> > at
> >
> >
> org.apache.lucene.search.grouping.GroupingSearch.search(GroupingSearch.java:113)
> >
> >
> > The version of Lucene I am using is 8.0.0.
> >
> >
> > Finally, I want to know how groupingSearch specifies three fields:
> > NumericDocValuesField, SortedNumericDocValuesField,
> > SortedSetDocValuesField?
> >
> >
> >
> >
> > Thank you for your attention to this matter?
> >
>
Re: How groupingSearch specifies SortedNumericDocValuesField [ In reply to ]
Hi, I managed to retrive the groups using the *SortedSetDocValuesField* in
*GroupingSearch* by initialising the groupsearch with *SortedSetFieldSource*

The problem is when a document has multiple values in the field
"SortedSetDocValuesField" than not the grouping query does not return all
the groups.

Let me demonstrate it in my example

// indexing, the first object has the category "one" and the second object
has category "two" and "three"

Document doc = new Document();
doc.add(new FacetField("Author", "Bob"));
doc.add(new SortedSetDocValuesField("category", new BytesRef("one")));
indexWriter.addDocument(config.build(taxoWriter, doc));

doc = new Document();
doc.add(new FacetField("Author", "Lisa"));
doc.add(new SortedSetDocValuesField("category", new BytesRef("two")));
doc.add(new SortedSetDocValuesField("category", new BytesRef("three")));
indexWriter.addDocument(config.build(taxoWriter, doc));

// initializing the grouping search
ValueSource vs = new SortedSetFieldSource(groupField);
groupingSearch = new GroupingSearch(vs, new HashMap<>());

// performing the group search
TopGroups groups = groupingSearch.search(searcher, new MatchAllDocsQuery(),
0, 100);


It returns 2 groups only and I would expect 3 groups ("one", "two" and
"three")

Is it a bug? Or am I using the API in a wrong way?



--
Sent from: https://lucene.472066.n3.nabble.com/Lucene-Java-Users-f532864.html

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org