Hello, I am trying to understand the requirements for properly using the index-time join. In my use case, I am trying to model a 1-N relationship where parent document could have 0-N child documents. For now I am keeping my data very simple where each child has a single field. So my data right now look like this:
Parent Doc Children
--------------------------------------
id=id00000
none
id=id00001
program=P1
id=id00002
program=P1
program=P2
id=id00003
none
id=id00004
program=P1
id=id00005
program=P1
program=P2
So essentially I have 6 parent docs, doc 0 has no children, doc 1 has 1 child, doc 2 has 2 children, etc.
Certain queries are giving me incorrect result. For example:
BitSetProducer parentSet = new QueryBitSetProducer(new TermQuery(new Term("id", "id00003")));
Query q = new ToParentBlockJoinQuery(new WildcardQuery(new Term("program", "*")), parentSet, ScoreMode.None);
This returns "id00003", which is unexpected.
I opened a bug (https://issues.apache.org/jira/browse/LUCENE-8902) in my haste earlier (sorry) and it was mentioned in there that "chid free is not supported". So I take it to mean that each parent should have at least one child. So let's say I add a "default" child to each parent:
Parent Doc Children
--------------------------------------
id=id00000
field1=val1
id=id00001
field1=val1
program=P1
id=id00002
field1=val1
program=P1
program=P2
id=id00003
field1=val1
id=id00004
field1=val1
program=P1
id=id00005
field1=val1
program=P1
program=P2
So now every parent has at least one child. That made no difference, still get the same result. What am I doing wrong here?
Thanks
Parent Doc Children
--------------------------------------
id=id00000
none
id=id00001
program=P1
id=id00002
program=P1
program=P2
id=id00003
none
id=id00004
program=P1
id=id00005
program=P1
program=P2
So essentially I have 6 parent docs, doc 0 has no children, doc 1 has 1 child, doc 2 has 2 children, etc.
Certain queries are giving me incorrect result. For example:
BitSetProducer parentSet = new QueryBitSetProducer(new TermQuery(new Term("id", "id00003")));
Query q = new ToParentBlockJoinQuery(new WildcardQuery(new Term("program", "*")), parentSet, ScoreMode.None);
This returns "id00003", which is unexpected.
I opened a bug (https://issues.apache.org/jira/browse/LUCENE-8902) in my haste earlier (sorry) and it was mentioned in there that "chid free is not supported". So I take it to mean that each parent should have at least one child. So let's say I add a "default" child to each parent:
Parent Doc Children
--------------------------------------
id=id00000
field1=val1
id=id00001
field1=val1
program=P1
id=id00002
field1=val1
program=P1
program=P2
id=id00003
field1=val1
id=id00004
field1=val1
program=P1
id=id00005
field1=val1
program=P1
program=P2
So now every parent has at least one child. That made no difference, still get the same result. What am I doing wrong here?
Thanks