Mailing List Archive

Fwd: best way (performance wise) to search for field without value?
To follow up, based on a quick JMH-test with 2M docs with some random data
I see a speedup of 70% :)
That is a nice friday-afternoon gift, thanks!

For ppl that are interested:

I added a BinaryDocValues field like this:

doc.add(BinaryDocValuesField("GROUPS_ALLOWED_EMPTY", new BytesRef(0x01))));

And used the finalQuery.add(new DocValuesFieldExistsQuery("
GROUPS_ALLOWED_EMPTY", BooleanClause.Occur.SHOULD);

On Fri, Nov 13, 2020 at 2:09 PM Michael McCandless <
lucene@mikemccandless.com> wrote:

> Maybe NormsFieldExistsQuery as a MUST_NOT clause? Though, you must enable
> norms on your field to use that.
>
> TermRangeQuery is indeed a horribly costly way to execute this, but if you
> cache the result on each refresh, perhaps it is OK?
>
> You could also index a dedicated doc values field indicating that the
> field empty and then use DocValuesFieldExistsQuery.
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Fri, Nov 13, 2020 at 7:56 AM Rob Audenaerde <rob.audenaerde@gmail.com>
> wrote:
>
>> Hi all,
>>
>> We have implemented some security on our index by adding a field
>> 'groups_allowed' to documents, and wrap a boolean must query around the
>> original query, that checks if one of the given user-groups matches at
>> least one groups_allowed.
>>
>> We chose to leave the groups_allowed field empty when the document should
>> able to be retrieved by all users, so we need to also select a document if
>> the 'groups_allowed' is empty.
>>
>> What would be the faster Query construction to do so?
>>
>>
>> Currently I use a TermRangeQuery that basically matches all values and put
>> that in a MUST_NOT combined with a MatchAllDocumentQuery(), but that gets
>> rather slow then the number of groups is high.
>>
>> Thanks!
>>
>