Mailing List Archive

search for empty field?
Is it possible to query for documents that have empty values for a field?

Say need to find documents with category empty, I tried negative query:
-category:*
But it returns 0 document. I think "category:*" is basically match all, so
this "-category:*" doesn't work.

Thanks!

--
Chris Lu
-------------------------
Instant Scalable Full-Text Search On Any Database/Application
site: http://www.dbsight.net
demo: http://search.dbsight.com
Lucene Database Search in 3 minutes:
http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
DBSight customer, a shopping comparison site, (anonymous per request) got
2.6 Million Euro funding!
Re: search for empty field? [ In reply to ]
This has been discussed multiple times, so looking at the
searchable archive will give you more detailed info. But as
I remember, the consensus suggestion was to index some
"impossible" value for those documents that lack a field.
For instance, say your field was "sometimes". I document
that had nothing to index for that field could get a value of
"ZZZZZZZZZZZZZ".

Now your query is simple sometimes:ZZZZZZZZZZZZZ


Best
Erick

On Tue, Sep 2, 2008 at 4:23 AM, Chris Lu <chris.lu@gmail.com> wrote:

> Is it possible to query for documents that have empty values for a field?
>
> Say need to find documents with category empty, I tried negative query:
> -category:*
> But it returns 0 document. I think "category:*" is basically match all, so
> this "-category:*" doesn't work.
>
> Thanks!
>
> --
> Chris Lu
> -------------------------
> Instant Scalable Full-Text Search On Any Database/Application
> site: http://www.dbsight.net
> demo: http://search.dbsight.com
> Lucene Database Search in 3 minutes:
>
> http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
> DBSight customer, a shopping comparison site, (anonymous per request) got
> 2.6 Million Euro funding!
>
Re: search for empty field? [ In reply to ]
Thanks Erick for reminding me of this!
I only need to validate a index and make sure the content are correctly
retrieved and index doesn't have empty fields.
So I'd better simply go through all document by id and check them directly.

Thanks!

--
Chris Lu
-------------------------
Instant Scalable Full-Text Search On Any Database/Application
site: http://www.dbsight.net
demo: http://search.dbsight.com
Lucene Database Search in 3 minutes:
http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
DBSight customer, a shopping comparison site, (anonymous per request) got
2.6 Million Euro funding!

On Wed, Sep 3, 2008 at 11:49 AM, Erick Erickson <erickerickson@gmail.com>wrote:

> This has been discussed multiple times, so looking at the
> searchable archive will give you more detailed info. But as
> I remember, the consensus suggestion was to index some
> "impossible" value for those documents that lack a field.
> For instance, say your field was "sometimes". I document
> that had nothing to index for that field could get a value of
> "ZZZZZZZZZZZZZ".
>
> Now your query is simple sometimes:ZZZZZZZZZZZZZ
>
>
> Best
> Erick
>
> On Tue, Sep 2, 2008 at 4:23 AM, Chris Lu <chris.lu@gmail.com> wrote:
>
> > Is it possible to query for documents that have empty values for a field?
> >
> > Say need to find documents with category empty, I tried negative query:
> > -category:*
> > But it returns 0 document. I think "category:*" is basically match all,
> so
> > this "-category:*" doesn't work.
> >
> > Thanks!
> >
> > --
> > Chris Lu
> > -------------------------
> > Instant Scalable Full-Text Search On Any Database/Application
> > site: http://www.dbsight.net
> > demo: http://search.dbsight.com
> > Lucene Database Search in 3 minutes:
> >
> >
> http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
> > DBSight customer, a shopping comparison site, (anonymous per request) got
> > 2.6 Million Euro funding!
> >
>
Re: search for empty field? [ In reply to ]
Oh.. I wonder if TermDocs/TermEnum would work for you
instead.....

Would it work to just create a document validator at index
time that threw an exception if all required fields weren't
present? Or is that outside your control?

Best
Erick

On Wed, Sep 3, 2008 at 3:11 PM, Chris Lu <chris.lu@gmail.com> wrote:

> Thanks Erick for reminding me of this!
> I only need to validate a index and make sure the content are correctly
> retrieved and index doesn't have empty fields.
> So I'd better simply go through all document by id and check them directly.
>
> Thanks!
>
> --
> Chris Lu
> -------------------------
> Instant Scalable Full-Text Search On Any Database/Application
> site: http://www.dbsight.net
> demo: http://search.dbsight.com
> Lucene Database Search in 3 minutes:
>
> http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
> DBSight customer, a shopping comparison site, (anonymous per request) got
> 2.6 Million Euro funding!
>
> On Wed, Sep 3, 2008 at 11:49 AM, Erick Erickson <erickerickson@gmail.com
> >wrote:
>
> > This has been discussed multiple times, so looking at the
> > searchable archive will give you more detailed info. But as
> > I remember, the consensus suggestion was to index some
> > "impossible" value for those documents that lack a field.
> > For instance, say your field was "sometimes". I document
> > that had nothing to index for that field could get a value of
> > "ZZZZZZZZZZZZZ".
> >
> > Now your query is simple sometimes:ZZZZZZZZZZZZZ
> >
> >
> > Best
> > Erick
> >
> > On Tue, Sep 2, 2008 at 4:23 AM, Chris Lu <chris.lu@gmail.com> wrote:
> >
> > > Is it possible to query for documents that have empty values for a
> field?
> > >
> > > Say need to find documents with category empty, I tried negative query:
> > > -category:*
> > > But it returns 0 document. I think "category:*" is basically match all,
> > so
> > > this "-category:*" doesn't work.
> > >
> > > Thanks!
> > >
> > > --
> > > Chris Lu
> > > -------------------------
> > > Instant Scalable Full-Text Search On Any Database/Application
> > > site: http://www.dbsight.net
> > > demo: http://search.dbsight.com
> > > Lucene Database Search in 3 minutes:
> > >
> > >
> >
> http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
> > > DBSight customer, a shopping comparison site, (anonymous per request)
> got
> > > 2.6 Million Euro funding!
> > >
> >
>
Re: search for empty field? [ In reply to ]
I was kind of waiting for a more efficient solution based on
TermDocs/TermEnum, but I feel since the term is not there at all, the only
thing we can do is to do some deduction.
I can copy the bitmap of all the deleted docs, and go through all
the TermDocs/TermEnum, and set the bit if there is a term there. then all
the unset bits are documents with empty fields.

This should be kind of efficient.

--
Chris Lu
-------------------------
Instant Scalable Full-Text Search On Any Database/Application
site: http://www.dbsight.net
demo: http://search.dbsight.com
Lucene Database Search in 3 minutes:
http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
DBSight customer, a shopping comparison site, (anonymous per request) got
2.6 Million Euro funding!

On Wed, Sep 3, 2008 at 12:18 PM, Erick Erickson <erickerickson@gmail.com>wrote:

> Oh.. I wonder if TermDocs/TermEnum would work for you
> instead.....
>
> Would it work to just create a document validator at index
> time that threw an exception if all required fields weren't
> present? Or is that outside your control?
>
> Best
> Erick
>
> On Wed, Sep 3, 2008 at 3:11 PM, Chris Lu <chris.lu@gmail.com> wrote:
>
> > Thanks Erick for reminding me of this!
> > I only need to validate a index and make sure the content are correctly
> > retrieved and index doesn't have empty fields.
> > So I'd better simply go through all document by id and check them
> directly.
> >
> > Thanks!
> >
> > --
> > Chris Lu
> > -------------------------
> > Instant Scalable Full-Text Search On Any Database/Application
> > site: http://www.dbsight.net
> > demo: http://search.dbsight.com
> > Lucene Database Search in 3 minutes:
> >
> >
> http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
> > DBSight customer, a shopping comparison site, (anonymous per request) got
> > 2.6 Million Euro funding!
> >
> > On Wed, Sep 3, 2008 at 11:49 AM, Erick Erickson <erickerickson@gmail.com
> > >wrote:
> >
> > > This has been discussed multiple times, so looking at the
> > > searchable archive will give you more detailed info. But as
> > > I remember, the consensus suggestion was to index some
> > > "impossible" value for those documents that lack a field.
> > > For instance, say your field was "sometimes". I document
> > > that had nothing to index for that field could get a value of
> > > "ZZZZZZZZZZZZZ".
> > >
> > > Now your query is simple sometimes:ZZZZZZZZZZZZZ
> > >
> > >
> > > Best
> > > Erick
> > >
> > > On Tue, Sep 2, 2008 at 4:23 AM, Chris Lu <chris.lu@gmail.com> wrote:
> > >
> > > > Is it possible to query for documents that have empty values for a
> > field?
> > > >
> > > > Say need to find documents with category empty, I tried negative
> query:
> > > > -category:*
> > > > But it returns 0 document. I think "category:*" is basically match
> all,
> > > so
> > > > this "-category:*" doesn't work.
> > > >
> > > > Thanks!
> > > >
> > > > --
> > > > Chris Lu
> > > > -------------------------
> > > > Instant Scalable Full-Text Search On Any Database/Application
> > > > site: http://www.dbsight.net
> > > > demo: http://search.dbsight.com
> > > > Lucene Database Search in 3 minutes:
> > > >
> > > >
> > >
> >
> http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
> > > > DBSight customer, a shopping comparison site, (anonymous per request)
> > got
> > > > 2.6 Million Euro funding!
> > > >
> > >
> >
>
Re: search for empty field? [ In reply to ]
I don't think << category:* >> does what you think it does.

category:[* TO *] will find all docs that have any indexed tokens in the
category field, so combining that as a prohibited clause with a
mandatory MatchAllDocsQuery will give you all docs that don't have
anything indexed in the category field....

*:* -category:[* TO *]

(although i can't remember if *:* is a Solr extension of part of hte core
QueryParser)



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org