Mailing List Archive

2 Questions
1)Does Lucine allow you to sort results by date?

2) How do you execute a wildcard search? I have
indexed four million documents using the SimpleAnalyzer. When
I execute a wildcard search using the SimpleAnalyzer the results returned
are always 0.

Thanks,

Joel
RE: 2 Questions [ In reply to ]
Hi,
#1 Yes. Look at the DateFilter implementation.
#2 Look at the source code for SimpleAnalyzer to see what are the stopwords.
I believe it removes more words when you index. I use StandardAnalyzer that
works better for me. If I remember right, searches for alphanumerics did not
work w/ Simple Analyzer. Also ,make sure to use the same kind of analyzer
for both indexing and searching. A wildcard does not work when used as the
first character in the word- for eg, *og for dog. Also, I believe wildcard
searches are case sensitive.
Aruna.
-----Original Message-----
From: Joel Bernstein [mailto:j.bernstein@ei.org]
Sent: Tuesday, April 30, 2002 1:04 PM
To: Lucene Users List
Subject: 2 Questions


1)Does Lucine allow you to sort results by date?

2) How do you execute a wildcard search? I have
indexed four million documents using the SimpleAnalyzer. When
I execute a wildcard search using the SimpleAnalyzer the results returned
are always 0.

Thanks,

Joel


--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: 2 Questions [ In reply to ]
The DateFilter interface seems only to filter (include/exlcude) the results
by date.
Does it also sort the results by date?

Thanks

----- Original Message -----
From: "Aruna Raghavan" <ArunaR@opin.com>
To: "'Lucene Users List'" <lucene-user@jakarta.apache.org>
Sent: Tuesday, April 30, 2002 2:11 PM
Subject: RE: 2 Questions


>
> Hi,
> #1 Yes. Look at the DateFilter implementation.
> #2 Look at the source code for SimpleAnalyzer to see what are the
stopwords.
> I believe it removes more words when you index. I use StandardAnalyzer
that
> works better for me. If I remember right, searches for alphanumerics did
not
> work w/ Simple Analyzer. Also ,make sure to use the same kind of analyzer
> for both indexing and searching. A wildcard does not work when used as the
> first character in the word- for eg, *og for dog. Also, I believe wildcard
> searches are case sensitive.
> Aruna.
> -----Original Message-----
> From: Joel Bernstein [mailto:j.bernstein@ei.org]
> Sent: Tuesday, April 30, 2002 1:04 PM
> To: Lucene Users List
> Subject: 2 Questions
>
>
> 1)Does Lucine allow you to sort results by date?
>
> 2) How do you execute a wildcard search? I have
> indexed four million documents using the SimpleAnalyzer. When
> I execute a wildcard search using the SimpleAnalyzer the results returned
> are always 0.
>
> Thanks,
>
> Joel
>
>
> --
> To unsubscribe, e-mail:
<mailto:lucene-user-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail:
<mailto:lucene-user-help@jakarta.apache.org>
>


--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: 2 Questions [ In reply to ]
Lucene does not currently support sorting by fields (such as a date field).

This is one of the to do items. I have implemented a sort by date on top of
Lucene (not built into Lucene's core), and plan to add it to the
contributions section once I get it a little more documented.

The method I used was suggested by Doug and is to create an array at startup
with contents of the field you want to sort by (in my case a date field).
For my list of 100,000 docs, this takes about 3 seconds. Then after you get
back the results from the search, look up the Lucene unique doc Ids (you'll
have to change the Hits interface to make this ID accessible) in the array,
get the field value and sort it. For my search results of about 15000, it
adds about .04 seconds to the search. It does get slower with more results.
Note that this is could be optimized with a TopDocs like implementation to
only sort the items that are near the top.

--Peter

On 4/30/02 11:38 AM, "Joel Bernstein" <j.bernstein@ei.org> wrote:

> The DateFilter interface seems only to filter (include/exlcude) the results
> by date.
> Does it also sort the results by date?
>
> Thanks


--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: 2 Questions [ In reply to ]
Hi,
I do not totally agree that in lucene *og would not match 'dog'.

public ArrayList wildCardSearch(String query) throws Exception {

String fieldsToSearch[] = { "fieldOne","path","date","uniqueId" };
ArrayList docList = new ArrayList();

for(int i=0; i<fieldsToSearch.length; i++){

BooleanQuery bQuery = new BooleanQuery();
WildcardQuery wQuery = new WildcardQuery( new
Term( fieldsToSearch[i], query ) );

bQuery.add( wQuery, true, false );

Hits hits = searcher.search(bQuery);

for(int j=0; j<hits.length(); j++){
Document doc = hits.doc(j);
docList.add(doc);
}
return docList;
}

Take this as a hint and work around it. It surely is working for me
also take a look at this

http://www.mail-archive.com/lucene-user@jakarta.apache.org/msg00505.html
<http://www.mail-archive.com/lucene-user@jakarta.apache.org/msg00505.html>


-Jaggi




Aruna Raghavan wrote:

>Hi,
>#1 Yes. Look at the DateFilter implementation.
>#2 Look at the source code for SimpleAnalyzer to see what are the stopwords.
>I believe it removes more words when you index. I use StandardAnalyzer that
>works better for me. If I remember right, searches for alphanumerics did not
>work w/ Simple Analyzer. Also ,make sure to use the same kind of analyzer
>for both indexing and searching. A wildcard does not work when used as the
>first character in the word- for eg, *og for dog. Also, I believe wildcard
>searches are case sensitive.
>Aruna.
>-----Original Message-----
>From: Joel Bernstein [mailto:j.bernstein@ei.org]
>Sent: Tuesday, April 30, 2002 1:04 PM
>To: Lucene Users List
>Subject: 2 Questions
>
>
>1)Does Lucine allow you to sort results by date?
>
>2) How do you execute a wildcard search? I have
>indexed four million documents using the SimpleAnalyzer. When
>I execute a wildcard search using the SimpleAnalyzer the results returned
>are always 0.
>
>Thanks,
>
>Joel
>
>
>--
>To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
>For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
>
>
>