Mailing List Archive

Lucene
Hi all,

Does Lucene has multilingual capabilities. I am trying to test Lucene with
other languages.Does anyone know how do we to that??

Thanks



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org
Re: lucene [ In reply to ]
Alberto:

Not a problem. You can add more data to an index even as it is being read by
your application.

HOWEVER, you must close and reopen your IndexReaders before the
newly-indexed data is available to the readers. This is an expensive
operation, so I would close/open the IndexReader judiciously.

NOTE: when you instantiate the IndexWriter, use false for the create
parameter or you'll erase your existing index.

You can also create an index in a *different* directory and then use a
MultiReader or Searcher, but that requires re-compiling your application to
add in the new indexes, and I assume that's not what you want.

I don't know about Spanish support.

Best
Erick
Re: Lucene [ In reply to ]
> Hi
>
> I want to use Apache Lucene to do a full text search for Postgresql.
>
> May i know the set up requirements and if it supports postgresql
>
> --
> Devinder




--
Devinder
Re: Lucene [ In reply to ]
Hi, Devinder,

Lucene is agnostic of any database configuration. You need to pull data via
jdbc out, and feed it to Lucene to create an index file, and then use Lucene
API to search on it.

--
Chris Lu
-------------------------
Instant Scalable Full-Text Search On Any Database/Application
site: http://www.dbsight.net
demo: http://search.dbsight.com
Lucene Database Search in 3 minutes:
http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes

On 9/13/07, Devinder Singh <devinbhullar@gmail.com> wrote:
>
> > Hi
> >
> > I want to use Apache Lucene to do a full text search for Postgresql.
> >
> > May i know the set up requirements and if it supports postgresql
> >
> > --
> > Devinder
>
>
>
>
> --
> Devinder
>
Re: Lucene [ In reply to ]
Hi Chris

Thanks do you also have a channel on IRC so we wan communicate.

Devinder


On 14/09/2007, Chris Lu <chris.lu@gmail.com> wrote:
>
> Hi, Devinder,
>
> Lucene is agnostic of any database configuration. You need to pull data
> via
> jdbc out, and feed it to Lucene to create an index file, and then use
> Lucene
> API to search on it.
>
> --
> Chris Lu
> -------------------------
> Instant Scalable Full-Text Search On Any Database/Application
> site: http://www.dbsight.net
> demo: http://search.dbsight.com
> Lucene Database Search in 3 minutes:
>
> http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
>
> On 9/13/07, Devinder Singh <devinbhullar@gmail.com> wrote:
> >
> > > Hi
> > >
> > > I want to use Apache Lucene to do a full text search for Postgresql.
> > >
> > > May i know the set up requirements and if it supports postgresql
> > >
> > > --
> > > Devinder
> >
> >
> >
> >
> > --
> > Devinder
> >
>



--
Devinder
Re: Lucene [ In reply to ]
I want to know what type of setup i need to get started with Lucene.

Do i need Java and Apache Tomcat on Windows XP.


Devinder


On 14/09/2007, Devinder Singh <devinbhullar@gmail.com> wrote:
>
> Hi Chris
>
> Thanks do you also have a channel on IRC so we wan communicate.
>
> Devinder
>
>
> On 14/09/2007, Chris Lu <chris.lu@gmail.com> wrote:
> >
> > Hi, Devinder,
> >
> > Lucene is agnostic of any database configuration. You need to pull data
> > via
> > jdbc out, and feed it to Lucene to create an index file, and then use
> > Lucene
> > API to search on it.
> >
> > --
> > Chris Lu
> > -------------------------
> > Instant Scalable Full-Text Search On Any Database/Application
> > site: http://www.dbsight.net
> > demo: http://search.dbsight.com
> > Lucene Database Search in 3 minutes:
> > http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
> >
> >
> > On 9/13/07, Devinder Singh <devinbhullar@gmail.com> wrote:
> > >
> > > > Hi
> > > >
> > > > I want to use Apache Lucene to do a full text search for Postgresql.
> >
> > > >
> > > > May i know the set up requirements and if it supports postgresql
> > > >
> > > > --
> > > > Devinder
> > >
> > >
> > >
> > >
> > > --
> > > Devinder
> > >
> >
>
>
>
> --
> Devinder




--
Devinder
Re: Lucene [ In reply to ]
Yes, with a little bit of work, as there is nothing out of the box
for it.

If you store term vectors (or re-analyze the document) you can use
the sample code from my ApacheCon 2005 talk (http://www.cnlp.org/
apachecon2005/, which also covers how to use TermVectors) OR you can
try implementing the new TermVectorMapper functionality in the trunk
version of Lucene.

Cheers,
Grant

On Oct 16, 2007, at 4:14 PM, Jae Joo wrote:

> Hi,
>
> Does Lucene have the function to return top 5 most frequency
> keywords in
> the article?
>
> Thanks,
>
> Jae

--------------------------
Grant Ingersoll
http://lucene.grantingersoll.com

Lucene Boot Camp Training:
ApacheCon Atlanta, Nov. 12, 2007. Sign up now! http://
www.apachecon.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Lucene [ In reply to ]
I don't understand what you're trying to do with "match extent". Perhaps
a bit more explanation of the problem you're trying to solve would get
you more helpful answers <G>...

Best
Erick

On Feb 4, 2008 9:34 AM, Allahbaksh Mohammedali Asadullah <
Allahbaksh_Asadullah@infosys.com> wrote:

> Hi,
>
> I have following requirement
>
>
>
>
>
> Value Match Extent
>
> Fieldname1 c1 23
>
> Fieldname2 c2 26
>
> Filedname3 c8 85
>
>
>
> Can I use lucene for the same. If yes what is easiest and the best way to
> use.
>
>
>
> Regards,
>
> Allahbaksh
>
>
>
> Allahbaksh Mohammedali Asadullah,
>
> *S*oftware *E*ngineering & *T*echnology *L*abs,
>
> Infosys Technolgies Limited, Electronics City,
>
> Hosur Road, Bangalore 560 100, India.
>
> (Board: +91-80-28520261 | Extn: 53915 | Direct: 41173915.
>
> Fax: +91-80-28520362 | Mobile: +91-9845505322.
>
>
> **************** CAUTION - Disclaimer *****************
> This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended
> solely for the use of the addressee(s). If you are not the intended
> recipient, please notify the sender by e-mail and delete the original
> message. Further, you are not to copy, disclose, or distribute this e-mail
> or its contents to any other person and any such actions are unlawful. This
> e-mail may contain viruses. Infosys has taken every reasonable precaution to
> minimize this risk, but is not liable for any damage you may sustain as a
> result of any virus in this e-mail. You should carry out your own virus
> checks before opening the e-mail or attachment. Infosys reserves the right
> to monitor and review the content of all messages sent to or from this
> e-mail address. Messages sent to or from this e-mail address may be stored
> on the Infosys e-mail system.
> ***INFOSYS******** End of Disclaimer ********INFOSYS***
>
RE: Lucene [ In reply to ]
First I want to search document which have values c1 then search document which has c1 as one of field value. I know we can use Term Query but is it the way we should do?
Can't we save something like this filedname1: c1-23 and while parsing get c1 and 23 as two fields. I should also able to query filedname1 with range query.
If you can provide your telephone number I can explain the same to you.
Regards,
Allahbaksh
-----Original Message-----
From: Erick Erickson [mailto:erickerickson@gmail.com]
Sent: Monday, February 04, 2008 8:44 PM
To: java-user@lucene.apache.org
Subject: Re: Lucene

I don't understand what you're trying to do with "match extent". Perhaps
a bit more explanation of the problem you're trying to solve would get
you more helpful answers <G>...

Best
Erick

On Feb 4, 2008 9:34 AM, Allahbaksh Mohammedali Asadullah <
Allahbaksh_Asadullah@infosys.com> wrote:

> Hi,
>
> I have following requirement
>
>
>
>
>
> Value Match Extent
>
> Fieldname1 c1 23
>
> Fieldname2 c2 26
>
> Filedname3 c8 85
>
>
>
> Can I use lucene for the same. If yes what is easiest and the best way to
> use.
>
>
>
> Regards,
>
> Allahbaksh
>
>
>
> Allahbaksh Mohammedali Asadullah,
>
> *S*oftware *E*ngineering & *T*echnology *L*abs,
>
> Infosys Technolgies Limited, Electronics City,
>
> Hosur Road, Bangalore 560 100, India.
>
> (Board: +91-80-28520261 | Extn: 53915 | Direct: 41173915.
>
> Fax: +91-80-28520362 | Mobile: +91-9845505322.
>
>
> **************** CAUTION - Disclaimer *****************
> This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended
> solely for the use of the addressee(s). If you are not the intended
> recipient, please notify the sender by e-mail and delete the original
> message. Further, you are not to copy, disclose, or distribute this e-mail
> or its contents to any other person and any such actions are unlawful. This
> e-mail may contain viruses. Infosys has taken every reasonable precaution to
> minimize this risk, but is not liable for any damage you may sustain as a
> result of any virus in this e-mail. You should carry out your own virus
> checks before opening the e-mail or attachment. Infosys reserves the right
> to monitor and review the content of all messages sent to or from this
> e-mail address. Messages sent to or from this e-mail address may be stored
> on the Infosys e-mail system.
> ***INFOSYS******** End of Disclaimer ********INFOSYS***
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Lucene [ In reply to ]
1) For Lucene's scoring, there's this:

http://lucene.apache.org/java/2_4_0/scoring.html#Scoring

And Lucene in action also describes the scoring formula.

2) It's up to you to build a Lucene document from your content, so you
decide which parts of your content (body, link, meta) become which
fields in Lucene. At that point Lucene's scoring formula kicks in.

Mike

Marco Palumbo - In4Tech wrote:

> Good morning,
>
> some days ago I sent the following e-mail, but I had no feed-back on
> it. Could you please tell us if there is someone able to cooperate
> with us on this project?
>
> Thank you in advance,
>
> Marco Palumbo
>
> dott. Marco Palumbo
> Chief Financial Officer
> In4Tech s.r.l.
> c.so Canalgrande, n. 88
> 41100 Modena - Italy
> tel.: 0039 059 230651
> fax : 0039 059 244672
> www.in4tech.net
>
> From: Marco Palumbo - In4Tech
> Sent: giovedì 13 novembre 2008 16.03
> To: java-user@lucene.apache.org
> Subject: Lucene
>
> Good morning,
>
> our company works in the field of industrial biotechnologies. We
> were interested in having a software capable to classify web-sites
> (and so organizations) working in our field. So, one of our IT
> consultants organized a system based on Heritrix (http://crawler.archive.org/
> ) and Lucene.
>
> As you know, Lucene calculates some scores of frequency. We would
> like to know/obtain:
> 1) the formula used by Lucene to calculate the scores;
> 2) for each page, the basic information used by Lucene to calculate
> the scores (atomic data: term's frequency in meta, link, body;
> dimension of the page; ...).
>
> How can you help us to have this kind of information?
>
> Thanks.
>
> Marco Palumbo
>
>
> dott. Marco Palumbo
> Chief Financial Officer
> In4Tech s.r.l.
> c.so Canalgrande, n. 88
> 41100 Modena - Italy
> tel.: 0039 059 230651
> fax : 0039 059 244672
> www.in4tech.net
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
RE: lucene [ In reply to ]
Hi,

Why not use PerFieldAnalyzerWrapper to provide the same thing and that's
already available?

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Rafa³ Lenarczyk [mailto:rafal.lenarczyk.98@gmail.com]
> Sent: Thursday, March 17, 2011 4:41 PM
> To: java-user@lucene.apache.org
> Subject: lucene
>
> Hy,
> This mail should be written by developers.
>
> I'm java developer and use your product in my application.
> I use special QueryParser like MultiFieldQueryParser I use static method
> parse where I must set lucene Version, string tab with queries, string tab
> with field, string tab with flags and analyzer:
> MultiFieldQueryParser.parse(Version matchVersion, String[] queries,
String[]
> fields, BooleanClause.Occur[] flags, Analyzer analyzer) throws
> ParseException;
>
> This implementation has one parameter analyzer, and I have differents
fields
> for example person name, person sname or person id sometimes.
> I want use KeywordAnalyzer to person id and other Analyzer to person
> name and sname.
> I wrote my specjal MultiFieldQueryParser.parse:
>
> import org.apache.lucene.analysis.Analyzer;
> import org.apache.lucene.queryParser.ParseException;
> import org.apache.lucene.queryParser.QueryParser;
> import org.apache.lucene.search.BooleanClause;
> import org.apache.lucene.search.BooleanQuery;
> import org.apache.lucene.search.Query;
> import org.apache.lucene.util.Version;
>
> public class MyMultifieldQueryParser {
>
> public static Query parse(Version matchVersion, String[] queries,
String[]
> fields, BooleanClause.Occur[] flags, Analyzer[] analyzers) throws
> ParseException {
> if (!(queries.length == fields.length && queries.length ==
> flags.length))
> throw new IllegalArgumentException("queries, fields, and flags
array
> have have different length");
> BooleanQuery bQuery = new BooleanQuery();
> for (int i = 0; i < fields.length; i++) {
> QueryParser qp = new QueryParser(matchVersion, fields[i],
> analyzers[i]); //---------------------------this was changed
> Query q = qp.parse(queries[i]);
> if (q!=null && // q never null, just being defensive
> (!(q instanceof BooleanQuery) ||
> ((BooleanQuery)q).getClauses().length>0)) {
> bQuery.add(q, flags[i]);
> }
> }
> return bQuery;
> }
> }
>
> I think It is good idea and You can add in your new release.
>
> Regards,
> Rafal Lenarczyk


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
RE: Lucene [ In reply to ]
Hello - you are on the wrong list, this is Lucene java user, not the Solr user mailing list. But this is what you are looking for:

https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Solr+Cell+using+Apache+Tika
https://wiki.apache.org/solr/ExtractingRequestHandler

First is official, second is old but maybe still relevant. Please not this is usually not to be used in production.

Regards,
Markus



-----Original message-----
> From:Anthony Van <anthony.huy.van@gmail.com>
> Sent: Wednesday 8th February 2017 22:51
> To: java-user@lucene.apache.org
> Subject: Lucene
>
> Good afternoon everyone,
>
> I have Solr Lucene installed on ubuntu and was wondering if i can add a
> file path in order to test out the search features?
>
> For example - I would like to add Y:\is\IS Shared to Lucene and then start
> testing out the search queries to see if it works.
>
> I read over many documentation and i am still at a lost.
>
> --
>
> *Anthony Van *
>
> Anthony.huy.van@gmail.com
>
> Information Technology and Security Professional
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: lucene [ In reply to ]
Routing this back to the list with trademarks@ on BCC

This, any any email along similar lines, is spam. The email should be
ignored and the sender blocked from sending further messages to the
mailing list.

Mark



On 05/10/2023 17:53, Adrien Grand wrote:
> Rerouting this to trademarks@a.o, java-user@lucene.a.o moved to Bcc.
>
> Hello Trademarks team,
>
> The Lucene project just received the below email on their user list, can
> you take care of it?
>
> Thank you,
> Adrien
>
> On Thu, Oct 5, 2023 at 3:23?PM Paul Liu <paul@chinesedomain.org
> <mailto:paul@chinesedomain.org>> wrote:
>
> (It's very urgent, therefore we kindly ask you to forward this email
> to your CEO. If you believe this has been sent to you in error,
> please ignore it. Thanks)Dear CEO,This is a formal email. We are the
> Domain Registration Service company in Shanghai, China. Here I have
> something to confirm with you. We received an application from
> Hongyuan Ltd on October 5, 2023. They want to request "lucene" as
> their internet keyword and China (CN) domain names (lucene.cn
> <http://lucene.cn>, lucene.com.cn <http://lucene.com.cn>,
> lucene.net.cn <http://lucene.net.cn>, lucene.org.cn
> <http://lucene.org.cn>). But after checking it, we find this name
> conflict with your company name or trademark. In order to deal with
> this matter better, it's necessary to send email to you and confirm
> whether this company is your distributor in China? Best Regards
> Paul Liu   Service & Operations Manager
>
> China Registry (Head Office)
>
>
>
> Tel: +86-2161918696
>
> Fax: +86-2161918697
>
> Mob: +86-13816428671
>
> No. 1780 Wuzhong Road, Shanghai 201103, China
>
> *****************************************
>
> This email contains privileged and confidential information intended
> for the addressee only. If you are not the intended recipient,
> please destroy this email and inform the sender immediately. We
> appreciate you respecting the confidentiality of this information by
> not disclosing or using the information in this email.
>
>
>
> --
> Adrien

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: lucene [ In reply to ]
Thanks Mark, I'll know next time it occurs.

On Thu, Oct 5, 2023 at 7:00?PM Mark Thomas <markt@apache.org> wrote:

> Routing this back to the list with trademarks@ on BCC
>
> This, any any email along similar lines, is spam. The email should be
> ignored and the sender blocked from sending further messages to the
> mailing list.
>
> Mark
>
>
>
> On 05/10/2023 17:53, Adrien Grand wrote:
> > Rerouting this to trademarks@a.o, java-user@lucene.a.o moved to Bcc.
> >
> > Hello Trademarks team,
> >
> > The Lucene project just received the below email on their user list, can
> > you take care of it?
> >
> > Thank you,
> > Adrien
> >
> > On Thu, Oct 5, 2023 at 3:23?PM Paul Liu <paul@chinesedomain.org
> > <mailto:paul@chinesedomain.org>> wrote:
> >
> > (It's very urgent, therefore we kindly ask you to forward this email
> > to your CEO. If you believe this has been sent to you in error,
> > please ignore it. Thanks)Dear CEO,This is a formal email. We are the
> > Domain Registration Service company in Shanghai, China. Here I have
> > something to confirm with you. We received an application from
> > Hongyuan Ltd on October 5, 2023. They want to request "lucene" as
> > their internet keyword and China (CN) domain names (lucene.cn
> > <http://lucene.cn>, lucene.com.cn <http://lucene.com.cn>,
> > lucene.net.cn <http://lucene.net.cn>, lucene.org.cn
> > <http://lucene.org.cn>). But after checking it, we find this name
> > conflict with your company name or trademark. In order to deal with
> > this matter better, it's necessary to send email to you and confirm
> > whether this company is your distributor in China? Best Regards
> > Paul Liu Service & Operations Manager
> >
> > China Registry (Head Office)
> >
> >
> >
> > Tel: +86-2161918696
> >
> > Fax: +86-2161918697
> >
> > Mob: +86-13816428671
> >
> > No. 1780 Wuzhong Road, Shanghai 201103, China
> >
> > *****************************************
> >
> > This email contains privileged and confidential information intended
> > for the addressee only. If you are not the intended recipient,
> > please destroy this email and inform the sender immediately. We
> > appreciate you respecting the confidentiality of this information by
> > not disclosing or using the information in this email.
> >
> >
> >
> > --
> > Adrien
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

--
Adrien