Mailing List Archive

How to create fields from a txt file for Lucene indexing?
Hi all

I'd like to create fields based in a txt.file, like the foollowing example:

File1.txt
Author: Eder
Description: Indexing txt files in Lucene Tutorial
Category: Software Development

File2.txt
Author: Cecilia
Title: Preventioning Fever
Category: Health y Wellness

So, I'd like to create the fields "Author", "Description", "Title" and
"Category" by reading the files. If I got the texts, I would do something
like:

Document doc = new Document( );
doc.add(New field("Author","Eder"));

But this info is in txt files, so how can I read the file and get the data?


Great Hugh,

Eder Rebouças dos Santos
Salvador / BA - Brasil
Re: How to create fields from a txt file for Lucene indexing? [ In reply to ]
You need to read in the file and parse it according to your business
rules (just like you would read in any file in your system) and then
create the appropriate Fields.

-Grant
On Oct 26, 2006, at 11:56 PM, Eder wrote:

> Hi all
>
> I'd like to create fields based in a txt.file, like the foollowing
> example:
>
> File1.txt
> Author: Eder
> Description: Indexing txt files in Lucene Tutorial
> Category: Software Development
>
> File2.txt
> Author: Cecilia
> Title: Preventioning Fever
> Category: Health y Wellness
>
> So, I'd like to create the fields "Author", "Description", "Title"
> and "Category" by reading the files. If I got the texts, I would do
> something like:
>
> Document doc = new Document( );
> doc.add(New field("Author","Eder"));
>
> But this info is in txt files, so how can I read the file and get
> the data?
>
>
> Great Hugh,
>
> Eder Rebouças dos Santos
> Salvador / BA - Brasil

--------------------------
Grant Ingersoll
Sr. Software Engineer
Center for Natural Language Processing
Syracuse University
335 Hinds Hall
Syracuse, NY 13244
http://www.cnlp.org
Re: How to create fields from a txt file for Lucene indexing? [ In reply to ]
Hi, Grant

Sorry for writing for ya... I'm a newbie in Lucene using. Could you give me
a practical example for parsing a file? I tried to comprehend the luceneweb
demo, but it's very complicated..

I'd thank ya a lot!

Eder


----- Original Message -----
From: "Grant Ingersoll" <gsingers@apache.org>
To: <general@lucene.apache.org>
Sent: Friday, October 27, 2006 10:43 AM
Subject: Re: How to create fields from a txt file for Lucene indexing?


You need to read in the file and parse it according to your business
rules (just like you would read in any file in your system) and then
create the appropriate Fields.

-Grant
On Oct 26, 2006, at 11:56 PM, Eder wrote:

> Hi all
>
> I'd like to create fields based in a txt.file, like the foollowing
> example:
>
> File1.txt
> Author: Eder
> Description: Indexing txt files in Lucene Tutorial
> Category: Software Development
>
> File2.txt
> Author: Cecilia
> Title: Preventioning Fever
> Category: Health y Wellness
>
> So, I'd like to create the fields "Author", "Description", "Title" and
> "Category" by reading the files. If I got the texts, I would do something
> like:
>
> Document doc = new Document( );
> doc.add(New field("Author","Eder"));
>
> But this info is in txt files, so how can I read the file and get the
> data?
>
>
> Great Hugh,
>
> Eder Rebouças dos Santos
> Salvador / BA - Brasil

--------------------------
Grant Ingersoll
Sr. Software Engineer
Center for Natural Language Processing
Syracuse University
335 Hinds Hall
Syracuse, NY 13244
http://www.cnlp.org
Re: How to create fields from a txt file for Lucene indexing? [ In reply to ]
Hi Eder,

If you are using Java 5, take a look at

java.util.Scanner to read your lines,
then use String
<http://java.sun.com/j2se/1.5.0/docs/api/java/lang/String.html>[]
split(String <http://java.sun.com/j2se/1.5.0/docs/api/java/lang/String.html>
regex) to split on column,
and read the first element of the array to decide what field you have.

Hope this helps.

Patrick


On 10/27/06, Eder <ers.c@bol.com.br> wrote:
>
>
> Hi, Grant
>
> Sorry for writing for ya... I'm a newbie in Lucene using. Could you give
> me
> a practical example for parsing a file? I tried to comprehend the
> luceneweb
> demo, but it's very complicated..
>
> I'd thank ya a lot!
>
> Eder
>
>
> ----- Original Message -----
> From: "Grant Ingersoll" <gsingers@apache.org>
> To: <general@lucene.apache.org>
> Sent: Friday, October 27, 2006 10:43 AM
> Subject: Re: How to create fields from a txt file for Lucene indexing?
>
>
> You need to read in the file and parse it according to your business
> rules (just like you would read in any file in your system) and then
> create the appropriate Fields.
>
> -Grant
> On Oct 26, 2006, at 11:56 PM, Eder wrote:
>
> > Hi all
> >
> > I'd like to create fields based in a txt.file, like the foollowing
> > example:
> >
> > File1.txt
> > Author: Eder
> > Description: Indexing txt files in Lucene Tutorial
> > Category: Software Development
> >
> > File2.txt
> > Author: Cecilia
> > Title: Preventioning Fever
> > Category: Health y Wellness
> >
> > So, I'd like to create the fields "Author", "Description", "Title" and
> > "Category" by reading the files. If I got the texts, I would
> do something
> > like:
> >
> > Document doc = new Document( );
> > doc.add(New field("Author","Eder"));
> >
> > But this info is in txt files, so how can I read the file and get the
> > data?
> >
> >
> > Great Hugh,
> >
> > Eder Rebouças dos Santos
> > Salvador / BA - Brasil
>
> --------------------------
> Grant Ingersoll
> Sr. Software Engineer
> Center for Natural Language Processing
> Syracuse University
> 335 Hinds Hall
> Syracuse, NY 13244
> http://www.cnlp.org
>
>
>
>