Mailing List Archive: Field.Text arguments

Field.Text arguments

Mar 27, 2002, 4:22 PM

Post #1 of 3 (626 views)

I'm confused about using Fields.

Here's the two methods that are confusing me:
public static final Field Text(String name, Reader value)
public static final Field Text(String name, String value)

The difference is that one takes a reader and the other a string.

I have a field that will have pretty large contents after running through
my analyzer (1500 to 6000 characters).

When I use the second of the two methods above my string is not run
through the analyzer, but is stored in the index.

When I use the first method, by passing in a StringReader based of the
String, I don't get anything indexed at all (and therefore it's difficult
to know if it was analyzed).

Is there some other Field type that I should be using for text that I want
analyzed and indexed, and that the text can be fairly long?

Here's a rough order of I'm doing things. FragmentAnalyzer is my own
custom class that seems to normally work:

Document document = new Document();
Reader reader = new StringReader(text);
document.add(Field.Text("contents", reader));
...
FragmentAnalyzer analyzer = new FragmentAnalyzer();
IndexWriter writer = new IndexWriter(pathToIndex, analyzer,
isCreateNewIndex);
writer.addDocument(document);
writer.close();

rob

--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>

Re: Field.Text arguments [ In reply to ]

Joe.Hajek at blackbox

Mar 27, 2002, 4:18 PM

Post #2 of 3 (614 views)

Permalink

Hi,

thats interesting. if you do a Field.Text(String name, Reader value) it should be indexed but not stored. strange i had no problems, but i didnt use a stringreader, just file readers.

try to do create your customized field, passing a string that is not stored. i dont remember the documentation exactly, but this should be possible passing the right parameters to the field constructor.

regards joe

"Robert A. Decker" <decker@robdecker.com> writes on
Thu, 28 Mar 2002 00:22:36 +0100 (MET):

> I'm confused about using Fields.
>
> Here's the two methods that are confusing me:
> public static final Field Text(String name, Reader value)
> public static final Field Text(String name, String value)
>
> The difference is that one takes a reader and the other a string.
>
> I have a field that will have pretty large contents after running
> through
> my analyzer (1500 to 6000 characters).
>
> When I use the second of the two methods above my string is not run
> through the analyzer, but is stored in the index.
>
> When I use the first method, by passing in a StringReader based of
> the
> String, I don't get anything indexed at all (and therefore it's
> difficult
> to know if it was analyzed).
>
>
> Is there some other Field type that I should be using for text that I
> want
> analyzed and indexed, and that the text can be fairly long?
>
>
> Here's a rough order of I'm doing things. FragmentAnalyzer is my own
> custom class that seems to normally work:
>
> Document document = new Document();
> Reader reader = new StringReader(text);
> document.add(Field.Text("contents", reader));
> ...
> FragmentAnalyzer analyzer = new FragmentAnalyzer();
> IndexWriter writer = new IndexWriter(pathToIndex, analyzer,
> isCreateNewIndex);
> writer.addDocument(document);
> writer.close();
>
>
> rob
>
>
> --
> To unsubscribe, e-mail:
> <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail:
> <mailto:lucene-user-help@jakarta.apache.org>
>

--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>

Re: Field.Text arguments [ In reply to ]

decker at robdecker

Mar 27, 2002, 4:55 PM

Post #3 of 3 (624 views)

Permalink

I think I may be confused on the terminology. What is meant by 'not
stored'? The comments on the method that takes a Reader as an argument
states that it 'is tokenized and indexed, but not stored in the index
verbatim'.

I took this to mean that it stores the version of the text after it is run
through the analyzer, which is exactly what I want.

Now that I've looked at the index files closer, I'm starting to think that
perhaps the text may be being stored. It's hard to tell though.

I want to be able go get at the contents of the stored field, and can do
so easily when I use the method that takes a String as an argument.

Here's how I'm trying to get the field back:
Field contentsField = doc.getField("contents");

I get null back when I used the Reader-as-argument Field method, but get
the correct, but unanalyzed, text back when I use the String-as-argument
Field method.

thanks,
rob

On Thu, 28 Mar 2002, Joe Hajek wrote:

> Hi,
>
> thats interesting. if you do a Field.Text(String name, Reader value) it
> should be indexed but not stored. strange i had no problems, but i didnt
> use a stringreader, just file readers.
>
> try to do create your customized field, passing a string that is not
> stored. i dont remember the documentation exactly, but this should be
> possible passing the right parameters to the field constructor.
>
> regards joe
>
>
> "Robert A. Decker" <decker@robdecker.com> writes on
> Thu, 28 Mar 2002 00:22:36 +0100 (MET):
>
> > I'm confused about using Fields.
> >
> > Here's the two methods that are confusing me:
> > public static final Field Text(String name, Reader value)
> > public static final Field Text(String name, String value)
> >
> > The difference is that one takes a reader and the other a string.
> >
> > I have a field that will have pretty large contents after running
> > through
> > my analyzer (1500 to 6000 characters).
> >
> > When I use the second of the two methods above my string is not run
> > through the analyzer, but is stored in the index.
> >
> > When I use the first method, by passing in a StringReader based of
> > the
> > String, I don't get anything indexed at all (and therefore it's
> > difficult
> > to know if it was analyzed).
> >
> >
> > Is there some other Field type that I should be using for text that I
> > want
> > analyzed and indexed, and that the text can be fairly long?
> >
> >
> > Here's a rough order of I'm doing things. FragmentAnalyzer is my own
> > custom class that seems to normally work:
> >
> > Document document = new Document();
> > Reader reader = new StringReader(text);
> > document.add(Field.Text("contents", reader));
> > ...
> > FragmentAnalyzer analyzer = new FragmentAnalyzer();
> > IndexWriter writer = new IndexWriter(pathToIndex, analyzer,
> > isCreateNewIndex);
> > writer.addDocument(document);
> > writer.close();
> >
> >
> > rob
> >
> >
> > --
> > To unsubscribe, e-mail:
> > <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> > For additional commands, e-mail:
> > <mailto:lucene-user-help@jakarta.apache.org>
> >
>
>
> --
> To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
>

--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>