Mailing List Archive: jdk 1.1.8?

I have three questions:

1.
The jguru faq says that it's possible to use lucene with the 1.1.8
jdk. However, there are quite a few calls to StringBuffer methods that
aren't present at that version of the jdk - things like delete,
deleteCharAt, substring...

Do people use a different version of StringBuffer with the 1.1.8 jdk?

2. What is a .jj file and how do I get it to build a StandTokenizer,
StandardTokenizerConstants, etc? Is this part of some third-party build
system?

3.
I'm wondering if Lucene will be a good choice for what I want to use it
for, and would like some third-person opinions.

I build a web application that has, as part of the overall application, a
document management system for managing medical information documents
concerning specific drugs, indications, etc... The application is in use
by health-care professionals at biotech/pharmaceutical companies.

Currently, we store these documents in Oracle as rtf files. When a user
wants to build a letter they select the files they want to include in the
letter and the app merges these, inserts some customer-related info, and
produces an rtf document.

However, we currently don't have full-text searching of the documents that
the user uses to build the letters.

We want to add full-text searching of documents. When a user uploads a
document I will strip out the raw text from the rtf and then add it to a
field in our table. However, before I do that I may want to run the text
through lucene.

I assume lucene will strip out redundant words, and do some additional
processing, like removing common words, removing 's' and 'es' endings,
etc. Even converting the word 'application' to 'applic'. (I assume Lucene
is related to AIAT - I believe they're the same author)

This is the text that I'll actually add to the database.

Then, when a user runs a query to search for a document, I will take their
query and run it through Lucene, and then send the modified query to
oracle.

The biggest reason I want to do this is to cut down on the number of
characters we store in Oracle - this will speed up searches.

I don't want to store the files on the filesystem, so I don't think I can
use the indexing options.

Let me know what you think.

thanks,
rob

--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>

I can answer question 2.

.jj files are JavaCC projects. You can download and install JavaCC from the internet (just run a search fo it from Yahoo). Once you install it, simply run the command "javacc xxxxx.jj" (from javacc's bin directory) where xxxxx.jj is the .jj file you wish to unpack. This will expand hte .jj file into many .java files which can then be compiled as per normal.

>>> decker@robdecker.com 03/13/02 06:07AM >>>
I have three questions:

1.
The jguru faq says that it's possible to use lucene with the 1.1.8
jdk. However, there are quite a few calls to StringBuffer methods that
aren't present at that version of the jdk - things like delete,
deleteCharAt, substring...

Do people use a different version of StringBuffer with the 1.1.8 jdk?

2. What is a .jj file and how do I get it to build a StandTokenizer,
StandardTokenizerConstants, etc? Is this part of some third-party build
system?

3.
I'm wondering if Lucene will be a good choice for what I want to use it
for, and would like some third-person opinions.

I build a web application that has, as part of the overall application, a
document management system for managing medical information documents
concerning specific drugs, indications, etc... The application is in use
by health-care professionals at biotech/pharmaceutical companies.

Currently, we store these documents in Oracle as rtf files. When a user
wants to build a letter they select the files they want to include in the
letter and the app merges these, inserts some customer-related info, and
produces an rtf document.

However, we currently don't have full-text searching of the documents that
the user uses to build the letters.

We want to add full-text searching of documents. When a user uploads a
document I will strip out the raw text from the rtf and then add it to a
field in our table. However, before I do that I may want to run the text
through lucene.

I assume lucene will strip out redundant words, and do some additional
processing, like removing common words, removing 's' and 'es' endings,
etc. Even converting the word 'application' to 'applic'. (I assume Lucene
is related to AIAT - I believe they're the same author)

This is the text that I'll actually add to the database.

Then, when a user runs a query to search for a document, I will take their
query and run it through Lucene, and then send the modified query to
oracle.

The biggest reason I want to do this is to cut down on the number of
characters we store in Oracle - this will speed up searches.

I don't want to store the files on the filesystem, so I don't think I can
use the indexing options.

Let me know what you think.

thanks,
rob

--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>

--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>