Hi folks,
Just a little advertising message for those who are interested in semantic
expansions :
http://kant.lingway.com/DemoUN is a demo of a multilingual IR system based
on Lucene
Please take a look at it - feedback is welcome!
Julien
----- Original Message -----
From: "Peter Carlson" <carlson@bookandhammer.com>
To: "Lucene Developers List" <lucene-dev@jakarta.apache.org>
Sent: Wednesday, May 15, 2002 7:06 AM
Subject: Re: Adding a TermExpansionQuery
> Hi Eric,
>
> Thanks for the feedback. My intention was to abstract the source, but one
of
> my questions was, does Lucene set a configuration file which will use this
> "Thesaurus" query, or will that have to be setup manually by the
developer.
>
> Currently, Lucene does not provide a configuration file.
>
> As far as if the information is in the index directory. I was thinking
this
> might be a nice place for this information to exist, then it doesn't add
any
> other overhead to the system (i.e. No configuration file) and might be
> easier to support multiple sources since the index has already been
> abstracted. If you wanted to share the "Thesaurus" across many different
> indices you could "copy" or "merge" that index component into the data
> source. This could even be part of the build process for a file system.
>
> --Peter
>
> On 5/15/02 6:45 AM, "Eric D. Friedman" <eric@conveysoftware.com> wrote:
>
> > Whichever storage mechanism you choose, you should be sure to abstract
its
> > interface so that people can make other choices. With that out of the
way,
> > it doesn't matter too much whether you pick a properties file or an XML
> > file.
> >
> > That said, I wouldn't expect to find this data stored in the index
> > directory, since it's not part of the index and since users may want to
> > share the data across several indices. I would also lean toward the
> > XML file (for a file solution, that is -- an RDBMS should be supported
> > too), since that lends itself more naturally to describing one-to-many
> > relations than a properties file does.
> >
> > Personal opinion: "Thesaurus" is a more descriptive term than
> > "TermExpansion." To me, term expansion suggests some kind of text
> > globbing, whereas a thesaurus is a reference (a "lookup table") that
> > provides *semantic* expansions of the kind you describe. Oracle's
> > intermedia indexing engine has thesaurus features similar to what you
> > describe and calls them by that name.
>
>
> --
> To unsubscribe, e-mail:
<mailto:lucene-dev-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail:
<mailto:lucene-dev-help@jakarta.apache.org>
>
>
--
To unsubscribe, e-mail: <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>