Hi!
As many others I want to use Lucene as a frontend for searching
content which is burried in a relational database. As far as I
can see this should be no problem, by building documents for
single rows in the tables. Since many of you have already done such
an approach I would appreciate any suggestions on the following
issues:
- Consistency
What is the best way to maintain consistency between the database
and the lucene index. I can think of two solutions:
- update index on every insert
- ignore index at insert and do full reindex after time
(e.g. nightly)
- Transactional issues
what is the best way to make a database insert + index insert
atomic!?
- Content Separation
My content in the database is spread across multiple tables.
But there are clusters of related tables. For example I have
3 tables describing authors of papers. My solution would be a
separate index for each of those clusters. When the user does
a search every index must be searched separately of course ...
Is maintaining a separate index for every "topic" a good idea?
One might ask why not searching against the database directly. Well,
I would have to build a search interface (think of boolean issues)
on my own, which is definitely something I do not have time for.
Additionally my database (Postgresql) doesn't support full-text
searches (yet).
Any additional input on your expiriences are very welcome!
Thx in advance,
Peter
--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
As many others I want to use Lucene as a frontend for searching
content which is burried in a relational database. As far as I
can see this should be no problem, by building documents for
single rows in the tables. Since many of you have already done such
an approach I would appreciate any suggestions on the following
issues:
- Consistency
What is the best way to maintain consistency between the database
and the lucene index. I can think of two solutions:
- update index on every insert
- ignore index at insert and do full reindex after time
(e.g. nightly)
- Transactional issues
what is the best way to make a database insert + index insert
atomic!?
- Content Separation
My content in the database is spread across multiple tables.
But there are clusters of related tables. For example I have
3 tables describing authors of papers. My solution would be a
separate index for each of those clusters. When the user does
a search every index must be searched separately of course ...
Is maintaining a separate index for every "topic" a good idea?
One might ask why not searching against the database directly. Well,
I would have to build a search interface (think of boolean issues)
on my own, which is definitely something I do not have time for.
Additionally my database (Postgresql) doesn't support full-text
searches (yet).
Any additional input on your expiriences are very welcome!
Thx in advance,
Peter
--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>