Hello:
I'm investigating Lucene as a replacement for a special-purpose search
technology that was developed long before Lucene (or any of the current IR
libraries) became available.
The use case involves so-called print streams. Imagine 20,000 statements
concatenated into one large file suitable for delivery to a print system.
The document formats vary, but include AFP (an IBM printer format), PCL (an
HP format), Postscript, PDF, and even "plain-text".
The indexing application must track the total page count of the embedded
statements. On a hit, the search application must extract and return the
[possibly multi-page] statement embedded within the larger print-stream
file.
How would the search application know (be informed by the Lucene/indexer)
the extent of the internal document(s)?
I'm not seeing this scenario discussed in forums or books. Does anyone have
comments or thoughts on Lucene's applicability as a solution?
Thanks.
Brad
--
View this message in context: http://www.nabble.com/Investigating-Lucene-for-Applicability-to--Unusual---Use-Case-tf3917031.html#a11106468
Sent from the Lucene - General mailing list archive at Nabble.com.
I'm investigating Lucene as a replacement for a special-purpose search
technology that was developed long before Lucene (or any of the current IR
libraries) became available.
The use case involves so-called print streams. Imagine 20,000 statements
concatenated into one large file suitable for delivery to a print system.
The document formats vary, but include AFP (an IBM printer format), PCL (an
HP format), Postscript, PDF, and even "plain-text".
The indexing application must track the total page count of the embedded
statements. On a hit, the search application must extract and return the
[possibly multi-page] statement embedded within the larger print-stream
file.
How would the search application know (be informed by the Lucene/indexer)
the extent of the internal document(s)?
I'm not seeing this scenario discussed in forums or books. Does anyone have
comments or thoughts on Lucene's applicability as a solution?
Thanks.
Brad
--
View this message in context: http://www.nabble.com/Investigating-Lucene-for-Applicability-to--Unusual---Use-Case-tf3917031.html#a11106468
Sent from the Lucene - General mailing list archive at Nabble.com.