On Tuesday, April 30, 2002, at 10:46 PM, Otis Gospodnetic wrote:
> Hm, this should be a FAQ.
Maybe it should... ;-)
> Check Lucene contributions page, there are some starting points there,
Well, this seems to be a very popular request... In fact I need
something like that also. Unfortunately, there seems to be no
authoritative answer as far as converting pdf files to text in a pure
Java environment... Maybe I'm missing something here as usual?
Also, on a related note, what would be a good approach to convert any
random document into pdf? I was thinking to have a two steps process for
document indexing in Lucene:
- First, convert everything to pdf (with Acrobat or something)
- Second, convert pdf to text and index it.
Any practical suggestions about how to do that in a pure Java
environment very welcome.
Thanks :-)
PA.
--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
> Hm, this should be a FAQ.
Maybe it should... ;-)
> Check Lucene contributions page, there are some starting points there,
Well, this seems to be a very popular request... In fact I need
something like that also. Unfortunately, there seems to be no
authoritative answer as far as converting pdf files to text in a pure
Java environment... Maybe I'm missing something here as usual?
Also, on a related note, what would be a good approach to convert any
random document into pdf? I was thinking to have a two steps process for
document indexing in Lucene:
- First, convert everything to pdf (with Acrobat or something)
- Second, convert pdf to text and index it.
Any practical suggestions about how to do that in a pure Java
environment very welcome.
Thanks :-)
PA.
--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>