I am confused about how Lucene performs the parsing of an Html document. It
doesn't do any tag striping (or does it?) consequently does that mean it
also indexes all html tags? If so then a request for searching "body" will
return any and all html documents previously indexed.
I'd appreciate anyone would could shed some light on the FAQ.10 about
indexing?
--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
doesn't do any tag striping (or does it?) consequently does that mean it
also indexes all html tags? If so then a request for searching "body" will
return any and all html documents previously indexed.
I'd appreciate anyone would could shed some light on the FAQ.10 about
indexing?
--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>