Mailing List Archive

proposed changes to document (doug) vs AbstractDocument (pt 3/3)
Oops, I just read that title and it looks like the choice is between
Doug and the AbstractDocument class.. . :-)

That was instead supposed to imply it was an answer to Doug's question
:-).

Referring back the the earlier proposal, there is a lot of mess with
"Document Factory" classes. The idea behind this alternative is that
all of the specific knowledge for reading a file format would be in
subclasses of AbstractDocument. (You'd still have the default Document
class, and its still final -- no backward compatibility impact).

Basically the AbstractDocument subclasses would take the place of the
Document Factory subclasses in the proposal.

Anyhow, these patches were for a code impact study. I did this just
before I wrote the getting started guide, and made the following
discoveries:

1. Lucene needs a getting started guide and tomcat web demo :-)
2. There was no impact on performance
3. Only 5 existing classes would be impacted (4 of which were minimally
affected)

These patches do not demonstrate the goal, only impact upon the API. In
the proposal the mime mapping was to "DocumentFactory" classes. These
could be eliminated in favor of common constructor methods in the
AbstractDocument subclasses (probably taking File and InputStream where
the File constructor would just construct a FileInputStream and pass to
the InputStream).

Additionally, for the purpose of containment by the mime mapping see the
following example of how this could be implemented:
(http://cvs.apache.org/viewcvs/~checkout~/jakarta-poi/src/java/org/apache/poi/hssf/record/RecordFactory.java?rev=1.1&content-type=text/plain).. Specifically at how the record objects are stored in a map as well as their constructors. Total runtime increase for the software using this class increased by a matter of like 2-3 milliseconds over a switch statement implementation. It sounds complicated, but if you look at the code, its actually simpler then the struct version.

For Lucene, obviously we'd load these dynamically based on some
properties file, but thats irrelevant. (For the record, Glen
Stampoultzis and Marc Johnson came up with this method -- I was totally
against it unless they could prove it had no impact on performance..they
were right ;-) ).

Please note the patches are only to study the impact, they are not
necessarily a study of the correct way to implement the proposed
changes.

My apologies, if I'm not forthcoming in replying to anyone, tomorrow I'm
leaving for Boston for a week (I may have limited Internet access).
I'll play catchup when I get back.

-Andy

--
www.superlinksoftware.com
www.sourceforge.net/projects/poi - port of Excel format to java
http://developer.java.sun.com/developer/bugParade/bugs/4487555.html
- fix java generics!


The avalanche has already started. It is too late for the pebbles to
vote.
-Ambassador Kosh


--
To unsubscribe, e-mail: <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>