Hi,
now I'm finally done and already thinking about the new LARM developments.
I see the following priorities:
1) Make all properties for a crawl configurable
2) Update documentation and make LARM more public (and probably get more
developerts on the project)
3) Implement the "politeness" stuff. Your idea with the
one-thread-per-server seems good to me
4) implement "sources" and "drains" and rework the processing pipelines to
be able to split up document processing pipelines into several steps
5) make the "FetcherTask.run()" more generic by putting most of it into the
processing pipelines
Regarding 1) I will try to do a spike with Avalon Phoenix, from which I
still believe that it perfectly matches what I've outlined in the
"Configuration RFC" mail. It provides XML configuration, a split up between
architecture (assembly.xml) and properties (config.xml), management of
startable/stoppable components, management of dependencies between
components, a management console, etc. The only glitch: the docs are still
painful, there are (afaik) no other resources on the net other than the
Avalon website, and it takes some time to understand the concepts. That
again lets entry costs for other people raise.
Clemens
----- Original Message -----
From: "Otis Gospodnetic" <otis_gospodnetic@yahoo.com>
To: "Clemens Marschner" <Clemens.Marschner@internet.lmu.de>; "Mehran Mehr"
<mehran@sharif.edu>
Sent: Tuesday, October 15, 2002 7:43 PM
Subject: LARM work
> Mehran, Clemens,
>
> Just curious, now that you are done with studies, Clemens, what is next
> for LARM?
>
> Otis
>
>
> __________________________________________________
> Do you Yahoo!?
> Faith Hill - Exclusive Performances, Videos & More
> http://faith.yahoo.com
--
To unsubscribe, e-mail: <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>
now I'm finally done and already thinking about the new LARM developments.
I see the following priorities:
1) Make all properties for a crawl configurable
2) Update documentation and make LARM more public (and probably get more
developerts on the project)
3) Implement the "politeness" stuff. Your idea with the
one-thread-per-server seems good to me
4) implement "sources" and "drains" and rework the processing pipelines to
be able to split up document processing pipelines into several steps
5) make the "FetcherTask.run()" more generic by putting most of it into the
processing pipelines
Regarding 1) I will try to do a spike with Avalon Phoenix, from which I
still believe that it perfectly matches what I've outlined in the
"Configuration RFC" mail. It provides XML configuration, a split up between
architecture (assembly.xml) and properties (config.xml), management of
startable/stoppable components, management of dependencies between
components, a management console, etc. The only glitch: the docs are still
painful, there are (afaik) no other resources on the net other than the
Avalon website, and it takes some time to understand the concepts. That
again lets entry costs for other people raise.
Clemens
----- Original Message -----
From: "Otis Gospodnetic" <otis_gospodnetic@yahoo.com>
To: "Clemens Marschner" <Clemens.Marschner@internet.lmu.de>; "Mehran Mehr"
<mehran@sharif.edu>
Sent: Tuesday, October 15, 2002 7:43 PM
Subject: LARM work
> Mehran, Clemens,
>
> Just curious, now that you are done with studies, Clemens, what is next
> for LARM?
>
> Otis
>
>
> __________________________________________________
> Do you Yahoo!?
> Faith Hill - Exclusive Performances, Videos & More
> http://faith.yahoo.com
--
To unsubscribe, e-mail: <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>