Mailing List Archive

Re: LARM work
Hi,

now I'm finally done and already thinking about the new LARM developments.

I see the following priorities:
1) Make all properties for a crawl configurable
2) Update documentation and make LARM more public (and probably get more
developerts on the project)
3) Implement the "politeness" stuff. Your idea with the
one-thread-per-server seems good to me
4) implement "sources" and "drains" and rework the processing pipelines to
be able to split up document processing pipelines into several steps
5) make the "FetcherTask.run()" more generic by putting most of it into the
processing pipelines

Regarding 1) I will try to do a spike with Avalon Phoenix, from which I
still believe that it perfectly matches what I've outlined in the
"Configuration RFC" mail. It provides XML configuration, a split up between
architecture (assembly.xml) and properties (config.xml), management of
startable/stoppable components, management of dependencies between
components, a management console, etc. The only glitch: the docs are still
painful, there are (afaik) no other resources on the net other than the
Avalon website, and it takes some time to understand the concepts. That
again lets entry costs for other people raise.

Clemens



----- Original Message -----
From: "Otis Gospodnetic" <otis_gospodnetic@yahoo.com>
To: "Clemens Marschner" <Clemens.Marschner@internet.lmu.de>; "Mehran Mehr"
<mehran@sharif.edu>
Sent: Tuesday, October 15, 2002 7:43 PM
Subject: LARM work


> Mehran, Clemens,
>
> Just curious, now that you are done with studies, Clemens, what is next
> for LARM?
>
> Otis
>
>
> __________________________________________________
> Do you Yahoo!?
> Faith Hill - Exclusive Performances, Videos & More
> http://faith.yahoo.com



--
To unsubscribe, e-mail: <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>
Re: LARM work [ In reply to ]
Hi Clemens,

I'm new to both Lucene & LARM and am wondering if you've been inspired
by some code from JoBo(http://sourceforge.net/projects/jobo/) to
implement some of LARM functionality. Currently, I'm using (a highly
modified version of) JoBo to fetch pagesfrom a few sites. I don't think
it's fantastic, but it gets the job done.

I know for one, that it has the feature of adding a timeout and limiting
bandwidth.

Stephane


Clemens Marschner wrote:

>Hi,
>
>now I'm finally done and already thinking about the new LARM developments.
>
>I see the following priorities:
>1) Make all properties for a crawl configurable
>2) Update documentation and make LARM more public (and probably get more
>developerts on the project)
>3) Implement the "politeness" stuff. Your idea with the
>one-thread-per-server seems good to me
>4) implement "sources" and "drains" and rework the processing pipelines to
>be able to split up document processing pipelines into several steps
>5) make the "FetcherTask.run()" more generic by putting most of it into the
>processing pipelines
>
>Regarding 1) I will try to do a spike with Avalon Phoenix, from which I
>still believe that it perfectly matches what I've outlined in the
>"Configuration RFC" mail. It provides XML configuration, a split up between
>architecture (assembly.xml) and properties (config.xml), management of
>startable/stoppable components, management of dependencies between
>components, a management console, etc. The only glitch: the docs are still
>painful, there are (afaik) no other resources on the net other than the
>Avalon website, and it takes some time to understand the concepts. That
>again lets entry costs for other people raise.
>
>Clemens
>
>
>
>----- Original Message -----
>From: "Otis Gospodnetic" <otis_gospodnetic@yahoo.com>
>To: "Clemens Marschner" <Clemens.Marschner@internet.lmu.de>; "Mehran Mehr"
><mehran@sharif.edu>
>Sent: Tuesday, October 15, 2002 7:43 PM
>Subject: LARM work
>
>
>>Mehran, Clemens,
>>
>>Just curious, now that you are done with studies, Clemens, what is next
>>for LARM?
>>
>>Otis
>>
>>
>>__________________________________________________
>>Do you Yahoo!?
>>Faith Hill - Exclusive Performances, Videos & More
>>http://faith.yahoo.com
>>
>
>
>
>--
>To unsubscribe, e-mail: <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
>For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>
>
>



--
To unsubscribe, e-mail: <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>