Mailing List Archive

lucene search
Hi!

I'm trying to use lucene search but can't
get it working ... I followed the steps
in http://forrest.apache.org/docs/searching.html
but still doesn't do, gives me:
cause
java.lang.RuntimeException: java.util.EmptyStackException
request-uri
/lucene-update.html
I'm using forrest-0.6-dev with Jetty (forrest run)
Can anybody help?

Regards
Johannes

--
User Interface Design GmbH * Teinacher Str. 38 * D-71634 Ludwigsburg
Fon +49 (0)7141 377 000 * Fax +49 (0)7141 377 00-99
Geschäftsstelle: User Interface Design GmbH * Lehrer-Götz-Weg 11 * D-81825
München
www.uidesign.de
Re: lucene search [ In reply to ]
Johannes Schäfer escribió:
> Hi!
>
> I'm trying to use lucene search but can't
> get it working ... I followed the steps
> in http://forrest.apache.org/docs/searching.html
> but still doesn't do, gives me:
> cause
> java.lang.RuntimeException: java.util.EmptyStackException
> request-uri
> /lucene-update.html
> I'm using forrest-0.6-dev with Jetty (forrest run)
> Can anybody help?

Seems like a bug:

on build/webapp/WEB-INF/logs/error.log

FATAL_E (2004-07-13) 12:16.18:764 [core.xslt-processor]
(/lucene-update.html) PoolThread-4/TraxErrorHandler: Error in
TraxTransformer: javax.xml.transform.TransformerException:
java.lang.RuntimeException: org.xml.sax.SAXException: <lucene:index>
element can contain only <lucene:document> elements!


Try to comment it out on site.xml :

<!--all label="All">
<whole_site_html label="Whole Site HTML" href="site.html"/>
<whole_site_pdf label="Whole Site PDF" href="site.pdf"/>
</all -->


util we get a way to fix it.

Cheers,
Cheche
RE: lucene search [ In reply to ]
> -----Original Message-----
> From: Juan Jose Pablos [mailto:cheche@che-che.com]
> Sent: Tuesday, July 13, 2004 1:00 PM
> To: user@forrest.apache.org
> Subject: Re: lucene search
>
>
> Johannes Schäfer escribió:
> > Hi!
> >
> > I'm trying to use lucene search but can't
> > get it working ... I followed the steps
> > in http://forrest.apache.org/docs/searching.html
> > but still doesn't do, gives me:
> > cause
> > java.lang.RuntimeException: java.util.EmptyStackException
> > request-uri
> > /lucene-update.html
> > I'm using forrest-0.6-dev with Jetty (forrest run)
> > Can anybody help?
>
> Seems like a bug:
>
> on build/webapp/WEB-INF/logs/error.log
>
> FATAL_E (2004-07-13) 12:16.18:764 [core.xslt-processor]
> (/lucene-update.html) PoolThread-4/TraxErrorHandler: Error in
> TraxTransformer: javax.xml.transform.TransformerException:
> java.lang.RuntimeException: org.xml.sax.SAXException: <lucene:index>
> element can contain only <lucene:document> elements!
>
>
> Try to comment it out on site.xml :
>
> <!--all label="All">
> <whole_site_html label="Whole Site HTML" href="site.html"/>
> <whole_site_pdf label="Whole Site PDF" href="site.pdf"/>
> </all -->

Did it.
Now it looks like this:
cause
java.lang.RuntimeException:
org.apache.cocoon.ResourceNotFoundException:
No pipeline matched request: graphics/images.lucene
request-uri
/lucene-update.html

More files missing?!

Cheers
Johannes


>
>
> util we get a way to fix it.
>
> Cheers,
> Cheche
>
>
>
>
Re: lucene search [ In reply to ]
Johannes Schäfer escribió:
> Now it looks like this:
> cause
> java.lang.RuntimeException:
> org.apache.cocoon.ResourceNotFoundException:
> No pipeline matched request: graphics/images.lucene
> request-uri
> /lucene-update.html
>
> More files missing?!
>
That is what the error message said. Have you got that reference on the
site.xml file?
RE: lucene search [ In reply to ]
> -----Original Message-----
> From: Juan Jose Pablos [mailto:cheche@che-che.com]
> Sent: Tuesday, July 13, 2004 6:37 PM
> To: user@forrest.apache.org
> Subject: Re: lucene search
>
>
> Johannes Schäfer escribió:
> > Now it looks like this:
> > cause
> > java.lang.RuntimeException:
> > org.apache.cocoon.ResourceNotFoundException:
> > No pipeline matched request: graphics/images.lucene
> > request-uri
> > /lucene-update.html
> >
> > More files missing?!
> >
> That is what the error message said. Have you got that
> reference on the
> site.xml file?
>

No, my site.xml file does not contain the word "lucene"
at all. I just put in the comments around the <all>
section. In my skinconf.xml I put it like this:
<search name="mysite" domain="localhost" provider="lucene"/>
Don't know where it went wrong. In the error-log it gives
a very similar FATAL_E message just that it says
No pipeline matched request: graphics/images.lucene

I have no clue :-(

Cheers
Johannes

--
User Interface Design GmbH * Teinacher Str. 38 * D-71634 Ludwigsburg
Fon +49 (0)7141 377 000 * Fax +49 (0)7141 377 00-99
Geschäftsstelle: User Interface Design GmbH * Lehrer-Götz-Weg 11 * D-81825
München
www.uidesign.de

Buch "User Interface Tuning" von Joachim Machate & Michael Burmester
www.user-interface-tuning.de

Attraktivität von interaktiven Produkten messen mit
www.attrakdiff.de
Re: lucene search [ In reply to ]
Johannes Schäfer escribió:
>
>
> No, my site.xml file does not contain the word "lucene"
> at all. I just put in the comments around the <all>
> section.

The word lucene has been added by the search tool. What about something
like graphics/image.* ?
RE: lucene search [ In reply to ]
> -----Original Message-----
> From: Juan Jose Pablos [mailto:cheche@che-che.com]
> Sent: Tuesday, July 13, 2004 7:15 PM
> Subject: Re: lucene search
<snip>
>
> The word lucene has been added by the search tool.
> What about something like graphics/image.* ?

I see: I have graphics/images.xml and lucene (?)
tries to build a file graphics/images.lucene but is
not sucessfull. Removing this file from site.xml I
get an Index Creation Report and I guess, this means
sucess. Yeah! (Still dosn't work with full HTML/PDF).

But why "images.xml"? There is nothing special about
it, in fact it's rather short and missing content ;-)
I've tried renaming it (x_images.xml) in site.xml but
this doesn't work. I've tried reducing it to "nothing",
i.e. took out all contents. Doesn't work.
the file itself has no special flags set (e.g. r/o).

Any idea?

Cheers
Johannes

--
User Interface Design GmbH * Teinacher Str. 38 * D-71634 Ludwigsburg
Fon +49 (0)7141 377 000 * Fax +49 (0)7141 377 00-99
Geschäftsstelle: User Interface Design GmbH * Lehrer-Götz-Weg 11 * D-81825
München
www.uidesign.de
RE: lucene search [ In reply to ]
Johannes Schäfer wrote:
> But why "images.xml"? There is nothing special about
> it, in fact it's rather short and missing content ;-)
> I've tried renaming it (x_images.xml) in site.xml but
> this doesn't work. I've tried reducing it to "nothing",
> i.e. took out all contents. Doesn't work.
> the file itself has no special flags set (e.g. r/o).
>
> Any idea?

It sounds like a default sitemap match is getting in your way.
Do the cocoon logfiles reveal anything?
build/webapp/WEB-INF/logs/

--
David Crossley
RE: lucene search [ In reply to ]
> -----Original Message-----
> From: David Crossley [mailto:crossley@apache.org]
> Sent: Wednesday, July 14, 2004 11:04 AM
> To: user@forrest.apache.org
> Subject: RE: lucene search
>
>
> Johannes Schäfer wrote:
> > But why "images.xml"? There is nothing special about
> > it, in fact it's rather short and missing content ;-)
> > I've tried renaming it (x_images.xml) in site.xml but
> > this doesn't work. I've tried reducing it to "nothing",
> > i.e. took out all contents. Doesn't work.
> > the file itself has no special flags set (e.g. r/o).
> >
> > Any idea?
>
> It sounds like a default sitemap match is getting in your way.
> Do the cocoon logfiles reveal anything?
> build/webapp/WEB-INF/logs/

Yes, I forgot about regexp matches; renaming the
beast to "visuals.xml" works. Thanks!

It looks like images.xml conflicts with the images/
directory used for images. Is this a bug or a feature?!

Cheers
Johannes

--
User Interface Design GmbH * Teinacher Str. 38 * D-71634 Ludwigsburg
Fon +49 (0)7141 377 000 * Fax +49 (0)7141 377 00-99
Geschäftsstelle: User Interface Design GmbH * Lehrer-Götz-Weg 11 * D-81825
München
www.uidesign.de
RE: lucene search [ In reply to ]
Johannes Schäfer wrote:
> David Crossley wrote:
> >
> > It sounds like a default sitemap match is getting in your way.
> > Do the cocoon logfiles reveal anything?
> > build/webapp/WEB-INF/logs/
>
> Yes, I forgot about regexp matches; renaming the
> beast to "visuals.xml" works. Thanks!
>
> It looks like images.xml conflicts with the images/
> directory used for images. Is this a bug or a feature?!

A bug. The sitemap should be smarter than that.

--
David Crossley
Re: Lucene Search [ In reply to ]
On Mon, Nov 8, 2010 at 7:25 AM, Szabo, Patrick (LNG-VIE)
<patrick.szabo@lexisnexis.at> wrote:
> Hi,
>
> I'm new to Forrest and my boss in all it's wisdom decided that i have to
> administrate our forrest installation now.
> I wasn't involved in our installation until now.
>
> I've already read the forrest documentation and it did help me, but
> there are quite a few question that i couldn't answer.

Yeah, in trying to answer your question I realize that the
documentation in this area is weak. After you understand this stuff,
it'd be great if you could contribute to them.

> I understand that i can use fields to search with lucene. Is there a
> list somewhere where i can see which fields are available ?!

For any given document, you can add a .lucene extension and the
element names are the searchable field names. These are mostly title,
subtitle, abstract, version, author, and content.

> Can i somehow add fields ?!

I don't think there's a way to add per-project fields, but for you
Forrest implementation, you can try to add it into
$FORREST_HOME/main/webapp/resources/stylesheets/xdoc-to-lucene.xsl

> We have an element <revised modified="23.06.2010"/> in our source xml
> and i would like to be able to serch for the date. E.g. i want to see
> all the documents that where modified last week.

Hmm... I'm not sure, I reckon you'd have to index it based on its xdoc
equivalent.

> I've got a few more questions but i don't want to pack them all in just
> one mail.

Yeah, one per thread is always preferred - mail threads are cheap though:)

I'm not sure what version of Forrest you are on but I had to do some
hacking just to get the indexing to work - I reckon it's been a
long-standing bug. I'll take that up on the dev@ list though. Good
luck!

--tim
AW: Lucene Search [ In reply to ]
Hi,

Thanks for your response.

I thought it was going to have something to do with xdoc-to-lucene.xsl but i don't know how to implement new field in that stylesheet. I can see how the other fields are stored but what i don't know how the xdoc version of my files look so i can't extend the exsiting template.

I could just take a look at what the xml2xdoc stylesheet does but it would be a lot easyer if i could take a look at an actual xdoc file....is there a way to store those intermediate xdoc files ?!

I'm not very familiar with cocoon jet so that might be a dumb question.

The pipeline goes like this: our xml (dita) -> xdoc -> html, pdf, ... Right ?!

Thanks

kind regards


. . . . . . . . . . . . . . . . . . . . . . . . . .
Patrick Szabo
XSLT-Entwickler
LexisNexis
Marxergasse 25, 1030 Wien

mailto:patrick.szabo@lexisnexis.at
Tel.: +43 (1) 534 52 - 1573
Fax: +43 (1) 534 52 - 146

http://shop.lexisnexis.at/
http://shop.lexisnexis.at/
-----Ursprüngliche Nachricht-----

Von: Tim Williams [mailto:williamstw@gmail.com]
Gesendet: Dienstag, 09. November 2010 04:08
An: user@forrest.apache.org
Betreff: Re: Lucene Search

On Mon, Nov 8, 2010 at 7:25 AM, Szabo, Patrick (LNG-VIE)
<patrick.szabo@lexisnexis.at> wrote:
> Hi,
>
> I'm new to Forrest and my boss in all it's wisdom decided that i have to
> administrate our forrest installation now.
> I wasn't involved in our installation until now.
>
> I've already read the forrest documentation and it did help me, but
> there are quite a few question that i couldn't answer.

Yeah, in trying to answer your question I realize that the
documentation in this area is weak. After you understand this stuff,
it'd be great if you could contribute to them.

> I understand that i can use fields to search with lucene. Is there a
> list somewhere where i can see which fields are available ?!

For any given document, you can add a .lucene extension and the
element names are the searchable field names. These are mostly title,
subtitle, abstract, version, author, and content.

> Can i somehow add fields ?!

I don't think there's a way to add per-project fields, but for you
Forrest implementation, you can try to add it into
$FORREST_HOME/main/webapp/resources/stylesheets/xdoc-to-lucene.xsl

> We have an element <revised modified="23.06.2010"/> in our source xml
> and i would like to be able to serch for the date. E.g. i want to see
> all the documents that where modified last week.

Hmm... I'm not sure, I reckon you'd have to index it based on its xdoc
equivalent.

> I've got a few more questions but i don't want to pack them all in just
> one mail.

Yeah, one per thread is always preferred - mail threads are cheap though:)

I'm not sure what version of Forrest you are on but I had to do some
hacking just to get the indexing to work - I reckon it's been a
long-standing bug. I'll take that up on the dev@ list though. Good
luck!

--tim
Re: Lucene Search [ In reply to ]
Szabo, Patrick (LNG-VIE) wrote:
>
> I thought it was going to have something to do with xdoc-to-lucene.xsl but i don't know how to implement new field in that stylesheet. I can see how the other fields are stored but what i don't know how the xdoc version of my files look so i can't extend the exsiting template.
>
> I could just take a look at what the xml2xdoc stylesheet does but it would be a lot easyer if i could take a look at an actual xdoc file....is there a way to store those intermediate xdoc files ?!

Do 'forrest run' then request
localhost:8888/index.xml

See some other tips in
http://forrest.apache.org/howto-dev.html

Also perhaps see the main/webapp/search.xmap
which will show some other internal requests
that might be useful such as
localhost:8888/index.lucene

> I'm not very familiar with cocoon jet so that might be a dumb question.
>
> The pipeline goes like this: our xml (dita) -> xdoc -> html, pdf, ... Right ?!

Yes.

-David
AW: Lucene Search [ In reply to ]
Thanks a lot David - your tipps have already helped me ^^

kind regards


. . . . . . . . . . . . . . . . . . . . . . . . . .
Patrick Szabo
XSLT-Entwickler
LexisNexis
Marxergasse 25, 1030 Wien

mailto:patrick.szabo@lexisnexis.at
Tel.: +43 (1) 534 52 - 1573
Fax: +43 (1) 534 52 - 146


-----Ursprüngliche Nachricht-----

Von: David Crossley [mailto:crossley@apache.org]
Gesendet: Mittwoch, 10. November 2010 15:23
An: user@forrest.apache.org
Betreff: Re: Lucene Search

Szabo, Patrick (LNG-VIE) wrote:
>
> I thought it was going to have something to do with xdoc-to-lucene.xsl but i don't know how to implement new field in that stylesheet. I can see how the other fields are stored but what i don't know how the xdoc version of my files look so i can't extend the exsiting template.
>
> I could just take a look at what the xml2xdoc stylesheet does but it would be a lot easyer if i could take a look at an actual xdoc file....is there a way to store those intermediate xdoc files ?!

Do 'forrest run' then request
localhost:8888/index.xml

See some other tips in
http://forrest.apache.org/howto-dev.html

Also perhaps see the main/webapp/search.xmap
which will show some other internal requests
that might be useful such as
localhost:8888/index.lucene

> I'm not very familiar with cocoon jet so that might be a dumb question.
>
> The pipeline goes like this: our xml (dita) -> xdoc -> html, pdf, ... Right ?!

Yes.

-David