Mailing List Archive: OUTOFMEMORY ERROR

OUTOFMEMORY ERROR

melola at seinet

Jul 6, 2005, 10:22 AM

Post #1 of 10 (6810 views)

Hi, I have a problem when I am trying to search a simple query without sorting into an index with 210.000 documents.
Executing the query several times I am getting the OutOfMemory error.
I am creating an IndexSearcher(pathDir) every search.
I don´t know if it will be necessary to create only one indexSearcher and caching it,
If I search into an index with only 50.000 documents, the outofMemory error doen´t appear.
------------------------
ENVIROMENT DESCRIPTION:
------------------------

---SERVER---
MEMORY 2GB
APP SERVER Jboss3.2.3
JAVA_OPTS -Xmx640M -Xms640M

----LUCENE 1.4.3-------
INDEX +- 210.000 documents
EACH DOCUMENT +- 20 fields (metadatas)
SIZE TEXT DOCUMENT 1k

------------------------
ERROR:
------------------------
18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
java.lang.OutOfMemoryError
18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
java.lang.OutOfMemoryError
18:52:18,660 ERROR [STDERR] java.rmi.ServerError: Unexpected Error; nested exception is:
java.lang.OutOfMemoryError
18:52:18,661 ERROR [STDERR] at org.jboss.ejb.plugins.LogInterceptor.handleException(LogInterceptor.java:374)
18:52:18,661 ERROR [STDERR] at org.jboss.ejb.plugins.LogInterceptor.invoke(LogInterceptor.java:195)
18:52:18,661 ERROR [STDERR] at org.jboss.ejb.plugins.ProxyFactoryFinderInterceptor.invoke(ProxyFactoryFinderInterceptor.java:122)
18:52:18,662 ERROR [STDERR] at org.jboss.ejb.StatelessSessionContainer.internalInvoke(StatelessSessionContainer.java:331)
18:52:18,662 ERROR [STDERR] at org.jboss.ejb.Container.invoke(Container.java:700)
18:52:18,662 ERROR [STDERR] at sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
18:52:18,662 ERROR [STDERR] at sun.reflect.DelegatingMethodAccessorImpl.invok
.
.
Exception java.lang.OutOfMemoryError: requested 4 bytes for CMS: Work queue overflow; try -XX:-CMSParallelRemarkEnabled. Out of swap space?

Could anybody help me???

Thanks in advance

Mari Luz

Re: OUTOFMEMORY ERROR [ In reply to ]

erik at ehatchersolutions

Jul 6, 2005, 11:12 AM

Post #2 of 10 (6643 views)

We'll need some more details to help. What query was it?

Erik

On Jul 6, 2005, at 1:22 PM, MariLuz Elola wrote:

> Hi, I have a problem when I am trying to search a simple query
> without sorting into an index with 210.000 documents.
> Executing the query several times I am getting the OutOfMemory error.
> I am creating an IndexSearcher(pathDir) every search.
> I don´t know if it will be necessary to create only one
> indexSearcher and caching it,
> If I search into an index with only 50.000 documents, the
> outofMemory error doen´t appear.
> ------------------------
> ENVIROMENT DESCRIPTION:
> ------------------------
>
> ---SERVER---
> MEMORY 2GB
> APP SERVER Jboss3.2.3
> JAVA_OPTS -Xmx640M -Xms640M
>
> ----LUCENE 1.4.3-------
> INDEX +- 210.000 documents
> EACH DOCUMENT +- 20 fields (metadatas)
> SIZE TEXT DOCUMENT 1k
>
> ------------------------
> ERROR:
> ------------------------
> 18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
> java.lang.OutOfMemoryError
> 18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
> java.lang.OutOfMemoryError
> 18:52:18,660 ERROR [STDERR] java.rmi.ServerError: Unexpected Error;
> nested exception is:
> java.lang.OutOfMemoryError
> 18:52:18,661 ERROR [STDERR] at
> org.jboss.ejb.plugins.LogInterceptor.handleException
> (LogInterceptor.java:374)
> 18:52:18,661 ERROR [STDERR] at
> org.jboss.ejb.plugins.LogInterceptor.invoke(LogInterceptor.java:195)
> 18:52:18,661 ERROR [STDERR] at
> org.jboss.ejb.plugins.ProxyFactoryFinderInterceptor.invoke
> (ProxyFactoryFinderInterceptor.java:122)
> 18:52:18,662 ERROR [STDERR] at
> org.jboss.ejb.StatelessSessionContainer.internalInvoke
> (StatelessSessionContainer.java:331)
> 18:52:18,662 ERROR [STDERR] at org.jboss.ejb.Container.invoke
> (Container.java:700)
> 18:52:18,662 ERROR [STDERR] at
> sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
> 18:52:18,662 ERROR [STDERR] at
> sun.reflect.DelegatingMethodAccessorImpl.invok
> .
> .
> Exception java.lang.OutOfMemoryError: requested 4 bytes for CMS:
> Work queue overflow; try -XX:-CMSParallelRemarkEnabled. Out of swap
> space?
>
>
> Could anybody help me???
>
> Thanks in advance
>
> Mari Luz
>
>
>
>
>

Re: OUTOFMEMORY ERROR [ In reply to ]

melola at seinet

Jul 7, 2005, 3:02 AM

Post #3 of 10 (6647 views)

The query is ==> ID:0*
This query returns all the documents, exactly 210.000 documents.
If the user doesn´t specify any criterio in the user interface of searching,
the server searchs all the documents.

Mari Luz

Untitled Document --------------------------------------------------- Mari
Luz Elola Developer Engineer Caleruega, 67 28033 Madrid (Spain) Tel.: +34 91
768 46 58 mailto:
melola@seinet.es ---------------------------------------------------
Privileged/Confidential Information may be contained in this message and is
intended solely for the use of the named addressee(s). Access to this e-mail
by anyone else is unauthorised. If you are not the intended recipient, any
disclosure, copying, distribution or re-use of the information contained in
it is prohibited and may be unlawful. Opinions, conclusions and any other
information contained in this message that do not relate to the official
business of Seinet shall be understood as neither given nor endorsed by it.
If you have received this communication in error, please notify us
immediately by replying to this mail and deleting it from your computer.
Thank you.
----- Original Message -----
From: "Erik Hatcher" <erik@ehatchersolutions.com>
To: <general@lucene.apache.org>
Sent: Wednesday, July 06, 2005 8:12 PM
Subject: Re: OUTOFMEMORY ERROR

We'll need some more details to help. What query was it?

Erik

On Jul 6, 2005, at 1:22 PM, MariLuz Elola wrote:

> Hi, I have a problem when I am trying to search a simple query without
> sorting into an index with 210.000 documents.
> Executing the query several times I am getting the OutOfMemory error.
> I am creating an IndexSearcher(pathDir) every search.
> I don´t know if it will be necessary to create only one indexSearcher and
> caching it,
> If I search into an index with only 50.000 documents, the outofMemory
> error doen´t appear.
> ------------------------
> ENVIROMENT DESCRIPTION:
> ------------------------
>
> ---SERVER---
> MEMORY 2GB
> APP SERVER Jboss3.2.3
> JAVA_OPTS -Xmx640M -Xms640M
>
> ----LUCENE 1.4.3-------
> INDEX +- 210.000 documents
> EACH DOCUMENT +- 20 fields (metadatas)
> SIZE TEXT DOCUMENT 1k
>
> ------------------------
> ERROR:
> ------------------------
> 18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
> java.lang.OutOfMemoryError
> 18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
> java.lang.OutOfMemoryError
> 18:52:18,660 ERROR [STDERR] java.rmi.ServerError: Unexpected Error;
> nested exception is:
> java.lang.OutOfMemoryError
> 18:52:18,661 ERROR [STDERR] at
> org.jboss.ejb.plugins.LogInterceptor.handleException
> (LogInterceptor.java:374)
> 18:52:18,661 ERROR [STDERR] at
> org.jboss.ejb.plugins.LogInterceptor.invoke(LogInterceptor.java:195)
> 18:52:18,661 ERROR [STDERR] at
> org.jboss.ejb.plugins.ProxyFactoryFinderInterceptor.invoke
> (ProxyFactoryFinderInterceptor.java:122)
> 18:52:18,662 ERROR [STDERR] at
> org.jboss.ejb.StatelessSessionContainer.internalInvoke
> (StatelessSessionContainer.java:331)
> 18:52:18,662 ERROR [STDERR] at org.jboss.ejb.Container.invoke
> (Container.java:700)
> 18:52:18,662 ERROR [STDERR] at
> sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
> 18:52:18,662 ERROR [STDERR] at
> sun.reflect.DelegatingMethodAccessorImpl.invok
> .
> .
> Exception java.lang.OutOfMemoryError: requested 4 bytes for CMS: Work
> queue overflow; try -XX:-CMSParallelRemarkEnabled. Out of swap space?
>
>
> Could anybody help me???
>
> Thanks in advance
>
> Mari Luz
>
>
>
>
>

Re: OUTOFMEMORY ERROR [ In reply to ]

erik at ehatchersolutions

Jul 7, 2005, 5:46 AM

Post #4 of 10 (6625 views)

On Jul 7, 2005, at 6:02 AM, MariLuz Elola wrote:
> The query is ==> ID:0*
> This query returns all the documents, exactly 210.000 documents.
> If the user doesn´t specify any criterio in the user interface of
> searching, the server searchs all the documents.

Doing a prefix query (which ID:0* is) internally builds a
BooleanQuery OR'ing all unique terms in the ID field that begin with
a "0". The built in limit is 1,024 clauses in a BooleanQuery.

You will need to re-think your approach. If the goal is to return
all documents, then use IndexReader to walk them. If the goal is to
have a general user query expression where ID:0* would be entered you
will need to account for that possibility with more system resources
and bumping up the BooleanQuery limit or indexing differently so that
there are no so many terms being put into the BooleanQuery. It is
difficult to offer specific advice as I'm not sure what your use
cases are.

Erik

>
> Mari Luz
>
>
>
> Untitled Document
> --------------------------------------------------- Mari Luz Elola
> Developer Engineer Caleruega, 67 28033 Madrid (Spain) Tel.: +34 91
> 768 46 58 mailto: melola@seinet.es
> --------------------------------------------------- Privileged/
> Confidential Information may be contained in this message and is
> intended solely for the use of the named addressee(s). Access to
> this e-mail by anyone else is unauthorised. If you are not the
> intended recipient, any disclosure, copying, distribution or re-use
> of the information contained in it is prohibited and may be
> unlawful. Opinions, conclusions and any other information contained
> in this message that do not relate to the official business of
> Seinet shall be understood as neither given nor endorsed by it. If
> you have received this communication in error, please notify us
> immediately by replying to this mail and deleting it from your
> computer. Thank you.
> ----- Original Message ----- From: "Erik Hatcher"
> <erik@ehatchersolutions.com>
> To: <general@lucene.apache.org>
> Sent: Wednesday, July 06, 2005 8:12 PM
> Subject: Re: OUTOFMEMORY ERROR
>
>
> We'll need some more details to help. What query was it?
>
> Erik
>
> On Jul 6, 2005, at 1:22 PM, MariLuz Elola wrote:
>
>
>> Hi, I have a problem when I am trying to search a simple query
>> without sorting into an index with 210.000 documents.
>> Executing the query several times I am getting the OutOfMemory error.
>> I am creating an IndexSearcher(pathDir) every search.
>> I don´t know if it will be necessary to create only one
>> indexSearcher and caching it,
>> If I search into an index with only 50.000 documents, the
>> outofMemory error doen´t appear.
>> ------------------------
>> ENVIROMENT DESCRIPTION:
>> ------------------------
>>
>> ---SERVER---
>> MEMORY 2GB
>> APP SERVER Jboss3.2.3
>> JAVA_OPTS -Xmx640M -Xms640M
>>
>> ----LUCENE 1.4.3-------
>> INDEX +- 210.000 documents
>> EACH DOCUMENT +- 20 fields (metadatas)
>> SIZE TEXT DOCUMENT 1k
>>
>> ------------------------
>> ERROR:
>> ------------------------
>> 18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
>> java.lang.OutOfMemoryError
>> 18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
>> java.lang.OutOfMemoryError
>> 18:52:18,660 ERROR [STDERR] java.rmi.ServerError: Unexpected
>> Error; nested exception is:
>> java.lang.OutOfMemoryError
>> 18:52:18,661 ERROR [STDERR] at
>> org.jboss.ejb.plugins.LogInterceptor.handleException
>> (LogInterceptor.java:374)
>> 18:52:18,661 ERROR [STDERR] at
>> org.jboss.ejb.plugins.LogInterceptor.invoke(LogInterceptor.java:195)
>> 18:52:18,661 ERROR [STDERR] at
>> org.jboss.ejb.plugins.ProxyFactoryFinderInterceptor.invoke
>> (ProxyFactoryFinderInterceptor.java:122)
>> 18:52:18,662 ERROR [STDERR] at
>> org.jboss.ejb.StatelessSessionContainer.internalInvoke
>> (StatelessSessionContainer.java:331)
>> 18:52:18,662 ERROR [STDERR] at org.jboss.ejb.Container.invoke
>> (Container.java:700)
>> 18:52:18,662 ERROR [STDERR] at
>> sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
>> 18:52:18,662 ERROR [STDERR] at
>> sun.reflect.DelegatingMethodAccessorImpl.invok
>> .
>> .
>> Exception java.lang.OutOfMemoryError: requested 4 bytes for CMS:
>> Work queue overflow; try -XX:-CMSParallelRemarkEnabled. Out of
>> swap space?
>>
>>
>> Could anybody help me???
>>
>> Thanks in advance
>>
>> Mari Luz
>>
>>
>>
>>
>>
>>
>
>

Re: OUTOFMEMORY ERROR [ In reply to ]

melola at seinet

Jul 7, 2005, 6:40 AM

Post #5 of 10 (6642 views)

Thanks Erik,
I was wrong, exactly the query that throws an OutOfMemory error is ==>
ID:0* -ID:xtent.
With the query ID:0* I have tried to reproduce the error, but the exception
doen´t appear.
I will use IndexReader instead of IndexSearcher for getting all the
documents. It´s a good idea.
Other thing, when the user searchs without using any query, internally I am
creating the next query ==> ID:0* OR NOT ID:xtent. And this query parsed by
QueryParser I am obtaining ID:0* -ID:xtent (traslated ==> ID:0* AND NOT
ID:xtent), isn´t? Is QueryParser working wrong???
About maxClauseCount (by default 1024), I am setting this property:
org.apache.lucene.search.BooleanQuery.maxClauseCount=es.seinet.xtent.searchEngine.lucene.general.Util.MAX_LUCENE_DOCUMENTS;

Mari Luz

----- Original Message -----
From: "Erik Hatcher" <erik@ehatchersolutions.com>
To: <general@lucene.apache.org>
Sent: Thursday, July 07, 2005 2:46 PM
Subject: Re: OUTOFMEMORY ERROR

On Jul 7, 2005, at 6:02 AM, MariLuz Elola wrote:
> The query is ==> ID:0*
> This query returns all the documents, exactly 210.000 documents.
> If the user doesn´t specify any criterio in the user interface of
> searching, the server searchs all the documents.

Doing a prefix query (which ID:0* is) internally builds a
BooleanQuery OR'ing all unique terms in the ID field that begin with
a "0". The built in limit is 1,024 clauses in a BooleanQuery.

You will need to re-think your approach. If the goal is to return
all documents, then use IndexReader to walk them. If the goal is to
have a general user query expression where ID:0* would be entered you
will need to account for that possibility with more system resources
and bumping up the BooleanQuery limit or indexing differently so that
there are no so many terms being put into the BooleanQuery. It is
difficult to offer specific advice as I'm not sure what your use
cases are.

Erik

>
> Mari Luz
>
>
>
> Untitled Document ---------------------------------------------------
> Mari Luz Elola Developer Engineer Caleruega, 67 28033 Madrid (Spain)
> Tel.: +34 91 768 46 58 mailto:
> elola@seinet.es ---------------------------------------------------
> Privileged/ Confidential Information may be contained in this message and
> is intended solely for the use of the named addressee(s). Access to this
> e-mail by anyone else is unauthorised. If you are not the intended
> recipient, any disclosure, copying, distribution or re-use of the
> information contained in it is prohibited and may be unlawful. Opinions,
> conclusions and any other information contained in this message that do
> not relate to the official business of Seinet shall be understood as
> neither given nor endorsed by it. If you have received this communication
> in error, please notify us immediately by replying to this mail and
> deleting it from your computer. Thank you.
> ----- Original Message ----- From: "Erik Hatcher"
> <erik@ehatchersolutions.com>
> To: <general@lucene.apache.org>
> Sent: Wednesday, July 06, 2005 8:12 PM
> Subject: Re: OUTOFMEMORY ERROR
>
>
> We'll need some more details to help. What query was it?
>
> Erik
>
> On Jul 6, 2005, at 1:22 PM, MariLuz Elola wrote:
>
>
>> Hi, I have a problem when I am trying to search a simple query without
>> sorting into an index with 210.000 documents.
>> Executing the query several times I am getting the OutOfMemory error.
>> I am creating an IndexSearcher(pathDir) every search.
>> I don´t know if it will be necessary to create only one indexSearcher
>> and caching it,
>> If I search into an index with only 50.000 documents, the outofMemory
>> error doen´t appear.
>> ------------------------
>> ENVIROMENT DESCRIPTION:
>> ------------------------
>>
>> ---SERVER---
>> MEMORY 2GB
>> APP SERVER Jboss3.2.3
>> JAVA_OPTS -Xmx640M -Xms640M
>>
>> ----LUCENE 1.4.3-------
>> INDEX +- 210.000 documents
>> EACH DOCUMENT +- 20 fields (metadatas)
>> SIZE TEXT DOCUMENT 1k
>>
>> ------------------------
>> ERROR:
>> ------------------------
>> 18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
>> java.lang.OutOfMemoryError
>> 18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
>> java.lang.OutOfMemoryError
>> 18:52:18,660 ERROR [STDERR] java.rmi.ServerError: Unexpected Error;
>> nested exception is:
>> java.lang.OutOfMemoryError
>> 18:52:18,661 ERROR [STDERR] at
>> org.jboss.ejb.plugins.LogInterceptor.handleException
>> (LogInterceptor.java:374)
>> 18:52:18,661 ERROR [STDERR] at
>> org.jboss.ejb.plugins.LogInterceptor.invoke(LogInterceptor.java:195)
>> 18:52:18,661 ERROR [STDERR] at
>> org.jboss.ejb.plugins.ProxyFactoryFinderInterceptor.invoke
>> (ProxyFactoryFinderInterceptor.java:122)
>> 18:52:18,662 ERROR [STDERR] at
>> org.jboss.ejb.StatelessSessionContainer.internalInvoke
>> (StatelessSessionContainer.java:331)
>> 18:52:18,662 ERROR [STDERR] at org.jboss.ejb.Container.invoke
>> (Container.java:700)
>> 18:52:18,662 ERROR [STDERR] at
>> sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
>> 18:52:18,662 ERROR [STDERR] at
>> sun.reflect.DelegatingMethodAccessorImpl.invok
>> .
>> .
>> Exception java.lang.OutOfMemoryError: requested 4 bytes for CMS: Work
>> queue overflow; try -XX:-CMSParallelRemarkEnabled. Out of swap space?
>>
>>
>> Could anybody help me???
>>
>> Thanks in advance
>>
>> Mari Luz
>>
>>
>>
>>
>>
>>
>
>

Re: OUTOFMEMORY ERROR [ In reply to ]

melola at seinet

Jul 7, 2005, 7:16 AM

Post #6 of 10 (6631 views)

Erik, I have a problem.
Firstly I have created several IndexWriter.
One of them has 210.000 documents, and in the future will be IndexWriters
with more than millions of documents.
I need to obtain all the documents.
I am searching using the query ID:0* because this query returns all the
documents.
Exactly I am getting the metadata ID (hits.doc(start).get(.ID)), I am
getting all the IDs of all the documents of a specific IndexWriter.
I am getting out of memory doing it.
About maxClauseCount (by default 1024), I am setting this property:
org.apache.lucene.search.BooleanQuery.maxClauseCount=es.seinet.xtent.searchEngine.lucene.general.Util.MAX_LUCENE_DOCUMENTS;
You gave me an idea...to use IndexReader instead of IndexSearcher for
getting all the documents.
I think that it is not possible to use IndexReader, because I need the ID,
not the phisical files:

Directory directory = FSDirectory.getDirectory(path false);
IndexReader reader = IndexReader.open(directory);
for (int i = 0; i < reader.maxDoc(); i++) ............

Moreover "directory" has all the documents of all the IndexWriter.

Mari Luz

----- Original Message -----
From: "MariLuz Elola" <melola@seinet.es>
To: <general@lucene.apache.org>
Sent: Thursday, July 07, 2005 3:40 PM
Subject: Re: OUTOFMEMORY ERROR

> Thanks Erik,
> I was wrong, exactly the query that throws an OutOfMemory error is ==>
> ID:0* -ID:xtent.
> With the query ID:0* I have tried to reproduce the error, but the
> exception doen´t appear.
> I will use IndexReader instead of IndexSearcher for getting all the
> documents. It´s a good idea.
> Other thing, when the user searchs without using any query, internally I
> am creating the next query ==> ID:0* OR NOT ID:xtent. And this query
> parsed by QueryParser I am obtaining ID:0* -ID:xtent (traslated ==> ID:0*
> AND NOT ID:xtent), isn´t? Is QueryParser working wrong???
> About maxClauseCount (by default 1024), I am setting this property:
> org.apache.lucene.search.BooleanQuery.maxClauseCount=es.seinet.xtent.searchEngine.lucene.general.Util.MAX_LUCENE_DOCUMENTS;
>
> Mari Luz
>
> ----- Original Message -----
> From: "Erik Hatcher" <erik@ehatchersolutions.com>
> To: <general@lucene.apache.org>
> Sent: Thursday, July 07, 2005 2:46 PM
> Subject: Re: OUTOFMEMORY ERROR
>
>
>
> On Jul 7, 2005, at 6:02 AM, MariLuz Elola wrote:
>> The query is ==> ID:0*
>> This query returns all the documents, exactly 210.000 documents.
>> If the user doesn´t specify any criterio in the user interface of
>> searching, the server searchs all the documents.
>
> Doing a prefix query (which ID:0* is) internally builds a
> BooleanQuery OR'ing all unique terms in the ID field that begin with
> a "0". The built in limit is 1,024 clauses in a BooleanQuery.
>
> You will need to re-think your approach. If the goal is to return
> all documents, then use IndexReader to walk them. If the goal is to
> have a general user query expression where ID:0* would be entered you
> will need to account for that possibility with more system resources
> and bumping up the BooleanQuery limit or indexing differently so that
> there are no so many terms being put into the BooleanQuery. It is
> difficult to offer specific advice as I'm not sure what your use
> cases are.
>
> Erik
>
>
>
>>
>> Mari Luz
>>
>>
>>
>> Untitled Document ---------------------------------------------------
>> Mari Luz Elola Developer Engineer Caleruega, 67 28033 Madrid (Spain)
>> Tel.: +34 91 768 46 58 mailto:
>> lola@seinet.es ---------------------------------------------------
>> Privileged/ Confidential Information may be contained in this message and
>> is intended solely for the use of the named addressee(s). Access to
>> this e-mail by anyone else is unauthorised. If you are not the intended
>> recipient, any disclosure, copying, distribution or re-use of the
>> information contained in it is prohibited and may be unlawful. Opinions,
>> conclusions and any other information contained in this message that do
>> not relate to the official business of Seinet shall be understood as
>> neither given nor endorsed by it. If you have received this
>> communication in error, please notify us immediately by replying to this
>> mail and deleting it from your computer. Thank you.
>> ----- Original Message ----- From: "Erik Hatcher"
>> <erik@ehatchersolutions.com>
>> To: <general@lucene.apache.org>
>> Sent: Wednesday, July 06, 2005 8:12 PM
>> Subject: Re: OUTOFMEMORY ERROR
>>
>>
>> We'll need some more details to help. What query was it?
>>
>> Erik
>>
>> On Jul 6, 2005, at 1:22 PM, MariLuz Elola wrote:
>>
>>
>>> Hi, I have a problem when I am trying to search a simple query without
>>> sorting into an index with 210.000 documents.
>>> Executing the query several times I am getting the OutOfMemory error.
>>> I am creating an IndexSearcher(pathDir) every search.
>>> I don´t know if it will be necessary to create only one indexSearcher
>>> and caching it,
>>> If I search into an index with only 50.000 documents, the outofMemory
>>> error doen´t appear.
>>> ------------------------
>>> ENVIROMENT DESCRIPTION:
>>> ------------------------
>>>
>>> ---SERVER---
>>> MEMORY 2GB
>>> APP SERVER Jboss3.2.3
>>> JAVA_OPTS -Xmx640M -Xms640M
>>>
>>> ----LUCENE 1.4.3-------
>>> INDEX +- 210.000 documents
>>> EACH DOCUMENT +- 20 fields (metadatas)
>>> SIZE TEXT DOCUMENT 1k
>>>
>>> ------------------------
>>> ERROR:
>>> ------------------------
>>> 18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
>>> java.lang.OutOfMemoryError
>>> 18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
>>> java.lang.OutOfMemoryError
>>> 18:52:18,660 ERROR [STDERR] java.rmi.ServerError: Unexpected Error;
>>> nested exception is:
>>> java.lang.OutOfMemoryError
>>> 18:52:18,661 ERROR [STDERR] at
>>> org.jboss.ejb.plugins.LogInterceptor.handleException
>>> (LogInterceptor.java:374)
>>> 18:52:18,661 ERROR [STDERR] at
>>> org.jboss.ejb.plugins.LogInterceptor.invoke(LogInterceptor.java:195)
>>> 18:52:18,661 ERROR [STDERR] at
>>> org.jboss.ejb.plugins.ProxyFactoryFinderInterceptor.invoke
>>> (ProxyFactoryFinderInterceptor.java:122)
>>> 18:52:18,662 ERROR [STDERR] at
>>> org.jboss.ejb.StatelessSessionContainer.internalInvoke
>>> (StatelessSessionContainer.java:331)
>>> 18:52:18,662 ERROR [STDERR] at org.jboss.ejb.Container.invoke
>>> (Container.java:700)
>>> 18:52:18,662 ERROR [STDERR] at
>>> sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
>>> 18:52:18,662 ERROR [STDERR] at
>>> sun.reflect.DelegatingMethodAccessorImpl.invok
>>> .
>>> .
>>> Exception java.lang.OutOfMemoryError: requested 4 bytes for CMS: Work
>>> queue overflow; try -XX:-CMSParallelRemarkEnabled. Out of swap space?
>>>
>>>
>>> Could anybody help me???
>>>
>>> Thanks in advance
>>>
>>> Mari Luz
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>
>

Re: OUTOFMEMORY ERROR [ In reply to ]

melola at seinet

Jul 7, 2005, 7:28 AM

Post #7 of 10 (6637 views)

Excuse, I was wrong again.
I can use IndexReader.... forget the last email :-D

----- Original Message -----
From: "MariLuz Elola" <melola@seinet.es>
To: <general@lucene.apache.org>
Sent: Thursday, July 07, 2005 4:16 PM
Subject: Re: OUTOFMEMORY ERROR

> Erik, I have a problem.
> Firstly I have created several IndexWriter.
> One of them has 210.000 documents, and in the future will be IndexWriters
> with more than millions of documents.
> I need to obtain all the documents.
> I am searching using the query ID:0* because this query returns all the
> documents.
> Exactly I am getting the metadata ID (hits.doc(start).get(.ID)), I am
> getting all the IDs of all the documents of a specific IndexWriter.
> I am getting out of memory doing it.
> About maxClauseCount (by default 1024), I am setting this property:
> org.apache.lucene.search.BooleanQuery.maxClauseCount=es.seinet.xtent.searchEngine.lucene.general.Util.MAX_LUCENE_DOCUMENTS;
> You gave me an idea...to use IndexReader instead of IndexSearcher for
> getting all the documents.
> I think that it is not possible to use IndexReader, because I need the ID,
> not the phisical files:
>
> Directory directory = FSDirectory.getDirectory(path false);
> IndexReader reader = IndexReader.open(directory);
> for (int i = 0; i < reader.maxDoc(); i++) ............
>
> Moreover "directory" has all the documents of all the IndexWriter.
>
>
> Mari Luz
>
> ----- Original Message -----
> From: "MariLuz Elola" <melola@seinet.es>
> To: <general@lucene.apache.org>
> Sent: Thursday, July 07, 2005 3:40 PM
> Subject: Re: OUTOFMEMORY ERROR
>
>
>> Thanks Erik,
>> I was wrong, exactly the query that throws an OutOfMemory error is ==>
>> ID:0* -ID:xtent.
>> With the query ID:0* I have tried to reproduce the error, but the
>> exception doen´t appear.
>> I will use IndexReader instead of IndexSearcher for getting all the
>> documents. It´s a good idea.
>> Other thing, when the user searchs without using any query, internally I
>> am creating the next query ==> ID:0* OR NOT ID:xtent. And this query
>> parsed by QueryParser I am obtaining ID:0* -ID:xtent (traslated ==> ID:0*
>> AND NOT ID:xtent), isn´t? Is QueryParser working wrong???
>> About maxClauseCount (by default 1024), I am setting this property:
>> org.apache.lucene.search.BooleanQuery.maxClauseCount=es.seinet.xtent.searchEngine.lucene.general.Util.MAX_LUCENE_DOCUMENTS;
>>
>> Mari Luz
>>
>> ----- Original Message -----
>> From: "Erik Hatcher" <erik@ehatchersolutions.com>
>> To: <general@lucene.apache.org>
>> Sent: Thursday, July 07, 2005 2:46 PM
>> Subject: Re: OUTOFMEMORY ERROR
>>
>>
>>
>> On Jul 7, 2005, at 6:02 AM, MariLuz Elola wrote:
>>> The query is ==> ID:0*
>>> This query returns all the documents, exactly 210.000 documents.
>>> If the user doesn´t specify any criterio in the user interface of
>>> searching, the server searchs all the documents.
>>
>> Doing a prefix query (which ID:0* is) internally builds a
>> BooleanQuery OR'ing all unique terms in the ID field that begin with
>> a "0". The built in limit is 1,024 clauses in a BooleanQuery.
>>
>> You will need to re-think your approach. If the goal is to return
>> all documents, then use IndexReader to walk them. If the goal is to
>> have a general user query expression where ID:0* would be entered you
>> will need to account for that possibility with more system resources
>> and bumping up the BooleanQuery limit or indexing differently so that
>> there are no so many terms being put into the BooleanQuery. It is
>> difficult to offer specific advice as I'm not sure what your use
>> cases are.
>>
>> Erik
>>
>>
>>
>>>
>>> Mari Luz
>>>
>>>
>>>
>>> Untitled Document ---------------------------------------------------
>>> Mari Luz Elola Developer Engineer Caleruega, 67 28033 Madrid (Spain)
>>> Tel.: +34 91 768 46 58 mailto:
>>> ola@seinet.es ---------------------------------------------------
>>> Privileged/ Confidential Information may be contained in this message
>>> and is intended solely for the use of the named addressee(s). Access to
>>> this e-mail by anyone else is unauthorised. If you are not the intended
>>> recipient, any disclosure, copying, distribution or re-use of the
>>> information contained in it is prohibited and may be unlawful.
>>> Opinions, conclusions and any other information contained in this
>>> message that do not relate to the official business of Seinet shall be
>>> understood as neither given nor endorsed by it. If you have received
>>> this communication in error, please notify us immediately by replying
>>> to this mail and deleting it from your computer. Thank you.
>>> ----- Original Message ----- From: "Erik Hatcher"
>>> <erik@ehatchersolutions.com>
>>> To: <general@lucene.apache.org>
>>> Sent: Wednesday, July 06, 2005 8:12 PM
>>> Subject: Re: OUTOFMEMORY ERROR
>>>
>>>
>>> We'll need some more details to help. What query was it?
>>>
>>> Erik
>>>
>>> On Jul 6, 2005, at 1:22 PM, MariLuz Elola wrote:
>>>
>>>
>>>> Hi, I have a problem when I am trying to search a simple query
>>>> without sorting into an index with 210.000 documents.
>>>> Executing the query several times I am getting the OutOfMemory error.
>>>> I am creating an IndexSearcher(pathDir) every search.
>>>> I don´t know if it will be necessary to create only one indexSearcher
>>>> and caching it,
>>>> If I search into an index with only 50.000 documents, the outofMemory
>>>> error doen´t appear.
>>>> ------------------------
>>>> ENVIROMENT DESCRIPTION:
>>>> ------------------------
>>>>
>>>> ---SERVER---
>>>> MEMORY 2GB
>>>> APP SERVER Jboss3.2.3
>>>> JAVA_OPTS -Xmx640M -Xms640M
>>>>
>>>> ----LUCENE 1.4.3-------
>>>> INDEX +- 210.000 documents
>>>> EACH DOCUMENT +- 20 fields (metadatas)
>>>> SIZE TEXT DOCUMENT 1k
>>>>
>>>> ------------------------
>>>> ERROR:
>>>> ------------------------
>>>> 18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
>>>> java.lang.OutOfMemoryError
>>>> 18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
>>>> java.lang.OutOfMemoryError
>>>> 18:52:18,660 ERROR [STDERR] java.rmi.ServerError: Unexpected Error;
>>>> nested exception is:
>>>> java.lang.OutOfMemoryError
>>>> 18:52:18,661 ERROR [STDERR] at
>>>> org.jboss.ejb.plugins.LogInterceptor.handleException
>>>> (LogInterceptor.java:374)
>>>> 18:52:18,661 ERROR [STDERR] at
>>>> org.jboss.ejb.plugins.LogInterceptor.invoke(LogInterceptor.java:195)
>>>> 18:52:18,661 ERROR [STDERR] at
>>>> org.jboss.ejb.plugins.ProxyFactoryFinderInterceptor.invoke
>>>> (ProxyFactoryFinderInterceptor.java:122)
>>>> 18:52:18,662 ERROR [STDERR] at
>>>> org.jboss.ejb.StatelessSessionContainer.internalInvoke
>>>> (StatelessSessionContainer.java:331)
>>>> 18:52:18,662 ERROR [STDERR] at org.jboss.ejb.Container.invoke
>>>> (Container.java:700)
>>>> 18:52:18,662 ERROR [STDERR] at
>>>> sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
>>>> 18:52:18,662 ERROR [STDERR] at
>>>> sun.reflect.DelegatingMethodAccessorImpl.invok
>>>> .
>>>> .
>>>> Exception java.lang.OutOfMemoryError: requested 4 bytes for CMS: Work
>>>> queue overflow; try -XX:-CMSParallelRemarkEnabled. Out of swap space?
>>>>
>>>>
>>>> Could anybody help me???
>>>>
>>>> Thanks in advance
>>>>
>>>> Mari Luz
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>

Re: OUTOFMEMORY ERROR [ In reply to ]

erik at ehatchersolutions

Jul 7, 2005, 8:53 AM

Post #8 of 10 (6628 views)

On Jul 7, 2005, at 9:40 AM, MariLuz Elola wrote:
> Thanks Erik,
> I was wrong, exactly the query that throws an OutOfMemory error is
> ==> ID:0* -ID:xtent.
> With the query ID:0* I have tried to reproduce the error, but the
> exception doen´t appear.

> Other thing, when the user searchs without using any query,
> internally I am creating the next query ==> ID:0* OR NOT ID:xtent.

That's a hairy query. I definitely do not recommend doing something
like that with prefix queries. Check out using a Filter for some of
this sort of thing also.

> And this query parsed by QueryParser I am obtaining ID:0* -ID:xtent
> (traslated ==> ID:0* AND NOT ID:xtent), isn´t? Is QueryParser
> working wrong???

It depends. By default, QueryParser uses OR as the default operator.

> About maxClauseCount (by default 1024), I am setting this property:
> org.apache.lucene.search.BooleanQuery.maxClauseCount=es.seinet.xtent.s
> earchEngine.lucene.general.Util.MAX_LUCENE_DOCUMENTS;

Bumping up that limit is not necessarily the best thing to do - I
recommend changing your approach to querying all documents rather
than trying to make BooleanQuery happy with an enormously inefficient
query.

Erik

>
> Mari Luz
>
> ----- Original Message ----- From: "Erik Hatcher"
> <erik@ehatchersolutions.com>
> To: <general@lucene.apache.org>
> Sent: Thursday, July 07, 2005 2:46 PM
> Subject: Re: OUTOFMEMORY ERROR
>
>
>
> On Jul 7, 2005, at 6:02 AM, MariLuz Elola wrote:
>
>> The query is ==> ID:0*
>> This query returns all the documents, exactly 210.000 documents.
>> If the user doesn´t specify any criterio in the user interface of
>> searching, the server searchs all the documents.
>>
>
> Doing a prefix query (which ID:0* is) internally builds a
> BooleanQuery OR'ing all unique terms in the ID field that begin with
> a "0". The built in limit is 1,024 clauses in a BooleanQuery.
>
> You will need to re-think your approach. If the goal is to return
> all documents, then use IndexReader to walk them. If the goal is to
> have a general user query expression where ID:0* would be entered you
> will need to account for that possibility with more system resources
> and bumping up the BooleanQuery limit or indexing differently so that
> there are no so many terms being put into the BooleanQuery. It is
> difficult to offer specific advice as I'm not sure what your use
> cases are.
>
> Erik
>
>
>
>
>>
>> Mari Luz
>>
>>
>>
>> Untitled Document
>> --------------------------------------------------- Mari Luz
>> Elola Developer Engineer Caleruega, 67 28033 Madrid (Spain) Tel.:
>> +34 91 768 46 58 mailto: elola@seinet.es
>> --------------------------------------------------- Privileged/
>> Confidential Information may be contained in this message and is
>> intended solely for the use of the named addressee(s). Access to
>> this e-mail by anyone else is unauthorised. If you are not the
>> intended recipient, any disclosure, copying, distribution or re-
>> use of the information contained in it is prohibited and may be
>> unlawful. Opinions, conclusions and any other information
>> contained in this message that do not relate to the official
>> business of Seinet shall be understood as neither given nor
>> endorsed by it. If you have received this communication in error,
>> please notify us immediately by replying to this mail and
>> deleting it from your computer. Thank you.
>> ----- Original Message ----- From: "Erik Hatcher"
>> <erik@ehatchersolutions.com>
>> To: <general@lucene.apache.org>
>> Sent: Wednesday, July 06, 2005 8:12 PM
>> Subject: Re: OUTOFMEMORY ERROR
>>
>>
>> We'll need some more details to help. What query was it?
>>
>> Erik
>>
>> On Jul 6, 2005, at 1:22 PM, MariLuz Elola wrote:
>>
>>
>>
>>> Hi, I have a problem when I am trying to search a simple query
>>> without sorting into an index with 210.000 documents.
>>> Executing the query several times I am getting the OutOfMemory
>>> error.
>>> I am creating an IndexSearcher(pathDir) every search.
>>> I don´t know if it will be necessary to create only one
>>> indexSearcher and caching it,
>>> If I search into an index with only 50.000 documents, the
>>> outofMemory error doen´t appear.
>>> ------------------------
>>> ENVIROMENT DESCRIPTION:
>>> ------------------------
>>>
>>> ---SERVER---
>>> MEMORY 2GB
>>> APP SERVER Jboss3.2.3
>>> JAVA_OPTS -Xmx640M -Xms640M
>>>
>>> ----LUCENE 1.4.3-------
>>> INDEX +- 210.000 documents
>>> EACH DOCUMENT +- 20 fields (metadatas)
>>> SIZE TEXT DOCUMENT 1k
>>>
>>> ------------------------
>>> ERROR:
>>> ------------------------
>>> 18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
>>> java.lang.OutOfMemoryError
>>> 18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
>>> java.lang.OutOfMemoryError
>>> 18:52:18,660 ERROR [STDERR] java.rmi.ServerError: Unexpected
>>> Error; nested exception is:
>>> java.lang.OutOfMemoryError
>>> 18:52:18,661 ERROR [STDERR] at
>>> org.jboss.ejb.plugins.LogInterceptor.handleException
>>> (LogInterceptor.java:374)
>>> 18:52:18,661 ERROR [STDERR] at
>>> org.jboss.ejb.plugins.LogInterceptor.invoke(LogInterceptor.java:195)
>>> 18:52:18,661 ERROR [STDERR] at
>>> org.jboss.ejb.plugins.ProxyFactoryFinderInterceptor.invoke
>>> (ProxyFactoryFinderInterceptor.java:122)
>>> 18:52:18,662 ERROR [STDERR] at
>>> org.jboss.ejb.StatelessSessionContainer.internalInvoke
>>> (StatelessSessionContainer.java:331)
>>> 18:52:18,662 ERROR [STDERR] at org.jboss.ejb.Container.invoke
>>> (Container.java:700)
>>> 18:52:18,662 ERROR [STDERR] at
>>> sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
>>> 18:52:18,662 ERROR [STDERR] at
>>> sun.reflect.DelegatingMethodAccessorImpl.invok
>>> .
>>> .
>>> Exception java.lang.OutOfMemoryError: requested 4 bytes for
>>> CMS: Work queue overflow; try -XX:-CMSParallelRemarkEnabled.
>>> Out of swap space?
>>>
>>>
>>> Could anybody help me???
>>>
>>> Thanks in advance
>>>
>>> Mari Luz
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>
>

Re: OUTOFMEMORY ERROR [ In reply to ]

melola at seinet

Jul 7, 2005, 10:12 AM

Post #9 of 10 (6652 views)

Hi Erik, excuse me for all my questions. Thank you very much for your speedy
answers, and sorry for my bad english.
I am spanish and I don´t speak english very well.
Well, I have one question more.
Finally I am using IndexReader to return all the documents:
Directory directory = FSDirectory.getDirectory(path, false);
IndexReader reader = IndexReader.open(directory);
for (int start = base; start < end; start++) {
Document doc = reader.document(start);
String
id=doc.get(es.seinet.xtent.searchEngine.lucene.general.Util.ID);
ides.add(id);
}
It works fine and speedy. The only problem is that it is impossible to sort
the results by some metadata (gets all the documents order by title, for
example).

My question is about the parameter maxClauseCount. I think the same that
you. It is not a good idea bump up the limit...
If I use the default vale (1024) and I search, I am getting this error:
[SearchCollection,executeQuery] caught a class
org.apache.lucene.search.BooleanQuery$TooManyClauses
with message: null

Are there any way to search all the documents (210.000 documents) and
internally works only with 1024, returns documents until 1024 and not get
the toomanyclauses error??? I need to work efficiently with collections of
more than 250.000 regitries, and the users normally does complex querys (ej:
DATE:[20050601 to 20050701] AND TITLE:Lucene* ...... ect....)

Ah!! I have seen that you are Erik Hatcher, the author of Lucene In
Action!!!
I don´t understand you about the filter.... well, I will read the charter of
filtering a search :-D

Thanks in advance

Mari Luz

----- Original Message -----
From: "Erik Hatcher" <erik@ehatchersolutions.com>
To: <general@lucene.apache.org>
Sent: Thursday, July 07, 2005 5:53 PM
Subject: Re: OUTOFMEMORY ERROR

On Jul 7, 2005, at 9:40 AM, MariLuz Elola wrote:
> Thanks Erik,
> I was wrong, exactly the query that throws an OutOfMemory error is ==>
> ID:0* -ID:xtent.
> With the query ID:0* I have tried to reproduce the error, but the
> exception doen´t appear.

> Other thing, when the user searchs without using any query, internally I
> am creating the next query ==> ID:0* OR NOT ID:xtent.

That's a hairy query. I definitely do not recommend doing something
like that with prefix queries. Check out using a Filter for some of
this sort of thing also.

> And this query parsed by QueryParser I am obtaining ID:0* -ID:xtent
> (traslated ==> ID:0* AND NOT ID:xtent), isn´t? Is QueryParser working
> wrong???

It depends. By default, QueryParser uses OR as the default operator.

> About maxClauseCount (by default 1024), I am setting this property:
> org.apache.lucene.search.BooleanQuery.maxClauseCount=es.seinet.xtent.s
> earchEngine.lucene.general.Util.MAX_LUCENE_DOCUMENTS;

Bumping up that limit is not necessarily the best thing to do - I
recommend changing your approach to querying all documents rather
than trying to make BooleanQuery happy with an enormously inefficient
query.

Erik

>
> Mari Luz
>
> ----- Original Message ----- From: "Erik Hatcher"
> <erik@ehatchersolutions.com>
> To: <general@lucene.apache.org>
> Sent: Thursday, July 07, 2005 2:46 PM
> Subject: Re: OUTOFMEMORY ERROR
>
>
>
> On Jul 7, 2005, at 6:02 AM, MariLuz Elola wrote:
>
>> The query is ==> ID:0*
>> This query returns all the documents, exactly 210.000 documents.
>> If the user doesn´t specify any criterio in the user interface of
>> searching, the server searchs all the documents.
>>
>
> Doing a prefix query (which ID:0* is) internally builds a
> BooleanQuery OR'ing all unique terms in the ID field that begin with
> a "0". The built in limit is 1,024 clauses in a BooleanQuery.
>
> You will need to re-think your approach. If the goal is to return
> all documents, then use IndexReader to walk them. If the goal is to
> have a general user query expression where ID:0* would be entered you
> will need to account for that possibility with more system resources
> and bumping up the BooleanQuery limit or indexing differently so that
> there are no so many terms being put into the BooleanQuery. It is
> difficult to offer specific advice as I'm not sure what your use
> cases are.
>
> Erik
>
>
>
>
>>
>> Mari Luz
>>
>>
>>
>> Untitled Document ---------------------------------------------------
>> Mari Luz Elola Developer Engineer Caleruega, 67 28033 Madrid (Spain)
>> Tel.: +34 91 768 46 58 mailto:
>> ola@seinet.es ---------------------------------------------------
>> Privileged/ Confidential Information may be contained in this message
>> and is intended solely for the use of the named addressee(s). Access to
>> this e-mail by anyone else is unauthorised. If you are not the intended
>> recipient, any disclosure, copying, distribution or re- use of the
>> information contained in it is prohibited and may be unlawful.
>> Opinions, conclusions and any other information contained in this
>> message that do not relate to the official business of Seinet shall be
>> understood as neither given nor endorsed by it. If you have received
>> this communication in error, please notify us immediately by replying
>> to this mail and deleting it from your computer. Thank you.
>> ----- Original Message ----- From: "Erik Hatcher"
>> <erik@ehatchersolutions.com>
>> To: <general@lucene.apache.org>
>> Sent: Wednesday, July 06, 2005 8:12 PM
>> Subject: Re: OUTOFMEMORY ERROR
>>
>>
>> We'll need some more details to help. What query was it?
>>
>> Erik
>>
>> On Jul 6, 2005, at 1:22 PM, MariLuz Elola wrote:
>>
>>
>>
>>> Hi, I have a problem when I am trying to search a simple query
>>> without sorting into an index with 210.000 documents.
>>> Executing the query several times I am getting the OutOfMemory error.
>>> I am creating an IndexSearcher(pathDir) every search.
>>> I don´t know if it will be necessary to create only one indexSearcher
>>> and caching it,
>>> If I search into an index with only 50.000 documents, the outofMemory
>>> error doen´t appear.
>>> ------------------------
>>> ENVIROMENT DESCRIPTION:
>>> ------------------------
>>>
>>> ---SERVER---
>>> MEMORY 2GB
>>> APP SERVER Jboss3.2.3
>>> JAVA_OPTS -Xmx640M -Xms640M
>>>
>>> ----LUCENE 1.4.3-------
>>> INDEX +- 210.000 documents
>>> EACH DOCUMENT +- 20 fields (metadatas)
>>> SIZE TEXT DOCUMENT 1k
>>>
>>> ------------------------
>>> ERROR:
>>> ------------------------
>>> 18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
>>> java.lang.OutOfMemoryError
>>> 18:52:18,657 ERROR [LogInterceptor] Unexpected Error:
>>> java.lang.OutOfMemoryError
>>> 18:52:18,660 ERROR [STDERR] java.rmi.ServerError: Unexpected Error;
>>> nested exception is:
>>> java.lang.OutOfMemoryError
>>> 18:52:18,661 ERROR [STDERR] at
>>> org.jboss.ejb.plugins.LogInterceptor.handleException
>>> (LogInterceptor.java:374)
>>> 18:52:18,661 ERROR [STDERR] at
>>> org.jboss.ejb.plugins.LogInterceptor.invoke(LogInterceptor.java:195)
>>> 18:52:18,661 ERROR [STDERR] at
>>> org.jboss.ejb.plugins.ProxyFactoryFinderInterceptor.invoke
>>> (ProxyFactoryFinderInterceptor.java:122)
>>> 18:52:18,662 ERROR [STDERR] at
>>> org.jboss.ejb.StatelessSessionContainer.internalInvoke
>>> (StatelessSessionContainer.java:331)
>>> 18:52:18,662 ERROR [STDERR] at org.jboss.ejb.Container.invoke
>>> (Container.java:700)
>>> 18:52:18,662 ERROR [STDERR] at
>>> sun.reflect.GeneratedMethodAccessor40.invoke(Unknown Source)
>>> 18:52:18,662 ERROR [STDERR] at
>>> sun.reflect.DelegatingMethodAccessorImpl.invok
>>> .
>>> .
>>> Exception java.lang.OutOfMemoryError: requested 4 bytes for CMS: Work
>>> queue overflow; try -XX:-CMSParallelRemarkEnabled. Out of swap space?
>>>
>>>
>>> Could anybody help me???
>>>
>>> Thanks in advance
>>>
>>> Mari Luz
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>
>

Re: OUTOFMEMORY ERROR [ In reply to ]

erik at ehatchersolutions

Jul 7, 2005, 12:50 PM

Post #10 of 10 (6634 views)

On Jul 7, 2005, at 1:12 PM, MariLuz Elola wrote:

> Hi Erik, excuse me for all my questions. Thank you very much for
> your speedy answers, and sorry for my bad english.
> I am spanish and I don´t speak english very well.
> Well, I have one question more.
> Finally I am using IndexReader to return all the documents:
> Directory directory = FSDirectory.getDirectory(path,
> false);
> IndexReader reader = IndexReader.open(directory);
> for (int start = base; start < end; start++) {
> Document doc = reader.document(start);
> String id=doc.get
> (es.seinet.xtent.searchEngine.lucene.general.Util.ID);
> ides.add(id);
> }
> It works fine and speedy. The only problem is that it is impossible
> to sort the results by some metadata (gets all the documents order
> by title, for example).

If you truly need to have a Query that can find all documents, then
add a special field to each document with a fixed value such as
doc:yes and then do a TermQuery for doc:yes. You could then leverage
Lucene's sorting capability.

> My question is about the parameter maxClauseCount. I think the same
> that you. It is not a good idea bump up the limit...
> If I use the default vale (1024) and I search, I am getting this
> error:
> [SearchCollection,executeQuery] caught a class
> org.apache.lucene.search.BooleanQuery$TooManyClauses
> with message: null
>
> Are there any way to search all the documents (210.000 documents)
> and internally works only with 1024, returns documents until 1024
> and not get the toomanyclauses error??? I need to work efficiently
> with collections of more than 250.000 regitries, and the users
> normally does complex querys (ej: DATE:[20050601 to 20050701] AND
> TITLE:Lucene* ...... ect....)

The issue is that PrefixQuery, WildcardQuery, RangeQuery, and
FuzzyQuery all expand to the terms that match in a BooleanQuery OR
fashion. You need to identify what terms those are and address them
individually. I can't offer specific advice since I don't know what
fields you're using and what values they may contain. But one
example is with dates. If you index dates and do it at the
millisecond granularity but you really only need to query by YEAR
then there is a great chance one of those query types will expand to
TooManyClauses. If, instead, you indexed dates by YYYY when all you
need is year granularity then you have far fewer terms. I hope this
makes sense and helps.

Erik