Mailing List Archive

rc4 and FileNotFoundException: an update
Hello again,

I guess it's really not my day...

Just to make sure I'm not hallucinating to much, I downloaded the latest
and greatest: rc4. Changed all the packages names to org.apache. Updated
a method here and there to reflect the APIs changes. And run my little
app. I would like to emphasize that except updating to the latest Lucene
release, nothing else has changed.

Well, it's pretty ugly. Whatever I'm doing with Lucene in the previous
package (com.lucene) is magnified many folds in rc4. After processing a
paltry 16 objects I got:

"SZFinder.findObjectsWithSpecificationInStore:
java.io.FileNotFoundException: _2.f14 (Too many open files)"

At least in the previous version, I will see that after a couple of
thousand of objects...

So, it seems, that there is something really rotten in the kingdom of
Denmark...

Any help much appreciated.

Thanks.


--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: rc4 and FileNotFoundException: an update [ In reply to ]
Have you posted code that demonstrates this problem?
If so I missed it. If you send, to this list, the
shortest program you can come up with that demonstrates
the problem there is a fair chance that someone may
spot something. I, and many others, use that release
of Lucene to index far more than 16 objects so I think
that at this stage the assumption has to be that the
problem lies with your code.


--
Ian.
ian@digimem.net


> petite_abeille@mac.com (petite_abeille) wrote
>
> Hello again,
>
> I guess it's really not my day...
>
> Just to make sure I'm not hallucinating to much, I downloaded the latest
> and greatest: rc4. Changed all the packages names to org.apache. Updated
> a method here and there to reflect the APIs changes. And run my little
> app. I would like to emphasize that except updating to the latest Lucene
> release, nothing else has changed.
>
> Well, it's pretty ugly. Whatever I'm doing with Lucene in the previous
> package (com.lucene) is magnified many folds in rc4. After processing a
> paltry 16 objects I got:
>
> "SZFinder.findObjectsWithSpecificationInStore:
> java.io.FileNotFoundException: _2.f14 (Too many open files)"
>
> At least in the previous version, I will see that after a couple of
> thousand of objects...
>
> So, it seems, that there is something really rotten in the kingdom of
> Denmark...
>
> Any help much appreciated.
>
> Thanks.

----------------------------------------------------------------------
Searchable personal storage and archiving from http://www.digimem.net/
Re: rc4 and FileNotFoundException: an update [ In reply to ]
> Have you posted code that demonstrates this problem? If so I missed it.

Thanks for your help.

PA.


--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: rc4 and FileNotFoundException: an update [ In reply to ]
On Fri, Apr 26, 2002 at 07:05:23PM +0200, petite_abeille wrote:
> I guess it's really not my day...
> [...]
> Well, it's pretty ugly. Whatever I'm doing with Lucene in the previous
> package (com.lucene) is magnified many folds in rc4. After processing a
> paltry 16 objects I got:
>
> "SZFinder.findObjectsWithSpecificationInStore:
> java.io.FileNotFoundException: _2.f14 (Too many open files)"

Sounds like a pretty nasty situation.

One suggestion I have for you is that Doug is usually very
helpful with problems like this IF you can first narrow down what is
happening to the point that you can post a clear, specific, isolated
test that consistently causes the problem to happen. This makes sense
- any effort to solve the problem will first involve isolating the
bug, and that's a task you're best suited for, since you know your
system best.

So maybe your best approach would be to take a copy of your
system as above, and start gradually stripping out stuff, testing
between each run, until you have most of the application-specific
stuff removed, but the problem is still reoccurring consistently.
Then post your code and ask if some of the more lucene-knowledgable
can take a look.

Re: index integrity, I agree that it would be really, really nice
to have some sort of "sanity" check. I have yet to actually get into
the internals of the index, but I'd guess that there must be some sort
of at least superficial check, maybe some sort of format check.

If I was going to kludge something together, the first approach
I'd take would be to just open the index and roll through all of the
Documents in it, accessing all of the fields (or maybe just a few main
fields per Document). I"m not sure what I'd *do* with the field
values (printing them out to the screen might take a while), other
than perhaps checking for nulls. But I suspect that if the code gets
throught that without causing an exception or getting null values,
then at least the index's internal format is intact. Maybe the test
code could save the number of lucene Document objects in the index in
between checks (and, of course, update this number when you add or
remove documents), and make sure it still has the right number of
documents.

As for repairing an index, I think that's working sort of against
the grain of Lucene. In your case, it sounds like rebuilding the
index is important, because you're using Lucene as a data store. I
have some similar issues myself in some things I want to build (I end
up wanting both a data store and a search index; ultimately I've ended
up choosing to have a separate data store for the extra data). But
Lucene is a search index, meant to be used more in a cache-like style,
so there's an underlying assumption that the original data is always
around to reindex. Thus, repairing an index is less important, since
it is assumed you can always rebuild it.

I don't know much of the theories behind data store systems. It
occurs to me that using Lucene as a data store, you'll always be
working against the grain, always swimming upstream. Maybe it'd be a
better idea to figure out some way to use Lucene as the indexing
technology in a data store, the way traditional RDBMSes use indexes,
for speeding access.

Or possibly you should look at Xindice (http://xml.apache.org/xindice/)
which is an XML database. You might find it easier to adapt that to your
needs. I'm kind of curious as to how fast Xindice's XPath execution is, and
what their indexing is based on - there might be a use for Lucene there.

Steven J. Owens
puff@darksleep.com

--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: rc4 and FileNotFoundException: an update [ In reply to ]
Hi Steven,

> Sounds like a pretty nasty situation.

It is...

> This makes sense - any effort to solve the problem will first involve
> isolating the
> bug, and that's a task you're best suited for, since you know your
> system best.

Ok... From what I understand, this situation arise depending on my
"usage pattern" of Lucene. For example, if I use it in "batch" mode (eg,
through some tools to stress test my app by loading a couple of millions
of objects), everything works perfectly fine. However, when running my
app in a more "interactive" mode (eg, with user interaction, object
indexing, writing and searching at the same time) I run into this
exception very quickly. The problem, seems to have something to do with
Searcher and/or how I'm using them. I need to investigate in that
direction... Also, what it the "magic" formula for minimizing
RandomAccessFile usage in Lucene to a very strict minimum? Is
IndexWriter.mergeFactor the only parameter I can play with, or am I
missing some other configuration that might help?

> Then post your code and ask if some of the more lucene-knowledgeable
> can take a look.

Unfortunately, it's not that straightforward as I'm using Lucene as part
of some sort of custom built oodbms and this behavior seems to be usage
related... You can check the app at http://homepage.mac.com/zoe_info/ if
that helps.

> Re: index integrity, I agree that it would be really, really nice to
> have some sort of "sanity" check.

I'm not familiar with Lucene internals, but is it conceivable to have
some sort of checksum per document and/or index that will help to
identify "corrupted" data?

> As for repairing an index, I think that's working sort of against the
> grain of Lucene.

:-(

> In your case, it sounds like rebuilding the index is important, because
> you're using Lucene as a data store.

Well, not exactly. I'm just using Lucene to index my data store (with a
bunch of Field.Keyword and Field.Unstored). The actual object storage is
handled externally to Lucene. However, I need a consistent index as I'm
using it as part of my object tree.

> Maybe it'd be a better idea to figure out some way to use Lucene as the
> indexing
> technology in a data store, the way traditional RDBMSes use indexes,
> for speeding access.

I agree. It's how I'm using it more or less. Nevertheless, for the sake
of reliability, I need to have some level of confidence that the
underlying indexes are "sane"... And a way to correct the problem if
they are not. In my case, I will happily trade speed for reliability as
I cannot afford to have inconsistent indexes. A corrupted index is of
not use to me.

> Or possibly you should look at Xindice (http://xml.apache.org/xindice/)
> which is an XML database.

I'm familiar with Xindice and other related toolboxes. However, I have
some "peculiar" requirements, so I decided to custom made my own
persistency layer. Works fine so far. Just this very annoying exception.
Also this situation seems to arise on UNIX systems only as I never heard
anybody complaining about it on any Windows type platforms... Very odd
in any case...

Thanks for your help in any case.

PA


--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: rc4 and FileNotFoundException: an update [ In reply to ]
Hi Petite,

> "SZFinder.findObjectsWithSpecificationInStore:
java.io.FileNotFoundException: _2.f14 (Too many open files)"

I don't know what environment you're using Lucene in. However, we had this "too
many open files" problem on our Solaris box, and increasing the number of file
descriptors through the ulimit -n command fixed it.

regards, Julian



--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: rc4 and FileNotFoundException: an update [ In reply to ]
> I don't know what environment you're using Lucene in. However, we had
> this "too
> many open files" problem on our Solaris box, and increasing the number
> of file
> descriptors through the ulimit -n command fixed it.

Thanks. That should help. However, I have a little desktop app and it
will be very cumbersome to require users to change some system
parameters just to run it... :-(

Thanks in any case.

PA


--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: rc4 and FileNotFoundException: an update [ In reply to ]
> I don't know what environment you're using Lucene in.

The problem seems to be specially bad on osx (10.1.4 + JRE 1.3.1 +
latest updates).

PA.


--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: rc4 and FileNotFoundException: an update [ In reply to ]
--- petite_abeille <petite_abeille@mac.com> wrote:
> > I don't know what environment you're using Lucene in.
>
> The problem seems to be specially bad on osx (10.1.4 + JRE 1.3.1 +
> latest updates).

Does this mean you tried it on other OSs and it worked?
Which ones?
What JDK did those have and what was their ulimit and what is the
ulimit on your OSX machine?
Just curious.

Otis


__________________________________________________
Do You Yahoo!?
Yahoo! Health - your guide to health and wellness
http://health.yahoo.com

--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: rc4 and FileNotFoundException: an update [ In reply to ]
> Does this mean you tried it on other OSs and it worked?

Yes.

> Which ones?

Win2k SP2

> What JDK did those have

jre 1.4.0

> and what was their ulimit and what is the
> ulimit on your OSX machine?
> Just curious.

I don't know. Does it matter?

PA


--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: rc4 and FileNotFoundException: an update [ In reply to ]
Hello,

> > and what was their ulimit and what is the
> > ulimit on your OSX machine?
> > Just curious.
>
> I don't know. Does it matter?

Of course it does - a low (u)limit is a part of your problem, perhaps.

Otis
P.S.
I don't know how Winblows deals with file descriptors. Try your
application on some other flavour of Unix, if possible.


__________________________________________________
Do You Yahoo!?
Yahoo! Health - your guide to health and wellness
http://health.yahoo.com

--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>