Mailing List Archive

Lucene Indexing Speed Trips, Deadlocks
So I've never really been in the position of having the time or tools to
easily and efficiently pound on Lucene updates - always lacked one of the
ingredients, and having a beer while a bit stocked up on both, I've been
doing some light hammering.

Not likely some high priority item, when I say efficient, I mean lots
coming in fast with minimal context switching or outside blocking and
locking - not that I'm in some reactive / back pressure bulldozer. But I
thought it might be interesting to someone, because I think I've seen hints
of it at a lesser scale when pointing any fingers towards Lucene was beyond
the effort or value.

So just a drop, don't mind a reply of "seems to hurt when I hit you" "Well
don't hit me so hard".

Somewhat older code, looks like maybe the issue could still hang out.

Seems to be the applylock in FrozenUpdates. I don't recall all the paths
coming in on it (I can easily reproduce), but it takes a few things
occurring around the same time, and the result ranges from massive slow
down that works through to essentially deadlock.

Seen a lot of this kind of behavior before, hard breaks to deadlock but
often hints of progress unwinding, so I gave a fair option on the lock a
try and seemed to address it for me without a noticeable penalty. I didn't
do rigorous testing and came at it from one angle. But took me to smooth
pounding from the off and on misery runs.


Also, unrelated, but since I saw someone struggling with it recently, I
might as well mention, I think there may be a few SPI related poor static
initializers in CharFilterFactory, TokenFilterFactory, TokenizerFactory -
don't quote me on the classes, but I believe a nice static holder pattern
there solves some fairly easy to trigger deadlock issues. Worked well for
me and I later found a similar class that already had this pattern with a
note about the deadlock doom impetus.

Sorry to pound, back pressure aint easy and "reactive" gets me about as
excited as Perl.

- Mark
Re: Lucene Indexing Speed Trips, Deadlocks [ In reply to ]
I pushed on this a little more last night. Kept running into the early in
the indexing process slowdown / to apparent deadlock issue occasionally
until I pushed the fair lock treatment through the whole IndexWriter class
chain. It certainly has a noticeable effect in terms of how soon a heavy
indexing load is noticeably slowed down or locked up and avoids the cases
where I seemed to get locked up without getting out (perhaps if you wait
long enough...). But you still end up waiting a good chunk towards the end
of my dump for a force apply of frozen buffered deletes - I'm
assuming waiting out some merging or something. So I see some benefit on
lots of threads coming into the indexwriter with docs, but perhaps wait
times are more time shifted than reduced or something, so further research
needed as well.

- Mark

On Wed, Apr 21, 2021 at 12:02 AM Mark Miller <markrmiller@gmail.com> wrote:

> So I've never really been in the position of having the time or tools to
> easily and efficiently pound on Lucene updates - always lacked one of the
> ingredients, and having a beer while a bit stocked up on both, I've been
> doing some light hammering.
>
> Not likely some high priority item, when I say efficient, I mean lots
> coming in fast with minimal context switching or outside blocking and
> locking - not that I'm in some reactive / back pressure bulldozer. But I
> thought it might be interesting to someone, because I think I've seen hints
> of it at a lesser scale when pointing any fingers towards Lucene was beyond
> the effort or value.
>
> So just a drop, don't mind a reply of "seems to hurt when I hit you" "Well
> don't hit me so hard".
>
> Somewhat older code, looks like maybe the issue could still hang out.
>
> Seems to be the applylock in FrozenUpdates. I don't recall all the paths
> coming in on it (I can easily reproduce), but it takes a few things
> occurring around the same time, and the result ranges from massive slow
> down that works through to essentially deadlock.
>
> Seen a lot of this kind of behavior before, hard breaks to deadlock but
> often hints of progress unwinding, so I gave a fair option on the lock a
> try and seemed to address it for me without a noticeable penalty. I didn't
> do rigorous testing and came at it from one angle. But took me to smooth
> pounding from the off and on misery runs.
>
>
> Also, unrelated, but since I saw someone struggling with it recently, I
> might as well mention, I think there may be a few SPI related poor static
> initializers in CharFilterFactory, TokenFilterFactory, TokenizerFactory -
> don't quote me on the classes, but I believe a nice static holder pattern
> there solves some fairly easy to trigger deadlock issues. Worked well for
> me and I later found a similar class that already had this pattern with a
> note about the deadlock doom impetus.
>
> Sorry to pound, back pressure aint easy and "reactive" gets me about as
> excited as Perl.
>
> - Mark
>
>

--
- Mark

http://about.me/markrmiller