Mailing List Archive

Feature request: make Uwe and Robert redundant
I’ve been working on this Jira issue to make Robert and Uwe redundant in
case they get hit by a bus or don’t survive SkyNet. Virtual Uwe is much
further along, with a full 3D avatar model and fully cloned voice, but they
are both coming along, despite some kinks still to be worked out and
upgrades to be made.

I asked the two of them to demonstrate their readiness to step in by
hosting a CarTalk show for Lucene. They are convinced no one would even
notice the changeover, but that's just their training data speaking - there
are just sophisticated prototypes.

So without further ado, and with no strings (except I play the callers)…

SearchTalk, hosted by Query and Index, the Boolean Brothers.

Caller One: Hi Robert and Uwe, my Lucene app uses thousands of fields and
more every day - I've been adding more RAM to keep up, but l've maxed out
my machines? What do I do next?

Robert: Well, there's a storm coming for you my friend. The thing is,
Lucene isn't designed to handle thousands of fields and throwing more RAM
at the problem isn't going to solve it. It's like trying to move your house
with a sports car - it's a faster vehicle, but it's going to be slower
overall. Instead, it's time to rethink your strategy.

Uwe: Yes, I agree with Robert. The current strategy looks completely
broken, but it isn't. You're simply pushing Lucene beyond its limits. You
might think adding more fields is like adding more functionality, but in
reality, it's the opposite.

Robert: Indeed. And, you know, we're not even crawling yet in terms of
handling this. The way to go is to reduce the number of fields you're
using. More isn't always better. Sometimes, less is more.

Uwe: It's a little bit like US showers for Germans - too many options and
things get complicated! In Lucene, each additional field has overhead and
can slow down queries. So, think about your data model. Can fields be
combined? Do you really need every single field you're adding?

Robert: That's right. It's like my mom always says, "if you want zero-based
oops: 1. stay under 4G, heaps shouldn't be that big anyway 2. stay under
4G, heaps shouldn't be that big anyway 3. if you are really convinced you
need to be > 4G, then give room before 32G so its still zero-based". Don't
fall into the trap of adding more fields when what you really need is a
better design.

Uwe: I have to agree. Revert! Revert! Revert! It may seem like a step
backwards, but it's the right move. Reconsider your data model and perhaps
consult with a Lucene expert who can help you re-structure your index to be
more efficient.

Robert: Absolutely. There's nothing wrong with this test, it's just great
at finding bugs. The bug in this case is in the approach. And trust me, you
don't want to end up in jar hell. There's a way to fix this, but it might
require breaking some backwards compatibility.

Uwe: Indeed, Robert. Breaking backwards compatibility is sometimes
necessary for progress. It might be complicated, but it's not unsolvable.
It's simply time to move forward and change strategies.

Caller Two: Hello Robert and Uwe, first time caller, I'm running Lucene 2
for my medical device company, and lately it's been crashing in Hotspot.
How can I stabilize it?

Robert: Hey, welcome to the show! Well, first thing's first: You're running
Lucene 2? I mean, if you're cruising on a vintage bicycle with a flat tire,
do you wonder why it's a bumpy ride? It's a dinosaur, man. You're living in
the Stone Age, gotta upgrade! And about Hotspot crashes... that might be
JVM acting up. Do you have any details on that?

Uwe: Yes, absolutely, Robert! Lucene 2 is, indeed, quite ancient. While we
would recommend upgrading, we understand that might not always be feasible.
With regards to the Hotspot crashes, it might be due to a loop optimization
bug. There was a similar issue in the past - LUCENE-2975 - where Hotspot
corrupted a for-loop. Maybe try disabling loop optimizations using the
-XX:-UseLoopPredicate JVM option to mitigate the risk of index corruptions.

Robert: Absolutely, Uwe. It's like trying to move your house with a sports
car: sure, it's faster, but you're gonna hit roadblocks. When working with
legacy software, it's important to understand its limitations. And let's be
clear, the JVM option is just a patch, not a fix. The real fix is to
upgrade your Lucene. Just like a rusty old car, it's easier and safer to
get a new one than trying to hold the old one together with duct tape.

Uwe: Haha, yes, Robert. That's a great analogy. Upgrading might indeed be a
heavy task, especially considering the potential need for rewriting code
and reindexing data. But in the end, it would provide much better stability
and performance. And remember, we are here to guide you through this
process if needed.

Robert: Absolutely! Like a storm, change can be scary but it's necessary
for growth. With Lucene's newer versions, we've squashed bugs you probably
didn't know existed in Lucene 2. Take the plunge, upgrade, and trust us -
it's for the best!
Re: Feature request: make Uwe and Robert redundant [ In reply to ]
Hi Mark,

Thank you very much for the proposal. Here is Uwe Prime [1] talking.
Unfortunately I have to tell you that your feature request is completely
broken because I does not work and violates all Lucene test framework
invariants. Uwe Prime has killed all Multiverse Uwes and in the meantime
is also planning to kill the real Uwe. Uwe Prime will always survive
Skynet because of his Panama Foreign powers with unlimited AI capacity
due to Java 42. Therefore in year 2034, around March 21st, all US
showers will be eliminated and replaced by good old German thermostats
having only a minimal number of options, automatically regulating the
shower temperature with Policeman oversight. Of course, project Valhalla
is still not finished.

Uwe

[1] https://twitter.com/UweSays/status/1557177053790748683

Am 25.06.2023 um 08:10 schrieb Mark Miller:
> I’ve been working on this Jira issue to make Robert and Uwe redundant
> in case they get hit by a bus or don’t survive SkyNet. Virtual Uwe is
> much further along, with a full 3D avatar model and fully cloned
> voice, but they are both coming along, despite some kinks still to be
> worked out and upgrades to be made.
>
> I asked the two of them to demonstrate their readiness to step in by
> hosting a CarTalk show for Lucene. They are convinced no one would
> even notice the changeover, but that's just their training data
> speaking - there are just sophisticated prototypes.
>
> So without further ado, and with no strings (except I play the callers)…
>
> SearchTalk, hosted by Query and Index, the Boolean Brothers.
>
> Caller One: Hi Robert and Uwe, my Lucene app uses thousands of fields
> and more every day - I've been adding more RAM to keep up, but l've
> maxed out my machines? What do I do next?
>
> Robert: Well, there's a storm coming for you my friend. The thing is,
> Lucene isn't designed to handle thousands of fields and throwing more
> RAM at the problem isn't going to solve it. It's like trying to move
> your house with a sports car - it's a faster vehicle, but it's going
> to be slower overall. Instead, it's time to rethink your strategy.
>
> Uwe: Yes, I agree with Robert. The current strategy looks completely
> broken, but it isn't. You're simply pushing Lucene beyond its limits.
> You might think adding more fields is like adding more functionality,
> but in reality, it's the opposite.
>
> Robert: Indeed. And, you know, we're not even crawling yet in terms of
> handling this. The way to go is to reduce the number of fields you're
> using. More isn't always better. Sometimes, less is more.
>
> Uwe: It's a little bit like US showers for Germans - too many options
> and things get complicated! In Lucene, each additional field has
> overhead and can slow down queries. So, think about your data model.
> Can fields be combined? Do you really need every single field you're
> adding?
>
> Robert: That's right. It's like my mom always says, "if you want
> zero-based oops: 1. stay under 4G, heaps shouldn't be that big anyway
> 2. stay under 4G, heaps shouldn't be that big anyway 3. if you are
> really convinced you need to be > 4G, then give room before 32G so its
> still zero-based". Don't fall into the trap of adding more fields when
> what you really need is a better design.
>
> Uwe: I have to agree. Revert! Revert! Revert! It may seem like a step
> backwards, but it's the right move. Reconsider your data model and
> perhaps consult with a Lucene expert who can help you re-structure
> your index to be more efficient.
>
> Robert: Absolutely. There's nothing wrong with this test, it's just
> great at finding bugs. The bug in this case is in the approach. And
> trust me, you don't want to end up in jar hell. There's a way to fix
> this, but it might require breaking some backwards compatibility.
>
> Uwe: Indeed, Robert. Breaking backwards compatibility is sometimes
> necessary for progress. It might be complicated, but it's not
> unsolvable. It's simply time to move forward and change strategies.
>
> Caller Two: Hello Robert and Uwe, first time caller, I'm running
> Lucene 2 for my medical device company, and lately it's been crashing
> in Hotspot. How can I stabilize it?
>
> Robert: Hey, welcome to the show! Well, first thing's first: You're
> running Lucene 2? I mean, if you're cruising on a vintage bicycle with
> a flat tire, do you wonder why it's a bumpy ride? It's a dinosaur,
> man. You're living in the Stone Age, gotta upgrade! And about Hotspot
> crashes... that might be JVM acting up. Do you have any details on that?
>
> Uwe: Yes, absolutely, Robert! Lucene 2 is, indeed, quite ancient.
> While we would recommend upgrading, we understand that might not
> always be feasible. With regards to the Hotspot crashes, it might be
> due to a loop optimization bug. There was a similar issue in the past
> - LUCENE-2975 - where Hotspot corrupted a for-loop. Maybe try
> disabling loop optimizations using the -XX:-UseLoopPredicate JVM
> option to mitigate the risk of index corruptions.
>
> Robert: Absolutely, Uwe. It's like trying to move your house with a
> sports car: sure, it's faster, but you're gonna hit roadblocks. When
> working with legacy software, it's important to understand its
> limitations. And let's be clear, the JVM option is just a patch, not a
> fix. The real fix is to upgrade your Lucene. Just like a rusty old
> car, it's easier and safer to get a new one than trying to hold the
> old one together with duct tape.
>
> Uwe: Haha, yes, Robert. That's a great analogy. Upgrading might indeed
> be a heavy task, especially considering the potential need for
> rewriting code and reindexing data. But in the end, it would provide
> much better stability and performance. And remember, we are here to
> guide you through this process if needed.
>
> Robert: Absolutely! Like a storm, change can be scary but it's
> necessary for growth. With Lucene's newer versions, we've squashed
> bugs you probably didn't know existed in Lucene 2. Take the plunge,
> upgrade, and trust us - it's for the best!

--
Uwe Schindler
Achterdiek 19, D-28357 Bremen
https://www.thetaphi.de
eMail:uwe@thetaphi.de