Mailing List Archive: Hibernate MassIndexer and Lucene connexion closes on mariadb

Hello, I'm new to this mailing list so I hope that I'm sending my question to the right place.

We recently experienced issues with Lucene Mass indexation process.
The connection is closed after few seconds and the indexation is partially processed.

I've got a piece of code that is re-indexing a table containing contacts.
This code was running fine until we execute it on a table containing more than 2 millions of contacts.

In that configuration the process does not run entirely and stop due to the following exception.
11/15 16:12:32 ERROR rg.hibernate.search.exception.impl.LogErrorHandler - HSEARCH000058: HSEARCH000211: An exception occurred while the MassIndexer was fetching the primary identifiers list
org.hibernate.exception.JDBCConnectionException: could not advance using next()
at org.hibernate.exception.internal.SQLExceptionTypeDelegate.convert(SQLExceptionTypeDelegate.java:48)
at org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:42)
at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:111)
at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:97)
at org.hibernate.internal.ScrollableResultsImpl.convert(ScrollableResultsImpl.java:69)
at org.hibernate.internal.ScrollableResultsImpl.next(ScrollableResultsImpl.java:104)
at org.hibernate.search.batchindexing.impl.IdentifierProducer.loadAllIdentifiers(IdentifierProducer.java:148)
at org.hibernate.search.batchindexing.impl.IdentifierProducer.inTransactionWrapper(IdentifierProducer.java:109)
at org.hibernate.search.batchindexing.impl.IdentifierProducer.run(IdentifierProducer.java:85)
at org.hibernate.search.batchindexing.impl.OptionallyWrapInJTATransaction.runWithErrorHandler(OptionallyWrapInJTATransaction.java:69)
at org.hibernate.search.batchindexing.impl.ErrorHandledRunnable.run(ErrorHandledRunnable.java:32)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.sql.SQLNonTransientConnectionException: (conn=18) Server has closed the connection. If result set contain huge amount of data, Server expects client to read off the result set relatively fast. In this case, please consider increasing net_wait_timeout session variable / processing your result set faster (check Streaming result sets documentation for more information)
at org.mariadb.jdbc.internal.util.exceptions.ExceptionMapper.get(ExceptionMapper.java:234)
at org.mariadb.jdbc.internal.util.exceptions.ExceptionMapper.getException(ExceptionMapper.java:165)
at org.mariadb.jdbc.internal.com.read.resultset.SelectResultSet.handleIoException(SelectResultSet.java:381)
at org.mariadb.jdbc.internal.com.read.resultset.SelectResultSet.next(SelectResultSet.java:650)
at org.apache.commons.dbcp2.DelegatingResultSet.next(DelegatingResultSet.java:1160)
at org.apache.commons.dbcp2.DelegatingResultSet.next(DelegatingResultSet.java:1160)
at org.hibernate.internal.ScrollableResultsImpl.next(ScrollableResultsImpl.java:99)
... 10 more

Here is the piece of code :
private void index(EntityManager em, LongConsumer callBack) {
FullTextEntityManager fullTextEntityManager = Search.getFullTextEntityManager(em);
MassIndexer indexer = fullTextEntityManager.createIndexer(Contact.class);
indexer.batchSizeToLoadObjects(BATCH_SIZE);
indexer.threadsToLoadObjects(NB_THREADS);
indexer.progressMonitor(new IndexerProgressMonitor(callBack));
indexer.start();
}

To make the process run until the end and after several unsuccessful tries, the only thing that seems to work is to edit mariadb config file and set : net_write_timeout = 3600
By default the time out is 60 sec. Here we put it to 1 hour... as the process took 45 minutes...
Does someone have an idea of what we did wrong ? It does not seems reasonable to set a such huge value to ask a full re-index on the table.
Is there a way to ask MassIndexer to process by limited chunk of data or something else to avoid the connection to stay open for a too long time ?
Thanks for your time.

Hey Sylvain,

this seems like a Hibernate Search problem and not really a Lucene one.
Maybe you could post this to https://discourse.hibernate.org/ and we'll
continue the discussion there?

Have a nice day,
Marko

On Wed, 17 Jan 2024 at 15:43, Sylvain Roulet <sylvain.roulet@eloquant.com>
wrote:

> Hello, I'm new to this mailing list so I hope that I'm sending my question
> to the right place.
>
>
> We recently experienced issues with Lucene Mass indexation process.
> The connection is closed after few seconds and the indexation is partially
> processed.
>
> I've got a piece of code that is re-indexing a table containing contacts.
> This code was running fine until we execute it on a table containing more
> than 2 millions of contacts.
>
> In that configuration the process does not run entirely and stop due to
> the following exception.
> 11/15 16:12:32 ERROR rg.hibernate.search.exception.impl.LogErrorHandler -
> HSEARCH000058: HSEARCH000211: An exception occurred while the MassIndexer
> was fetching the primary identifiers list
> org.hibernate.exception.JDBCConnectionException: could not advance using
> next()
> at
> org.hibernate.exception.internal.SQLExceptionTypeDelegate.convert(SQLExceptionTypeDelegate.java:48)
> at
> org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:42)
> at
> org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:111)
> at
> org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:97)
> at
> org.hibernate.internal.ScrollableResultsImpl.convert(ScrollableResultsImpl.java:69)
> at
> org.hibernate.internal.ScrollableResultsImpl.next(ScrollableResultsImpl.java:104)
> at
> org.hibernate.search.batchindexing.impl.IdentifierProducer.loadAllIdentifiers(IdentifierProducer.java:148)
> at
> org.hibernate.search.batchindexing.impl.IdentifierProducer.inTransactionWrapper(IdentifierProducer.java:109)
> at
> org.hibernate.search.batchindexing.impl.IdentifierProducer.run(IdentifierProducer.java:85)
> at
> org.hibernate.search.batchindexing.impl.OptionallyWrapInJTATransaction.runWithErrorHandler(OptionallyWrapInJTATransaction.java:69)
> at
> org.hibernate.search.batchindexing.impl.ErrorHandledRunnable.run(ErrorHandledRunnable.java:32)
> at
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
> at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: java.sql.SQLNonTransientConnectionException: (conn=18) Server
> has closed the connection. If result set contain huge amount of data,
> Server expects client to read off the result set relatively fast. In this
> case, please consider increasing net_wait_timeout session variable /
> processing your result set faster (check Streaming result sets
> documentation for more information)
> at
> org.mariadb.jdbc.internal.util.exceptions.ExceptionMapper.get(ExceptionMapper.java:234)
> at
> org.mariadb.jdbc.internal.util.exceptions.ExceptionMapper.getException(ExceptionMapper.java:165)
> at org.mariadb.jdbc.internal.com
> .read.resultset.SelectResultSet.handleIoException(SelectResultSet.java:381)
> at org.mariadb.jdbc.internal.com
> .read.resultset.SelectResultSet.next(SelectResultSet.java:650)
> at
> org.apache.commons.dbcp2.DelegatingResultSet.next(DelegatingResultSet.java:1160)
> at
> org.apache.commons.dbcp2.DelegatingResultSet.next(DelegatingResultSet.java:1160)
> at
> org.hibernate.internal.ScrollableResultsImpl.next(ScrollableResultsImpl.java:99)
> ... 10 more
>
> Here is the piece of code :
> private void index(EntityManager em, LongConsumer callBack) {
> FullTextEntityManager fullTextEntityManager =
> Search.getFullTextEntityManager(em);
> MassIndexer indexer =
> fullTextEntityManager.createIndexer(Contact.class);
> indexer.batchSizeToLoadObjects(BATCH_SIZE);
> indexer.threadsToLoadObjects(NB_THREADS);
> indexer.progressMonitor(new IndexerProgressMonitor(callBack));
> indexer.start();
> }
>
>
> To make the process run until the end and after several unsuccessful
> tries, the only thing that seems to work is to edit mariadb config file and
> set : net_write_timeout = 3600
> By default the time out is 60 sec. Here we put it to 1 hour... as the
> process took 45 minutes...
> Does someone have an idea of what we did wrong ? It does not seems
> reasonable to set a such huge value to ask a full re-index on the table.
> Is there a way to ask MassIndexer to process by limited chunk of data or
> something else to avoid the connection to stay open for a too long time ?
> Thanks for your time.
>
>