Mailing List Archive: Issue related to SOLR Indexing

Hi,

I am indexing data into a 64 shard collection created in a SOLR 4.10.3,CDH
cluster running over HDFS and having 19 nodes.
The indexing runs very well for the intial few hours(5-6) post which all
the different nodes of the cluster start showing health issues(varying
randomly across the nodes) and the indexing speed also reduces a lot.
I have used the SOLR tuning guidelines specified in -
https://www.cloudera.com/documentation/enterprise/5-8-x/topics/search_tuning_solr.html#csug_topic_10

and tried but it did not work out. I observed that decreasing the
"solr.hdfs.blockcache.slab.count" to a very low value(32) improves indexing
a lot but only for the initial few hours.

Some errors that I get on the server side logs are -
org.apache.curator.CuratorConnectionLossException: KeeperErrorCode =
ConnectionLoss
org.apache.solr.core.SolrCore: org.apache.solr.common.SolrException: Cannot
talk to ZooKeeper - Updates are disabled.
org.apache.solr.update.processor.DistributedUpdateProcessor: ClusterState
says we are the leader, but locally we don't think so

The cluster never self-recovers post encountering the errors I mentioned
above.
Restarting the cluster does solve the problem though which again starts
occurring after a few hours.

I would need some suggestions/guidelines/helpful links on what are the
parameters that I should consider and their recommended values to be used
to ensure a stable and smooth indexing.

--
Sathyam Doraswamy

You should probably post your question to solr-user mailing list for a broader audience.
If you can share more details about your cluster and more logs, it would probably also be beneficial

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

> 6. mar. 2017 kl. 03.05 skrev Sathyam <sathyam.doraswamy@gmail.com>:
>
> Hi,
>
> I am indexing data into a 64 shard collection created in a SOLR 4.10.3,CDH
> cluster running over HDFS and having 19 nodes.
> The indexing runs very well for the intial few hours(5-6) post which all
> the different nodes of the cluster start showing health issues(varying
> randomly across the nodes) and the indexing speed also reduces a lot.
> I have used the SOLR tuning guidelines specified in -
> https://www.cloudera.com/documentation/enterprise/5-8-x/topics/search_tuning_solr.html#csug_topic_10
>
> and tried but it did not work out. I observed that decreasing the
> "solr.hdfs.blockcache.slab.count" to a very low value(32) improves indexing
> a lot but only for the initial few hours.
>
> Some errors that I get on the server side logs are -
> org.apache.curator.CuratorConnectionLossException: KeeperErrorCode =
> ConnectionLoss
> org.apache.solr.core.SolrCore: org.apache.solr.common.SolrException: Cannot
> talk to ZooKeeper - Updates are disabled.
> org.apache.solr.update.processor.DistributedUpdateProcessor: ClusterState
> says we are the leader, but locally we don't think so
>
> The cluster never self-recovers post encountering the errors I mentioned
> above.
> Restarting the cluster does solve the problem though which again starts
> occurring after a few hours.
>
> I would need some suggestions/guidelines/helpful links on what are the
> parameters that I should consider and their recommended values to be used
> to ensure a stable and smooth indexing.
>
>
>
> --
> Sathyam Doraswamy