In an effort to prepare Elasticsearch for modularization, we are
investigating and eliminating split packages. The situation has improved
through recent refactoring in Lucene 9.0 [1], but a number of split
packages still remain. This message identifies one such so that it can
be discussed in isolation, with a view to a potential solution either in
Lucene or possibly within Elasticsearch itself.
Elasticsearch has a custom merge policy, ShuffleForcedMergePolicy, that
interleaves eldest and newest segments. SFMP has but a single
package-private dependency on lucene, namely `SegmentInfo::setDiagnostics`.
The `setDiagnostics` method is used by SFMP to add a single additional
ES-specific diagnostic key.
It is possible to recreate the whole SegmentCommitInfo (and the SI
within, along with the additional ES-specific diagnostic key), by using
just the accessible parts of the lucene API. This has been prototyped[2],
but shows that something is not quite right [3]. Resolving this issue is
likely better done in lucene.
The comment in `MergePolicy::setMergeInfo` - "Sets the SegmentCommitInfo
of the merged segment. Allows sub-classes to e.g. set diagnostics
properties"[4] - is telling. To better support the use-cases referred to
in this comment and to allow for other-package subclasses (which seems
completely reasonable given other parts of the API), it would be better
for lucene's SI::setDiagnostics method to be public (rather than
package-private). Such a change is a general improvement in the lucene
API, which can be leveraged by any custom merge policy.
-Chris.
[1] https://issues.apache.org/jira/browse/LUCENE-9319
[2] https://github.com/elastic/elasticsearch/pull/78253
[3] https://github.com/elastic/elasticsearch/pull/78253#issuecomment-925861893
[4] https://github.com/apache/lucene/blob/d5d6dc079395c47cd6d12dcce3bcfdd2c7d9dc63/lucene/core/src/java/org/apache/lucene/index/MergePolicy.java#L271
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
investigating and eliminating split packages. The situation has improved
through recent refactoring in Lucene 9.0 [1], but a number of split
packages still remain. This message identifies one such so that it can
be discussed in isolation, with a view to a potential solution either in
Lucene or possibly within Elasticsearch itself.
Elasticsearch has a custom merge policy, ShuffleForcedMergePolicy, that
interleaves eldest and newest segments. SFMP has but a single
package-private dependency on lucene, namely `SegmentInfo::setDiagnostics`.
The `setDiagnostics` method is used by SFMP to add a single additional
ES-specific diagnostic key.
It is possible to recreate the whole SegmentCommitInfo (and the SI
within, along with the additional ES-specific diagnostic key), by using
just the accessible parts of the lucene API. This has been prototyped[2],
but shows that something is not quite right [3]. Resolving this issue is
likely better done in lucene.
The comment in `MergePolicy::setMergeInfo` - "Sets the SegmentCommitInfo
of the merged segment. Allows sub-classes to e.g. set diagnostics
properties"[4] - is telling. To better support the use-cases referred to
in this comment and to allow for other-package subclasses (which seems
completely reasonable given other parts of the API), it would be better
for lucene's SI::setDiagnostics method to be public (rather than
package-private). Such a change is a general improvement in the lucene
API, which can be leveraged by any custom merge policy.
-Chris.
[1] https://issues.apache.org/jira/browse/LUCENE-9319
[2] https://github.com/elastic/elasticsearch/pull/78253
[3] https://github.com/elastic/elasticsearch/pull/78253#issuecomment-925861893
[4] https://github.com/apache/lucene/blob/d5d6dc079395c47cd6d12dcce3bcfdd2c7d9dc63/lucene/core/src/java/org/apache/lucene/index/MergePolicy.java#L271
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org