Hi~
We are trying to build an OLAP database based on lucene, and we heavily use lucene's DocValues (as our column store).
We try to use DocValues to store the array type field. For example, if we want to store the field1 and feild2 in this json document into DocValues respectively, SORTED_NUMERIC and SORTED_SET seem to be our only option.
{
"field1": [ 3, 1, 1, 2 ],
"field2": [ "c", "a", "a", "b" ]
}
When we store field1 in SORTED_NUMERIC and field2 in SORTED_SET, we will get this result:
[Community Verified icon]
field1:
* origin: [3, 1, 1, 2]
* in SORTED_NUMERIC: [1, 1, 2, 3]
field2?
* origin: [”c”, “a”, “a”, “b” ]
* in SORTED_SET: ords [0, 1, 2] terms [”a”, “b”, “c”]
The original ordering relationship of the elements in the array is lost.
We're guessing that lucene's DocValues are designed primarily for sorting and aggregation, so the original order of elements may not matter.
But in our usage scene, it is important to keep the original order of the elements in the array (we allow user to access the elements in the array using the subscript operator).
We wonder if lucene has plans to add new types of DocValues that can store arrays and keep the original order of elements in the array?
Thanks!
We are trying to build an OLAP database based on lucene, and we heavily use lucene's DocValues (as our column store).
We try to use DocValues to store the array type field. For example, if we want to store the field1 and feild2 in this json document into DocValues respectively, SORTED_NUMERIC and SORTED_SET seem to be our only option.
{
"field1": [ 3, 1, 1, 2 ],
"field2": [ "c", "a", "a", "b" ]
}
When we store field1 in SORTED_NUMERIC and field2 in SORTED_SET, we will get this result:
[Community Verified icon]
field1:
* origin: [3, 1, 1, 2]
* in SORTED_NUMERIC: [1, 1, 2, 3]
field2?
* origin: [”c”, “a”, “a”, “b” ]
* in SORTED_SET: ords [0, 1, 2] terms [”a”, “b”, “c”]
The original ordering relationship of the elements in the array is lost.
We're guessing that lucene's DocValues are designed primarily for sorting and aggregation, so the original order of elements may not matter.
But in our usage scene, it is important to keep the original order of the elements in the array (we allow user to access the elements in the array using the subscript operator).
We wonder if lucene has plans to add new types of DocValues that can store arrays and keep the original order of elements in the array?
Thanks!