I think it makes sense to un-deprecate that API (why did we deprecate it?),
but I'm not sure IW should be in the business of soft/hard limits on field
count?
I agree such limits make sense if the integrity of the index is at risk,
e.g. IW does enforce a max number of unique documents in one index.
But for number of fields, as long as we expose the API, then the layer
above Lucene can handle soft/hard limits, notifying the user correctly,
rejecting updates, etc.?
Mike McCandless
http://blog.mikemccandless.com On Thu, Jan 14, 2021 at 5:36 PM Marcus Eagan <marcuseagan@gmail.com> wrote:
> I like Oren's idea and Simon's proposal of unlimited by default but
> configurable.
> Marcus
>
> On Thu, Jan 14, 2021 at 12:16 AM Simon Willnauer <
> simon.willnauer@gmail.com> wrote:
>
>> I personally have pretty positive experience with what I call softlimits.
>> At elastic we use them all over the place to catch issues when a user
>> likely misconfigures something or if there is likely a issue on the users
>> end.
>> I think having an option on the IW that allows to limit the fieldnumbers.
>> We can even extract a general limits object with total num docs etc. if we
>> want. We can still set stuff to unlimited by default.
>>
>> WDYT
>>
>> Sent from a mobile device
>>
>> On 14. Jan 2021, at 06:36, David Smiley <dsmiley@apache.org> wrote:
>>
>> ?
>> I don't like the idea of IndexWriter limiting field names, but I do like
>> the idea of un-deprecating that method, which appeared to have a trivial
>> implementation. Try commenting on the issue of it's deprecations, which
>> has various watchers to get their attention.
>>
>> ~ David Smiley
>> Apache Lucene/Solr Search Developer
>> http://www.linkedin.com/in/davidwsmiley
>>
>>
>> On Wed, Jan 13, 2021 at 5:02 PM Oren Ovadia
>> <oren.ovadia@mongodb.com.invalid> wrote:
>>
>>> Hi All,
>>>
>>> I work on Lucene at MongoDB.
>>>
>>> I would like to limit the amount of fields in an index to prevent
>>> tenants from causing a mapping explosion.
>>>
>>> Since IndexWriter.getFieldNames has been deprecated
>>> <https://issues.apache.org/jira/browse/LUCENE-8909>, there is no way to
>>> do this without using a reader (which comes with a set of problems
>>> regarding flush/commit rates).
>>>
>>> Would love to add to Lucene the ability to have IndexWriters limiting
>>> the number of fields. Curious to hear your thoughts.
>>>
>>> Thanks,
>>> Oren
>>>
>>>
>
> --
> Marcus Eagan
>
>