Mailing List Archive

Stored field already compressed
Hello,



I would like to store a field for a document in the index without any compression. The field is already a compressed byte[]. The application already uses ZStd and it’s very well optimized for this data. It doesn’t seem Lucene will allow me to store the field without compressing it again. This seems wasteful. Is there a solution to this? Or would I have to implement my own Codec or some such? I started digging down that route and it doesn’t look pretty. ????



Tony
Re: Stored field already compressed [ In reply to ]
Hi,
I think here's how you can construct it easily.
https://github.com/apache/lucene/blob/1ebee9e6116b1dbc5bcd410b4180df1f9c4c9d50/lucene/core/src/java/org/apache/lucene/index/SortingStoredFieldsConsumer.java#L78


On Tue, Nov 14, 2023 at 10:11?PM Tony Schwartz <tony@xfire.io.invalid>
wrote:

> Hello,
>
>
>
> I would like to store a field for a document in the index without any
> compression. The field is already a compressed byte[]. The application
> already uses ZStd and it’s very well optimized for this data. It doesn’t
> seem Lucene will allow me to store the field without compressing it again.
> This seems wasteful. Is there a solution to this? Or would I have to
> implement my own Codec or some such? I started digging down that route and
> it doesn’t look pretty. ????
>
>
>
> Tony
>
>

--
Sincerely yours
Mikhail Khludnev
RE: Stored field already compressed [ In reply to ]
Awesome! Thank you, Mikhail! I was able to use this code to create my own Codec. Had to learn how to register the Codec via Java's SPI. Seems to be working perfectly. Appreciate the help! Significant performance improvement right out of the gate!

Tony



-----Original Message-----
From: Mikhail Khludnev <mkhl@apache.org>
Sent: Tuesday, November 14, 2023 15:16
To: java-user@lucene.apache.org
Subject: Re: Stored field already compressed

Hi,
I think here's how you can construct it easily.
https://github.com/apache/lucene/blob/1ebee9e6116b1dbc5bcd410b4180df1f9c4c9d50/lucene/core/src/java/org/apache/lucene/index/SortingStoredFieldsConsumer.java#L78


On Tue, Nov 14, 2023 at 10:11?PM Tony Schwartz <tony@xfire.io.invalid>
wrote:

> Hello,
>
>
>
> I would like to store a field for a document in the index without any
> compression. The field is already a compressed byte[]. The
> application already uses ZStd and it’s very well optimized for this
> data. It doesn’t seem Lucene will allow me to store the field without compressing it again.
> This seems wasteful. Is there a solution to this? Or would I have to
> implement my own Codec or some such? I started digging down that
> route and it doesn’t look pretty. ????
>
>
>
> Tony
>
>

--
Sincerely yours
Mikhail Khludnev


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org