Mailing List Archive

Seeking guidance on uncompressed storage options in Lucene 9.7.0
Dear Lucene Community,

We're currently using Lucene 4.10.4 with Lucene410Codec for storing fields
without compression. However, transitioning to Lucene 9.7.0 presents a
challenge as we haven't found a similarly uncompressed codec.

Interestingly, even with the Mode.BEST_SPEED compression mode, the
retrieval speed for stored fields is significantly slower in Lucene 9.7.0
compared to Lucene 4.10.4 for our use case.

Our request is this:
Are there potential alternatives in Lucene 9.7.0 to achieve a storage
format similar to Lucene410Codec that avoids compression?
Any guidance or insights would be greatly appreciated. Thank you for your
time and assistance.

Hari
Re: Seeking guidance on uncompressed storage options in Lucene 9.7.0 [ In reply to ]
Well. Presumably it might be a custom codec, but it incur such a high
effort, that I'd raser revise an app logic, or check binary docvalues
first.

On Tue, Feb 27, 2024 at 11:54?AM hariram ravichandran <
hariramravichandar@gmail.com> wrote:

> Dear Lucene Community,
>
> We're currently using Lucene 4.10.4 with Lucene410Codec for storing fields
> without compression. However, transitioning to Lucene 9.7.0 presents a
> challenge as we haven't found a similarly uncompressed codec.
>
> Interestingly, even with the Mode.BEST_SPEED compression mode, the
> retrieval speed for stored fields is significantly slower in Lucene 9.7.0
> compared to Lucene 4.10.4 for our use case.
>
> Our request is this:
> Are there potential alternatives in Lucene 9.7.0 to achieve a storage
> format similar to Lucene410Codec that avoids compression?
> Any guidance or insights would be greatly appreciated. Thank you for your
> time and assistance.
>
> Hari
>


--
Sincerely yours
Mikhail Khludnev
RE: Seeking guidance on uncompressed storage options in Lucene 9.7.0 [ In reply to ]
I asked a question like this a while ago and received a tip. I implemented a codec of my own. To use it, you must register it in your .jar via the META-INF/servcies/org.apache.lucene.codecs.Codec.

Here's my implementation:

import org.apache.lucene.codecs.Codec;
import org.apache.lucene.codecs.FilterCodec;
import org.apache.lucene.codecs.StoredFieldsFormat;
import org.apache.lucene.codecs.compressing.CompressionMode;
import org.apache.lucene.codecs.compressing.Compressor;
import org.apache.lucene.codecs.compressing.Decompressor;
import org.apache.lucene.codecs.lucene90.compressing.Lucene90CompressingStoredFieldsFormat;
import org.apache.lucene.codecs.lucene95.Lucene95Codec;
import org.apache.lucene.store.ByteBuffersDataInput;
import org.apache.lucene.store.DataInput;
import org.apache.lucene.store.DataOutput;
import org.apache.lucene.util.ArrayUtil;
import org.apache.lucene.util.BytesRef;

import java.io.IOException;

public class LhLuceneCodec extends FilterCodec {
public LhLuceneCodec() {
this( new Lucene95Codec( Lucene95Codec.Mode.BEST_SPEED ) );
}
protected LhLuceneCodec( Codec delegate ) {
super( "LhLuceneCodec", delegate );
}
private static final CompressionMode NO_COMPRESSION =
new CompressionMode() {
@Override
public Compressor newCompressor() {
return new Compressor() {
@Override
public void close() throws IOException {}

@Override
public void compress( ByteBuffersDataInput buffersInput, DataOutput out)
throws IOException {
out.copyBytes(buffersInput, buffersInput.size());
}
};
}

@Override
public Decompressor newDecompressor() {
return new Decompressor() {
@Override
public void decompress( DataInput in, int originalLength, int offset, int length, BytesRef bytes) throws IOException {
bytes.bytes = ArrayUtil.grow(bytes.bytes, length);
in.skipBytes(offset);
in.readBytes(bytes.bytes, 0, length);
bytes.offset = 0;
bytes.length = length;
}

@Override
public Decompressor clone() {
return this;
}
};
}
};
private static final StoredFieldsFormat storedFieldsFormat = new Lucene90CompressingStoredFieldsFormat(
"LhStoredFields",
NO_COMPRESSION,
128 * 1024,
1,
10
);

@Override
public StoredFieldsFormat storedFieldsFormat() {
return storedFieldsFormat;
}
}






Tony Schwartz




On Tue, Feb 27, 2024 at 11:54?AM hariram ravichandran < hariramravichandar@gmail.com> wrote:

> Dear Lucene Community,
>
> We're currently using Lucene 4.10.4 with Lucene410Codec for storing
> fields without compression. However, transitioning to Lucene 9.7.0
> presents a challenge as we haven't found a similarly uncompressed codec.
>
> Interestingly, even with the Mode.BEST_SPEED compression mode, the
> retrieval speed for stored fields is significantly slower in Lucene
> 9.7.0 compared to Lucene 4.10.4 for our use case.
>
> Our request is this:
> Are there potential alternatives in Lucene 9.7.0 to achieve a storage
> format similar to Lucene410Codec that avoids compression?
> Any guidance or insights would be greatly appreciated. Thank you for
> your time and assistance.
>
> Hari
>


--
Sincerely yours
Mikhail Khludnev


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org